#linuxfu: Fixing your network interfaces after a motherboard change
I've loved Linux since I first spent a long day downloading 150 or more floppy disks-worth of Slackware in my first autumn term at University in 1992.*
There was something magical about it even then in those days before web. There was so much it could do, and you could tinker beneath the hood and add go faster stripes if you wanted.
I remember my first linux kernel compile (version 1.1.16), and how I spent hours researching how X11 rendered the desktop, so I could squeeze the maximum number of pixels out of my cheap 15" CRT monitor without the refresh rate dropping below 60Hz (I think I ended up with some very odd resolution such as 886x664).
Times have changed, but my love of Linux hasn't.
I wanted to share a story today of how wonderful Debian is, but how at the same time how utterly perplexing it can be, if you don't know what is going on.
The story starts with a server that had a faulty motherboard. Specifically, the IPMI interface (if you don't know what that is, find out - it is a sys admin's best friend) had stopped working, and no amount of rebooting or reflashing would make it work.
A replacement motherboard was found, and sent out to the Data Centre, where I got remote hands engineers to fit it. In no time at all, the engineer popped up on chat to tell me the job was done, and that the server was up and at the login prompt on the console.
But I couldn't ping it. Why?
As always, a methodical and logical approach is the best way to deal with an issue like this.
First I realised that I didn't have IPMI access to the server, and there was a very logical reason for this - it was a new motherboard, so the IPMI on this board had not been configured. I got the engineer to reboot the server and configure the IPMI via the BIOS. That worked a treat - now I had control over the power and also a console onto the server through which I could properly debug the issue.
I logged in and su'd to root via the IPMI console (isn't this brilliant - the server itself has no working network interfaces, and yet I can still log into it from a different continent!), and tried pinging the intranet gateway. Nothing.
So, what is my network config looking like? I used 'ip addr list' to tell me what IPs I had configured. I didn't have any! Why would that be? The server was booting from the same hard drive it used before the motherboard was changed. It was the same server. So surely it should have the same running config as before?
It was then I made a breakthrough. Looking closely at the output of the 'ip addr list' command, I noticed something that wasn't right. The two network interfaces on my servers are universally designated eth0 and eth1 - it is one of those constants that helps you to keep your configs uniform when you have a lot of machines to deal with. However, what I saw was eth2 and eth3!
Neither eth0 and eth1 were there at all. Now the lack of pinging made sense - the interface configs in /etc/network/interfaces are for eth0 and eth1 - there is nothing in there for eth2 or eth3. At boot time the init scripts had not found any interfaces to configure.
What I didn't immediately understand was why it had happened.
Google proved to be my friend here, and ultimately the answer is pretty straight forward, although it took a while to pin down. I'm here to save you the leg-work if you are stuck with a similar issue now.
At boot time, the udev subsystem in Debian takes a look at your hardware, and applies a bunch of rules that help determine how those devices get configured for use. This means that every time you boot your server, the devices it finds are configured in the same consistent way.
One set of rules concerns network interfaces. These are enumerated by MAC address, which is of course unique for each interface. The rules match a MAC address to a network interface name.
When the server was first built, the interfaces were added to the rules file and assigned eth0 and eth1.
When the motherboard was replaced and the server booted, udev spotted that there were new two NICs present with MAC addresses it hadn't seen before. It built rules for those interfaces, and assigned them the next available interface names - eth2 and eth3.
Makes sense, doesn't it? But it is one of those scenarios that until you've experienced it and scratched your head a bit, it is difficult to get to the bottom of.
There is nowhere obvious to look that might explain why the names of your network interfaces have changed - but that's because you rarely ever have to tinker with udev.
To fix the issue, you have to modify /etc/udev/rules.d/70-persistent-net.rules.
In that file you'll find an entry for every NIC that the udev has ever seen. That means there will be entries for your original eth0 and eth1, along with the newer entries for eth2 and eth3.
The fix is simple. Comment out the entries for the original eth0 and eth1, and then modify the eth2 and eth3 entries so that they assign the NICs as eth0 and eth1. Then save and close the file, and reboot.
Problem solved.
I love Linux. I love that it uses logic and rules to ensure that devices get enumerated in the same way every time you boot. I love it that when things break you can still look under the hood and fix them.
--
*That included at least 25 floppies worth of LaTeX. And nope, before you ask, I didn't install them. Do you know anyone who ever did? Oh, and yes - I really was at University in 1992.










