Having worked in IT for many years, I came to understand this simple phrase very well. It isn’t that change is bad. Change can be a good thing if what currently exists is just plain stinking bad. The problem is that all too often, when something is working reliably and does what it is supposed to be doing, then more often than not, it is better to leave well enough alone. This is one of the reasons I ended up switching to Apple back in the 90’s for personal use as for the most part, stuff just worked and I didn’t have to waste personal time being my own computer technician (why when I had to deal with that for regular

Such is the case with my server upgrade for my site. Trusty old 500MHz Power Mac G4 running a circa 2005 operating system (Mac OS X Server 10.4.11) did exactly what I wanted to reliably and without fail. While it isn’t normally recommended, I’ve also been using this system as my router and firewall in addition to hosting a number of network services including my web and mail server. Back in the “old” days, I used a Sun SPARCstation IPX to do the same and have never used a dedicated hardware router because BSD UNIX could do it all.  As for my circa 1999 Graphite G4, that is how robust and reliable it has been over the years but performance wise, it was caving in from having to handle many different tasks.

The Mac mini upgrade was a no brainer. For my needs, I don’t need an uber box and the performance it offered would be a huge increase over the G4. The mini itself isn’t the problem though. The issue is Snow Leopard Server (SLS). The irony is that as with all Mac OS X Server releases dating back to Rhapsody, I’ve been involved in every single one of those software seeds. It was also through this testing that I personally never upgraded to anything beyond Tiger Server as there were some design decisions which I just personally did not care for. When I bought my first Mac Pro (the 2.66GHz quad Xeon) in 2006, it stayed on Mac OS X 10.4.11. When I upgraded to the 3.33GHz hexacore Mac Pro, I begrudgingly accepted that I had to use Snow Leopard. If I could, I would run Tiger Server on this mini even if it is EOL’d and Apple no longer provides security updates for it. Unfortunately, the minimum system is 10.6.3 so I’ve little choice in the matter unless I go and buy one of those pre-2007 systems.

Now routing (IP forwarding in my case) should be a pretty much a no brainer to implement given Mac OS X is all BSD Unix underneath. It’s worked without fail for me from OS X 10.1 to 10.4.11 in both client and server versions. Unfortunately, Apple sometimes adds things which breaks the way things are supposed to work. One of those things is scoped routing and as I ended up finding out, I’m not the only one having problems with using Snow Leopard Server to do very simple basic routing. Again, the irony is I was part of the Snow Leopard Server testing BUT I only tested it as a standalone server and never its routing and NAT functionality as I assumed (yes, I know – dangerous to assume anything) it wouldn’t be much different than previous versions. Turns out I was very wrong.

To make a long story short, after migrating everything over, I suddenly found every single client system on my internal network having issues with loading web sites completely. Trying to download anything like a small 1mb zip archive also turned into a nightmare where the connection would eventually just timeout and the download would fail. Initially, I thought there might be a misconfiguration in the firewall and went over its configuration carefully. That wasn’t the problem though.

Doing a simple search led me to this kernel parameter; net.inet.ip.scopedroute on Apple’s discussion forums and it seemed to be affecting those trying to use SLS to either do just IP forwarding or also NAT (under certain configurations which cannot be setup using Gateway Assistant). Turns out Apple introduced source based routing (simply put, the notion of a single default gateway no longer exists) in Snow Leopard (this kernel parameter does not exist in Mac OS X 10.4 and was disabled in 10.5). IP forwarding and NAT just worked in Tiger Server and earlier (I’ve never used Leopard Server and up until this migration, Snow Leopard Server in a production setting). Attempting to disable scoped (aka source based) routing via sysctl -w net.inet.ip.scopedroute=0 should in theory, make routing work the way it did in earlier versions of Mac OS X. Unfortunately, it didn’t and routing would break after 15 minutes. And if it did keep working, the same slow performance issues on client machines would remain making SLS as a router solution totally useless. After trying various voodoo troubleshooting tips, I threw in the towel. I wasn’t about to waste time running tcpdump and analyzing where those packets were going or trying to set static routes to force the issue. If it gets to that level, I might as well move to a dedicated hardware router…. or maybe just go back to what did work in the first place.

So temporarily, I moved the routing and firewall functions back to my G4 where it is working just fine and dandy. The mini with SLS is basically just handling the web server now and my earlier hopes of gracefully retiring that workhorse G4 has been put on hold. All I can say is unfrickinbelievable. I ended up wasting an incredible amount of time prepping that system (recompiling programs and libraries) to take over everything only to find it has issues not unique to me. Rhetorical question but…. what happened to “it just works” Apple? Something as simple and basic as NAT should just work out of the box by turning it on.  But it is unusable in SLS unless setup via the Gateway Assistant (which in my case, is useless because I need more control over what Apple provides – see followup at the end of this post). Apple’s forte used to be making complicated stuff just work out of the box but this recent trend of changing things just for the sake of change and making it somewhat more aggravating to work with (Mission Control in Lion is one of those) is off putting to this long term user and AAPL shareholder.

I suppose this is one of the many reasons why they have trended away from the enterprise (discontinuation of the Xserve and the way Lion Server is being developed and marketed). Frankly, that writing was on the wall when they stopped eating their own dog food to power some of their own backend services over the past few years (at least as far as being the primary hardware and OS). And with iCloud on the horizon and Apple not utilizing their own hardware and software, more so would they take flack for not using Xserve and Mac OS X Server in their data center (thus re-enforcing the discontinuation of that product).

It’s also why I had stopped recommending any of their enterprise offerings after the Leopard Server seed was completed. When you provide a bunch of BR’s or ER’s and you get back “works as expected”, I figure I’m just wasting my time as the decision has already been made higher up that this is how things should work and only a large amount of fuss will possibly result in modifications being made (I know, I’ve seen that part from the inside). My biggest beef was the constant rearrangements within the Server Admin application with each version of the OS (which was disappointing because a great bulk of ideas I had been involved in eventually made it into the Server Admin app by the time Tiger Server was released), the inclusion of Server Preferences (I understand the actions behind it in terms of the target market but it reminded me of the early OS X Server seeds when I had to meticulously file many ER’s to get the redundant settings out of the general System Preferences; and here they were spawning “yet another way to configure the system”, and also how incomplete and half-assed the QuickTime Broadcaster application was; which was then unceremoniously dropped as they began pushing PodCast Producer. It’s not that I was a heavy user of Broadcaster, the point is that Apple has pushed technologies only to just drop or change things up at will. When it is something in an enterprise level product and Apple does that sort of stuff (puts out something half-baked which then never gets properly fixed and then drops it later, wearing my enterprise IT hat, Apple as a provider is an unreliable “partner” in my point of view. Wearing my consultant hat, I’m glad I never recommended Xserve and OS X Server (note that for the general tinkerer and hobbyist, I’ve had no qualms with recommending OS X Server for personal use since the expectations there are completely different compared to introducing this in a business environment where there is a lot more on the line).

As for Lion and Lion Server, can’t say much at this time due to the NDA. But once it is released, I’m going to have my 100 yen worth to say about some of the design decisions.

Followup: there might be some “confusion” regarding the routing and NAT issues that I’ve encountered with Snow Leopard Server which I thought I should expand on.  Note that NAT will work perfectly fine for those who are able to utilize the Gateway Assistant to setup their box for internet sharing.

In my case however, I cannot utilize the Gateway Assistant because it makes certain assumptions on what private IP addresses it uses as well as what and how the network interfaces are configured. First, keep the latter in mind while noting the configuration limitations imposed by Gateway Assistant when it hits this section. Second, take note of the hardware configuration relating to network ports for the Power Mac G4 that I’m migrating from and the 2010 Mac mini which had the issues with Snow Leopard Server’s source based routing. Based on these hints, It should be easy for the experienced networking admin to figure out how I’m configuring IP forwarding and NAT and why this setup worked well prior to Snow Leopard Server, but causes some issues in SLS. I won’t go into further detail because that then leads into all sorts of discussions regarding network security best practices (which is missing the forest for the trees since this isn’t any sort of mission critical data center that I’m running here; if such best practice were a high priority requirement, I wouldn’t have been running everything on a single box to begin with).

Getting back to the main issue though, disabling source based routing in SLS should revert the behavior back to the previous way but that doesn’t seem to be the case. Furthermore, there are individuals out there who still have similar issues with losing routing or encountering connectivity issues even when they utilized the Gateway Assistant to setup their internet sharing. What this seems to imply is there is some sort of quirkiness in the way NAT actually operates under SLS depending on the configuration. Once I replace the G4 Power Mac with the 2007 Mac mini (running the universal version of Tiger Server), I plan on cloning that configuration and then performing an upgrade to Leopard Server with the intention of seeing if routing and NAT works fine or encounters the same issues (again, source based routing became a kernel parameter in Leopard but is disabled by default).

One Comment

Leave a Reply