Friday, June 17, 2005

Genunix.Org is Alive



genunix.org equipment with the N2120 staged for the camera only


Take a look at GenUnix.Org There's not much content there now, beyond a mirror of the OpenSolaris launch files and some video from the first Open Solaris User Group meeting; but that'll change in the future. Cyril Plisko has an operational SubVersion (SVN) source repository hosted at the site.

How genunix.org got started (Part 1 of 2)

Early in May, I got the idea to host an OpenSolaris Community/Mirror site. First off was to leave a message for Paul Vixie of Internet Systems Consortium - because I know that they currently host kernel.org and a bunch of other, successful, Open Source projects. I wanted to add OpenSolaris to that list.

Within a week I had been contacted by Peter Losher and we got an OK to proceed. I could hardly believe it - access to a clean one gigabit connection to the internet with the rackspace, power, cooling and bandwidth sponsored by ISC.

Next I needed to scrounge up some equipment. We (at Logical Approach) decided to sponsor the site with a maxxed out V20Z: two 146 gigabyte drives, 8 gigabytes of memory and two AMD 252 (2.6GHz) Opteron processors. This would ensure that a site would go online and indicate our committment to this project. However I was reluctant to bringup the site to support the upcoming launch of OpenSolaris, with just one server. I wanted high performance .... but also realized that high reliability and high availability were primary requirements.

So I put together a generic technical spec - generic in that it described the basic architectural building blocks of the site, but did not specify vendor specific part numbers or detailed configuration. The spec. also broke down the equipment into two procurement phases, which were called a Starter System Configuration and an Enhanced System Configuration. This would allow the site to go online with the starter config and, later, to be expanded to the enhanced config. Here is what the top level generic spec looked like:

Starter System Configuration Overview
  1. Server Load Balancer (aka Application Switch) standalone appliance with:
  • 12 * gigabit ethernet ports configured
  • - 2 * optical ports to connect to the ISC infrastructure
  • - 10 * copper UTP ports to connect to the web servers
  • 2 * A/C power supplies
  1. Four 1U dual AMD Opteron based rackmount servers configured
  • 2 * AMD Opteron 252 (2.6GHz) CPUs
  • 8Gb RAM
  • 2 * 146Gb U320 SCSI disk drives
  • 2 * built-in copper gigabit ethernet ports
  • 1 * dual-port gigabit ethernet expansion card
Enhanced System Configuration Overview
  1. One Fibre Channel (FC) SAN disk subsystem configured
  • 12 * 146Gb Fibre Channel 3.5" disk drives
  • 2 * RAID Controllers with 1-GB Cache Each and battery backup
  • 4 * 2Gb/Sec FC Host ports
  • 2 * A/C power supplies
  1. Four Fibre Channel Host adapters
  • PCI 64-bit low profile form factor
  • 2Gb/Sec Optical LC connectors
  • 2m Optical cable
As you can tell, the reliability/availability comes from using a Server Load Balancer (SLB) aka Application Switch, to load balance incoming requests across multiple, backend, servers. The load balancer issues periodic health checks and, assuming all 4 servers are healthy, the requests will be distributed according to the selected load balancing algorithm to the available servers in the defined pool. The real beauty of this approach, is that you can also do scheduled maintenance on any of the servers by "telling" the SLB to take a particular server out of the available pool. You wait until all active sessions expire on the server, then disconnect it. Now you are free to ugrade or repair it. Lets assume you're upgrading the Operating System. After you've completed the upgrade, you have plenty of time to test exhaustively, because the other servers in the pool are serving your client requests. When you've satisified that the upgraded server is ready for production, simply tell the SLB to put it back into the pool. Your user community experiences no impact and are completely unaware that you've just upgraded a server.

This architecture is also cost effective - because you consider each server as a throw away server. I don't mean this literally. Each server can had a single power supply or a single SCSI buss, or non-mirrored disks - because if it fails, it will have little impact on the service you're providing. This is in stark contrast to using high end (read expensive) servers with multiple power supplies, multiple disk subsystem busses and mirrored disk drives.

Next the generic spec was translated into a detailed vendor specific specification, including a parts list. Of course I preferred that Sun would provide hardware sponsorship - so there was a little Sun bias in the original generic spec. For the servers, I really wanted to use the Sun V20Z - it's an awesome server based on the AMD Opteron processor and runs Solaris based applications with impressive speed and efficiency.

I ran the spec by the other members of the CAB as a sanity check. No feedback = good news. Next I presented it to Jim Grisanzio and Stephen Harpster. Initially I got a No - for various reasons. Then Simon Phipps (also a CAB member) told me to forward the proposal to John Fowler.

In the meantime, I was busy upgrading Logicals' V20Z with the required new CPUs, expanded memory capacity and a couple of 146Gb disk drives. Unfortunately the new CPUs were not compatible with the existing motherboard or Voltage Regulator Modules (VRM). The V20Z uses a separate VRM for the CPU and memory. The Sun 252 processor upgrade kits, came with the required VRMs - so that was not an issue. But the included documentation indicated the requirement for a revision K2.5 motherboard, or, in Suns terminology, the Super FRU Chassis assembly, where FRU means Field Replacable Unit. Since this was a Sun supplied upgrade, I called Suns tech support and explained the issue. In less than an hour I had a case number and was told that a replacement motherboard would be dispatched.

It takes about one hour of careful work, to strip your existing motherboard and "transplant" the parts [1] to the replacement. And then about 10 minutes to install the new CPU, heatsink, CPU and memory VRMs. It helps if you are comfortable working on PC hardware - if not, I'd recommend that you find someone who is. One (big) advantage of the updated motherboard is (IMHO) quieter speed controlled fans and support for DDR400 memory parts (with the upgraded CPU).

On June 1 an email arrived with the news I had been awaiting. Bill Channel now had my request, via John Fowler, for Hardware sponsorship and he was ready to get started on making this happen! :)

The hardware was scheduled for delivery on Monday June 6th.

Note [1]: CDROM/floppy assembly, SCSI backplane, PCI risers, Power Supply, SCSI backplane cable assemblies, daughter board with keyboard/mouse connectors, memory, disk drive(s).


Continued in Part II.

2 comments:

Anonymous said...

Why no plain 'genunix.org'?

Anonymous said...

Great site about web site design indianapolis Keep up the good work! website design bathurst