Friday, November 09, 2007

hype versus reality

Like every engineer, I have to admit, upfront, that I have a limited tolerance to spin-meistering and marketing terminology. I find it boring, repetitive, tiring and annoying. It's also totally predictable - after you identify the marketing speakwords, aka buzzwords, its really annoying to see them over-used again and again and again and again - you get the picture. Most engineers can relate to this. The flip-side is that a certain amount of buzzwords replication actually works! OK - while I don't pretend to understand this phenomenon, I'll buy into it, based on anecdotal evidence. But... there is a point where hype and buzzword usage crosses my personal line-in-the-sand. I guess every technocrat has there own line-in-the-sand.

That line, in my case, is where hype and buzzwords go from being hype, that really can't be verified and validated, to where it is totally bogus and unbelievable. In fact, I'll go further and state that, in some extreme cases, it can be plain dumb and/or possibly dishonest. Such is the case with Marc Hamiltons blog blogs.sun.com/marchamilton/entry/busy_weekend entry which states that the indiana project preview had seen 100,000 downloads in less than 72 hours. Why did I think that this number was just plain wrong? Because, I had an email exchange with Jesse Silver of Sun and he asked me if I could provide download numbers for the Project Indiana Developer Preview (filename in-preview.iso) that we were mirroring on www.genunix.org and he told me to expect "big numbers". So I asked him, "what do you mean by big numbers?" and he said that they had seen over 100,000 downloads from dlc.sun.com. I was curious - because, my initial reaction, was that I did'nt (personally) feel that there was this level of interest in Indiana - particularly since the marketing team had not released it under the widely expected name of Project Indiana, but instead, had chosen to rename/re-brand it, to the OpenSolaris Developer Preview - which no-one, including me, really expected. Well, after taking a look at genunix.org's numbers: we had shipped 690 copies at that time (Mon Nov 5 11:52:54 PST 2007), I just did'nt see that level of interest. After a quick back-of-the-napkin calculation, I knew those numbers were just plain wrong - and I advised Jesse that I felt that those numbers were flawed. Why? Well, for the answer, take a look at the following email I sent to Marc Hamilton (3 days) later in the week after I noticed his blog entry (referred to in a post to one of the OpenSolaris mailing lists which prompted me to read it):

--------- begin Marc Hamilton email -----------
Date: Thu, 8 Nov 2007 10:22:09 -0600 (CST)
From: Al Hopper
To: Marc Hamilton
Subject: 100k download - hard to believe

Hi Marc,

I saw your recent blog[1] and the number you quoted for Project Indiana downloads (100,000) does not look reasonable to me. A "back-of-the-napkin" calculation reveals that for 100,000 downloads of 660226048 bytes per iso image, delivered over 72 hours, you'd be pushing over 2Gbits/Sec to the 'net. From a quick test of dlc.sun.com, it looks like you've got a 5Mbit/Sec cap on your connection (my best techguess).

I had an earlier email "conversation" with Jesse Silver - where he asked me for our genunix.org stats[2] and quoted his download numbers. I expressed scepticism that his numbers were accurate - suggesting that they may have counted the number of download transactions from the http access logs, rather than accumulating a count of the bytes transferred per in-preview.iso transacation and dividing the result by 660226048 (the size of the iso image). As of about 1 hour ago, we've shipped 808 copies of in-preview.iso.

I would suggest that you update your blog ASAP.
Comments welcome.

[1] http://blogs.sun.com/marchamilton/entry/busy_weekend
[2] we are providing downloads of the in-preview.iso file

--------- end Marc Hamilton email -----------

I did'nt receive any feedback from Marc 28+ hours later - hence this blog.
Should inaccurate hype be allowed to go unchallenged? What do you think?

PS: screen capture of Marc Hamilton blog as of Fri Nov 9th 19:28 Pacific

Sunday, September 30, 2007

Setup ZFS boot for Build 72

This cheat sheet will use a very simple and minimal PXE boot to help you setup a machine with ZFS boot and SXCE build 72 (and later). In all, from scratch, you should be able to complete this entire process in about one hour! We make the following assumptions:

  • the install server is on the install network at 192.168.80.18
  • the install server is using a ZFS based filesystem with a pool called tanku. The users home directory is also in this pool at /tanku/home/al
  • the target machine has ethernet address: 00:e0:81:2f:e1:4f
  • there are no other DHCP servers active on the install network
Verify that your ethernet interface supports PXE boot. Most systems do - except for low-end ethernet cards that don't have an option ROM. Determine the ethernet address of the interface you'll be using for PXE boot. Make a note of this address.

Download Lori Alts/Dave Miners ZFS boot tools:
wget http://www.opensolaris.org/os/community/install/files/zfsboot-kit-20060418.i386.tar.bz2

Yes - the date should be 20070418. Unzip and untar them - in this case they'll end up in /tanku/home/al/zfsboot/20070418(aka ~al/zfsboot/20070418)
cd
mkdir zfsboot
cd zfsboot
bunzip2 -c zfsboot-kit-20060418.i386.tar.bz2 | tar xvf -
Notice that the directory name has been changed to 20070418. Find and read the README file. But don't spend too much time studying it. This cheat sheet will tell you what to do.

On the install server setup a ZFS bootable netinstall image for b72
mkdir /mnt72
chown root:sys /mnt72
chmod 755 /mnt72
# FYI only: /solimages is an NFS mount
lofiadm -a /solimages/sol-nv-b72-x86-dvd.iso
Assumes that lofiadm returned "/dev/lofi/2"
mount -F hsfs -o ro /dev/lofi/2 /mnt72
zfs create tanku/b72installzfs
zfs set sharenfs='ro,anon=0' tanku/b72installzfs
cd /mnt72/Solaris_11/Tools
./setup_install_server /tanku/b72installzfs
cd /tanku/home/al/zfsboot/20070418
The next step takes around 13 minutes (why?)
ptime ./patch_image_for_zfsboot /tanku/b72installzfs
Remove the DVD image mount and cleanup
umount /mnt72
lofiadm -d /dev/lofi/2
Verify that you can mount /tanku/b72installzfs on another machine as a quick test. Best to check this now than try to trouble shoot it later. Use a mount command similar to:
mount -F nfs -o ro,vers=3,proto=tcp 192.168.80.18:/tanku/b72installzfs /mnt
Now cd to the Tools subdirectory in the prepared zfs boot area - in this case /tanku/b72zfsinstall
cd /tanku/b72installzfs/Solaris_11/Tools
Generate the target client files:
./add_install_client -d -e 00:e0:81:2f:e1:4f -s 192.168.80.18:/tanku/b72installzfs i86pc
You'll see instructions to add the client macros (something) like:
If not already configured, enable PXE boot by creating
a macro named 0100E0812FE14F with:
Boot server IP (BootSrvA) : 192.168.80.18
Boot file (BootFile) : 0100E0812FE14F
Using the screen-by-screen guide at http://www.sun.com/bigadmin/features/articles/jumpstart_x86_x64.jsp
starting at step 5 entitled Configure and Run the DHCP Server , setup the DHCP server and add the required two macros. NB: Ignore everything up to step 5. You don't need any of it!

At step 5.n, "n. Type the number of IP addresses and click Next." you should consider adding more than two addresses, in case something else on this network (unexpectedly) requests a DHCP lease.

Now add the two macros and use the name 0100E0812FE14F Note Well: the macro must have the correct name. Verify that the tftp based files are available. Again - a quick test now will save you a bunch of trouble shooting time down the road.
df | grep tftp
It should look something *like* this:
/tanku/b72installzfs/boot    260129046 3564877 256564169     2%    /tftpboot/I86PC.Solaris_11-2
Test that the tftp files can be successfully retrieved via tftp:
$ cd /tmp
$ tftp 192.168.80.18
tftp> get 0100E0812FE14F
Received 134028 bytes in 0.0 seconds
tftp> quit
Don't forget to cleanup:
rm /tmp/0100E0812FE14F
Enable FTP on your boot server to allow snagging the zfs boot profile file:
svcadm enable ftp
Change your password before you dare use FTP. Remember to use a disposable password - because it can be sniffed on the LAN. After we're finished using FTP, restore your original password.

Now enable the PXE boot on the target systems BIOS
Boot the target system.
During the early phases of booting press F12 ASAP

You should see the machine contact the DHCP server and start downloading the required files within a couple of Seconds.

NB: verify that the ethernet address displayed by the PXE code is the same one you expected and is associated with the physical interface in use. Some machines pick the ethernet port that will be used for PXE boot for you - you simply don't have a choice. Newer BIOSes allow you to enable PXE separately for each supported interface. Expect to see a GRUB prompt for the release you're installing (i.e., b72)

There is a known bug with build 72 that you might encounter when the target machine contacts the DHCP server. If you see something similar to:
Alarm Clock

ERROR: Unable to configure the network interface
exiting to shell
Then you've hit bug 6598201 The workaround is to simply enter ^D (control-D) in the terminal and the process will continue as if nothing had happened.

Select 4 (Console install) - it's the least likely to cause you issues. If you're using bge0 as the PXE boot interface, ensure that you leave the bge0 interface enabled for networking "[x] bge0" - otherwise you won't be able to "see" the install server. Fill in the minimum required config details, and take the first Exit option as soon as you see one. Now you should be looking
at a command line prompt.

The following assumes you've setup a profile file called (simply) profile.zfs on your boot server. See samples below.

At the prompt:
cd /tmp
ftp 192.168.80.18
user:
password:
(use the dummy login/password you setup earlier)
get profile.zfs
quit
Now load the system with pfinstall:
pfinstall /tmp/profile.zfs
The system should begin loading Solaris within a couple of Seconds.

Sample ZFS boot profile #1 (simple).
You may wish to change the cluster type to SUNWCXall (see next sample)
install_type initial_install
cluster SUNWCall
filesys c1t0d0s1 auto swap
pool mypool free / mirror c1t0d0s0 c2t1d0s0
dataset mypool/be1 auto /
dataset mypool/be1/usr auto /usr
dataset mypool/be1/opt auto /opt
dataset mypool/be1/var auto /var
dataset mypool/be1/export auto /export

Sample ZFS boot profile #2 (more complex).

Note the subtle change in the cluster name in this sample. We will load all the available locales by using the geo keyword. This will almost double the required install disk space. Instead of the C default system locale we'll make the system default be en_US.UTF-8.
install_type initial_install
cluster SUNWCXall
system_locale en_US.UTF-8
geo N_Africa
geo C_America
geo N_America
geo S_America
geo Asia
geo Ausi
geo C_Europe
geo E_Europe
geo N_Europe
geo S_Europe
geo W_Europe
geo M_East
filesys c1t0d0s1 auto swap
pool tanks free / mirror c1t0d0s0 c2t0d0s0
dataset tanks/be1 auto /
dataset tanks/be1/usr auto /usr
dataset tanks/be1/opt auto /opt
dataset tanks/be1/var auto /var
dataset tanks/be1/export auto /export
With cluster SUNWCXall and no additional geo regions, you should be ready to reboot in 7 to 10 minutes. Now reboot the machine gracefully:
init 6
That's it! Your machine should now reboot successfully.
Enjoy!

PS: Don't forget to change back your password and disable FTP on the install server. If you're going to reboot the install server, remember to remove the /etc/vfstab entry for the /tftpboot - or the machine will not boot cleanly to run-level 3.