Sunday, September 30, 2007

Setup ZFS boot for Build 72

This cheat sheet will use a very simple and minimal PXE boot to help you setup a machine with ZFS boot and SXCE build 72 (and later). In all, from scratch, you should be able to complete this entire process in about one hour! We make the following assumptions:

  • the install server is on the install network at 192.168.80.18
  • the install server is using a ZFS based filesystem with a pool called tanku. The users home directory is also in this pool at /tanku/home/al
  • the target machine has ethernet address: 00:e0:81:2f:e1:4f
  • there are no other DHCP servers active on the install network
Verify that your ethernet interface supports PXE boot. Most systems do - except for low-end ethernet cards that don't have an option ROM. Determine the ethernet address of the interface you'll be using for PXE boot. Make a note of this address.

Download Lori Alts/Dave Miners ZFS boot tools:
wget http://www.opensolaris.org/os/community/install/files/zfsboot-kit-20060418.i386.tar.bz2

Yes - the date should be 20070418. Unzip and untar them - in this case they'll end up in /tanku/home/al/zfsboot/20070418(aka ~al/zfsboot/20070418)
cd
mkdir zfsboot
cd zfsboot
bunzip2 -c zfsboot-kit-20060418.i386.tar.bz2 | tar xvf -
Notice that the directory name has been changed to 20070418. Find and read the README file. But don't spend too much time studying it. This cheat sheet will tell you what to do.

On the install server setup a ZFS bootable netinstall image for b72
mkdir /mnt72
chown root:sys /mnt72
chmod 755 /mnt72
# FYI only: /solimages is an NFS mount
lofiadm -a /solimages/sol-nv-b72-x86-dvd.iso
Assumes that lofiadm returned "/dev/lofi/2"
mount -F hsfs -o ro /dev/lofi/2 /mnt72
zfs create tanku/b72installzfs
zfs set sharenfs='ro,anon=0' tanku/b72installzfs
cd /mnt72/Solaris_11/Tools
./setup_install_server /tanku/b72installzfs
cd /tanku/home/al/zfsboot/20070418
The next step takes around 13 minutes (why?)
ptime ./patch_image_for_zfsboot /tanku/b72installzfs
Remove the DVD image mount and cleanup
umount /mnt72
lofiadm -d /dev/lofi/2
Verify that you can mount /tanku/b72installzfs on another machine as a quick test. Best to check this now than try to trouble shoot it later. Use a mount command similar to:
mount -F nfs -o ro,vers=3,proto=tcp 192.168.80.18:/tanku/b72installzfs /mnt
Now cd to the Tools subdirectory in the prepared zfs boot area - in this case /tanku/b72zfsinstall
cd /tanku/b72installzfs/Solaris_11/Tools
Generate the target client files:
./add_install_client -d -e 00:e0:81:2f:e1:4f -s 192.168.80.18:/tanku/b72installzfs i86pc
You'll see instructions to add the client macros (something) like:
If not already configured, enable PXE boot by creating
a macro named 0100E0812FE14F with:
Boot server IP (BootSrvA) : 192.168.80.18
Boot file (BootFile) : 0100E0812FE14F
Using the screen-by-screen guide at http://www.sun.com/bigadmin/features/articles/jumpstart_x86_x64.jsp
starting at step 5 entitled Configure and Run the DHCP Server , setup the DHCP server and add the required two macros. NB: Ignore everything up to step 5. You don't need any of it!

At step 5.n, "n. Type the number of IP addresses and click Next." you should consider adding more than two addresses, in case something else on this network (unexpectedly) requests a DHCP lease.

Now add the two macros and use the name 0100E0812FE14F Note Well: the macro must have the correct name. Verify that the tftp based files are available. Again - a quick test now will save you a bunch of trouble shooting time down the road.
df | grep tftp
It should look something *like* this:
/tanku/b72installzfs/boot    260129046 3564877 256564169     2%    /tftpboot/I86PC.Solaris_11-2
Test that the tftp files can be successfully retrieved via tftp:
$ cd /tmp
$ tftp 192.168.80.18
tftp> get 0100E0812FE14F
Received 134028 bytes in 0.0 seconds
tftp> quit
Don't forget to cleanup:
rm /tmp/0100E0812FE14F
Enable FTP on your boot server to allow snagging the zfs boot profile file:
svcadm enable ftp
Change your password before you dare use FTP. Remember to use a disposable password - because it can be sniffed on the LAN. After we're finished using FTP, restore your original password.

Now enable the PXE boot on the target systems BIOS
Boot the target system.
During the early phases of booting press F12 ASAP

You should see the machine contact the DHCP server and start downloading the required files within a couple of Seconds.

NB: verify that the ethernet address displayed by the PXE code is the same one you expected and is associated with the physical interface in use. Some machines pick the ethernet port that will be used for PXE boot for you - you simply don't have a choice. Newer BIOSes allow you to enable PXE separately for each supported interface. Expect to see a GRUB prompt for the release you're installing (i.e., b72)

There is a known bug with build 72 that you might encounter when the target machine contacts the DHCP server. If you see something similar to:
Alarm Clock

ERROR: Unable to configure the network interface
exiting to shell
Then you've hit bug 6598201 The workaround is to simply enter ^D (control-D) in the terminal and the process will continue as if nothing had happened.

Select 4 (Console install) - it's the least likely to cause you issues. If you're using bge0 as the PXE boot interface, ensure that you leave the bge0 interface enabled for networking "[x] bge0" - otherwise you won't be able to "see" the install server. Fill in the minimum required config details, and take the first Exit option as soon as you see one. Now you should be looking
at a command line prompt.

The following assumes you've setup a profile file called (simply) profile.zfs on your boot server. See samples below.

At the prompt:
cd /tmp
ftp 192.168.80.18
user:
password:
(use the dummy login/password you setup earlier)
get profile.zfs
quit
Now load the system with pfinstall:
pfinstall /tmp/profile.zfs
The system should begin loading Solaris within a couple of Seconds.

Sample ZFS boot profile #1 (simple).
You may wish to change the cluster type to SUNWCXall (see next sample)
install_type initial_install
cluster SUNWCall
filesys c1t0d0s1 auto swap
pool mypool free / mirror c1t0d0s0 c2t1d0s0
dataset mypool/be1 auto /
dataset mypool/be1/usr auto /usr
dataset mypool/be1/opt auto /opt
dataset mypool/be1/var auto /var
dataset mypool/be1/export auto /export

Sample ZFS boot profile #2 (more complex).

Note the subtle change in the cluster name in this sample. We will load all the available locales by using the geo keyword. This will almost double the required install disk space. Instead of the C default system locale we'll make the system default be en_US.UTF-8.
install_type initial_install
cluster SUNWCXall
system_locale en_US.UTF-8
geo N_Africa
geo C_America
geo N_America
geo S_America
geo Asia
geo Ausi
geo C_Europe
geo E_Europe
geo N_Europe
geo S_Europe
geo W_Europe
geo M_East
filesys c1t0d0s1 auto swap
pool tanks free / mirror c1t0d0s0 c2t0d0s0
dataset tanks/be1 auto /
dataset tanks/be1/usr auto /usr
dataset tanks/be1/opt auto /opt
dataset tanks/be1/var auto /var
dataset tanks/be1/export auto /export
With cluster SUNWCXall and no additional geo regions, you should be ready to reboot in 7 to 10 minutes. Now reboot the machine gracefully:
init 6
That's it! Your machine should now reboot successfully.
Enjoy!

PS: Don't forget to change back your password and disable FTP on the install server. If you're going to reboot the install server, remember to remove the /etc/vfstab entry for the /tftpboot - or the machine will not boot cleanly to run-level 3.