The random and rare writings of Al Tobey.

Saturday, January 31, 2009

Why OpenSolaris is growing on me ...

I crashed my OpenSolaris 2008.11 and it was awesome. The crash was entirely my fault and I knew it was likely, but I can't resist playing with shiny new toys. I'm testing ZFS on iSCSI LUN's and was taking packet traces of iscsitgt and COMSTAR for the developers of LIO Target v3.0 so they can get all the bits right for automatic MPxIO setup when using a Solaris initiator. When I installed COMSTAR on OpenSolaris 2008.11, I noticed that one of the things included in the tarball I found is an updated iSCSI initiator package with some interesting infrastructural changes (iscsidm). I decided to give it a whirl on my test cluster with 10 2.5TB LUN's connected to a hopped-up Dell r905 running OS2008.11.
atobey@opensolaris01:~# wget http://www.opensolaris.org/os/project/iser/SUNWiscsitr-bins-121108.i386.tar.gz
atobey@opensolaris01:~# tar -xzvf SUNWiscsitr-bins-121108.i386.tar.gz
atobey@opensolaris01:~# pfexec su -
# Hmm, good chance this is gonna end badly ...
root@opensolaris01:~# zfs snapshot rpool/ROOT/opensolaris-1@2009-01-31-01:53:47
# Ok, now that I have a backout plan, continue ...
root@opensolaris01:~# pkgadd -d . SUNWiscsidmr
root@opensolaris01:~# pkgadd -d . SUNWiscsidmu
root@opensolaris01:~# pkgadd -d . SUNWiscsir
root@opensolaris01:~# pkgadd -d . SUNWiscsiu
root@opensolaris01:~# rm -f /etc/iscsi/* # only known way to truly clean up all iSCSI configuration
root@opensolaris01:~# reboot
# get some coffee, reconnect ...
atobey@opensolaris01:~# pfexec su -
root@opensolaris01:~# iscsiadm modify discovery --static disable
root@opensolaris01:~# iscsiadm modify initiator-node -N iqn.2008-12.org.tobert.opensolaris01
root@opensolaris01:~# iscsiadm modify initiator-node --node-alias opensolaris01
root@opensolaris01:~# iscsiadm modify initiator-node --configured-sessions 2 # required for MPxIO
root@opensolaris01:~# iscsiadm modify initiator-node --authentication chap
root@opensolaris01:~# iscsiadm modify initiator-node --CHAP-name torgiscsi
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box01:lio0,192.168.1.11:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box01:lio0,192.168.2.11:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box02:lio0,192.168.1.12:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box02:lio0,192.168.2.12:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box03:lio0,192.168.1.13:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box03:lio0,192.168.2.13:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box04:lio0,192.168.1.14:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box04:lio0,192.168.2.14:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box05:lio0,192.168.1.15:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box05:lio0,192.168.2.15:3260
root@opensolaris01:~# iscsiadm modify initiator-node --CHAP-secret
# enter password
root@opensolaris01:~# iscsiadm modify discovery --static enable
root@opensolaris01:~# zpool import -o cachefile=/etc/zfs/tank.cache tank
root@opensolaris01:~# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
tank 26.3T 800G 25.5T 2% ONLINE -
rpool 68G 20.8G 47.2G 30% ONLINE -
root@opensolaris01:~# zpool status tank
pool: tank
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
c0t234d0 ONLINE 0 0 0
c0t235d0 ONLINE 0 0 0
c0t230d0 ONLINE 0 0 0
c0t231d0 ONLINE 0 0 0
c0t226d0 ONLINE 0 0 0
c0t227d0 ONLINE 0 0 0
c0t222d0 ONLINE 0 0 0
c0t223d0 ONLINE 0 0 0
c0t216d0 ONLINE 0 0 0
c0t217d0 ONLINE 0 0 0

errors: No known data errors
The devices are just regular iSCSI devices without MPxIO since I didn't configure it for this test. I fired off an instance of ffsb set to run for a day over 800GB of data with a read-heavy profile. This test in various forms has run perfectly for a couple months now. The initiator crashed the kernel sometime during the night. This is what I saw on my serial console:
apnainci[ccp[uc6]p/ut2h]re/atdh=rfefafdfff0=fff07b9f2ff113c80: abs8s3ertion 15fa4i6l0e:d B:A D TR0, fAiPl:e :t y.p.e/=.e./com6
dd
Lr=0f fffofcfc0u0r7rbe9d2 3ian mo80 gduleenun "uix:naisxs" fdauiel +t7oe a( )N
L fpfofifff0n07bter9 2d3ad0e riedfme:riedmn_ce
.tyanscki_nagb ofritl_eo nsey+s1tde7m s(.).
ffffff007b923b20 idm:idm_task_abort+9a ()
ffffff007b923b70 idm:idm_update_state+1fb ()
ffffff007b923ba0 idm:idm_state_s8_cleanup+85 ()
ffffff007b923be0 idm:idm_conn_event_handler+144 ()
ffffff007b923c60 genunix:taskq_thread+193 ()
ffffff007b923c70 unix:thread_start+8 ()

dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
done
dump aborted: please record the above information!
rebooting...
Clearly the idm module still needs some polishing, and I'm sure it'll get it. My story so far is not here to disparage the idm or COMSTAR work; it's here as a backdrop to why I dig OpenSolaris more and more the longer I use it. Backing out patches always sucks, on nearly every OS, and with nearly every package manager. With ZFS, life sucks a lot less when you find yourself backed into a corner:
root@opensolaris01:~# zfs list -t all
NAME USED AVAIL REFER MOUNTPOINT
rpool 52.8G 14.1G 75K /rpool
rpool/ROOT 4.08G 14.1G 18K legacy
rpool/ROOT/opensolaris 106M 14.1G 2.31G -
rpool/ROOT/opensolaris-1 3.90G 14.1G 3.66G /
rpool/ROOT/opensolaris-1@install 71.0M - 2.21G -
rpool/ROOT/opensolaris-1@2009-01-13-23:06:36 40.0M - 2.26G -
rpool/ROOT/opensolaris-1@2009-01-31-01:53:47 78.4M - 3.66G -
rpool/dump 16.0G 14.1G 16.0G -
rpool/export 761M 14.1G 19K /export
rpool/export/home 761M 14.1G 32K /export/home
rpool/export/home/atobey 761M 14.1G 761M /export/home/atobey
rpool/swap 32.0G 46.1G 16K -
root@opensolaris01:~# zfs clone rpool/ROOT/opensolaris-1@2009-01-31-01:53:47 rpool/ROOT/opensolaris-2
root@opensolaris01:~# zfs list -t all
NAME USED AVAIL REFER MOUNTPOINT
rpool 52.8G 14.1G 75K /rpool
rpool/ROOT 4.08G 14.1G 18K legacy
rpool/ROOT/opensolaris 106M 14.1G 2.31G -
rpool/ROOT/opensolaris-1 3.90G 14.1G 3.66G /
rpool/ROOT/opensolaris-1@install 71.0M - 2.21G -
rpool/ROOT/opensolaris-1@2009-01-13-23:06:36 40.0M - 2.26G -
rpool/ROOT/opensolaris-1@2009-01-31-01:53:47 78.4M - 3.66G -
rpool/ROOT/opensolaris-2 74.8M 14.1G 3.66G -
rpool/dump 16.0G 14.1G 16.0G -
rpool/export 761M 14.1G 19K /export
rpool/export/home 761M 14.1G 32K /export/home
rpool/export/home/atobey 761M 14.1G 761M /export/home/atobey
rpool/swap 32.0G 46.1G 16K -
root@opensolaris01:~# zfs set mountpoint=/ rpool/ROOT/opensolaris-2
root@opensolaris01:~# zfs set canmount=noauto rpool/ROOT/opensolaris-2
# just because they're there on the other clones ...
root@opensolaris01:~# zfs set org.opensolaris.libbe:policy=static rpool/ROOT/opensolaris-2
# ran uuidgen on a linux box
root@opensolaris01:~# zfs set org.opensolaris.libbe:uuid=35f4532d-e452-453b-86a9-1de8e1b296f9
root@opensolaris01:~# sed -i 's/opensolaris-1/opensolaris-2/g' /rpool/boot/grub/menu.lst
root@opensolaris01:~# reboot
ZFS is cool. Maybe someday when btrfs grows up, Linux can do this for its root filesystems too.

Notes:
  • The shell stuff in this post are reenactments of a sort because I didn't think to blog about this until I was all done.

  • pkg image-update will also set up snapshots/clones for easy backout, which is why rpool/ROOT/opensolaris-1 existed.

  • Network connection was 2 ixgbe's to 5 Linux boxes with 2 e1000e's each

  • I used sed on the GRUB config to save space. Don't try it at home unless you really understand what it's doing.

  • Hooray for /usr/gnu and having it at the front of $PATH

  • pfexec su - for root access isn't really best practice, but it sure saves a lot of typing.

2 comments:

Ivan said...

Besides doing the snapshot, clone, and set properties by hand, you could use the beadm command:

# beadm create opensolaris-2

( by default it will use your current boot environment, or use -e to select a different BE as the source )

And thanks for the write up on this, I was chatting with someone earlier trying to get beadm to work with their own customized opensolaris distro, and this was everything needed to get up and running.

Ivan R.

Albert P. Tobey said...

That's a much better way to do it. Thanks.