atobey@opensolaris01:~# wget http://www.opensolaris.org/os/project/iser/SUNWiscsitr-bins-121108.i386.tar.gzThe devices are just regular iSCSI devices without MPxIO since I didn't configure it for this test. I fired off an instance of ffsb set to run for a day over 800GB of data with a read-heavy profile. This test in various forms has run perfectly for a couple months now. The initiator crashed the kernel sometime during the night. This is what I saw on my serial console:
atobey@opensolaris01:~# tar -xzvf SUNWiscsitr-bins-121108.i386.tar.gz
atobey@opensolaris01:~# pfexec su -
# Hmm, good chance this is gonna end badly ...
root@opensolaris01:~# zfs snapshot rpool/ROOT/opensolaris-1@2009-01-31-01:53:47
# Ok, now that I have a backout plan, continue ...
root@opensolaris01:~# pkgadd -d . SUNWiscsidmr
root@opensolaris01:~# pkgadd -d . SUNWiscsidmu
root@opensolaris01:~# pkgadd -d . SUNWiscsir
root@opensolaris01:~# pkgadd -d . SUNWiscsiu
root@opensolaris01:~# rm -f /etc/iscsi/* # only known way to truly clean up all iSCSI configuration
root@opensolaris01:~# reboot
# get some coffee, reconnect ...
atobey@opensolaris01:~# pfexec su -
root@opensolaris01:~# iscsiadm modify discovery --static disable
root@opensolaris01:~# iscsiadm modify initiator-node -N iqn.2008-12.org.tobert.opensolaris01
root@opensolaris01:~# iscsiadm modify initiator-node --node-alias opensolaris01
root@opensolaris01:~# iscsiadm modify initiator-node --configured-sessions 2 # required for MPxIO
root@opensolaris01:~# iscsiadm modify initiator-node --authentication chap
root@opensolaris01:~# iscsiadm modify initiator-node --CHAP-name torgiscsi
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box01:lio0,192.168.1.11:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box01:lio0,192.168.2.11:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box02:lio0,192.168.1.12:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box02:lio0,192.168.2.12:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box03:lio0,192.168.1.13:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box03:lio0,192.168.2.13:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box04:lio0,192.168.1.14:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box04:lio0,192.168.2.14:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box05:lio0,192.168.1.15:3260
root@opensolaris01:~# iscsiadm add static-config iqn.2009-01.org.tobert.box05:lio0,192.168.2.15:3260
root@opensolaris01:~# iscsiadm modify initiator-node --CHAP-secret
# enter password
root@opensolaris01:~# iscsiadm modify discovery --static enable
root@opensolaris01:~# zpool import -o cachefile=/etc/zfs/tank.cache tank
root@opensolaris01:~# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
tank 26.3T 800G 25.5T 2% ONLINE -
rpool 68G 20.8G 47.2G 30% ONLINE -
root@opensolaris01:~# zpool status tank
pool: tank
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
c0t234d0 ONLINE 0 0 0
c0t235d0 ONLINE 0 0 0
c0t230d0 ONLINE 0 0 0
c0t231d0 ONLINE 0 0 0
c0t226d0 ONLINE 0 0 0
c0t227d0 ONLINE 0 0 0
c0t222d0 ONLINE 0 0 0
c0t223d0 ONLINE 0 0 0
c0t216d0 ONLINE 0 0 0
c0t217d0 ONLINE 0 0 0
errors: No known data errors
apnainci[ccp[uc6]p/ut2h]re/atdh=rfefafdfff0=fff07b9f2ff113c80: abs8s3ertion 15fa4i6l0e:d B:A D TR0, fAiPl:e :t y.p.e/=.e./com6Clearly the idm module still needs some polishing, and I'm sure it'll get it. My story so far is not here to disparage the idm or COMSTAR work; it's here as a backdrop to why I dig OpenSolaris more and more the longer I use it. Backing out patches always sucks, on nearly every OS, and with nearly every package manager. With ZFS, life sucks a lot less when you find yourself backed into a corner:
dd
Lr=0f fffofcfc0u0r7rbe9d2 3ian mo80 gduleenun "uix:naisxs" fdauiel +t7oe a( )N
L fpfofifff0n07bter9 2d3ad0e riedfme:riedmn_ce
.tyanscki_nagb ofritl_eo nsey+s1tde7m s(.).
ffffff007b923b20 idm:idm_task_abort+9a ()
ffffff007b923b70 idm:idm_update_state+1fb ()
ffffff007b923ba0 idm:idm_state_s8_cleanup+85 ()
ffffff007b923be0 idm:idm_conn_event_handler+144 ()
ffffff007b923c60 genunix:taskq_thread+193 ()
ffffff007b923c70 unix:thread_start+8 ()
dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
done
dump aborted: please record the above information!
rebooting...
root@opensolaris01:~# zfs list -t allZFS is cool. Maybe someday when btrfs grows up, Linux can do this for its root filesystems too.
NAME USED AVAIL REFER MOUNTPOINT
rpool 52.8G 14.1G 75K /rpool
rpool/ROOT 4.08G 14.1G 18K legacy
rpool/ROOT/opensolaris 106M 14.1G 2.31G -
rpool/ROOT/opensolaris-1 3.90G 14.1G 3.66G /
rpool/ROOT/opensolaris-1@install 71.0M - 2.21G -
rpool/ROOT/opensolaris-1@2009-01-13-23:06:36 40.0M - 2.26G -
rpool/ROOT/opensolaris-1@2009-01-31-01:53:47 78.4M - 3.66G -
rpool/dump 16.0G 14.1G 16.0G -
rpool/export 761M 14.1G 19K /export
rpool/export/home 761M 14.1G 32K /export/home
rpool/export/home/atobey 761M 14.1G 761M /export/home/atobey
rpool/swap 32.0G 46.1G 16K -
root@opensolaris01:~# zfs clone rpool/ROOT/opensolaris-1@2009-01-31-01:53:47 rpool/ROOT/opensolaris-2
root@opensolaris01:~# zfs list -t all
NAME USED AVAIL REFER MOUNTPOINT
rpool 52.8G 14.1G 75K /rpool
rpool/ROOT 4.08G 14.1G 18K legacy
rpool/ROOT/opensolaris 106M 14.1G 2.31G -
rpool/ROOT/opensolaris-1 3.90G 14.1G 3.66G /
rpool/ROOT/opensolaris-1@install 71.0M - 2.21G -
rpool/ROOT/opensolaris-1@2009-01-13-23:06:36 40.0M - 2.26G -
rpool/ROOT/opensolaris-1@2009-01-31-01:53:47 78.4M - 3.66G -
rpool/ROOT/opensolaris-2 74.8M 14.1G 3.66G -
rpool/dump 16.0G 14.1G 16.0G -
rpool/export 761M 14.1G 19K /export
rpool/export/home 761M 14.1G 32K /export/home
rpool/export/home/atobey 761M 14.1G 761M /export/home/atobey
rpool/swap 32.0G 46.1G 16K -
root@opensolaris01:~# zfs set mountpoint=/ rpool/ROOT/opensolaris-2
root@opensolaris01:~# zfs set canmount=noauto rpool/ROOT/opensolaris-2
# just because they're there on the other clones ...
root@opensolaris01:~# zfs set org.opensolaris.libbe:policy=static rpool/ROOT/opensolaris-2
# ran uuidgen on a linux box
root@opensolaris01:~# zfs set org.opensolaris.libbe:uuid=35f4532d-e452-453b-86a9-1de8e1b296f9
root@opensolaris01:~# sed -i 's/opensolaris-1/opensolaris-2/g' /rpool/boot/grub/menu.lst
root@opensolaris01:~# reboot
Notes:
- The shell stuff in this post are reenactments of a sort because I didn't think to blog about this until I was all done.
- pkg image-update will also set up snapshots/clones for easy backout, which is why rpool/ROOT/opensolaris-1 existed.
- Network connection was 2 ixgbe's to 5 Linux boxes with 2 e1000e's each
- I used sed on the GRUB config to save space. Don't try it at home unless you really understand what it's doing.
- Hooray for /usr/gnu and having it at the front of $PATH
- pfexec su - for root access isn't really best practice, but it sure saves a lot of typing.