Monday, March 24, 2014

Mirrored FAT32 EFI Boot Partitions

Most modern hardware ships with UEFI firmware and can be booted in EFI mode automatically when a disk is found with EFI setup. When I recently put Arch Linux on my desktop machine, I went with gummiboot to try something new and I really like the results.

Now that I'm bringing my Xeon machine back online for some benchmarking, I want to use gummiboot there too. The twist is that while my desktop has a single SSD for root+data, this machine has a few more drives installed.


The root drives actually aren't visible in the photo. They're 2.5" SATA drives behind the side panel on the left side, directly behind the red and black SAS tray.

Since I'm installing a mirrored root on btrfs and using EFI, I want to have /boot mirrored to both drives so the system will still boot if one of them fails. The easy way would be to format both and rsync with a cron job. While that would catch 99% of updates, I figure since I'm using this machine for crazy disk stuff I might as well try mirroring the EFI filesystem.

Because of the way EFI works, FAT32 is pretty much the only decent choice for a filesystem on the EFI partition (code ef02). Since /boot only needs to hold initramfs, kernels, and the EFI configuration, I'll simply mount it on /boot as vfat.

[root@sysresccd /]# mount /dev/sdd1 /boot -t vfat
[root@sysresccd /]# mount -t vfat
/dev/sdd1 on /boot type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=cp437,iocharset=ascii,shortname=mixed,errors=remount-ro)
view raw gistfile1.txt hosted with ❤ by GitHub
This is how it's set up on my desktop. But now I want mirroring. I tried mdraid, but even with metadata 1.0 the fat filesystem can't be direct mounted. No big deal. The Linux LVM is actually a frontend to a kernel disk abstraction called device mapper. It includes a mirror target, so all I had to do was spin up a quick mirrored LV then dump the device mapper table.

#!/bin/bash
# create empty volume files
truncate --size 256M /a
truncate --size 256M /b
# attach them to loopback devices
deva=$(losetup --find)
losetup $deva /a
devb=$(losetup --find)
losetup $devb /b
# format as LVM physical volumes
pvcreate $deva
pvcreate $devb
# create the volume group
vgcreate test $deva $devb
# create a mirrored (RAID1) logical volume with an in-memory replication log
lvcreate --extents 1 --mirrors 1 --corelog --name lv_derp test
dmsetup table /dev/mapper/test-lv_derp
# 0 8192 mirror core 1 1024 2 253:0 0 253:1 0 1 handle_errors
view raw lvm-cheat.sh hosted with ❤ by GitHub
Here's the breakdown of what the device mapper table says in English:

0 8192 mirror core 1 1024 2 253:0 0 253:1 0 1 handle_errors

Present blocks 0 to 8192 as a mirror with in-core replication log of the size 1024 with 2 devices, 253:0 and 253:1 both starting at offset 0 with one argument of 'handle_errors'.  The syntax is terse and the documentation is incomplete, so that's as far as I can tell. Device mapper can do a lot more than this, but this is all I need for now.

With the knowledge of what an LVM-created device mapper table looks like, writing a script that sets up the mirror is pretty easy. I'll throw this into a systemd unit file when I'm done with the setup.

#!/bin/bash
die () { echo "$*" ; exit 1; }
# the WWN path to the disk + partition
# device mapper will NOT protect you if this is wrong!
primary="/dev/disk/by-id/wwn-0x5000c50026c51411-part1"
secondary="/dev/disk/by-id/wwn-0x5000c50026c5694e-part1"
# make sure the devices are available
test -L $primary || die "$primary does not seem to be available right now"
test -L $secondary || die "$secondary does not seem to be available right now"
# get the block size of the devices
size=$(blockdev --getsz $primary)
set -x
echo "0 $size mirror core 1 1024 2 $primary 0 $secondary 0 1 handle_errors" | \
dmsetup create efiboot
With that script written the rest is mostly by the book (wiki), but I'll go ahead and test that the partitions are usable alone.

[root@sysresccd /]# ls /boot
EFI initramfs-linux-fallback.img initramfs-linux.img vmlinuz-linux
[root@sysresccd /]# mv /boot /boot.bak
[root@sysresccd /]# mkdir /boot
[root@sysresccd /]# mkfs.vfat -n EFI -F32 /dev/mapper/efiboot
mkfs.fat 3.0.26 (2014-03-07)
unable to get drive geometry, using default 255/63
[root@sysresccd /]# mount /dev/mapper/efiboot /boot
[root@sysresccd /]# mount -t vfat
/dev/mapper/efiboot on /boot type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=cp437,iocharset=ascii,shortname=mixed,errors=remount-ro)
[root@sysresccd /]# rsync -a /boot.bak/ /boot
[root@sysresccd /]# ls /boot
EFI initramfs-linux-fallback.img initramfs-linux.img vmlinuz-linux
[root@sysresccd /]# umount /boot
[root@sysresccd /]# mount /dev/sdd1 /boot
mount: /dev/sdd1 is already mounted or /boot busy
[root@sysresccd /]# dmsetup remove /dev/mapper/efiboot
[root@sysresccd /]# mount /dev/sdd1 /boot
[root@sysresccd /]# ls /boot
EFI initramfs-linux-fallback.img initramfs-linux.img vmlinuz-linux
[root@sysresccd /]# umount /boot
[root@sysresccd /]# mount /dev/sde1 /boot
[root@sysresccd /]# ls /boot
EFI initramfs-linux-fallback.img initramfs-linux.img vmlinuz-linux
[root@sysresccd /]# umount /boot
[root@sysresccd /]# bash /root/dmirror.sh
+ echo '0 4194304 mirror core 1 1024 2 /dev/disk/by-id/wwn-0x5000c50026c51411-part1 0 /dev/disk/by-id/wwn-0x5000c50026c5694e-part1 0 1 handle_errors'
+ dmsetup create efiboot
[root@sysresccd /]# mount /dev/mapper/efiboot /boot
[root@sysresccd /]# ls /boot
EFI initramfs-linux-fallback.img initramfs-linux.img vmlinuz-linux
[root@sysresccd /]#
view raw gistfile1.txt hosted with ❤ by GitHub
And with that, my workstation has redundant boot drives and can be set up to boot with gummitboot per the wiki instructions.

Edit: I may have spoke too soon. Will update again when I figure out why gummiboot won't run.

No comments: