Manually Creating a ZFS Pool From Mismatched Drives

I recently suffered a drive failure on my main ZFS array. Naturally I wasn’t as prepared for this even as I should have been so I wasn’t 100% sure how to go about replacing the drive. I decided to press my old rack server into action to get some practice in before working on the real thing. I found out some old hard drives but they were mismatched in terms of size. This will work but the Proxmox interface can’t be used, it excepts the far more sensible situation where all drives are matched.

Finding out what drives you have

To create an array from mismatched drives you need to do it manually. Start by using fdisk to find out what drives your system has installed.

fdisk -l

Disk /dev/sda: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: ST2000DL003-9VT1
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 77070F1F-38F8-41D4-B39B-D921AD600CFA

Device       Start        End    Sectors  Size Type
/dev/sda1       34       2047       2014 1007K BIOS boot
/dev/sda2     2048    1050623    1048576  512M EFI System
/dev/sda3  1050624 3907029134 3905978511  1.8T Linux LVM

Disk /dev/mapper/pve-swap: 8 GiB, 8589934592 bytes, 16777216 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk /dev/mapper/pve-root: 96 GiB, 103079215104 bytes, 201326592 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk /dev/sdb: 149.01 GiB, 160000000000 bytes, 312500000 sectors
Disk model: ST3160812AS     
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 9B6002BF-9EA1-674B-8D6E-B953AAF16C6F

Device         Start       End   Sectors  Size Type
/dev/sdb1       2048 312481791 312479744  149G Solaris /usr & Apple ZFS
/dev/sdb9  312481792 312498175     16384    8M Solaris reserved 1

Disk /dev/sdc: 298.09 GiB, 320072933376 bytes, 625142448 sectors
Disk model: WDC WD3200AAKX-0
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: F891A5A3-B02B-084D-ACC8-8BD998ABF9CF

Device         Start       End   Sectors   Size Type
/dev/sdc1       2048 625125375 625123328 298.1G Solaris /usr & Apple ZFS
/dev/sdc9  625125376 625141759     16384     8M Solaris reserved 1

Looking at the above output you can see that there are three physical drives in this system. The first, sda, is the OS drive so I won’t be touching that. You can safely ignore the pve- drives as they are part of Proxmox. The two drives I’ll be using for the array are sdb and sdc. As can be seen sdb is 160GB and sdc is 320 so these are mismatched drives. Now run the following command which will show the drives by ID.

ls -l /dev/disk/by-id/

total 0
lrwxrwxrwx 1 root root  9 Jul 29 09:43 ata-ST2000DL003-9VT166_5YD1KN65 -> ../../sda
lrwxrwxrwx 1 root root 10 Jul 29 09:43 ata-ST2000DL003-9VT166_5YD1KN65-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Jul 29 09:43 ata-ST2000DL003-9VT166_5YD1KN65-part2 -> ../../sda2
lrwxrwxrwx 1 root root 10 Jul 29 09:43 ata-ST2000DL003-9VT166_5YD1KN65-part3 -> ../../sda3
lrwxrwxrwx 1 root root  9 Jul 29 10:09 ata-ST3160812AS_5LS9ADML -> ../../sdb
lrwxrwxrwx 1 root root 10 Jul 29 10:09 ata-ST3160812AS_5LS9ADML-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Jul 29 10:09 ata-ST3160812AS_5LS9ADML-part9 -> ../../sdb9
lrwxrwxrwx 1 root root  9 Jul 29 10:13 ata-WDC_WD3200AAKX-001CA0_WD-WCAYUL400860 -> ../../sdc
lrwxrwxrwx 1 root root 10 Jul 29 10:13 ata-WDC_WD3200AAKX-001CA0_WD-WCAYUL400860-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 Jul 29 10:13 ata-WDC_WD3200AAKX-001CA0_WD-WCAYUL400860-part9 -> ../../sdc9

... snip ...

These are the drive ID’s that you’ll be using to create the array. You can create an array using names such as /dev/sdb but it’s a bad idea as the OS might chance which disk that link point too (especially if you swap disks around). Using the ID given here means you will always know exactly which disk you are interacting with. You should also write this ID on the end of the disk (or the carrier / caddie) so you know which disk to pull out of an array. I’ve snipped the content here to remove virtual devices and show only the physical disks.

Wiping the drives

Before you can create a ZFS pool on the drives you need to wipe them. If you don’t you’ll almost certainly get an error message from the zpool create command telling you that the drives already contain a valid file system. For example:

zpool create -f -o 'ashift=12' arrayz1 tank /dev/disk/by-id/ata-ST3160812AS_5LS9ADML /dev/disk/by-id/ata-WDC_WD3200AAKX-001CA0_WD-WCAYUL400860
cannot create 'arrayz1': one or more vdevs refer to the same device, or one of
the devices is part of an active md or lvm device

To wipe a disk use the wipefs command, this is what Proxmox uses if you select wipe though the interface. Note, this doesn’t clear the data, it simply removes the firesystem. For example:

wipefs -a /dev/disk/by-id/ata-ST3160812AS_5LS9ADML

/dev/disk/by-id/ata-ST3160812AS_5LS9ADML: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54
/dev/disk/by-id/ata-ST3160812AS_5LS9ADML: 8 bytes were erased at offset 0x2540be3e00 (gpt): 45 46 49 20 50 41 52 54
/dev/disk/by-id/ata-ST3160812AS_5LS9ADML: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
/dev/disk/by-id/ata-ST3160812AS_5LS9ADML: calling ioctl to re-read partition table: Success

As can be seen the wipefs command has erased several key bytes on the disk which removed the existing filesystem, or at least renders them inaccessible. You can confirm they are gone by running the fdisk -l command again. The wiped drive will show no partitions.

Creating the Pool

Now try running the following command without the -f flag.

zpool create -o 'ashift=12' tank raidz1 /dev/disk/by-id/ata-ST3160812AS_5LS9ADML /dev/disk/by-id/ata-WDC_WD3200AAKX-001CA0_WD-WCAYUL400860

invalid vdev specification
use '-f' to override the following errors:
raidz contains devices of different sizes

ZFS complains because the drives are mismatched in size and it tells you to use the -f flag to create the array anyway. This pool would be called “tank” and although it only contains two disks I have specified it be created as a raidz array. This allows the array to be expanded later but for now it would essentially operate like a mirror. Running the command a second time but with the -f flag you should see no output indicating that the command succeeded.

zpool create -f -o 'ashift=12' tank raidz1 /dev/disk/by-id/ata-ST3160812AS_5LS9ADML /dev/disk/by-id/ata-WDC_WD3200AAKX-001CA0_WD-WCAYUL400860

Success can be confirmed by running zpool status as shown here.

zpool status

  pool: tank
 state: ONLINE

        NAME                                           STATE     READ WRITE CKSUM
        tank                                           ONLINE       0     0     0
          raidz1-0                                     ONLINE       0     0     0
            ata-ST3160812AS_5LS9ADML                   ONLINE       0     0     0
            ata-WDC_WD3200AAKX-001CA0_WD-WCAYUL400860  ONLINE       0     0     0

errors: No known data errors

Create a dataset

This is just for demonstration purposes, you will probably want to organize your datasets into logical chunks. To create a dataset call data issue the following command:

zfs create tank/data

There will be no output indicating that the command was successful. If you run an ls command under /tank you’ll see a new directory which is the root of the dataset.

That’s all there is to it. The array is created and we went the extra step of adding a dataset.