I recently suffered a drive failure on my main ZFS array. Naturally I wasn’t as prepared for this even as I should have been so I wasn’t 100% sure how to go about replacing the drive. I decided to press my old rack server into action to get some practice in before working on the real thing. I found out some old hard drives but they were mismatched in terms of size. This will work but the Proxmox interface can’t be used, it excepts the far more sensible situation where all drives are matched.
Finding out what drives you have
To create an array from mismatched drives you need to do it manually. Start by using fdisk
to find out what drives your system has installed.
fdisk -l Disk /dev/sda: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: ST2000DL003-9VT1 Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: 77070F1F-38F8-41D4-B39B-D921AD600CFA Device Start End Sectors Size Type /dev/sda1 34 2047 2014 1007K BIOS boot /dev/sda2 2048 1050623 1048576 512M EFI System /dev/sda3 1050624 3907029134 3905978511 1.8T Linux LVM Disk /dev/mapper/pve-swap: 8 GiB, 8589934592 bytes, 16777216 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mapper/pve-root: 96 GiB, 103079215104 bytes, 201326592 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/sdb: 149.01 GiB, 160000000000 bytes, 312500000 sectors Disk model: ST3160812AS Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: 9B6002BF-9EA1-674B-8D6E-B953AAF16C6F Device Start End Sectors Size Type /dev/sdb1 2048 312481791 312479744 149G Solaris /usr & Apple ZFS /dev/sdb9 312481792 312498175 16384 8M Solaris reserved 1 Disk /dev/sdc: 298.09 GiB, 320072933376 bytes, 625142448 sectors Disk model: WDC WD3200AAKX-0 Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: F891A5A3-B02B-084D-ACC8-8BD998ABF9CF Device Start End Sectors Size Type /dev/sdc1 2048 625125375 625123328 298.1G Solaris /usr & Apple ZFS /dev/sdc9 625125376 625141759 16384 8M Solaris reserved 1
Looking at the above output you can see that there are three physical drives in this system. The first, sda, is the OS drive so I won’t be touching that. You can safely ignore the pve- drives as they are part of Proxmox. The two drives I’ll be using for the array are sdb and sdc. As can be seen sdb is 160GB and sdc is 320 so these are mismatched drives. Now run the following command which will show the drives by ID.
ls -l /dev/disk/by-id/ total 0 lrwxrwxrwx 1 root root 9 Jul 29 09:43 ata-ST2000DL003-9VT166_5YD1KN65 -> ../../sda lrwxrwxrwx 1 root root 10 Jul 29 09:43 ata-ST2000DL003-9VT166_5YD1KN65-part1 -> ../../sda1 lrwxrwxrwx 1 root root 10 Jul 29 09:43 ata-ST2000DL003-9VT166_5YD1KN65-part2 -> ../../sda2 lrwxrwxrwx 1 root root 10 Jul 29 09:43 ata-ST2000DL003-9VT166_5YD1KN65-part3 -> ../../sda3 lrwxrwxrwx 1 root root 9 Jul 29 10:09 ata-ST3160812AS_5LS9ADML -> ../../sdb lrwxrwxrwx 1 root root 10 Jul 29 10:09 ata-ST3160812AS_5LS9ADML-part1 -> ../../sdb1 lrwxrwxrwx 1 root root 10 Jul 29 10:09 ata-ST3160812AS_5LS9ADML-part9 -> ../../sdb9 lrwxrwxrwx 1 root root 9 Jul 29 10:13 ata-WDC_WD3200AAKX-001CA0_WD-WCAYUL400860 -> ../../sdc lrwxrwxrwx 1 root root 10 Jul 29 10:13 ata-WDC_WD3200AAKX-001CA0_WD-WCAYUL400860-part1 -> ../../sdc1 lrwxrwxrwx 1 root root 10 Jul 29 10:13 ata-WDC_WD3200AAKX-001CA0_WD-WCAYUL400860-part9 -> ../../sdc9 ... snip ...
These are the drive ID’s that you’ll be using to create the array. You can create an array using names such as /dev/sdb but it’s a bad idea as the OS might chance which disk that link point too (especially if you swap disks around). Using the ID given here means you will always know exactly which disk you are interacting with. You should also write this ID on the end of the disk (or the carrier / caddie) so you know which disk to pull out of an array. I’ve snipped the content here to remove virtual devices and show only the physical disks.
Wiping the drives
Before you can create a ZFS pool on the drives you need to wipe them. If you don’t you’ll almost certainly get an error message from the zpool create
command telling you that the drives already contain a valid file system. For example:
zpool create -f -o 'ashift=12' arrayz1 tank /dev/disk/by-id/ata-ST3160812AS_5LS9ADML /dev/disk/by-id/ata-WDC_WD3200AAKX-001CA0_WD-WCAYUL400860 cannot create 'arrayz1': one or more vdevs refer to the same device, or one of the devices is part of an active md or lvm device
To wipe a disk use the wipefs
command, this is what Proxmox uses if you select wipe though the interface. Note, this doesn’t clear the data, it simply removes the firesystem. For example:
wipefs -a /dev/disk/by-id/ata-ST3160812AS_5LS9ADML /dev/disk/by-id/ata-ST3160812AS_5LS9ADML: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54 /dev/disk/by-id/ata-ST3160812AS_5LS9ADML: 8 bytes were erased at offset 0x2540be3e00 (gpt): 45 46 49 20 50 41 52 54 /dev/disk/by-id/ata-ST3160812AS_5LS9ADML: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa /dev/disk/by-id/ata-ST3160812AS_5LS9ADML: calling ioctl to re-read partition table: Success
As can be seen the wipefs
command has erased several key bytes on the disk which removed the existing filesystem, or at least renders them inaccessible. You can confirm they are gone by running the fdisk -l
command again. The wiped drive will show no partitions.
Creating the Pool
Now try running the following command without the -f flag.
zpool create -o 'ashift=12' tank raidz1 /dev/disk/by-id/ata-ST3160812AS_5LS9ADML /dev/disk/by-id/ata-WDC_WD3200AAKX-001CA0_WD-WCAYUL400860 invalid vdev specification use '-f' to override the following errors: raidz contains devices of different sizes
ZFS complains because the drives are mismatched in size and it tells you to use the -f flag to create the array anyway. This pool would be called “tank” and although it only contains two disks I have specified it be created as a raidz array. This allows the array to be expanded later but for now it would essentially operate like a mirror. Running the command a second time but with the -f flag you should see no output indicating that the command succeeded.
zpool create -f -o 'ashift=12' tank raidz1 /dev/disk/by-id/ata-ST3160812AS_5LS9ADML /dev/disk/by-id/ata-WDC_WD3200AAKX-001CA0_WD-WCAYUL400860
Success can be confirmed by running zpool status
as shown here.
zpool status pool: tank state: ONLINE config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ata-ST3160812AS_5LS9ADML ONLINE 0 0 0 ata-WDC_WD3200AAKX-001CA0_WD-WCAYUL400860 ONLINE 0 0 0 errors: No known data errors
Create a dataset
This is just for demonstration purposes, you will probably want to organize your datasets into logical chunks. To create a dataset call data issue the following command:
zfs create tank/data
There will be no output indicating that the command was successful. If you run an ls
command under /tank
you’ll see a new directory which is the root of the dataset.
That’s all there is to it. The array is created and we went the extra step of adding a dataset.