ZFS replace drive before it fails

The main Virtual Machine Server was seeing hardware failures and ZFS “scrub not zero bytes”. One of the (cheap) Hitachi Ultrastar 2TB disks was starting to fail, after only 1.5 years. Smartctl was showing 53 recent errors.

  pool: vmstorage
 state: ONLINE
  scan: scrub repaired 1.75M in 5h12m with 0 errors on Tue Oct  1 07:12:22 2019

        NAME                                            STATE     READ WRITE CKSUM
        vmstorage                                       ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            ata-Hitachi_HUA723020ALA641_YFG31Y3A-part2  ONLINE       0     0     0
            ata-Hitachi_HUA723020ALA641_YFG4GJ8A-part2  ONLINE       0     0     0

/var/log/messages was showing stuff like this, over and over:

Sep 15 03:03:42 dellt3600 smartd[19459]: Device: /dev/sdc [SAT], 831 Currently unreadable (pending) sectors

Looking in /dev/disk/by-id/*, the failing drive has a serial number of YFG4GJ8A.

So, the setup for these commands became:

export DISK_GOOD=/dev/disk/by-id/ata-Hitachi_HUA723020ALA641_YFG31Y3A-part2
export DISK_BAD=/dev/disk/by-id/ata-Hitachi_HUA723020ALA641_YFG4GJ8A-part2
export DISK_REPLACE=/dev/disk/by-id/ata-Hitachi_HUA723020ALA641_YGJ0JSYA-part2

The command to remove the failing, but not yet failed, drive from the mirror:

zpool detach vmstorage $DISK_BAD

(At this point, I shutdown the machine, and had to swap disks since there was only room for two 3.5″ HDDs).

After reboot, the command to add the new disk into the mirror:

zpool attach vmstorage $DISK_GOOD $DISK_REPLACE

Resilvering 1.25TB took 4h56m.

Note: if you “pre-partition” your ZFS disks (like I do), then you also need the “root” disk to run parted:

export DISK_REPLACE_ROOT=/dev/disk/by-id/ata-Hitachi_HUA723020ALA641_YGJ0JSYA

Use ‘unit s’ to create the partitions with exactly the same sector counts as the drive being replaced.

Just recording the replacement drive: $42 – HGST/Hitachi Ultrastar 7K3000 2TB 7200RPM Enterprise Grade Sata III For the record – the drive arrived new, with 0 hours power-on time. Vendor was DBSKY.

This entry was posted in ZFS. Bookmark the permalink.