{"id":1049,"date":"2019-10-12T17:16:24","date_gmt":"2019-10-12T22:16:24","guid":{"rendered":"http:\/\/tiemensfamily.com\/TimOnCS\/?p=1049"},"modified":"2019-10-12T17:16:24","modified_gmt":"2019-10-12T22:16:24","slug":"zfs-replace-drive-before-it-fails","status":"publish","type":"post","link":"https:\/\/tiemensfamily.com\/timoncs\/2019\/10\/12\/zfs-replace-drive-before-it-fails\/","title":{"rendered":"ZFS replace drive before it fails"},"content":{"rendered":"<p>The main <a href=\"http:\/\/tiemensfamily.com\/TimOnCS\/2018\/04\/05\/virtual-machine-server\/\">Virtual Machine Server<\/a> was seeing hardware failures and ZFS &#8220;scrub not zero bytes&#8221;.  One of the (cheap) Hitachi Ultrastar 2TB disks was starting to fail, after only 1.5 years.  Smartctl was showing 53 recent errors.<\/p>\n<pre>\n  pool: vmstorage\n state: ONLINE\n  scan: scrub repaired 1.75M in 5h12m with 0 errors on Tue Oct  1 07:12:22 2019\nconfig:\n\n        NAME                                            STATE     READ WRITE CKSUM\n        vmstorage                                       ONLINE       0     0     0\n          mirror-0                                      ONLINE       0     0     0\n            ata-Hitachi_HUA723020ALA641_YFG31Y3A-part2  ONLINE       0     0     0\n            ata-Hitachi_HUA723020ALA641_YFG4GJ8A-part2  ONLINE       0     0     0\n<\/pre>\n<p>\/var\/log\/messages was showing stuff like this, over and over:<\/p>\n<pre>\nSep 15 03:03:42 dellt3600 smartd[19459]: Device: \/dev\/sdc [SAT], 831 Currently unreadable (pending) sectors\n<\/pre>\n<p>Looking in \/dev\/disk\/by-id\/*, the failing drive has a serial number of YFG4GJ8A.<\/p>\n<p>So, the setup for these commands became:<\/p>\n<pre>\nexport DISK_GOOD=\/dev\/disk\/by-id\/ata-Hitachi_HUA723020ALA641_YFG31Y3A-part2\nexport DISK_BAD=\/dev\/disk\/by-id\/ata-Hitachi_HUA723020ALA641_YFG4GJ8A-part2\nexport DISK_REPLACE=\/dev\/disk\/by-id\/ata-Hitachi_HUA723020ALA641_YGJ0JSYA-part2\n<\/pre>\n<p>The command to remove the failing, but not yet failed, drive from the mirror:<\/p>\n<pre>\nzpool detach vmstorage $DISK_BAD\n<\/pre>\n<p>(At this point, I shutdown the machine, and had to swap disks since there was only room for two 3.5&#8243; HDDs).<\/p>\n<p>After reboot, the command to add the new disk into the mirror:<\/p>\n<pre>\nzpool attach vmstorage $DISK_GOOD $DISK_REPLACE\n<\/pre>\n<p>Resilvering 1.25TB took 4h56m.<\/p>\n<p>Note: if you &#8220;pre-partition&#8221; your ZFS disks (like I do), then you also need the &#8220;root&#8221; disk to run parted:<\/p>\n<pre>\nexport DISK_REPLACE_ROOT=\/dev\/disk\/by-id\/ata-Hitachi_HUA723020ALA641_YGJ0JSYA\nparted $DISK_REPLACE_ROOT\n<\/pre>\n<p>Use &#8216;unit s&#8217; to create the partitions with exactly the same sector counts as the drive being replaced.<\/p>\n<p>Just recording the replacement drive: $42 &#8211; <a href=\"https:\/\/smile.amazon.com\/gp\/product\/B00HRLI2FU\/\">HGST\/Hitachi Ultrastar 7K3000 2TB 7200RPM Enterprise Grade Sata III<\/a>  For the record &#8211; the drive arrived new, with 0 hours power-on time.  Vendor was DBSKY.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The main Virtual Machine Server was seeing hardware failures and ZFS &#8220;scrub not zero bytes&#8221;. One of the (cheap) Hitachi Ultrastar 2TB disks was starting to fail, after only 1.5 years. Smartctl was showing 53 recent errors. pool: vmstorage state: &hellip; <a href=\"https:\/\/tiemensfamily.com\/timoncs\/2019\/10\/12\/zfs-replace-drive-before-it-fails\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[10],"tags":[],"_links":{"self":[{"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/posts\/1049"}],"collection":[{"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/comments?post=1049"}],"version-history":[{"count":0,"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/posts\/1049\/revisions"}],"wp:attachment":[{"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/media?parent=1049"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/categories?post=1049"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/tags?post=1049"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}