From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f53.google.com ([74.125.82.53]:35493 "EHLO mail-wm0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751888AbcF1Atn (ORCPT ); Mon, 27 Jun 2016 20:49:43 -0400 Received: by mail-wm0-f53.google.com with SMTP id v199so119581481wmv.0 for ; Mon, 27 Jun 2016 17:49:43 -0700 (PDT) Received: from system (dslb-088-067-121-064.088.067.pools.vodafone-ip.de. [88.67.121.64]) by smtp.gmail.com with ESMTPSA id bb4sm1516878wjb.32.2016.06.27.17.49.40 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 27 Jun 2016 17:49:41 -0700 (PDT) Date: Tue, 28 Jun 2016 02:49:40 +0200 From: Saint Germain Cc: Btrfs BTRFS Subject: Re: Kernel bug during RAID1 replace Message-ID: <20160628024940.3b323b26@system> In-Reply-To: References: <20160627233612.662d2a9a@system> <20160628002602.022258bf@system> <20160628010618.58e235fa@system> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII To: unlisted-recipients:; (no To-header on input) Sender: linux-btrfs-owner@vger.kernel.org List-ID: On Mon, 27 Jun 2016 18:00:34 -0600, Chris Murphy wrote : > On Mon, Jun 27, 2016 at 5:06 PM, Saint Germain > wrote: > > On Mon, 27 Jun 2016 16:58:37 -0600, Chris Murphy > > wrote : > > > >> On Mon, Jun 27, 2016 at 4:55 PM, Chris Murphy > >> wrote: > >> > >> >> BTRFS info (device sdb1): dev_replace from /dev/sda1 (devid 1) > >> >> to /dev/sdd1 started scrub_handle_errored_block: 166 callbacks > >> >> suppressed BTRFS warning (device sdb1): checksum error at > >> >> logical 93445255168 on dev /dev/sda1, sector 77669048, root 5, > >> >> inode 3434831, offset 479232, length 4096, links 1 (path: > >> >> user/.local/share/zeitgeist/activity.sqlite-wal) > >> >> btrfs_dev_stat_print_on_error: 166 callbacks suppressed BTRFS > >> >> error (device sdb1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0, > >> >> corrupt 14221, gen 24 scrub_handle_errored_block: 166 callbacks > >> >> suppressed BTRFS error (device sdb1): unable to fixup (regular) > >> >> error at logical 93445255168 on dev /dev/sda1 > >> > > >> > Shoot. You have a lot of these. It looks suspiciously like you're > >> > hitting a case list regulars are only just starting to understand > >> > >> Forget this part completely. It doesn't affect raid1. I just > >> re-read that your setup is not raid1, I don't know why I thought > >> it was raid5. > >> > >> The likely issue here is that you've got legit corruptions on sda > >> (mix of slow and flat out bad sectors), as well as a failing drive. > >> > >> This is also safe to issue: > >> > >> smartctl -l scterc /dev/sda > >> smartctl -l scterc /dev/sdb > >> cat /sys/block/sda/device/timeout > >> cat /sys/block/sdb/device/timeout > >> > > > > My setup is indeed RAID1 (and not RAID5) > > > > root@system:/# smartctl -l scterc /dev/sda > > smartctl 6.4 2014-10-07 r4002 [x86_64-linux-4.6.0-0.bpo.1-amd64] > > (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, > > www.smartmontools.org > > > > SCT Error Recovery Control: > > Read: Disabled > > Write: Disabled > > > > root@system:/# smartctl -l scterc /dev/sdb > > smartctl 6.4 2014-10-07 r4002 [x86_64-linux-4.6.0-0.bpo.1-amd64] > > (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, > > www.smartmontools.org > > > > SCT Error Recovery Control: > > Read: Disabled > > Write: Disabled > > > > root@system:/# cat /sys/block/sda/device/timeout > > 30 > > root@system:/# cat /sys/block/sdb/device/timeout > > 30 > > Good news and bad news. The bad news is this is a significant > misconfiguration, it's very common, and it means that any bad sectors > that don't result in read errors before 30 seconds will mean they > don't get fixed by Btrfs (or even mdadm or LVM raid). So they can > accumulate. > > There are two options since your drives support SCT ERC. > > 1. > smartctl -l scterc,70,70 /dev/sdX ## done for both drives > > That will make sure the drive reports a read error in 7 seconds, well > under the kernel's command timer of 7 seconds. This is how your drives > should normally be configured for RAID usage. > > 2. > echo 180 > /sys/block/sda/device/timeout > echo 180 > /sys/block/sdb/device/timeout > > This *might* actually work better in your case. If you permit the > drives to have really long error recovery, it might actually allow the > data to be returned to Btrfs and then it can start fixing problems. > Maybe. It's a long shot. And there will be upwards of 3 minute hangs. > > I would give this a shot first. You can issue these commands safely at > any time, no umount is needed or anything like that. I would do this > even before using cp/rsync or ddrescue because it increases the chance > the drive can recover data from these bad sectors and fix the other > drive. > > These settings are not persistent across a reboot unless you set a > udev rule or equivalent. > > On one of my drives that supports SCT ERC it only accepts the smartctl > -l command to set the timeout once. I can't change it without power > cycling the drive or it just crashes (yay firmware bugs). Just FYI > it's possible to run into other weirdness. > I've tried both option and launched a replace, but I got the same error (replace is cancelled, jernel bug). I will let these options on and attempt a ddrescue on /dev/sda to /dev/sdd. Then I will disconnect /dev/sda and reboot and see if it works better. > Last, I have no idea if the massive Btrfs write errors on sda are from > an earlier problem where the drive data or power cable got jiggled or > was otherwise absent temporarily? So depending on how the block > timeout change affects your data recovery, you might end up needing to > do a reboot to get back to a more stable state for all of this? It > really should be able to fix things *if* at least one copy can be read > and then written to the other drive. > I have also no idea why is sda behaving like this. I haven't done anything particular on these drives. Thanks for your help !