Re: BTRFS RAID1 behavior after one drive temporal disconection

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Austin S Hemmelgarn <ahferroin7@gmail.com>
To: Pavel Pisa <pisa@cmp.felk.cvut.cz>, linux-btrfs@vger.kernel.org
Subject: Re: BTRFS RAID1 behavior after one drive temporal disconection
Date: Thu, 8 Oct 2015 07:47:33 -0400	[thread overview]
Message-ID: <561657D5.1070809@gmail.com> (raw)
In-Reply-To: <201510081028.02178.pisa@cmp.felk.cvut.cz>

[-- Attachment #1: Type: text/plain, Size: 6024 bytes --]

On 2015-10-08 04:28, Pavel Pisa wrote:
> Hello everybody,
>
> On Monday 05 of October 2015 22:26:46 Pavel Pisa wrote:
>> Hello everybody,
> ...
>> BTRFS has recognized appearance of its partition (even that hanged
>> from sdb5 to sde5 when disk "hotplugged" again).
>> But it seems that RAID1 components are not in sync and BTRFS
>> continues to report
>>
>> BTRFS: lost page write due to I/O error on /dev/sde5
>> BTRFS: bdev /dev/sde5 errs: wr 11021805, rd 8526080, flush 29099, corrupt
>> 0, gen
>>
>> I have tried to find the best way to resync RAID1 BTRFS partitions.
>> But problem is that filesystem is the root one of the system.
>> So reboot to some rescue media is required to run btrfsck --repair
>> which is intended for unmounted devices.
>>
>> What is behavior of BTRFS in this situation?
>> Is BTRFS able to use data from not up to date partition in these
>> cases where data in respective files have not been modified?
>> The main reason for question is if such (stable) data can be backuped
>> by out of sync partition in the case of some random block is wear
>> out on another device. Or is this situation equivalent to running
>> with only one disk?
>>
>> Are there some parameters/solution to run some command
>> (scrub balance) which makes devices to be in the sync again
>> without unmount or reboot?
>>
>> I believe than attaching one more drive and running "btrfs replace"
>> would solve described situation. But is there some equivalent to
>> run operation "inplace".
>
> It seems that SATA controller is not able to activate link which
> has not been connected at BIOS POST time. This means that I cannot add new drive
> without reboot.
Check your BIOS options, there should be some option to set SATA ports 
as either 'Hot-Plug' or 'External', which should allow you to hot-plug 
drives without needing a reboot (unless it's a Dell system, they have 
never properly implemented the SATA standard on their desktops).
>
> Before reboot, the server bleeds with messages
>
> BTRFS: bdev /dev/sde5 errs: wr 11715459, rd 8526080, flush 29099, corrupt 0, gen 0
> BTRFS: lost page write due to I/O error on /dev/sde5
> BTRFS: bdev /dev/sde5 errs: wr 11715460, rd 8526080, flush 29099, corrupt 0, gen 0
> BTRFS: lost page write due to I/O error on /dev/sde5
Even aside from the below mentioned issues, if your disk is showing that 
many errors, you should probably run a SMART self-test routine on it to 
determine whether this is just a transient issue or an indication of an 
impending disk failure.  The commands I'd suggest are:
smartctl -t short /dev/sde
That will tell you some time to wait for the test to complete, after 
waiting  that long, run:
smartctl -H /dev/sde
If that says the health check failed, replace the disk as soon as 
possible, and don't use it for storing any data you can't afford to lose.
>
> that changed to next mesages after reboot
>
> Btrfs loaded
> BTRFS: device label riki-pool devid 1 transid 282383 /dev/sda3
> BTRFS: device label riki-pool devid 2 transid 249562 /dev/sdb5
> BTRFS info (device sda3): disk space caching is enabled
> BTRFS (device sda3): parent transid verify failed on 44623216640 wanted 263476 found 212766
> BTRFS (device sda3): parent transid verify failed on 45201899520 wanted 282383 found 246891
> BTRFS (device sda3): parent transid verify failed on 45202571264 wanted 282383 found 246890
> BTRFS (device sda3): parent transid verify failed on 45201965056 wanted 282383 found 246889
> BTRFS (device sda3): parent transid verify failed on 45202505728 wanted 282383 found 246890
> BTRFS (device sda3): parent transid verify failed on 45202866176 wanted 282383 found 246890
> BTRFS (device sda3): parent transid verify failed on 45207126016 wanted 282383 found 246894
> BTRFS (device sda3): parent transid verify failed on 45202522112 wanted 282383 found 246890
> BTRFS: bdev /dev/disk/by-uuid/1627e557-d063-40b6-9450-3694dd1fd1ba errs: wr 11723314, rd 8526080, flush 2
> BTRFS (device sda3): parent transid verify failed on 45206945792 wanted 282383 found 67960
> BTRFS (device sda3): parent transid verify failed on 45204471808 wanted 282382 found 67960
>
> which looks really frightening to me. Temporary disconnected drive has old transid
> at start (OK). But what means the rest of the lines. If it means that files with
> older transaction ID are used from temporary disconnected drive (now /dev/sdb5)
> and newer versions from /dev/sda3 are ignored and reported as invalid then this means
> severe data lost and may it be mitchmatch because all transactions after disk disconnect
> are lost (i.e. FS root has been taken from misbehaving drive at old version).
>
> BTRFS does not fall even to red-only/degraded mode after system restart.
This actually surprises me.
>
> On the other hand, from logs (all stored on the possibly damaged root FS) it seems
> that there there are not missing messages from days when discs has been out of sync,
> so it looks like all data are OK. So should I expect that BTRFS managed problems
> well and all data are consistent?
I would be very careful in that situation, you may still have issues, at 
the very least, make a backup of the system as soon as possible.
>
> I go to use "btrfs replace" because there has not been any reply to my inplace correction
> question. But I expect that clarification if possible/how to resync RAID1 after one
> drive temporal disappear is really important to many of BTRFS users.
As of right now, there is no way that I know of to safely re-sync a 
drive that's been disconnected for a while.  The best bet is probably to 
use replace, but for that to work reliably, you would need to tell it to 
ignore the now stale drive when trying to read each chunk.

It is theoretically possible to wipe the FS signature on the out-of sync 
drive, run a device scan, then run 'replace missing' pointing at the now 
'blank' device, although going that route is really risky.


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]

next prev parent reply	other threads:[~2015-10-08 11:48 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-05 20:26 BTRFS RAID1 behavior after one drive temporal disconection Pavel Pisa
2015-10-08  8:28 ` Pavel Pisa
2015-10-08 11:47   ` Austin S Hemmelgarn [this message]
2015-10-08 16:40     ` Pavel Pisa
2015-10-08 21:13     ` Hugo Mills
2015-10-08 22:16       ` Pavel Pisa
2015-10-08 22:22         ` Hugo Mills
2015-10-09 11:13           ` Austin S Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=561657D5.1070809@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=pisa@cmp.felk.cvut.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).