From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: james harvey <jamespharvey20@gmail.com>
Cc: Chris Murphy <lists@colorremedies.com>,
Goffredo Baroncelli <kreijack@inwind.it>,
Anand Jain <anand.jain@oracle.com>,
Remi Gauvin <remi@georgianit.com>,
Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files
Date: Fri, 29 Jun 2018 14:31:04 -0400 [thread overview]
Message-ID: <f76a93ba-05c8-967f-d7c0-8fdc0a618a7e@gmail.com> (raw)
In-Reply-To: <CA+X5Wn7jVZS5USHnmj1SrHh87V1O6vgzRcLM64di4ALc81hjmQ@mail.gmail.com>
On 2018-06-29 13:58, james harvey wrote:
> On Fri, Jun 29, 2018 at 1:09 PM, Austin S. Hemmelgarn
> <ahferroin7@gmail.com> wrote:
>> On 2018-06-29 11:15, james harvey wrote:
>>>
>>> On Thu, Jun 28, 2018 at 6:27 PM, Chris Murphy <lists@colorremedies.com>
>>> wrote:
>>>>
>>>> And an open question I have about scrub is weather it only ever is
>>>> checking csums, meaning nodatacow files are never scrubbed, or if the
>>>> copies are at least compared to each other?
>>>
>>>
>>> Scrub never looks at nodatacow files. It does not compare the copies
>>> to each other.
>>>
>>> Qu submitted a patch to make check compare the copies:
>>> https://patchwork.kernel.org/patch/10434509/
>>>
>>> This hasn't been added to btrfs-progs git yet.
>>>
>>> IMO, I think the offline check should look at nodatacow copies like
>>> this, but I still think this also needs to be added to scrub. In the
>>> patch thread, I discuss my reasons why. In brief: online scanning;
>>> this goes along with user's expectation of scrub ensuring mirrored
>>> data integrity; and recommendations to setup scrub on periodic basis
>>> to me means it's the place to put it.
>>
>> That said, it can't sanely fix things if there is a mismatch. At least, not
>> unless BTRFS gets proper generational tracking to handle temporarily missing
>> devices. As of right now, sanely fixing things requires significant manual
>> intervention, as you have to bypass the device read selection algorithm to
>> be able to look at the state of the individual copies so that you can pick
>> one to use and forcibly rewrite the whole file by hand.
>
> Absolutely. User would need to use manual intervention as you
> describe, or restore the single file(s) from backup. But, it's a good
> opportunity to tell the user they had partial data corruption, even if
> it can't be auto-fixed. Otherwise they get intermittent data
> corruption, depending on which copies are read.
The thing is though, as things stand right now, you need to manually
edit the data on-disk directly or restore the file from a backup to fix
the file. While it's technically true that you can manually repair this
type of thing, both of the cases for doing it without those patches I
mentioned, it's functionally impossible for a regular user to do it
without potentially losing some data.
Unless that changes, scrub telling you it's corrupt is not going to help
much aside from making sure you don't make things worse by trying to use
it. Given this, it would make sense to have a (disabled by default)
option to have scrub repair it by just using the newer or older copy of
the data. That would require classic RAID generational tracking though,
which BTRFS doesn't have yet.
>> A while back, Anand Jain posted some patches that would let you select a
>> particular device to direct all reads to via a mount option, but I don't
>> think they ever got merged. That would have made manual recovery in cases
>> like this exponentially easier (mount read-only with one device selected,
>> copy the file out somewhere, remount read-only with the other device, drop
>> caches, copy the file out again, compare and reconcile the two copies, then
>> remount the volume writable and write out the repaired file).
next prev parent reply other threads:[~2018-06-29 18:31 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-28 1:42 Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files Remi Gauvin
2018-06-28 1:58 ` Qu Wenruo
2018-06-28 2:10 ` Remi Gauvin
2018-06-28 2:55 ` Qu Wenruo
2018-06-28 3:14 ` remi
2018-06-28 5:39 ` Qu Wenruo
2018-06-28 8:16 ` Andrei Borzenkov
2018-06-28 8:20 ` Andrei Borzenkov
2018-06-28 9:15 ` Qu Wenruo
2018-06-28 11:12 ` Austin S. Hemmelgarn
2018-06-28 11:46 ` Qu Wenruo
2018-06-28 12:20 ` Austin S. Hemmelgarn
2018-06-28 17:10 ` Andrei Borzenkov
2018-06-29 0:07 ` Qu Wenruo
2018-06-28 22:00 ` Remi Gauvin
2018-06-28 13:24 ` Anand Jain
2018-06-28 14:17 ` Chris Murphy
2018-06-28 15:37 ` Remi Gauvin
2018-06-28 22:04 ` Chris Murphy
2018-06-28 17:37 ` Goffredo Baroncelli
2018-06-28 22:27 ` Chris Murphy
2018-06-29 15:15 ` james harvey
2018-06-29 17:09 ` Austin S. Hemmelgarn
2018-06-29 17:58 ` james harvey
2018-06-29 18:31 ` Austin S. Hemmelgarn [this message]
2018-06-30 6:33 ` Duncan
2018-07-02 12:03 ` Austin S. Hemmelgarn
2018-06-29 18:40 ` Chris Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f76a93ba-05c8-967f-d7c0-8fdc0a618a7e@gmail.com \
--to=ahferroin7@gmail.com \
--cc=anand.jain@oracle.com \
--cc=jamespharvey20@gmail.com \
--cc=kreijack@inwind.it \
--cc=linux-btrfs@vger.kernel.org \
--cc=lists@colorremedies.com \
--cc=remi@georgianit.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).