From: "Nik." <btrfs@avgustinov.eu>
To: Chris Murphy <lists@colorremedies.com>,
Btrfs BTRFS <linux-btrfs@vger.kernel.org>,
Qu Wenruo <quwenruo.btrfs@gmx.com>
Subject: Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
Date: Wed, 10 Apr 2019 23:03:55 +0200 [thread overview]
Message-ID: <f6ee7655-e6e2-4162-6c34-7b4b3dba71c7@avgustinov.eu> (raw)
In-Reply-To: <CAJCQCtQJkJwEyouCUzcV1MzPkcxhvtqxkWqmrwnB9txV=MUTXA@mail.gmail.com>
2019-04-07 20:45, Chris Murphy:
> On Sun, Apr 7, 2019 at 1:42 AM Nik. <btrfs@avgustinov.eu> wrote:
>> 2019-04-07 01:18, Qu Wenruo:
>
>>> You have 2 bits flipped just in one tree block!
>>>
>> If the data-tree structures alone have so many bits flipped, how much
>> flipped bits are to be expected in the data itself? What should a normal
>> btrfs user do in order to prevent such disasters?
>
> I think the corruption in your case is inferred by Btrfs only by bad
> key ordering, not csum failure for the leaf? I can't tell for sure
> from the error, but I don't see a csum complaint.
I do not quite understand where the "bad key ordering" came from, but my
question why (in my case) it keeps happening only to the btrfs file
systems? Is it relevant, that all four failed systems have had initially
ext4 format and were converted to btrfs (with the btrfs-progs used 5-6
years ago)?
Another question: I am sure that many btrfs users are ready in some
cases to trade reliability for performance; wouldn't it be interesting
to introduce a kind of switch/option like the "verify on", used many
years ago on msdos-systems to ensure that write operations (especially
on floppy disks) were successful? Just an idea...
My btrfs-restore is still running (since Monday evening, until now about
50% restored), and I am on a business trip. As soon as it finishes and I
am back home I will compare with the backup and give more info, but it
seems that this would need another day or two.
Kind regards,
Nik.
--
> I'd expect a RAM caused corruption could affect a metadata leaf data,
> followed by csum computation. Therefore no csum failure on subsequent
> read. Whereas if the corruption is storage stack related, we'd see a
> csum error on subsequent read.
>
> Once there's corruption in a block address, the corruption can
> propagate into anything else that depends on that block address even
> if there isn't another corruption event. So one event, multiple
> corruptions.
>
>
>> And another thing: if I am getting it right, it should have been more
>> reliable/appropriate to let btrfs manage the five disks behind the md0
>> with a raid1 profile instead binding them in a RAID5 and "giving" just a
>> single device to btrfs.
>
> Not necessarily. If corruption happens early enough, it gets baked
> into all copies of the metadata.
>
>
next prev parent reply other threads:[~2019-04-10 21:04 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <aa81a49a-d5ca-0f1c-fa75-9ed3656cff55@avgustinov.eu>
2019-03-31 18:44 ` interest in post-mortem examination of a BTRFS system and improving the btrfs-code? btrfs
2019-04-02 0:24 ` Qu Wenruo
2019-04-02 13:06 ` Nik.
2019-04-02 13:24 ` Qu Wenruo
2019-04-02 13:29 ` Hugo Mills
2019-04-02 14:05 ` Nik.
2019-04-02 13:59 ` Nik.
2019-04-02 14:12 ` Qu Wenruo
2019-04-02 14:19 ` Hans van Kranenburg
2019-04-02 15:04 ` Nik.
2019-04-02 15:07 ` Hans van Kranenburg
2019-04-02 21:22 ` Nik.
2019-04-03 1:04 ` Qu Wenruo
2019-04-04 15:27 ` Nik.
2019-04-05 0:47 ` Qu Wenruo
2019-04-05 6:58 ` Nik.
2019-04-05 7:08 ` Qu Wenruo
[not found] ` <e9720559-eff2-e88b-12b4-81defb8c29c5@avgustinov.eu>
2019-04-05 8:15 ` Qu Wenruo
2019-04-05 19:38 ` Nik.
2019-04-06 0:03 ` Qu Wenruo
2019-04-06 7:16 ` Nik.
2019-04-06 7:45 ` Qu Wenruo
2019-04-06 8:44 ` Nik.
2019-04-06 9:06 ` Qu Wenruo
2019-04-06 13:20 ` Nik.
2019-04-06 13:22 ` Qu Wenruo
2019-04-06 13:28 ` Qu Wenruo
2019-04-06 14:19 ` Nik.
2019-04-06 23:18 ` Qu Wenruo
2019-04-07 7:41 ` Nik.
2019-04-07 18:45 ` Chris Murphy
2019-04-08 13:09 ` Qu Wenruo
2019-04-08 21:22 ` Nik.
2019-04-12 10:44 ` Nik.
2019-04-12 10:50 ` Qu Wenruo
2019-04-12 11:38 ` Nik.
2019-04-12 12:45 ` Qu Wenruo
2019-05-07 17:17 ` Nik.
2019-05-07 17:30 ` Chris Murphy
2019-05-13 12:19 ` Nik.
2019-04-10 21:03 ` Nik. [this message]
2019-04-11 0:45 ` Qu Wenruo
2019-04-02 18:28 ` Chris Murphy
2019-04-02 19:02 ` Hugo Mills
2019-04-04 2:48 ` Jeff Mahoney
2019-04-04 15:58 ` Nik.
2019-04-04 17:31 ` Chris Murphy
[not found] ` <beab578a-ccaf-1ec7-c7b6-1ba9cd3743ad@avgustinov.eu>
2019-04-05 7:07 ` Chris Murphy
2019-04-05 12:07 ` Nik.
2019-04-12 10:52 ` Nik.
2019-04-05 6:53 ` Chris Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f6ee7655-e6e2-4162-6c34-7b4b3dba71c7@avgustinov.eu \
--to=btrfs@avgustinov.eu \
--cc=linux-btrfs@vger.kernel.org \
--cc=lists@colorremedies.com \
--cc=quwenruo.btrfs@gmx.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).