From: Duncan <1i5t5.duncan@cox.net>
To: Austin S Hemmelgarn <ahferroin7@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: What is the vision for btrfs fs repair?
Date: Thu, 9 Oct 2014 05:34:02 -0700 [thread overview]
Message-ID: <20141009053402.7dc286f0@ws> (raw)
In-Reply-To: <107Y1p00G0wm9Bl0107vjZ>
On Thu, 09 Oct 2014 08:07:51 -0400
Austin S Hemmelgarn <ahferroin7@gmail.com> wrote:
> On 2014-10-09 07:53, Duncan wrote:
> > Austin S Hemmelgarn posted on Thu, 09 Oct 2014 07:29:23 -0400 as
> > excerpted:
> >
> >> Also, you should be running btrfs scrub regularly to correct
> >> bit-rot and force remapping of blocks with read errors. While
> >> BTRFS technically handles both transparently on reads, it only
> >> corrects thing on disk when you do a scrub.
> >
> > AFAIK that isn't quite correct. Currently, the number of copies is
> > limited to two, meaning if one of the two is bad, there's a 50%
> > chance of btrfs reading the good one on first try.
> >
> > If btrfs reads the good copy, it simply uses it. If btrfs reads
> > the bad one, it checks the other one and assuming it's good,
> > replaces the bad one with the good one both for the read (which
> > otherwise errors out), and by overwriting the bad one.
> >
> > But here's the rub. The chances of detecting that bad block are
> > relatively low in most cases. First, the system must try reading
> > it for some reason, but even then, chances are 50% it'll pick the
> > good one and won't even notice the bad one.
> >
> > Thus, while btrfs may randomly bump into a bad block and rewrite it
> > with the good copy, scrub is the only way to systematically detect
> > and (if there's a good copy) fix these checksum errors. It's not
> > that btrfs doesn't do it if it finds them, it's that the chances of
> > finding them are relatively low, unless you do a scrub, which
> > systematically checks the entire filesystem (well, other than files
> > marked nocsum, or nocow, which implies nocsum, or files written
> > when mounted with nodatacow or nodatasum).
> >
> > At least that's the way it /should/ work. I guess it's possible
> > that btrfs isn't doing those routine "bump-into-it-and-fix-it"
> > fixes yet, but if so, that's the first /I/ remember reading of it.
>
> I'm not 100% certain, but I believe it doesn't actually fix things on
> disk when it detects an error during a read, I know it doesn't it the
> fs is mounted ro (even if the media is writable), because I did some
> testing to see how 'read-only' mounting a btrfs filesystem really is.
Definitely it won't with a read-only mount. But then scrub shouldn't
be able to write to a read-only mount either. The only way a read-only
mount should be writable is if it's mounted (bind-mounted or
btrfs-subvolume-mounted) read-write elsewhere, and the write occurs to
that mount, not the read-only mounted location.
There's even debate about replaying the journal or doing orphan-delete
on read-only mounts (at least on-media, the change could, and arguably
should, occur in RAM and be cached, marking the cache "dirty" at the
same time so it's appropriately flushed if/when the filesystem goes
writable), with some arguing read-only means just that, don't
write /anything/ to it until it's read-write mounted.
But writable-mounted, detected checksum errors (with a good copy
available) should be rewritten as far as I know. If not, I'd call it
a bug. The problem is in the detection, not in the rewriting. Scrub's
the only way to reliably detect these errors since it's the only thing
that systematically checks /everything/.
> Also, that's a much better description of how multiple copies work
> than I could probably have ever given.
Thanks. =:^)
--
Duncan - No HTML messages please, as they are filtered as spam.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2014-10-09 12:34 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-08 19:11 What is the vision for btrfs fs repair? Eric Sandeen
2014-10-09 11:29 ` Austin S Hemmelgarn
2014-10-09 11:53 ` Duncan
2014-10-09 11:55 ` Hugo Mills
2014-10-09 12:07 ` Austin S Hemmelgarn
2014-10-09 12:12 ` Hugo Mills
2014-10-09 12:32 ` Austin S Hemmelgarn
[not found] ` <107Y1p00G0wm9Bl0107vjZ>
2014-10-09 12:34 ` Duncan [this message]
2014-10-09 13:18 ` Austin S Hemmelgarn
2014-10-09 13:49 ` Duncan
2014-10-09 15:44 ` Eric Sandeen
[not found] ` <0zvr1p0162Q6ekd01zvtN0>
2014-10-09 12:42 ` Duncan
2014-10-10 1:58 ` Chris Murphy
2014-10-10 3:20 ` Duncan
2014-10-10 10:53 ` Bob Marley
2014-10-10 10:59 ` Roman Mamedov
2014-10-10 11:12 ` Bob Marley
2014-10-10 15:18 ` cwillu
2014-10-10 14:37 ` Chris Murphy
2014-10-10 17:43 ` Bob Marley
2014-10-10 17:53 ` Bardur Arantsson
2014-10-10 19:35 ` Austin S Hemmelgarn
2014-10-10 22:05 ` Eric Sandeen
2014-10-13 11:26 ` Austin S Hemmelgarn
2014-10-12 10:14 ` Martin Steigerwald
2014-10-12 23:59 ` Duncan
2014-10-13 11:37 ` Austin S Hemmelgarn
2014-10-13 11:48 ` Rich Freeman
2014-10-11 7:29 ` Goffredo Baroncelli
2014-11-17 20:55 ` Phillip Susi
2014-10-12 10:06 ` Martin Steigerwald
2014-10-12 10:17 ` Martin Steigerwald
2014-10-13 21:09 ` Josef Bacik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141009053402.7dc286f0@ws \
--to=1i5t5.duncan@cox.net \
--cc=ahferroin7@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.