* What does scrub do?
@ 2014-04-11 11:23 Alex
2014-04-11 18:36 ` Duncan
2014-04-15 16:26 ` David Sterba
0 siblings, 2 replies; 4+ messages in thread
From: Alex @ 2014-04-11 11:23 UTC (permalink / raw)
To: linux-btrfs
Hi all,
Debian testing/Jessie-to-be; except kernels/btrfs-tools are from unstable so
usually couple of weeks later than you/Linus publish.
Linux XX 3.13-1-amd64 #1 SMP Debian 3.13.7-1 (2014-03-25) x86_64
Btrfs-tools v3.12 Debian standard (not particularly messed with looks like)
I've never had scrub report anything other than 0 (zero) errors. Ever.
Yet I've had more than one ( ;-) ) problem which required btrfs-zero-log
and/or btrfs --repair. This are usually my fault - fixed it 'til it broke.
root@XX ~ # btrfs scrub status /
scrub status for f8152a67-3c2e-4da1-812e-9a6ab2ad1102
scrub started at Fri Apr 11 09:55:36 2014 and finished after 44 seconds
total bytes scrubbed: 1.40GiB with 0 errors
[ 7.502338] btrfs: device label china devid 1 transid 938773 /dev/vda1
[ 7.514213] btrfs: device label china devid 1 transid 938773 /dev/vda1
[ 7.530893] btrfs: disk space caching is enabled
[ 7.530897] btrfs: has skinny extents
[ 7.720288] btrfs: bdev /dev/vda1 errs: wr 0, rd 0, flush 0, corrupt 66,
gen 2
[ 18.967319] btrfs: device label china devid 1 transid 938773 /dev/vda1
[ 19.360767] btrfs: device label china devid 1 transid 938773 /dev/vda1
This scrub and dmesg were taken within minutes of each other. So what it the
utility of running scrub? Or have I got the the wrong idea of what scrub
should report. This VM guest doesn't get messed with often, and is kept
Very small KVM virtual machine - easy to send you a btrfs dump. Almost
vanilla set-up too. Just say the word.
Have been running btrfs here for quite some while (years, since Linux3.1 I
think) on server. Very very stable (lzo compression sometimes not quite as
stable as zlib and I only run it on the desktop m/c).
People: for auto snapshots use Snapper (a la SUSE) which is now in Debian et
al. Only peculiarity is that clear-down of daily snapshots only happens in
the night so you don't need to put many/any hourly snapshots in.
Thank you. And well done/thank you to the contributors.
Al.
PS: please get the 3.14 tools release out - perhaps the fixes have
already gone through the tree and I am just shouting at the wind.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: What does scrub do?
2014-04-11 11:23 What does scrub do? Alex
@ 2014-04-11 18:36 ` Duncan
2014-04-15 17:13 ` Alex
2014-04-15 16:26 ` David Sterba
1 sibling, 1 reply; 4+ messages in thread
From: Duncan @ 2014-04-11 18:36 UTC (permalink / raw)
To: linux-btrfs
Alex posted on Fri, 11 Apr 2014 11:23:31 +0000 as excerpted:
> I've never had scrub report anything other than 0 (zero) errors. Ever.
> Yet I've had more than one ( ;-) ) problem which required btrfs-zero-log
> and/or btrfs --repair. This are usually my fault - fixed it 'til it
> broke.
>
> root@XX ~ # btrfs scrub status /
> scrub status for f8152a67-3c2e-4da1-812e-9a6ab2ad1102
> scrub started at Fri Apr 11 09:55:36 2014 and finished after 44 seconds
> total bytes scrubbed: 1.40GiB with 0 errors
[snip]
> [ 7.720288] btrfs: bdev /dev/vda1 errs: wr 0, rd 0, flush 0,
> corrupt 66, gen 2
[snip]
> This scrub and dmesg were taken within minutes of each other. So what it
> the utility of running scrub? Or have I got the the wrong idea of what
> scrub should report.
Probably the latter (wrong idea...), altho you might have the wrong idea
of what the mount is reporting, rather than the wrong idea about scrub,
or more likely, a bit of wrong on both.
Scrub is designed to fix one specific kind of error, and then in only one
specific (but somewhat common) case. Btrfs data and metadata are both
checksummed. Scrub goes over each individual checksummed object and
calculates its checksum, verifying it against the checksum stored for
it. If the checksums don't match, it records an error.
Additionally, for errors, *IF* there's a second copy of the object and
that copy DOES pass checksum validation, scrub will rewrite the bad copy
using the good copy, "scrubbing" the data and fixing the errors it found.
Here's the critical bit. By default, btrfs keeps two copies of metadata,
but *NOT* data. On a single device filesystem, this is dup mode metadata
(except on ssd, where it's single mode), single mode data. On a multi-
device filesystem, metadata will default to raid1 mode instead of dup
mode (a copy on each device instead of two copies on one device), while
data still defaults to single mode -- just one copy. There is one
further exception, for filesystems under 1 GiB in size, btrfs defaults to
mixed-mode, data/metadata in the same mixed chunks.
Of course if you created the filesystem with specific modes (say -draid1,
for raid1 mode data, or -msingle, for single mode metadata) or if you did
a balance-convert to change the mode or switched between multi-device and
single-device filesystem, the defaults won't apply -- you'll have what
you set (or the default for the originally created filesystem).
While scrub can detect checksum errors in single (and raid0) mode, there
won't be a second hopefully valid copy to replace bad copies with, so it
will detect checksum errors but won't be able to fix them. Only if
there's a second, valid copy, can it fix the errors it detects.
Which is one reason I run most of my btrfs filesystems with two devices
configured as raid1 for both data and metadata. (I do have a couple very
small filesystems, /boot and its backup on the other device, that are
mixed-mode dup-mode, on a single device, but of course dup-mode has a
second copy too.)
Anyway, if you have never seen scrub errors, that's because scrub has
never come across such checksum validation errors on your system.
Meanwhile, the corrupt errors you see in the above mount are likely
historical. The errors reported by mount above, and by btrfs device stat
are the number of errors since the filesystem was created or since the
last reset (btrfs device stat -z prints AND RESETS the stats). As you've
never had scrub report an error, the corruptions likely got fixed some
other way, possibly by deleting the affected files. But the count has
never been reset, so you're still seeing those historical errors.
> PS: please get the 3.14 tools release out - perhaps the fixes have
> already gone through the tree and I am just shouting at the wind.
FWIW, btrfs-progs v3.14 is tagged in git, and I'm running it here. I
don't know tarball release status since I build from git, but it's
definitely tagged and available in git, which is what I'm building from,
so it's definitely out.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: What does scrub do?
2014-04-11 11:23 What does scrub do? Alex
2014-04-11 18:36 ` Duncan
@ 2014-04-15 16:26 ` David Sterba
1 sibling, 0 replies; 4+ messages in thread
From: David Sterba @ 2014-04-15 16:26 UTC (permalink / raw)
To: Alex; +Cc: linux-btrfs
On Fri, Apr 11, 2014 at 11:23:31AM +0000, Alex wrote:
> People: for auto snapshots use Snapper (a la SUSE) which is now in Debian et
> al. Only peculiarity is that clear-down of daily snapshots only happens in
> the night so you don't need to put many/any hourly snapshots in.
Can be fixed by copying
/etc/cron.daily/suse.de-snapper -> ../cron.hourly
though this would increase the device load, there will be always a
snapshot to clean.
Or you can run the script any time you want to start the cleanup.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: What does scrub do?
2014-04-11 18:36 ` Duncan
@ 2014-04-15 17:13 ` Alex
0 siblings, 0 replies; 4+ messages in thread
From: Alex @ 2014-04-15 17:13 UTC (permalink / raw)
To: linux-btrfs
NoDuncan <1i5t5.duncan <at> XXX> writes:
Wow Duncan! Thank you so much for your extensive post. Well written and very
well received.
I do think your 'critical bit' comments are worth iterating! I've booked
marked your email!
Qu: By-the-by does know how to re-lay the CRCs down again?
I don't seem to be able to 'coax' it myself (but the corruption error is
still there). I created backup problems (on the VMs) for myself by using a
naive approach at the time. So it is a good chance to put it right now.
Thank you (!) for the 3.14 tools poke; I deserved it! Just bad timing my end
it seems .. tl;d(on't)r!
Qu: is anyone actively using seed devices? I saw one post relatively
recently. I can see "Ebonacco" and possibly "Killermist" are.
Kind regards
Alex.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-04-15 17:20 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-11 11:23 What does scrub do? Alex
2014-04-11 18:36 ` Duncan
2014-04-15 17:13 ` Alex
2014-04-15 16:26 ` David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox