From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: csum failed : d-raid0, m-raid1
Date: Sat, 30 Jan 2016 05:58:45 +0000 (UTC) [thread overview]
Message-ID: <pan$75b98$f785acae$3724509e$fea963e1@cox.net> (raw)
In-Reply-To: CAAcrkYJmDHNijZx45cdfZNTgVCgvih_06WROGJg+pjYsQT-=tg@mail.gmail.com
John Smith posted on Fri, 29 Jan 2016 19:04:42 +0100 as excerpted:
> Hi
>
> i built btrfs volume using 2x3tb brand new /tested for badblocks drives.
> I copied into volume around 5Tb of data.
>
> I tried to read one file which is around 4GB and i got input / output
> error.
>
> Dmesg contains:
>
> [154159.040059] BTRFS warning (device sdd): csum failed ino 9995246 off
> 4506214400 csum 383964635 expected csum 6478505
>
>
> Any idea what is it? Whats the reason that this happened? Can I recover?
Btrfs crc32c-checksums all blocks on write, both data (except for data
written while mounted nodatasum and nocow attribute files) and metadata
(always), and verifies the checksum on read.
The read-time csum verification failed on the block at that address of
the file, and as your data is raid0, there's no second copy to fall back
on as there would be for raid1 data and no parity data available to try
to rebuild from as there'd be for raid56 data, so the file can only be
read upto that point, and if you skip that block, and there are no
further checksum failures, beyond that point, to the end of the file.
Of course the sysadmin's first rule of backups, in simplest form, is that
if you don't have at least one backup, you are by your failure to backup
defining the value of the data as less than the value of the time/hassle/
resources you'd otherwise spend making that backup, so you either have a
backup to fallback to, or you're data is self-defined by that lack of a
backup as of only trivial value not worth the trouble.
And of course, btrfs, while stabilizING, isn't yet considered fully
stable and mature, so that sysadmin's rule of backups applies to an even
stronger degree than it does to fully stable and mature filesystems.
As a result, for recovery, you can either fall back to the backup,
rewriting the file from backup to the btrfs in question, or by action you
defined the data as too trivial to be worth backing up, so you can simply
delete the file in question and not worry about it.
The question then becomes one of finding out what file is involved, in
ordered to either delete it or recover it from backup. Keep in mind that
unlike most filesystems, inode numbers on btrfs are subvolume-specific,
so it's possible to have multiple inodes with the same inode number on
the filesystem, if you have multiple subvolumes. Thus, it's not as
simple as looking up what file that inode corresponds to, unless of
course you have only the primary/root subvolume, no others.
There are two ways to find what file corresponds to that inode on that
subvolume. One involves use of the btrfs debugging tools and is targeted
at devs. While I know this is possible and I've seen the method posted,
I'm not a dev, only a btrfs user and list regular, and I've not kept
track of the specifics, so I won't attempt to describe them further here.
The other one is btrfs scrub, which will systematically verify all
checksums on the filesystem, repairing errors where it's possible
(metadata in your case as it's raid1, assuming of course that the second
copy of the block isn't also bad), reporting those which it can't (the
raid0 data). Where it can't fix the problem dmesg should contain the
file with the problem (unless it's metadata and thus not a file, of
course).
Of course on 5 TB of data, scrub's going to take awhile... likely over a
day and possibly two (5 TiB of data at 30 MiB/sec is about 48 hours, 30
MiB/sec might be a bit pessimistically slow but isn't out of real-world
range on spinning rust). Even on relatively fast (for spinning rust)
drives, 100 MiB/sec, you're looking at 14 hours...
Tho because scrub checksum-verifies all blocks, it'll cover any problems
in other files and in metadata too, not just the one file.
FWIW, maintenance time is one of several reasons I use multiple smaller
btrfs on partitioned up devices, here, instead of a single huge multi-TB
btrfs. My btrfs are also all raid1 both data and metadata, save for
/boot (and its backup on the other device) which are both mixed-mode dup,
two copies on the same device, so there's always that second copy to pull
from to repair the failed one, if something fails checksum verification.
They also happen to be on SSD, with the largest btrfs on a pair of 24 GiB
partitions. As such, scrubs, balances, checks, etc, all take under 10
minutes per filesystem, with scrubs often complete in under a minute,
instead of the day or longer it's likely to take you for 5 TiB on
spinning rust. Of course I have more btrfs and it'd take me somewhat
longer than that minute to do just one, say a half hour, to scrub them
all, but some of them aren't even routinely mounted, and my 8 GiB (per
device, two devices) btrfs raid1 / is mounted read-only by default, so it
too is unlikely to be damaged. As such, generally only 2-3 btrfs need
scrubbed at once and often it's only 1-2, and on fast SSD, I'm done in
under 5 minutes. /Much/ more feasible maintenance time than several
/days/! =:^)
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
prev parent reply other threads:[~2016-01-30 5:58 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-29 18:04 csum failed : d-raid0, m-raid1 John Smith
2016-01-29 18:13 ` John Smith
2016-01-29 20:26 ` Chris Murphy
2016-01-30 5:58 ` Duncan [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$75b98$f785acae$3724509e$fea963e1@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).