linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: csum failed : d-raid0, m-raid1
Date: Sat, 30 Jan 2016 05:58:45 +0000 (UTC)	[thread overview]
Message-ID: <pan$75b98$f785acae$3724509e$fea963e1@cox.net> (raw)
In-Reply-To: CAAcrkYJmDHNijZx45cdfZNTgVCgvih_06WROGJg+pjYsQT-=tg@mail.gmail.com

John Smith posted on Fri, 29 Jan 2016 19:04:42 +0100 as excerpted:

> Hi
> 
> i built btrfs volume using 2x3tb brand new /tested for badblocks drives.
> I copied into volume around 5Tb of data.
> 
> I tried to read one file which is around 4GB and i got input / output
> error.
> 
> Dmesg contains:
> 
> [154159.040059] BTRFS warning (device sdd): csum failed ino 9995246 off
> 4506214400 csum 383964635 expected csum 6478505
> 
> 
> Any idea what is it? Whats the reason that this happened? Can I recover?

Btrfs crc32c-checksums all blocks on write, both data (except for data 
written while mounted nodatasum and nocow attribute files) and metadata 
(always), and verifies the checksum on read.

The read-time csum verification failed on the block at that address of 
the file, and as your data is raid0, there's no second copy to fall back 
on as there would be for raid1 data and no parity data available to try 
to rebuild from as there'd be for raid56 data, so the file can only be 
read upto that point, and if you skip that block, and there are no 
further checksum failures, beyond that point, to the end of the file.

Of course the sysadmin's first rule of backups, in simplest form, is that 
if you don't have at least one backup, you are by your failure to backup 
defining the value of the data as less than the value of the time/hassle/
resources you'd otherwise spend making that backup, so you either have a 
backup to fallback to, or you're data is self-defined by that lack of a 
backup as of only trivial value not worth the trouble.

And of course, btrfs, while stabilizING, isn't yet considered fully 
stable and mature, so that sysadmin's rule of backups applies to an even 
stronger degree than it does to fully stable and mature filesystems.

As a result, for recovery, you can either fall back to the backup, 
rewriting the file from backup to the btrfs in question, or by action you 
defined the data as too trivial to be worth backing up, so you can simply 
delete the file in question and not worry about it.

The question then becomes one of finding out what file is involved, in 
ordered to either delete it or recover it from backup.  Keep in mind that 
unlike most filesystems, inode numbers on btrfs are subvolume-specific, 
so it's possible to have multiple inodes with the same inode number on 
the filesystem, if you have multiple subvolumes.  Thus, it's not as 
simple as looking up what file that inode corresponds to, unless of 
course you have only the primary/root subvolume, no others.

There are two ways to find what file corresponds to that inode on that 
subvolume.  One involves use of the btrfs debugging tools and is targeted 
at devs.  While I know this is possible and I've seen the method posted, 
I'm not a dev, only a btrfs user and list regular, and I've not kept 
track of the specifics, so I won't attempt to describe them further here.

The other one is btrfs scrub, which will systematically verify all 
checksums on the filesystem, repairing errors where it's possible 
(metadata in your case as it's raid1, assuming of course that the second 
copy of the block isn't also bad), reporting those which it can't (the 
raid0 data).  Where it can't fix the problem dmesg should contain the 
file with the problem (unless it's metadata and thus not a file, of 
course).

Of course on 5 TB of data, scrub's going to take awhile... likely over a 
day and possibly two (5 TiB of data at 30 MiB/sec is about 48 hours, 30 
MiB/sec might be a bit pessimistically slow but isn't out of real-world 
range on spinning rust).  Even on relatively fast (for spinning rust) 
drives, 100 MiB/sec, you're looking at 14 hours...

Tho because scrub checksum-verifies all blocks, it'll cover any problems 
in other files and in metadata too, not just the one file.

FWIW, maintenance time is one of several reasons I use multiple smaller 
btrfs on partitioned up devices, here, instead of a single huge multi-TB 
btrfs.  My btrfs are also all raid1 both data and metadata, save for 
/boot (and its backup on the other device) which are both mixed-mode dup, 
two copies on the same device, so there's always that second copy to pull 
from to repair the failed one, if something fails checksum verification.  
They also happen to be on SSD, with the largest btrfs on a pair of 24 GiB 
partitions.  As such, scrubs, balances, checks, etc, all take under 10 
minutes per filesystem, with scrubs often complete in under a minute, 
instead of the day or longer it's likely to take you for 5 TiB on 
spinning rust.  Of course I have more btrfs and it'd take me somewhat 
longer than that minute to do just one, say a half hour, to scrub them 
all, but some of them aren't even routinely mounted, and my 8 GiB (per 
device, two devices) btrfs raid1 / is mounted read-only by default, so it 
too is unlikely to be damaged.  As such, generally only 2-3 btrfs need 
scrubbed at once and often it's only 1-2, and on fast SSD, I'm done in 
under 5 minutes.  /Much/ more feasible maintenance time than several 
/days/! =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


      parent reply	other threads:[~2016-01-30  5:58 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-29 18:04 csum failed : d-raid0, m-raid1 John Smith
2016-01-29 18:13 ` John Smith
2016-01-29 20:26 ` Chris Murphy
2016-01-30  5:58 ` Duncan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='pan$75b98$f785acae$3724509e$fea963e1@cox.net' \
    --to=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).