All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Richard A. Lochner" <lochner@clone1.com>
To: Chris Murphy <lists@colorremedies.com>
Cc: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: BTRFS Data at Rest File Corruption
Date: Thu, 12 May 2016 23:49:17 -0500	[thread overview]
Message-ID: <1463114957.3636.140.camel@clone1.com> (raw)
In-Reply-To: <CAJCQCtSSbv5dAC-uBN9RnYKKRMtr04KmLZVzhvAh7=Xq3ej7dQ@mail.gmail.com>

Chris,

See notes inline.

On Thu, 2016-05-12 at 19:41 -0600, Chris Murphy wrote:
> On Thu, May 12, 2016 at 11:49 AM, Richard A. Lochner <lochner@clone1.
> com> wrote:
> 
> > 
> > I suspected, and I still suspect that the error occurred upon a
> > metadata update that corrupted the checksum for the file, probably
> > due
> > to silent memory corruption.  If the checksum was silently
> > corrupted,
> > it would be simply written to both drives causing this type of
> > error.
> Metadata is checksummed independently of data. So if the data isn't
> updated, its checksum doesn't change, only metadata checksum is
> changed.
> > 
> > 
> > btrfs dmesg(s):
> > 
> > [16510.334020] BTRFS warning (device sdb1): checksum error at
> > logical
> > 3037444042752 on dev /dev/sdb1, sector 4988789496, root 259, inode
> > 1437377, offset 75754369024, length 4096, links 1 (path:
> > Rick/sda4.img)
> > [16510.334043] BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr
> > 0, rd
> > 0, flush 0, corrupt 5, gen 0
> > [16510.345662] BTRFS error (device sdb1): unable to fixup (regular)
> > error at logical 3037444042752 on dev /dev/sdb1
> > 
> > [17606.978439] BTRFS warning (device sdb1): checksum error at
> > logical
> > 3037444042752 on dev /dev/sdc1, sector 4988750584, root 259, inode
> > 1437377, offset 75754369024, length 4096, links 1 (path:
> > Rick/sda4.img)
> > [17606.978460] BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr
> > 0, rd
> > 13, flush 0, corrupt 4, gen 0
> > [17606.989497] BTRFS error (device sdb1): unable to fixup (regular)
> > error at logical 3037444042752 on dev /dev/sdc1
> This is confusing. Are these the same boot? The later time has a
> lower
> corrupt count. Can you just 'dd if=sda4.img of=/dev/null' and report
> all (new) messages in dmesg? It seems to me there should be pretty
> much all the same monotonic-time for the problem with both devices.

My apologies, they were from different boots.  After the dd, I get
these:

[109479.550836] BTRFS warning (device sdb1): csum failed ino 1437377
off 75754369024 csum 1689728329 expected csum 2165338402
[109479.596626] BTRFS warning (device sdb1): csum failed ino 1437377
off 75754369024 csum 1689728329 expected csum 2165338402
[109479.601969] BTRFS warning (device sdb1): csum failed ino 1437377
off 75754369024 csum 1689728329 expected csum 2165338402
[109479.602189] BTRFS warning (device sdb1): csum failed ino 1437377
off 75754369024 csum 1689728329 expected csum 2165338402
[109479.602323] BTRFS warning (device sdb1): csum failed ino 1437377
off 75754369024 csum 1689728329 expected csum 2165338402
> 
> Also what do you get for these for each device:
> 
> smartctl scterc -l /dev/sdX
> cat /sys/block/sdX/device/timeout
> 
# smartctl -l scterc  /dev/sdb
sartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.4.8-300.fc23.x86_64]
(local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools
.org

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

# smartctl -l scterc  /dev/sdc
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.4.8-300.fc23.x86_64]
(local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools
.org

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

# cat /sys/block/sdb/device/timeout
30
# cat /sys/block/sdc/device/timeout
30
> 

  reply	other threads:[~2016-05-13  4:49 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-11 18:36 BTRFS Data at Rest File Corruption Richard Lochner
2016-05-11 19:01 ` Roman Mamedov
2016-05-11 19:26 ` Austin S. Hemmelgarn
2016-05-12 17:49   ` Richard A. Lochner
2016-05-12 18:29     ` Austin S. Hemmelgarn
2016-05-12 21:53       ` Goffredo Baroncelli
2016-05-12 23:15       ` Richard A. Lochner
2016-05-13  1:41     ` Chris Murphy
2016-05-13  4:49       ` Richard A. Lochner [this message]
2016-05-13 17:46         ` Chris Murphy
2016-05-15 18:43           ` Richard A. Lochner
2016-05-16  6:07             ` Chris Murphy
2016-05-16 11:33               ` Austin S. Hemmelgarn
2016-05-16 21:20                 ` Richard A. Lochner
2016-05-16 22:43                 ` Chris Murphy
2016-05-16 23:44                   ` Richard A. Lochner
2016-05-17  3:42                     ` Chris Murphy
2016-05-17 11:26                       ` Austin S. Hemmelgarn
2016-05-13 16:28   ` Goffredo Baroncelli
2016-05-13 16:54     ` Austin S. Hemmelgarn
2016-05-12  6:49 ` Chris Murphy
     [not found] ` <CAAuLxcaQ1Uo+pff9AtD74UwUvo5yYKBuNLwKzjVMWV1kt2DcRQ@mail.gmail.com>
2016-05-12 18:26   ` Richard A. Lochner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1463114957.3636.140.camel@clone1.com \
    --to=lochner@clone1.com \
    --cc=ahferroin7@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.