From: Hugo Mills <hugo@carfax.org.uk>
To: Oliver Freyermuth <o.freyermuth@googlemail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: btrfs recovery
Date: Thu, 26 Jan 2017 10:00:38 +0000 [thread overview]
Message-ID: <20170126100038.GE24076@carfax.org.uk> (raw)
In-Reply-To: <24f6cfb2-d008-af12-ad94-4a4da1be1ee2@googlemail.com>
[-- Attachment #1: Type: text/plain, Size: 2931 bytes --]
On Thu, Jan 26, 2017 at 10:36:55AM +0100, Oliver Freyermuth wrote:
> Hi and thanks for the quick reply!
>
> Am 26.01.2017 um 10:25 schrieb Hugo Mills:
> > Can you post the output of "btrfs-debug-tree -b 35028992
> > /dev/sdb1", specifically the 5 or so entries around item 243. It is
> > quite likely that you have bad RAM, and the output will help confirm
> > that.
> >
>
> Since I did not find item 243 in the debug output at all, I uploaded the complete output of the debug-tree command here:
> http://pastebin.com/xM8qUnSx
It's on line 248 of the paste:
246. key (5547032576 EXTENT_ITEM 204800) block 596426752 (36403) gen 20441
247. key (5561905152 EXTENT_ITEM 184320) block 596443136 (36404) gen 20441
248. key (15606380089319694336 UNKNOWN.76 303104) block 596459520 (36405) gen 20441
249. key (5726711808 EXTENT_ITEM 524288) block 596475904 (36406) gen 20441
250. key (5820571648 EXTENT_ITEM 524288) block 350322688 (21382) gen 20427
I was wrong in my assumption: this isn't a simple bitflip. It looks
like a small random write of data over the item key. That's not to say
that bad hardware isn't the culprit -- it's worth checking anyway --
but it could also be a bug in... well, almost anything.
It's not corruption on the disk, because that would be caught by
the checksum mechanism. This data was corrupted in RAM, before it was
checksummed and written to disk. That could have happened as a result
of some rogue piece of kernel code writing to an incorrect address, or
as a result of some _other_ memory corruption affecting an address
which is then used to write something to.
Looking at the data, I think this should be manually fixable, with
sufficient effort (and a hex editor).
Looking at the item value:
>>> hex(15606380089319694336)
'0xd89500014da12000'
Compared to the preceding key's value:
>>> hex(5561905152)
'0x14b83f000'
It looks like it's just the top couple of bytes in this field that are
affected, so those (d8, 95) can be zeroed. The second field should
clearly be EXTENT_ITEM, which is 0xa8. The offset field (the third
one) looks OK to me -- the bottom byte is 0.
We can probably talk you through fixing this by hand with a decent
hex editor. I've done it before...
> > Check and fix your hardware first. :)
> >
> > If it is bad RAM, then the error is likely to be a simple bitflip,
> > and there are patches for btrfs check which will fix those in most
> > cases.
>
> I'll schedule a memcheck as soon as I can turn off the machine for a while,
> which sadly may be a week or so in the future from now...
Bear in mind that if it is unreliable hardware, then continued use
of the FS in read-write operation is likely to cause additional
damage.
Hugo.
--
Hugo Mills | This: Rock. You throw rock.
hugo@... carfax.org.uk |
http://carfax.org.uk/ |
PGP: E2AB1DE4 | Graeme Swann on fast bowlers
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
next prev parent reply other threads:[~2017-01-26 10:00 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-26 9:18 btrfs recovery Oliver Freyermuth
2017-01-26 9:25 ` Hugo Mills
2017-01-26 9:36 ` Oliver Freyermuth
2017-01-26 10:00 ` Hugo Mills [this message]
2017-01-26 11:01 ` Oliver Freyermuth
2017-01-27 11:01 ` Oliver Freyermuth
2017-01-27 12:58 ` Austin S. Hemmelgarn
2017-01-28 5:00 ` Duncan
2017-01-28 12:37 ` Janos Toth F.
2017-01-28 16:51 ` Oliver Freyermuth
2017-01-28 16:46 ` Oliver Freyermuth
2017-01-31 4:58 ` Duncan
2017-01-31 12:45 ` Austin S. Hemmelgarn
2017-02-01 4:36 ` Duncan
2017-01-30 12:41 ` Austin S. Hemmelgarn
2017-01-28 21:04 ` Oliver Freyermuth
2017-01-28 22:27 ` Hans van Kranenburg
2017-01-29 2:02 ` Oliver Freyermuth
2017-01-29 16:44 ` Hans van Kranenburg
2017-01-29 19:09 ` Oliver Freyermuth
2017-01-29 19:28 ` Hans van Kranenburg
2017-01-29 19:52 ` Oliver Freyermuth
2017-01-29 20:13 ` Hans van Kranenburg
-- strict thread matches above, loose matches on Subject: below --
2017-01-30 20:02 Michael Born
2017-01-30 20:27 ` Hans van Kranenburg
2017-01-30 20:51 ` Chris Murphy
2017-01-30 21:07 ` Michael Born
2017-01-30 21:16 ` Hans van Kranenburg
2017-01-30 22:24 ` GWB
2017-01-30 22:37 ` Michael Born
2017-01-31 0:29 ` GWB
2017-01-31 9:08 ` Graham Cobb
2017-01-30 21:20 ` Chris Murphy
2017-01-30 21:35 ` Chris Murphy
2017-01-30 21:40 ` Michael Born
2017-01-31 4:30 ` Duncan
2017-01-19 10:06 Sebastian Gottschall
2017-01-20 1:08 ` Qu Wenruo
2017-01-20 9:45 ` Sebastian Gottschall
2017-01-23 11:15 ` Sebastian Gottschall
2017-01-24 0:39 ` Qu Wenruo
2017-01-20 8:05 ` Duncan
2017-01-20 9:59 ` Sebastian Gottschall
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170126100038.GE24076@carfax.org.uk \
--to=hugo@carfax.org.uk \
--cc=linux-btrfs@vger.kernel.org \
--cc=o.freyermuth@googlemail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).