From: Hugo Mills <hugo@carfax.org.uk>
To: Claes Fransson <claes.v.fransson@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: bad key ordering - repairable?
Date: Mon, 22 Jan 2018 21:22:50 +0000 [thread overview]
Message-ID: <20180122212250.GY3807@carfax.org.uk> (raw)
In-Reply-To: <CAEY8F1qw-6Xa+ESJH0X3zhJcQ1UaoJO4wkPjdDt63JEYHBuAoQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5612 bytes --]
On Mon, Jan 22, 2018 at 10:06:58PM +0100, Claes Fransson wrote:
> Hi!
>
> I really like the features of BTRFS, especially deduplication,
> snapshotting and checksumming. However, when using it on my laptop the
> last couple of years, it has became corrupted a lot of times.
> Sometimes I have managed to fix the problems (at least so much that I
> can continue to use the filesystem) with check --repair, but several
> times I had to recreate the file system and reinstall the operating
> system.
>
> I am guessing the corruptions might be the results of unclean
> shutdowns, mostly after system hangs, but also because of running out
> of battery sometimes?
> Furthermore, the power-led has recently started blinking (also when
> the power-cable is plugged in), I guess because of an old and bad
> battery. Maybe the current corruption also can have something to do
> with this? However I almost always run with power cable plugged in in
> last year, only on battery a few seconds a few times when moving the
> laptop.
>
> Currently, I can only mount the filesystem readonly, it goes readonly
> automatically if I try to mount it normally.
>
> When booting an OpenSUSE Tumbleweed-20180119 live-iso:
> localhost:~ # uname -r
> 4.14.13-1-default
> localhost:~ # btrfs --version
> btrfs-progs v4.14.1
>
> localhost:~ # btrfs check -p /dev/sda12
> Checking filesystem on /dev/sda12
[fixing up bad paste]
> UUID: d2819d5a-fd69-484b-bf34-f2b5692cbe1f
> bad key ordering 159 160 bad block 690436964352
> ERROR: errors found in extent allocation tree or chunk allocation
> checking free space cache [.]
> checking fs roots [o]
> checking csums
> bad key ordering 159 160
> Error looking up extent record -1
[snip]
> localhost:~ # btrfs inspect-internal dump-tree -b 690436964352
> /dev/sda12
> btrfs-progs v4.14.1
> leaf 690436964352 items 170 free space 1811 generation 196864 owner 2
> leaf 690436964352 flags 0x1(WRITTEN) backref revision 1
> fs uuid d2819d5a-fd69-484b-bf34-f2b5692cbe1f
> chunk uuid 52f81fe6-893b-4432-9336-895057ee81e1
> .
> .
> .
> item 157 key (22732500992 EXTENT_ITEM 16384) itemoff 6538 itemsize 53
> refs 1 gen 821 flags DATA
> extent data backref root 287 objectid 51665 offset 0 count 1
> item 158 key (22732517376 EXTENT_ITEM 16384) itemoff 6485 itemsize 53
> refs 1 gen 821 flags DATA
> extent data backref root 287 objectid 51666 offset 0 count 1
> item 159 key (22732533760 EXTENT_ITEM 16384) itemoff 6485 itemsize 0
> print-tree.c:428: print_extent_item: BUG_ON `item_size != sizeof(*ei0)` triggered, value 1
> btrfs(+0x365c6)[0x55bdfaada5c6]
> btrfs(print_extent_item+0x424)[0x55bdfaadb284]
> btrfs(btrfs_print_leaf+0x94e)[0x55bdfaadbc1e]
> btrfs(btrfs_print_tree+0x295)[0x55bdfaadcf05]
> btrfs(cmd_inspect_dump_tree+0x734)[0x55bdfab1b024]
> btrfs(main+0x7d)[0x55bdfaac7d4d]
> /lib64/libc.so.6(__libc_start_main+0xea)[0x7ff42100ff4a]
> btrfs(_start+0x2a)[0x55bdfaac7e5a]
> Aborted (core dumped)
Wow, I've never seen it do that before. It's the next thing I'd
have asked for, so it's good you've preempted it.
The main thing is that bad key ordering is almost always due to RAM
corruption. That's either bad RAM, or dodgy power regulation -- the
latter could be the PSU, or capacitors on the motherboard. (In this
case, it might also be something funny with the battery).
I would definitely recommend a long run of memtest86. At least 8
hours, preferably 24. If you get errors repeatedly in the sme place,
it's the RAM. If they appear randomly, it's probably the power
regulation.
[snip]
>
> The filesystem had become pretty full, I had planned to increase the
> Btrfs-partition size before it became corrupt.
>
> Active kernel when the filesystem went read only: OpenSUSE Linux
> 4.14.14-1.geef6178-default, from the
> http://download.opensuse.org/repositories/Kernel:/stable/standard/stable
> repository.
>
> Fstab mount options: noatime,autodefrag (I have been using the option
> nossd with older kernels one period in the past on the filesystem).
>
> If it matters, I have been running duperemove many times on the
> filesystem since creation.
>
> To test the RAM, I have been running mprime Blend-test for 24 hours
> after the corruption without any error or warning.
Of all of the bad key order errors I've seen (dozens), I think
there were a whole two which turned out not to be obviously related to
corrupt RAM. I still say that it's most likely the hardware.
> Is there a way I can try to repair this filesystem without the need to
> recreate it and reinstall the operating system? A reinstall including
> all currently installed packages, and restoring all current system
> settings, would probably take some time for me to do.
> If it is currently not repairable, it would be nice if this kind of
> corruption could be repaired in the future, even if losing a few
> files. Or if the corruptions could be avoided in the first place.
Given that the current tools crash, the answer's a definite
no. However, if you can get a developer interested, they may be able
to write a fix for it, given an image of the FS (using btrfs-image).
[snip]
> I have never noticed any corruptions on the NTFS and Ext4 file systems
> on the laptop, only on the Btrfs file systems.
You've never _noticed_ them. :)
Hugo.
--
Hugo Mills | ... one ping(1) to rule them all, and in the
hugo@... carfax.org.uk | darkness bind(2) them.
http://carfax.org.uk/ |
PGP: E2AB1DE4 | Illiad
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
next prev parent reply other threads:[~2018-01-22 21:22 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-22 21:06 bad key ordering - repairable? Claes Fransson
2018-01-22 21:22 ` Hugo Mills [this message]
2018-01-23 13:06 ` Claes Fransson
2018-01-23 18:13 ` Claes Fransson
2018-01-24 0:31 ` Chris Murphy
2018-01-24 19:44 ` Claes Fransson
2018-01-24 23:15 ` Duncan
[not found] ` <CAEY8F1pVrZnf3M6mGJaxogx14ZrJ5CV3++_-y13sTniJ3ds4ww@mail.gmail.com>
2018-01-27 17:42 ` Claes Fransson
2018-01-27 14:54 ` Claes Fransson
2018-01-23 2:35 ` Chris Murphy
2018-01-23 12:51 ` Austin S. Hemmelgarn
2018-01-23 13:29 ` Claes Fransson
2018-01-24 0:44 ` Chris Murphy
2018-01-24 12:30 ` Austin S. Hemmelgarn
2018-01-24 23:54 ` Chris Murphy
2018-01-25 12:41 ` Austin S. Hemmelgarn
2018-01-23 13:17 ` Claes Fransson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180122212250.GY3807@carfax.org.uk \
--to=hugo@carfax.org.uk \
--cc=claes.v.fransson@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).