* bad key ordering - repairable?
@ 2018-01-22 21:06 Claes Fransson
2018-01-22 21:22 ` Hugo Mills
2018-01-23 2:35 ` Chris Murphy
0 siblings, 2 replies; 17+ messages in thread
From: Claes Fransson @ 2018-01-22 21:06 UTC (permalink / raw)
To: linux-btrfs
Hi!
I really like the features of BTRFS, especially deduplication,
snapshotting and checksumming. However, when using it on my laptop the
last couple of years, it has became corrupted a lot of times.
Sometimes I have managed to fix the problems (at least so much that I
can continue to use the filesystem) with check --repair, but several
times I had to recreate the file system and reinstall the operating
system.
I am guessing the corruptions might be the results of unclean
shutdowns, mostly after system hangs, but also because of running out
of battery sometimes?
Furthermore, the power-led has recently started blinking (also when
the power-cable is plugged in), I guess because of an old and bad
battery. Maybe the current corruption also can have something to do
with this? However I almost always run with power cable plugged in in
last year, only on battery a few seconds a few times when moving the
laptop.
Currently, I can only mount the filesystem readonly, it goes readonly
automatically if I try to mount it normally.
When booting an OpenSUSE Tumbleweed-20180119 live-iso:
localhost:~ # uname -r
4.14.13-1-default
localhost:~ # btrfs --version
btrfs-progs v4.14.1
localhost:~ # btrfs check -p /dev/sda12
Checking filesystem on /dev/sda12
UUID:
d2819d5a-fd69-484b-bf34-f2b5692cbe1f
bad key ordering 159 160
bad block 690436964352
ERROR: errors found in extent allocation tree or chunk
allocation checking free
space cache [.]
checking fs roots [o]
checking csums
bad key ordering 159 160
Error looking up extent record -1
Right section didn't have a record
There are no
extents for csum range 22732550144-24923615232
Csum exists for 16303538176-24923615232 but
there is no extent record ERROR:
errors found in csum tree
found 344063430663 bytes
used, error(s) found
total csum bytes: 0
total tree bytes: 453410816
total fs tree bytes: 0
total
extent tree bytes: 452952064
btree space waste bytes: 140165932
file data blocks
allocated: 108462080
referenced 108462080
localhost:~ # btrfs inspect-internal dump-tree -b 690436964352
/dev/sda12
btrfs-progs v4.14.1
leaf
690436964352 items 170 free space 1811 generation 196864 owner 2
leaf 690436964352 flags 0x1(WRITTEN)
backref revision 1
fs uuid d2819d5a-fd69-484b-bf34-f2b5692cbe1f
chunk uuid
52f81fe6-893b-4432-9336-895057ee81e1
.
.
.
item 157 key (22732500992 EXTENT_ITEM 16384) itemoff 6538 itemsize 53
refs 1 gen 821 flags DATA
extent data backref root 287 objectid 51665 offset 0 count 1
item 158 key (22732517376 EXTENT_ITEM 16384) itemoff 6485 itemsize 53
refs 1 gen 821 flags DATA
extent data backref root 287 objectid 51666 offset 0 count 1
item 159 key (22732533760 EXTENT_ITEM 16384) itemoff 6485 itemsize 0
print-tree.c:428: print_extent_item: BUG_ON `item_size !=
sizeof(*ei0)` triggered, value 1
btrfs(+0x365c6)[0x55bdfaada5c6]
btrfs(print_extent_item+0x424)[0x55bdfaadb284]
btrfs(btrfs_print_leaf+0x94e)[0x55bdfaadbc1e]
btrfs(btrfs_print_tree+0x295)[0x55bdfaadcf05]
btrfs(cmd_inspect_dump_tree+0x734)[0x55bdfab1b024]
btrfs(main+0x7d)[0x55bdfaac7d4d]
/lib64/libc.so.6(__libc_start_main+0xea)[0x7ff42100ff4a]
btrfs(_start+0x2a)[0x55bdfaac7e5a]
Aborted (core dumped)
check --repair hangs after reporting "bad key ordering 159 160" with
no disk activity but constant high cpu usage.
localhost:~ # smartctl -a /dev/sda
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.14.13-1-default] (SUSE RPM)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: SanDisk SD8SB8U1T001122
Serial Number: 163076421231
LU WWN Device Id: 5 001b44 4a4dde388
Firmware Version: X4140000
User Capacity: 1,024,209,543,168 bytes [1.02 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2 T13/2015-D revision 3
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Jan 22 15:28:46 2018 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x11) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 10) minutes.
SMART Attributes Data Structure revision number: 4
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0032 100 100 --- Old_age
Always - 0
9 Power_On_Hours 0x0032 100 100 --- Old_age
Always - 7692
12 Power_Cycle_Count 0x0032 100 100 --- Old_age
Always - 496
165 Unknown_Attribute 0x0032 100 100 --- Old_age
Always - 1112516724361
166 Unknown_Attribute 0x0032 100 100 --- Old_age
Always - 1
167 Unknown_Attribute 0x0032 100 100 --- Old_age
Always - 25
168 Unknown_Attribute 0x0032 100 100 --- Old_age
Always - 44
169 Unknown_Attribute 0x0032 100 100 --- Old_age
Always - 753
170 Unknown_Attribute 0x0032 100 100 --- Old_age
Always - 0
171 Unknown_Attribute 0x0032 100 100 --- Old_age
Always - 0
172 Unknown_Attribute 0x0032 100 100 --- Old_age
Always - 0
173 Unknown_Attribute 0x0032 100 100 --- Old_age
Always - 18
174 Unknown_Attribute 0x0032 100 100 --- Old_age
Always - 57
184 End-to-End_Error 0x0032 100 100 --- Old_age
Always - 0
187 Reported_Uncorrect 0x0032 100 100 --- Old_age
Always - 0
188 Command_Timeout 0x0032 100 100 --- Old_age
Always - 1
194 Temperature_Celsius 0x0022 061 062 --- Old_age
Always - 39 (Min/Max 9/62)
199 UDMA_CRC_Error_Count 0x0032 100 100 --- Old_age
Always - 0
230 Unknown_SSD_Attribute 0x0032 100 100 --- Old_age
Always - 4733091251278
232 Available_Reservd_Space 0x0033 100 100 004 Pre-fail
Always - 100
233 Media_Wearout_Indicator 0x0032 100 100 --- Old_age
Always - 19202
234 Unknown_Attribute 0x0032 100 100 --- Old_age
Always - 32167
241 Total_LBAs_Written 0x0030 253 253 --- Old_age
Offline - 22520
242 Total_LBAs_Read 0x0030 253 253 --- Old_age
Offline - 183882
244 Unknown_Attribute 0x0032 000 100 --- Old_age
Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining
LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 7570 -
# 2 Extended offline Completed without error 00% 7395 -
# 3 Extended offline Completed without error 00% 6253 -
# 4 Short offline Completed without error 00% 4030 -
# 5 Extended offline Completed without error 00% 1568 -
# 6 Extended offline Completed without error 00% 1434 -
Selective Self-tests/Logging not supported
localhost:~ # btrfs fi usage /mnt
Overall:
Device size: 450.00GiB
Device allocated: 424.04GiB
Device unallocated: 25.96GiB
Device missing: 0.00B
Used: 420.38GiB
Free (estimated): 27.39GiB (min: 27.39GiB)
Data ratio: 1.00
Metadata ratio: 1.00
Global reserve: 512.00MiB (used: 0.00B)
Data,single: Size:411.98GiB, Used:410.55GiB
/dev/sda12 411.98GiB
Metadata,single: Size:12.00GiB, Used:9.83GiB
/dev/sda12 12.00GiB
System,single: Size:64.00MiB, Used:64.00KiB
/dev/sda12 64.00MiB
Unallocated:
/dev/sda12 25.96GiB
The filesystem had become pretty full, I had planned to increase the
Btrfs-partition size before it became corrupt.
Active kernel when the filesystem went read only: OpenSUSE Linux
4.14.14-1.geef6178-default, from the
http://download.opensuse.org/repositories/Kernel:/stable/standard/stable
repository.
Fstab mount options: noatime,autodefrag (I have been using the option
nossd with older kernels one period in the past on the filesystem).
If it matters, I have been running duperemove many times on the
filesystem since creation.
To test the RAM, I have been running mprime Blend-test for 24 hours
after the corruption without any error or warning.
Is there a way I can try to repair this filesystem without the need to
recreate it and reinstall the operating system? A reinstall including
all currently installed packages, and restoring all current system
settings, would probably take some time for me to do.
If it is currently not repairable, it would be nice if this kind of
corruption could be repaired in the future, even if losing a few
files. Or if the corruptions could be avoided in the first place.
Laptop: Asus N56JR-S4075H, bought new 2014
Hard drive: since 14 months a SanDisk X400 SD8SB8U1T001122 1TB SSD,
originally a Seagate ST750LM000 SSHD
RAM: lshw:-memory
description: System Memory
physical id: c
slot: System board or motherboard
size: 12GiB
*-bank:0
description: SODIMM DDR3 Synchronous 1600 MHz (0,6 ns)
product: ASU16D3LS1KBG/4G
vendor: Kingston
physical id: 0
serial: C32D5655
slot: ChannelA-DIMM0
size: 4GiB
width: 64 bits
clock: 1600MHz (0.6ns)
*-bank:1
description: DIMM [empty]
product: [Empty]
vendor: [Empty]
physical id: 1
serial: [Empty]
slot: ChannelA-DIMM1
*-bank:2
description: SODIMM DDR3 Synchronous 1600 MHz (0,6 ns)
product: M471B1G73QH0-YK0
vendor: Samsung
physical id: 2
serial: 1519AD27
slot: ChannelB-DIMM0
size: 8GiB
width: 64 bits
clock: 1600MHz (0.6ns)
*-bank:3
description: DIMM [empty]
product: [Empty]
vendor: [Empty]
physical id: 3
serial: [Empty]
slot: ChannelB-DIMM1
CPU: Intel(R) Core(TM) i7-4700HQ CPU @ 2.40GHz
BIOS version: N56JRH.202
SSD Partitions (among others): Btrfs with OpenSUSE Tumbleweed
installation, NTFS with Windows 10, Ext4 with Fedora installation.
I have never noticed any corruptions on the NTFS and Ext4 file systems
on the laptop, only on the Btrfs file systems.
Best regards,
Claes Fransson
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: bad key ordering - repairable?
2018-01-22 21:06 bad key ordering - repairable? Claes Fransson
@ 2018-01-22 21:22 ` Hugo Mills
2018-01-23 13:06 ` Claes Fransson
2018-01-23 2:35 ` Chris Murphy
1 sibling, 1 reply; 17+ messages in thread
From: Hugo Mills @ 2018-01-22 21:22 UTC (permalink / raw)
To: Claes Fransson; +Cc: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 5612 bytes --]
On Mon, Jan 22, 2018 at 10:06:58PM +0100, Claes Fransson wrote:
> Hi!
>
> I really like the features of BTRFS, especially deduplication,
> snapshotting and checksumming. However, when using it on my laptop the
> last couple of years, it has became corrupted a lot of times.
> Sometimes I have managed to fix the problems (at least so much that I
> can continue to use the filesystem) with check --repair, but several
> times I had to recreate the file system and reinstall the operating
> system.
>
> I am guessing the corruptions might be the results of unclean
> shutdowns, mostly after system hangs, but also because of running out
> of battery sometimes?
> Furthermore, the power-led has recently started blinking (also when
> the power-cable is plugged in), I guess because of an old and bad
> battery. Maybe the current corruption also can have something to do
> with this? However I almost always run with power cable plugged in in
> last year, only on battery a few seconds a few times when moving the
> laptop.
>
> Currently, I can only mount the filesystem readonly, it goes readonly
> automatically if I try to mount it normally.
>
> When booting an OpenSUSE Tumbleweed-20180119 live-iso:
> localhost:~ # uname -r
> 4.14.13-1-default
> localhost:~ # btrfs --version
> btrfs-progs v4.14.1
>
> localhost:~ # btrfs check -p /dev/sda12
> Checking filesystem on /dev/sda12
[fixing up bad paste]
> UUID: d2819d5a-fd69-484b-bf34-f2b5692cbe1f
> bad key ordering 159 160 bad block 690436964352
> ERROR: errors found in extent allocation tree or chunk allocation
> checking free space cache [.]
> checking fs roots [o]
> checking csums
> bad key ordering 159 160
> Error looking up extent record -1
[snip]
> localhost:~ # btrfs inspect-internal dump-tree -b 690436964352
> /dev/sda12
> btrfs-progs v4.14.1
> leaf 690436964352 items 170 free space 1811 generation 196864 owner 2
> leaf 690436964352 flags 0x1(WRITTEN) backref revision 1
> fs uuid d2819d5a-fd69-484b-bf34-f2b5692cbe1f
> chunk uuid 52f81fe6-893b-4432-9336-895057ee81e1
> .
> .
> .
> item 157 key (22732500992 EXTENT_ITEM 16384) itemoff 6538 itemsize 53
> refs 1 gen 821 flags DATA
> extent data backref root 287 objectid 51665 offset 0 count 1
> item 158 key (22732517376 EXTENT_ITEM 16384) itemoff 6485 itemsize 53
> refs 1 gen 821 flags DATA
> extent data backref root 287 objectid 51666 offset 0 count 1
> item 159 key (22732533760 EXTENT_ITEM 16384) itemoff 6485 itemsize 0
> print-tree.c:428: print_extent_item: BUG_ON `item_size != sizeof(*ei0)` triggered, value 1
> btrfs(+0x365c6)[0x55bdfaada5c6]
> btrfs(print_extent_item+0x424)[0x55bdfaadb284]
> btrfs(btrfs_print_leaf+0x94e)[0x55bdfaadbc1e]
> btrfs(btrfs_print_tree+0x295)[0x55bdfaadcf05]
> btrfs(cmd_inspect_dump_tree+0x734)[0x55bdfab1b024]
> btrfs(main+0x7d)[0x55bdfaac7d4d]
> /lib64/libc.so.6(__libc_start_main+0xea)[0x7ff42100ff4a]
> btrfs(_start+0x2a)[0x55bdfaac7e5a]
> Aborted (core dumped)
Wow, I've never seen it do that before. It's the next thing I'd
have asked for, so it's good you've preempted it.
The main thing is that bad key ordering is almost always due to RAM
corruption. That's either bad RAM, or dodgy power regulation -- the
latter could be the PSU, or capacitors on the motherboard. (In this
case, it might also be something funny with the battery).
I would definitely recommend a long run of memtest86. At least 8
hours, preferably 24. If you get errors repeatedly in the sme place,
it's the RAM. If they appear randomly, it's probably the power
regulation.
[snip]
>
> The filesystem had become pretty full, I had planned to increase the
> Btrfs-partition size before it became corrupt.
>
> Active kernel when the filesystem went read only: OpenSUSE Linux
> 4.14.14-1.geef6178-default, from the
> http://download.opensuse.org/repositories/Kernel:/stable/standard/stable
> repository.
>
> Fstab mount options: noatime,autodefrag (I have been using the option
> nossd with older kernels one period in the past on the filesystem).
>
> If it matters, I have been running duperemove many times on the
> filesystem since creation.
>
> To test the RAM, I have been running mprime Blend-test for 24 hours
> after the corruption without any error or warning.
Of all of the bad key order errors I've seen (dozens), I think
there were a whole two which turned out not to be obviously related to
corrupt RAM. I still say that it's most likely the hardware.
> Is there a way I can try to repair this filesystem without the need to
> recreate it and reinstall the operating system? A reinstall including
> all currently installed packages, and restoring all current system
> settings, would probably take some time for me to do.
> If it is currently not repairable, it would be nice if this kind of
> corruption could be repaired in the future, even if losing a few
> files. Or if the corruptions could be avoided in the first place.
Given that the current tools crash, the answer's a definite
no. However, if you can get a developer interested, they may be able
to write a fix for it, given an image of the FS (using btrfs-image).
[snip]
> I have never noticed any corruptions on the NTFS and Ext4 file systems
> on the laptop, only on the Btrfs file systems.
You've never _noticed_ them. :)
Hugo.
--
Hugo Mills | ... one ping(1) to rule them all, and in the
hugo@... carfax.org.uk | darkness bind(2) them.
http://carfax.org.uk/ |
PGP: E2AB1DE4 | Illiad
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: bad key ordering - repairable?
2018-01-22 21:06 bad key ordering - repairable? Claes Fransson
2018-01-22 21:22 ` Hugo Mills
@ 2018-01-23 2:35 ` Chris Murphy
2018-01-23 12:51 ` Austin S. Hemmelgarn
2018-01-23 13:17 ` Claes Fransson
1 sibling, 2 replies; 17+ messages in thread
From: Chris Murphy @ 2018-01-23 2:35 UTC (permalink / raw)
To: Claes Fransson; +Cc: Btrfs BTRFS
On Mon, Jan 22, 2018 at 2:06 PM, Claes Fransson
<claes.v.fransson@gmail.com> wrote:
> Hi!
>
> I really like the features of BTRFS, especially deduplication,
> snapshotting and checksumming. However, when using it on my laptop the
> last couple of years, it has became corrupted a lot of times.
> Sometimes I have managed to fix the problems (at least so much that I
> can continue to use the filesystem) with check --repair, but several
> times I had to recreate the file system and reinstall the operating
> system.
>
> I am guessing the corruptions might be the results of unclean
> shutdowns, mostly after system hangs, but also because of running out
> of battery sometimes?
I think it's something else because I intentionally and
unintentionally do unclean shutdowns (I'm really impatient and I'm a
saboteur) on my laptop and I never get corruptions. In 18 months with
an HP Spectre which doesn't even have ECC memory, and has an NVMe
drive, *and* really remarkable for almost half this time I used the
discard mount option which pretty much instantly obliterates unused
roots, even when referenced in the super block as backup roots - and
yet still zero corruption. No complaints on mount, scrub, or readonly
checks. *shrug*
Anyway I suspect hardware or power issue. Or even SSD firmware issue.
> Furthermore, the power-led has recently started blinking (also when
> the power-cable is plugged in), I guess because of an old and bad
> battery. Maybe the current corruption also can have something to do
> with this? However I almost always run with power cable plugged in in
> last year, only on battery a few seconds a few times when moving the
> laptop.
>
> Currently, I can only mount the filesystem readonly, it goes readonly
> automatically if I try to mount it normally.
Btrfs is confused and doesn't want to make the corruption worse.
>
> Fstab mount options: noatime,autodefrag (I have been using the option
> nossd with older kernels one period in the past on the filesystem).
>
> If it matters, I have been running duperemove many times on the
> filesystem since creation.
I don't think it's related.
>
> To test the RAM, I have been running mprime Blend-test for 24 hours
> after the corruption without any error or warning.
I'm not familiar with it, pretty sure you want this for UEFI:
https://www.memtest86.com/download.htm
Where you can use that or memtest86+ if the firmware is BIOS based.
> I have never noticed any corruptions on the NTFS and Ext4 file systems
> on the laptop, only on the Btrfs file systems.
NTFS and ext4 likely won't notice such corruptions either (although
new ext4 volumes any day now will have checksummed metadata by
default) as they're weren't designed with such detection in mind.
--
Chris Murphy
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: bad key ordering - repairable?
2018-01-23 2:35 ` Chris Murphy
@ 2018-01-23 12:51 ` Austin S. Hemmelgarn
2018-01-23 13:29 ` Claes Fransson
2018-01-24 0:44 ` Chris Murphy
2018-01-23 13:17 ` Claes Fransson
1 sibling, 2 replies; 17+ messages in thread
From: Austin S. Hemmelgarn @ 2018-01-23 12:51 UTC (permalink / raw)
To: Chris Murphy, Claes Fransson; +Cc: Btrfs BTRFS
On 2018-01-22 21:35, Chris Murphy wrote:
> On Mon, Jan 22, 2018 at 2:06 PM, Claes Fransson
> <claes.v.fransson@gmail.com> wrote:
>> Hi!
>>
>> I really like the features of BTRFS, especially deduplication,
>> snapshotting and checksumming. However, when using it on my laptop the
>> last couple of years, it has became corrupted a lot of times.
>> Sometimes I have managed to fix the problems (at least so much that I
>> can continue to use the filesystem) with check --repair, but several
>> times I had to recreate the file system and reinstall the operating
>> system.
>>
>> I am guessing the corruptions might be the results of unclean
>> shutdowns, mostly after system hangs, but also because of running out
>> of battery sometimes?
>
> I think it's something else because I intentionally and
> unintentionally do unclean shutdowns (I'm really impatient and I'm a
> saboteur) on my laptop and I never get corruptions. In 18 months with
> an HP Spectre which doesn't even have ECC memory, and has an NVMe
> drive, *and* really remarkable for almost half this time I used the
> discard mount option which pretty much instantly obliterates unused
> roots, even when referenced in the super block as backup roots - and
> yet still zero corruption. No complaints on mount, scrub, or readonly
> checks. *shrug*
>
> Anyway I suspect hardware or power issue. Or even SSD firmware issue.
I would tend to agree here, with one caveat, if it's a laptop that's
less than 3 years old, you can probably rule out power issues. Some
more info on the particular system might help identify what's wrong.
>
>> Furthermore, the power-led has recently started blinking (also when
>> the power-cable is plugged in), I guess because of an old and bad
>> battery. Maybe the current corruption also can have something to do
>> with this? However I almost always run with power cable plugged in in
>> last year, only on battery a few seconds a few times when moving the
>> laptop.
>>
>> Currently, I can only mount the filesystem readonly, it goes readonly
>> automatically if I try to mount it normally.
>
> Btrfs is confused and doesn't want to make the corruption worse. >
>>
>> Fstab mount options: noatime,autodefrag (I have been using the option
>> nossd with older kernels one period in the past on the filesystem).
>>
>> If it matters, I have been running duperemove many times on the
>> filesystem since creation.
>
> I don't think it's related.
>
>
>>
>> To test the RAM, I have been running mprime Blend-test for 24 hours
>> after the corruption without any error or warning.
>
> I'm not familiar with it, pretty sure you want this for UEFI:
>
> https://www.memtest86.com/download.htm
>
> Where you can use that or memtest86+ if the firmware is BIOS based.
Do keep in mind that just because it passes memory checks does not mean
it's not an issue with the RAM. Memory testers rarely throw false
positives, but it's pretty common to get false negatives from them.>
>> I have never noticed any corruptions on the NTFS and Ext4 file systems
>> on the laptop, only on the Btrfs file systems.
>
> NTFS and ext4 likely won't notice such corruptions either (although
> new ext4 volumes any day now will have checksummed metadata by
> default) as they're weren't designed with such detection in mind.
This is extremely important to understand. BTRFS and ZFS are
essentially the only filesystems available on Linux that actually
validate things enough to notice this reliably (ReFS on Windows probably
does, and I think whatever Apple is calling their new FS does too).
Even if ext4 did notice it, it would just mark the filesystem for a
check and then keep going without doing anything else about it
(seriously, the default behavior for internal errors on ext4 is to just
continue like nothing happened and mark the FS for fsck).
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: bad key ordering - repairable?
2018-01-22 21:22 ` Hugo Mills
@ 2018-01-23 13:06 ` Claes Fransson
2018-01-23 18:13 ` Claes Fransson
2018-01-27 14:54 ` Claes Fransson
0 siblings, 2 replies; 17+ messages in thread
From: Claes Fransson @ 2018-01-23 13:06 UTC (permalink / raw)
To: linux-btrfs
2018-01-22 22:22 GMT+01:00 Hugo Mills <hugo@carfax.org.uk>:
> On Mon, Jan 22, 2018 at 10:06:58PM +0100, Claes Fransson wrote:
>> Hi!
>>
>> I really like the features of BTRFS, especially deduplication,
>> snapshotting and checksumming. However, when using it on my laptop the
>> last couple of years, it has became corrupted a lot of times.
>> Sometimes I have managed to fix the problems (at least so much that I
>> can continue to use the filesystem) with check --repair, but several
>> times I had to recreate the file system and reinstall the operating
>> system.
>>
>> I am guessing the corruptions might be the results of unclean
>> shutdowns, mostly after system hangs, but also because of running out
>> of battery sometimes?
>> Furthermore, the power-led has recently started blinking (also when
>> the power-cable is plugged in), I guess because of an old and bad
>> battery. Maybe the current corruption also can have something to do
>> with this? However I almost always run with power cable plugged in in
>> last year, only on battery a few seconds a few times when moving the
>> laptop.
>>
>> Currently, I can only mount the filesystem readonly, it goes readonly
>> automatically if I try to mount it normally.
>>
>> When booting an OpenSUSE Tumbleweed-20180119 live-iso:
>> localhost:~ # uname -r
>> 4.14.13-1-default
>> localhost:~ # btrfs --version
>> btrfs-progs v4.14.1
>>
>> localhost:~ # btrfs check -p /dev/sda12
>> Checking filesystem on /dev/sda12
>
> [fixing up bad paste]
>
>> UUID: d2819d5a-fd69-484b-bf34-f2b5692cbe1f
>> bad key ordering 159 160 bad block 690436964352
>> ERROR: errors found in extent allocation tree or chunk allocation
>> checking free space cache [.]
>> checking fs roots [o]
>> checking csums
>> bad key ordering 159 160
>> Error looking up extent record -1
>
> [snip]
>
>> localhost:~ # btrfs inspect-internal dump-tree -b 690436964352
>> /dev/sda12
>> btrfs-progs v4.14.1
>> leaf 690436964352 items 170 free space 1811 generation 196864 owner 2
>> leaf 690436964352 flags 0x1(WRITTEN) backref revision 1
>> fs uuid d2819d5a-fd69-484b-bf34-f2b5692cbe1f
>> chunk uuid 52f81fe6-893b-4432-9336-895057ee81e1
>> .
>> .
>> .
>> item 157 key (22732500992 EXTENT_ITEM 16384) itemoff 6538 itemsize 53
>> refs 1 gen 821 flags DATA
>> extent data backref root 287 objectid 51665 offset 0 count 1
>> item 158 key (22732517376 EXTENT_ITEM 16384) itemoff 6485 itemsize 53
>> refs 1 gen 821 flags DATA
>> extent data backref root 287 objectid 51666 offset 0 count 1
>> item 159 key (22732533760 EXTENT_ITEM 16384) itemoff 6485 itemsize 0
>> print-tree.c:428: print_extent_item: BUG_ON `item_size != sizeof(*ei0)` triggered, value 1
>> btrfs(+0x365c6)[0x55bdfaada5c6]
>> btrfs(print_extent_item+0x424)[0x55bdfaadb284]
>> btrfs(btrfs_print_leaf+0x94e)[0x55bdfaadbc1e]
>> btrfs(btrfs_print_tree+0x295)[0x55bdfaadcf05]
>> btrfs(cmd_inspect_dump_tree+0x734)[0x55bdfab1b024]
>> btrfs(main+0x7d)[0x55bdfaac7d4d]
>> /lib64/libc.so.6(__libc_start_main+0xea)[0x7ff42100ff4a]
>> btrfs(_start+0x2a)[0x55bdfaac7e5a]
>> Aborted (core dumped)
>
> Wow, I've never seen it do that before. It's the next thing I'd
> have asked for, so it's good you've preempted it.
>
> The main thing is that bad key ordering is almost always due to RAM
> corruption. That's either bad RAM, or dodgy power regulation -- the
> latter could be the PSU, or capacitors on the motherboard. (In this
> case, it might also be something funny with the battery).
>
> I would definitely recommend a long run of memtest86. At least 8
> hours, preferably 24. If you get errors repeatedly in the sme place,
> it's the RAM. If they appear randomly, it's probably the power
> regulation.
>
Thanks for the suggestion, I will try to do this in the next days.
> [snip]
>
>>
>> The filesystem had become pretty full, I had planned to increase the
>> Btrfs-partition size before it became corrupt.
>>
>> Active kernel when the filesystem went read only: OpenSUSE Linux
>> 4.14.14-1.geef6178-default, from the
>> http://download.opensuse.org/repositories/Kernel:/stable/standard/stable
>> repository.
>>
>> Fstab mount options: noatime,autodefrag (I have been using the option
>> nossd with older kernels one period in the past on the filesystem).
>>
>> If it matters, I have been running duperemove many times on the
>> filesystem since creation.
>>
>> To test the RAM, I have been running mprime Blend-test for 24 hours
>> after the corruption without any error or warning.
>
> Of all of the bad key order errors I've seen (dozens), I think
> there were a whole two which turned out not to be obviously related to
> corrupt RAM. I still say that it's most likely the hardware.
Okay, thank you for sharing your experience with me.
>
>> Is there a way I can try to repair this filesystem without the need to
>> recreate it and reinstall the operating system? A reinstall including
>> all currently installed packages, and restoring all current system
>> settings, would probably take some time for me to do.
>> If it is currently not repairable, it would be nice if this kind of
>> corruption could be repaired in the future, even if losing a few
>> files. Or if the corruptions could be avoided in the first place.
>
> Given that the current tools crash, the answer's a definite
> no. However, if you can get a developer interested, they may be able
> to write a fix for it, given an image of the FS (using btrfs-image).
>
Okay, will try to produce and upload an image within the next week.
> [snip]
>> I have never noticed any corruptions on the NTFS and Ext4 file systems
>> on the laptop, only on the Btrfs file systems.
>
> You've never _noticed_ them. :)
>
> Hugo.
>
> --
> Hugo Mills | ... one ping(1) to rule them all, and in the
> hugo@... carfax.org.uk | darkness bind(2) them.
> http://carfax.org.uk/ |
> PGP: E2AB1DE4 | Illiad
Thank you for your answers.
Claes
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: bad key ordering - repairable?
2018-01-23 2:35 ` Chris Murphy
2018-01-23 12:51 ` Austin S. Hemmelgarn
@ 2018-01-23 13:17 ` Claes Fransson
1 sibling, 0 replies; 17+ messages in thread
From: Claes Fransson @ 2018-01-23 13:17 UTC (permalink / raw)
To: Btrfs BTRFS
2018-01-23 3:35 GMT+01:00 Chris Murphy <lists@colorremedies.com>:
> On Mon, Jan 22, 2018 at 2:06 PM, Claes Fransson
> <claes.v.fransson@gmail.com> wrote:
>> Hi!
>>
>> I really like the features of BTRFS, especially deduplication,
>> snapshotting and checksumming. However, when using it on my laptop the
>> last couple of years, it has became corrupted a lot of times.
>> Sometimes I have managed to fix the problems (at least so much that I
>> can continue to use the filesystem) with check --repair, but several
>> times I had to recreate the file system and reinstall the operating
>> system.
>>
>> I am guessing the corruptions might be the results of unclean
>> shutdowns, mostly after system hangs, but also because of running out
>> of battery sometimes?
>
> I think it's something else because I intentionally and
> unintentionally do unclean shutdowns (I'm really impatient and I'm a
> saboteur) on my laptop and I never get corruptions. In 18 months with
> an HP Spectre which doesn't even have ECC memory, and has an NVMe
> drive, *and* really remarkable for almost half this time I used the
> discard mount option which pretty much instantly obliterates unused
> roots, even when referenced in the super block as backup roots - and
> yet still zero corruption. No complaints on mount, scrub, or readonly
> checks. *shrug*
>
Okay, thank you for sharing your experience
> Anyway I suspect hardware or power issue. Or even SSD firmware issue.
>
>> Furthermore, the power-led has recently started blinking (also when
>> the power-cable is plugged in), I guess because of an old and bad
>> battery. Maybe the current corruption also can have something to do
>> with this? However I almost always run with power cable plugged in in
>> last year, only on battery a few seconds a few times when moving the
>> laptop.
>>
>> Currently, I can only mount the filesystem readonly, it goes readonly
>> automatically if I try to mount it normally.
>
> Btrfs is confused and doesn't want to make the corruption worse.
>
>
>
>
>>
>> Fstab mount options: noatime,autodefrag (I have been using the option
>> nossd with older kernels one period in the past on the filesystem).
>>
>> If it matters, I have been running duperemove many times on the
>> filesystem since creation.
>
> I don't think it's related.
>
>
>>
>> To test the RAM, I have been running mprime Blend-test for 24 hours
>> after the corruption without any error or warning.
>
> I'm not familiar with it, pretty sure you want this for UEFI:
>
> https://www.memtest86.com/download.htm
>
Thanks, I will try this within the next days (I boot my laptop in UEFI mode),
> Where you can use that or memtest86+ if the firmware is BIOS based.
>
>
>> I have never noticed any corruptions on the NTFS and Ext4 file systems
>> on the laptop, only on the Btrfs file systems.
>
> NTFS and ext4 likely won't notice such corruptions either (although
> new ext4 volumes any day now will have checksummed metadata by
> default) as they're weren't designed with such detection in mind.
>
>
> --
> Chris Murphy
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: bad key ordering - repairable?
2018-01-23 12:51 ` Austin S. Hemmelgarn
@ 2018-01-23 13:29 ` Claes Fransson
2018-01-24 0:44 ` Chris Murphy
1 sibling, 0 replies; 17+ messages in thread
From: Claes Fransson @ 2018-01-23 13:29 UTC (permalink / raw)
To: Btrfs BTRFS
2018-01-23 13:51 GMT+01:00 Austin S. Hemmelgarn <ahferroin7@gmail.com>:
> On 2018-01-22 21:35, Chris Murphy wrote:
>>
>> On Mon, Jan 22, 2018 at 2:06 PM, Claes Fransson
>> <claes.v.fransson@gmail.com> wrote:
>>>
>>> Hi!
>>>
>>> I really like the features of BTRFS, especially deduplication,
>>> snapshotting and checksumming. However, when using it on my laptop the
>>> last couple of years, it has became corrupted a lot of times.
>>> Sometimes I have managed to fix the problems (at least so much that I
>>> can continue to use the filesystem) with check --repair, but several
>>> times I had to recreate the file system and reinstall the operating
>>> system.
>>>
>>> I am guessing the corruptions might be the results of unclean
>>> shutdowns, mostly after system hangs, but also because of running out
>>> of battery sometimes?
>>
>>
>> I think it's something else because I intentionally and
>> unintentionally do unclean shutdowns (I'm really impatient and I'm a
>> saboteur) on my laptop and I never get corruptions. In 18 months with
>> an HP Spectre which doesn't even have ECC memory, and has an NVMe
>> drive, *and* really remarkable for almost half this time I used the
>> discard mount option which pretty much instantly obliterates unused
>> roots, even when referenced in the super block as backup roots - and
>> yet still zero corruption. No complaints on mount, scrub, or readonly
>> checks. *shrug*
>>
>> Anyway I suspect hardware or power issue. Or even SSD firmware issue.
>
> I would tend to agree here, with one caveat, if it's a laptop that's less
> than 3 years old, you can probably rule out power issues. Some more info on
> the particular system might help identify what's wrong.
Hi,
I boughtThe laptop new in July 2014, but have had corruption issues
with btrfs I think as long as I have been trying it, since the end of
2014 I think. You can find addtitional info about my laptop in my
original post, please let me know if you want som more info.
>>
>>
>>> Furthermore, the power-led has recently started blinking (also when
>>> the power-cable is plugged in), I guess because of an old and bad
>>> battery. Maybe the current corruption also can have something to do
>>> with this? However I almost always run with power cable plugged in in
>>> last year, only on battery a few seconds a few times when moving the
>>> laptop.
>>>
>>> Currently, I can only mount the filesystem readonly, it goes readonly
>>> automatically if I try to mount it normally.
>>
>>
>> Btrfs is confused and doesn't want to make the corruption worse. >
>>>
>>>
>>> Fstab mount options: noatime,autodefrag (I have been using the option
>>> nossd with older kernels one period in the past on the filesystem).
>>>
>>> If it matters, I have been running duperemove many times on the
>>> filesystem since creation.
>>
>>
>> I don't think it's related.
>>
>>
>>>
>>> To test the RAM, I have been running mprime Blend-test for 24 hours
>>> after the corruption without any error or warning.
>>
>>
>> I'm not familiar with it, pretty sure you want this for UEFI:
>>
>> https://www.memtest86.com/download.htm
>>
>> Where you can use that or memtest86+ if the firmware is BIOS based.
>
> Do keep in mind that just because it passes memory checks does not mean it's
> not an issue with the RAM. Memory testers rarely throw false positives, but
> it's pretty common to get false negatives from them.>
Okay, thanks for telling me.
>>>
>>> I have never noticed any corruptions on the NTFS and Ext4 file systems
>>> on the laptop, only on the Btrfs file systems.
>>
>>
>> NTFS and ext4 likely won't notice such corruptions either (although
>> new ext4 volumes any day now will have checksummed metadata by
>> default) as they're weren't designed with such detection in mind.
>
> This is extremely important to understand. BTRFS and ZFS are essentially
> the only filesystems available on Linux that actually validate things enough
> to notice this reliably (ReFS on Windows probably does, and I think whatever
> Apple is calling their new FS does too). Even if ext4 did notice it, it
> would just mark the filesystem for a check and then keep going without doing
> anything else about it (seriously, the default behavior for internal errors
> on ext4 is to just continue like nothing happened and mark the FS for fsck).
Well, personally I think it would be great if I (optionally) could do
that with Btrfs too. Even if it notice me of corruption and I might
even lose e few files, I think it would be good if I could continue to
use the filesystem with normal read/write capabilities, so I wouldnt
need to reinstall the operating system.
Best regards,
Claes
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: bad key ordering - repairable?
2018-01-23 13:06 ` Claes Fransson
@ 2018-01-23 18:13 ` Claes Fransson
2018-01-24 0:31 ` Chris Murphy
2018-01-27 14:54 ` Claes Fransson
1 sibling, 1 reply; 17+ messages in thread
From: Claes Fransson @ 2018-01-23 18:13 UTC (permalink / raw)
To: Btrfs BTRFS
2018-01-23 14:06 GMT+01:00 Claes Fransson <claes.v.fransson@gmail.com>:
> 2018-01-22 22:22 GMT+01:00 Hugo Mills <hugo@carfax.org.uk>:
>> On Mon, Jan 22, 2018 at 10:06:58PM +0100, Claes Fransson wrote:
>>> Hi!
>>>
>>> I really like the features of BTRFS, especially deduplication,
>>> snapshotting and checksumming. However, when using it on my laptop the
>>> last couple of years, it has became corrupted a lot of times.
>>> Sometimes I have managed to fix the problems (at least so much that I
>>> can continue to use the filesystem) with check --repair, but several
>>> times I had to recreate the file system and reinstall the operating
>>> system.
>>>
>>> I am guessing the corruptions might be the results of unclean
>>> shutdowns, mostly after system hangs, but also because of running out
>>> of battery sometimes?
>>> Furthermore, the power-led has recently started blinking (also when
>>> the power-cable is plugged in), I guess because of an old and bad
>>> battery. Maybe the current corruption also can have something to do
>>> with this? However I almost always run with power cable plugged in in
>>> last year, only on battery a few seconds a few times when moving the
>>> laptop.
>>>
>>> Currently, I can only mount the filesystem readonly, it goes readonly
>>> automatically if I try to mount it normally.
>>>
>>> When booting an OpenSUSE Tumbleweed-20180119 live-iso:
>>> localhost:~ # uname -r
>>> 4.14.13-1-default
>>> localhost:~ # btrfs --version
>>> btrfs-progs v4.14.1
>>>
>>> localhost:~ # btrfs check -p /dev/sda12
>>> Checking filesystem on /dev/sda12
>>
>> [fixing up bad paste]
>>
>>> UUID: d2819d5a-fd69-484b-bf34-f2b5692cbe1f
>>> bad key ordering 159 160 bad block 690436964352
>>> ERROR: errors found in extent allocation tree or chunk allocation
>>> checking free space cache [.]
>>> checking fs roots [o]
>>> checking csums
>>> bad key ordering 159 160
>>> Error looking up extent record -1
>>
>> [snip]
>>
>>> localhost:~ # btrfs inspect-internal dump-tree -b 690436964352
>>> /dev/sda12
>>> btrfs-progs v4.14.1
>>> leaf 690436964352 items 170 free space 1811 generation 196864 owner 2
>>> leaf 690436964352 flags 0x1(WRITTEN) backref revision 1
>>> fs uuid d2819d5a-fd69-484b-bf34-f2b5692cbe1f
>>> chunk uuid 52f81fe6-893b-4432-9336-895057ee81e1
>>> .
>>> .
>>> .
>>> item 157 key (22732500992 EXTENT_ITEM 16384) itemoff 6538 itemsize 53
>>> refs 1 gen 821 flags DATA
>>> extent data backref root 287 objectid 51665 offset 0 count 1
>>> item 158 key (22732517376 EXTENT_ITEM 16384) itemoff 6485 itemsize 53
>>> refs 1 gen 821 flags DATA
>>> extent data backref root 287 objectid 51666 offset 0 count 1
>>> item 159 key (22732533760 EXTENT_ITEM 16384) itemoff 6485 itemsize 0
>>> print-tree.c:428: print_extent_item: BUG_ON `item_size != sizeof(*ei0)` triggered, value 1
>>> btrfs(+0x365c6)[0x55bdfaada5c6]
>>> btrfs(print_extent_item+0x424)[0x55bdfaadb284]
>>> btrfs(btrfs_print_leaf+0x94e)[0x55bdfaadbc1e]
>>> btrfs(btrfs_print_tree+0x295)[0x55bdfaadcf05]
>>> btrfs(cmd_inspect_dump_tree+0x734)[0x55bdfab1b024]
>>> btrfs(main+0x7d)[0x55bdfaac7d4d]
>>> /lib64/libc.so.6(__libc_start_main+0xea)[0x7ff42100ff4a]
>>> btrfs(_start+0x2a)[0x55bdfaac7e5a]
>>> Aborted (core dumped)
>>
>> Wow, I've never seen it do that before. It's the next thing I'd
>> have asked for, so it's good you've preempted it.
>>
>> The main thing is that bad key ordering is almost always due to RAM
>> corruption. That's either bad RAM, or dodgy power regulation -- the
>> latter could be the PSU, or capacitors on the motherboard. (In this
>> case, it might also be something funny with the battery).
>>
>> I would definitely recommend a long run of memtest86. At least 8
>> hours, preferably 24. If you get errors repeatedly in the sme place,
>> it's the RAM. If they appear randomly, it's probably the power
>> regulation.
>>
> Thanks for the suggestion, I will try to do this in the next days.
>
I haven't noticed before that there is actually RAM-modules from
different vendors in the laptop. One 8GB by Samsung, and one 4GB by
Kingston! Maybe that is a source for the corruptions.
I also found that there indeed was a new firmware version for my
SSD-disk, so I have now updated it's firmware to the newest version.
Unfortunately I couldn't find any information of what possible issues
it was supposed to fix. The laptop has already the latest BIOS version
provided by ASUS for the model.
I have not yet run the memtest86.
Claes
>> [snip]
>>
>>>
>>> The filesystem had become pretty full, I had planned to increase the
>>> Btrfs-partition size before it became corrupt.
>>>
>>> Active kernel when the filesystem went read only: OpenSUSE Linux
>>> 4.14.14-1.geef6178-default, from the
>>> http://download.opensuse.org/repositories/Kernel:/stable/standard/stable
>>> repository.
>>>
>>> Fstab mount options: noatime,autodefrag (I have been using the option
>>> nossd with older kernels one period in the past on the filesystem).
>>>
>>> If it matters, I have been running duperemove many times on the
>>> filesystem since creation.
>>>
>>> To test the RAM, I have been running mprime Blend-test for 24 hours
>>> after the corruption without any error or warning.
>>
>> Of all of the bad key order errors I've seen (dozens), I think
>> there were a whole two which turned out not to be obviously related to
>> corrupt RAM. I still say that it's most likely the hardware.
>
> Okay, thank you for sharing your experience with me.
>
>>
>>> Is there a way I can try to repair this filesystem without the need to
>>> recreate it and reinstall the operating system? A reinstall including
>>> all currently installed packages, and restoring all current system
>>> settings, would probably take some time for me to do.
>>> If it is currently not repairable, it would be nice if this kind of
>>> corruption could be repaired in the future, even if losing a few
>>> files. Or if the corruptions could be avoided in the first place.
>>
>> Given that the current tools crash, the answer's a definite
>> no. However, if you can get a developer interested, they may be able
>> to write a fix for it, given an image of the FS (using btrfs-image).
>>
> Okay, will try to produce and upload an image within the next week.
>
>
>> [snip]
>>> I have never noticed any corruptions on the NTFS and Ext4 file systems
>>> on the laptop, only on the Btrfs file systems.
>>
>> You've never _noticed_ them. :)
>>
>> Hugo.
>>
>> --
>> Hugo Mills | ... one ping(1) to rule them all, and in the
>> hugo@... carfax.org.uk | darkness bind(2) them.
>> http://carfax.org.uk/ |
>> PGP: E2AB1DE4 | Illiad
>
> Thank you for your answers.
>
> Claes
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: bad key ordering - repairable?
2018-01-23 18:13 ` Claes Fransson
@ 2018-01-24 0:31 ` Chris Murphy
2018-01-24 19:44 ` Claes Fransson
0 siblings, 1 reply; 17+ messages in thread
From: Chris Murphy @ 2018-01-24 0:31 UTC (permalink / raw)
To: Claes Fransson; +Cc: Btrfs BTRFS
On Tue, Jan 23, 2018 at 11:13 AM, Claes Fransson
<claes.v.fransson@gmail.com> wrote:
> I haven't noticed before that there is actually RAM-modules from
> different vendors in the laptop. One 8GB by Samsung, and one 4GB by
> Kingston!
If they have the correct tolerances, I don't think it's a problem.
Some memory controllers use a kind of interleaving if the module sizes
are the same, so worse case you might be leaving a bit of a
performance improvement on the table by the fact they aren't the same
size.
If the memory testing doesn't pan out, you could go down a bit of a
rabbit hole and run each module in production for twice the length of
time you figure you should see a corruption appear.
> I also found that there indeed was a new firmware version for my
> SSD-disk, so I have now updated it's firmware to the newest version.
> Unfortunately I couldn't find any information of what possible issues
> it was supposed to fix. The laptop has already the latest BIOS version
> provided by ASUS for the model.
I don't know enough about the bad key ordering error and its cause. If
that corruption can happen only in memory then the SSD firmware update
may change nothing. If there's some possibility the corruption can be
the result of SSD firmware bugs, then it might make sense to use DUP
metadata in the short term, even on an SSD. Any memory corruption
would affect both copies. Any SSD induced corruption *might* affect
both copies, depending on whether the SSD deduplicates or colocates
the two copies of metadata...but I'd like to think that there's at
least a pretty decent chance one of the copies would be good in which
case you'd get Btrfs self-healing for metadata only.
Anyway, it's a tedious search.
As for Btrfs getting better at handling these kinds of cases. Yeah
it's a valid question. What we know about other file systems is they
can become unrepairable because they don't detect corruption soon
enough. Whereas Btrfs has detected a problem early on yet it's still
damaged enough now that effectively you can no longer mount it rw.
>From a data integrity point of view, at least you can ro mount and get
your data off the volume with a normal file copy operation, not
something that's certain with other file systems.
If you were to try another file system, I'd look at XFS, tools and
kernels in the past couple of years support metadata checksumming with
the V5 format.
--
Chris Murphy
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: bad key ordering - repairable?
2018-01-23 12:51 ` Austin S. Hemmelgarn
2018-01-23 13:29 ` Claes Fransson
@ 2018-01-24 0:44 ` Chris Murphy
2018-01-24 12:30 ` Austin S. Hemmelgarn
1 sibling, 1 reply; 17+ messages in thread
From: Chris Murphy @ 2018-01-24 0:44 UTC (permalink / raw)
To: Austin S. Hemmelgarn; +Cc: Chris Murphy, Claes Fransson, Btrfs BTRFS
On Tue, Jan 23, 2018 at 5:51 AM, Austin S. Hemmelgarn
<ahferroin7@gmail.com> wrote:
> This is extremely important to understand. BTRFS and ZFS are essentially
> the only filesystems available on Linux that actually validate things enough
> to notice this reliably (ReFS on Windows probably does, and I think whatever
> Apple is calling their new FS does too).
ReFS always checksums metadata, optionally can checksum data.
APFS is really vague on this front, it may be checksumming metadata,
it's not checksumming data and with no option to. Apple proposes their
branded storage devices do not return bogus data. OK so then why
checksum the metadata?
>Even if ext4 did notice it, it
> would just mark the filesystem for a check and then keep going without doing
> anything else about it (seriously, the default behavior for internal errors
> on ext4 is to just continue like nothing happened and mark the FS for fsck).
I haven't used ext4 with metadata checksumming enabled, and have no
idea how it behaves when it starts encountering checksum errors during
normal use. For sure XFS will complain a lot and will go read only
when it gets confused. I'd expect any file system going to the trouble
of checksumming would have to have some means of bailing out, rather
than just continuing on.
Btrfs (and maybe ZFS) COW everything except supers. So ostensibly a
future feature might let them continue on with a kind of
integrated/single volume variation on seed/sprout device. I'd like to
see something like this just for undoable and testable offline
repairs, rather than offline repair only being predicated on
overwritting metadata.
--
Chris Murphy
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: bad key ordering - repairable?
2018-01-24 0:44 ` Chris Murphy
@ 2018-01-24 12:30 ` Austin S. Hemmelgarn
2018-01-24 23:54 ` Chris Murphy
0 siblings, 1 reply; 17+ messages in thread
From: Austin S. Hemmelgarn @ 2018-01-24 12:30 UTC (permalink / raw)
To: Chris Murphy; +Cc: Claes Fransson, Btrfs BTRFS
On 2018-01-23 19:44, Chris Murphy wrote:
> On Tue, Jan 23, 2018 at 5:51 AM, Austin S. Hemmelgarn
> <ahferroin7@gmail.com> wrote:
>
>> This is extremely important to understand. BTRFS and ZFS are essentially
>> the only filesystems available on Linux that actually validate things enough
>> to notice this reliably (ReFS on Windows probably does, and I think whatever
>> Apple is calling their new FS does too).
>
> ReFS always checksums metadata, optionally can checksum data.
Good to know, I've not actually dealt with ReFS myself yet (we're mostly
a Linux shop where I work, and the two Windows servers we do have aren't
using ReFS simply because it wasn't beyond the technology preview level
when we installed them and we don't want to screw anything up).
>
> APFS is really vague on this front, it may be checksumming metadata,
> it's not checksumming data and with no option to. Apple proposes their
> branded storage devices do not return bogus data. OK so then why
> checksum the metadata?
Even aside from the fact that it might be checksumming data, Apple's
storage engineers are still smoking something pretty damn strong if they
think that they can claim their storage devices _never_ return bogus
data. Either they're running some kind of checksumming _and_
replication below the block layer in the storage device itself (which
actually might explain the insane cost of at least one piece of their
hardware), or they think they've come up with some fail-safe way to
detect corruption and return errors reliably, and in either case things
can still fail. I smell a potential future lawsuit in the works...
>
>> Even if ext4 did notice it, it
>> would just mark the filesystem for a check and then keep going without doing
>> anything else about it (seriously, the default behavior for internal errors
>> on ext4 is to just continue like nothing happened and mark the FS for fsck).
>
> I haven't used ext4 with metadata checksumming enabled, and have no
> idea how it behaves when it starts encountering checksum errors during
> normal use. For sure XFS will complain a lot and will go read only
> when it gets confused. I'd expect any file system going to the trouble
> of checksumming would have to have some means of bailing out, rather
> than just continuing on.
Actually, I forgot about the (newer) metadata checksumming feature in
ext4, and was just basing my statement on behavior the last time I used
it for anything serious. Having just checked mkfs.ext4, it appears that
the metadata in the SB that tells the kernel what to do when it runs
into an error for the FS still defaults to continuing on as if nothing
happens, even if you enable metadata checksumming (which still seems to
be disabled by default). Whether or not that actually is honored by
modern kernels, I don't know, but I've seen no evidence to suggest that
it isn't.
>
> Btrfs (and maybe ZFS) COW everything except supers. So ostensibly a
> future feature might let them continue on with a kind of
> integrated/single volume variation on seed/sprout device. I'd like to
> see something like this just for undoable and testable offline
> repairs, rather than offline repair only being predicated on
> overwritting metadata.Agreed.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: bad key ordering - repairable?
2018-01-24 0:31 ` Chris Murphy
@ 2018-01-24 19:44 ` Claes Fransson
2018-01-24 23:15 ` Duncan
[not found] ` <CAEY8F1pVrZnf3M6mGJaxogx14ZrJ5CV3++_-y13sTniJ3ds4ww@mail.gmail.com>
0 siblings, 2 replies; 17+ messages in thread
From: Claes Fransson @ 2018-01-24 19:44 UTC (permalink / raw)
Cc: Btrfs BTRFS
On Jan 24, 2018 01:31, "Chris Murphy" <lists@colorremedies.com> wrote:
On Tue, Jan 23, 2018 at 11:13 AM, Claes Fransson
<claes.v.fransson@gmail.com> wrote:
> I haven't noticed before that there is actually RAM-modules from
> different vendors in the laptop. One 8GB by Samsung, and one 4GB by
> Kingston!
If they have the correct tolerances, I don't think it's a problem.
Some memory controllers use a kind of interleaving if the module sizes
are the same, so worse case you might be leaving a bit of a
performance improvement on the table by the fact they aren't the same
size.
If the memory testing doesn't pan out, you could go down a bit of a
rabbit hole and run each module in production for twice the length of
time you figure you should see a corruption appear.
So, I have now some results from the PassMark Memtest86! I let the
default automatic tests run for about 19 hours and 16 passes. It
reported zero "Errors", but 4 lines of "[Note] RAM may be vulnerable
to high frequency row hammer bit flips". If I understand it correctly,
it means that some errors were detected when the RAM was tested at
higher rates than guaranteed accurate by the vendors. I am not sure
what that may indicate regarding the performance of the RAM for my
Btrfs filesystem. I "only" got irreparable corruptions maybe once
every couple of months or half a year.
I also forgot that I have been trying using Zswap the last couple of
months with OpenSUSE on the Btrfs-filesystem (and also Fedora on the
Ext4-partition). Maybe that is a source for the last corruption (I am
pretty sure I was not using Zswap during previous corruptions, of
which I think at least one was reporting "transid verify failed" or
similar.) Sometimes, but not when the filesystem went readonly, the
computer has been freezing almost completely (mouse pointer moving
only extremely slowly) when running out of RAM the last months. I have
sometimes waited many hours for the operating system to swap out not
so important memory to the swap-partition, but end up having to force
a reboot. I suspect that it might be Zswap not working optimally,
maybe it also affects Btrfs? I have used pretty low swappiness values,
1 or 10.
I might try using only one of the RAM modules in the future if nothing
else works. I usually use most of my available 12 GB RAM though (and
often even more :) ) when using my laptop.
> I also found that there indeed was a new firmware version for my
> SSD-disk, so I have now updated it's firmware to the newest version.
> Unfortunately I couldn't find any information of what possible issues
> it was supposed to fix. The laptop has already the latest BIOS version
> provided by ASUS for the model.
I don't know enough about the bad key ordering error and its cause. If
that corruption can happen only in memory then the SSD firmware update
may change nothing. If there's some possibility the corruption can be
the result of SSD firmware bugs, then it might make sense to use DUP
metadata in the short term, even on an SSD. Any memory corruption
would affect both copies. Any SSD induced corruption *might* affect
both copies, depending on whether the SSD deduplicates or colocates
the two copies of metadata...but I'd like to think that there's at
least a pretty decent chance one of the copies would be good in which
case you'd get Btrfs self-healing for metadata only.
Thanks, I might try metadata DUP in the future.
Anyway, it's a tedious search.
As for Btrfs getting better at handling these kinds of cases. Yeah
it's a valid question. What we know about other file systems is they
can become unrepairable because they don't detect corruption soon
enough. Whereas Btrfs has detected a problem early on yet it's still
damaged enough now that effectively you can no longer mount it rw.
>From a data integrity point of view, at least you can ro mount and get
your data off the volume with a normal file copy operation, not
something that's certain with other file systems.
If you were to try another file system, I'd look at XFS, tools and
kernels in the past couple of years support metadata checksumming with
the V5 format.
Yes, XFS should also have deduplication as an experimental feature.
Don't know how stable it is yet, I might try it. In the future it is
also supposed to get snapshot feature.
Thanks for all your tips and thoughts.
Claes
--
Chris Murphy
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: bad key ordering - repairable?
2018-01-24 19:44 ` Claes Fransson
@ 2018-01-24 23:15 ` Duncan
[not found] ` <CAEY8F1pVrZnf3M6mGJaxogx14ZrJ5CV3++_-y13sTniJ3ds4ww@mail.gmail.com>
1 sibling, 0 replies; 17+ messages in thread
From: Duncan @ 2018-01-24 23:15 UTC (permalink / raw)
To: linux-btrfs
Claes Fransson posted on Wed, 24 Jan 2018 20:44:33 +0100 as excerpted:
> So, I have now some results from the PassMark Memtest86! I let the
> default automatic tests run for about 19 hours and 16 passes. It
> reported zero "Errors", but 4 lines of "[Note] RAM may be vulnerable to
> high frequency row hammer bit flips". If I understand it correctly,
> it means that some errors were detected when the RAM was tested at
> higher rates than guaranteed accurate by the vendors.
>From Wikipedia:
Row hammer (also written as rowhammer) is an unintended side effect in
dynamic random-access memory (DRAM) that causes memory cells to leak
their charges and interact electrically between themselves, possibly
altering the contents of nearby memory rows that were not addressed in
the original memory access. This circumvention of the isolation between
DRAM memory cells results from the high cell density in modern DRAM, and
can be triggered by specially crafted memory access patterns that rapidly
activate the same memory rows numerous times.[1][2][3]
The row hammer effect has been used in some privilege escalation computer
security exploits.
https://en.wikipedia.org/wiki/Row_hammer
So it has nothing to do with (generic) testing the RAM at higher rates
than guaranteed by the vendors, but rather, with deliberate rapid
repeated access (at normal clock rates) of the same cell rows in ordered
to trigger a bitflip in nearby memory cells that could not normally be
accessed due to process separation and insufficient privileges.
IOW, it's unlikely to be accidentally tripped, and thus is exceedingly
unlikely to be relevant here, unless you're being hacked, of course.
That said, and entirely unrelated to rowhammer, I know one of the
problems of memory test false-negatives from experience.
In my case, I was even running ECC RAM. But the memory I had purchased
(back in the day when memory was far more expensive and sub-GB memory was
the norm) was cheap, and as it happened, marked as stable at slightly
higher clock rates than it actually was. But I couldn't afford more (or
I'd have procured less dodgy RAM in the first place) and had little
recourse but to live with it for awhile. A year or so later there was a
BIOS update that added better memory clocking control, and I was able to
declock the RAM slightly from its rating (IIRC to PC-3000 level, it was
PC3200 rated, this was DDR1 era), after which it was /entirely/ stable,
even after reducing some of the wait-state settings somewhat to try to
claw back some of what I lost due to the underclocking.
I run gentoo, and nearly all of my problems occurred when I was doing
updates, building packages at 100% CPU with multiple cores accessing the
same RAM. FWIW, the most frequent /detected/ problem was bunzip checksum
errors as it decompressed and verified the data in memory (before writing
out)... that would move or go away if I tried again. Occasionally I'd
get machine-check errors (MCEs), but not frequently, and the ECC RAM
subsystem /never/ reported errors.
But the memory tests gave that memory an all-clear.
The problem with the memory tests in this case is that they tend to work
on an otherwise unloaded system, and test the retention of the memory
cells, /not/ so much the speed and reliability at which they are accessed
under fully loaded system stress -- and how could they when memory speed
is normally set by the BIOS and not something the memory tester has
access to?
But my memory problems weren't with the memory cells themselves -- they
retained their data just fine and indeed it was ECC RAM so would have
triggered ECC errors if they didn't -- but with the precision timing of
memory IO -- it wasn't quite up to the specs it claimed to support and
would occasionally produce in-transit errors (the ECC would have detected
and possibly corrected errors in storage), and the memory testers simply
didn't test that like a fully loaded system doing unpacks of sources and
builds from them did.
As mentioned, once I got a BIOS update that let me declock the RAM a bit,
everything was fine, and it remained fine when I did upgrade the RAM some
years later, after prices had fallen, as well.
(The system was first-gen AMD Opteron, on a server-grade Tyan board, that
I ran from purchase in late 2003 for over eight years, maxing out the
pair of CPUs to dual-core Opteron 290s and the RAM to 8 gigs, over time,
until the board finally died in 2012 due to burst capacitors. Which
reminds me, I'm still running the replacement, a Gigabyte with an fx6100
overclocked a bit to 3.9 GHz and 16 gig RAM, and it's now nearing six
years old, so I suppose I better start planning for the next upgrade...
I've spent that six years upgrading to big-screen TVs as monitors, with a
65inch/165cm 4K as my primary now and a 48inch/122cm as a secondary to
put youtube or whatever on fullscreen, and to now my second generation of
ssds, a pair of 1 TB samsung evos, but this reminds me that at nearing
six years old the main system's aging too, so I better start thinking of
replacing it again...)
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: bad key ordering - repairable?
2018-01-24 12:30 ` Austin S. Hemmelgarn
@ 2018-01-24 23:54 ` Chris Murphy
2018-01-25 12:41 ` Austin S. Hemmelgarn
0 siblings, 1 reply; 17+ messages in thread
From: Chris Murphy @ 2018-01-24 23:54 UTC (permalink / raw)
To: Austin S. Hemmelgarn; +Cc: Chris Murphy, Claes Fransson, Btrfs BTRFS
On Wed, Jan 24, 2018 at 5:30 AM, Austin S. Hemmelgarn
<ahferroin7@gmail.com> wrote:
>> APFS is really vague on this front, it may be checksumming metadata,
>> it's not checksumming data and with no option to. Apple proposes their
>> branded storage devices do not return bogus data. OK so then why
>> checksum the metadata?
>
> Even aside from the fact that it might be checksumming data, Apple's storage
> engineers are still smoking something pretty damn strong if they think that
> they can claim their storage devices _never_ return bogus data. Either
> they're running some kind of checksumming _and_ replication below the block
> layer in the storage device itself (which actually might explain the insane
> cost of at least one piece of their hardware), or they think they've come up
> with some fail-safe way to detect corruption and return errors reliably, and
> in either case things can still fail. I smell a potential future lawsuit in
> the works.
I read somewhere the hardware (or more correctly their flash firmware)
supposedly uses 128 bytes of checksum per 4KB data. That's a lot, I
wonder if it's actually some kind of parity. But regardless, this kind
of in-hardware checksumming won't account for things like misdirected
or torn writes or literally any sort of corruption happening prior to
the flash firmware computing those checksums.
On flash storage, maybe they're just concerned about bit rot or even
the most superficial bit flips, and having just enough information to
detect and correct for 1 or 2 flips per 4KB, not totally dissimilar to
ECC memory. But that they don't use ECC memory, leave them open to
corruption in the storage stack happening outside the literal storage
device.
> Actually, I forgot about the (newer) metadata checksumming feature in ext4,
> and was just basing my statement on behavior the last time I used it for
> anything serious. Having just checked mkfs.ext4, it appears that the
> metadata in the SB that tells the kernel what to do when it runs into an
> error for the FS still defaults to continuing on as if nothing happens, even
> if you enable metadata checksumming (which still seems to be disabled by
> default). Whether or not that actually is honored by modern kernels, I
> don't know, but I've seen no evidence to suggest that it isn't.
Depending on the corruption, Btrfs continues as well. If I corrupt a
deadend leaf that contains file metadata (like names or security
contexts), I just get some complaints of corruption. The file system
remains rw mounted though. I don't know the metric by which metadata
can be damaged and Btrfs says "whoooaa!!" and puts on the brakes by
going read only. XFS certainly has its limits and goes read only when
it detects certain metadata corruption via checksum fail. I'd guess
ext4 will do the same thing, otherwise whats the point if it's going
to knowingly eat itself alive?
--
Chris Murphy
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: bad key ordering - repairable?
2018-01-24 23:54 ` Chris Murphy
@ 2018-01-25 12:41 ` Austin S. Hemmelgarn
0 siblings, 0 replies; 17+ messages in thread
From: Austin S. Hemmelgarn @ 2018-01-25 12:41 UTC (permalink / raw)
To: Chris Murphy; +Cc: Claes Fransson, Btrfs BTRFS
On 2018-01-24 18:54, Chris Murphy wrote:
> On Wed, Jan 24, 2018 at 5:30 AM, Austin S. Hemmelgarn
> <ahferroin7@gmail.com> wrote:
>
>>> APFS is really vague on this front, it may be checksumming metadata,
>>> it's not checksumming data and with no option to. Apple proposes their
>>> branded storage devices do not return bogus data. OK so then why
>>> checksum the metadata?
>>
>> Even aside from the fact that it might be checksumming data, Apple's storage
>> engineers are still smoking something pretty damn strong if they think that
>> they can claim their storage devices _never_ return bogus data. Either
>> they're running some kind of checksumming _and_ replication below the block
>> layer in the storage device itself (which actually might explain the insane
>> cost of at least one piece of their hardware), or they think they've come up
>> with some fail-safe way to detect corruption and return errors reliably, and
>> in either case things can still fail. I smell a potential future lawsuit in
>> the works.
>
>
> I read somewhere the hardware (or more correctly their flash firmware)
> supposedly uses 128 bytes of checksum per 4KB data. That's a lot, I
> wonder if it's actually some kind of parity. But regardless, this kind
> of in-hardware checksumming won't account for things like misdirected
> or torn writes or literally any sort of corruption happening prior to
> the flash firmware computing those checksums.
It's most likely more generic erasure coding (parity as most people
think of it in the storage sense (RAID5 and RAID6) is a special case of
(n, n-1) or (n, n-2) erasure coding that happens to be optimal), so in
theory they could correct up to 1024 bits of errors, which is all well
and good, but as you say doesn't really protect against much (more
specifically, it only protects reliably against cell discharges from
various sources, or more generic read-disturb errors).
>
> On flash storage, maybe they're just concerned about bit rot or even
> the most superficial bit flips, and having just enough information to
> detect and correct for 1 or 2 flips per 4KB, not totally dissimilar to
> ECC memory. But that they don't use ECC memory, leave them open to
> corruption in the storage stack happening outside the literal storage
> device.
They also don't appear to use T.10 DIF (or whatever the T.13 equivalent
that I can never remember the name of is), which means even if they did
use ECC RAM they would still have a period of time where the data is
unprotected.
>
>> Actually, I forgot about the (newer) metadata checksumming feature in ext4,
>> and was just basing my statement on behavior the last time I used it for
>> anything serious. Having just checked mkfs.ext4, it appears that the
>> metadata in the SB that tells the kernel what to do when it runs into an
>> error for the FS still defaults to continuing on as if nothing happens, even
>> if you enable metadata checksumming (which still seems to be disabled by
>> default). Whether or not that actually is honored by modern kernels, I
>> don't know, but I've seen no evidence to suggest that it isn't.
>
>
> Depending on the corruption, Btrfs continues as well. If I corrupt a
> deadend leaf that contains file metadata (like names or security
> contexts), I just get some complaints of corruption. The file system
> remains rw mounted though. I don't know the metric by which metadata
> can be damaged and Btrfs says "whoooaa!!" and puts on the brakes by
> going read only. XFS certainly has its limits and goes read only when
> it detects certain metadata corruption via checksum fail. I'd guess
> ext4 will do the same thing, otherwise whats the point if it's going
> to knowingly eat itself alive?
I'm pretty sure the ext4 behavior is a hold-over from the original ext
filesystem, and I think even as far back as the version of the MINIX
filesystem that Linux originally used (which ext evolved out of). At a
minimum, all three error behaviors (panic, go read-only, or flag and
ignore) have been around since the early days of ext2.
FWIW, there are some cases where it does make sense to just not care and
ignore the errors. As a pretty specific example, one of the last
remaining places I still use ext4 is on top of compressed ramdisks when
I need some quick ephemeral storage that I want to be more memory
efficient than tmpfs. In such cases, the FS gets mounted exactly once,
and is usually used only for a very short period of time, and as a
result, the 'on-disk' data doesn't really matter much, so there's not
much point in worrying about it.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: bad key ordering - repairable?
2018-01-23 13:06 ` Claes Fransson
2018-01-23 18:13 ` Claes Fransson
@ 2018-01-27 14:54 ` Claes Fransson
1 sibling, 0 replies; 17+ messages in thread
From: Claes Fransson @ 2018-01-27 14:54 UTC (permalink / raw)
To: Btrfs BTRFS
2018-01-23 14:06 GMT+01:00 Claes Fransson <claes.v.fransson@gmail.com>:
> 2018-01-22 22:22 GMT+01:00 Hugo Mills <hugo@carfax.org.uk>:
>> On Mon, Jan 22, 2018 at 10:06:58PM +0100, Claes Fransson wrote:
>>> Hi!
>>>
>>> I really like the features of BTRFS, especially deduplication,
>>> snapshotting and checksumming. However, when using it on my laptop the
>>> last couple of years, it has became corrupted a lot of times.
>>> Sometimes I have managed to fix the problems (at least so much that I
>>> can continue to use the filesystem) with check --repair, but several
>>> times I had to recreate the file system and reinstall the operating
>>> system.
>>>
>>> I am guessing the corruptions might be the results of unclean
>>> shutdowns, mostly after system hangs, but also because of running out
>>> of battery sometimes?
>>> Furthermore, the power-led has recently started blinking (also when
>>> the power-cable is plugged in), I guess because of an old and bad
>>> battery. Maybe the current corruption also can have something to do
>>> with this? However I almost always run with power cable plugged in in
>>> last year, only on battery a few seconds a few times when moving the
>>> laptop.
>>>
>>> Currently, I can only mount the filesystem readonly, it goes readonly
>>> automatically if I try to mount it normally.
>>>
>>> When booting an OpenSUSE Tumbleweed-20180119 live-iso:
>>> localhost:~ # uname -r
>>> 4.14.13-1-default
>>> localhost:~ # btrfs --version
>>> btrfs-progs v4.14.1
>>>
>>> localhost:~ # btrfs check -p /dev/sda12
>>> Checking filesystem on /dev/sda12
>>
>> [fixing up bad paste]
>>
>>> UUID: d2819d5a-fd69-484b-bf34-f2b5692cbe1f
>>> bad key ordering 159 160 bad block 690436964352
>>> ERROR: errors found in extent allocation tree or chunk allocation
>>> checking free space cache [.]
>>> checking fs roots [o]
>>> checking csums
>>> bad key ordering 159 160
>>> Error looking up extent record -1
>>
>> [snip]
>>
>>> localhost:~ # btrfs inspect-internal dump-tree -b 690436964352
>>> /dev/sda12
>>> btrfs-progs v4.14.1
>>> leaf 690436964352 items 170 free space 1811 generation 196864 owner 2
>>> leaf 690436964352 flags 0x1(WRITTEN) backref revision 1
>>> fs uuid d2819d5a-fd69-484b-bf34-f2b5692cbe1f
>>> chunk uuid 52f81fe6-893b-4432-9336-895057ee81e1
>>> .
>>> .
>>> .
>>> item 157 key (22732500992 EXTENT_ITEM 16384) itemoff 6538 itemsize 53
>>> refs 1 gen 821 flags DATA
>>> extent data backref root 287 objectid 51665 offset 0 count 1
>>> item 158 key (22732517376 EXTENT_ITEM 16384) itemoff 6485 itemsize 53
>>> refs 1 gen 821 flags DATA
>>> extent data backref root 287 objectid 51666 offset 0 count 1
>>> item 159 key (22732533760 EXTENT_ITEM 16384) itemoff 6485 itemsize 0
>>> print-tree.c:428: print_extent_item: BUG_ON `item_size != sizeof(*ei0)` triggered, value 1
>>> btrfs(+0x365c6)[0x55bdfaada5c6]
>>> btrfs(print_extent_item+0x424)[0x55bdfaadb284]
>>> btrfs(btrfs_print_leaf+0x94e)[0x55bdfaadbc1e]
>>> btrfs(btrfs_print_tree+0x295)[0x55bdfaadcf05]
>>> btrfs(cmd_inspect_dump_tree+0x734)[0x55bdfab1b024]
>>> btrfs(main+0x7d)[0x55bdfaac7d4d]
>>> /lib64/libc.so.6(__libc_start_main+0xea)[0x7ff42100ff4a]
>>> btrfs(_start+0x2a)[0x55bdfaac7e5a]
>>> Aborted (core dumped)
>>
>> Wow, I've never seen it do that before. It's the next thing I'd
>> have asked for, so it's good you've preempted it.
>>
>> The main thing is that bad key ordering is almost always due to RAM
>> corruption. That's either bad RAM, or dodgy power regulation -- the
>> latter could be the PSU, or capacitors on the motherboard. (In this
>> case, it might also be something funny with the battery).
>>
>> I would definitely recommend a long run of memtest86. At least 8
>> hours, preferably 24. If you get errors repeatedly in the sme place,
>> it's the RAM. If they appear randomly, it's probably the power
>> regulation.
>>
> Thanks for the suggestion, I will try to do this in the next days.
>
>> [snip]
>>
>>>
>>> The filesystem had become pretty full, I had planned to increase the
>>> Btrfs-partition size before it became corrupt.
>>>
>>> Active kernel when the filesystem went read only: OpenSUSE Linux
>>> 4.14.14-1.geef6178-default, from the
>>> http://download.opensuse.org/repositories/Kernel:/stable/standard/stable
>>> repository.
>>>
>>> Fstab mount options: noatime,autodefrag (I have been using the option
>>> nossd with older kernels one period in the past on the filesystem).
>>>
>>> If it matters, I have been running duperemove many times on the
>>> filesystem since creation.
>>>
>>> To test the RAM, I have been running mprime Blend-test for 24 hours
>>> after the corruption without any error or warning.
>>
>> Of all of the bad key order errors I've seen (dozens), I think
>> there were a whole two which turned out not to be obviously related to
>> corrupt RAM. I still say that it's most likely the hardware.
>
> Okay, thank you for sharing your experience with me.
>
>>
>>> Is there a way I can try to repair this filesystem without the need to
>>> recreate it and reinstall the operating system? A reinstall including
>>> all currently installed packages, and restoring all current system
>>> settings, would probably take some time for me to do.
>>> If it is currently not repairable, it would be nice if this kind of
>>> corruption could be repaired in the future, even if losing a few
>>> files. Or if the corruptions could be avoided in the first place.
>>
>> Given that the current tools crash, the answer's a definite
>> no. However, if you can get a developer interested, they may be able
>> to write a fix for it, given an image of the FS (using btrfs-image).
>>
> Okay, will try to produce and upload an image within the next week.
>
>
I have now uploaded a btrfs-image of the file system to the cloud:
https://drive.google.com/file/d/1r2nesQy_W4wVb00BdZc5o2wqS8mHtOkK/view?usp=sharing
Claes
>> [snip]
>>> I have never noticed any corruptions on the NTFS and Ext4 file systems
>>> on the laptop, only on the Btrfs file systems.
>>
>> You've never _noticed_ them. :)
>>
>> Hugo.
>>
>> --
>> Hugo Mills | ... one ping(1) to rule them all, and in the
>> hugo@... carfax.org.uk | darkness bind(2) them.
>> http://carfax.org.uk/ |
>> PGP: E2AB1DE4 | Illiad
>
> Thank you for your answers.
>
> Claes
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: bad key ordering - repairable?
[not found] ` <CAEY8F1pVrZnf3M6mGJaxogx14ZrJ5CV3++_-y13sTniJ3ds4ww@mail.gmail.com>
@ 2018-01-27 17:42 ` Claes Fransson
0 siblings, 0 replies; 17+ messages in thread
From: Claes Fransson @ 2018-01-27 17:42 UTC (permalink / raw)
To: Btrfs BTRFS
2018-01-27 18:32 GMT+01:00 Claes Fransson <claes.v.fransson@gmail.com>:
>
> Duncan Wed, 24 Jan 2018 15:18:25 -0800
>
> Claes Fransson posted on Wed, 24 Jan 2018 20:44:33 +0100 as excerpted:
>
> > So, I have now some results from the PassMark Memtest86! I let the
> > default automatic tests run for about 19 hours and 16 passes. It
> > reported zero "Errors", but 4 lines of "[Note] RAM may be vulnerable to
> > high frequency row hammer bit flips". If I understand it correctly,
> > it means that some errors were detected when the RAM was tested at
> > higher rates than guaranteed accurate by the vendors.
>
> >From Wikipedia:
>
>> Row hammer (also written as rowhammer) is an unintended side effect in
>> dynamic random-access memory (DRAM) that causes memory cells to leak
>> their charges and interact electrically between themselves, possibly
>> altering the contents of nearby memory rows that were not addressed in
>> the original memory access. This circumvention of the isolation between
>> DRAM memory cells results from the high cell density in modern DRAM, and
>> can be triggered by specially crafted memory access patterns that rapidly
>> activate the same memory rows numerous times.[1][2][3]
>>
>> The row hammer effect has been used in some privilege escalation computer
>> security exploits.
>>
>> https://en.wikipedia.org/wiki/Row_hammer
>>
>> So it has nothing to do with (generic) testing the RAM at higher rates
>> than guaranteed by the vendors, but rather, with deliberate rapid
>> repeated access (at normal clock rates) of the same cell rows in ordered
>> to trigger a bitflip in nearby memory cells that could not normally be
>> accessed due to process separation and insufficient privileges.
>
>
Well, I was thinking of the specific error message by memtest86.
According to the PassMark website,
https://www.memtest86.com/troubleshooting.htm, "Why am I only getting
errors during Test 13 Hammer Test?", second paragraph.
Thanks for the Wikipedia explanation though.
>
>> IOW, it's unlikely to be accidentally tripped, and thus is exceedingly
>> unlikely to be relevant here, unless you're being hacked, of course.
>
>
Okay, thanks for your conclusion.
>
>>
> That said, and entirely unrelated to rowhammer, I know one of the
> problems of memory test false-negatives from experience.
>
> In my case, I was even running ECC RAM. But the memory I had purchased
> (back in the day when memory was far more expensive and sub-GB memory was
> the norm) was cheap, and as it happened, marked as stable at slightly
> higher clock rates than it actually was. But I couldn't afford more (or
> I'd have procured less dodgy RAM in the first place) and had little
> recourse but to live with it for awhile. A year or so later there was a
> BIOS update that added better memory clocking control, and I was able to
> declock the RAM slightly from its rating (IIRC to PC-3000 level, it was
> PC3200 rated, this was DDR1 era), after which it was /entirely/ stable,
> even after reducing some of the wait-state settings somewhat to try to
> claw back some of what I lost due to the underclocking.
>
> I run gentoo, and nearly all of my problems occurred when I was doing
> updates, building packages at 100% CPU with multiple cores accessing the
> same RAM. FWIW, the most frequent /detected/ problem was bunzip checksum
> errors as it decompressed and verified the data in memory (before writing
> out)... that would move or go away if I tried again. Occasionally I'd
> get machine-check errors (MCEs), but not frequently, and the ECC RAM
> subsystem /never/ reported errors.
>
My filesystem went readonly just after I did some updating of a lot of
packages (I think it was thousands of packages :) ), so massive
disk-IO for me, but possible also some CPU and RAM usage...
>
>> But the memory tests gave that memory an all-clear.
>
>
>>> The problem with the memory tests in this case is that they tend to work
>>> on an otherwise unloaded system, and test the retention of the memory
>>> cells, /not/ so much the speed and reliability at which they are accessed
>>> under fully loaded system stress -- and how could they when memory speed
>>> is normally set by the BIOS and not something the memory tester has
>>> access to?
>>>
>>> But my memory problems weren't with the memory cells themselves -- they
>>> retained their data just fine and indeed it was ECC RAM so would have
>>> triggered ECC errors if they didn't -- but with the precision timing of
>>> memory IO -- it wasn't quite up to the specs it claimed to support and
>>> would occasionally produce in-transit errors (the ECC would have detected
>>> and possibly corrected errors in storage), and the memory testers simply
>>> didn't test that like a fully loaded system doing unpacks of sources and
>>> builds from them did.
>>>
>>> As mentioned, once I got a BIOS update that let me declock the RAM a bit,
>>> everything was fine, and it remained fine when I did upgrade the RAM some
>>> years later, after prices had fallen, as well.
>
>
Thanks for telling, but unfortunately I do not have any setting to
change the clocking of the RAM on my laptop when booting into the
BIOS-settings menus.
Claes
>
>> (The system was first-gen AMD Opteron, on a server-grade Tyan board, that
>> I ran from purchase in late 2003 for over eight years, maxing out the
>>
>> pair of CPUs to dual-core Opteron 290s and the RAM to 8 gigs, over time,
>> until the board finally died in 2012 due to burst capacitors. Which
>> reminds me, I'm still running the replacement, a Gigabyte with an fx6100
>> overclocked a bit to 3.9 GHz and 16 gig RAM, and it's now nearing six
>> years old, so I suppose I better start planning for the next upgrade...
>> I've spent that six years upgrading to big-screen TVs as monitors, with a
>> 65inch/165cm 4K as my primary now and a 48inch/122cm as a secondary to
>> put youtube or whatever on fullscreen, and to now my second generation of
>> ssds, a pair of 1 TB samsung evos, but this reminds me that at nearing
>> six years old the main system's aging too, so I better start thinking of
>> replacing it again...)
>>
>> --
>> Duncan - List replies preferred. No HTML msgs.
>> "Every nonfree program has a lord, a master --
>> and if you use the program, he is your master." Richard Stallman
>>
>> --
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2018-01-27 17:42 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-22 21:06 bad key ordering - repairable? Claes Fransson
2018-01-22 21:22 ` Hugo Mills
2018-01-23 13:06 ` Claes Fransson
2018-01-23 18:13 ` Claes Fransson
2018-01-24 0:31 ` Chris Murphy
2018-01-24 19:44 ` Claes Fransson
2018-01-24 23:15 ` Duncan
[not found] ` <CAEY8F1pVrZnf3M6mGJaxogx14ZrJ5CV3++_-y13sTniJ3ds4ww@mail.gmail.com>
2018-01-27 17:42 ` Claes Fransson
2018-01-27 14:54 ` Claes Fransson
2018-01-23 2:35 ` Chris Murphy
2018-01-23 12:51 ` Austin S. Hemmelgarn
2018-01-23 13:29 ` Claes Fransson
2018-01-24 0:44 ` Chris Murphy
2018-01-24 12:30 ` Austin S. Hemmelgarn
2018-01-24 23:54 ` Chris Murphy
2018-01-25 12:41 ` Austin S. Hemmelgarn
2018-01-23 13:17 ` Claes Fransson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).