* btrfs check --repair: ERROR: cannot read chunk root
@ 2016-10-30 18:34 Marc MERLIN
2016-10-31 1:02 ` Qu Wenruo
0 siblings, 1 reply; 40+ messages in thread
From: Marc MERLIN @ 2016-10-30 18:34 UTC (permalink / raw)
To: linux-btrfs
I have a filesystem on top of md raid5 that got a few problems due to the
underlying block layer (bad data cable).
The filesystem mounts fine, but had a few issues
Scrub runs (I didn't let it finish, it takes a _long_ time)
But check --repair won't even run at all:
myth:~# btrfs --version
btrfs-progs v4.7.3
myth:~# uname -r
4.8.5-ia32-20161028
myth:~# btrfs check -p --repair /dev/mapper/crypt_bcache0 2>&1 | tee
/var/spool/repair
bytenr mismatch, want=13835462344704, have=0
ERROR: cannot read chunk root
Couldn't open file system
enabling repair mode
myth:~#
myth:~# btrfs rescue super-recover -v /dev//mapper/crypt_bcache0
All Devices:
Device: id = 1, name = /dev//mapper/crypt_bcache0
Before Recovering:
[All good supers]:
device name = /dev//mapper/crypt_bcache0
superblock bytenr = 65536
device name = /dev//mapper/crypt_bcache0
superblock bytenr = 67108864
device name = /dev//mapper/crypt_bcache0
superblock bytenr = 274877906944
[All bad supers]:
All supers are valid, no need to recover
I don't care about the data, it's a backup array, but I'd still like to know
if I can recover from this state and do a repair to see how much data got
damaged
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-10-30 18:34 btrfs check --repair: ERROR: cannot read chunk root Marc MERLIN
@ 2016-10-31 1:02 ` Qu Wenruo
2016-10-31 2:06 ` Marc MERLIN
0 siblings, 1 reply; 40+ messages in thread
From: Qu Wenruo @ 2016-10-31 1:02 UTC (permalink / raw)
To: Marc MERLIN, linux-btrfs
At 10/31/2016 02:34 AM, Marc MERLIN wrote:
> I have a filesystem on top of md raid5 that got a few problems due to the
> underlying block layer (bad data cable).
> The filesystem mounts fine, but had a few issues
> Scrub runs (I didn't let it finish, it takes a _long_ time)
> But check --repair won't even run at all:
>
> myth:~# btrfs --version
> btrfs-progs v4.7.3
> myth:~# uname -r
> 4.8.5-ia32-20161028
>
> myth:~# btrfs check -p --repair /dev/mapper/crypt_bcache0 2>&1 | tee
> /var/spool/repair
> bytenr mismatch, want=13835462344704, have=0
> ERROR: cannot read chunk root
Your chunk root is corrupted, and since chunk tree provides the
underlying disk layout, even for single device, so if we failed to read
it, then it will never be able to be mounted.
You could try to use backup chunk root.
"btrfs inspect-internal dump-super -f" to find the backup chunk root,
and use "btrfs check --chunk-root <backup chunk root bytenr>" to have
another try.
Thanks,
Qu
> Couldn't open file system
> enabling repair mode
> myth:~#
>
> myth:~# btrfs rescue super-recover -v /dev//mapper/crypt_bcache0
> All Devices:
> Device: id = 1, name = /dev//mapper/crypt_bcache0
>
> Before Recovering:
> [All good supers]:
> device name = /dev//mapper/crypt_bcache0
> superblock bytenr = 65536
>
> device name = /dev//mapper/crypt_bcache0
> superblock bytenr = 67108864
>
> device name = /dev//mapper/crypt_bcache0
> superblock bytenr = 274877906944
>
> [All bad supers]:
>
> All supers are valid, no need to recover
>
>
> I don't care about the data, it's a backup array, but I'd still like to know
> if I can recover from this state and do a repair to see how much data got
> damaged
>
> Thanks,
> Marc
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-10-31 1:02 ` Qu Wenruo
@ 2016-10-31 2:06 ` Marc MERLIN
2016-10-31 4:21 ` Marc MERLIN
2016-10-31 5:27 ` Qu Wenruo
0 siblings, 2 replies; 40+ messages in thread
From: Marc MERLIN @ 2016-10-31 2:06 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs
On Mon, Oct 31, 2016 at 09:02:50AM +0800, Qu Wenruo wrote:
> Your chunk root is corrupted, and since chunk tree provides the
> underlying disk layout, even for single device, so if we failed to read
> it, then it will never be able to be mounted.
That's the thing though, I can mount the filesystem just fine :)
> You could try to use backup chunk root.
>
> "btrfs inspect-internal dump-super -f" to find the backup chunk root,
> and use "btrfs check --chunk-root <backup chunk root bytenr>" to have
> another try.
Am I doing this right? It doesn't seem to work
myth:~# btrfs check -p --repair --chunk-root 13835462344704 /dev/mapper/crypt_bcache0 2>&1 | tee /var/spool/repair2
bytenr mismatch, want=13835462344704, have=0
ERROR: cannot read chunk root
Couldn't open file system
enabling repair mode
myth:~# btrfs inspect-internal dump-super -f /dev/mapper/crypt_bcache0 | less
superblock: bytenr=65536, device=/dev/mapper/crypt_bcache0
---------------------------------------------------------
csum_type 0 (crc32c)
csum_size 4
csum 0x3814e4a0 [match]
bytenr 65536
flags 0x1
( WRITTEN )
magic _BHRfS_M [match]
fsid 6692cf4c-93d9-438c-ac30-5db6381dc4f2
label DS5
generation 51176
root 13845513109504
sys_array_size 129
chunk_root_generation 51135
root_level 1
chunk_root 13835462344704
chunk_root_level 1
log_root 0
log_root_transid 0
log_root_level 0
total_bytes 16002599346176
bytes_used 14584560160768
sectorsize 4096
nodesize 16384
leafsize 16384
stripesize 4096
root_dir 6
num_devices 1
compat_flags 0x0
compat_ro_flags 0x0
incompat_flags 0x169
( MIXED_BACKREF |
COMPRESS_LZO |
BIG_METADATA |
EXTENDED_IREF |
SKINNY_METADATA )
cache_generation 51176
uuid_tree_generation 51176
dev_item.uuid 0cf779be-8e16-4982-b7d7-f8241deea0d1
dev_item.fsid 6692cf4c-93d9-438c-ac30-5db6381dc4f2 [match]
dev_item.type 0
dev_item.total_bytes 16002599346176
dev_item.bytes_used 14691011133440
dev_item.io_align 4096
dev_item.io_width 4096
dev_item.sector_size 4096
dev_item.devid 1
dev_item.dev_group 0
dev_item.seek_speed 0
dev_item.bandwidth 0
dev_item.generation 0
sys_chunk_array[2048]:
item 0 key (FIRST_CHUNK_TREE CHUNK_ITEM 13835461197824)
chunk length 33554432 owner 2 stripe_len 65536
type SYSTEM|DUP num_stripes 2
stripe 0 devid 1 offset 13500327919616
dev uuid: 0cf779be-8e16-4982-b7d7-f8241deea0d1
stripe 1 devid 1 offset 13500361474048
dev uuid: 0cf779be-8e16-4982-b7d7-f8241deea0d1
backup_roots[4]:
backup 0:
backup_tree_root: 12801101791232 gen: 51174 level: 1
backup_chunk_root: 13835462344704 gen: 51135 level: 1
backup_extent_root: 12801124352000 gen: 51174 level: 3
backup_fs_root: 10548133724160 gen: 51172 level: 0
backup_dev_root: 11125467824128 gen: 51172 level: 1
backup_csum_root: 12801133953024 gen: 51174 level: 3
backup_total_bytes: 16002599346176
backup_bytes_used: 14584560160768
backup_num_devices: 1
backup 1:
backup_tree_root: 13842532810752 gen: 51175 level: 1
backup_chunk_root: 13835462344704 gen: 51135 level: 1
backup_extent_root: 13843784695808 gen: 51175 level: 3
backup_fs_root: 10548133724160 gen: 51172 level: 0
backup_dev_root: 11125467824128 gen: 51172 level: 1
backup_csum_root: 13842542362624 gen: 51175 level: 3
backup_total_bytes: 16002599346176
backup_bytes_used: 14584560160768
backup_num_devices: 1
backup 2:
backup_tree_root: 13845513109504 gen: 51176 level: 1
backup_chunk_root: 13835462344704 gen: 51135 level: 1
backup_extent_root: 13845513191424 gen: 51176 level: 3
backup_fs_root: 10548133724160 gen: 51172 level: 0
backup_dev_root: 11125467824128 gen: 51172 level: 1
backup_csum_root: 13852180938752 gen: 51176 level: 3
backup_total_bytes: 16002599346176
backup_bytes_used: 14584560160768
backup_num_devices: 1
backup 3:
backup_tree_root: 12750807580672 gen: 51173 level: 1
backup_chunk_root: 13835462344704 gen: 51135 level: 1
backup_extent_root: 12750810447872 gen: 51173 level: 3
backup_fs_root: 10548133724160 gen: 51172 level: 0
backup_dev_root: 11125467824128 gen: 51172 level: 1
backup_csum_root: 12684302712832 gen: 51173 level: 3
backup_total_bytes: 16002599346176
backup_bytes_used: 14584560177152
backup_num_devices: 1
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-10-31 2:06 ` Marc MERLIN
@ 2016-10-31 4:21 ` Marc MERLIN
2016-10-31 5:27 ` Qu Wenruo
1 sibling, 0 replies; 40+ messages in thread
From: Marc MERLIN @ 2016-10-31 4:21 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs
On Sun, Oct 30, 2016 at 07:06:16PM -0700, Marc MERLIN wrote:
> On Mon, Oct 31, 2016 at 09:02:50AM +0800, Qu Wenruo wrote:
> > Your chunk root is corrupted, and since chunk tree provides the
> > underlying disk layout, even for single device, so if we failed to read
> > it, then it will never be able to be mounted.
>
> That's the thing though, I can mount the filesystem just fine :)
Actually, has anyone seen any configuration where the kernel can mount a
filesystem without ro, or recovery, it can just mount it read/write and
btrfs check --repair can't open it?
This kind of sounds like a bug in check --repair IMO.
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-10-31 2:06 ` Marc MERLIN
2016-10-31 4:21 ` Marc MERLIN
@ 2016-10-31 5:27 ` Qu Wenruo
2016-10-31 5:47 ` Marc MERLIN
1 sibling, 1 reply; 40+ messages in thread
From: Qu Wenruo @ 2016-10-31 5:27 UTC (permalink / raw)
To: Marc MERLIN; +Cc: linux-btrfs
At 10/31/2016 10:06 AM, Marc MERLIN wrote:
> On Mon, Oct 31, 2016 at 09:02:50AM +0800, Qu Wenruo wrote:
>> Your chunk root is corrupted, and since chunk tree provides the
>> underlying disk layout, even for single device, so if we failed to read
>> it, then it will never be able to be mounted.
>
> That's the thing though, I can mount the filesystem just fine :)
That's strange, pretty strange.
And according to your super dump, I didn't see anything btrfs-progs
can't handle.
Your chunk tree lies in a DUP chunk, which btrfs-progs should be able to
handle it. (Unlike RAID5/6, btrfs-progs doesn't support to recover it at
read time)
>
>> You could try to use backup chunk root.
>>
>> "btrfs inspect-internal dump-super -f" to find the backup chunk root,
>> and use "btrfs check --chunk-root <backup chunk root bytenr>" to have
>> another try.
>
> Am I doing this right? It doesn't seem to work
>
> myth:~# btrfs check -p --repair --chunk-root 13835462344704 /dev/mapper/crypt_bcache0 2>&1 | tee /var/spool/repair2
> bytenr mismatch, want=13835462344704, have=0
> ERROR: cannot read chunk root
> Couldn't open file system
> enabling repair mode
You're doing it right, while the superblock doesn't contain any old
chunk root bytenr.
So this method doesn't work at all. :(
Would you please dump the following bytes?
That's the chunk root tree block on your disk.
offset: 13500329066496 length: 16384
offset: 13500330213376 length: 16384
According to your fsck error output, I assume btrfs-progs fails to read
the first copy of chunk root, and due to a bug, it doesn't continue to
read 2nd copy.
While kernel continues to read the 2nd copy and everything goes on.
IIRC btrfs-progs can handle csum error and continue trying, maybe some
logical goes wrong.
Thanks,
Qu
>
>
> myth:~# btrfs inspect-internal dump-super -f /dev/mapper/crypt_bcache0 | less
> superblock: bytenr=65536, device=/dev/mapper/crypt_bcache0
> ---------------------------------------------------------
> csum_type 0 (crc32c)
> csum_size 4
> csum 0x3814e4a0 [match]
> bytenr 65536
> flags 0x1
> ( WRITTEN )
> magic _BHRfS_M [match]
> fsid 6692cf4c-93d9-438c-ac30-5db6381dc4f2
> label DS5
> generation 51176
> root 13845513109504
> sys_array_size 129
> chunk_root_generation 51135
> root_level 1
> chunk_root 13835462344704
> chunk_root_level 1
> log_root 0
> log_root_transid 0
> log_root_level 0
> total_bytes 16002599346176
> bytes_used 14584560160768
> sectorsize 4096
> nodesize 16384
> leafsize 16384
> stripesize 4096
> root_dir 6
> num_devices 1
> compat_flags 0x0
> compat_ro_flags 0x0
> incompat_flags 0x169
> ( MIXED_BACKREF |
> COMPRESS_LZO |
> BIG_METADATA |
> EXTENDED_IREF |
> SKINNY_METADATA )
> cache_generation 51176
> uuid_tree_generation 51176
> dev_item.uuid 0cf779be-8e16-4982-b7d7-f8241deea0d1
> dev_item.fsid 6692cf4c-93d9-438c-ac30-5db6381dc4f2 [match]
> dev_item.type 0
> dev_item.total_bytes 16002599346176
> dev_item.bytes_used 14691011133440
> dev_item.io_align 4096
> dev_item.io_width 4096
> dev_item.sector_size 4096
> dev_item.devid 1
> dev_item.dev_group 0
> dev_item.seek_speed 0
> dev_item.bandwidth 0
> dev_item.generation 0
> sys_chunk_array[2048]:
> item 0 key (FIRST_CHUNK_TREE CHUNK_ITEM 13835461197824)
> chunk length 33554432 owner 2 stripe_len 65536
> type SYSTEM|DUP num_stripes 2
> stripe 0 devid 1 offset 13500327919616
> dev uuid: 0cf779be-8e16-4982-b7d7-f8241deea0d1
> stripe 1 devid 1 offset 13500361474048
> dev uuid: 0cf779be-8e16-4982-b7d7-f8241deea0d1
> backup_roots[4]:
> backup 0:
> backup_tree_root: 12801101791232 gen: 51174 level: 1
> backup_chunk_root: 13835462344704 gen: 51135 level: 1
> backup_extent_root: 12801124352000 gen: 51174 level: 3
> backup_fs_root: 10548133724160 gen: 51172 level: 0
> backup_dev_root: 11125467824128 gen: 51172 level: 1
> backup_csum_root: 12801133953024 gen: 51174 level: 3
> backup_total_bytes: 16002599346176
> backup_bytes_used: 14584560160768
> backup_num_devices: 1
>
> backup 1:
> backup_tree_root: 13842532810752 gen: 51175 level: 1
> backup_chunk_root: 13835462344704 gen: 51135 level: 1
> backup_extent_root: 13843784695808 gen: 51175 level: 3
> backup_fs_root: 10548133724160 gen: 51172 level: 0
> backup_dev_root: 11125467824128 gen: 51172 level: 1
> backup_csum_root: 13842542362624 gen: 51175 level: 3
> backup_total_bytes: 16002599346176
> backup_bytes_used: 14584560160768
> backup_num_devices: 1
>
> backup 2:
> backup_tree_root: 13845513109504 gen: 51176 level: 1
> backup_chunk_root: 13835462344704 gen: 51135 level: 1
> backup_extent_root: 13845513191424 gen: 51176 level: 3
> backup_fs_root: 10548133724160 gen: 51172 level: 0
> backup_dev_root: 11125467824128 gen: 51172 level: 1
> backup_csum_root: 13852180938752 gen: 51176 level: 3
> backup_total_bytes: 16002599346176
> backup_bytes_used: 14584560160768
> backup_num_devices: 1
>
> backup 3:
> backup_tree_root: 12750807580672 gen: 51173 level: 1
> backup_chunk_root: 13835462344704 gen: 51135 level: 1
> backup_extent_root: 12750810447872 gen: 51173 level: 3
> backup_fs_root: 10548133724160 gen: 51172 level: 0
> backup_dev_root: 11125467824128 gen: 51172 level: 1
> backup_csum_root: 12684302712832 gen: 51173 level: 3
> backup_total_bytes: 16002599346176
> backup_bytes_used: 14584560177152
> backup_num_devices: 1
>
>
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-10-31 5:27 ` Qu Wenruo
@ 2016-10-31 5:47 ` Marc MERLIN
2016-10-31 6:04 ` Qu Wenruo
0 siblings, 1 reply; 40+ messages in thread
From: Marc MERLIN @ 2016-10-31 5:47 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs
On Mon, Oct 31, 2016 at 01:27:56PM +0800, Qu Wenruo wrote:
> Would you please dump the following bytes?
> That's the chunk root tree block on your disk.
>
> offset: 13500329066496 length: 16384
> offset: 13500330213376 length: 16384
Sorry for asking, am I doing this wrong?
myth:~# dd if=/dev/mapper/crypt_bcache0 of=/tmp/dump1 bs=512 count=32
skip=26367830208
dd: reading `/dev/mapper/crypt_bcache0': Invalid argument
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.000401393 s, 0.0 kB/s
> According to your fsck error output, I assume btrfs-progs fails to read
> the first copy of chunk root, and due to a bug, it doesn't continue to
> read 2nd copy.
>
> While kernel continues to read the 2nd copy and everything goes on.
Ah, that would make sense.
But from what you're saying, I should be able to do recovery by pointing
to the 2nd copy of the chunk root, but somehow I haven't typed the right
command to do so yet, correct?
Should I try another command offset than
btrfs check -p --repair --chunk-root 13835462344704 /dev/mapper/crypt_bcache0
?
Or are you saying the btrfs progs bug causes it to fail to even try to read
the 2nd copy of the chunk root even though it was given on the command line?
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-10-31 5:47 ` Marc MERLIN
@ 2016-10-31 6:04 ` Qu Wenruo
2016-10-31 6:25 ` Marc MERLIN
0 siblings, 1 reply; 40+ messages in thread
From: Qu Wenruo @ 2016-10-31 6:04 UTC (permalink / raw)
To: Marc MERLIN; +Cc: linux-btrfs
At 10/31/2016 01:47 PM, Marc MERLIN wrote:
> On Mon, Oct 31, 2016 at 01:27:56PM +0800, Qu Wenruo wrote:
>> Would you please dump the following bytes?
>> That's the chunk root tree block on your disk.
>>
>> offset: 13500329066496 length: 16384
>> offset: 13500330213376 length: 16384
>
> Sorry for asking, am I doing this wrong?
> myth:~# dd if=/dev/mapper/crypt_bcache0 of=/tmp/dump1 bs=512 count=32
> skip=26367830208
> dd: reading `/dev/mapper/crypt_bcache0': Invalid argument
> 0+0 records in
> 0+0 records out
> 0 bytes (0 B) copied, 0.000401393 s, 0.0 kB/s
So, the underlying MD RAID5 are complaining about some wrong data, and
refuse to read out.
It seems that btrfs-progs can't handle read failure?
Maybe dm-error could emulate it.
And what about the 2nd range?
>
>> According to your fsck error output, I assume btrfs-progs fails to read
>> the first copy of chunk root, and due to a bug, it doesn't continue to
>> read 2nd copy.
>>
>> While kernel continues to read the 2nd copy and everything goes on.
>
> Ah, that would make sense.
> But from what you're saying, I should be able to do recovery by pointing
> to the 2nd copy of the chunk root, but somehow I haven't typed the right
> command to do so yet, correct?
Unfortunately, no the case.
For --chunk-root command, *logical* bytenr is specified.
We can only tell btrfs-progs(kernel is the same) to find tree root/chunk
root at given *logical* bytenr.
But to read which *physical* copy, we can't specify.
Normally, btrfs-progs/kernel should find the correct physical copy
without problem, but not this time for btrfs-progs.
And further more, all backup chunk root are in facts pointing to current
chunk root, so --chunk-root doesn't work at all.
>
> Should I try another command offset than
> btrfs check -p --repair --chunk-root 13835462344704 /dev/mapper/crypt_bcache0
> ?
Nope, that bytenr is *physical* bytenr, not *logical* bytenr
--chunk-root accepts.
But the read error for first tree block already gives some hint.
I'll try to emulate it.
Thanks,
Qu
>
> Or are you saying the btrfs progs bug causes it to fail to even try to read
> the 2nd copy of the chunk root even though it was given on the command line?
>
> Thanks,
> Marc
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-10-31 6:04 ` Qu Wenruo
@ 2016-10-31 6:25 ` Marc MERLIN
2016-10-31 6:32 ` Qu Wenruo
0 siblings, 1 reply; 40+ messages in thread
From: Marc MERLIN @ 2016-10-31 6:25 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs
On Mon, Oct 31, 2016 at 02:04:10PM +0800, Qu Wenruo wrote:
> >Sorry for asking, am I doing this wrong?
> >myth:~# dd if=/dev/mapper/crypt_bcache0 of=/tmp/dump1 bs=512 count=32
> >skip=26367830208
> >dd: reading `/dev/mapper/crypt_bcache0': Invalid argument
> >0+0 records in
> >0+0 records out
> >0 bytes (0 B) copied, 0.000401393 s, 0.0 kB/s
>
> So, the underlying MD RAID5 are complaining about some wrong data, and
> refuse to read out.
>
> It seems that btrfs-progs can't handle read failure?
> Maybe dm-error could emulate it.
>
> And what about the 2nd range?
they both fail the same, but I wasn' tsure if I typed the wrong dd command
or not.
myth:~# btrfs fi df /mnt/mnt
Data, single: total=13.22TiB, used=13.19TiB
System, DUP: total=32.00MiB, used=1.42MiB
Metadata, DUP: total=74.00GiB, used=72.82GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
myth:~# btrfs fi show
Label: 'DS5' uuid: 6692cf4c-93d9-438c-ac30-5db6381dc4f2
Total devices 1 FS bytes used 13.26TiB
devid 1 size 14.55TiB used 13.36TiB path /dev/mapper/crypt_bcache0
For now, I mounted the filesystem and I'm running scrub on it to see how
much damage there is. It will take all night:
BTRFS warning (device dm-0): checksum error at logical 27886878720 on dev /dev/mapper/crypt_bcache0, sector 56580096, root 9461, inode 45837, offset 15460089856, length 4096, links 1 (path: system/mlocate/mlocate.db)
BTRFS error (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
BTRFS error (device dm-0): unable to fixup (regular) error at logical 27887009792 on dev /dev/mapper/crypt_bcache0
BTRFS error (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
BTRFS error (device dm-0): unable to fixup (regular) error at logical 27886878720 on dev /dev/mapper/crypt_bcache0
BTRFS warning (device dm-0): checksum error at logical 27885961216 on dev /dev/mapper/crypt_bcache0, sector 56578304, root 9461, inode 45837, offset 15459172352, length 4096, links 1 (path: system/mlocate/mlocate.db)
BTRFS warning (device dm-0): checksum error at logical 27885830144 on dev /dev/mapper/crypt_bcache0, sector 56578048, root 9461, inode 45837, offset 15459041280, length 4096, links 1 (path: system/mlocate/mlocate.db)
BTRFS error (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0
BTRFS error (device dm-0): unable to fixup (regular) error at logical 27885830144 on dev /dev/mapper/crypt_bcache0
BTRFS error (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
BTRFS error (device dm-0): unable to fixup (regular) error at logical 27885961216 on dev /dev/mapper/crypt_bcache0
BTRFS warning (device dm-0): checksum error at logical 27887013888 on dev /dev/mapper/crypt_bcache0, sector 56580360, root 9461, inode 45837, offset 15460225024, length 4096, links 1 (path: system/mlocate/mlocate.db)
BTRFS error (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 5, gen 0
BTRFS error (device dm-0): unable to fixup (regular) error at logical 27887013888 on dev /dev/mapper/crypt_bcache0
BTRFS warning (device dm-0): checksum error at logical 27885834240 on dev /dev/mapper/crypt_bcache0, sector 56578056, root 9461, inode 45837, offset 15459045376, length 4096, links 1 (path: system/mlocate/mlocate.db)
BTRFS error (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 6, gen 0
BTRFS error (device dm-0): unable to fixup (regular) error at logical 27885834240 on dev /dev/mapper/crypt_bcache0
BTRFS warning (device dm-0): checksum error at logical 27887017984 on dev /dev/mapper/crypt_bcache0, sector 56580368, root 9461, inode 45837, offset 15460229120, length 4096, links 1 (path: system/mlocate/mlocate.db)
BTRFS error (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 7, gen 0
BTRFS error (device dm-0): unable to fixup (regular) error at logical 27887017984 on dev /dev/mapper/crypt_bcache0
So far, it looks like mnior damage limited to one file, I'll see tomorrow morning after it's done reading the whole array
> And further more, all backup chunk root are in facts pointing to current
> chunk root, so --chunk-root doesn't work at all.
Ah, ok, so there is nothing I can do at the moment until I get a new btrfs-progs, correct?
Thanks for your answers
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-10-31 6:25 ` Marc MERLIN
@ 2016-10-31 6:32 ` Qu Wenruo
2016-10-31 6:37 ` Marc MERLIN
0 siblings, 1 reply; 40+ messages in thread
From: Qu Wenruo @ 2016-10-31 6:32 UTC (permalink / raw)
To: Marc MERLIN; +Cc: linux-btrfs
At 10/31/2016 02:25 PM, Marc MERLIN wrote:
> On Mon, Oct 31, 2016 at 02:04:10PM +0800, Qu Wenruo wrote:
>>> Sorry for asking, am I doing this wrong?
>>> myth:~# dd if=/dev/mapper/crypt_bcache0 of=/tmp/dump1 bs=512 count=32
>>> skip=26367830208
>>> dd: reading `/dev/mapper/crypt_bcache0': Invalid argument
>>> 0+0 records in
>>> 0+0 records out
>>> 0 bytes (0 B) copied, 0.000401393 s, 0.0 kB/s
>>
>> So, the underlying MD RAID5 are complaining about some wrong data, and
>> refuse to read out.
>>
>> It seems that btrfs-progs can't handle read failure?
>> Maybe dm-error could emulate it.
>>
>> And what about the 2nd range?
>
> they both fail the same, but I wasn' tsure if I typed the wrong dd command
> or not.
Strange, your command seems OK to me.
Does it has anything to do with your security setup or something like that?
Or is it related to dm-crypt or bcache?
But this reminds me, if dd can't read it, maybe btrfs-progs is the same.
Maybe only kernel can read dm-crypt device while user space tools can't
access dm-crypt devices directly?
Thanks,
Qu
>
> myth:~# btrfs fi df /mnt/mnt
> Data, single: total=13.22TiB, used=13.19TiB
> System, DUP: total=32.00MiB, used=1.42MiB
> Metadata, DUP: total=74.00GiB, used=72.82GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> myth:~# btrfs fi show
> Label: 'DS5' uuid: 6692cf4c-93d9-438c-ac30-5db6381dc4f2
> Total devices 1 FS bytes used 13.26TiB
> devid 1 size 14.55TiB used 13.36TiB path /dev/mapper/crypt_bcache0
>
> For now, I mounted the filesystem and I'm running scrub on it to see how
> much damage there is. It will take all night:
> BTRFS warning (device dm-0): checksum error at logical 27886878720 on dev /dev/mapper/crypt_bcache0, sector 56580096, root 9461, inode 45837, offset 15460089856, length 4096, links 1 (path: system/mlocate/mlocate.db)
> BTRFS error (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
> BTRFS error (device dm-0): unable to fixup (regular) error at logical 27887009792 on dev /dev/mapper/crypt_bcache0
> BTRFS error (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
> BTRFS error (device dm-0): unable to fixup (regular) error at logical 27886878720 on dev /dev/mapper/crypt_bcache0
> BTRFS warning (device dm-0): checksum error at logical 27885961216 on dev /dev/mapper/crypt_bcache0, sector 56578304, root 9461, inode 45837, offset 15459172352, length 4096, links 1 (path: system/mlocate/mlocate.db)
> BTRFS warning (device dm-0): checksum error at logical 27885830144 on dev /dev/mapper/crypt_bcache0, sector 56578048, root 9461, inode 45837, offset 15459041280, length 4096, links 1 (path: system/mlocate/mlocate.db)
> BTRFS error (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0
> BTRFS error (device dm-0): unable to fixup (regular) error at logical 27885830144 on dev /dev/mapper/crypt_bcache0
> BTRFS error (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
> BTRFS error (device dm-0): unable to fixup (regular) error at logical 27885961216 on dev /dev/mapper/crypt_bcache0
> BTRFS warning (device dm-0): checksum error at logical 27887013888 on dev /dev/mapper/crypt_bcache0, sector 56580360, root 9461, inode 45837, offset 15460225024, length 4096, links 1 (path: system/mlocate/mlocate.db)
> BTRFS error (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 5, gen 0
> BTRFS error (device dm-0): unable to fixup (regular) error at logical 27887013888 on dev /dev/mapper/crypt_bcache0
> BTRFS warning (device dm-0): checksum error at logical 27885834240 on dev /dev/mapper/crypt_bcache0, sector 56578056, root 9461, inode 45837, offset 15459045376, length 4096, links 1 (path: system/mlocate/mlocate.db)
> BTRFS error (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 6, gen 0
> BTRFS error (device dm-0): unable to fixup (regular) error at logical 27885834240 on dev /dev/mapper/crypt_bcache0
> BTRFS warning (device dm-0): checksum error at logical 27887017984 on dev /dev/mapper/crypt_bcache0, sector 56580368, root 9461, inode 45837, offset 15460229120, length 4096, links 1 (path: system/mlocate/mlocate.db)
> BTRFS error (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 7, gen 0
> BTRFS error (device dm-0): unable to fixup (regular) error at logical 27887017984 on dev /dev/mapper/crypt_bcache0
>
> So far, it looks like mnior damage limited to one file, I'll see tomorrow morning after it's done reading the whole array
>
>> And further more, all backup chunk root are in facts pointing to current
>> chunk root, so --chunk-root doesn't work at all.
>
> Ah, ok, so there is nothing I can do at the moment until I get a new btrfs-progs, correct?
>
> Thanks for your answers
> Marc
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-10-31 6:32 ` Qu Wenruo
@ 2016-10-31 6:37 ` Marc MERLIN
2016-10-31 7:04 ` Qu Wenruo
0 siblings, 1 reply; 40+ messages in thread
From: Marc MERLIN @ 2016-10-31 6:37 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs
On Mon, Oct 31, 2016 at 02:32:53PM +0800, Qu Wenruo wrote:
>
>
> At 10/31/2016 02:25 PM, Marc MERLIN wrote:
> >On Mon, Oct 31, 2016 at 02:04:10PM +0800, Qu Wenruo wrote:
> >>>Sorry for asking, am I doing this wrong?
> >>>myth:~# dd if=/dev/mapper/crypt_bcache0 of=/tmp/dump1 bs=512 count=32
> >>>skip=26367830208
> >>>dd: reading `/dev/mapper/crypt_bcache0': Invalid argument
> >>>0+0 records in
> >>>0+0 records out
> >>>0 bytes (0 B) copied, 0.000401393 s, 0.0 kB/s
> >>
> >>So, the underlying MD RAID5 are complaining about some wrong data, and
> >>refuse to read out.
> >>
> >>It seems that btrfs-progs can't handle read failure?
> >>Maybe dm-error could emulate it.
> >>
> >>And what about the 2nd range?
> >
> >they both fail the same, but I wasn' tsure if I typed the wrong dd command
> >or not.
>
> Strange, your command seems OK to me.
>
> Does it has anything to do with your security setup or something like that?
> Or is it related to dm-crypt or bcache?
>
>
> But this reminds me, if dd can't read it, maybe btrfs-progs is the same.
>
> Maybe only kernel can read dm-crypt device while user space tools can't
> access dm-crypt devices directly?
It can, it's just the offset seems wrong:
myth:~# dd if=/dev/mapper/crypt_bcache0 of=/tmp/dump1 bs=512 count=32 skip=26367830208
dd: reading `/dev/mapper/crypt_bcache0': Invalid argument
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.000421662 s, 0.0 kB/s
If I divide by 1000, it works:
myth:~# dd if=/dev/mapper/crypt_bcache0 of=/tmp/dump1 bs=512 count=32 skip=26367830
32+0 records in
32+0 records out
16384 bytes (16 kB) copied, 0.139005 s, 118 kB/s
so that's why I was asking you if I counted the offset wrong. I took the
value you asked and divided by 512, but it seems too big
13500329066496 / 512 = 26367830208
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-10-31 6:37 ` Marc MERLIN
@ 2016-10-31 7:04 ` Qu Wenruo
2016-10-31 8:44 ` Hugo Mills
0 siblings, 1 reply; 40+ messages in thread
From: Qu Wenruo @ 2016-10-31 7:04 UTC (permalink / raw)
To: Marc MERLIN; +Cc: linux-btrfs
At 10/31/2016 02:37 PM, Marc MERLIN wrote:
> On Mon, Oct 31, 2016 at 02:32:53PM +0800, Qu Wenruo wrote:
>>
>>
>> At 10/31/2016 02:25 PM, Marc MERLIN wrote:
>>> On Mon, Oct 31, 2016 at 02:04:10PM +0800, Qu Wenruo wrote:
>>>>> Sorry for asking, am I doing this wrong?
>>>>> myth:~# dd if=/dev/mapper/crypt_bcache0 of=/tmp/dump1 bs=512 count=32
>>>>> skip=26367830208
>>>>> dd: reading `/dev/mapper/crypt_bcache0': Invalid argument
>>>>> 0+0 records in
>>>>> 0+0 records out
>>>>> 0 bytes (0 B) copied, 0.000401393 s, 0.0 kB/s
>>>>
>>>> So, the underlying MD RAID5 are complaining about some wrong data, and
>>>> refuse to read out.
>>>>
>>>> It seems that btrfs-progs can't handle read failure?
>>>> Maybe dm-error could emulate it.
>>>>
>>>> And what about the 2nd range?
>>>
>>> they both fail the same, but I wasn' tsure if I typed the wrong dd command
>>> or not.
>>
>> Strange, your command seems OK to me.
>>
>> Does it has anything to do with your security setup or something like that?
>> Or is it related to dm-crypt or bcache?
>>
>>
>> But this reminds me, if dd can't read it, maybe btrfs-progs is the same.
>>
>> Maybe only kernel can read dm-crypt device while user space tools can't
>> access dm-crypt devices directly?
>
> It can, it's just the offset seems wrong:
>
> myth:~# dd if=/dev/mapper/crypt_bcache0 of=/tmp/dump1 bs=512 count=32 skip=26367830208
> dd: reading `/dev/mapper/crypt_bcache0': Invalid argument
> 0+0 records in
> 0+0 records out
> 0 bytes (0 B) copied, 0.000421662 s, 0.0 kB/s
>
> If I divide by 1000, it works:
> myth:~# dd if=/dev/mapper/crypt_bcache0 of=/tmp/dump1 bs=512 count=32 skip=26367830
> 32+0 records in
> 32+0 records out
> 16384 bytes (16 kB) copied, 0.139005 s, 118 kB/s
>
> so that's why I was asking you if I counted the offset wrong. I took the
> value you asked and divided by 512, but it seems too big
>
> 13500329066496 / 512 = 26367830208
>
> Marc
>
But according to your dump-super output, that's strange.
------
chunk_root 13835462344704 (CR)
item 0 key (FIRST_CHUNK_TREE CHUNK_ITEM 13835461197824) (CS)
chunk length 33554432 owner 2 stripe_len 65536
type SYSTEM|DUP num_stripes 2
stripe 0 devid 1 offset 13500327919616 (ST1)
dev uuid: 0cf779be-8e16-4982-b7d7-f8241deea0d1
stripe 1 devid 1 offset 13500361474048 (ST2)
dev uuid: 0cf779be-8e16-4982-b7d7-f8241deea0d1
------
Here, your chunk logical bytenr is 13835461197824, and its physical
bytenr is 13500327919616 and 13500361474048.
My calculation is quite simple.
Start1 = CR - CS + ST1
Start2 = CR - CS + ST2
Unless the superblock is incorrect, it is not possile.
And the physical offset, is about 12.2 TiB, which is smaller than 15TiB
of your device.
So that's quite strange that dd can't read out the data.
And if dd can't read it out, then I see no reason btrfs-progs can read
it out.
Any idea on special dm setup which can make us fail to read out some
data range?
Thanks,
Qu
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-10-31 7:04 ` Qu Wenruo
@ 2016-10-31 8:44 ` Hugo Mills
2016-10-31 15:04 ` Marc MERLIN
0 siblings, 1 reply; 40+ messages in thread
From: Hugo Mills @ 2016-10-31 8:44 UTC (permalink / raw)
To: Qu Wenruo; +Cc: Marc MERLIN, linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 3608 bytes --]
On Mon, Oct 31, 2016 at 03:04:27PM +0800, Qu Wenruo wrote:
>
>
> At 10/31/2016 02:37 PM, Marc MERLIN wrote:
> >On Mon, Oct 31, 2016 at 02:32:53PM +0800, Qu Wenruo wrote:
> >>
> >>
> >>At 10/31/2016 02:25 PM, Marc MERLIN wrote:
> >>>On Mon, Oct 31, 2016 at 02:04:10PM +0800, Qu Wenruo wrote:
> >>>>>Sorry for asking, am I doing this wrong?
> >>>>>myth:~# dd if=/dev/mapper/crypt_bcache0 of=/tmp/dump1 bs=512 count=32
> >>>>>skip=26367830208
> >>>>>dd: reading `/dev/mapper/crypt_bcache0': Invalid argument
> >>>>>0+0 records in
> >>>>>0+0 records out
> >>>>>0 bytes (0 B) copied, 0.000401393 s, 0.0 kB/s
> >>>>
> >>>>So, the underlying MD RAID5 are complaining about some wrong data, and
> >>>>refuse to read out.
> >>>>
> >>>>It seems that btrfs-progs can't handle read failure?
> >>>>Maybe dm-error could emulate it.
> >>>>
> >>>>And what about the 2nd range?
> >>>
> >>>they both fail the same, but I wasn' tsure if I typed the wrong dd command
> >>>or not.
> >>
> >>Strange, your command seems OK to me.
> >>
> >>Does it has anything to do with your security setup or something like that?
> >>Or is it related to dm-crypt or bcache?
> >>
> >>
> >>But this reminds me, if dd can't read it, maybe btrfs-progs is the same.
> >>
> >>Maybe only kernel can read dm-crypt device while user space tools can't
> >>access dm-crypt devices directly?
> >
> >It can, it's just the offset seems wrong:
> >
> >myth:~# dd if=/dev/mapper/crypt_bcache0 of=/tmp/dump1 bs=512 count=32 skip=26367830208
> >dd: reading `/dev/mapper/crypt_bcache0': Invalid argument
> >0+0 records in
> >0+0 records out
> >0 bytes (0 B) copied, 0.000421662 s, 0.0 kB/s
> >
> >If I divide by 1000, it works:
> >myth:~# dd if=/dev/mapper/crypt_bcache0 of=/tmp/dump1 bs=512 count=32 skip=26367830
> >32+0 records in
> >32+0 records out
> >16384 bytes (16 kB) copied, 0.139005 s, 118 kB/s
> >
> >so that's why I was asking you if I counted the offset wrong. I took the
> >value you asked and divided by 512, but it seems too big
> >
> >13500329066496 / 512 = 26367830208
> >
> >Marc
> >
> But according to your dump-super output, that's strange.
> ------
> chunk_root 13835462344704 (CR)
> item 0 key (FIRST_CHUNK_TREE CHUNK_ITEM 13835461197824) (CS)
> chunk length 33554432 owner 2 stripe_len 65536
> type SYSTEM|DUP num_stripes 2
> stripe 0 devid 1 offset 13500327919616 (ST1)
> dev uuid: 0cf779be-8e16-4982-b7d7-f8241deea0d1
> stripe 1 devid 1 offset 13500361474048 (ST2)
> dev uuid: 0cf779be-8e16-4982-b7d7-f8241deea0d1
> ------
>
> Here, your chunk logical bytenr is 13835461197824, and its physical
> bytenr is 13500327919616 and 13500361474048.
>
> My calculation is quite simple.
> Start1 = CR - CS + ST1
> Start2 = CR - CS + ST2
>
> Unless the superblock is incorrect, it is not possile.
>
> And the physical offset, is about 12.2 TiB, which is smaller than
> 15TiB of your device.
>
> So that's quite strange that dd can't read out the data.
> And if dd can't read it out, then I see no reason btrfs-progs can
> read it out.
>
> Any idea on special dm setup which can make us fail to read out some
> data range?
I've seen both btrfs check and btrfs dump-super give wrong answers
(particularly, some addresses end up larger than the device, for some
reason) when run on a mounted filesystem. Worth ruling that one out.
Hugo.
--
Hugo Mills | Great films about cricket: Silly Point Break
hugo@... carfax.org.uk |
http://carfax.org.uk/ |
PGP: E2AB1DE4 |
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-10-31 8:44 ` Hugo Mills
@ 2016-10-31 15:04 ` Marc MERLIN
2016-11-01 3:48 ` Marc MERLIN
2016-11-01 4:13 ` Qu Wenruo
0 siblings, 2 replies; 40+ messages in thread
From: Marc MERLIN @ 2016-10-31 15:04 UTC (permalink / raw)
To: Hugo Mills, Qu Wenruo, linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 3432 bytes --]
On Mon, Oct 31, 2016 at 08:44:12AM +0000, Hugo Mills wrote:
> > Any idea on special dm setup which can make us fail to read out some
> > data range?
>
> I've seen both btrfs check and btrfs dump-super give wrong answers
> (particularly, some addresses end up larger than the device, for some
> reason) when run on a mounted filesystem. Worth ruling that one out.
I just finished running my scrub overnight, and it failed around 10%:
[115500.316921] BTRFS error (device dm-0): bad tree block start 8461247125784585065 17619396231168
[115500.332354] BTRFS error (device dm-0): bad tree block start 8461247125784585065 17619396231168
[115500.332626] BTRFS: error (device dm-0) in __btrfs_free_extent:6954: errno=-5 IO failure
[115500.332629] BTRFS info (device dm-0): forced readonly
[115500.332632] BTRFS: error (device dm-0) in btrfs_run_delayed_refs:2960: errno=-5 IO failure
[115500.436002] btrfs_printk: 550 callbacks suppressed
[115500.436024] BTRFS warning (device dm-0): Skipping commit of aborted transaction.
[115500.436029] BTRFS: error (device dm-0) in cleanup_transaction:1854: errno=-5 IO failure
myth:~# ionice -c 3 nice -10 btrfs scrub start -Bd /mnt/mnt
(...)
scrub device /dev/mapper/crypt_bcache0 (id 1) canceled
scrub started at Sun Oct 30 22:52:59 2016 and was aborted after 09:03:11
total bytes scrubbed: 1.15TiB with 512 errors
error details: csum=512
corrected errors: 0, uncorrectable errors: 512, unverified errors: 0
Am I correct that if I see "__btrfs_free_extent:6954: errno=-5 IO failure" it means
that btrfs had physical read errors from the underlying block layer?
Do I have some weird mismatch between the size of my md array and the size of my filesystem
(as per dd apparently thinking parts of it are out of bounds?)
Yet, the sizes seem to match:
myth:~# mdadm --query --detail /dev/md5
/dev/md5:
Version : 1.2
Creation Time : Tue Jan 21 10:35:52 2014
Raid Level : raid5
Array Size : 15627542528 (14903.59 GiB 16002.60 GB)
Used Dev Size : 3906885632 (3725.90 GiB 4000.65 GB)
Raid Devices : 5
Total Devices : 5
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Mon Oct 31 07:56:07 2016
State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Name : gargamel.svh.merlins.org:5
UUID : ec672af7:a66d9557:2f00d76c:38c9f705
Events : 147992
Number Major Minor RaidDevice State
0 8 97 0 active sync /dev/sdg1
6 8 113 1 active sync /dev/sdh1
2 8 81 2 active sync /dev/sdf1
3 8 65 3 active sync /dev/sde1
5 8 49 4 active sync /dev/sdd1
myth:~# btrfs fi df /mnt/mnt
Data, single: total=13.22TiB, used=13.19TiB
System, DUP: total=32.00MiB, used=1.42MiB
Metadata, DUP: total=75.00GiB, used=72.82GiB
GlobalReserve, single: total=512.00MiB, used=6.73MiB
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 291 bytes --]
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-10-31 15:04 ` Marc MERLIN
@ 2016-11-01 3:48 ` Marc MERLIN
2016-11-01 4:13 ` Qu Wenruo
1 sibling, 0 replies; 40+ messages in thread
From: Marc MERLIN @ 2016-11-01 3:48 UTC (permalink / raw)
To: Hugo Mills, Qu Wenruo, linux-btrfs
So, I'm willing to wait 2 more days before I wipe this filesystem and
start over if I can't get check --repair to work on it.
If you need longer, please let me konw you have an upcoming patch for me
to try and I'll wait.
Thanks,
Marc
On Mon, Oct 31, 2016 at 08:04:22AM -0700, Marc MERLIN wrote:
> On Mon, Oct 31, 2016 at 08:44:12AM +0000, Hugo Mills wrote:
> > > Any idea on special dm setup which can make us fail to read out some
> > > data range?
> >
> > I've seen both btrfs check and btrfs dump-super give wrong answers
> > (particularly, some addresses end up larger than the device, for some
> > reason) when run on a mounted filesystem. Worth ruling that one out.
>
> I just finished running my scrub overnight, and it failed around 10%:
> [115500.316921] BTRFS error (device dm-0): bad tree block start 8461247125784585065 17619396231168
> [115500.332354] BTRFS error (device dm-0): bad tree block start 8461247125784585065 17619396231168
> [115500.332626] BTRFS: error (device dm-0) in __btrfs_free_extent:6954: errno=-5 IO failure
> [115500.332629] BTRFS info (device dm-0): forced readonly
> [115500.332632] BTRFS: error (device dm-0) in btrfs_run_delayed_refs:2960: errno=-5 IO failure
> [115500.436002] btrfs_printk: 550 callbacks suppressed
> [115500.436024] BTRFS warning (device dm-0): Skipping commit of aborted transaction.
> [115500.436029] BTRFS: error (device dm-0) in cleanup_transaction:1854: errno=-5 IO failure
>
>
> myth:~# ionice -c 3 nice -10 btrfs scrub start -Bd /mnt/mnt
> (...)
> scrub device /dev/mapper/crypt_bcache0 (id 1) canceled
> scrub started at Sun Oct 30 22:52:59 2016 and was aborted after 09:03:11
> total bytes scrubbed: 1.15TiB with 512 errors
> error details: csum=512
> corrected errors: 0, uncorrectable errors: 512, unverified errors: 0
>
> Am I correct that if I see "__btrfs_free_extent:6954: errno=-5 IO failure" it means
> that btrfs had physical read errors from the underlying block layer?
>
> Do I have some weird mismatch between the size of my md array and the size of my filesystem
> (as per dd apparently thinking parts of it are out of bounds?)
> Yet, the sizes seem to match:
>
>
> myth:~# mdadm --query --detail /dev/md5
> /dev/md5:
> Version : 1.2
> Creation Time : Tue Jan 21 10:35:52 2014
> Raid Level : raid5
> Array Size : 15627542528 (14903.59 GiB 16002.60 GB)
> Used Dev Size : 3906885632 (3725.90 GiB 4000.65 GB)
> Raid Devices : 5
> Total Devices : 5
> Persistence : Superblock is persistent
>
> Intent Bitmap : Internal
>
> Update Time : Mon Oct 31 07:56:07 2016
> State : clean
> Active Devices : 5
> Working Devices : 5
> Failed Devices : 0
> Spare Devices : 0
>
> Layout : left-symmetric
> Chunk Size : 512K
>
> Name : gargamel.svh.merlins.org:5
> UUID : ec672af7:a66d9557:2f00d76c:38c9f705
> Events : 147992
>
> Number Major Minor RaidDevice State
> 0 8 97 0 active sync /dev/sdg1
> 6 8 113 1 active sync /dev/sdh1
> 2 8 81 2 active sync /dev/sdf1
> 3 8 65 3 active sync /dev/sde1
> 5 8 49 4 active sync /dev/sdd1
>
> myth:~# btrfs fi df /mnt/mnt
> Data, single: total=13.22TiB, used=13.19TiB
> System, DUP: total=32.00MiB, used=1.42MiB
> Metadata, DUP: total=75.00GiB, used=72.82GiB
> GlobalReserve, single: total=512.00MiB, used=6.73MiB
>
> Thanks,
> Marc
> --
> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> Microsoft is to operating systems ....
> .... what McDonalds is to gourmet cooking
> Home page: http://marc.merlins.org/
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-10-31 15:04 ` Marc MERLIN
2016-11-01 3:48 ` Marc MERLIN
@ 2016-11-01 4:13 ` Qu Wenruo
2016-11-01 4:21 ` Marc MERLIN
1 sibling, 1 reply; 40+ messages in thread
From: Qu Wenruo @ 2016-11-01 4:13 UTC (permalink / raw)
To: Marc MERLIN, Hugo Mills, linux-btrfs
At 10/31/2016 11:04 PM, Marc MERLIN wrote:
> On Mon, Oct 31, 2016 at 08:44:12AM +0000, Hugo Mills wrote:
>>> Any idea on special dm setup which can make us fail to read out some
>>> data range?
>>
>> I've seen both btrfs check and btrfs dump-super give wrong answers
>> (particularly, some addresses end up larger than the device, for some
>> reason) when run on a mounted filesystem. Worth ruling that one out.
>
> I just finished running my scrub overnight, and it failed around 10%:
> [115500.316921] BTRFS error (device dm-0): bad tree block start 8461247125784585065 17619396231168
> [115500.332354] BTRFS error (device dm-0): bad tree block start 8461247125784585065 17619396231168
> [115500.332626] BTRFS: error (device dm-0) in __btrfs_free_extent:6954: errno=-5 IO failure
> [115500.332629] BTRFS info (device dm-0): forced readonly
> [115500.332632] BTRFS: error (device dm-0) in btrfs_run_delayed_refs:2960: errno=-5 IO failure
> [115500.436002] btrfs_printk: 550 callbacks suppressed
> [115500.436024] BTRFS warning (device dm-0): Skipping commit of aborted transaction.
> [115500.436029] BTRFS: error (device dm-0) in cleanup_transaction:1854: errno=-5 IO failure
>
>
> myth:~# ionice -c 3 nice -10 btrfs scrub start -Bd /mnt/mnt
> (...)
> scrub device /dev/mapper/crypt_bcache0 (id 1) canceled
> scrub started at Sun Oct 30 22:52:59 2016 and was aborted after 09:03:11
> total bytes scrubbed: 1.15TiB with 512 errors
> error details: csum=512
> corrected errors: 0, uncorrectable errors: 512, unverified errors: 0
>
> Am I correct that if I see "__btrfs_free_extent:6954: errno=-5 IO failure" it means
> that btrfs had physical read errors from the underlying block layer?
Not really sure if it's physical read errors. As we throw -EIO almost
every where.
But that's possible that your extent tree got corrupted so
__btrfs_free_extent() failed to modify extent tree.
And in that case, we do throw -EIO.
>
> Do I have some weird mismatch between the size of my md array and the size of my filesystem
> (as per dd apparently thinking parts of it are out of bounds?)
> Yet, the sizes seem to match:
Would you try to locate the range where we starts to fail to read?
I still think the root problem is we failed to read the device in user
space.
Thanks,
Qu
>
>
> myth:~# mdadm --query --detail /dev/md5
> /dev/md5:
> Version : 1.2
> Creation Time : Tue Jan 21 10:35:52 2014
> Raid Level : raid5
> Array Size : 15627542528 (14903.59 GiB 16002.60 GB)
> Used Dev Size : 3906885632 (3725.90 GiB 4000.65 GB)
> Raid Devices : 5
> Total Devices : 5
> Persistence : Superblock is persistent
>
> Intent Bitmap : Internal
>
> Update Time : Mon Oct 31 07:56:07 2016
> State : clean
> Active Devices : 5
> Working Devices : 5
> Failed Devices : 0
> Spare Devices : 0
>
> Layout : left-symmetric
> Chunk Size : 512K
>
> Name : gargamel.svh.merlins.org:5
> UUID : ec672af7:a66d9557:2f00d76c:38c9f705
> Events : 147992
>
> Number Major Minor RaidDevice State
> 0 8 97 0 active sync /dev/sdg1
> 6 8 113 1 active sync /dev/sdh1
> 2 8 81 2 active sync /dev/sdf1
> 3 8 65 3 active sync /dev/sde1
> 5 8 49 4 active sync /dev/sdd1
>
> myth:~# btrfs fi df /mnt/mnt
> Data, single: total=13.22TiB, used=13.19TiB
> System, DUP: total=32.00MiB, used=1.42MiB
> Metadata, DUP: total=75.00GiB, used=72.82GiB
> GlobalReserve, single: total=512.00MiB, used=6.73MiB
>
> Thanks,
> Marc
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-11-01 4:13 ` Qu Wenruo
@ 2016-11-01 4:21 ` Marc MERLIN
2016-11-04 8:01 ` Marc MERLIN
0 siblings, 1 reply; 40+ messages in thread
From: Marc MERLIN @ 2016-11-01 4:21 UTC (permalink / raw)
To: Qu Wenruo; +Cc: Hugo Mills, linux-btrfs
On Tue, Nov 01, 2016 at 12:13:38PM +0800, Qu Wenruo wrote:
> Would you try to locate the range where we starts to fail to read?
>
> I still think the root problem is we failed to read the device in user
> space.
Understood.
I'll run this then:
myth:~# dd if=/dev/mapper/crypt_bcache0 of=/dev/null bs=1M &
[2] 21108
myth:~# while :; do killall -USR1 dd; sleep 1200; done
275+0 records in
274+0 records out
287309824 bytes (287 MB) copied, 7.20248 s, 39.9 MB/s
This will take a while to run, I'll report back on how far it goes.
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-11-01 4:21 ` Marc MERLIN
@ 2016-11-04 8:01 ` Marc MERLIN
2016-11-04 9:00 ` Roman Mamedov
2016-11-07 1:11 ` Qu Wenruo
0 siblings, 2 replies; 40+ messages in thread
From: Marc MERLIN @ 2016-11-04 8:01 UTC (permalink / raw)
To: Qu Wenruo; +Cc: Hugo Mills, linux-btrfs
On Mon, Oct 31, 2016 at 09:21:40PM -0700, Marc MERLIN wrote:
> On Tue, Nov 01, 2016 at 12:13:38PM +0800, Qu Wenruo wrote:
> > Would you try to locate the range where we starts to fail to read?
> >
> > I still think the root problem is we failed to read the device in user
> > space.
>
> Understood.
>
> I'll run this then:
> myth:~# dd if=/dev/mapper/crypt_bcache0 of=/dev/null bs=1M &
> [2] 21108
> myth:~# while :; do killall -USR1 dd; sleep 1200; done
> 275+0 records in
> 274+0 records out
> 287309824 bytes (287 MB) copied, 7.20248 s, 39.9 MB/s
>
> This will take a while to run, I'll report back on how far it goes.
Well, turns out you were right. My array is 14TB and dd was only able to
copy 8.8TB out of it.
I wonder if it's a bug with bcache and source devices that are too big?
8782434271232 bytes (8.8 TB) copied, 214809 s, 40.9 MB/s
dd: reading `/dev/mapper/crypt_bcache0': Invalid argument
8388608+0 records in
8388608+0 records out
8796093022208 bytes (8.8 TB) copied, 215197 s, 40.9 MB/s
[2]+ Exit 1 dd if=/dev/mapper/crypt_bcache0 of=/dev/null bs=1M
What's vexing is that absolutely nothing has been logged in the kernel dmesg
buffer about this read error.
Basically I have this:
sde 8:64 0 3.7T 0
└─sde1 8:65 0 3.7T 0
└─md5 9:5 0 14.6T 0
└─bcache0 252:0 0 14.6T 0
└─crypt_bcache0 (dm-0) 253:0 0 14.6T 0
I'll try dd'ing the md5 directly now, but that's going to take another 2 days :(
That said, given that almost half the device is not readable from user space
for some reason, that would explain why btrfs check is failing. Obviously it
can't do its job if it can't read blocks.
I'll report back on what I find out with this problem but if you have
suggestions on what to look for, let me know :)
Thanks.
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-11-04 8:01 ` Marc MERLIN
@ 2016-11-04 9:00 ` Roman Mamedov
2016-11-04 17:59 ` Marc MERLIN
2016-11-07 1:11 ` Qu Wenruo
1 sibling, 1 reply; 40+ messages in thread
From: Roman Mamedov @ 2016-11-04 9:00 UTC (permalink / raw)
To: Marc MERLIN; +Cc: Qu Wenruo, Hugo Mills, linux-btrfs
On Fri, 4 Nov 2016 01:01:13 -0700
Marc MERLIN <marc@merlins.org> wrote:
> Basically I have this:
> sde 8:64 0 3.7T 0
> └─sde1 8:65 0 3.7T 0
> └─md5 9:5 0 14.6T 0
> └─bcache0 252:0 0 14.6T 0
> └─crypt_bcache0 (dm-0) 253:0 0 14.6T 0
>
> I'll try dd'ing the md5 directly now, but that's going to take another 2 days :(
>
> That said, given that almost half the device is not readable from user space
> for some reason, that would explain why btrfs check is failing. Obviously it
> can't do its job if it can't read blocks.
I don't see anything to support the notion that "half is unreadable", maybe
just a 512-byte sector is unreadable -- but that would be enough to make
regular dd bail out -- which is why you should be using dd_rescue for this,
not regular dd. Assuming you just want to copy over as much data as possible,
and not simply test if dd fails or not (but in any case dd_rescue at least
would not fail instantly and would tell you precise count of how much is
unreadable).
There is "GNU ddrescue" and "dd_rescue", I liked the first one better, but
they both work on a similar principle.
Also didn't you recently have issues with bad block lists on mdadm. This
mysterious "unreadable and nothing in dmesg" does appear to be a continuation
of that.
--
With respect,
Roman
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-11-04 9:00 ` Roman Mamedov
@ 2016-11-04 17:59 ` Marc MERLIN
0 siblings, 0 replies; 40+ messages in thread
From: Marc MERLIN @ 2016-11-04 17:59 UTC (permalink / raw)
To: Roman Mamedov; +Cc: Qu Wenruo, Hugo Mills, linux-btrfs
On Fri, Nov 04, 2016 at 02:00:43PM +0500, Roman Mamedov wrote:
> On Fri, 4 Nov 2016 01:01:13 -0700
> Marc MERLIN <marc@merlins.org> wrote:
>
> > Basically I have this:
> > sde 8:64 0 3.7T 0
> > └─sde1 8:65 0 3.7T 0
> > └─md5 9:5 0 14.6T 0
> > └─bcache0 252:0 0 14.6T 0
> > └─crypt_bcache0 (dm-0) 253:0 0 14.6T 0
> >
> > I'll try dd'ing the md5 directly now, but that's going to take another 2 days :(
> >
> > That said, given that almost half the device is not readable from user space
> > for some reason, that would explain why btrfs check is failing. Obviously it
> > can't do its job if it can't read blocks.
>
> I don't see anything to support the notion that "half is unreadable", maybe
> just a 512-byte sector is unreadable -- but that would be enough to make
> regular dd bail out -- which is why you should be using dd_rescue for this,
> not regular dd. Assuming you just want to copy over as much data as possible,
> and not simply test if dd fails or not (but in any case dd_rescue at least
> would not fail instantly and would tell you precise count of how much is
> unreadable).
Thanks for the plug on ddrescue, I have used it to rescue drives in the
past.
Here, however, everything after the 8.8TB mark, is unreadable, so there
is nothing to skip.
Because the underlying drives are fine, I'm not entirely sure where the
issue is although it has to be on the mdadm side and not related to
btrfs.
And of course the mdadm array shows clean, and I have already disabled
the mdadm per drive bad block (mis-)feature which probably is
responsible for all the problems I've had here.
myth:~# mdadm --examine-badblocks /dev/sd[defgh]1
No bad-blocks list configured on /dev/sdd1
No bad-blocks list configured on /dev/sde1
No bad-blocks list configured on /dev/sdf1
No bad-blocks list configured on /dev/sdg1
No bad-blocks list configured on /dev/sdh1
I'm also still perplexed as to why despite the rear error I'm getting,
absolutely nothing is logged in the kernel :-/
I'll pursue that further and post a summary on the thread here if I find
something interesting.
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs check --repair: ERROR: cannot read chunk root
2016-11-04 8:01 ` Marc MERLIN
2016-11-04 9:00 ` Roman Mamedov
@ 2016-11-07 1:11 ` Qu Wenruo
[not found] ` <87lgwwnnyf.fsf@notabene.neil.brown.name>
1 sibling, 1 reply; 40+ messages in thread
From: Qu Wenruo @ 2016-11-07 1:11 UTC (permalink / raw)
To: Marc MERLIN; +Cc: Hugo Mills, linux-btrfs
At 11/04/2016 04:01 PM, Marc MERLIN wrote:
> On Mon, Oct 31, 2016 at 09:21:40PM -0700, Marc MERLIN wrote:
>> On Tue, Nov 01, 2016 at 12:13:38PM +0800, Qu Wenruo wrote:
>>> Would you try to locate the range where we starts to fail to read?
>>>
>>> I still think the root problem is we failed to read the device in user
>>> space.
>>
>> Understood.
>>
>> I'll run this then:
>> myth:~# dd if=/dev/mapper/crypt_bcache0 of=/dev/null bs=1M &
>> [2] 21108
>> myth:~# while :; do killall -USR1 dd; sleep 1200; done
>> 275+0 records in
>> 274+0 records out
>> 287309824 bytes (287 MB) copied, 7.20248 s, 39.9 MB/s
>>
>> This will take a while to run, I'll report back on how far it goes.
>
> Well, turns out you were right. My array is 14TB and dd was only able to
> copy 8.8TB out of it.
>
> I wonder if it's a bug with bcache and source devices that are too big?
At least we know it's not a problem of btrfs-progs.
And for bcache/soft raid/encryption, unfortunately I'm not familiar with
any of them.
I would recommend to report it to bcache/mdadm/encryption ML after
locating the layer which returns EINVAL.
>
> 8782434271232 bytes (8.8 TB) copied, 214809 s, 40.9 MB/s
> dd: reading `/dev/mapper/crypt_bcache0': Invalid argument
> 8388608+0 records in
> 8388608+0 records out
> 8796093022208 bytes (8.8 TB) copied, 215197 s, 40.9 MB/s
> [2]+ Exit 1 dd if=/dev/mapper/crypt_bcache0 of=/dev/null bs=1M
>
> What's vexing is that absolutely nothing has been logged in the kernel dmesg
> buffer about this read error.
>
> Basically I have this:
> sde 8:64 0 3.7T 0
> └─sde1 8:65 0 3.7T 0
> └─md5 9:5 0 14.6T 0
> └─bcache0 252:0 0 14.6T 0
> └─crypt_bcache0 (dm-0) 253:0 0 14.6T 0
>
> I'll try dd'ing the md5 directly now, but that's going to take another 2 days :(
No need to read them out, just reading from the 8T would be good enough
for me.
BTW, that's really a complicated layout, with soft raid, bcache, and
encryption, it will take a long time to find the real cause.
But at least we know the 8.8T position, we can save some time not
reading the whole disk.
Thanks,
Qu
>
> That said, given that almost half the device is not readable from user space
> for some reason, that would explain why btrfs check is failing. Obviously it
> can't do its job if it can't read blocks.
>
> I'll report back on what I find out with this problem but if you have
> suggestions on what to look for, let me know :)
>
> Thanks.
> Marc
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: clearing blocks wrongfully marked as bad if --update=no-bbl can't be used?
[not found] ` <87lgwwnnyf.fsf@notabene.neil.brown.name>
@ 2016-11-07 1:20 ` Marc MERLIN
2016-11-07 1:39 ` Qu Wenruo
0 siblings, 1 reply; 40+ messages in thread
From: Marc MERLIN @ 2016-11-07 1:20 UTC (permalink / raw)
To: Qu Wenruo; +Cc: Hugo Mills, linux-btrfs
On Mon, Nov 07, 2016 at 09:11:54AM +0800, Qu Wenruo wrote:
> > Well, turns out you were right. My array is 14TB and dd was only able to
> > copy 8.8TB out of it.
> >
> > I wonder if it's a bug with bcache and source devices that are too big?
>
> At least we know it's not a problem of btrfs-progs.
>
> And for bcache/soft raid/encryption, unfortunately I'm not familiar with any
> of them.
>
> I would recommend to report it to bcache/mdadm/encryption ML after locating
> the layer which returns EINVAL.
So, Neil Brown found the problem.
myth:/sys/block/md5/md# dd if=/dev/md5 of=/dev/null bs=1GiB skip=8190
dd: reading `/dev/md5': Invalid argument
2+0 records in
2+0 records out
2147483648 bytes (2.1 GB) copied, 37.0785 s, 57.9 MB/s
myth:/sys/block/md5/md# dd if=/dev/md5 of=/dev/null bs=1GiB skip=8190 count=3 iflag=direct
3+0 records in
3+0 records out
On Mon, Nov 07, 2016 at 11:16:56AM +1100, NeilBrown wrote:
> EINVAL from a read() system call is surprising in this context.....
>
> do_generic_file_read can return it:
> if (unlikely(*ppos >= inode->i_sb->s_maxbytes))
> return -EINVAL;
>
> s_maxbytes will be MAX_LFS_FILESIZE which, on a 32bit system, is
>
> #define MAX_LFS_FILESIZE (((loff_t)PAGE_SIZE << (BITS_PER_LONG-1))-1)
>
> That is 2^(12+31) or 2^43 or 8TB.
>
> Is this a 32bit system you are using? Such systems can only support
> buffered IO up to 8TB. If you use iflags=direct to avoid buffering, you
> should get access to the whole device.
I am indeed using a 32bit system, and now we know why the kernel can
mount and use my filesystem just fine while btrfs check repair fails to
deal with it.
The filesystem is more than 8TB on a 32bit kernel with 32bit userland.
Since iflag=direct fixes the issue with dd, it sounds like something
similar could be done for btrfs progs, to support filesystems bigger
than 8TB on 32bit systems.
However, could you confirm that filesystems more than 8TB are supported
by the kernel code itself on 32bit systems? (I think so, but just
wanting to make sure)
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: clearing blocks wrongfully marked as bad if --update=no-bbl can't be used?
2016-11-07 1:20 ` clearing blocks wrongfully marked as bad if --update=no-bbl can't be used? Marc MERLIN
@ 2016-11-07 1:39 ` Qu Wenruo
2016-11-07 4:18 ` Qu Wenruo
0 siblings, 1 reply; 40+ messages in thread
From: Qu Wenruo @ 2016-11-07 1:39 UTC (permalink / raw)
To: Marc MERLIN; +Cc: Hugo Mills, linux-btrfs
At 11/07/2016 09:20 AM, Marc MERLIN wrote:
> On Mon, Nov 07, 2016 at 09:11:54AM +0800, Qu Wenruo wrote:
>>> Well, turns out you were right. My array is 14TB and dd was only able to
>>> copy 8.8TB out of it.
>>>
>>> I wonder if it's a bug with bcache and source devices that are too big?
>>
>> At least we know it's not a problem of btrfs-progs.
>>
>> And for bcache/soft raid/encryption, unfortunately I'm not familiar with any
>> of them.
>>
>> I would recommend to report it to bcache/mdadm/encryption ML after locating
>> the layer which returns EINVAL.
>
> So, Neil Brown found the problem.
>
> myth:/sys/block/md5/md# dd if=/dev/md5 of=/dev/null bs=1GiB skip=8190
> dd: reading `/dev/md5': Invalid argument
> 2+0 records in
> 2+0 records out
> 2147483648 bytes (2.1 GB) copied, 37.0785 s, 57.9 MB/s
> myth:/sys/block/md5/md# dd if=/dev/md5 of=/dev/null bs=1GiB skip=8190 count=3 iflag=direct
> 3+0 records in
> 3+0 records out
That's interesting.
>
>
> On Mon, Nov 07, 2016 at 11:16:56AM +1100, NeilBrown wrote:
>> EINVAL from a read() system call is surprising in this context.....
>>
>> do_generic_file_read can return it:
>> if (unlikely(*ppos >= inode->i_sb->s_maxbytes))
>> return -EINVAL;
At least the return value is a bug.
Normally we should return -EFBIG instead of -EINVAL.
>>
>> s_maxbytes will be MAX_LFS_FILESIZE which, on a 32bit system, is
>>
>> #define MAX_LFS_FILESIZE (((loff_t)PAGE_SIZE << (BITS_PER_LONG-1))-1)
>>
>> That is 2^(12+31) or 2^43 or 8TB.
>>
>> Is this a 32bit system you are using? Such systems can only support
>> buffered IO up to 8TB. If you use iflags=direct to avoid buffering, you
>> should get access to the whole device.
>
> I am indeed using a 32bit system, and now we know why the kernel can
> mount and use my filesystem just fine while btrfs check repair fails to
> deal with it.
> The filesystem is more than 8TB on a 32bit kernel with 32bit userland.
>
> Since iflag=direct fixes the issue with dd, it sounds like something
> similar could be done for btrfs progs, to support filesystems bigger
> than 8TB on 32bit systems.
>
> However, could you confirm that filesystems more than 8TB are supported
> by the kernel code itself on 32bit systems? (I think so, but just
> wanting to make sure)
Yep, fs can support to u64 max size fs. (But I'd assume u63 max as some
fs may use the highest bit for special purpose)
Just VFS/mm layer is blocking things.
Direct IO can handle it because it avoids cache, while for buffered IO,
it's cache(memory) size limiting the offsize.
It's good to locate the root cause.
It doesn't look hard to add such workaround for btrfs-progs.
I'll send such workaround soon.
Thanks,
Qu
>
> Thanks,
> Marc
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: clearing blocks wrongfully marked as bad if --update=no-bbl can't be used?
2016-11-07 1:39 ` Qu Wenruo
@ 2016-11-07 4:18 ` Qu Wenruo
2016-11-07 5:36 ` btrfs support for filesystems >8TB on 32bit architectures Marc MERLIN
0 siblings, 1 reply; 40+ messages in thread
From: Qu Wenruo @ 2016-11-07 4:18 UTC (permalink / raw)
To: Marc MERLIN; +Cc: Hugo Mills, linux-btrfs
At 11/07/2016 09:39 AM, Qu Wenruo wrote:
>
>
> At 11/07/2016 09:20 AM, Marc MERLIN wrote:
>> On Mon, Nov 07, 2016 at 09:11:54AM +0800, Qu Wenruo wrote:
>>>> Well, turns out you were right. My array is 14TB and dd was only
>>>> able to
>>>> copy 8.8TB out of it.
>>>>
>>>> I wonder if it's a bug with bcache and source devices that are too big?
>>>
>>> At least we know it's not a problem of btrfs-progs.
>>>
>>> And for bcache/soft raid/encryption, unfortunately I'm not familiar
>>> with any
>>> of them.
>>>
>>> I would recommend to report it to bcache/mdadm/encryption ML after
>>> locating
>>> the layer which returns EINVAL.
>>
>> So, Neil Brown found the problem.
>>
>> myth:/sys/block/md5/md# dd if=/dev/md5 of=/dev/null bs=1GiB skip=8190
>> dd: reading `/dev/md5': Invalid argument
>> 2+0 records in
>> 2+0 records out
>> 2147483648 bytes (2.1 GB) copied, 37.0785 s, 57.9 MB/s
>> myth:/sys/block/md5/md# dd if=/dev/md5 of=/dev/null bs=1GiB skip=8190
>> count=3 iflag=direct
>> 3+0 records in
>> 3+0 records out
>
> That's interesting.
>
>>
>>
>> On Mon, Nov 07, 2016 at 11:16:56AM +1100, NeilBrown wrote:
>>> EINVAL from a read() system call is surprising in this context.....
>>>
>>> do_generic_file_read can return it:
>>> if (unlikely(*ppos >= inode->i_sb->s_maxbytes))
>>> return -EINVAL;
>
> At least the return value is a bug.
> Normally we should return -EFBIG instead of -EINVAL.
>
>>>
>>> s_maxbytes will be MAX_LFS_FILESIZE which, on a 32bit system, is
>>>
>>> #define MAX_LFS_FILESIZE (((loff_t)PAGE_SIZE <<
>>> (BITS_PER_LONG-1))-1)
>>>
>>> That is 2^(12+31) or 2^43 or 8TB.
>>>
>>> Is this a 32bit system you are using? Such systems can only support
>>> buffered IO up to 8TB. If you use iflags=direct to avoid buffering, you
>>> should get access to the whole device.
>>
>> I am indeed using a 32bit system, and now we know why the kernel can
>> mount and use my filesystem just fine while btrfs check repair fails to
>> deal with it.
>> The filesystem is more than 8TB on a 32bit kernel with 32bit userland.
>>
>> Since iflag=direct fixes the issue with dd, it sounds like something
>> similar could be done for btrfs progs, to support filesystems bigger
>> than 8TB on 32bit systems.
>>
>> However, could you confirm that filesystems more than 8TB are supported
>> by the kernel code itself on 32bit systems? (I think so, but just
>> wanting to make sure)
>
> Yep, fs can support to u64 max size fs. (But I'd assume u63 max as some
> fs may use the highest bit for special purpose)
> Just VFS/mm layer is blocking things.
>
> Direct IO can handle it because it avoids cache, while for buffered IO,
> it's cache(memory) size limiting the offsize.
>
> It's good to locate the root cause.
>
> It doesn't look hard to add such workaround for btrfs-progs.
> I'll send such workaround soon.
I'm totally wrong here.
DirectIO needs the 'buf' parameter of read()/pread() to be 512 bytes
aligned.
While we are using a lot of stack memory() and normal malloc()/calloc()
allocated memory, which are seldom aligned to 512 bytes.
So to *workaround* the problem in btrfs-progs, we may need to change any
pread() caller to use aligned memory allocation.
I really don't think David will accept such huge change for a workdaround...
Thanks,
Qu
>
> Thanks,
> Qu
>
>>
>> Thanks,
>> Marc
>>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs support for filesystems >8TB on 32bit architectures
2016-11-07 4:18 ` Qu Wenruo
@ 2016-11-07 5:36 ` Marc MERLIN
2016-11-07 6:16 ` Qu Wenruo
0 siblings, 1 reply; 40+ messages in thread
From: Marc MERLIN @ 2016-11-07 5:36 UTC (permalink / raw)
To: Qu Wenruo; +Cc: Hugo Mills, linux-btrfs
(sorry for the bad subject line from the mdadm list on the previous mail)
On Mon, Nov 07, 2016 at 12:18:10PM +0800, Qu Wenruo wrote:
> I'm totally wrong here.
>
> DirectIO needs the 'buf' parameter of read()/pread() to be 512 bytes
> aligned.
>
> While we are using a lot of stack memory() and normal malloc()/calloc()
> allocated memory, which are seldom aligned to 512 bytes.
>
> So to *workaround* the problem in btrfs-progs, we may need to change any
> pread() caller to use aligned memory allocation.
>
> I really don't think David will accept such huge change for a workdaround...
Thanks for looking into it.
So basically should we just document that btrfs filesystems past 8TB in
size are not supported on 32bit architectures?
(as in you can mount them and use them I believe, but you cannot create,
or repair them)
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs support for filesystems >8TB on 32bit architectures
2016-11-07 5:36 ` btrfs support for filesystems >8TB on 32bit architectures Marc MERLIN
@ 2016-11-07 6:16 ` Qu Wenruo
2016-11-07 14:55 ` Marc MERLIN
0 siblings, 1 reply; 40+ messages in thread
From: Qu Wenruo @ 2016-11-07 6:16 UTC (permalink / raw)
To: Marc MERLIN, David Sterba; +Cc: Hugo Mills, linux-btrfs
At 11/07/2016 01:36 PM, Marc MERLIN wrote:
> (sorry for the bad subject line from the mdadm list on the previous mail)
>
> On Mon, Nov 07, 2016 at 12:18:10PM +0800, Qu Wenruo wrote:
>> I'm totally wrong here.
>>
>> DirectIO needs the 'buf' parameter of read()/pread() to be 512 bytes
>> aligned.
>>
>> While we are using a lot of stack memory() and normal malloc()/calloc()
>> allocated memory, which are seldom aligned to 512 bytes.
>>
>> So to *workaround* the problem in btrfs-progs, we may need to change any
>> pread() caller to use aligned memory allocation.
>>
>> I really don't think David will accept such huge change for a workdaround...
>
> Thanks for looking into it.
> So basically should we just document that btrfs filesystems past 8TB in
> size are not supported on 32bit architectures?
> (as in you can mount them and use them I believe, but you cannot create,
> or repair them)
>
> Marc
>
Add David to this thread.
For create, it should be OK. As at create time, we hardly write beyond
3G. So it won't be a big problem.
For repair, we do have a possibility that btrfsck can't handle it.
Anyway, I'd like to see how David thinks what we should do the handle
the problem.
Thanks,
Qu
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs support for filesystems >8TB on 32bit architectures
2016-11-07 6:16 ` Qu Wenruo
@ 2016-11-07 14:55 ` Marc MERLIN
2016-11-08 0:35 ` Qu Wenruo
0 siblings, 1 reply; 40+ messages in thread
From: Marc MERLIN @ 2016-11-07 14:55 UTC (permalink / raw)
To: Qu Wenruo; +Cc: David Sterba, Hugo Mills, linux-btrfs
On Mon, Nov 07, 2016 at 02:16:37PM +0800, Qu Wenruo wrote:
>
>
> At 11/07/2016 01:36 PM, Marc MERLIN wrote:
> > (sorry for the bad subject line from the mdadm list on the previous mail)
> >
> > On Mon, Nov 07, 2016 at 12:18:10PM +0800, Qu Wenruo wrote:
> > > I'm totally wrong here.
> > >
> > > DirectIO needs the 'buf' parameter of read()/pread() to be 512 bytes
> > > aligned.
> > >
> > > While we are using a lot of stack memory() and normal malloc()/calloc()
> > > allocated memory, which are seldom aligned to 512 bytes.
> > >
> > > So to *workaround* the problem in btrfs-progs, we may need to change any
> > > pread() caller to use aligned memory allocation.
> > >
> > > I really don't think David will accept such huge change for a workdaround...
> >
> > Thanks for looking into it.
> > So basically should we just document that btrfs filesystems past 8TB in
> > size are not supported on 32bit architectures?
> > (as in you can mount them and use them I believe, but you cannot create,
> > or repair them)
> >
> > Marc
> >
> Add David to this thread.
>
> For create, it should be OK. As at create time, we hardly write beyond 3G.
> So it won't be a big problem.
>
> For repair, we do have a possibility that btrfsck can't handle it.
>
> Anyway, I'd like to see how David thinks what we should do the handle the
> problem.
Understood. One big thing (for me) I forgot to confirm:
1) btrfs receive
2) btrfs scrub
should both be able to work because the IO operations are done directly
inside the kernel and not from user space, correct?
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs support for filesystems >8TB on 32bit architectures
2016-11-07 14:55 ` Marc MERLIN
@ 2016-11-08 0:35 ` Qu Wenruo
2016-11-08 0:39 ` Marc MERLIN
0 siblings, 1 reply; 40+ messages in thread
From: Qu Wenruo @ 2016-11-08 0:35 UTC (permalink / raw)
To: Marc MERLIN; +Cc: David Sterba, Hugo Mills, linux-btrfs
At 11/07/2016 10:55 PM, Marc MERLIN wrote:
> On Mon, Nov 07, 2016 at 02:16:37PM +0800, Qu Wenruo wrote:
>>
>>
>> At 11/07/2016 01:36 PM, Marc MERLIN wrote:
>>> (sorry for the bad subject line from the mdadm list on the previous mail)
>>>
>>> On Mon, Nov 07, 2016 at 12:18:10PM +0800, Qu Wenruo wrote:
>>>> I'm totally wrong here.
>>>>
>>>> DirectIO needs the 'buf' parameter of read()/pread() to be 512 bytes
>>>> aligned.
>>>>
>>>> While we are using a lot of stack memory() and normal malloc()/calloc()
>>>> allocated memory, which are seldom aligned to 512 bytes.
>>>>
>>>> So to *workaround* the problem in btrfs-progs, we may need to change any
>>>> pread() caller to use aligned memory allocation.
>>>>
>>>> I really don't think David will accept such huge change for a workdaround...
>>>
>>> Thanks for looking into it.
>>> So basically should we just document that btrfs filesystems past 8TB in
>>> size are not supported on 32bit architectures?
>>> (as in you can mount them and use them I believe, but you cannot create,
>>> or repair them)
>>>
>>> Marc
>>>
>> Add David to this thread.
>>
>> For create, it should be OK. As at create time, we hardly write beyond 3G.
>> So it won't be a big problem.
>>
>> For repair, we do have a possibility that btrfsck can't handle it.
>>
>> Anyway, I'd like to see how David thinks what we should do the handle the
>> problem.
>
> Understood. One big thing (for me) I forgot to confirm:
> 1) btrfs receive
Unfortunately, receive is completely done in userspace.
Only send works inside kernel.
So, receive will fail to reconstruct any file larger beyond 8T.
Despite that, any other normal file smaller than 8T is not affected.
> 2) btrfs scrub
Scrub does work in kernel, so it's unaffected.
Thanks,
Qu
> should both be able to work because the IO operations are done directly
> inside the kernel and not from user space, correct?
>
> Thanks,
> Marc
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs support for filesystems >8TB on 32bit architectures
2016-11-08 0:35 ` Qu Wenruo
@ 2016-11-08 0:39 ` Marc MERLIN
2016-11-08 0:43 ` Qu Wenruo
0 siblings, 1 reply; 40+ messages in thread
From: Marc MERLIN @ 2016-11-08 0:39 UTC (permalink / raw)
To: Qu Wenruo; +Cc: David Sterba, Hugo Mills, linux-btrfs
On Tue, Nov 08, 2016 at 08:35:54AM +0800, Qu Wenruo wrote:
> >Understood. One big thing (for me) I forgot to confirm:
> >1) btrfs receive
>
> Unfortunately, receive is completely done in userspace.
> Only send works inside kernel.
right, I've confirmed that btrfs receive fails.
It looks like btrfs balance is also failing, which is more surprising.
Isn't that one in the kernel?
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs support for filesystems >8TB on 32bit architectures
2016-11-08 0:39 ` Marc MERLIN
@ 2016-11-08 0:43 ` Qu Wenruo
2016-11-08 1:06 ` Marc MERLIN
0 siblings, 1 reply; 40+ messages in thread
From: Qu Wenruo @ 2016-11-08 0:43 UTC (permalink / raw)
To: Marc MERLIN; +Cc: David Sterba, Hugo Mills, linux-btrfs
At 11/08/2016 08:39 AM, Marc MERLIN wrote:
> On Tue, Nov 08, 2016 at 08:35:54AM +0800, Qu Wenruo wrote:
>>> Understood. One big thing (for me) I forgot to confirm:
>>> 1) btrfs receive
>>
>> Unfortunately, receive is completely done in userspace.
>> Only send works inside kernel.
>
> right, I've confirmed that btrfs receive fails.
> It looks like btrfs balance is also failing, which is more surprising.
> Isn't that one in the kernel?
That's strange, balance is done completely in kernel space.
Unless we're calling vfs_* function we won't go through the extra check.
What's the error reported?
Thanks,
Qu
>
> Marc
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs support for filesystems >8TB on 32bit architectures
2016-11-08 0:43 ` Qu Wenruo
@ 2016-11-08 1:06 ` Marc MERLIN
2016-11-08 1:17 ` Qu Wenruo
0 siblings, 1 reply; 40+ messages in thread
From: Marc MERLIN @ 2016-11-08 1:06 UTC (permalink / raw)
To: Qu Wenruo; +Cc: David Sterba, Hugo Mills, linux-btrfs
On Tue, Nov 08, 2016 at 08:43:34AM +0800, Qu Wenruo wrote:
> That's strange, balance is done completely in kernel space.
>
> Unless we're calling vfs_* function we won't go through the extra check.
>
> What's the error reported?
See below. Note however that is may be because btrfs received messed up the
filesystem first.
BTRFS info (device dm-0): use zlib compression
BTRFS info (device dm-0): disk space caching is enabled
BTRFS info (device dm-0): has skinny extents
BTRFS info (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 512, gen 0
BTRFS info (device dm-0): detected SSD devices, enabling SSD mode
BTRFS info (device dm-0): continuing balance
BTRFS info (device dm-0): The free space cache file (1593999097856) is invalid. skip it
BTRFS info (device dm-0): The free space cache file (1671308509184) is invalid. skip it
BTRFS info (device dm-0): relocating block group 13835461197824 flags 34
------------[ cut here ]------------
WARNING: CPU: 0 PID: 22825 at fs/btrfs/disk-io.c:520 btree_csum_one_bio.isra.39+0xf7/0x100
Modules linked in: bcache configs rc_hauppauge ir_kbd_i2c cpufreq_userspace cpufreq_powersave cpufreq_conservative autofs4 snd_hda_codec_hdmi joydev snd_hda_codec_realtek snd_hda_codec_generic tuner_simple tuner_types tda9887 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep tda8290 coretemp snd_pcm_oss snd_mixer_oss tuner snd_pcm msp3400 snd_seq_midi snd_seq_midi_event firewire_sbp2 saa7127 snd_rawmidi hwmon_vid dm_crypt dm_mod saa7115 snd_seq bttv hid_generic snd_seq_device snd_timer ehci_pci ivtv tea575x videobuf_dma_sg rc_core videobuf_core input_leds tveeprom cx2341x v4l2_common ehci_hcd videodev media acpi_cpufreq tpm_tis tpm_tis_core gpio_ich snd soundcore tpm psmouse lpc_ich evdev asus_atk0110 serio_raw lp parport raid456 async_raid6_recov async_pq async_xor async_memcpy async_tx multipath usbhid hid sr_mod cdrom sg firewire_ohci firewire_core floppy crc_itu_t i915 atl1 fjes mii uhci_hcd usbcore usb_common
CPU: 0 PID: 22825 Comm: kworker/u9:2 Tainted: G W 4.8.5-ia32-20161028 #2
Hardware name: System manufacturer P5E-VM HDMI/P5E-VM HDMI, BIOS 0604 07/16/2008
Workqueue: btrfs-worker-high btrfs_worker_helper
00200286 00200286 d3d81e48 df414827 00000000 dfa12da5 d3d81e78 df05677a
df9ed884 00000000 00005929 dfa12da5 00000208 df2cf067 00000208 f7463fa0
f401a080 00000000 d3d81e8c df05684a 00000009 00000000 00000000 d3d81eb4
Call Trace:
[<df414827>] dump_stack+0x58/0x81
[<df05677a>] __warn+0xea/0x110
[<df2cf067>] ? btree_csum_one_bio.isra.39+0xf7/0x100
[<df05684a>] warn_slowpath_null+0x2a/0x30
[<df2cf067>] btree_csum_one_bio.isra.39+0xf7/0x100
[<df2cf085>] __btree_submit_bio_start+0x15/0x20
[<df2cdd10>] run_one_async_start+0x30/0x40
[<df31286d>] btrfs_scrubparity_helper+0xcd/0x2d0
[<df2cde70>] ? run_one_async_free+0x20/0x20
[<df312bbd>] btrfs_worker_helper+0xd/0x10
[<df06d05b>] process_one_work+0x10b/0x400
[<df06d387>] worker_thread+0x37/0x4b0
[<df06d350>] ? process_one_work+0x400/0x400
[<df0722db>] kthread+0x9b/0xb0
[<df799922>] ret_from_kernel_thread+0xe/0x24
[<df072240>] ? kthread_stop+0x100/0x100
---[ end trace f461faff989bf258 ]---
BTRFS: error (device dm-0) in btrfs_commit_transaction:2232: errno=-5 IO failure (Error while writing out transaction)
BTRFS info (device dm-0): forced readonly
BTRFS warning (device dm-0): Skipping commit of aborted transaction.
------------[ cut here ]------------
WARNING: CPU: 0 PID: 22318 at fs/btrfs/transaction.c:1854 btrfs_commit_transaction+0x2f5/0xcc0
BTRFS: Transaction aborted (error -5)
Modules linked in: bcache configs rc_hauppauge ir_kbd_i2c cpufreq_userspace cpufreq_powersave cpufreq_conservative autofs4 snd_hda_codec_hdmi joydev snd_hda_codec_realtek snd_hda_codec_generic tuner_simple tuner_types tda9887 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep tda8290 coretemp snd_pcm_oss snd_mixer_oss tuner snd_pcm msp3400 snd_seq_midi snd_seq_midi_event firewire_sbp2 saa7127 snd_rawmidi hwmon_vid dm_crypt dm_mod saa7115 snd_seq bttv hid_generic snd_seq_device snd_timer ehci_pci ivtv tea575x videobuf_dma_sg rc_core videobuf_core input_leds tveeprom cx2341x v4l2_common ehci_hcd videodev media acpi_cpufreq tpm_tis tpm_tis_core gpio_ich snd soundcore tpm psmouse lpc_ich evdev asus_atk0110 serio_raw lp parport raid456 async_raid6_recov async_pq async_xor async_memcpy async_tx multipath usbhid hid sr_mod cdrom sg firewire_ohci firewire_core floppy crc_itu_t i915 atl1 fjes mii uhci_hcd usbcore usb_common
CPU: 0 PID: 22318 Comm: btrfs-balance Tainted: G W 4.8.5-ia32-20161028 #2
Hardware name: System manufacturer P5E-VM HDMI/P5E-VM HDMI, BIOS 0604 07/16/2008
00000286 00000286 d74a3ca4 df414827 d74a3ce8 dfa132ab d74a3cd4 df05677a
dfa075cc d74a3d04 0000572e dfa132ab 0000073e df2d7de5 0000073e f698dc00
e9173e70 fffffffb d74a3cf0 df0567db 00000009 00000000 d74a3ce8 dfa075cc
Call Trace:
[<df414827>] dump_stack+0x58/0x81
[<df05677a>] __warn+0xea/0x110
[<df2d7de5>] ? btrfs_commit_transaction+0x2f5/0xcc0
[<df0567db>] warn_slowpath_fmt+0x3b/0x40
[<df2d7de5>] btrfs_commit_transaction+0x2f5/0xcc0
[<df096800>] ? prepare_to_wait_event+0xd0/0xd0
[<df33334f>] prepare_to_relocate+0x12f/0x180
[<df339a41>] relocate_block_group+0x31/0x790
[<df0b1427>] ? vprintk_default+0x37/0x40
[<df796ca0>] ? mutex_lock+0x10/0x30
[<df2f8f45>] ? btrfs_wait_ordered_roots+0x1d5/0x1f0
[<df14eed6>] ? printk+0x17/0x19
[<df2a47b2>] ? btrfs_printk+0x102/0x110
[<df33a388>] btrfs_relocate_block_group+0x1e8/0x2e0
[<df308a9f>] btrfs_relocate_chunk.isra.29+0x3f/0xf0
[<df30221f>] ? free_extent_buffer+0x4f/0xa0
[<df30a555>] btrfs_balance+0xb05/0x1820
[<df0b0afa>] ? console_unlock+0x40a/0x630
[<df30b2c1>] balance_kthread+0x51/0x80
[<df30b270>] ? btrfs_balance+0x1820/0x1820
[<df0722db>] kthread+0x9b/0xb0
[<df799922>] ret_from_kernel_thread+0xe/0x24
[<df072240>] ? kthread_stop+0x100/0x100
---[ end trace f461faff989bf259 ]---
BTRFS: error (device dm-0) in cleanup_transaction:1854: errno=-5 IO failure
BTRFS info (device dm-0): delayed_refs has NO entry
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs support for filesystems >8TB on 32bit architectures
2016-11-08 1:06 ` Marc MERLIN
@ 2016-11-08 1:17 ` Qu Wenruo
2016-11-08 15:24 ` Marc MERLIN
0 siblings, 1 reply; 40+ messages in thread
From: Qu Wenruo @ 2016-11-08 1:17 UTC (permalink / raw)
To: Marc MERLIN; +Cc: David Sterba, Hugo Mills, linux-btrfs
At 11/08/2016 09:06 AM, Marc MERLIN wrote:
> On Tue, Nov 08, 2016 at 08:43:34AM +0800, Qu Wenruo wrote:
>> That's strange, balance is done completely in kernel space.
>>
>> Unless we're calling vfs_* function we won't go through the extra check.
>>
>> What's the error reported?
>
> See below. Note however that is may be because btrfs received messed up the
> filesystem first.
If receive can easily screw up the fs, then fsstress can also screw up
btrfs easily.
So I didn't think that's the case. (Several years ago it's possible)
>
> BTRFS info (device dm-0): use zlib compression
> BTRFS info (device dm-0): disk space caching is enabled
> BTRFS info (device dm-0): has skinny extents
> BTRFS info (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 512, gen 0
> BTRFS info (device dm-0): detected SSD devices, enabling SSD mode
> BTRFS info (device dm-0): continuing balance
> BTRFS info (device dm-0): The free space cache file (1593999097856) is invalid. skip it
>
> BTRFS info (device dm-0): The free space cache file (1671308509184) is invalid. skip it
>
> BTRFS info (device dm-0): relocating block group 13835461197824 flags 34
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 22825 at fs/btrfs/disk-io.c:520 btree_csum_one_bio.isra.39+0xf7/0x100
Dirty tree block's bytenr doesn't match with page's logical.
It seems that the tree block is not up-to-date, maybe corrupted.
Seems not related to the 8T limit.
Could you please add pr_info() to print out the 'found_start' and 'start'?
Also I'm not familiar with this code, the number may has a clue to show
what's going wrong.
Thanks,
Qu
> Modules linked in: bcache configs rc_hauppauge ir_kbd_i2c cpufreq_userspace cpufreq_powersave cpufreq_conservative autofs4 snd_hda_codec_hdmi joydev snd_hda_codec_realtek snd_hda_codec_generic tuner_simple tuner_types tda9887 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep tda8290 coretemp snd_pcm_oss snd_mixer_oss tuner snd_pcm msp3400 snd_seq_midi snd_seq_midi_event firewire_sbp2 saa7127 snd_rawmidi hwmon_vid dm_crypt dm_mod saa7115 snd_seq bttv hid_generic snd_seq_device snd_timer ehci_pci ivtv tea575x videobuf_dma_sg rc_core videobuf_core input_leds tveeprom cx2341x v4l2_common ehci_hcd videodev media acpi_cpufreq tpm_tis tpm_tis_core gpio_ich snd soundcore tpm psmouse lpc_ich evdev asus_atk0110 serio_raw lp parport raid456 async_raid6_recov async_pq async_xor async_memcpy async_tx multipath usbhid hid sr_mod cdrom sg firewire_ohci firewire_core floppy crc_itu_t i915 atl1 fjes mii uhci_hcd usbcore usb_common
> CPU: 0 PID: 22825 Comm: kworker/u9:2 Tainted: G W 4.8.5-ia32-20161028 #2
> Hardware name: System manufacturer P5E-VM HDMI/P5E-VM HDMI, BIOS 0604 07/16/2008
> Workqueue: btrfs-worker-high btrfs_worker_helper
> 00200286 00200286 d3d81e48 df414827 00000000 dfa12da5 d3d81e78 df05677a
> df9ed884 00000000 00005929 dfa12da5 00000208 df2cf067 00000208 f7463fa0
> f401a080 00000000 d3d81e8c df05684a 00000009 00000000 00000000 d3d81eb4
> Call Trace:
> [<df414827>] dump_stack+0x58/0x81
> [<df05677a>] __warn+0xea/0x110
> [<df2cf067>] ? btree_csum_one_bio.isra.39+0xf7/0x100
> [<df05684a>] warn_slowpath_null+0x2a/0x30
> [<df2cf067>] btree_csum_one_bio.isra.39+0xf7/0x100
> [<df2cf085>] __btree_submit_bio_start+0x15/0x20
> [<df2cdd10>] run_one_async_start+0x30/0x40
> [<df31286d>] btrfs_scrubparity_helper+0xcd/0x2d0
> [<df2cde70>] ? run_one_async_free+0x20/0x20
> [<df312bbd>] btrfs_worker_helper+0xd/0x10
> [<df06d05b>] process_one_work+0x10b/0x400
> [<df06d387>] worker_thread+0x37/0x4b0
> [<df06d350>] ? process_one_work+0x400/0x400
> [<df0722db>] kthread+0x9b/0xb0
> [<df799922>] ret_from_kernel_thread+0xe/0x24
> [<df072240>] ? kthread_stop+0x100/0x100
> ---[ end trace f461faff989bf258 ]---
> BTRFS: error (device dm-0) in btrfs_commit_transaction:2232: errno=-5 IO failure (Error while writing out transaction)
> BTRFS info (device dm-0): forced readonly
> BTRFS warning (device dm-0): Skipping commit of aborted transaction.
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 22318 at fs/btrfs/transaction.c:1854 btrfs_commit_transaction+0x2f5/0xcc0
> BTRFS: Transaction aborted (error -5)
> Modules linked in: bcache configs rc_hauppauge ir_kbd_i2c cpufreq_userspace cpufreq_powersave cpufreq_conservative autofs4 snd_hda_codec_hdmi joydev snd_hda_codec_realtek snd_hda_codec_generic tuner_simple tuner_types tda9887 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep tda8290 coretemp snd_pcm_oss snd_mixer_oss tuner snd_pcm msp3400 snd_seq_midi snd_seq_midi_event firewire_sbp2 saa7127 snd_rawmidi hwmon_vid dm_crypt dm_mod saa7115 snd_seq bttv hid_generic snd_seq_device snd_timer ehci_pci ivtv tea575x videobuf_dma_sg rc_core videobuf_core input_leds tveeprom cx2341x v4l2_common ehci_hcd videodev media acpi_cpufreq tpm_tis tpm_tis_core gpio_ich snd soundcore tpm psmouse lpc_ich evdev asus_atk0110 serio_raw lp parport raid456 async_raid6_recov async_pq async_xor async_memcpy async_tx multipath usbhid hid sr_mod cdrom sg firewire_ohci firewire_core floppy crc_itu_t i915 atl1 fjes mii uhci_hcd usbcore usb_common
> CPU: 0 PID: 22318 Comm: btrfs-balance Tainted: G W 4.8.5-ia32-20161028 #2
> Hardware name: System manufacturer P5E-VM HDMI/P5E-VM HDMI, BIOS 0604 07/16/2008
> 00000286 00000286 d74a3ca4 df414827 d74a3ce8 dfa132ab d74a3cd4 df05677a
> dfa075cc d74a3d04 0000572e dfa132ab 0000073e df2d7de5 0000073e f698dc00
> e9173e70 fffffffb d74a3cf0 df0567db 00000009 00000000 d74a3ce8 dfa075cc
> Call Trace:
> [<df414827>] dump_stack+0x58/0x81
> [<df05677a>] __warn+0xea/0x110
> [<df2d7de5>] ? btrfs_commit_transaction+0x2f5/0xcc0
> [<df0567db>] warn_slowpath_fmt+0x3b/0x40
> [<df2d7de5>] btrfs_commit_transaction+0x2f5/0xcc0
> [<df096800>] ? prepare_to_wait_event+0xd0/0xd0
> [<df33334f>] prepare_to_relocate+0x12f/0x180
> [<df339a41>] relocate_block_group+0x31/0x790
> [<df0b1427>] ? vprintk_default+0x37/0x40
> [<df796ca0>] ? mutex_lock+0x10/0x30
> [<df2f8f45>] ? btrfs_wait_ordered_roots+0x1d5/0x1f0
> [<df14eed6>] ? printk+0x17/0x19
> [<df2a47b2>] ? btrfs_printk+0x102/0x110
> [<df33a388>] btrfs_relocate_block_group+0x1e8/0x2e0
> [<df308a9f>] btrfs_relocate_chunk.isra.29+0x3f/0xf0
> [<df30221f>] ? free_extent_buffer+0x4f/0xa0
> [<df30a555>] btrfs_balance+0xb05/0x1820
> [<df0b0afa>] ? console_unlock+0x40a/0x630
> [<df30b2c1>] balance_kthread+0x51/0x80
> [<df30b270>] ? btrfs_balance+0x1820/0x1820
> [<df0722db>] kthread+0x9b/0xb0
> [<df799922>] ret_from_kernel_thread+0xe/0x24
> [<df072240>] ? kthread_stop+0x100/0x100
> ---[ end trace f461faff989bf259 ]---
> BTRFS: error (device dm-0) in cleanup_transaction:1854: errno=-5 IO failure
> BTRFS info (device dm-0): delayed_refs has NO entry
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs support for filesystems >8TB on 32bit architectures
2016-11-08 1:17 ` Qu Wenruo
@ 2016-11-08 15:24 ` Marc MERLIN
2016-11-09 1:50 ` Qu Wenruo
0 siblings, 1 reply; 40+ messages in thread
From: Marc MERLIN @ 2016-11-08 15:24 UTC (permalink / raw)
To: Qu Wenruo; +Cc: David Sterba, Hugo Mills, linux-btrfs
On Tue, Nov 08, 2016 at 09:17:43AM +0800, Qu Wenruo wrote:
>
>
> At 11/08/2016 09:06 AM, Marc MERLIN wrote:
> >On Tue, Nov 08, 2016 at 08:43:34AM +0800, Qu Wenruo wrote:
> >>That's strange, balance is done completely in kernel space.
> >>
> >>Unless we're calling vfs_* function we won't go through the extra check.
> >>
> >>What's the error reported?
> >
> >See below. Note however that is may be because btrfs received messed up the
> >filesystem first.
>
> If receive can easily screw up the fs, then fsstress can also screw up
> btrfs easily.
>
> So I didn't think that's the case. (Several years ago it's possible)
So now I'm even more confused. I put the array back in my 64bit system and
check --repair comes back clean, but scrub does not. Is that supposed to be possible?
gargamel:~# btrfs check -p --repair /dev/mapper/crypt_bcache2 2>&1 | tee /mnt/dshelf1/other/btrfs2
enabling repair mode
Checking filesystem on /dev/mapper/crypt_bcache2
UUID: 6692cf4c-93d9-438c-ac30-5db6381dc4f2
checking extents [.]
Fixed 0 roots.
cache and super generation don't match, space cache will be invalidated
checking fs roots [o]
checking csums
checking root refs
found 14622791987200 bytes used err is 0
total csum bytes: 14200176492
total tree bytes: 78239416320
total fs tree bytes: 59524497408
total extent tree bytes: 3236872192
btree space waste bytes: 10068589919
file data blocks allocated: 18101311373312
referenced 18038641020928
Nov 8 06:55:40 gargamel kernel: [35631.988896] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 513, gen 0
Nov 8 06:55:40 gargamel kernel: [35631.988897] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 514, gen 0
Nov 8 06:55:40 gargamel kernel: [35631.988899] BTRFS warning (device dm-6): checksum error at logical 27885961216 on dev /dev/mapper/crypt_bcache2, sector 56578304, root 9461, inode 45837, offset 15459172352, length 4096, links 1 (path: system/mlocate/mlocate.db)
Nov 8 06:55:40 gargamel kernel: [35631.988900] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 515, gen 0
Nov 8 06:55:40 gargamel kernel: [35631.988903] BTRFS warning (device dm-6): checksum error at logical 27887534080 on dev /dev/mapper/crypt_bcache2, sector 56581376, root 9461, inode 45837, offset 15460745216, length 4096, links 1 (path: system/mlocate/mlocate.db)
Nov 8 06:55:40 gargamel kernel: [35631.988904] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27887009792 on dev /dev/mapper/crypt_bcache2
Nov 8 06:55:40 gargamel kernel: [35631.988905] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27886878720 on dev /dev/mapper/crypt_bcache2
Nov 8 06:55:40 gargamel kernel: [35631.988906] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 516, gen 0
Nov 8 06:55:40 gargamel kernel: [35631.988907] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27887837184 on dev /dev/mapper/crypt_bcache2
Nov 8 06:55:40 gargamel kernel: [35631.988908] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 517, gen 0
Nov 8 06:55:40 gargamel kernel: [35631.988909] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 518, gen 0
Nov 8 06:55:40 gargamel kernel: [35631.988910] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27885830144 on dev /dev/mapper/crypt_bcache2
Nov 8 06:55:40 gargamel kernel: [35631.988911] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27885961216 on dev /dev/mapper/crypt_bcache2
Nov 8 06:55:40 gargamel kernel: [35631.988912] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27887534080 on dev /dev/mapper/crypt_bcache2
Nov 8 06:55:40 gargamel kernel: [35631.988882] BTRFS warning (device dm-6): checksum error at logical 27887403008 on dev /dev/mapper/crypt_bcache2, sector 56581120, root 9461, inode 45837, offset 15460614144, length 4096, links 1 (path: system/mlocate/mlocate.db)
Nov 8 06:55:40 gargamel kernel: [35631.988885] BTRFS warning (device dm-6): checksum error at logical 27887009792 on dev /dev/mapper/crypt_bcache2, sector 56580352, root 9461, inode 45837, offset 15460220928, length 4096, links 1 (path: system/mlocate/mlocate.db)
Nov 8 06:55:40 gargamel kernel: [35631.988887] BTRFS warning (device dm-6): checksum error at logical 27886878720 on dev /dev/mapper/crypt_bcache2, sector 56580096, root 9461, inode 45837, offset 15460089856, length 4096, links 1 (path: system/mlocate/mlocate.db)
Nov 8 06:55:40 gargamel kernel: [35631.988890] BTRFS warning (device dm-6): checksum error at logical 27887837184 on dev /dev/mapper/crypt_bcache2, sector 56581968, root 9461, inode 45837, offset 15461048320, length 4096, links 1 (path: system/mlocate/mlocate.db)
Nov 8 06:55:40 gargamel kernel: [35631.988895] BTRFS warning (device dm-6): checksum error at logical 27885830144 on dev /dev/mapper/crypt_bcache2, sector 56578048, root 9461, inode 45837, offset 15459041280, length 4096, links 1 (path: system/mlocate/mlocate.db)
Nov 8 06:55:40 gargamel kernel: [35631.988896] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 513, gen 0
Nov 8 06:55:40 gargamel kernel: [35631.988897] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 514, gen 0
Nov 8 06:55:40 gargamel kernel: [35631.988899] BTRFS warning (device dm-6): checksum error at logical 27885961216 on dev /dev/mapper/crypt_bcache2, sector 56578304, root 9461, inode 45837, offset 15459172352, length 4096, links 1 (path: system/mlocate/mlocate.db)
Nov 8 06:55:40 gargamel kernel: [35631.988900] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 515, gen 0
Nov 8 06:55:40 gargamel kernel: [35631.988903] BTRFS warning (device dm-6): checksum error at logical 27887534080 on dev /dev/mapper/crypt_bcache2, sector 56581376, root 9461, inode 45837, offset 15460745216, length 4096, links 1 (path: system/mlocate/mlocate.db)
Nov 8 06:55:40 gargamel kernel: [35631.988904] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27887009792 on dev /dev/mapper/crypt_bcache2
Nov 8 06:55:40 gargamel kernel: [35631.988905] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27886878720 on dev /dev/mapper/crypt_bcache2
Nov 8 06:55:40 gargamel kernel: [35631.988906] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 516, gen 0
Nov 8 06:55:40 gargamel kernel: [35631.988907] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27887837184 on dev /dev/mapper/crypt_bcache2
Nov 8 06:55:40 gargamel kernel: [35631.988908] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 517, gen 0
Nov 8 06:55:40 gargamel kernel: [35631.988909] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 518, gen 0
Nov 8 06:55:40 gargamel kernel: [35631.988910] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27885830144 on dev /dev/mapper/crypt_bcache2
Nov 8 06:55:40 gargamel kernel: [35631.988911] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27885961216 on dev /dev/mapper/crypt_bcache2
Nov 8 06:55:40 gargamel kernel: [35631.988912] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27887534080 on dev /dev/mapper/crypt_bcache2
> >
> >BTRFS info (device dm-0): use zlib compression
> >BTRFS info (device dm-0): disk space caching is enabled
> >BTRFS info (device dm-0): has skinny extents
> >BTRFS info (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0,
> >flush 0, corrupt 512, gen 0
> >BTRFS info (device dm-0): detected SSD devices, enabling SSD mode
> >BTRFS info (device dm-0): continuing balance
> >BTRFS info (device dm-0): The free space cache file (1593999097856) is
> >invalid. skip it
> >
> >BTRFS info (device dm-0): The free space cache file (1671308509184) is
> >invalid. skip it
> >
> >BTRFS info (device dm-0): relocating block group 13835461197824 flags 34
> >------------[ cut here ]------------
> >WARNING: CPU: 0 PID: 22825 at fs/btrfs/disk-io.c:520
> >btree_csum_one_bio.isra.39+0xf7/0x100
>
> Dirty tree block's bytenr doesn't match with page's logical.
> It seems that the tree block is not up-to-date, maybe corrupted.
>
> Seems not related to the 8T limit.
>
> Could you please add pr_info() to print out the 'found_start' and 'start'?
> Also I'm not familiar with this code, the number may has a clue to show
> what's going wrong.
>
> Thanks,
> Qu
>
> >Modules linked in: bcache configs rc_hauppauge ir_kbd_i2c
> >cpufreq_userspace cpufreq_powersave cpufreq_conservative autofs4
> >snd_hda_codec_hdmi joydev snd_hda_codec_realtek snd_hda_codec_generic
> >tuner_simple tuner_types tda9887 snd_hda_intel snd_hda_codec snd_hda_core
> >snd_hwdep tda8290 coretemp snd_pcm_oss snd_mixer_oss tuner snd_pcm msp3400
> >snd_seq_midi snd_seq_midi_event firewire_sbp2 saa7127 snd_rawmidi
> >hwmon_vid dm_crypt dm_mod saa7115 snd_seq bttv hid_generic snd_seq_device
> >snd_timer ehci_pci ivtv tea575x videobuf_dma_sg rc_core videobuf_core
> >input_leds tveeprom cx2341x v4l2_common ehci_hcd videodev media
> >acpi_cpufreq tpm_tis tpm_tis_core gpio_ich snd soundcore tpm psmouse
> >lpc_ich evdev asus_atk0110 serio_raw lp parport raid456 async_raid6_recov
> >async_pq async_xor async_memcpy async_tx multipath usbhid hid sr_mod cdrom
> >sg firewire_ohci firewire_core floppy crc_itu_t i915 atl1 fjes mii
> >uhci_hcd usbcore usb_common
> >CPU: 0 PID: 22825 Comm: kworker/u9:2 Tainted: G W
> >4.8.5-ia32-20161028 #2
> >Hardware name: System manufacturer P5E-VM HDMI/P5E-VM HDMI, BIOS 0604
> >07/16/2008
> >Workqueue: btrfs-worker-high btrfs_worker_helper
> > 00200286 00200286 d3d81e48 df414827 00000000 dfa12da5 d3d81e78 df05677a
> > df9ed884 00000000 00005929 dfa12da5 00000208 df2cf067 00000208 f7463fa0
> > f401a080 00000000 d3d81e8c df05684a 00000009 00000000 00000000 d3d81eb4
> >Call Trace:
> > [<df414827>] dump_stack+0x58/0x81
> > [<df05677a>] __warn+0xea/0x110
> > [<df2cf067>] ? btree_csum_one_bio.isra.39+0xf7/0x100
> > [<df05684a>] warn_slowpath_null+0x2a/0x30
> > [<df2cf067>] btree_csum_one_bio.isra.39+0xf7/0x100
> > [<df2cf085>] __btree_submit_bio_start+0x15/0x20
> > [<df2cdd10>] run_one_async_start+0x30/0x40
> > [<df31286d>] btrfs_scrubparity_helper+0xcd/0x2d0
> > [<df2cde70>] ? run_one_async_free+0x20/0x20
> > [<df312bbd>] btrfs_worker_helper+0xd/0x10
> > [<df06d05b>] process_one_work+0x10b/0x400
> > [<df06d387>] worker_thread+0x37/0x4b0
> > [<df06d350>] ? process_one_work+0x400/0x400
> > [<df0722db>] kthread+0x9b/0xb0
> > [<df799922>] ret_from_kernel_thread+0xe/0x24
> > [<df072240>] ? kthread_stop+0x100/0x100
> >---[ end trace f461faff989bf258 ]---
> >BTRFS: error (device dm-0) in btrfs_commit_transaction:2232: errno=-5 IO
> >failure (Error while writing out transaction)
> >BTRFS info (device dm-0): forced readonly
> >BTRFS warning (device dm-0): Skipping commit of aborted transaction.
> >------------[ cut here ]------------
> >WARNING: CPU: 0 PID: 22318 at fs/btrfs/transaction.c:1854
> >btrfs_commit_transaction+0x2f5/0xcc0
> >BTRFS: Transaction aborted (error -5)
> >Modules linked in: bcache configs rc_hauppauge ir_kbd_i2c
> >cpufreq_userspace cpufreq_powersave cpufreq_conservative autofs4
> >snd_hda_codec_hdmi joydev snd_hda_codec_realtek snd_hda_codec_generic
> >tuner_simple tuner_types tda9887 snd_hda_intel snd_hda_codec snd_hda_core
> >snd_hwdep tda8290 coretemp snd_pcm_oss snd_mixer_oss tuner snd_pcm msp3400
> >snd_seq_midi snd_seq_midi_event firewire_sbp2 saa7127 snd_rawmidi
> >hwmon_vid dm_crypt dm_mod saa7115 snd_seq bttv hid_generic snd_seq_device
> >snd_timer ehci_pci ivtv tea575x videobuf_dma_sg rc_core videobuf_core
> >input_leds tveeprom cx2341x v4l2_common ehci_hcd videodev media
> >acpi_cpufreq tpm_tis tpm_tis_core gpio_ich snd soundcore tpm psmouse
> >lpc_ich evdev asus_atk0110 serio_raw lp parport raid456 async_raid6_recov
> >async_pq async_xor async_memcpy async_tx multipath usbhid hid sr_mod cdrom
> >sg firewire_ohci firewire_core floppy crc_itu_t i915 atl1 fjes mii
> >uhci_hcd usbcore usb_common
> >CPU: 0 PID: 22318 Comm: btrfs-balance Tainted: G W
> >4.8.5-ia32-20161028 #2
> >Hardware name: System manufacturer P5E-VM HDMI/P5E-VM HDMI, BIOS 0604
> >07/16/2008
> > 00000286 00000286 d74a3ca4 df414827 d74a3ce8 dfa132ab d74a3cd4 df05677a
> > dfa075cc d74a3d04 0000572e dfa132ab 0000073e df2d7de5 0000073e f698dc00
> > e9173e70 fffffffb d74a3cf0 df0567db 00000009 00000000 d74a3ce8 dfa075cc
> >Call Trace:
> > [<df414827>] dump_stack+0x58/0x81
> > [<df05677a>] __warn+0xea/0x110
> > [<df2d7de5>] ? btrfs_commit_transaction+0x2f5/0xcc0
> > [<df0567db>] warn_slowpath_fmt+0x3b/0x40
> > [<df2d7de5>] btrfs_commit_transaction+0x2f5/0xcc0
> > [<df096800>] ? prepare_to_wait_event+0xd0/0xd0
> > [<df33334f>] prepare_to_relocate+0x12f/0x180
> > [<df339a41>] relocate_block_group+0x31/0x790
> > [<df0b1427>] ? vprintk_default+0x37/0x40
> > [<df796ca0>] ? mutex_lock+0x10/0x30
> > [<df2f8f45>] ? btrfs_wait_ordered_roots+0x1d5/0x1f0
> > [<df14eed6>] ? printk+0x17/0x19
> > [<df2a47b2>] ? btrfs_printk+0x102/0x110
> > [<df33a388>] btrfs_relocate_block_group+0x1e8/0x2e0
> > [<df308a9f>] btrfs_relocate_chunk.isra.29+0x3f/0xf0
> > [<df30221f>] ? free_extent_buffer+0x4f/0xa0
> > [<df30a555>] btrfs_balance+0xb05/0x1820
> > [<df0b0afa>] ? console_unlock+0x40a/0x630
> > [<df30b2c1>] balance_kthread+0x51/0x80
> > [<df30b270>] ? btrfs_balance+0x1820/0x1820
> > [<df0722db>] kthread+0x9b/0xb0
> > [<df799922>] ret_from_kernel_thread+0xe/0x24
> > [<df072240>] ? kthread_stop+0x100/0x100
> >---[ end trace f461faff989bf259 ]---
> >BTRFS: error (device dm-0) in cleanup_transaction:1854: errno=-5 IO failure
> >BTRFS info (device dm-0): delayed_refs has NO entry
> >
>
>
>
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs support for filesystems >8TB on 32bit architectures
2016-11-08 15:24 ` Marc MERLIN
@ 2016-11-09 1:50 ` Qu Wenruo
2016-11-09 2:05 ` Marc MERLIN
0 siblings, 1 reply; 40+ messages in thread
From: Qu Wenruo @ 2016-11-09 1:50 UTC (permalink / raw)
To: Marc MERLIN; +Cc: David Sterba, Hugo Mills, linux-btrfs
At 11/08/2016 11:24 PM, Marc MERLIN wrote:
> On Tue, Nov 08, 2016 at 09:17:43AM +0800, Qu Wenruo wrote:
>>
>>
>> At 11/08/2016 09:06 AM, Marc MERLIN wrote:
>>> On Tue, Nov 08, 2016 at 08:43:34AM +0800, Qu Wenruo wrote:
>>>> That's strange, balance is done completely in kernel space.
>>>>
>>>> Unless we're calling vfs_* function we won't go through the extra check.
>>>>
>>>> What's the error reported?
>>>
>>> See below. Note however that is may be because btrfs received messed up the
>>> filesystem first.
>>
>> If receive can easily screw up the fs, then fsstress can also screw up
>> btrfs easily.
>>
>> So I didn't think that's the case. (Several years ago it's possible)
>
> So now I'm even more confused. I put the array back in my 64bit system and
> check --repair comes back clean, but scrub does not. Is that supposed to be possible?
Yeah, quite possible!
The truth is, current btrfs check only checks:
1) Metadata
while --check-data-csum option will check data, but still
follow the restriction 3).
2) Crossing reference of metadata (contents of metadata)
3) The first good mirror/backup
So quite a lot of problems can't be detected by btrfs check:
1) Data corruption (csum mismatch)
2) 2nd mirror corruption(DUP/RAID0/10) or parity error(RAID5/6)
For btrfsck to check all mirror and data, you could try out-of-tree
offline scrub patchset:
https://github.com/adam900710/btrfs-progs/tree/fsck_scrub
Which implements the kernel scrub equivalent in btrfs-progs.
Thanks,
Qu
>
> gargamel:~# btrfs check -p --repair /dev/mapper/crypt_bcache2 2>&1 | tee /mnt/dshelf1/other/btrfs2
> enabling repair mode
> Checking filesystem on /dev/mapper/crypt_bcache2
> UUID: 6692cf4c-93d9-438c-ac30-5db6381dc4f2
> checking extents [.]
> Fixed 0 roots.
> cache and super generation don't match, space cache will be invalidated
> checking fs roots [o]
> checking csums
> checking root refs
> found 14622791987200 bytes used err is 0
> total csum bytes: 14200176492
> total tree bytes: 78239416320
> total fs tree bytes: 59524497408
> total extent tree bytes: 3236872192
> btree space waste bytes: 10068589919
> file data blocks allocated: 18101311373312
> referenced 18038641020928
>
> Nov 8 06:55:40 gargamel kernel: [35631.988896] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 513, gen 0
> Nov 8 06:55:40 gargamel kernel: [35631.988897] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 514, gen 0
> Nov 8 06:55:40 gargamel kernel: [35631.988899] BTRFS warning (device dm-6): checksum error at logical 27885961216 on dev /dev/mapper/crypt_bcache2, sector 56578304, root 9461, inode 45837, offset 15459172352, length 4096, links 1 (path: system/mlocate/mlocate.db)
> Nov 8 06:55:40 gargamel kernel: [35631.988900] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 515, gen 0
> Nov 8 06:55:40 gargamel kernel: [35631.988903] BTRFS warning (device dm-6): checksum error at logical 27887534080 on dev /dev/mapper/crypt_bcache2, sector 56581376, root 9461, inode 45837, offset 15460745216, length 4096, links 1 (path: system/mlocate/mlocate.db)
> Nov 8 06:55:40 gargamel kernel: [35631.988904] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27887009792 on dev /dev/mapper/crypt_bcache2
> Nov 8 06:55:40 gargamel kernel: [35631.988905] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27886878720 on dev /dev/mapper/crypt_bcache2
> Nov 8 06:55:40 gargamel kernel: [35631.988906] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 516, gen 0
> Nov 8 06:55:40 gargamel kernel: [35631.988907] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27887837184 on dev /dev/mapper/crypt_bcache2
> Nov 8 06:55:40 gargamel kernel: [35631.988908] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 517, gen 0
> Nov 8 06:55:40 gargamel kernel: [35631.988909] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 518, gen 0
> Nov 8 06:55:40 gargamel kernel: [35631.988910] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27885830144 on dev /dev/mapper/crypt_bcache2
> Nov 8 06:55:40 gargamel kernel: [35631.988911] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27885961216 on dev /dev/mapper/crypt_bcache2
> Nov 8 06:55:40 gargamel kernel: [35631.988912] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27887534080 on dev /dev/mapper/crypt_bcache2
> Nov 8 06:55:40 gargamel kernel: [35631.988882] BTRFS warning (device dm-6): checksum error at logical 27887403008 on dev /dev/mapper/crypt_bcache2, sector 56581120, root 9461, inode 45837, offset 15460614144, length 4096, links 1 (path: system/mlocate/mlocate.db)
> Nov 8 06:55:40 gargamel kernel: [35631.988885] BTRFS warning (device dm-6): checksum error at logical 27887009792 on dev /dev/mapper/crypt_bcache2, sector 56580352, root 9461, inode 45837, offset 15460220928, length 4096, links 1 (path: system/mlocate/mlocate.db)
> Nov 8 06:55:40 gargamel kernel: [35631.988887] BTRFS warning (device dm-6): checksum error at logical 27886878720 on dev /dev/mapper/crypt_bcache2, sector 56580096, root 9461, inode 45837, offset 15460089856, length 4096, links 1 (path: system/mlocate/mlocate.db)
> Nov 8 06:55:40 gargamel kernel: [35631.988890] BTRFS warning (device dm-6): checksum error at logical 27887837184 on dev /dev/mapper/crypt_bcache2, sector 56581968, root 9461, inode 45837, offset 15461048320, length 4096, links 1 (path: system/mlocate/mlocate.db)
> Nov 8 06:55:40 gargamel kernel: [35631.988895] BTRFS warning (device dm-6): checksum error at logical 27885830144 on dev /dev/mapper/crypt_bcache2, sector 56578048, root 9461, inode 45837, offset 15459041280, length 4096, links 1 (path: system/mlocate/mlocate.db)
> Nov 8 06:55:40 gargamel kernel: [35631.988896] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 513, gen 0
> Nov 8 06:55:40 gargamel kernel: [35631.988897] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 514, gen 0
> Nov 8 06:55:40 gargamel kernel: [35631.988899] BTRFS warning (device dm-6): checksum error at logical 27885961216 on dev /dev/mapper/crypt_bcache2, sector 56578304, root 9461, inode 45837, offset 15459172352, length 4096, links 1 (path: system/mlocate/mlocate.db)
> Nov 8 06:55:40 gargamel kernel: [35631.988900] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 515, gen 0
> Nov 8 06:55:40 gargamel kernel: [35631.988903] BTRFS warning (device dm-6): checksum error at logical 27887534080 on dev /dev/mapper/crypt_bcache2, sector 56581376, root 9461, inode 45837, offset 15460745216, length 4096, links 1 (path: system/mlocate/mlocate.db)
> Nov 8 06:55:40 gargamel kernel: [35631.988904] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27887009792 on dev /dev/mapper/crypt_bcache2
> Nov 8 06:55:40 gargamel kernel: [35631.988905] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27886878720 on dev /dev/mapper/crypt_bcache2
> Nov 8 06:55:40 gargamel kernel: [35631.988906] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 516, gen 0
> Nov 8 06:55:40 gargamel kernel: [35631.988907] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27887837184 on dev /dev/mapper/crypt_bcache2
> Nov 8 06:55:40 gargamel kernel: [35631.988908] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 517, gen 0
> Nov 8 06:55:40 gargamel kernel: [35631.988909] BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 518, gen 0
> Nov 8 06:55:40 gargamel kernel: [35631.988910] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27885830144 on dev /dev/mapper/crypt_bcache2
> Nov 8 06:55:40 gargamel kernel: [35631.988911] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27885961216 on dev /dev/mapper/crypt_bcache2
> Nov 8 06:55:40 gargamel kernel: [35631.988912] BTRFS error (device dm-6): unable to fixup (regular) error at logical 27887534080 on dev /dev/mapper/crypt_bcache2
>
>
>
>>>
>>> BTRFS info (device dm-0): use zlib compression
>>> BTRFS info (device dm-0): disk space caching is enabled
>>> BTRFS info (device dm-0): has skinny extents
>>> BTRFS info (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0,
>>> flush 0, corrupt 512, gen 0
>>> BTRFS info (device dm-0): detected SSD devices, enabling SSD mode
>>> BTRFS info (device dm-0): continuing balance
>>> BTRFS info (device dm-0): The free space cache file (1593999097856) is
>>> invalid. skip it
>>>
>>> BTRFS info (device dm-0): The free space cache file (1671308509184) is
>>> invalid. skip it
>>>
>>> BTRFS info (device dm-0): relocating block group 13835461197824 flags 34
>>> ------------[ cut here ]------------
>>> WARNING: CPU: 0 PID: 22825 at fs/btrfs/disk-io.c:520
>>> btree_csum_one_bio.isra.39+0xf7/0x100
>>
>> Dirty tree block's bytenr doesn't match with page's logical.
>> It seems that the tree block is not up-to-date, maybe corrupted.
>>
>> Seems not related to the 8T limit.
>>
>> Could you please add pr_info() to print out the 'found_start' and 'start'?
>> Also I'm not familiar with this code, the number may has a clue to show
>> what's going wrong.
>>
>> Thanks,
>> Qu
>>
>>> Modules linked in: bcache configs rc_hauppauge ir_kbd_i2c
>>> cpufreq_userspace cpufreq_powersave cpufreq_conservative autofs4
>>> snd_hda_codec_hdmi joydev snd_hda_codec_realtek snd_hda_codec_generic
>>> tuner_simple tuner_types tda9887 snd_hda_intel snd_hda_codec snd_hda_core
>>> snd_hwdep tda8290 coretemp snd_pcm_oss snd_mixer_oss tuner snd_pcm msp3400
>>> snd_seq_midi snd_seq_midi_event firewire_sbp2 saa7127 snd_rawmidi
>>> hwmon_vid dm_crypt dm_mod saa7115 snd_seq bttv hid_generic snd_seq_device
>>> snd_timer ehci_pci ivtv tea575x videobuf_dma_sg rc_core videobuf_core
>>> input_leds tveeprom cx2341x v4l2_common ehci_hcd videodev media
>>> acpi_cpufreq tpm_tis tpm_tis_core gpio_ich snd soundcore tpm psmouse
>>> lpc_ich evdev asus_atk0110 serio_raw lp parport raid456 async_raid6_recov
>>> async_pq async_xor async_memcpy async_tx multipath usbhid hid sr_mod cdrom
>>> sg firewire_ohci firewire_core floppy crc_itu_t i915 atl1 fjes mii
>>> uhci_hcd usbcore usb_common
>>> CPU: 0 PID: 22825 Comm: kworker/u9:2 Tainted: G W
>>> 4.8.5-ia32-20161028 #2
>>> Hardware name: System manufacturer P5E-VM HDMI/P5E-VM HDMI, BIOS 0604
>>> 07/16/2008
>>> Workqueue: btrfs-worker-high btrfs_worker_helper
>>> 00200286 00200286 d3d81e48 df414827 00000000 dfa12da5 d3d81e78 df05677a
>>> df9ed884 00000000 00005929 dfa12da5 00000208 df2cf067 00000208 f7463fa0
>>> f401a080 00000000 d3d81e8c df05684a 00000009 00000000 00000000 d3d81eb4
>>> Call Trace:
>>> [<df414827>] dump_stack+0x58/0x81
>>> [<df05677a>] __warn+0xea/0x110
>>> [<df2cf067>] ? btree_csum_one_bio.isra.39+0xf7/0x100
>>> [<df05684a>] warn_slowpath_null+0x2a/0x30
>>> [<df2cf067>] btree_csum_one_bio.isra.39+0xf7/0x100
>>> [<df2cf085>] __btree_submit_bio_start+0x15/0x20
>>> [<df2cdd10>] run_one_async_start+0x30/0x40
>>> [<df31286d>] btrfs_scrubparity_helper+0xcd/0x2d0
>>> [<df2cde70>] ? run_one_async_free+0x20/0x20
>>> [<df312bbd>] btrfs_worker_helper+0xd/0x10
>>> [<df06d05b>] process_one_work+0x10b/0x400
>>> [<df06d387>] worker_thread+0x37/0x4b0
>>> [<df06d350>] ? process_one_work+0x400/0x400
>>> [<df0722db>] kthread+0x9b/0xb0
>>> [<df799922>] ret_from_kernel_thread+0xe/0x24
>>> [<df072240>] ? kthread_stop+0x100/0x100
>>> ---[ end trace f461faff989bf258 ]---
>>> BTRFS: error (device dm-0) in btrfs_commit_transaction:2232: errno=-5 IO
>>> failure (Error while writing out transaction)
>>> BTRFS info (device dm-0): forced readonly
>>> BTRFS warning (device dm-0): Skipping commit of aborted transaction.
>>> ------------[ cut here ]------------
>>> WARNING: CPU: 0 PID: 22318 at fs/btrfs/transaction.c:1854
>>> btrfs_commit_transaction+0x2f5/0xcc0
>>> BTRFS: Transaction aborted (error -5)
>>> Modules linked in: bcache configs rc_hauppauge ir_kbd_i2c
>>> cpufreq_userspace cpufreq_powersave cpufreq_conservative autofs4
>>> snd_hda_codec_hdmi joydev snd_hda_codec_realtek snd_hda_codec_generic
>>> tuner_simple tuner_types tda9887 snd_hda_intel snd_hda_codec snd_hda_core
>>> snd_hwdep tda8290 coretemp snd_pcm_oss snd_mixer_oss tuner snd_pcm msp3400
>>> snd_seq_midi snd_seq_midi_event firewire_sbp2 saa7127 snd_rawmidi
>>> hwmon_vid dm_crypt dm_mod saa7115 snd_seq bttv hid_generic snd_seq_device
>>> snd_timer ehci_pci ivtv tea575x videobuf_dma_sg rc_core videobuf_core
>>> input_leds tveeprom cx2341x v4l2_common ehci_hcd videodev media
>>> acpi_cpufreq tpm_tis tpm_tis_core gpio_ich snd soundcore tpm psmouse
>>> lpc_ich evdev asus_atk0110 serio_raw lp parport raid456 async_raid6_recov
>>> async_pq async_xor async_memcpy async_tx multipath usbhid hid sr_mod cdrom
>>> sg firewire_ohci firewire_core floppy crc_itu_t i915 atl1 fjes mii
>>> uhci_hcd usbcore usb_common
>>> CPU: 0 PID: 22318 Comm: btrfs-balance Tainted: G W
>>> 4.8.5-ia32-20161028 #2
>>> Hardware name: System manufacturer P5E-VM HDMI/P5E-VM HDMI, BIOS 0604
>>> 07/16/2008
>>> 00000286 00000286 d74a3ca4 df414827 d74a3ce8 dfa132ab d74a3cd4 df05677a
>>> dfa075cc d74a3d04 0000572e dfa132ab 0000073e df2d7de5 0000073e f698dc00
>>> e9173e70 fffffffb d74a3cf0 df0567db 00000009 00000000 d74a3ce8 dfa075cc
>>> Call Trace:
>>> [<df414827>] dump_stack+0x58/0x81
>>> [<df05677a>] __warn+0xea/0x110
>>> [<df2d7de5>] ? btrfs_commit_transaction+0x2f5/0xcc0
>>> [<df0567db>] warn_slowpath_fmt+0x3b/0x40
>>> [<df2d7de5>] btrfs_commit_transaction+0x2f5/0xcc0
>>> [<df096800>] ? prepare_to_wait_event+0xd0/0xd0
>>> [<df33334f>] prepare_to_relocate+0x12f/0x180
>>> [<df339a41>] relocate_block_group+0x31/0x790
>>> [<df0b1427>] ? vprintk_default+0x37/0x40
>>> [<df796ca0>] ? mutex_lock+0x10/0x30
>>> [<df2f8f45>] ? btrfs_wait_ordered_roots+0x1d5/0x1f0
>>> [<df14eed6>] ? printk+0x17/0x19
>>> [<df2a47b2>] ? btrfs_printk+0x102/0x110
>>> [<df33a388>] btrfs_relocate_block_group+0x1e8/0x2e0
>>> [<df308a9f>] btrfs_relocate_chunk.isra.29+0x3f/0xf0
>>> [<df30221f>] ? free_extent_buffer+0x4f/0xa0
>>> [<df30a555>] btrfs_balance+0xb05/0x1820
>>> [<df0b0afa>] ? console_unlock+0x40a/0x630
>>> [<df30b2c1>] balance_kthread+0x51/0x80
>>> [<df30b270>] ? btrfs_balance+0x1820/0x1820
>>> [<df0722db>] kthread+0x9b/0xb0
>>> [<df799922>] ret_from_kernel_thread+0xe/0x24
>>> [<df072240>] ? kthread_stop+0x100/0x100
>>> ---[ end trace f461faff989bf259 ]---
>>> BTRFS: error (device dm-0) in cleanup_transaction:1854: errno=-5 IO failure
>>> BTRFS info (device dm-0): delayed_refs has NO entry
>>>
>>
>>
>>
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs support for filesystems >8TB on 32bit architectures
2016-11-09 1:50 ` Qu Wenruo
@ 2016-11-09 2:05 ` Marc MERLIN
2016-11-11 3:48 ` Marc MERLIN
0 siblings, 1 reply; 40+ messages in thread
From: Marc MERLIN @ 2016-11-09 2:05 UTC (permalink / raw)
To: Qu Wenruo; +Cc: David Sterba, Hugo Mills, linux-btrfs
On Wed, Nov 09, 2016 at 09:50:08AM +0800, Qu Wenruo wrote:
> Yeah, quite possible!
>
> The truth is, current btrfs check only checks:
> 1) Metadata
> while --check-data-csum option will check data, but still
> follow the restriction 3).
> 2) Crossing reference of metadata (contents of metadata)
> 3) The first good mirror/backup
>
> So quite a lot of problems can't be detected by btrfs check:
> 1) Data corruption (csum mismatch)
> 2) 2nd mirror corruption(DUP/RAID0/10) or parity error(RAID5/6)
>
> For btrfsck to check all mirror and data, you could try out-of-tree
> offline scrub patchset:
> https://github.com/adam900710/btrfs-progs/tree/fsck_scrub
>
> Which implements the kernel scrub equivalent in btrfs-progs.
I see, thanks for the answer.
Note that this is very confusing to the end user.
If check --repair returns success, the filesystem should be clean.
Hopefully that patchset can be included in btrfs-progs
But sure enough, I'm seeing a lot of these:
BTRFS warning (device dm-6): checksum error at logical 269783986176 on dev /dev/mapper/crypt_bcache2, sector 529035384, root 16755, inode 1225897, offset 77824, length 4096, links 5 (path: magic/20150624/home/merlin/public_html/rig3/img/thumb800_302_1-Wire.jpg)
This is bad because I would expect check --repair to find them all and offer
to remove all the corrupted files after giving me a list of what I've lost,
or just recompute the checksum to be correct, know the file is now corrupted
but "clean" and I have the option of keeping them as is (ok-ish for a video
file) or restore them from backup.
The worst part with scrub is that I have to find all these files, and then
find all the snapshots they're in (maybe 10 or 20) and delete them all, and
then some of those snapshots are read only because they are btrfs send
source, so I need to destroy those snapshots and lose my btrfs send
relationship and am forced to recreate it (maybe 2 to 6 days of syncing over
a slow-ish link)
When data is corrupted, no solution is perfect, but hopefully check --repair
will indeed be able to restore the entire filesystem to a clean state, even
if some data must be lost in the process.
Thanks for considering.
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs support for filesystems >8TB on 32bit architectures
2016-11-09 2:05 ` Marc MERLIN
@ 2016-11-11 3:48 ` Marc MERLIN
2016-11-11 3:55 ` Qu Wenruo
0 siblings, 1 reply; 40+ messages in thread
From: Marc MERLIN @ 2016-11-11 3:48 UTC (permalink / raw)
To: Qu Wenruo; +Cc: David Sterba, Hugo Mills, linux-btrfs
On Tue, Nov 08, 2016 at 06:05:19PM -0800, Marc MERLIN wrote:
> On Wed, Nov 09, 2016 at 09:50:08AM +0800, Qu Wenruo wrote:
> > Yeah, quite possible!
> >
> > The truth is, current btrfs check only checks:
> > 1) Metadata
> > while --check-data-csum option will check data, but still
> > follow the restriction 3).
> > 2) Crossing reference of metadata (contents of metadata)
> > 3) The first good mirror/backup
> >
> > So quite a lot of problems can't be detected by btrfs check:
> > 1) Data corruption (csum mismatch)
> > 2) 2nd mirror corruption(DUP/RAID0/10) or parity error(RAID5/6)
> >
> > For btrfsck to check all mirror and data, you could try out-of-tree
> > offline scrub patchset:
> > https://github.com/adam900710/btrfs-progs/tree/fsck_scrub
> >
> > Which implements the kernel scrub equivalent in btrfs-progs.
>
> I see, thanks for the answer.
> Note that this is very confusing to the end user.
> If check --repair returns success, the filesystem should be clean.
> Hopefully that patchset can be included in btrfs-progs
>
> But sure enough, I'm seeing a lot of these:
> BTRFS warning (device dm-6): checksum error at logical 269783986176 on dev /dev/mapper/crypt_bcache2, sector 529035384, root 16755, inode 1225897, offset 77824, length 4096, links 5 (path: magic/20150624/home/merlin/public_html/rig3/img/thumb800_302_1-Wire.jpg)
So, I ran check -repair, then I ran scrub and I deleted all the files
that were referenced by pathname and failed scrub.
Now I have this:
BTRFS error (device dm-6): unable to fixup (regular) error at logical 269785128960 on dev /dev/mapper/crypt_bcache2
BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 1545, gen 0
BTRFS error (device dm-6): unable to fixup (regular) error at logical 269785133056 on dev /dev/mapper/crypt_bcache2
BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 1546, gen 0
BTRFS error (device dm-6): unable to fixup (regular) error at logical 269785137152 on dev /dev/mapper/crypt_bcache2
BTRFS warning (device dm-6): checksum error at logical 269784580096 on dev /dev/mapper/crypt_bcache2, sector 529036544, root 17564, inode 1225903, offset 16384: path resolving failed with ret=-2
BTRFS warning (device dm-6): checksum error at logical 269784584192 on dev /dev/mapper/crypt_bcache2, sector 529036552, root 17564, inode 1225903, offset 20480: path resolving failed with ret=-2
BTRFS warning (device dm-6): checksum error at logical 269784588288 on dev /dev/mapper/crypt_bcache2, sector 529036560, root 17564, inode 1225903, offset 24576: path resolving failed with ret=-2
BTRFS warning (device dm-6): checksum error at logical 269784592384 on dev /dev/mapper/crypt_bcache2, sector 529036568, root 17564, inode 1225903, offset 28672: path resolving failed with ret=-2
BTRFS warning (device dm-6): checksum error at logical 269784596480 on dev /dev/mapper/crypt_bcache2, sector 529036576, root 17564, inode 1225903, offset 32768: path resolving failed with ret=-2
BTRFS warning (device dm-6): checksum error at logical 269784600576 on dev /dev/mapper/crypt_bcache2, sector 529036584, root 17564, inode 1225903, offset 36864: path resolving failed with ret=-2
BTRFS warning (device dm-6): checksum error at logical 269784604672 on dev /dev/mapper/crypt_bcache2, sector 529036592, root 17564, inode 1225903, offset 40960: path resolving failed with ret=-2
BTRFS warning (device dm-6): checksum error at logical 269784608768 on dev /dev/mapper/crypt_bcache2, sector 529036600, root 17564, inode 1225903, offset 45056: path resolving failed with ret=-2
BTRFS warning (device dm-6): checksum error at logical 269784612864 on dev /dev/mapper/crypt_bcache2, sector 529036608, root 17564, inode 1225903, offset 49152: path resolving failed with ret=-2
How am I supposed to deal with those?
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: btrfs support for filesystems >8TB on 32bit architectures
2016-11-11 3:48 ` Marc MERLIN
@ 2016-11-11 3:55 ` Qu Wenruo
2016-11-12 3:17 ` when btrfs scrub reports errors and btrfs check --repair does not Marc MERLIN
0 siblings, 1 reply; 40+ messages in thread
From: Qu Wenruo @ 2016-11-11 3:55 UTC (permalink / raw)
To: Marc MERLIN; +Cc: David Sterba, Hugo Mills, linux-btrfs
At 11/11/2016 11:48 AM, Marc MERLIN wrote:
> On Tue, Nov 08, 2016 at 06:05:19PM -0800, Marc MERLIN wrote:
>> On Wed, Nov 09, 2016 at 09:50:08AM +0800, Qu Wenruo wrote:
>>> Yeah, quite possible!
>>>
>>> The truth is, current btrfs check only checks:
>>> 1) Metadata
>>> while --check-data-csum option will check data, but still
>>> follow the restriction 3).
>>> 2) Crossing reference of metadata (contents of metadata)
>>> 3) The first good mirror/backup
>>>
>>> So quite a lot of problems can't be detected by btrfs check:
>>> 1) Data corruption (csum mismatch)
>>> 2) 2nd mirror corruption(DUP/RAID0/10) or parity error(RAID5/6)
>>>
>>> For btrfsck to check all mirror and data, you could try out-of-tree
>>> offline scrub patchset:
>>> https://github.com/adam900710/btrfs-progs/tree/fsck_scrub
>>>
>>> Which implements the kernel scrub equivalent in btrfs-progs.
>>
>> I see, thanks for the answer.
>> Note that this is very confusing to the end user.
>> If check --repair returns success, the filesystem should be clean.
>> Hopefully that patchset can be included in btrfs-progs
>>
>> But sure enough, I'm seeing a lot of these:
>> BTRFS warning (device dm-6): checksum error at logical 269783986176 on dev /dev/mapper/crypt_bcache2, sector 529035384, root 16755, inode 1225897, offset 77824, length 4096, links 5 (path: magic/20150624/home/merlin/public_html/rig3/img/thumb800_302_1-Wire.jpg)
>
> So, I ran check -repair, then I ran scrub and I deleted all the files
> that were referenced by pathname and failed scrub.
> Now I have this:
> BTRFS error (device dm-6): unable to fixup (regular) error at logical 269785128960 on dev /dev/mapper/crypt_bcache2
> BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 1545, gen 0
> BTRFS error (device dm-6): unable to fixup (regular) error at logical 269785133056 on dev /dev/mapper/crypt_bcache2
> BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 1546, gen 0
> BTRFS error (device dm-6): unable to fixup (regular) error at logical 269785137152 on dev /dev/mapper/crypt_bcache2
> BTRFS warning (device dm-6): checksum error at logical 269784580096 on dev /dev/mapper/crypt_bcache2, sector 529036544, root 17564, inode 1225903, offset 16384: path resolving failed with ret=-2
> BTRFS warning (device dm-6): checksum error at logical 269784584192 on dev /dev/mapper/crypt_bcache2, sector 529036552, root 17564, inode 1225903, offset 20480: path resolving failed with ret=-2
> BTRFS warning (device dm-6): checksum error at logical 269784588288 on dev /dev/mapper/crypt_bcache2, sector 529036560, root 17564, inode 1225903, offset 24576: path resolving failed with ret=-2
> BTRFS warning (device dm-6): checksum error at logical 269784592384 on dev /dev/mapper/crypt_bcache2, sector 529036568, root 17564, inode 1225903, offset 28672: path resolving failed with ret=-2
> BTRFS warning (device dm-6): checksum error at logical 269784596480 on dev /dev/mapper/crypt_bcache2, sector 529036576, root 17564, inode 1225903, offset 32768: path resolving failed with ret=-2
> BTRFS warning (device dm-6): checksum error at logical 269784600576 on dev /dev/mapper/crypt_bcache2, sector 529036584, root 17564, inode 1225903, offset 36864: path resolving failed with ret=-2
> BTRFS warning (device dm-6): checksum error at logical 269784604672 on dev /dev/mapper/crypt_bcache2, sector 529036592, root 17564, inode 1225903, offset 40960: path resolving failed with ret=-2
> BTRFS warning (device dm-6): checksum error at logical 269784608768 on dev /dev/mapper/crypt_bcache2, sector 529036600, root 17564, inode 1225903, offset 45056: path resolving failed with ret=-2
> BTRFS warning (device dm-6): checksum error at logical 269784612864 on dev /dev/mapper/crypt_bcache2, sector 529036608, root 17564, inode 1225903, offset 49152: path resolving failed with ret=-2
>
> How am I supposed to deal with those?
It seems to be orphan inodes.
Btrfs doesn't remove all the contents of an inode at rm time.
It just unlink the inode and put it into a state called orphan
inodes.(Can't be referred from any directory).
And then free their data extents in next several trans.
Try to find these inodes using inode number in specified subvolume.
If not found, then they are orphan inodes, nothing to worry.
These wrong data extent will disappear soon or later.
Or you can use "btrfs fi sync" to make sure orphan inodes are really
removed from tree.
Thanks,
Qu
>
> Thanks,
> Marc
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: when btrfs scrub reports errors and btrfs check --repair does not
2016-11-11 3:55 ` Qu Wenruo
@ 2016-11-12 3:17 ` Marc MERLIN
2016-11-13 15:06 ` Marc MERLIN
0 siblings, 1 reply; 40+ messages in thread
From: Marc MERLIN @ 2016-11-12 3:17 UTC (permalink / raw)
To: Qu Wenruo; +Cc: David Sterba, Hugo Mills, linux-btrfs
On Fri, Nov 11, 2016 at 11:55:21AM +0800, Qu Wenruo wrote:
> It seems to be orphan inodes.
> Btrfs doesn't remove all the contents of an inode at rm time.
> It just unlink the inode and put it into a state called orphan inodes.(Can't
> be referred from any directory).
BTRFS warning (device dm-6): checksum error at logical 269783928832 on dev /dev/mapper/crypt_bcache2, sector 529035272, root 17564, inode 1225897, offset 20480: path resolving failed with ret=-2
BTRFS warning (device dm-6): checksum error at logical 269783932928 on dev /dev/mapper/crypt_bcache2, sector 529035280, root 17564, inode 1225897, offset 24576: path resolving failed with ret=-2
Do you mean I should be using find /mnt/mnt -inum ?
Well, how about that, you're right:
gargamel:/mnt/mnt/DS2/backup# find /mnt/mnt -inum 1225897
/mnt/mnt/DS2/backup/debian64_rw.20160713_03:21:57/gandalfthegreat/20120409/home/merlin/public_html/mirrors/rwf/vfrcharts/x9y678z6.jpg
So basically the breakage in my filesystem is enough that the backlink
from the inode to the pathname is gone? That's not good :-/
> And then free their data extents in next several trans.
>
> Try to find these inodes using inode number in specified subvolume.
> If not found, then they are orphan inodes, nothing to worry.
> These wrong data extent will disappear soon or later.
>
> Or you can use "btrfs fi sync" to make sure orphan inodes are really removed
> from tree.
So, I ran btrfi fi sync /mnt/mnt, butit returned instantly.
scrub after that, still returns:
btrfs scrub start -Bd /mnt/mnt
BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 1793, gen 0
BTRFS error (device dm-6): unable to fixup (regular) error at logical 269785628672 on dev /dev/mapper/crypt_bcache2
BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 1794, gen 0
BTRFS error (device dm-6): unable to fixup (regular) error at logical 269784580096 on dev /dev/mapper/crypt_bcache2
BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 1795, gen 0
BTRFS error (device dm-6): unable to fixup (regular) error at logical 269785632768 on dev /dev/mapper/crypt_bcache2
BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 1796, gen 0
BTRFS error (device dm-6): unable to fixup (regular) error at logical 269785104384 on dev /dev/mapper/crypt_bcache2
BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 1797, gen 0
BTRFS error (device dm-6): unable to fixup (regular) error at logical 269784584192 on dev /dev/mapper/crypt_bcache2
BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 1798, gen 0
BTRFS error (device dm-6): unable to fixup (regular) error at logical 269785636864 on dev /dev/mapper/crypt_bcache2
BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 1799, gen 0
BTRFS error (device dm-6): unable to fixup (regular) error at logical 269785108480 on dev /dev/mapper/crypt_bcache2
BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 1800, gen 0
BTRFS error (device dm-6): unable to fixup (regular) error at logical 269784588288 on dev /dev/mapper/crypt_bcache2
BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 1801, gen 0
BTRFS error (device dm-6): unable to fixup (regular) error at logical 269784055808 on dev /dev/mapper/crypt_bcache2
BTRFS error (device dm-6): bdev /dev/mapper/crypt_bcache2 errs: wr 0, rd 0, flush 0, corrupt 1802, gen 0
BTRFS error (device dm-6): unable to fixup (regular) error at logical 269785640960 on dev /dev/mapper/crypt_bcache2
What am I supposed to do about these, I'm not even clear where this
corruption is located and how to clear it.
I understand you're saying that this does not seem to affect any
remaining data, but if scrub is not clean, it can't even see what
file an inode is linked to, and that inode doesn't get cleaned 2 days
later, my filesystem is in a bad state that check --repair should fix,
is it not?
Yes, I can wipe it and start over, but I'm trying to use this as a
learning experience as well as seeing if the tools are working as they
should.
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: when btrfs scrub reports errors and btrfs check --repair does not
2016-11-12 3:17 ` when btrfs scrub reports errors and btrfs check --repair does not Marc MERLIN
@ 2016-11-13 15:06 ` Marc MERLIN
2016-11-13 15:13 ` Roman Mamedov
0 siblings, 1 reply; 40+ messages in thread
From: Marc MERLIN @ 2016-11-13 15:06 UTC (permalink / raw)
To: Qu Wenruo; +Cc: David Sterba, Hugo Mills, linux-btrfs
On Fri, Nov 11, 2016 at 07:17:08PM -0800, Marc MERLIN wrote:
> On Fri, Nov 11, 2016 at 11:55:21AM +0800, Qu Wenruo wrote:
> > It seems to be orphan inodes.
> > Btrfs doesn't remove all the contents of an inode at rm time.
> > It just unlink the inode and put it into a state called orphan inodes.(Can't
> > be referred from any directory).
>
> BTRFS warning (device dm-6): checksum error at logical 269783928832 on dev /dev/mapper/crypt_bcache2, sector 529035272, root 17564, inode 1225897, offset 20480: path resolving failed with ret=-2
> BTRFS warning (device dm-6): checksum error at logical 269783932928 on dev /dev/mapper/crypt_bcache2, sector 529035280, root 17564, inode 1225897, offset 24576: path resolving failed with ret=-2
>
> Do you mean I should be using find /mnt/mnt -inum ?
> Well, how about that, you're right:
> gargamel:/mnt/mnt/DS2/backup# find /mnt/mnt -inum 1225897
> /mnt/mnt/DS2/backup/debian64_rw.20160713_03:21:57/gandalfthegreat/20120409/home/merlin/public_html/mirrors/rwf/vfrcharts/x9y678z6.jpg
> So basically the breakage in my filesystem is enough that the backlink
> from the inode to the pathname is gone? That's not good :-/
Mmmn, been doing find -inum, deleting hits, running scrub, and then
scrub still fails with more, and now I'm seeing this;
gargamel:~# find /mnt/mnt -inum 1225897
/mnt/mnt/DS2/backup/ubuntu_rw.20160713_03:25:42/gandalfthegrey/20100718/var/local/www/Pix/albums/Trips/200509_Malaysia/500_KapalaiIsland/BestOf/33_Diving-Dive5-2_139.jpg
/mnt/mnt/DS2/backup/debian64_ro.20160720_02:58:38/gandalfthegreat/20120409/home/merlin/public_html/mirrors/rwf/vfrcharts/x9y678z6.jpg
/mnt/mnt/DS2/backup/debian64_ro.20160720_02:58:38/gandalfthegreat/20120409/home/merlin/public_html/mirrors/rwf/vfrcharts/x9y679z6.jpg
(...)
/mnt/mnt/DS2/backup/debian64_rw.20160727_02:59:03/gandalfthegreat/20120409/home/merlin/public_html/mirrors/rwf/vfrcharts/x9y678z6.jpg
/mnt/mnt/DS2/backup/debian64_rw.20160727_02:59:03/gandalfthegreat/20120409/home/merlin/public_html/mirrors/rwf/vfrcharts/x9y679z6.jpg
/mnt/mnt/DS2/backup/debian64_rw.20160727_02:59:03/gandalfthegreat/20120409/home/merlin/public_html/mirrors/rwf/vfrcharts/x9y81z9.jpg
And then I see this:
gargamel:~# ls -li /mnt/mnt/DS2/backup/ubuntu_rw.20160713_03:25:42/gandalfthegrey/20100718/var/local/www/Pix/albums/Trips/200509_Malaysia/500_KapalaiIsland/BestOf/33_Diving-Dive5-2_139.jpg /mnt/mnt/DS2/backup/debian64_ro.20160720_02:58:38/gandalfthegreat/20120409/home/merlin/public_html/mirrors/rwf/vfrcharts/x9y678z6.jpg /mnt/mnt/DS2/backup/debian64_ro.20160720_02:58:38/gandalfthegreat/20120409/home/merlin/public_html/mirrors/rwf/vfrcharts/x9y679z6.jpg /mnt/mnt/DS2/backup/debian64_rw.20160727_02:59:03/gandalfthegreat/20120409/home/merlin/public_html/mirrors/rwf/vfrcharts/x9y678z6.jpg /mnt/mnt/DS2/backup/debian64_rw.20160727_02:59:03/gandalfthegreat/20120409/home/merlin/public_html/mirrors/rwf/vfrcharts/x9y679z6.jpg /mnt/mnt/DS2/backup/debian64_rw.20160727_02:59:03/gandalfthegreat/20120409/home/merlin/public_html/mirrors/rwf/vfrcharts/x9y81z9.jpg
1225897 -rw-r--r-- 5 merlin merlin 13794 Jan 7 2012 /mnt/mnt/DS2/backup/debian64_ro.20160720_02:58:38/gandalfthegreat/20120409/home/merlin/public_html/mirrors/rwf/vfrcharts/x9y678z6.jpg
1225898 -rw-r--r-- 5 merlin merlin 13048 Jan 7 2012 /mnt/mnt/DS2/backup/debian64_ro.20160720_02:58:38/gandalfthegreat/20120409/home/merlin/public_html/mirrors/rwf/vfrcharts/x9y679z6.jpg
1225897 -rw-r--r-- 5 merlin merlin 13794 Jan 7 2012 /mnt/mnt/DS2/backup/debian64_rw.20160727_02:59:03/gandalfthegreat/20120409/home/merlin/public_html/mirrors/rwf/vfrcharts/x9y678z6.jpg
1225898 -rw-r--r-- 5 merlin merlin 13048 Jan 7 2012 /mnt/mnt/DS2/backup/debian64_rw.20160727_02:59:03/gandalfthegreat/20120409/home/merlin/public_html/mirrors/rwf/vfrcharts/x9y679z6.jpg
1225913 -rw-r--r-- 5 merlin merlin 15247 Jan 7 2012 /mnt/mnt/DS2/backup/debian64_rw.20160727_02:59:03/gandalfthegreat/20120409/home/merlin/public_html/mirrors/rwf/vfrcharts/x9y81z9.jpg
1225897 lrwxrwxrwx 1 merlin merlin 35 Aug 1 2010 /mnt/mnt/DS2/backup/ubuntu_rw.20160713_03:25:42/gandalfthegrey/20100718/var/local/www/Pix/albums/Trips/200509_Malaysia/500_KapalaiIsland/BestOf/33_Diving-Dive5-2_139.jpg -> ../33_Diving/BestOf/Dive5-2_139.jpg
So first:
a) find -inum returns some inodes that don't match
b) but argh, multiple files (very different) have the same inode number, so finding
files by inode number after scrub flagged an inode bad, isn't going to work :(
At this point, I'm starting to lose patience (and running out of time),
so I'm going to wipe this filesystem after I hear back from you, but
basically scrub and repair and still not up to what they should be IMO
(as per my previous comment):
One should be able to fully repair an unclean filesystem with check --repair, and scrub should
give me things I can either fix by hand (delete the corrupt file) or
that check --repair would fix, and neither is true here.
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: when btrfs scrub reports errors and btrfs check --repair does not
2016-11-13 15:06 ` Marc MERLIN
@ 2016-11-13 15:13 ` Roman Mamedov
2016-11-13 15:52 ` Marc MERLIN
0 siblings, 1 reply; 40+ messages in thread
From: Roman Mamedov @ 2016-11-13 15:13 UTC (permalink / raw)
To: Marc MERLIN; +Cc: linux-btrfs
On Sun, 13 Nov 2016 07:06:30 -0800
Marc MERLIN <marc@merlins.org> wrote:
> So first:
> a) find -inum returns some inodes that don't match
> b) but argh, multiple files (very different) have the same inode number, so finding
> files by inode number after scrub flagged an inode bad, isn't going to work :(
I wonder why do you even need scrub to verify file readability. Just try
reading all files by using e.g. "cfv -Crr", the read errors produced will
point you directly to files which are unreadable, without the need to lookup
them in a backward way via inum. Then just restore those from backups.
--
With respect,
Roman
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: when btrfs scrub reports errors and btrfs check --repair does not
2016-11-13 15:13 ` Roman Mamedov
@ 2016-11-13 15:52 ` Marc MERLIN
0 siblings, 0 replies; 40+ messages in thread
From: Marc MERLIN @ 2016-11-13 15:52 UTC (permalink / raw)
To: Roman Mamedov; +Cc: linux-btrfs
On Sun, Nov 13, 2016 at 08:13:29PM +0500, Roman Mamedov wrote:
> On Sun, 13 Nov 2016 07:06:30 -0800
> Marc MERLIN <marc@merlins.org> wrote:
>
> > So first:
> > a) find -inum returns some inodes that don't match
> > b) but argh, multiple files (very different) have the same inode number, so finding
> > files by inode number after scrub flagged an inode bad, isn't going to work :(
>
> I wonder why do you even need scrub to verify file readability. Just try
> reading all files by using e.g. "cfv -Crr", the read errors produced will
> point you directly to files which are unreadable, without the need to lookup
> them in a backward way via inum. Then just restore those from backups.
I could read the files, but we're talking about maybe 100 million files?
that would take a while... (and most of them are COW copies of the same
physical data), so scrub is _much_ faster.
Scrub is also reporting issues not related to files, but data structures
it seems, while repair is not fiding them.
As for the data, it's a backup device, so I can just wipe it, but again,
I'm using this as an example of how I would simply bring a drive back to
a clean state, and that's not pretty right now.
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2016-11-13 15:52 UTC | newest]
Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-30 18:34 btrfs check --repair: ERROR: cannot read chunk root Marc MERLIN
2016-10-31 1:02 ` Qu Wenruo
2016-10-31 2:06 ` Marc MERLIN
2016-10-31 4:21 ` Marc MERLIN
2016-10-31 5:27 ` Qu Wenruo
2016-10-31 5:47 ` Marc MERLIN
2016-10-31 6:04 ` Qu Wenruo
2016-10-31 6:25 ` Marc MERLIN
2016-10-31 6:32 ` Qu Wenruo
2016-10-31 6:37 ` Marc MERLIN
2016-10-31 7:04 ` Qu Wenruo
2016-10-31 8:44 ` Hugo Mills
2016-10-31 15:04 ` Marc MERLIN
2016-11-01 3:48 ` Marc MERLIN
2016-11-01 4:13 ` Qu Wenruo
2016-11-01 4:21 ` Marc MERLIN
2016-11-04 8:01 ` Marc MERLIN
2016-11-04 9:00 ` Roman Mamedov
2016-11-04 17:59 ` Marc MERLIN
2016-11-07 1:11 ` Qu Wenruo
[not found] ` <87lgwwnnyf.fsf@notabene.neil.brown.name>
2016-11-07 1:20 ` clearing blocks wrongfully marked as bad if --update=no-bbl can't be used? Marc MERLIN
2016-11-07 1:39 ` Qu Wenruo
2016-11-07 4:18 ` Qu Wenruo
2016-11-07 5:36 ` btrfs support for filesystems >8TB on 32bit architectures Marc MERLIN
2016-11-07 6:16 ` Qu Wenruo
2016-11-07 14:55 ` Marc MERLIN
2016-11-08 0:35 ` Qu Wenruo
2016-11-08 0:39 ` Marc MERLIN
2016-11-08 0:43 ` Qu Wenruo
2016-11-08 1:06 ` Marc MERLIN
2016-11-08 1:17 ` Qu Wenruo
2016-11-08 15:24 ` Marc MERLIN
2016-11-09 1:50 ` Qu Wenruo
2016-11-09 2:05 ` Marc MERLIN
2016-11-11 3:48 ` Marc MERLIN
2016-11-11 3:55 ` Qu Wenruo
2016-11-12 3:17 ` when btrfs scrub reports errors and btrfs check --repair does not Marc MERLIN
2016-11-13 15:06 ` Marc MERLIN
2016-11-13 15:13 ` Roman Mamedov
2016-11-13 15:52 ` Marc MERLIN
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).