* Recovery options for FS forced readonly due to 3.17 snapshot bug
@ 2015-01-20 12:15 brett.king
2015-01-20 12:27 ` Hugo Mills
2015-01-20 12:40 ` Filipe David Manana
0 siblings, 2 replies; 5+ messages in thread
From: brett.king @ 2015-01-20 12:15 UTC (permalink / raw)
To: linux-btrfs@vger.kernel.org
Hi,
My FS has been forced readonly by the early 3.17 snapshot bug. After much reading, I'm looking for validation of some recovery scenarios:
1) btrfsck --repair under a later kernel.
2) replacing the devices one by one under a later kernel, effectively removing the corruption.
3) Just copying data across from the readonly source to a new FS created under a later kernel.
Option 1 with check only (not repair) segfaults and coredumps on 3.18 (see below) .. are there some other options needed here or perhaps patches pending to rectify this ?
Option 2 would be better than nothing, as I get to keep my snapshots and don't need any more surplus drives than 2x units of the largest size (data is RAID1).
Option 3 is a fallback and basically forfeits the source FS including snapshots, is guaranteed to work (I can read the data fine), however requires 1:1 capacity in new disks to achieve.
So being unable to complete a btrfsck check and generally scared off by a btrfsck -repair, will option 2 work ? I.E. can I still replace devices on a readonly FS ?
And are there any other options ?
Note the FS is very nearly full of data - I did also get continuous disk activity due to this high utilisation triggering the endless reclaim unused extent routine, for a few days prior to it being forced readonly.
[root@array ~]# uname -r
3.18.2-200.fc21.x86_64
(upgraded since getting the issue)
[root@array ~]# btrfsck /dev/sdm
<snip>
parent transid verify failed on 40198674350080 wanted 1623837 found 1622986
Ignoring transid failure
Segmentation fault (core dumped)
[root@array ~]# btrfs fi sh /export/archive/
Label: 'archive' uuid: 22c7663a-93ca-40a6-9491-26abaa62b924
Total devices 8 FS bytes used 18.07TiB
devid 1 size 5.46TiB used 5.46TiB path /dev/sdm
devid 2 size 3.64TiB used 3.64TiB path /dev/sdc
devid 3 size 3.64TiB used 3.64TiB path /dev/sdd
devid 4 size 5.46TiB used 5.46TiB path /dev/sdl
devid 5 size 4.55TiB used 4.55TiB path /dev/sdb
devid 8 size 4.55TiB used 4.55TiB path /dev/sdj
devid 9 size 4.55TiB used 4.55TiB path /dev/sdk
devid 10 size 4.55TiB used 4.55TiB path /dev/sde
Btrfs v3.18.1
[root@array ~]# df -h /export/archive/
Filesystem Size Used Avail Use% Mounted on
/dev/sdm 19T 19T 123G 100% /export/archive
[root@array ~]# btrfs fi df /export/archive/
Data, RAID1: total=18.17TiB, used=18.05TiB
System, RAID1: total=32.00MiB, used=3.00MiB
Metadata, RAID1: total=23.00GiB, used=22.49GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
[root@array ~]# dmesg |grep -i btrfs |less
[ 2.044134] Btrfs loaded
[ 2.046327] BTRFS: device label array devid 1 transid 289965 /dev/sda4
[ 2.472321] BTRFS info (device sda4): disk space caching is enabled
[ 2.513784] BTRFS: detected SSD devices, enabling SSD mode
[ 2.985877] BTRFS info (device sda4): disk space caching is enabled
[ 3.520737] BTRFS: device label archive devid 5 transid 1623944 /dev/sdf
[ 3.544394] BTRFS: device label archive devid 3 transid 1623944 /dev/sdh
[ 3.544553] BTRFS: device label archive devid 10 transid 1623944 /dev/sdi
[ 3.544957] BTRFS: device label archive devid 2 transid 1623944 /dev/sdg
[ 3.614636] BTRFS: device label archive devid 9 transid 1623944 /dev/sdc
[ 3.663006] BTRFS: device label archive devid 8 transid 1623944 /dev/sdb
[ 3.663092] BTRFS: device label archive devid 4 transid 1623944 /dev/sdd
[ 3.685397] BTRFS: device label archive devid 1 transid 1623944 /dev/sde
[ 3.701732] BTRFS info (device sde): disk space caching is enabled
[ 3.903413] BTRFS: bdev /dev/sde errs: wr 2805, rd 1548, flush 0, corrupt 0, gen 0
[ 3.903421] BTRFS: bdev /dev/sdd errs: wr 2805, rd 0, flush 0, corrupt 0, gen 0
[ 3.903426] BTRFS: bdev /dev/sdb errs: wr 2805, rd 33, flush 0, corrupt 0, gen 0
[ 3.903430] BTRFS: bdev /dev/sdc errs: wr 2806, rd 0, flush 0, corrupt 0, gen 0
[ 1432.231219] BTRFS: space cache generation (1623006) does not match inode (1623943)
[ 1432.231283] BTRFS warning (device sde): failed to load free space cache for block group 57021632675840, rebuild it now
[ 1434.603652] BTRFS (device sde): parent transid verify failed on 57022998740992 wanted 1623896 found 1622991
[ 1434.608726] BTRFS (device sde): parent transid verify failed on 57022998740992 wanted 1623896 found 1622991
[ 1434.919115] BTRFS (device sde): parent transid verify failed on 57022584373248 wanted 1623923 found 1622990
[ 1434.937616] BTRFS (device sde): parent transid verify failed on 57022584373248 wanted 1623923 found 1622990
[ 1434.937658] WARNING: CPU: 0 PID: 1571 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x54/0x120 [btrfs]()
[ 1434.937660] BTRFS: Transaction aborted (error -5)
[ 1434.937662] Modules linked in: binfmt_misc pl2303 iTCO_wdt iTCO_vendor_support intel_rapl x86_pkg_temp_thermal coretemp kvm_intel vfat fat kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_controller mei_me snd_hda_codec mei ir_xmp_decoder ir_lirc_codec snd_seq lirc_dev ir_mce_kbd_decoder ir_sharp_decoder ir_sanyo_decoder i2c_i801 ir_sony_decoder lpc_ich ir_jvc_decoder snd_hwdep mfd_core ir_rc6_decoder ir_rc5_decoder ir_nec_decoder snd_seq_device snd_pcm i2c_hid tpm_tis tpm rc_rc6_mce i2c_designware_platform nuvoton_cir dw_dmac i2c_designware_core dw_dmac_core rc_core snd_timer snd soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc btrfs xor i915 raid6_pq e1000e i2c_algo_bit drm_kms_helper drm
[ 1434.937732] Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]
[ 1434.937779] [<ffffffffa026c284>] __btrfs_abort_transaction+0x54/0x120 [btrfs]
[ 1434.937795] [<ffffffffa028a38a>] btrfs_run_delayed_refs.part.64+0x12a/0x280 [btrfs]
[ 1434.937811] [<ffffffffa028a4f7>] btrfs_run_delayed_refs+0x17/0x20 [btrfs]
[ 1434.937829] [<ffffffffa029b5c7>] btrfs_commit_transaction+0x407/0xa20 [btrfs]
[ 1434.937843] [<ffffffffa02823fa>] flush_space+0x9a/0x4e0 [btrfs]
[ 1434.937857] [<ffffffffa0281e90>] ? btrfs_get_alloc_profile+0x30/0x40 [btrfs]
[ 1434.937870] [<ffffffffa02822c4>] ? can_overcommit+0x54/0xf0 [btrfs]
[ 1434.937883] [<ffffffffa0282925>] btrfs_async_reclaim_metadata_space+0xe5/0x200 [btrfs]
[ 1434.937920] BTRFS: error (device sde) in btrfs_run_delayed_refs:2792: errno=-5 IO failure
[ 1434.937972] BTRFS info (device sde): forced readonly
[ 1434.937975] BTRFS warning (device sde): Skipping commit of aborted transaction.
[ 1434.937978] BTRFS: error (device sde) in cleanup_transaction:1607: errno=-5 IO failure
[ 1435.424679] BTRFS (device sde): parent transid verify failed on 29792056147968 wanted 1623936 found 1623004
[ 1435.522805] BTRFS (device sde): parent transid verify failed on 29792056999936 wanted 1623936 found 1623004
[ 1435.565828] BTRFS (device sde): parent transid verify failed on 29792057737216 wanted 1623936 found 1623004
[ 1435.651292] BTRFS (device sde): parent transid verify failed on 29790951227392 wanted 1623935 found 1623003
[ 1435.718304] BTRFS (device sde): parent transid verify failed on 29790914019328 wanted 1623935 found 1623003
[ 1435.754679] BTRFS (device sde): parent transid verify failed on 29790914592768 wanted 1623935 found 1623003
[ 1439.616104] BTRFS (device sde): parent transid verify failed on 29793154777088 wanted 1623936 found 1623005
[ 1439.622553] BTRFS (device sde): parent transid verify failed on 29796082139136 wanted 1623939 found 1622980
[ 1439.635702] BTRFS (device sde): parent transid verify failed on 29795165683712 wanted 1623938 found 1623007
[ 1439.639677] BTRFS (device sde): parent transid verify failed on 29796037656576 wanted 1623939 found 1622980
[ 1439.644303] BTRFS (device sde): parent transid verify failed on 29792017874944 wanted 1623936 found 1623004
[ 1439.653535] BTRFS (device sde): parent transid verify failed on 29795102965760 wanted 1623939 found 1623007
[ 1439.663393] BTRFS (device sde): parent transid verify failed on 29793354055680 wanted 1623937 found 1623005
[ 1439.666222] BTRFS (device sde): parent transid verify failed on 29795590062080 wanted 1623939 found 1623007
[ 1439.677759] BTRFS (device sde): parent transid verify failed on 29798681542656 wanted 1623943 found 1622983
[ 1439.693965] BTRFS (device sde): parent transid verify failed on 29795888676864 wanted 1623939 found 1623007
[ 1444.624351] BTRFS (device sde): parent transid verify failed on 29799304413184 wanted 1623943 found 1622984
[ 1444.640856] BTRFS (device sde): parent transid verify failed on 29794051571712 wanted 1623937 found 1623005
[ 1444.654402] BTRFS (device sde): parent transid verify failed on 29796307681280 wanted 1623940 found 1622980
[ 1444.667444] BTRFS (device sde): parent transid verify failed on 29796331765760 wanted 1623940 found 1622980
[ 1444.677949] BTRFS (device sde): parent transid verify failed on 29795524362240 wanted 1623939 found 1623007
[ 1444.687418] BTRFS (device sde): parent transid verify failed on 29799375421440 wanted 1623943 found 1622984
[ 1444.694611] BTRFS (device sde): parent transid verify failed on 29797253136384 wanted 1623941 found 1622981
[ 1444.708855] BTRFS (device sde): parent transid verify failed on 29796242833408 wanted 1623940 found 1622980
[ 1444.717589] BTRFS (device sde): parent transid verify failed on 29795054223360 wanted 1623939 found 1623007
[ 1444.719838] BTRFS (device sde): parent transid verify failed on 29797253316608 wanted 1623941 found 1622981
The FS mounts RW fine on boot, until a write is actually attempted, at which point it spits these errors - ultimately the write doesn't get committed & the FS gets forced readonly.
Thanks in advance !
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Recovery options for FS forced readonly due to 3.17 snapshot bug
2015-01-20 12:15 brett.king
@ 2015-01-20 12:27 ` Hugo Mills
2015-01-20 12:40 ` Filipe David Manana
1 sibling, 0 replies; 5+ messages in thread
From: Hugo Mills @ 2015-01-20 12:27 UTC (permalink / raw)
To: brett.king; +Cc: linux-btrfs@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 10850 bytes --]
On Tue, Jan 20, 2015 at 11:15:22PM +1100, brett.king@commandict.com.au wrote:
> Hi,
> My FS has been forced readonly by the early 3.17 snapshot bug. After much reading, I'm looking for validation of some recovery scenarios:
>
> 1) btrfsck --repair under a later kernel.
Kernel won't make a difference here, only the version of the btrfs
progs you're using, which should be "as recent as possible", but at
least 3.17 (best to try one of the 3.18.x releases).
It's entirely possible that you've got a different issue as well
which is preventing the btrfs check from repairing the FS.
> 2) replacing the devices one by one under a later kernel, effectively removing the corruption.
This won't work -- the FS would still try to migrate the broken
metadata.
> 3) Just copying data across from the readonly source to a new FS created under a later kernel.
That would work.
> Option 1 with check only (not repair) segfaults and coredumps on 3.18 (see below) .. are there some other options needed here or perhaps patches pending to rectify this ?
>
> Option 2 would be better than nothing, as I get to keep my snapshots and don't need any more surplus drives than 2x units of the largest size (data is RAID1).
>
> Option 3 is a fallback and basically forfeits the source FS including snapshots, is guaranteed to work (I can read the data fine), however requires 1:1 capacity in new disks to achieve.
>
> So being unable to complete a btrfsck check and generally scared off by a btrfsck -repair, will option 2 work ? I.E. can I still replace devices on a readonly FS ?
>
> And are there any other options ?
>
> Note the FS is very nearly full of data - I did also get continuous disk activity due to this high utilisation triggering the endless reclaim unused extent routine, for a few days prior to it being forced readonly.
>
> [root@array ~]# uname -r
> 3.18.2-200.fc21.x86_64
>
> (upgraded since getting the issue)
>
> [root@array ~]# btrfsck /dev/sdm
> <snip>
> parent transid verify failed on 40198674350080 wanted 1623837 found 1622986
> Ignoring transid failure
> Segmentation fault (core dumped)
Even at its worst, this should not be segfaulting. This should be
considered a bug in btrfs check at the very least.
Hugo.
> [root@array ~]# btrfs fi sh /export/archive/
> Label: 'archive' uuid: 22c7663a-93ca-40a6-9491-26abaa62b924
> Total devices 8 FS bytes used 18.07TiB
> devid 1 size 5.46TiB used 5.46TiB path /dev/sdm
> devid 2 size 3.64TiB used 3.64TiB path /dev/sdc
> devid 3 size 3.64TiB used 3.64TiB path /dev/sdd
> devid 4 size 5.46TiB used 5.46TiB path /dev/sdl
> devid 5 size 4.55TiB used 4.55TiB path /dev/sdb
> devid 8 size 4.55TiB used 4.55TiB path /dev/sdj
> devid 9 size 4.55TiB used 4.55TiB path /dev/sdk
> devid 10 size 4.55TiB used 4.55TiB path /dev/sde
>
> Btrfs v3.18.1
>
> [root@array ~]# df -h /export/archive/
> Filesystem Size Used Avail Use% Mounted on
> /dev/sdm 19T 19T 123G 100% /export/archive
>
> [root@array ~]# btrfs fi df /export/archive/
> Data, RAID1: total=18.17TiB, used=18.05TiB
> System, RAID1: total=32.00MiB, used=3.00MiB
> Metadata, RAID1: total=23.00GiB, used=22.49GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
>
> [root@array ~]# dmesg |grep -i btrfs |less
> [ 2.044134] Btrfs loaded
> [ 2.046327] BTRFS: device label array devid 1 transid 289965 /dev/sda4
> [ 2.472321] BTRFS info (device sda4): disk space caching is enabled
> [ 2.513784] BTRFS: detected SSD devices, enabling SSD mode
> [ 2.985877] BTRFS info (device sda4): disk space caching is enabled
> [ 3.520737] BTRFS: device label archive devid 5 transid 1623944 /dev/sdf
> [ 3.544394] BTRFS: device label archive devid 3 transid 1623944 /dev/sdh
> [ 3.544553] BTRFS: device label archive devid 10 transid 1623944 /dev/sdi
> [ 3.544957] BTRFS: device label archive devid 2 transid 1623944 /dev/sdg
> [ 3.614636] BTRFS: device label archive devid 9 transid 1623944 /dev/sdc
> [ 3.663006] BTRFS: device label archive devid 8 transid 1623944 /dev/sdb
> [ 3.663092] BTRFS: device label archive devid 4 transid 1623944 /dev/sdd
> [ 3.685397] BTRFS: device label archive devid 1 transid 1623944 /dev/sde
> [ 3.701732] BTRFS info (device sde): disk space caching is enabled
> [ 3.903413] BTRFS: bdev /dev/sde errs: wr 2805, rd 1548, flush 0, corrupt 0, gen 0
> [ 3.903421] BTRFS: bdev /dev/sdd errs: wr 2805, rd 0, flush 0, corrupt 0, gen 0
> [ 3.903426] BTRFS: bdev /dev/sdb errs: wr 2805, rd 33, flush 0, corrupt 0, gen 0
> [ 3.903430] BTRFS: bdev /dev/sdc errs: wr 2806, rd 0, flush 0, corrupt 0, gen 0
> [ 1432.231219] BTRFS: space cache generation (1623006) does not match inode (1623943)
> [ 1432.231283] BTRFS warning (device sde): failed to load free space cache for block group 57021632675840, rebuild it now
> [ 1434.603652] BTRFS (device sde): parent transid verify failed on 57022998740992 wanted 1623896 found 1622991
> [ 1434.608726] BTRFS (device sde): parent transid verify failed on 57022998740992 wanted 1623896 found 1622991
> [ 1434.919115] BTRFS (device sde): parent transid verify failed on 57022584373248 wanted 1623923 found 1622990
> [ 1434.937616] BTRFS (device sde): parent transid verify failed on 57022584373248 wanted 1623923 found 1622990
> [ 1434.937658] WARNING: CPU: 0 PID: 1571 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x54/0x120 [btrfs]()
> [ 1434.937660] BTRFS: Transaction aborted (error -5)
> [ 1434.937662] Modules linked in: binfmt_misc pl2303 iTCO_wdt iTCO_vendor_support intel_rapl x86_pkg_temp_thermal coretemp kvm_intel vfat fat kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_controller mei_me snd_hda_codec mei ir_xmp_decoder ir_lirc_codec snd_seq lirc_dev ir_mce_kbd_decoder ir_sharp_decoder ir_sanyo_decoder i2c_i801 ir_sony_decoder lpc_ich ir_jvc_decoder snd_hwdep mfd_core ir_rc6_decoder ir_rc5_decoder ir_nec_decoder snd_seq_device snd_pcm i2c_hid tpm_tis tpm rc_rc6_mce i2c_designware_platform nuvoton_cir dw_dmac i2c_designware_core dw_dmac_core rc_core snd_timer snd soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc btrfs xor i915 raid6_pq e1000e i2c_algo_bit drm_kms_helper drm
> [ 1434.937732] Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]
> [ 1434.937779] [<ffffffffa026c284>] __btrfs_abort_transaction+0x54/0x120 [btrfs]
> [ 1434.937795] [<ffffffffa028a38a>] btrfs_run_delayed_refs.part.64+0x12a/0x280 [btrfs]
> [ 1434.937811] [<ffffffffa028a4f7>] btrfs_run_delayed_refs+0x17/0x20 [btrfs]
> [ 1434.937829] [<ffffffffa029b5c7>] btrfs_commit_transaction+0x407/0xa20 [btrfs]
> [ 1434.937843] [<ffffffffa02823fa>] flush_space+0x9a/0x4e0 [btrfs]
> [ 1434.937857] [<ffffffffa0281e90>] ? btrfs_get_alloc_profile+0x30/0x40 [btrfs]
> [ 1434.937870] [<ffffffffa02822c4>] ? can_overcommit+0x54/0xf0 [btrfs]
> [ 1434.937883] [<ffffffffa0282925>] btrfs_async_reclaim_metadata_space+0xe5/0x200 [btrfs]
> [ 1434.937920] BTRFS: error (device sde) in btrfs_run_delayed_refs:2792: errno=-5 IO failure
> [ 1434.937972] BTRFS info (device sde): forced readonly
> [ 1434.937975] BTRFS warning (device sde): Skipping commit of aborted transaction.
> [ 1434.937978] BTRFS: error (device sde) in cleanup_transaction:1607: errno=-5 IO failure
> [ 1435.424679] BTRFS (device sde): parent transid verify failed on 29792056147968 wanted 1623936 found 1623004
> [ 1435.522805] BTRFS (device sde): parent transid verify failed on 29792056999936 wanted 1623936 found 1623004
> [ 1435.565828] BTRFS (device sde): parent transid verify failed on 29792057737216 wanted 1623936 found 1623004
> [ 1435.651292] BTRFS (device sde): parent transid verify failed on 29790951227392 wanted 1623935 found 1623003
> [ 1435.718304] BTRFS (device sde): parent transid verify failed on 29790914019328 wanted 1623935 found 1623003
> [ 1435.754679] BTRFS (device sde): parent transid verify failed on 29790914592768 wanted 1623935 found 1623003
> [ 1439.616104] BTRFS (device sde): parent transid verify failed on 29793154777088 wanted 1623936 found 1623005
> [ 1439.622553] BTRFS (device sde): parent transid verify failed on 29796082139136 wanted 1623939 found 1622980
> [ 1439.635702] BTRFS (device sde): parent transid verify failed on 29795165683712 wanted 1623938 found 1623007
> [ 1439.639677] BTRFS (device sde): parent transid verify failed on 29796037656576 wanted 1623939 found 1622980
> [ 1439.644303] BTRFS (device sde): parent transid verify failed on 29792017874944 wanted 1623936 found 1623004
> [ 1439.653535] BTRFS (device sde): parent transid verify failed on 29795102965760 wanted 1623939 found 1623007
> [ 1439.663393] BTRFS (device sde): parent transid verify failed on 29793354055680 wanted 1623937 found 1623005
> [ 1439.666222] BTRFS (device sde): parent transid verify failed on 29795590062080 wanted 1623939 found 1623007
> [ 1439.677759] BTRFS (device sde): parent transid verify failed on 29798681542656 wanted 1623943 found 1622983
> [ 1439.693965] BTRFS (device sde): parent transid verify failed on 29795888676864 wanted 1623939 found 1623007
> [ 1444.624351] BTRFS (device sde): parent transid verify failed on 29799304413184 wanted 1623943 found 1622984
> [ 1444.640856] BTRFS (device sde): parent transid verify failed on 29794051571712 wanted 1623937 found 1623005
> [ 1444.654402] BTRFS (device sde): parent transid verify failed on 29796307681280 wanted 1623940 found 1622980
> [ 1444.667444] BTRFS (device sde): parent transid verify failed on 29796331765760 wanted 1623940 found 1622980
> [ 1444.677949] BTRFS (device sde): parent transid verify failed on 29795524362240 wanted 1623939 found 1623007
> [ 1444.687418] BTRFS (device sde): parent transid verify failed on 29799375421440 wanted 1623943 found 1622984
> [ 1444.694611] BTRFS (device sde): parent transid verify failed on 29797253136384 wanted 1623941 found 1622981
> [ 1444.708855] BTRFS (device sde): parent transid verify failed on 29796242833408 wanted 1623940 found 1622980
> [ 1444.717589] BTRFS (device sde): parent transid verify failed on 29795054223360 wanted 1623939 found 1623007
> [ 1444.719838] BTRFS (device sde): parent transid verify failed on 29797253316608 wanted 1623941 found 1622981
>
> The FS mounts RW fine on boot, until a write is actually attempted, at which point it spits these errors - ultimately the write doesn't get committed & the FS gets forced readonly.
>
> Thanks in advance !
--
Hugo Mills | My code is never released, it escapes from the git
hugo@... carfax.org.uk | repo and kills a few beta testers on the way out
http://carfax.org.uk/ |
PGP: 65E74AC0 | Diablo-D3
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Recovery options for FS forced readonly due to 3.17 snapshot bug
2015-01-20 12:15 brett.king
2015-01-20 12:27 ` Hugo Mills
@ 2015-01-20 12:40 ` Filipe David Manana
1 sibling, 0 replies; 5+ messages in thread
From: Filipe David Manana @ 2015-01-20 12:40 UTC (permalink / raw)
To: brett.king; +Cc: linux-btrfs@vger.kernel.org
On Tue, Jan 20, 2015 at 12:15 PM, <brett.king@commandict.com.au> wrote:
> Hi,
> My FS has been forced readonly by the early 3.17 snapshot bug. After much reading, I'm looking for validation of some recovery scenarios:
>
> 1) btrfsck --repair under a later kernel.
>
> 2) replacing the devices one by one under a later kernel, effectively removing the corruption.
>
> 3) Just copying data across from the readonly source to a new FS created under a later kernel.
>
> Option 1 with check only (not repair) segfaults and coredumps on 3.18 (see below) .. are there some other options needed here or perhaps patches pending to rectify this ?
>
> Option 2 would be better than nothing, as I get to keep my snapshots and don't need any more surplus drives than 2x units of the largest size (data is RAID1).
>
> Option 3 is a fallback and basically forfeits the source FS including snapshots, is guaranteed to work (I can read the data fine), however requires 1:1 capacity in new disks to achieve.
>
> So being unable to complete a btrfsck check and generally scared off by a btrfsck -repair, will option 2 work ? I.E. can I still replace devices on a readonly FS ?
>
> And are there any other options ?
>
> Note the FS is very nearly full of data - I did also get continuous disk activity due to this high utilisation triggering the endless reclaim unused extent routine, for a few days prior to it being forced readonly.
>
> [root@array ~]# uname -r
> 3.18.2-200.fc21.x86_64
>
> (upgraded since getting the issue)
>
> [root@array ~]# btrfsck /dev/sdm
> <snip>
> parent transid verify failed on 40198674350080 wanted 1623837 found 1622986
> Ignoring transid failure
> Segmentation fault (core dumped)
>
> [root@array ~]# btrfs fi sh /export/archive/
> Label: 'archive' uuid: 22c7663a-93ca-40a6-9491-26abaa62b924
> Total devices 8 FS bytes used 18.07TiB
> devid 1 size 5.46TiB used 5.46TiB path /dev/sdm
> devid 2 size 3.64TiB used 3.64TiB path /dev/sdc
> devid 3 size 3.64TiB used 3.64TiB path /dev/sdd
> devid 4 size 5.46TiB used 5.46TiB path /dev/sdl
> devid 5 size 4.55TiB used 4.55TiB path /dev/sdb
> devid 8 size 4.55TiB used 4.55TiB path /dev/sdj
> devid 9 size 4.55TiB used 4.55TiB path /dev/sdk
> devid 10 size 4.55TiB used 4.55TiB path /dev/sde
>
> Btrfs v3.18.1
>
> [root@array ~]# df -h /export/archive/
> Filesystem Size Used Avail Use% Mounted on
> /dev/sdm 19T 19T 123G 100% /export/archive
>
> [root@array ~]# btrfs fi df /export/archive/
> Data, RAID1: total=18.17TiB, used=18.05TiB
> System, RAID1: total=32.00MiB, used=3.00MiB
> Metadata, RAID1: total=23.00GiB, used=22.49GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
>
> [root@array ~]# dmesg |grep -i btrfs |less
> [ 2.044134] Btrfs loaded
> [ 2.046327] BTRFS: device label array devid 1 transid 289965 /dev/sda4
> [ 2.472321] BTRFS info (device sda4): disk space caching is enabled
> [ 2.513784] BTRFS: detected SSD devices, enabling SSD mode
> [ 2.985877] BTRFS info (device sda4): disk space caching is enabled
> [ 3.520737] BTRFS: device label archive devid 5 transid 1623944 /dev/sdf
> [ 3.544394] BTRFS: device label archive devid 3 transid 1623944 /dev/sdh
> [ 3.544553] BTRFS: device label archive devid 10 transid 1623944 /dev/sdi
> [ 3.544957] BTRFS: device label archive devid 2 transid 1623944 /dev/sdg
> [ 3.614636] BTRFS: device label archive devid 9 transid 1623944 /dev/sdc
> [ 3.663006] BTRFS: device label archive devid 8 transid 1623944 /dev/sdb
> [ 3.663092] BTRFS: device label archive devid 4 transid 1623944 /dev/sdd
> [ 3.685397] BTRFS: device label archive devid 1 transid 1623944 /dev/sde
> [ 3.701732] BTRFS info (device sde): disk space caching is enabled
> [ 3.903413] BTRFS: bdev /dev/sde errs: wr 2805, rd 1548, flush 0, corrupt 0, gen 0
> [ 3.903421] BTRFS: bdev /dev/sdd errs: wr 2805, rd 0, flush 0, corrupt 0, gen 0
> [ 3.903426] BTRFS: bdev /dev/sdb errs: wr 2805, rd 33, flush 0, corrupt 0, gen 0
> [ 3.903430] BTRFS: bdev /dev/sdc errs: wr 2806, rd 0, flush 0, corrupt 0, gen 0
You have device write and read errors here - the snapshots issue with
3.17 wouldn't bump these device error stats.
Even if you were affected by the snapshots issue, you clearly have
here another problem, very likely hardware related due to those stats.
If you are affected by the snapshots issue, btrfs check/btrfsck as of
btrfs-progs v3.17+ should be able to fix it and if you are affected by
it, it reports at a very early stage (without --repair) one or more
messages with the following format:
"root %llu has a root item with a more recent gen (%llu) compared to
the found root node (%llu)\n"
If the snapshots issue from 3.17.0/1 was the only issue you had, those
messages would be reported by fsck before it segfaults (at least for
all cases I've seen).
Also when one is affected by the snapshots issue, it should only cause
problems when reading from the affected readonly snapshots, and not
when running the delayed references code in the trace below.
My best recommendation is to check how healthy your drives are,
perform a backup of whatever is possible (btrfs restore for example)
and then only after run btrfsck with --repair.
> [ 1432.231219] BTRFS: space cache generation (1623006) does not match inode (1623943)
> [ 1432.231283] BTRFS warning (device sde): failed to load free space cache for block group 57021632675840, rebuild it now
> [ 1434.603652] BTRFS (device sde): parent transid verify failed on 57022998740992 wanted 1623896 found 1622991
> [ 1434.608726] BTRFS (device sde): parent transid verify failed on 57022998740992 wanted 1623896 found 1622991
> [ 1434.919115] BTRFS (device sde): parent transid verify failed on 57022584373248 wanted 1623923 found 1622990
> [ 1434.937616] BTRFS (device sde): parent transid verify failed on 57022584373248 wanted 1623923 found 1622990
> [ 1434.937658] WARNING: CPU: 0 PID: 1571 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x54/0x120 [btrfs]()
> [ 1434.937660] BTRFS: Transaction aborted (error -5)
> [ 1434.937662] Modules linked in: binfmt_misc pl2303 iTCO_wdt iTCO_vendor_support intel_rapl x86_pkg_temp_thermal coretemp kvm_intel vfat fat kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_controller mei_me snd_hda_codec mei ir_xmp_decoder ir_lirc_codec snd_seq lirc_dev ir_mce_kbd_decoder ir_sharp_decoder ir_sanyo_decoder i2c_i801 ir_sony_decoder lpc_ich ir_jvc_decoder snd_hwdep mfd_core ir_rc6_decoder ir_rc5_decoder ir_nec_decoder snd_seq_device snd_pcm i2c_hid tpm_tis tpm rc_rc6_mce i2c_designware_platform nuvoton_cir dw_dmac i2c_designware_core dw_dmac_core rc_core snd_timer snd soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc btrfs xor i915 raid6_pq e1000e i2c_algo_bit drm_kms_helper drm
> [ 1434.937732] Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]
> [ 1434.937779] [<ffffffffa026c284>] __btrfs_abort_transaction+0x54/0x120 [btrfs]
> [ 1434.937795] [<ffffffffa028a38a>] btrfs_run_delayed_refs.part.64+0x12a/0x280 [btrfs]
> [ 1434.937811] [<ffffffffa028a4f7>] btrfs_run_delayed_refs+0x17/0x20 [btrfs]
> [ 1434.937829] [<ffffffffa029b5c7>] btrfs_commit_transaction+0x407/0xa20 [btrfs]
> [ 1434.937843] [<ffffffffa02823fa>] flush_space+0x9a/0x4e0 [btrfs]
> [ 1434.937857] [<ffffffffa0281e90>] ? btrfs_get_alloc_profile+0x30/0x40 [btrfs]
> [ 1434.937870] [<ffffffffa02822c4>] ? can_overcommit+0x54/0xf0 [btrfs]
> [ 1434.937883] [<ffffffffa0282925>] btrfs_async_reclaim_metadata_space+0xe5/0x200 [btrfs]
> [ 1434.937920] BTRFS: error (device sde) in btrfs_run_delayed_refs:2792: errno=-5 IO failure
> [ 1434.937972] BTRFS info (device sde): forced readonly
> [ 1434.937975] BTRFS warning (device sde): Skipping commit of aborted transaction.
> [ 1434.937978] BTRFS: error (device sde) in cleanup_transaction:1607: errno=-5 IO failure
> [ 1435.424679] BTRFS (device sde): parent transid verify failed on 29792056147968 wanted 1623936 found 1623004
> [ 1435.522805] BTRFS (device sde): parent transid verify failed on 29792056999936 wanted 1623936 found 1623004
> [ 1435.565828] BTRFS (device sde): parent transid verify failed on 29792057737216 wanted 1623936 found 1623004
> [ 1435.651292] BTRFS (device sde): parent transid verify failed on 29790951227392 wanted 1623935 found 1623003
> [ 1435.718304] BTRFS (device sde): parent transid verify failed on 29790914019328 wanted 1623935 found 1623003
> [ 1435.754679] BTRFS (device sde): parent transid verify failed on 29790914592768 wanted 1623935 found 1623003
> [ 1439.616104] BTRFS (device sde): parent transid verify failed on 29793154777088 wanted 1623936 found 1623005
> [ 1439.622553] BTRFS (device sde): parent transid verify failed on 29796082139136 wanted 1623939 found 1622980
> [ 1439.635702] BTRFS (device sde): parent transid verify failed on 29795165683712 wanted 1623938 found 1623007
> [ 1439.639677] BTRFS (device sde): parent transid verify failed on 29796037656576 wanted 1623939 found 1622980
> [ 1439.644303] BTRFS (device sde): parent transid verify failed on 29792017874944 wanted 1623936 found 1623004
> [ 1439.653535] BTRFS (device sde): parent transid verify failed on 29795102965760 wanted 1623939 found 1623007
> [ 1439.663393] BTRFS (device sde): parent transid verify failed on 29793354055680 wanted 1623937 found 1623005
> [ 1439.666222] BTRFS (device sde): parent transid verify failed on 29795590062080 wanted 1623939 found 1623007
> [ 1439.677759] BTRFS (device sde): parent transid verify failed on 29798681542656 wanted 1623943 found 1622983
> [ 1439.693965] BTRFS (device sde): parent transid verify failed on 29795888676864 wanted 1623939 found 1623007
> [ 1444.624351] BTRFS (device sde): parent transid verify failed on 29799304413184 wanted 1623943 found 1622984
> [ 1444.640856] BTRFS (device sde): parent transid verify failed on 29794051571712 wanted 1623937 found 1623005
> [ 1444.654402] BTRFS (device sde): parent transid verify failed on 29796307681280 wanted 1623940 found 1622980
> [ 1444.667444] BTRFS (device sde): parent transid verify failed on 29796331765760 wanted 1623940 found 1622980
> [ 1444.677949] BTRFS (device sde): parent transid verify failed on 29795524362240 wanted 1623939 found 1623007
> [ 1444.687418] BTRFS (device sde): parent transid verify failed on 29799375421440 wanted 1623943 found 1622984
> [ 1444.694611] BTRFS (device sde): parent transid verify failed on 29797253136384 wanted 1623941 found 1622981
> [ 1444.708855] BTRFS (device sde): parent transid verify failed on 29796242833408 wanted 1623940 found 1622980
> [ 1444.717589] BTRFS (device sde): parent transid verify failed on 29795054223360 wanted 1623939 found 1623007
> [ 1444.719838] BTRFS (device sde): parent transid verify failed on 29797253316608 wanted 1623941 found 1622981
>
> The FS mounts RW fine on boot, until a write is actually attempted, at which point it spits these errors - ultimately the write doesn't get committed & the FS gets forced readonly.
>
> Thanks in advance !
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Filipe David Manana,
"Reasonable men adapt themselves to the world.
Unreasonable men adapt the world to themselves.
That's why all progress depends on unreasonable men."
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: Recovery options for FS forced readonly due to 3.17 snapshot bug
@ 2015-01-20 22:26 Brett King
0 siblings, 0 replies; 5+ messages in thread
From: Brett King @ 2015-01-20 22:26 UTC (permalink / raw)
To: fdmanana@gmail.com; +Cc: linux-btrfs@vger.kernel.org
From: Filipe David Manana <fdmanana@gmail.com>
Sent: Tue 01-20-2015 11:40 pm
Subject: Re: Recovery options for FS forced readonly due to 3.17 snapshot bug
To: brett.king@commandict.com.au;
CC: linux-btrfs@vger.kernel.org;
> On Tue, Jan 20, 2015 at 12:15 PM, <brett.king@commandict.com.au> wrote:
> > Hi,
> > My FS has been forced readonly by the early 3.17 snapshot bug. After much
> reading, I'm looking for validation of some recovery scenarios:
> >
> > 1) btrfsck --repair under a later kernel.
> >
> > 2) replacing the devices one by one under a later kernel, effectively
> removing the corruption.
> >
> > 3) Just copying data across from the readonly source to a new FS created
> under a later kernel.
> >
> > Option 1 with check only (not repair) segfaults and coredumps on 3.18 (see
> below) .. are there some other options needed here or perhaps patches pending
> to rectify this ?
> >
> > Option 2 would be better than nothing, as I get to keep my snapshots and
> don't need any more surplus drives than 2x units of the largest size (data is
> RAID1).
> >
> > Option 3 is a fallback and basically forfeits the source FS including
> snapshots, is guaranteed to work (I can read the data fine), however requires
> 1:1 capacity in new disks to achieve.
> >
> > So being unable to complete a btrfsck check and generally scared off by a
> btrfsck -repair, will option 2 work ? I.E. can I still replace devices on a
> readonly FS ?
> >
> > And are there any other options ?
> >
> > Note the FS is very nearly full of data - I did also get continuous disk
> activity due to this high utilisation triggering the endless reclaim unused
> extent routine, for a few days prior to it being forced readonly.
> >
> > [root@array ~]# uname -r
> > 3.18.2-200.fc21.x86_64
> >
> > (upgraded since getting the issue)
> >
> > [root@array ~]# btrfsck /dev/sdm
> > <snip>
> > parent transid verify failed on 40198674350080 wanted 1623837 found 1622986
> > Ignoring transid failure
> > Segmentation fault (core dumped)
> >
> > [root@array ~]# btrfs fi sh /export/archive/
> > Label: 'archive' uuid: 22c7663a-93ca-40a6-9491-26abaa62b924
> > Total devices 8 FS bytes used 18.07TiB
> > devid 1 size 5.46TiB used 5.46TiB path /dev/sdm
> > devid 2 size 3.64TiB used 3.64TiB path /dev/sdc
> > devid 3 size 3.64TiB used 3.64TiB path /dev/sdd
> > devid 4 size 5.46TiB used 5.46TiB path /dev/sdl
> > devid 5 size 4.55TiB used 4.55TiB path /dev/sdb
> > devid 8 size 4.55TiB used 4.55TiB path /dev/sdj
> > devid 9 size 4.55TiB used 4.55TiB path /dev/sdk
> > devid 10 size 4.55TiB used 4.55TiB path /dev/sde
> >
> > Btrfs v3.18.1
> >
> > [root@array ~]# df -h /export/archive/
> > Filesystem Size Used Avail Use% Mounted on
> > /dev/sdm 19T 19T 123G 100% /export/archive
> >
> > [root@array ~]# btrfs fi df /export/archive/
> > Data, RAID1: total=18.17TiB, used=18.05TiB
> > System, RAID1: total=32.00MiB, used=3.00MiB
> > Metadata, RAID1: total=23.00GiB, used=22.49GiB
> > GlobalReserve, single: total=512.00MiB, used=0.00B
> >
> > [root@array ~]# dmesg |grep -i btrfs |less
> > [ 2.044134] Btrfs loaded
> > [ 2.046327] BTRFS: device label array devid 1 transid 289965 /dev/sda4
> > [ 2.472321] BTRFS info (device sda4): disk space caching is enabled
> > [ 2.513784] BTRFS: detected SSD devices, enabling SSD mode
> > [ 2.985877] BTRFS info (device sda4): disk space caching is enabled
> > [ 3.520737] BTRFS: device label archive devid 5 transid 1623944 /dev/sdf
> > [ 3.544394] BTRFS: device label archive devid 3 transid 1623944 /dev/sdh
> > [ 3.544553] BTRFS: device label archive devid 10 transid 1623944 /dev/sdi
> > [ 3.544957] BTRFS: device label archive devid 2 transid 1623944 /dev/sdg
> > [ 3.614636] BTRFS: device label archive devid 9 transid 1623944 /dev/sdc
> > [ 3.663006] BTRFS: device label archive devid 8 transid 1623944 /dev/sdb
> > [ 3.663092] BTRFS: device label archive devid 4 transid 1623944 /dev/sdd
> > [ 3.685397] BTRFS: device label archive devid 1 transid 1623944 /dev/sde
> > [ 3.701732] BTRFS info (device sde): disk space caching is enabled
> > [ 3.903413] BTRFS: bdev /dev/sde errs: wr 2805, rd 1548, flush 0, corrupt
> 0, gen 0
> > [ 3.903421] BTRFS: bdev /dev/sdd errs: wr 2805, rd 0, flush 0, corrupt 0,
> gen 0
> > [ 3.903426] BTRFS: bdev /dev/sdb errs: wr 2805, rd 33, flush 0, corrupt 0,
> gen 0
> > [ 3.903430] BTRFS: bdev /dev/sdc errs: wr 2806, rd 0, flush 0, corrupt 0,
> gen 0
>
> You have device write and read errors here - the snapshots issue with
> 3.17 wouldn't bump these device error stats.
> Even if you were affected by the snapshots issue, you clearly have
> here another problem, very likely hardware related due to those stats.
Yes that is my other problem, these disks are in external USB enclosures which are flakey from time to time; I'm redesigning this imminently.
>
> If you are affected by the snapshots issue, btrfs check/btrfsck as of
> btrfs-progs v3.17+ should be able to fix it and if you are affected by
> it, it reports at a very early stage (without --repair) one or more
> messages with the following format:
>
> "root %llu has a root item with a more recent gen (%llu) compared to
> the found root node (%llu)\n"
>
> If the snapshots issue from 3.17.0/1 was the only issue you had, those
> messages would be reported by fsck before it segfaults (at least for
> all cases I've seen).
Hmm I've not seen any of those errors - as you say it seems like something else is affecting it also.
>
> Also when one is affected by the snapshots issue, it should only cause
> problems when reading from the affected readonly snapshots, and not
> when running the delayed references code in the trace below.
This makes sense; I've only read from the (RO) snapshots when being paranoid about my data & manually comparing some sample files.
>
> My best recommendation is to check how healthy your drives are,
> perform a backup of whatever is possible (btrfs restore for example)
I hadn't heard of restore - looks good from the wiki, after a quick trial it seems to be the way to keep my snapshots if I had to clone the data over to a new FS - i.e. option 2.5; fantastic.
> and then only after run btrfsck with --repair.
Have kicked off a btrfs check --repair now, as it is the option which avoids the time & cost of procuring new disks; I have backups of my important data and am an optimist :)
Will report how it went - thanks for the info !
>
> > [ 1432.231219] BTRFS: space cache generation (1623006) does not match inode
> (1623943)
> > [ 1432.231283] BTRFS warning (device sde): failed to load free space cache
> for block group 57021632675840, rebuild it now
> > [ 1434.603652] BTRFS (device sde): parent transid verify failed on
> 57022998740992 wanted 1623896 found 1622991
> > [ 1434.608726] BTRFS (device sde): parent transid verify failed on
> 57022998740992 wanted 1623896 found 1622991
> > [ 1434.919115] BTRFS (device sde): parent transid verify failed on
> 57022584373248 wanted 1623923 found 1622990
> > [ 1434.937616] BTRFS (device sde): parent transid verify failed on
> 57022584373248 wanted 1623923 found 1622990
> > [ 1434.937658] WARNING: CPU: 0 PID: 1571 at fs/btrfs/super.c:260
> __btrfs_abort_transaction+0x54/0x120 [btrfs]()
> > [ 1434.937660] BTRFS: Transaction aborted (error -5)
> > [ 1434.937662] Modules linked in: binfmt_misc pl2303 iTCO_wdt
> iTCO_vendor_support intel_rapl x86_pkg_temp_thermal coretemp kvm_intel vfat fat
> kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel
> snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel
> snd_hda_controller mei_me snd_hda_codec mei ir_xmp_decoder ir_lirc_codec
> snd_seq lirc_dev ir_mce_kbd_decoder ir_sharp_decoder ir_sanyo_decoder i2c_i801
> ir_sony_decoder lpc_ich ir_jvc_decoder snd_hwdep mfd_core ir_rc6_decoder
> ir_rc5_decoder ir_nec_decoder snd_seq_device snd_pcm i2c_hid tpm_tis tpm
> rc_rc6_mce i2c_designware_platform nuvoton_cir dw_dmac i2c_designware_core
> dw_dmac_core rc_core snd_timer snd soundcore nfsd auth_rpcgss nfs_acl lockd
> grace sunrpc btrfs xor i915 raid6_pq e1000e i2c_algo_bit drm_kms_helper drm
> > [ 1434.937732] Workqueue: events_unbound btrfs_async_reclaim_metadata_space
> [btrfs]
> > [ 1434.937779] [<ffffffffa026c284>] __btrfs_abort_transaction+0x54/0x120
> [btrfs]
> > [ 1434.937795] [<ffffffffa028a38a>]
> btrfs_run_delayed_refs.part.64+0x12a/0x280 [btrfs]
> > [ 1434.937811] [<ffffffffa028a4f7>] btrfs_run_delayed_refs+0x17/0x20 [btrfs]
> > [ 1434.937829] [<ffffffffa029b5c7>] btrfs_commit_transaction+0x407/0xa20
> [btrfs]
> > [ 1434.937843] [<ffffffffa02823fa>] flush_space+0x9a/0x4e0 [btrfs]
> > [ 1434.937857] [<ffffffffa0281e90>] ? btrfs_get_alloc_profile+0x30/0x40
> [btrfs]
> > [ 1434.937870] [<ffffffffa02822c4>] ? can_overcommit+0x54/0xf0 [btrfs]
> > [ 1434.937883] [<ffffffffa0282925>]
> btrfs_async_reclaim_metadata_space+0xe5/0x200 [btrfs]
> > [ 1434.937920] BTRFS: error (device sde) in btrfs_run_delayed_refs:2792:
> errno=-5 IO failure
> > [ 1434.937972] BTRFS info (device sde): forced readonly
> > [ 1434.937975] BTRFS warning (device sde): Skipping commit of aborted
> transaction.
> > [ 1434.937978] BTRFS: error (device sde) in cleanup_transaction:1607:
> errno=-5 IO failure
> > [ 1435.424679] BTRFS (device sde): parent transid verify failed on
> 29792056147968 wanted 1623936 found 1623004
> > [ 1435.522805] BTRFS (device sde): parent transid verify failed on
> 29792056999936 wanted 1623936 found 1623004
> > [ 1435.565828] BTRFS (device sde): parent transid verify failed on
> 29792057737216 wanted 1623936 found 1623004
> > [ 1435.651292] BTRFS (device sde): parent transid verify failed on
> 29790951227392 wanted 1623935 found 1623003
> > [ 1435.718304] BTRFS (device sde): parent transid verify failed on
> 29790914019328 wanted 1623935 found 1623003
> > [ 1435.754679] BTRFS (device sde): parent transid verify failed on
> 29790914592768 wanted 1623935 found 1623003
> > [ 1439.616104] BTRFS (device sde): parent transid verify failed on
> 29793154777088 wanted 1623936 found 1623005
> > [ 1439.622553] BTRFS (device sde): parent transid verify failed on
> 29796082139136 wanted 1623939 found 1622980
> > [ 1439.635702] BTRFS (device sde): parent transid verify failed on
> 29795165683712 wanted 1623938 found 1623007
> > [ 1439.639677] BTRFS (device sde): parent transid verify failed on
> 29796037656576 wanted 1623939 found 1622980
> > [ 1439.644303] BTRFS (device sde): parent transid verify failed on
> 29792017874944 wanted 1623936 found 1623004
> > [ 1439.653535] BTRFS (device sde): parent transid verify failed on
> 29795102965760 wanted 1623939 found 1623007
> > [ 1439.663393] BTRFS (device sde): parent transid verify failed on
> 29793354055680 wanted 1623937 found 1623005
> > [ 1439.666222] BTRFS (device sde): parent transid verify failed on
> 29795590062080 wanted 1623939 found 1623007
> > [ 1439.677759] BTRFS (device sde): parent transid verify failed on
> 29798681542656 wanted 1623943 found 1622983
> > [ 1439.693965] BTRFS (device sde): parent transid verify failed on
> 29795888676864 wanted 1623939 found 1623007
> > [ 1444.624351] BTRFS (device sde): parent transid verify failed on
> 29799304413184 wanted 1623943 found 1622984
> > [ 1444.640856] BTRFS (device sde): parent transid verify failed on
> 29794051571712 wanted 1623937 found 1623005
> > [ 1444.654402] BTRFS (device sde): parent transid verify failed on
> 29796307681280 wanted 1623940 found 1622980
> > [ 1444.667444] BTRFS (device sde): parent transid verify failed on
> 29796331765760 wanted 1623940 found 1622980
> > [ 1444.677949] BTRFS (device sde): parent transid verify failed on
> 29795524362240 wanted 1623939 found 1623007
> > [ 1444.687418] BTRFS (device sde): parent transid verify failed on
> 29799375421440 wanted 1623943 found 1622984
> > [ 1444.694611] BTRFS (device sde): parent transid verify failed on
> 29797253136384 wanted 1623941 found 1622981
> > [ 1444.708855] BTRFS (device sde): parent transid verify failed on
> 29796242833408 wanted 1623940 found 1622980
> > [ 1444.717589] BTRFS (device sde): parent transid verify failed on
> 29795054223360 wanted 1623939 found 1623007
> > [ 1444.719838] BTRFS (device sde): parent transid verify failed on
> 29797253316608 wanted 1623941 found 1622981
> >
> > The FS mounts RW fine on boot, until a write is actually attempted, at which
> point it spits these errors - ultimately the write doesn't get committed & the
> FS gets forced readonly.
> >
> > Thanks in advance !
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Filipe David Manana,
>
> "Reasonable men adapt themselves to the world.
> Unreasonable men adapt the world to themselves.
> That's why all progress depends on unreasonable men."
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: Recovery options for FS forced readonly due to 3.17 snapshot bug
@ 2015-01-21 9:15 Brett King
0 siblings, 0 replies; 5+ messages in thread
From: Brett King @ 2015-01-21 9:15 UTC (permalink / raw)
To: linux-btrfs@vger.kernel.org
-----Original message-----
From: Brett King <brett.king@commandict.com.au>
Sent: Wed 01-21-2015 09:26 am
Subject: RE: Recovery options for FS forced readonly due to 3.17 snapshot bug
CC: linux-btrfs@vger.kernel.org;
To: fdmanana@gmail.com;
> From: Filipe David Manana <fdmanana@gmail.com>
> Sent: Tue 01-20-2015 11:40 pm
> Subject: Re: Recovery options for FS forced readonly due to 3.17 snapshot bug
> To: brett.king@commandict.com.au;
> CC: linux-btrfs@vger.kernel.org;
> > On Tue, Jan 20, 2015 at 12:15 PM, <brett.king@commandict.com.au> wrote:
> > > Hi,
> > > My FS has been forced readonly by the early 3.17 snapshot bug. After much
> > reading, I'm looking for validation of some recovery scenarios:
> > >
> > > 1) btrfsck --repair under a later kernel.
> > >
> > > 2) replacing the devices one by one under a later kernel, effectively
> > removing the corruption.
> > >
> > > 3) Just copying data across from the readonly source to a new FS created
> > under a later kernel.
> > >
> > > Option 1 with check only (not repair) segfaults and coredumps on 3.18 (see
> > below) .. are there some other options needed here or perhaps patches pending
> > to rectify this ?
> > >
> > > Option 2 would be better than nothing, as I get to keep my snapshots and
> > don't need any more surplus drives than 2x units of the largest size (data is
> > RAID1).
> > >
> > > Option 3 is a fallback and basically forfeits the source FS including
> > snapshots, is guaranteed to work (I can read the data fine), however requires
> > 1:1 capacity in new disks to achieve.
> > >
> > > So being unable to complete a btrfsck check and generally scared off by a
> > btrfsck -repair, will option 2 work ? I.E. can I still replace devices on a
> > readonly FS ?
> > >
> > > And are there any other options ?
> > >
> > > Note the FS is very nearly full of data - I did also get continuous disk
> > activity due to this high utilisation triggering the endless reclaim unused
> > extent routine, for a few days prior to it being forced readonly.
> > >
> > > [root@array ~]# uname -r
> > > 3.18.2-200.fc21.x86_64
> > >
> > > (upgraded since getting the issue)
> > >
> > > [root@array ~]# btrfsck /dev/sdm
> > > <snip>
> > > parent transid verify failed on 40198674350080 wanted 1623837 found 1622986
> > > Ignoring transid failure
> > > Segmentation fault (core dumped)
> > >
> > > [root@array ~]# btrfs fi sh /export/archive/
> > > Label: 'archive' uuid: 22c7663a-93ca-40a6-9491-26abaa62b924
> > > Total devices 8 FS bytes used 18.07TiB
> > > devid 1 size 5.46TiB used 5.46TiB path /dev/sdm
> > > devid 2 size 3.64TiB used 3.64TiB path /dev/sdc
> > > devid 3 size 3.64TiB used 3.64TiB path /dev/sdd
> > > devid 4 size 5.46TiB used 5.46TiB path /dev/sdl
> > > devid 5 size 4.55TiB used 4.55TiB path /dev/sdb
> > > devid 8 size 4.55TiB used 4.55TiB path /dev/sdj
> > > devid 9 size 4.55TiB used 4.55TiB path /dev/sdk
> > > devid 10 size 4.55TiB used 4.55TiB path /dev/sde
> > >
> > > Btrfs v3.18.1
> > >
> > > [root@array ~]# df -h /export/archive/
> > > Filesystem Size Used Avail Use% Mounted on
> > > /dev/sdm 19T 19T 123G 100% /export/archive
> > >
> > > [root@array ~]# btrfs fi df /export/archive/
> > > Data, RAID1: total=18.17TiB, used=18.05TiB
> > > System, RAID1: total=32.00MiB, used=3.00MiB
> > > Metadata, RAID1: total=23.00GiB, used=22.49GiB
> > > GlobalReserve, single: total=512.00MiB, used=0.00B
> > >
> > > [root@array ~]# dmesg |grep -i btrfs |less
> > > [ 2.044134] Btrfs loaded
> > > [ 2.046327] BTRFS: device label array devid 1 transid 289965 /dev/sda4
> > > [ 2.472321] BTRFS info (device sda4): disk space caching is enabled
> > > [ 2.513784] BTRFS: detected SSD devices, enabling SSD mode
> > > [ 2.985877] BTRFS info (device sda4): disk space caching is enabled
> > > [ 3.520737] BTRFS: device label archive devid 5 transid 1623944 /dev/sdf
> > > [ 3.544394] BTRFS: device label archive devid 3 transid 1623944 /dev/sdh
> > > [ 3.544553] BTRFS: device label archive devid 10 transid 1623944 /dev/sdi
> > > [ 3.544957] BTRFS: device label archive devid 2 transid 1623944 /dev/sdg
> > > [ 3.614636] BTRFS: device label archive devid 9 transid 1623944 /dev/sdc
> > > [ 3.663006] BTRFS: device label archive devid 8 transid 1623944 /dev/sdb
> > > [ 3.663092] BTRFS: device label archive devid 4 transid 1623944 /dev/sdd
> > > [ 3.685397] BTRFS: device label archive devid 1 transid 1623944 /dev/sde
> > > [ 3.701732] BTRFS info (device sde): disk space caching is enabled
> > > [ 3.903413] BTRFS: bdev /dev/sde errs: wr 2805, rd 1548, flush 0,
> corrupt
> > 0, gen 0
> > > [ 3.903421] BTRFS: bdev /dev/sdd errs: wr 2805, rd 0, flush 0, corrupt
> 0,
> > gen 0
> > > [ 3.903426] BTRFS: bdev /dev/sdb errs: wr 2805, rd 33, flush 0, corrupt
> 0,
> > gen 0
> > > [ 3.903430] BTRFS: bdev /dev/sdc errs: wr 2806, rd 0, flush 0, corrupt
> 0,
> > gen 0
> >
> > You have device write and read errors here - the snapshots issue with
> > 3.17 wouldn't bump these device error stats.
> > Even if you were affected by the snapshots issue, you clearly have
> > here another problem, very likely hardware related due to those stats.
>
> Yes that is my other problem, these disks are in external USB enclosures which
> are flakey from time to time; I'm redesigning this imminently.
>
> >
> > If you are affected by the snapshots issue, btrfs check/btrfsck as of
> > btrfs-progs v3.17+ should be able to fix it and if you are affected by
> > it, it reports at a very early stage (without --repair) one or more
> > messages with the following format:
> >
> > "root %llu has a root item with a more recent gen (%llu) compared to
> > the found root node (%llu)\n"
> >
> > If the snapshots issue from 3.17.0/1 was the only issue you had, those
> > messages would be reported by fsck before it segfaults (at least for
> > all cases I've seen).
>
> Hmm I've not seen any of those errors - as you say it seems like something else
> is affecting it also.
>
> >
> > Also when one is affected by the snapshots issue, it should only cause
> > problems when reading from the affected readonly snapshots, and not
> > when running the delayed references code in the trace below.
>
> This makes sense; I've only read from the (RO) snapshots when being paranoid
> about my data & manually comparing some sample files.
>
> >
> > My best recommendation is to check how healthy your drives are,
> > perform a backup of whatever is possible (btrfs restore for example)
>
> I hadn't heard of restore - looks good from the wiki, after a quick trial it
> seems to be the way to keep my snapshots if I had to clone the data over to a
> new FS - i.e. option 2.5; fantastic.
>
> > and then only after run btrfsck with --repair.
>
> Have kicked off a btrfs check --repair now, as it is the option which avoids
> the time & cost of procuring new disks; I have backups of my important data and
> am an optimist :)
>
> Will report how it went - thanks for the info !
Well the optimist invites disappointment .. btrfs check --repair still segfaults however my write attempts now end with out of space errors, which are more encouraging however as they appear after the forced readonly message, I suspect they are stemming from that instead of capacity.
<snip>
Jan 21 20:13:38 array kernel: [ 956.187594] BTRFS (device sdd): parent transid verify failed on 57023356125184 wanted 1623924 found 1622992
Jan 21 20:13:38 array kernel: [ 956.187632] WARNING: CPU: 2 PID: 471 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x54/0x120 [btrfs]()
Jan 21 20:13:38 array kernel: [ 956.187636] Modules linked in: binfmt_misc pl2303 vfat fat iTCO_wdt iTCO_vendor_support intel_rapl x86_pkg_temp_thermal coretemp kvm_intel kvm snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi crct10dif_pclmul crc32_pclmul crc32c_intel snd_hda_intel ghash_clmulni_intel snd_hda_controller i2c_i801 mei_me mei ir_mce_kbd_decoder ir_lirc_codec ir_xmp_decoder lirc_dev ir_sharp_decoder ir_sanyo_decoder snd_hda_codec ir_jvc_decoder ir_sony_decoder lpc_ich ir_rc6_decoder mfd_core ir_rc5_decoder snd_hwdep snd_seq ir_nec_decoder snd_seq_device snd_pcm snd_timer i2c_hid rc_rc6_mce tpm_tis tpm i2c_designware_platform nuvoton_cir snd rc_core i2c_designware_core dw_dmac dw_dmac_core soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc uas usb_storage btrfs xor i915 raid6_pq i2c_algo_bit drm_kms_helper e1000e drm ptp pps_core sdhci_acpi sdhci video mmc_core
Jan 21 20:13:38 array kernel: [ 956.187711] Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]
Jan 21 20:13:38 array kernel: [ 956.187760] [<ffffffffa0253284>] __btrfs_abort_transaction+0x54/0x120 [btrfs]
Jan 21 20:13:38 array kernel: [ 956.187774] [<ffffffffa0266a09>] __btrfs_free_extent+0x179/0xca0 [btrfs]
Jan 21 20:13:38 array kernel: [ 956.187803] [<ffffffffa02ce4a9>] ? btrfs_merge_delayed_refs+0x3b9/0x490 [btrfs]
Jan 21 20:13:38 array kernel: [ 956.187819] [<ffffffffa026c76e>] __btrfs_run_delayed_refs+0x8fe/0x1210 [btrfs]
Jan 21 20:13:38 array kernel: [ 956.187866] [<ffffffffa02712ce>] btrfs_run_delayed_refs.part.64+0x6e/0x280 [btrfs]
Jan 21 20:13:38 array kernel: [ 956.187904] [<ffffffffa0271a01>] btrfs_write_dirty_block_groups+0x461/0x740 [btrfs]
Jan 21 20:13:38 array kernel: [ 956.187943] [<ffffffffa02f8b71>] commit_cowonly_roots+0x17a/0x234 [btrfs]
Jan 21 20:13:38 array kernel: [ 956.187974] [<ffffffffa0282651>] btrfs_commit_transaction+0x491/0xa20 [btrfs]
Jan 21 20:13:38 array kernel: [ 956.187996] [<ffffffffa02693fa>] flush_space+0x9a/0x4e0 [btrfs]
Jan 21 20:13:38 array kernel: [ 956.188016] [<ffffffffa0268e90>] ? btrfs_get_alloc_profile+0x30/0x40 [btrfs]
Jan 21 20:13:38 array kernel: [ 956.188043] [<ffffffffa02692c4>] ? can_overcommit+0x54/0xf0 [btrfs]
Jan 21 20:13:38 array kernel: [ 956.188063] [<ffffffffa0269925>] btrfs_async_reclaim_metadata_space+0xe5/0x200 [btrfs]
Jan 21 20:13:38 array kernel: [ 956.188133] BTRFS: error (device sdd) in __btrfs_free_extent:5978: errno=-5 IO failure
Jan 21 20:13:38 array kernel: [ 956.188224] BTRFS info (device sdd): forced readonly
Jan 21 20:13:38 array kernel: [ 956.188232] BTRFS: error (device sdd) in btrfs_run_delayed_refs:2792: errno=-5 IO failure
Jan 21 20:13:38 array kernel: BTRFS (device sdd): parent transid verify failed on 57023356125184 wanted 1623924 found 1622992
Jan 21 20:13:38 array kernel: [ 956.190791] BTRFS warning (device sdd): btrfs_uuid_scan_kthread failed -30
Jan 21 20:13:38 array kernel: WARNING: CPU: 2 PID: 471 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x54/0x120 [btrfs]()
Jan 21 20:13:38 array kernel: Modules linked in: binfmt_misc pl2303 vfat fat iTCO_wdt iTCO_vendor_support intel_rapl x86_pkg_temp_thermal coretemp kvm_intel kvm snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi crct10dif_pclmul crc32_pclmul crc32c_intel snd_hda_intel ghash_clmulni_intel snd_hda_controller i2c_i801 mei_me mei ir_mce_kbd_decoder ir_lirc_codec ir_xmp_decoder lirc_dev ir_sharp_decoder ir_sanyo_decoder snd_hda_codec ir_jvc_decoder ir_sony_decoder lpc_ich ir_rc6_decoder mfd_core ir_rc5_decoder snd_hwdep snd_seq ir_nec_decoder snd_seq_device snd_pcm snd_timer i2c_hid rc_rc6_mce tpm_tis tpm i2c_designware_platform nuvoton_cir snd rc_core i2c_designware_core dw_dmac dw_dmac_core soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc uas usb_storage btrfs xor i915 raid6_pq i2c_algo_bit drm_kms_helper
Jan 21 20:13:38 array kernel: Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]
Jan 21 20:13:38 array kernel: [<ffffffffa0253284>] __btrfs_abort_transaction+0x54/0x120 [btrfs]
Jan 21 20:13:38 array kernel: [<ffffffffa0266a09>] __btrfs_free_extent+0x179/0xca0 [btrfs]
Jan 21 20:13:38 array kernel: [<ffffffffa02ce4a9>] ? btrfs_merge_delayed_refs+0x3b9/0x490 [btrfs]
Jan 21 20:13:38 array kernel: [<ffffffffa026c76e>] __btrfs_run_delayed_refs+0x8fe/0x1210 [btrfs]
Jan 21 20:13:38 array kernel: [<ffffffffa02712ce>] btrfs_run_delayed_refs.part.64+0x6e/0x280 [btrfs]
Jan 21 20:13:38 array kernel: [<ffffffffa0271a01>] btrfs_write_dirty_block_groups+0x461/0x740 [btrfs]
Jan 21 20:13:38 array kernel: [<ffffffffa02f8b71>] commit_cowonly_roots+0x17a/0x234 [btrfs]
Jan 21 20:13:38 array kernel: [<ffffffffa0282651>] btrfs_commit_transaction+0x491/0xa20 [btrfs]
Jan 21 20:13:38 array kernel: [<ffffffffa02693fa>] flush_space+0x9a/0x4e0 [btrfs]
Jan 21 20:13:38 array kernel: [<ffffffffa0268e90>] ? btrfs_get_alloc_profile+0x30/0x40 [btrfs]
Jan 21 20:13:38 array kernel: [<ffffffffa02692c4>] ? can_overcommit+0x54/0xf0 [btrfs]
Jan 21 20:13:38 array kernel: [<ffffffffa0269925>] btrfs_async_reclaim_metadata_space+0xe5/0x200 [btrfs]
Jan 21 20:13:38 array kernel: BTRFS: error (device sdd) in __btrfs_free_extent:5978: errno=-5 IO failure
Jan 21 20:13:38 array kernel: BTRFS info (device sdd): forced readonly
Jan 21 20:13:38 array kernel: BTRFS: error (device sdd) in btrfs_run_delayed_refs:2792: errno=-5 IO failure
Jan 21 20:13:38 array kernel: BTRFS warning (device sdd): btrfs_uuid_scan_kthread failed -30
Jan 21 20:13:38 array kernel: [ 956.319756] BTRFS: error (device sdd) in write_one_cache_group:3153: errno=-28 No space left
Jan 21 20:13:38 array kernel: [ 956.319805] BTRFS warning (device sdd): Skipping commit of aborted transaction.
Jan 21 20:13:38 array kernel: [ 956.319807] BTRFS: error (device sdd) in cleanup_transaction:1607: errno=-28 No space left
Jan 21 20:13:38 array kernel: BTRFS: error (device sdd) in write_one_cache_group:3153: errno=-28 No space left
Jan 21 20:13:38 array kernel: BTRFS warning (device sdd): Skipping commit of aborted transaction.
Jan 21 20:13:38 array kernel: BTRFS: error (device sdd) in cleanup_transaction:1607: errno=-28 No space left
So it appears something else is awry, potentially exacerbated by my flakey hardware and which can now only effectively be recovered from with a restore / copy off to a healthier place.
Thanks all, good learnings - I'll be back as I couldn't sleep at night without a checksum-on-read FS !
>
> >
> > > [ 1432.231219] BTRFS: space cache generation (1623006) does not match inode
> > (1623943)
> > > [ 1432.231283] BTRFS warning (device sde): failed to load free space cache
> > for block group 57021632675840, rebuild it now
> > > [ 1434.603652] BTRFS (device sde): parent transid verify failed on
> > 57022998740992 wanted 1623896 found 1622991
> > > [ 1434.608726] BTRFS (device sde): parent transid verify failed on
> > 57022998740992 wanted 1623896 found 1622991
> > > [ 1434.919115] BTRFS (device sde): parent transid verify failed on
> > 57022584373248 wanted 1623923 found 1622990
> > > [ 1434.937616] BTRFS (device sde): parent transid verify failed on
> > 57022584373248 wanted 1623923 found 1622990
> > > [ 1434.937658] WARNING: CPU: 0 PID: 1571 at fs/btrfs/super.c:260
> > __btrfs_abort_transaction+0x54/0x120 [btrfs]()
> > > [ 1434.937660] BTRFS: Transaction aborted (error -5)
> > > [ 1434.937662] Modules linked in: binfmt_misc pl2303 iTCO_wdt
> > iTCO_vendor_support intel_rapl x86_pkg_temp_thermal coretemp kvm_intel vfat
> fat
> > kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel
> > snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel
> > snd_hda_controller mei_me snd_hda_codec mei ir_xmp_decoder ir_lirc_codec
> > snd_seq lirc_dev ir_mce_kbd_decoder ir_sharp_decoder ir_sanyo_decoder
> i2c_i801
> > ir_sony_decoder lpc_ich ir_jvc_decoder snd_hwdep mfd_core ir_rc6_decoder
> > ir_rc5_decoder ir_nec_decoder snd_seq_device snd_pcm i2c_hid tpm_tis tpm
> > rc_rc6_mce i2c_designware_platform nuvoton_cir dw_dmac i2c_designware_core
> > dw_dmac_core rc_core snd_timer snd soundcore nfsd auth_rpcgss nfs_acl lockd
> > grace sunrpc btrfs xor i915 raid6_pq e1000e i2c_algo_bit drm_kms_helper drm
> > > [ 1434.937732] Workqueue: events_unbound btrfs_async_reclaim_metadata_space
> > [btrfs]
> > > [ 1434.937779] [<ffffffffa026c284>] __btrfs_abort_transaction+0x54/0x120
> > [btrfs]
> > > [ 1434.937795] [<ffffffffa028a38a>]
> > btrfs_run_delayed_refs.part.64+0x12a/0x280 [btrfs]
> > > [ 1434.937811] [<ffffffffa028a4f7>] btrfs_run_delayed_refs+0x17/0x20
> [btrfs]
> > > [ 1434.937829] [<ffffffffa029b5c7>] btrfs_commit_transaction+0x407/0xa20
> > [btrfs]
> > > [ 1434.937843] [<ffffffffa02823fa>] flush_space+0x9a/0x4e0 [btrfs]
> > > [ 1434.937857] [<ffffffffa0281e90>] ? btrfs_get_alloc_profile+0x30/0x40
> > [btrfs]
> > > [ 1434.937870] [<ffffffffa02822c4>] ? can_overcommit+0x54/0xf0 [btrfs]
> > > [ 1434.937883] [<ffffffffa0282925>]
> > btrfs_async_reclaim_metadata_space+0xe5/0x200 [btrfs]
> > > [ 1434.937920] BTRFS: error (device sde) in btrfs_run_delayed_refs:2792:
> > errno=-5 IO failure
> > > [ 1434.937972] BTRFS info (device sde): forced readonly
> > > [ 1434.937975] BTRFS warning (device sde): Skipping commit of aborted
> > transaction.
> > > [ 1434.937978] BTRFS: error (device sde) in cleanup_transaction:1607:
> > errno=-5 IO failure
> > > [ 1435.424679] BTRFS (device sde): parent transid verify failed on
> > 29792056147968 wanted 1623936 found 1623004
> > > [ 1435.522805] BTRFS (device sde): parent transid verify failed on
> > 29792056999936 wanted 1623936 found 1623004
> > > [ 1435.565828] BTRFS (device sde): parent transid verify failed on
> > 29792057737216 wanted 1623936 found 1623004
> > > [ 1435.651292] BTRFS (device sde): parent transid verify failed on
> > 29790951227392 wanted 1623935 found 1623003
> > > [ 1435.718304] BTRFS (device sde): parent transid verify failed on
> > 29790914019328 wanted 1623935 found 1623003
> > > [ 1435.754679] BTRFS (device sde): parent transid verify failed on
> > 29790914592768 wanted 1623935 found 1623003
> > > [ 1439.616104] BTRFS (device sde): parent transid verify failed on
> > 29793154777088 wanted 1623936 found 1623005
> > > [ 1439.622553] BTRFS (device sde): parent transid verify failed on
> > 29796082139136 wanted 1623939 found 1622980
> > > [ 1439.635702] BTRFS (device sde): parent transid verify failed on
> > 29795165683712 wanted 1623938 found 1623007
> > > [ 1439.639677] BTRFS (device sde): parent transid verify failed on
> > 29796037656576 wanted 1623939 found 1622980
> > > [ 1439.644303] BTRFS (device sde): parent transid verify failed on
> > 29792017874944 wanted 1623936 found 1623004
> > > [ 1439.653535] BTRFS (device sde): parent transid verify failed on
> > 29795102965760 wanted 1623939 found 1623007
> > > [ 1439.663393] BTRFS (device sde): parent transid verify failed on
> > 29793354055680 wanted 1623937 found 1623005
> > > [ 1439.666222] BTRFS (device sde): parent transid verify failed on
> > 29795590062080 wanted 1623939 found 1623007
> > > [ 1439.677759] BTRFS (device sde): parent transid verify failed on
> > 29798681542656 wanted 1623943 found 1622983
> > > [ 1439.693965] BTRFS (device sde): parent transid verify failed on
> > 29795888676864 wanted 1623939 found 1623007
> > > [ 1444.624351] BTRFS (device sde): parent transid verify failed on
> > 29799304413184 wanted 1623943 found 1622984
> > > [ 1444.640856] BTRFS (device sde): parent transid verify failed on
> > 29794051571712 wanted 1623937 found 1623005
> > > [ 1444.654402] BTRFS (device sde): parent transid verify failed on
> > 29796307681280 wanted 1623940 found 1622980
> > > [ 1444.667444] BTRFS (device sde): parent transid verify failed on
> > 29796331765760 wanted 1623940 found 1622980
> > > [ 1444.677949] BTRFS (device sde): parent transid verify failed on
> > 29795524362240 wanted 1623939 found 1623007
> > > [ 1444.687418] BTRFS (device sde): parent transid verify failed on
> > 29799375421440 wanted 1623943 found 1622984
> > > [ 1444.694611] BTRFS (device sde): parent transid verify failed on
> > 29797253136384 wanted 1623941 found 1622981
> > > [ 1444.708855] BTRFS (device sde): parent transid verify failed on
> > 29796242833408 wanted 1623940 found 1622980
> > > [ 1444.717589] BTRFS (device sde): parent transid verify failed on
> > 29795054223360 wanted 1623939 found 1623007
> > > [ 1444.719838] BTRFS (device sde): parent transid verify failed on
> > 29797253316608 wanted 1623941 found 1622981
> > >
> > > The FS mounts RW fine on boot, until a write is actually attempted, at
> which
> > point it spits these errors - ultimately the write doesn't get committed &
> the
> > FS gets forced readonly.
> > >
> > > Thanks in advance !
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> >
> >
> > --
> > Filipe David Manana,
> >
> > "Reasonable men adapt themselves to the world.
> > Unreasonable men adapt the world to themselves.
> > That's why all progress depends on unreasonable men."
> >
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-01-21 9:15 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-21 9:15 Recovery options for FS forced readonly due to 3.17 snapshot bug Brett King
-- strict thread matches above, loose matches on Subject: below --
2015-01-20 22:26 Brett King
2015-01-20 12:15 brett.king
2015-01-20 12:27 ` Hugo Mills
2015-01-20 12:40 ` Filipe David Manana
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).