Kernel Dump scanning directory

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Kernel Dump scanning directory
@ 2015-05-06 21:56 Anthony Plack
  2015-05-07 14:33 ` Anthony Plack
  0 siblings, 1 reply; 14+ messages in thread
From: Anthony Plack @ 2015-05-06 21:56 UTC (permalink / raw)
  To: linux-btrfs

Hey guys, I have some dump data from BTRFS.  This occurred when running rsync -nav to my backup server.  Data is all there (as far as I can tell about 9TB of data).  If I unmount the drive, and remount it, it will be writable (echo ‘b’ > b) for days, but I start the rsync and very quickly it will dump and switch to read-only.  Scrub finished successful (mount was still RW after completion) yesterday.  Let me know what else I can get you.

May  6 11:17:47 fatdrive kernel: verify_parent_transid: 19 callbacks suppressed
May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
May  6 11:17:47 fatdrive kernel: ------------[ cut here ]------------
May  6 11:17:47 fatdrive kernel: WARNING: CPU: 0 PID: 9426 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x5a/0x130()
May  6 11:17:47 fatdrive kernel: BTRFS: Transaction aborted (error -5)
May  6 11:17:47 fatdrive kernel: Modules linked in: nfsd exportfs kvm_amd r8168(O) kvm k10temp ata_generic acpi_cpufreq
May  6 11:17:47 fatdrive kernel: CPU: 0 PID: 9426 Comm: kworker/u12:4 Tainted: G        W  O    4.0.0-gentoo #1
May  6 11:17:47 fatdrive kernel: Hardware name: BIOSTAR Group TA880G HD/TA880G HD, BIOS 080015  11/11/2011
May  6 11:17:47 fatdrive kernel: Workqueue: btrfs-extent-refs btrfs_extent_refs_helper
May  6 11:17:47 fatdrive kernel:  ffffffff81a9729e ffff8801704abc38 ffffffff817e6414 0000000000000000
May  6 11:17:47 fatdrive kernel:  ffff8801704abc88 ffff8801704abc78 ffffffff81049e60 0000000000000000
May  6 11:17:47 fatdrive kernel:  ffff8802142b22c0 ffff88020ce81800 00000000fffffffb ffffffff81869b50
May  6 11:17:47 fatdrive kernel: Call Trace:
May  6 11:17:47 fatdrive kernel:  [<ffffffff817e6414>] dump_stack+0x45/0x57
May  6 11:17:47 fatdrive kernel:  [<ffffffff81049e60>] warn_slowpath_common+0x90/0xd0
May  6 11:17:47 fatdrive kernel:  [<ffffffff81049ee1>] warn_slowpath_fmt+0x41/0x50
May  6 11:17:47 fatdrive kernel:  [<ffffffff812d9e7a>] __btrfs_abort_transaction+0x5a/0x130
May  6 11:17:47 fatdrive kernel:  [<ffffffff812f6fa1>] btrfs_run_delayed_refs.part.65+0x121/0x270
May  6 11:17:47 fatdrive kernel:  [<ffffffff812f7190>] delayed_ref_async_start+0x80/0xa0
May  6 11:17:47 fatdrive kernel:  [<ffffffff81336e07>] normal_work_helper+0xb7/0x2b0
May  6 11:17:47 fatdrive kernel:  [<ffffffff8133711d>] btrfs_extent_refs_helper+0xd/0x10
May  6 11:17:47 fatdrive kernel:  [<ffffffff810606c7>] process_one_work+0x147/0x3f0
May  6 11:17:47 fatdrive kernel:  [<ffffffff81060a88>] worker_thread+0x118/0x440
May  6 11:17:47 fatdrive kernel:  [<ffffffff81060970>] ? process_one_work+0x3f0/0x3f0
May  6 11:17:47 fatdrive kernel:  [<ffffffff81065624>] kthread+0xc4/0xe0
May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
May  6 11:17:47 fatdrive kernel:  [<ffffffff817ee488>] ret_from_fork+0x58/0x90
May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
May  6 11:17:47 fatdrive kernel: ---[ end trace f948e8ffd1f550c0 ]---
May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
May  6 11:17:47 fatdrive kernel: ------------[ cut here ]------------
May  6 11:17:47 fatdrive kernel: WARNING: CPU: 1 PID: 9414 at fs/btrfs/delayed-ref.c:438 btrfs_select_ref_head+0x124/0x140()
May  6 11:17:47 fatdrive kernel: Modules linked in: nfsd exportfs kvm_amd r8168(O) kvm k10temp ata_generic acpi_cpufreq
May  6 11:17:47 fatdrive kernel: CPU: 1 PID: 9414 Comm: kworker/u12:0 Tainted: G        W  O    4.0.0-gentoo #1
May  6 11:17:47 fatdrive kernel: Hardware name: BIOSTAR Group TA880G HD/TA880G HD, BIOS 080015  11/11/2011
May  6 11:17:47 fatdrive kernel: Workqueue: btrfs-extent-refs btrfs_extent_refs_helper
May  6 11:17:47 fatdrive kernel:  ffffffff81a98a02 ffff88006476bb58 ffffffff817e6414 0000000000000000
May  6 11:17:47 fatdrive kernel:  0000000000000000 ffff88006476bb98 ffffffff81049e60 ffff88006476bbd8
May  6 11:17:47 fatdrive kernel:  ffff880214300320 ffff880178aec120 0000000000000000 ffff880178aec208
May  6 11:17:47 fatdrive kernel: Call Trace:
May  6 11:17:47 fatdrive kernel:  [<ffffffff817e6414>] dump_stack+0x45/0x57
May  6 11:17:47 fatdrive kernel:  [<ffffffff81049e60>] warn_slowpath_common+0x90/0xd0
May  6 11:17:47 fatdrive kernel:  [<ffffffff81049f45>] warn_slowpath_null+0x15/0x20
May  6 11:17:47 fatdrive kernel:  [<ffffffff81351f04>] btrfs_select_ref_head+0x124/0x140
May  6 11:17:47 fatdrive kernel:  [<ffffffff812f1f15>] __btrfs_run_delayed_refs+0x1f5/0x11f0
May  6 11:17:47 fatdrive kernel:  [<ffffffff8106e6cd>] ? check_preempt_curr+0x8d/0xa0
May  6 11:17:47 fatdrive kernel:  [<ffffffff8106e6f8>] ? ttwu_do_wakeup+0x18/0xd0
May  6 11:17:47 fatdrive kernel:  [<ffffffff812f6eea>] btrfs_run_delayed_refs.part.65+0x6a/0x270
May  6 11:17:47 fatdrive kernel:  [<ffffffff812f7190>] delayed_ref_async_start+0x80/0xa0
May  6 11:17:47 fatdrive kernel:  [<ffffffff81336e07>] normal_work_helper+0xb7/0x2b0
May  6 11:17:47 fatdrive kernel:  [<ffffffff8133711d>] btrfs_extent_refs_helper+0xd/0x10
May  6 11:17:47 fatdrive kernel:  [<ffffffff810606c7>] process_one_work+0x147/0x3f0
May  6 11:17:47 fatdrive kernel:  [<ffffffff81060a88>] worker_thread+0x118/0x440
May  6 11:17:47 fatdrive kernel:  [<ffffffff81060970>] ? process_one_work+0x3f0/0x3f0
May  6 11:17:47 fatdrive kernel:  [<ffffffff81065624>] kthread+0xc4/0xe0
May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
May  6 11:17:47 fatdrive kernel:  [<ffffffff817ee488>] ret_from_fork+0x58/0x90
May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
May  6 11:17:47 fatdrive kernel: ---[ end trace f948e8ffd1f550c1 ]---
May  6 11:17:47 fatdrive kernel: BTRFS: error (device sdm) in btrfs_run_delayed_refs:2790: errno=-5 IO failure
May  6 11:17:47 fatdrive kernel: BTRFS info (device sdm): forced readonly
May  6 11:17:47 fatdrive kernel: BTRFS: error (device sdm) in btrfs_run_delayed_refs:2790: errno=-5 IO failure
May  6 11:17:47 fatdrive kernel: BTRFS: error (device sdm) in btrfs_run_delayed_refs:2790: errno=-5 IO failure
May  6 11:18:16 fatdrive kernel: verify_parent_transid: 28 callbacks suppressed
May  6 11:18:16 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 7645497978880 wanted 148829 found 147711


fatdrive data # uname -a && btrfs filesystem show /mnt/data
Linux fatdrive 4.0.0-gentoo #1 SMP Sat Apr 25 22:56:50 CDT 2015 x86_64 AMD Athlon(tm) II X2 240 Processor AuthenticAMD GNU/Linux
Label: 'data'  uuid: f591ac13-1a69-476d-bd30-346f87a491da
	Total devices 11 FS bytes used 9.85TiB
	devid    1 size 1.82TiB used 1.82TiB path /dev/sdm
	devid    2 size 1.82TiB used 1.82TiB path /dev/sdh
	devid    3 size 1.82TiB used 1.82TiB path /dev/sdg
	devid    4 size 1.82TiB used 1.81TiB path /dev/sdf
	devid    5 size 1.82TiB used 1.81TiB path /dev/sdc
	devid    6 size 1.82TiB used 1.82TiB path /dev/sdb
	devid    7 size 1.82TiB used 1.82TiB path /dev/sdk
	devid    8 size 1.82TiB used 1.82TiB path /dev/sdd
	devid    9 size 2.73TiB used 2.63TiB path /dev/sde
	devid   10 size 1.82TiB used 1.82TiB path /dev/sdl
	devid   11 size 1.82TiB used 836.00GiB path /dev/sda

btrfs-progs v3.19.1

fatdrive data # btrfs device stat /mnt/data
[/dev/sdm].write_io_errs   0
[/dev/sdm].read_io_errs    0
[/dev/sdm].flush_io_errs   0
[/dev/sdm].corruption_errs 0
[/dev/sdm].generation_errs 0
[/dev/sdh].write_io_errs   0
[/dev/sdh].read_io_errs    0
[/dev/sdh].flush_io_errs   0
[/dev/sdh].corruption_errs 0
[/dev/sdh].generation_errs 0
[/dev/sdg].write_io_errs   0
[/dev/sdg].read_io_errs    0
[/dev/sdg].flush_io_errs   0
[/dev/sdg].corruption_errs 0
[/dev/sdg].generation_errs 0
[/dev/sdf].write_io_errs   0
[/dev/sdf].read_io_errs    0
[/dev/sdf].flush_io_errs   0
[/dev/sdf].corruption_errs 0
[/dev/sdf].generation_errs 0
[/dev/sdc].write_io_errs   0
[/dev/sdc].read_io_errs    0
[/dev/sdc].flush_io_errs   0
[/dev/sdc].corruption_errs 0
[/dev/sdc].generation_errs 3
[/dev/sdb].write_io_errs   0
[/dev/sdb].read_io_errs    0
[/dev/sdb].flush_io_errs   0
[/dev/sdb].corruption_errs 0
[/dev/sdb].generation_errs 0
[/dev/sdk].write_io_errs   0
[/dev/sdk].read_io_errs    0
[/dev/sdk].flush_io_errs   0
[/dev/sdk].corruption_errs 0
[/dev/sdk].generation_errs 3
[/dev/sdd].write_io_errs   0
[/dev/sdd].read_io_errs    0
[/dev/sdd].flush_io_errs   0
[/dev/sdd].corruption_errs 0
[/dev/sdd].generation_errs 0
[/dev/sde].write_io_errs   0
[/dev/sde].read_io_errs    0
[/dev/sde].flush_io_errs   0
[/dev/sde].corruption_errs 0
[/dev/sde].generation_errs 0
[/dev/sdl].write_io_errs   0
[/dev/sdl].read_io_errs    0
[/dev/sdl].flush_io_errs   0
[/dev/sdl].corruption_errs 0
[/dev/sdl].generation_errs 0
[/dev/sda].write_io_errs   0
[/dev/sda].read_io_errs    0
[/dev/sda].flush_io_errs   0
[/dev/sda].corruption_errs 0
[/dev/sda].generation_errs 0

fatdrive data # btrfs scrub stat /mnt/data
scrub status for f591ac13-1a69-476d-bd30-346f87a491da
	scrub started at Mon May  4 13:21:02 2015 and was aborted after 25599 seconds
	total bytes scrubbed: 13.12TiB with 6 errors
	error details: verify=6
	corrected errors: 0, uncorrectable errors: 6, unverified errors: 0




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Kernel Dump scanning directory
  2015-05-06 21:56 Kernel Dump scanning directory Anthony Plack
@ 2015-05-07 14:33 ` Anthony Plack
  2015-05-07 21:30   ` Anthony Plack
  0 siblings, 1 reply; 14+ messages in thread
From: Anthony Plack @ 2015-05-07 14:33 UTC (permalink / raw)
  To: linux-btrfs

Okay, trying my programming skills at this unfamiliar code.

Okay so this looks like an IO error, then an abort transaction.  Is it a fair assumption that a normal transaction abort marks the volume as read-only, undoes the transaction, and then marks the transaction as read-write?  Therefore, because of the dump, in the middle of the undo, the volume remains read-only?

> On May 6, 2015, at 4:56 PM, Anthony Plack <anthony@plack.net> wrote:
> 
> Hey guys, I have some dump data from BTRFS.  This occurred when running rsync -nav to my backup server.  Data is all there (as far as I can tell about 9TB of data).  If I unmount the drive, and remount it, it will be writable (echo ‘b’ > b) for days, but I start the rsync and very quickly it will dump and switch to read-only.  Scrub finished successful (mount was still RW after completion) yesterday.  Let me know what else I can get you.
> 
> May  6 11:17:47 fatdrive kernel: verify_parent_transid: 19 callbacks suppressed
> May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
> May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
> May  6 11:17:47 fatdrive kernel: ------------[ cut here ]------------
> May  6 11:17:47 fatdrive kernel: WARNING: CPU: 0 PID: 9426 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x5a/0x130()
> May  6 11:17:47 fatdrive kernel: BTRFS: Transaction aborted (error -5)
> May  6 11:17:47 fatdrive kernel: Modules linked in: nfsd exportfs kvm_amd r8168(O) kvm k10temp ata_generic acpi_cpufreq
> May  6 11:17:47 fatdrive kernel: CPU: 0 PID: 9426 Comm: kworker/u12:4 Tainted: G        W  O    4.0.0-gentoo #1
> May  6 11:17:47 fatdrive kernel: Hardware name: BIOSTAR Group TA880G HD/TA880G HD, BIOS 080015  11/11/2011
> May  6 11:17:47 fatdrive kernel: Workqueue: btrfs-extent-refs btrfs_extent_refs_helper
> May  6 11:17:47 fatdrive kernel:  ffffffff81a9729e ffff8801704abc38 ffffffff817e6414 0000000000000000
> May  6 11:17:47 fatdrive kernel:  ffff8801704abc88 ffff8801704abc78 ffffffff81049e60 0000000000000000
> May  6 11:17:47 fatdrive kernel:  ffff8802142b22c0 ffff88020ce81800 00000000fffffffb ffffffff81869b50
> May  6 11:17:47 fatdrive kernel: Call Trace:
> May  6 11:17:47 fatdrive kernel:  [<ffffffff817e6414>] dump_stack+0x45/0x57
> May  6 11:17:47 fatdrive kernel:  [<ffffffff81049e60>] warn_slowpath_common+0x90/0xd0
> May  6 11:17:47 fatdrive kernel:  [<ffffffff81049ee1>] warn_slowpath_fmt+0x41/0x50
> May  6 11:17:47 fatdrive kernel:  [<ffffffff812d9e7a>] __btrfs_abort_transaction+0x5a/0x130
> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f6fa1>] btrfs_run_delayed_refs.part.65+0x121/0x270
> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f7190>] delayed_ref_async_start+0x80/0xa0
> May  6 11:17:47 fatdrive kernel:  [<ffffffff81336e07>] normal_work_helper+0xb7/0x2b0
> May  6 11:17:47 fatdrive kernel:  [<ffffffff8133711d>] btrfs_extent_refs_helper+0xd/0x10
> May  6 11:17:47 fatdrive kernel:  [<ffffffff810606c7>] process_one_work+0x147/0x3f0
> May  6 11:17:47 fatdrive kernel:  [<ffffffff81060a88>] worker_thread+0x118/0x440
> May  6 11:17:47 fatdrive kernel:  [<ffffffff81060970>] ? process_one_work+0x3f0/0x3f0
> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065624>] kthread+0xc4/0xe0
> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
> May  6 11:17:47 fatdrive kernel:  [<ffffffff817ee488>] ret_from_fork+0x58/0x90
> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
> May  6 11:17:47 fatdrive kernel: ---[ end trace f948e8ffd1f550c0 ]---
> May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
> May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687

Not sure where to dig out btrfs_run_delayed_refs.part.65+0x121 as it is a structure (gdb is not a regular tool I use).  it seems that it is trying to run the delayed refs and something in the data for line 265 is wrong and generating a warning.

260					errno);
261		}
262		trans->aborted = errno;
263		/* Nothing used. The other threads that have joined this
264		 * transaction may be able to continue. */
265		if (!trans->blocks_used && list_empty(&trans->new_bgs)) {
266			const char *errstr;
267	
268			errstr = btrfs_decode_error(errno);
269			btrfs_warn(root->fs_info,

 

> May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
> May  6 11:17:47 fatdrive kernel: ------------[ cut here ]------------
> May  6 11:17:47 fatdrive kernel: WARNING: CPU: 1 PID: 9414 at fs/btrfs/delayed-ref.c:438 btrfs_select_ref_head+0x124/0x140()
> May  6 11:17:47 fatdrive kernel: Modules linked in: nfsd exportfs kvm_amd r8168(O) kvm k10temp ata_generic acpi_cpufreq
> May  6 11:17:47 fatdrive kernel: CPU: 1 PID: 9414 Comm: kworker/u12:0 Tainted: G        W  O    4.0.0-gentoo #1
> May  6 11:17:47 fatdrive kernel: Hardware name: BIOSTAR Group TA880G HD/TA880G HD, BIOS 080015  11/11/2011
> May  6 11:17:47 fatdrive kernel: Workqueue: btrfs-extent-refs btrfs_extent_refs_helper
> May  6 11:17:47 fatdrive kernel:  ffffffff81a98a02 ffff88006476bb58 ffffffff817e6414 0000000000000000
> May  6 11:17:47 fatdrive kernel:  0000000000000000 ffff88006476bb98 ffffffff81049e60 ffff88006476bbd8
> May  6 11:17:47 fatdrive kernel:  ffff880214300320 ffff880178aec120 0000000000000000 ffff880178aec208
> May  6 11:17:47 fatdrive kernel: Call Trace:
> May  6 11:17:47 fatdrive kernel:  [<ffffffff817e6414>] dump_stack+0x45/0x57
> May  6 11:17:47 fatdrive kernel:  [<ffffffff81049e60>] warn_slowpath_common+0x90/0xd0
> May  6 11:17:47 fatdrive kernel:  [<ffffffff81049f45>] warn_slowpath_null+0x15/0x20
> May  6 11:17:47 fatdrive kernel:  [<ffffffff81351f04>] btrfs_select_ref_head+0x124/0x140
> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f1f15>] __btrfs_run_delayed_refs+0x1f5/0x11f0
> May  6 11:17:47 fatdrive kernel:  [<ffffffff8106e6cd>] ? check_preempt_curr+0x8d/0xa0
> May  6 11:17:47 fatdrive kernel:  [<ffffffff8106e6f8>] ? ttwu_do_wakeup+0x18/0xd0
> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f6eea>] btrfs_run_delayed_refs.part.65+0x6a/0x270
> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f7190>] delayed_ref_async_start+0x80/0xa0
> May  6 11:17:47 fatdrive kernel:  [<ffffffff81336e07>] normal_work_helper+0xb7/0x2b0
> May  6 11:17:47 fatdrive kernel:  [<ffffffff8133711d>] btrfs_extent_refs_helper+0xd/0x10
> May  6 11:17:47 fatdrive kernel:  [<ffffffff810606c7>] process_one_work+0x147/0x3f0
> May  6 11:17:47 fatdrive kernel:  [<ffffffff81060a88>] worker_thread+0x118/0x440
> May  6 11:17:47 fatdrive kernel:  [<ffffffff81060970>] ? process_one_work+0x3f0/0x3f0
> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065624>] kthread+0xc4/0xe0
> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
> May  6 11:17:47 fatdrive kernel:  [<ffffffff817ee488>] ret_from_fork+0x58/0x90
> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
> May  6 11:17:47 fatdrive kernel: ---[ end trace f948e8ffd1f550c1 ]---
> May  6 11:17:47 fatdrive kernel: BTRFS: error (device sdm) in btrfs_run_delayed_refs:2790: errno=-5 IO failure
> May  6 11:17:47 fatdrive kernel: BTRFS info (device sdm): forced readonly
> May  6 11:17:47 fatdrive kernel: BTRFS: error (device sdm) in btrfs_run_delayed_refs:2790: errno=-5 IO failure
> May  6 11:17:47 fatdrive kernel: BTRFS: error (device sdm) in btrfs_run_delayed_refs:2790: errno=-5 IO failure
> May  6 11:18:16 fatdrive kernel: verify_parent_transid: 28 callbacks suppressed
> May  6 11:18:16 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 7645497978880 wanted 148829 found 147711
> 

This next one seems a little more obvious.

0x12d4 is in btrfs_select_ref_head (fs/btrfs/delayed-ref.c:438).
433			head = rb_entry(node, struct btrfs_delayed_ref_head,
434					href_node);
435		}
436	
437		head->processing = 1;
438		WARN_ON(delayed_refs->num_heads_ready == 0);
439		delayed_refs->num_heads_ready--;
440		delayed_refs->run_delayed_start = head->node.bytenr +
441			head->node.num_bytes;
442		return head;

Because there are not heads ready, we are getting a warning.

So the real question is how do I know what is generating the IO error?  The rsync -nav is not changing the the volume, it is reading from it.

The devices are not showing errors in the log or in the "device stat" view.

What seems to be interesting is that whatever is generating the IO errors (which seems to be in the extents), triggers the Warning, which by default flips the whole volume to read-only.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Kernel Dump scanning directory
  2015-05-07 14:33 ` Anthony Plack
@ 2015-05-07 21:30   ` Anthony Plack
  2015-05-08 16:44     ` Anthony Plack
  0 siblings, 1 reply; 14+ messages in thread
From: Anthony Plack @ 2015-05-07 21:30 UTC (permalink / raw)
  To: linux-btrfs

On May 7, 2015, at 9:33 AM, Anthony Plack <anthony@plack.net> wrote:
> 
> Okay, trying my programming skills at this unfamiliar code.
> 
> Okay so this looks like an IO error, then an abort transaction.  Is it a fair assumption that a normal transaction abort marks the volume as read-only, undoes the transaction, and then marks the transaction as read-write?  Therefore, because of the dump, in the middle of the undo, the volume remains read-only?
> 
>> On May 6, 2015, at 4:56 PM, Anthony Plack <anthony@plack.net> wrote:
>> 
>> Hey guys, I have some dump data from BTRFS.  This occurred when running rsync -nav to my backup server.  Data is all there (as far as I can tell about 9TB of data).  If I unmount the drive, and remount it, it will be writable (echo ‘b’ > b) for days, but I start the rsync and very quickly it will dump and switch to read-only.  Scrub finished successful (mount was still RW after completion) yesterday.  Let me know what else I can get you.
>> 
>> May  6 11:17:47 fatdrive kernel: verify_parent_transid: 19 callbacks suppressed
>> May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
>> May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
>> May  6 11:17:47 fatdrive kernel: ------------[ cut here ]------------
>> May  6 11:17:47 fatdrive kernel: WARNING: CPU: 0 PID: 9426 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x5a/0x130()
>> May  6 11:17:47 fatdrive kernel: BTRFS: Transaction aborted (error -5)
>> May  6 11:17:47 fatdrive kernel: Modules linked in: nfsd exportfs kvm_amd r8168(O) kvm k10temp ata_generic acpi_cpufreq
>> May  6 11:17:47 fatdrive kernel: CPU: 0 PID: 9426 Comm: kworker/u12:4 Tainted: G        W  O    4.0.0-gentoo #1
>> May  6 11:17:47 fatdrive kernel: Hardware name: BIOSTAR Group TA880G HD/TA880G HD, BIOS 080015  11/11/2011
>> May  6 11:17:47 fatdrive kernel: Workqueue: btrfs-extent-refs btrfs_extent_refs_helper
>> May  6 11:17:47 fatdrive kernel:  ffffffff81a9729e ffff8801704abc38 ffffffff817e6414 0000000000000000
>> May  6 11:17:47 fatdrive kernel:  ffff8801704abc88 ffff8801704abc78 ffffffff81049e60 0000000000000000
>> May  6 11:17:47 fatdrive kernel:  ffff8802142b22c0 ffff88020ce81800 00000000fffffffb ffffffff81869b50
>> May  6 11:17:47 fatdrive kernel: Call Trace:
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff817e6414>] dump_stack+0x45/0x57
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81049e60>] warn_slowpath_common+0x90/0xd0
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81049ee1>] warn_slowpath_fmt+0x41/0x50
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812d9e7a>] __btrfs_abort_transaction+0x5a/0x130
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f6fa1>] btrfs_run_delayed_refs.part.65+0x121/0x270
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f7190>] delayed_ref_async_start+0x80/0xa0
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81336e07>] normal_work_helper+0xb7/0x2b0
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff8133711d>] btrfs_extent_refs_helper+0xd/0x10
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff810606c7>] process_one_work+0x147/0x3f0
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81060a88>] worker_thread+0x118/0x440
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81060970>] ? process_one_work+0x3f0/0x3f0
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065624>] kthread+0xc4/0xe0
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff817ee488>] ret_from_fork+0x58/0x90
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
>> May  6 11:17:47 fatdrive kernel: ---[ end trace f948e8ffd1f550c0 ]---
>> May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
>> May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
> 
> Not sure where to dig out btrfs_run_delayed_refs.part.65+0x121 as it is a structure (gdb is not a regular tool I use).  it seems that it is trying to run the delayed refs and something in the data for line 265 is wrong and generating a warning.
> 
> 260					errno);
> 261		}
> 262		trans->aborted = errno;
> 263		/* Nothing used. The other threads that have joined this
> 264		 * transaction may be able to continue. */
> 265		if (!trans->blocks_used && list_empty(&trans->new_bgs)) {
> 266			const char *errstr;
> 267	
> 268			errstr = btrfs_decode_error(errno);
> 269			btrfs_warn(root->fs_info,
> 
> 
> 
>> May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
>> May  6 11:17:47 fatdrive kernel: ------------[ cut here ]------------
>> May  6 11:17:47 fatdrive kernel: WARNING: CPU: 1 PID: 9414 at fs/btrfs/delayed-ref.c:438 btrfs_select_ref_head+0x124/0x140()
>> May  6 11:17:47 fatdrive kernel: Modules linked in: nfsd exportfs kvm_amd r8168(O) kvm k10temp ata_generic acpi_cpufreq
>> May  6 11:17:47 fatdrive kernel: CPU: 1 PID: 9414 Comm: kworker/u12:0 Tainted: G        W  O    4.0.0-gentoo #1
>> May  6 11:17:47 fatdrive kernel: Hardware name: BIOSTAR Group TA880G HD/TA880G HD, BIOS 080015  11/11/2011
>> May  6 11:17:47 fatdrive kernel: Workqueue: btrfs-extent-refs btrfs_extent_refs_helper
>> May  6 11:17:47 fatdrive kernel:  ffffffff81a98a02 ffff88006476bb58 ffffffff817e6414 0000000000000000
>> May  6 11:17:47 fatdrive kernel:  0000000000000000 ffff88006476bb98 ffffffff81049e60 ffff88006476bbd8
>> May  6 11:17:47 fatdrive kernel:  ffff880214300320 ffff880178aec120 0000000000000000 ffff880178aec208
>> May  6 11:17:47 fatdrive kernel: Call Trace:
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff817e6414>] dump_stack+0x45/0x57
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81049e60>] warn_slowpath_common+0x90/0xd0
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81049f45>] warn_slowpath_null+0x15/0x20
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81351f04>] btrfs_select_ref_head+0x124/0x140
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f1f15>] __btrfs_run_delayed_refs+0x1f5/0x11f0
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff8106e6cd>] ? check_preempt_curr+0x8d/0xa0
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff8106e6f8>] ? ttwu_do_wakeup+0x18/0xd0
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f6eea>] btrfs_run_delayed_refs.part.65+0x6a/0x270
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f7190>] delayed_ref_async_start+0x80/0xa0
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81336e07>] normal_work_helper+0xb7/0x2b0
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff8133711d>] btrfs_extent_refs_helper+0xd/0x10
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff810606c7>] process_one_work+0x147/0x3f0
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81060a88>] worker_thread+0x118/0x440
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81060970>] ? process_one_work+0x3f0/0x3f0
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065624>] kthread+0xc4/0xe0
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff817ee488>] ret_from_fork+0x58/0x90
>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
>> May  6 11:17:47 fatdrive kernel: ---[ end trace f948e8ffd1f550c1 ]---
>> May  6 11:17:47 fatdrive kernel: BTRFS: error (device sdm) in btrfs_run_delayed_refs:2790: errno=-5 IO failure
>> May  6 11:17:47 fatdrive kernel: BTRFS info (device sdm): forced readonly
>> May  6 11:17:47 fatdrive kernel: BTRFS: error (device sdm) in btrfs_run_delayed_refs:2790: errno=-5 IO failure
>> May  6 11:17:47 fatdrive kernel: BTRFS: error (device sdm) in btrfs_run_delayed_refs:2790: errno=-5 IO failure
>> May  6 11:18:16 fatdrive kernel: verify_parent_transid: 28 callbacks suppressed
>> May  6 11:18:16 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 7645497978880 wanted 148829 found 147711
>> 
> 
> This next one seems a little more obvious.
> 
> 0x12d4 is in btrfs_select_ref_head (fs/btrfs/delayed-ref.c:438).
> 433			head = rb_entry(node, struct btrfs_delayed_ref_head,
> 434					href_node);
> 435		}
> 436	
> 437		head->processing = 1;
> 438		WARN_ON(delayed_refs->num_heads_ready == 0);
> 439		delayed_refs->num_heads_ready--;
> 440		delayed_refs->run_delayed_start = head->node.bytenr +
> 441			head->node.num_bytes;
> 442		return head;
> 
> Because there are not heads ready, we are getting a warning.
> 
> So the real question is how do I know what is generating the IO error?  The rsync -nav is not changing the the volume, it is reading from it.
> 
> The devices are not showing errors in the log or in the "device stat" view.
> 
> What seems to be interesting is that whatever is generating the IO errors (which seems to be in the extents), triggers the Warning, which by default flips the whole volume to read-only.

Okay, so I ran btrfs-debug-tree /dev/sdm.  Most of the leaves looked okay until it got to one where it gave me this and quit:

	item 35 key (2685703434240 EXTENT_ITEM 4096) itemoff 2159 itemsize 51
		extent refs 1 gen 150619 flags TREE_BLOCK
		tree block key (4231221 DIR_ITEM 1140804831) level 0
		tree block backref root 5
parent transid verify failed on 94238015488 wanted 150690 found 150691
parent transid verify failed on 94238015488 wanted 150690 found 150691
parent transid verify failed on 94238015488 wanted 150690 found 150691
parent transid verify failed on 94238015488 wanted 150690 found 150691
Ignoring transid failure
print-tree.c:1091: btrfs_print_tree: Assertion failed.
btrfs-debug-tree[0x418539]
btrfs-debug-tree[0x41a53a]
btrfs-debug-tree(btrfs_print_tree+0x1eb)[0x41a40b]
btrfs-debug-tree(main+0x704)[0x407d04]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f2936a42a85]
btrfs-debug-tree[0x40812d]

So it looks like the IO error is really just a bad extent, which makes sense from the warnings from the kernel.

I will need to figure out why the Assertion failed.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Kernel Dump scanning directory
  2015-05-07 21:30   ` Anthony Plack
@ 2015-05-08 16:44     ` Anthony Plack
  2015-05-08 20:58       ` Duncan
                         ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Anthony Plack @ 2015-05-08 16:44 UTC (permalink / raw)
  To: linux-btrfs


> On May 7, 2015, at 4:30 PM, Anthony Plack <anthony@plack.net> wrote:
> 
> On May 7, 2015, at 9:33 AM, Anthony Plack <anthony@plack.net> wrote:
>> 
>> Okay, trying my programming skills at this unfamiliar code.
>> 
>> Okay so this looks like an IO error, then an abort transaction.  Is it a fair assumption that a normal transaction abort marks the volume as read-only, undoes the transaction, and then marks the transaction as read-write?  Therefore, because of the dump, in the middle of the undo, the volume remains read-only?
>> 
>>> On May 6, 2015, at 4:56 PM, Anthony Plack <anthony@plack.net> wrote:
>>> 
>>> Hey guys, I have some dump data from BTRFS.  This occurred when running rsync -nav to my backup server.  Data is all there (as far as I can tell about 9TB of data).  If I unmount the drive, and remount it, it will be writable (echo ‘b’ > b) for days, but I start the rsync and very quickly it will dump and switch to read-only.  Scrub finished successful (mount was still RW after completion) yesterday.  Let me know what else I can get you.
>>> 
>>> May  6 11:17:47 fatdrive kernel: verify_parent_transid: 19 callbacks suppressed
>>> May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
>>> May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
>>> May  6 11:17:47 fatdrive kernel: ------------[ cut here ]------------
>>> May  6 11:17:47 fatdrive kernel: WARNING: CPU: 0 PID: 9426 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x5a/0x130()
>>> May  6 11:17:47 fatdrive kernel: BTRFS: Transaction aborted (error -5)
>>> May  6 11:17:47 fatdrive kernel: Modules linked in: nfsd exportfs kvm_amd r8168(O) kvm k10temp ata_generic acpi_cpufreq
>>> May  6 11:17:47 fatdrive kernel: CPU: 0 PID: 9426 Comm: kworker/u12:4 Tainted: G        W  O    4.0.0-gentoo #1
>>> May  6 11:17:47 fatdrive kernel: Hardware name: BIOSTAR Group TA880G HD/TA880G HD, BIOS 080015  11/11/2011
>>> May  6 11:17:47 fatdrive kernel: Workqueue: btrfs-extent-refs btrfs_extent_refs_helper
>>> May  6 11:17:47 fatdrive kernel:  ffffffff81a9729e ffff8801704abc38 ffffffff817e6414 0000000000000000
>>> May  6 11:17:47 fatdrive kernel:  ffff8801704abc88 ffff8801704abc78 ffffffff81049e60 0000000000000000
>>> May  6 11:17:47 fatdrive kernel:  ffff8802142b22c0 ffff88020ce81800 00000000fffffffb ffffffff81869b50
>>> May  6 11:17:47 fatdrive kernel: Call Trace:
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff817e6414>] dump_stack+0x45/0x57
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81049e60>] warn_slowpath_common+0x90/0xd0
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81049ee1>] warn_slowpath_fmt+0x41/0x50
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812d9e7a>] __btrfs_abort_transaction+0x5a/0x130
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f6fa1>] btrfs_run_delayed_refs.part.65+0x121/0x270
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f7190>] delayed_ref_async_start+0x80/0xa0
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81336e07>] normal_work_helper+0xb7/0x2b0
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff8133711d>] btrfs_extent_refs_helper+0xd/0x10
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff810606c7>] process_one_work+0x147/0x3f0
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81060a88>] worker_thread+0x118/0x440
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81060970>] ? process_one_work+0x3f0/0x3f0
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065624>] kthread+0xc4/0xe0
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff817ee488>] ret_from_fork+0x58/0x90
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
>>> May  6 11:17:47 fatdrive kernel: ---[ end trace f948e8ffd1f550c0 ]---
>>> May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
>>> May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
>> 
>> Not sure where to dig out btrfs_run_delayed_refs.part.65+0x121 as it is a structure (gdb is not a regular tool I use).  it seems that it is trying to run the delayed refs and something in the data for line 265 is wrong and generating a warning.
>> 
>> 260					errno);
>> 261		}
>> 262		trans->aborted = errno;
>> 263		/* Nothing used. The other threads that have joined this
>> 264		 * transaction may be able to continue. */
>> 265		if (!trans->blocks_used && list_empty(&trans->new_bgs)) {
>> 266			const char *errstr;
>> 267	
>> 268			errstr = btrfs_decode_error(errno);
>> 269			btrfs_warn(root->fs_info,
>> 
>> 
>> 
>>> May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
>>> May  6 11:17:47 fatdrive kernel: ------------[ cut here ]------------
>>> May  6 11:17:47 fatdrive kernel: WARNING: CPU: 1 PID: 9414 at fs/btrfs/delayed-ref.c:438 btrfs_select_ref_head+0x124/0x140()
>>> May  6 11:17:47 fatdrive kernel: Modules linked in: nfsd exportfs kvm_amd r8168(O) kvm k10temp ata_generic acpi_cpufreq
>>> May  6 11:17:47 fatdrive kernel: CPU: 1 PID: 9414 Comm: kworker/u12:0 Tainted: G        W  O    4.0.0-gentoo #1
>>> May  6 11:17:47 fatdrive kernel: Hardware name: BIOSTAR Group TA880G HD/TA880G HD, BIOS 080015  11/11/2011
>>> May  6 11:17:47 fatdrive kernel: Workqueue: btrfs-extent-refs btrfs_extent_refs_helper
>>> May  6 11:17:47 fatdrive kernel:  ffffffff81a98a02 ffff88006476bb58 ffffffff817e6414 0000000000000000
>>> May  6 11:17:47 fatdrive kernel:  0000000000000000 ffff88006476bb98 ffffffff81049e60 ffff88006476bbd8
>>> May  6 11:17:47 fatdrive kernel:  ffff880214300320 ffff880178aec120 0000000000000000 ffff880178aec208
>>> May  6 11:17:47 fatdrive kernel: Call Trace:
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff817e6414>] dump_stack+0x45/0x57
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81049e60>] warn_slowpath_common+0x90/0xd0
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81049f45>] warn_slowpath_null+0x15/0x20
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81351f04>] btrfs_select_ref_head+0x124/0x140
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f1f15>] __btrfs_run_delayed_refs+0x1f5/0x11f0
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff8106e6cd>] ? check_preempt_curr+0x8d/0xa0
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff8106e6f8>] ? ttwu_do_wakeup+0x18/0xd0
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f6eea>] btrfs_run_delayed_refs.part.65+0x6a/0x270
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f7190>] delayed_ref_async_start+0x80/0xa0
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81336e07>] normal_work_helper+0xb7/0x2b0
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff8133711d>] btrfs_extent_refs_helper+0xd/0x10
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff810606c7>] process_one_work+0x147/0x3f0
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81060a88>] worker_thread+0x118/0x440
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81060970>] ? process_one_work+0x3f0/0x3f0
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065624>] kthread+0xc4/0xe0
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff817ee488>] ret_from_fork+0x58/0x90
>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
>>> May  6 11:17:47 fatdrive kernel: ---[ end trace f948e8ffd1f550c1 ]---
>>> May  6 11:17:47 fatdrive kernel: BTRFS: error (device sdm) in btrfs_run_delayed_refs:2790: errno=-5 IO failure
>>> May  6 11:17:47 fatdrive kernel: BTRFS info (device sdm): forced readonly
>>> May  6 11:17:47 fatdrive kernel: BTRFS: error (device sdm) in btrfs_run_delayed_refs:2790: errno=-5 IO failure
>>> May  6 11:17:47 fatdrive kernel: BTRFS: error (device sdm) in btrfs_run_delayed_refs:2790: errno=-5 IO failure
>>> May  6 11:18:16 fatdrive kernel: verify_parent_transid: 28 callbacks suppressed
>>> May  6 11:18:16 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 7645497978880 wanted 148829 found 147711
>>> 
>> 
>> This next one seems a little more obvious.
>> 
>> 0x12d4 is in btrfs_select_ref_head (fs/btrfs/delayed-ref.c:438).
>> 433			head = rb_entry(node, struct btrfs_delayed_ref_head,
>> 434					href_node);
>> 435		}
>> 436	
>> 437		head->processing = 1;
>> 438		WARN_ON(delayed_refs->num_heads_ready == 0);
>> 439		delayed_refs->num_heads_ready--;
>> 440		delayed_refs->run_delayed_start = head->node.bytenr +
>> 441			head->node.num_bytes;
>> 442		return head;
>> 
>> Because there are not heads ready, we are getting a warning.
>> 
>> So the real question is how do I know what is generating the IO error?  The rsync -nav is not changing the the volume, it is reading from it.
>> 
>> The devices are not showing errors in the log or in the "device stat" view.
>> 
>> What seems to be interesting is that whatever is generating the IO errors (which seems to be in the extents), triggers the Warning, which by default flips the whole volume to read-only.
> 
> Okay, so I ran btrfs-debug-tree /dev/sdm.  Most of the leaves looked okay until it got to one where it gave me this and quit:
> 
> 	item 35 key (2685703434240 EXTENT_ITEM 4096) itemoff 2159 itemsize 51
> 		extent refs 1 gen 150619 flags TREE_BLOCK
> 		tree block key (4231221 DIR_ITEM 1140804831) level 0
> 		tree block backref root 5
> parent transid verify failed on 94238015488 wanted 150690 found 150691
> parent transid verify failed on 94238015488 wanted 150690 found 150691
> parent transid verify failed on 94238015488 wanted 150690 found 150691
> parent transid verify failed on 94238015488 wanted 150690 found 150691
> Ignoring transid failure
> print-tree.c:1091: btrfs_print_tree: Assertion failed.
> btrfs-debug-tree[0x418539]
> btrfs-debug-tree[0x41a53a]
> btrfs-debug-tree(btrfs_print_tree+0x1eb)[0x41a40b]
> btrfs-debug-tree(main+0x704)[0x407d04]
> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f2936a42a85]
> btrfs-debug-tree[0x40812d]
> 
> So it looks like the IO error is really just a bad extent, which makes sense from the warnings from the kernel.
> 

Okay, updated the btrfs-progs to 4.0.  No change.  Scrub runs fine, 6 uncorrectable errors.

I have a new drive, and the worst drive in the list is /dev/sdl.  So I started a replace operation on sdl figuring that if I can replace the drive causing the bad IO, I can fix the issue.

btrfs replace start -f /dev/sdu /dev/sdl /mnt/data

This generated another warning (and read-only location)

May  8 11:22:05 fatdrive kernel: BTRFS: dev_replace from /dev/sdl (devid 10) to /dev/sdu started
May  8 11:22:13 fatdrive kernel: BTRFS: btrfs_scrub_dev(/dev/sdl, 10, /dev/sdu) failed -5
May  8 11:22:13 fatdrive kernel: ------------[ cut here ]------------
May  8 11:22:13 fatdrive kernel: WARNING: CPU: 0 PID: 8719 at fs/btrfs/dev-replace.c:425 btrfs_dev_replace_start+0x345/0x370()
May  8 11:22:13 fatdrive kernel: Modules linked in: ntfs nfsd exportfs r8169 mii kvm_amd kvm k10temp ata_generic acpi_cpufreq
May  8 11:22:13 fatdrive kernel: CPU: 0 PID: 8719 Comm: btrfs Tainted: G        W       4.0.0-gentoo #2
May  8 11:22:13 fatdrive kernel: Hardware name: BIOSTAR Group TA880G HD/TA880G HD, BIOS 080015  11/11/2011
May  8 11:22:13 fatdrive kernel:  ffffffff81a98eea ffff880214a9bcb8 ffffffff817e6934 0000000000000000
May  8 11:22:13 fatdrive kernel:  0000000000000000 ffff880214a9bcf8 ffffffff81049e60 ffff880214a9bcc8
May  8 11:22:13 fatdrive kernel:  ffff88020fdf2000 00000000fffffffb ffff88020c89d000 ffff88020c89dda8
May  8 11:22:13 fatdrive kernel: Call Trace:
May  8 11:22:13 fatdrive kernel:  [<ffffffff817e6934>] dump_stack+0x45/0x57
May  8 11:22:13 fatdrive kernel:  [<ffffffff81049e60>] warn_slowpath_common+0x90/0xd0
May  8 11:22:13 fatdrive kernel:  [<ffffffff81049f45>] warn_slowpath_null+0x15/0x20
May  8 11:22:13 fatdrive kernel:  [<ffffffff81377d75>] btrfs_dev_replace_start+0x345/0x370
May  8 11:22:13 fatdrive kernel:  [<ffffffff8133f41b>] btrfs_ioctl+0x1a2b/0x2a90
May  8 11:22:13 fatdrive kernel:  [<ffffffff8110e92b>] ? unlock_page+0x6b/0x70
May  8 11:22:13 fatdrive kernel:  [<ffffffff8110f30a>] ? filemap_map_pages+0x1ba/0x210
May  8 11:22:13 fatdrive kernel:  [<ffffffff811349c3>] ? handle_mm_fault+0x733/0xef0
May  8 11:22:13 fatdrive kernel:  [<ffffffff81170b00>] do_vfs_ioctl+0x2e0/0x4e0
May  8 11:22:13 fatdrive kernel:  [<ffffffff813957bf>] ? file_has_perm+0x8f/0xa0
May  8 11:22:13 fatdrive kernel:  [<ffffffff81170d89>] SyS_ioctl+0x89/0xa0
May  8 11:22:13 fatdrive kernel:  [<ffffffff8103e9fc>] ? do_page_fault+0xc/0x10
May  8 11:22:13 fatdrive kernel:  [<ffffffff817eea32>] system_call_fastpath+0x12/0x17
May  8 11:22:13 fatdrive kernel: ---[ end trace 205223c7c401157e ]---
May  8 11:25:39 fatdrive kernel: device: 'btrfs-6': device_unregister
May  8 11:25:39 fatdrive kernel: PM: Removing info for No Bus:btrfs-6
May  8 11:25:39 fatdrive kernel: device: 'btrfs-6': device_create_release


Now I need to figure out error -5.  The Warning is a thrown warning.

        /* the disk copy procedure reuses the scrub code */
        ret = btrfs_scrub_dev(fs_info, src_device->devid, 0,
                              btrfs_device_get_total_bytes(src_device),
                              &dev_replace->scrub_progress, 0, 1);

        ret = btrfs_dev_replace_finishing(root->fs_info, ret);
        /* don't warn if EINPROGRESS, someone else might be running scrub */
        if (ret == -EINPROGRESS) {
                args->result = BTRFS_IOCTL_DEV_REPLACE_RESULT_SCRUB_INPROGRESS;
                ret = 0;
        } else {
                WARN_ON(ret);
        }


I am close to zeroing the log at this point.  It is sad that we cannot fix the transaction log instead of just flushing it.

Once again btrfsck --repair /dev/sdm ended with

parent transid verify failed on 94238015488 wanted 150690 found 150691
Ignoring transid failure

no attempt to actually repair the volume.  No indication from the tools why.

Maybe I am over hoping for a COW transaction system to actually have the ability to fix transaction issues since there is little to no documentation other than zero the log.  I am also wondering why we have a log in the file system if the fix is to just flush it.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Kernel Dump scanning directory
  2015-05-08 16:44     ` Anthony Plack
@ 2015-05-08 20:58       ` Duncan
  2015-05-08 21:37         ` Anthony Plack
  2015-05-08 21:18       ` Hugo Mills
  2015-05-08 22:01       ` Kernel Dump scanning directory - New Scary Issue Anthony Plack
  2 siblings, 1 reply; 14+ messages in thread
From: Duncan @ 2015-05-08 20:58 UTC (permalink / raw)
  To: linux-btrfs

Anthony Plack posted on Fri, 08 May 2015 11:44:39 -0500 as excerpted:

> Maybe I am over hoping for a COW transaction system to actually have the
> ability to fix transaction issues since there is little to no
> documentation other than zero the log.  I am also wondering why we have
> a log in the file system if the fix is to just flush it.

I'm not a coder, but for quite some time wondered the same thing, plus 
why a log at all if it was COW and atomic commits... which you may know, 
at least at the level I was wondering about it.

Then a dev chanced a comment that made it much clearer to me.  Hopefully 
I didn't misunderstand something and the below clarifies things for you 
as it did for me, or at least for others reading if being a coder, you 
already had this high level of understanding and were talking about 
details of the implementation...

Btrfs is indeed atomic commit, but in the interest of efficiency, it 
doesn't commit every little change /all/ the way up the tree.  Instead, 
commit time is every 30 seconds by default, with the log collecting 
changes between commits, ideally at least, allowing a replay of 
everything that been partially written since the last commit that hasn't 
been committed yet -- the changes should be on device in a partially 
written tree, and the log should allow that partially written tree to be 
fully committed, thus saving all but perhaps the last actually in-process 
write at the moment of the crash.

With that, the log actually made sense to me, it's just a log of what 
hasn't been fully committed yet, given that in the interest of 
efficiency, a full commit is only done every 30 seconds.

And then I wasn't quite so concerned either about losing something 
critical in the log, or the safety of simply blowing it away, since by 
design, the log only deals with what hasn't been committed yet, and the 
commits are themselves designed to be atomic and leave the filesystem in 
a known good state.

But beyond that, at my non-coding sysadmin level anyway, I agree with 
where you seem to be taking this.  Currently, there's little to nothing 
documented or in the output about that log, and it's either keep it all 
or blow it away.  How much nicer it might be to have a bit more detail 
available, and to be able to actually trace the log and apply or blow 
away individual partial transactions as necessary, with the trace 
available so at least coder-level folks could trace exactly why the 
replay either committed or discarded each between-commits partial 
transaction, repairing the log instead of whole-hog commit or delete.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Kernel Dump scanning directory
  2015-05-08 16:44     ` Anthony Plack
  2015-05-08 20:58       ` Duncan
@ 2015-05-08 21:18       ` Hugo Mills
  2015-05-08 21:49         ` Anthony Plack
  2015-05-08 22:01       ` Kernel Dump scanning directory - New Scary Issue Anthony Plack
  2 siblings, 1 reply; 14+ messages in thread
From: Hugo Mills @ 2015-05-08 21:18 UTC (permalink / raw)
  To: Anthony Plack; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 8854 bytes --]

On Fri, May 08, 2015 at 11:44:39AM -0500, Anthony Plack wrote:
> > On May 7, 2015, at 4:30 PM, Anthony Plack <anthony@plack.net> wrote:
> > On May 7, 2015, at 9:33 AM, Anthony Plack <anthony@plack.net> wrote:
> >> Okay, trying my programming skills at this unfamiliar code.
> >> 
> >> Okay so this looks like an IO error, then an abort transaction.  Is it a fair assumption that a normal transaction abort marks the volume as read-only, undoes the transaction, and then marks the transaction as read-write?  Therefore, because of the dump, in the middle of the undo, the volume remains read-only?
> >> 
> >>> On May 6, 2015, at 4:56 PM, Anthony Plack <anthony@plack.net> wrote:
> >> 
> >> This next one seems a little more obvious.
> >> 
> >> 0x12d4 is in btrfs_select_ref_head (fs/btrfs/delayed-ref.c:438).
> >> 433			head = rb_entry(node, struct btrfs_delayed_ref_head,
> >> 434					href_node);
> >> 435		}
> >> 436	
> >> 437		head->processing = 1;
> >> 438		WARN_ON(delayed_refs->num_heads_ready == 0);
> >> 439		delayed_refs->num_heads_ready--;
> >> 440		delayed_refs->run_delayed_start = head->node.bytenr +
> >> 441			head->node.num_bytes;
> >> 442		return head;
> >> 
> >> Because there are not heads ready, we are getting a warning.
> >> 
> >> So the real question is how do I know what is generating the IO error?  The rsync -nav is not changing the the volume, it is reading from it.
> >> 
> >> The devices are not showing errors in the log or in the "device stat" view.
> >> 
> >> What seems to be interesting is that whatever is generating the IO errors (which seems to be in the extents), triggers the Warning, which by default flips the whole volume to read-only.
> > 
> > Okay, so I ran btrfs-debug-tree /dev/sdm.  Most of the leaves looked okay until it got to one where it gave me this and quit:
> > 
> > 	item 35 key (2685703434240 EXTENT_ITEM 4096) itemoff 2159 itemsize 51
> > 		extent refs 1 gen 150619 flags TREE_BLOCK
> > 		tree block key (4231221 DIR_ITEM 1140804831) level 0
> > 		tree block backref root 5
> > parent transid verify failed on 94238015488 wanted 150690 found 150691
> > parent transid verify failed on 94238015488 wanted 150690 found 150691
> > parent transid verify failed on 94238015488 wanted 150690 found 150691
> > parent transid verify failed on 94238015488 wanted 150690 found 150691
> > Ignoring transid failure
> > print-tree.c:1091: btrfs_print_tree: Assertion failed.
> > btrfs-debug-tree[0x418539]
> > btrfs-debug-tree[0x41a53a]
> > btrfs-debug-tree(btrfs_print_tree+0x1eb)[0x41a40b]
> > btrfs-debug-tree(main+0x704)[0x407d04]
> > /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f2936a42a85]
> > btrfs-debug-tree[0x40812d]
> > 
> > So it looks like the IO error is really just a bad extent, which makes sense from the warnings from the kernel.
> 
> Okay, updated the btrfs-progs to 4.0.  No change.  Scrub runs fine, 6 uncorrectable errors.
> 
> I have a new drive, and the worst drive in the list is /dev/sdl.  So I started a replace operation on sdl figuring that if I can replace the drive causing the bad IO, I can fix the issue.
> 
> btrfs replace start -f /dev/sdu /dev/sdl /mnt/data
> 
> This generated another warning (and read-only location)
> 
> May  8 11:22:05 fatdrive kernel: BTRFS: dev_replace from /dev/sdl (devid 10) to /dev/sdu started
> May  8 11:22:13 fatdrive kernel: BTRFS: btrfs_scrub_dev(/dev/sdl, 10, /dev/sdu) failed -5
> May  8 11:22:13 fatdrive kernel: ------------[ cut here ]------------
> May  8 11:22:13 fatdrive kernel: WARNING: CPU: 0 PID: 8719 at fs/btrfs/dev-replace.c:425 btrfs_dev_replace_start+0x345/0x370()
> May  8 11:22:13 fatdrive kernel: Modules linked in: ntfs nfsd exportfs r8169 mii kvm_amd kvm k10temp ata_generic acpi_cpufreq
> May  8 11:22:13 fatdrive kernel: CPU: 0 PID: 8719 Comm: btrfs Tainted: G        W       4.0.0-gentoo #2
> May  8 11:22:13 fatdrive kernel: Hardware name: BIOSTAR Group TA880G HD/TA880G HD, BIOS 080015  11/11/2011
> May  8 11:22:13 fatdrive kernel:  ffffffff81a98eea ffff880214a9bcb8 ffffffff817e6934 0000000000000000
> May  8 11:22:13 fatdrive kernel:  0000000000000000 ffff880214a9bcf8 ffffffff81049e60 ffff880214a9bcc8
> May  8 11:22:13 fatdrive kernel:  ffff88020fdf2000 00000000fffffffb ffff88020c89d000 ffff88020c89dda8
> May  8 11:22:13 fatdrive kernel: Call Trace:
> May  8 11:22:13 fatdrive kernel:  [<ffffffff817e6934>] dump_stack+0x45/0x57
> May  8 11:22:13 fatdrive kernel:  [<ffffffff81049e60>] warn_slowpath_common+0x90/0xd0
> May  8 11:22:13 fatdrive kernel:  [<ffffffff81049f45>] warn_slowpath_null+0x15/0x20
> May  8 11:22:13 fatdrive kernel:  [<ffffffff81377d75>] btrfs_dev_replace_start+0x345/0x370
> May  8 11:22:13 fatdrive kernel:  [<ffffffff8133f41b>] btrfs_ioctl+0x1a2b/0x2a90
> May  8 11:22:13 fatdrive kernel:  [<ffffffff8110e92b>] ? unlock_page+0x6b/0x70
> May  8 11:22:13 fatdrive kernel:  [<ffffffff8110f30a>] ? filemap_map_pages+0x1ba/0x210
> May  8 11:22:13 fatdrive kernel:  [<ffffffff811349c3>] ? handle_mm_fault+0x733/0xef0
> May  8 11:22:13 fatdrive kernel:  [<ffffffff81170b00>] do_vfs_ioctl+0x2e0/0x4e0
> May  8 11:22:13 fatdrive kernel:  [<ffffffff813957bf>] ? file_has_perm+0x8f/0xa0
> May  8 11:22:13 fatdrive kernel:  [<ffffffff81170d89>] SyS_ioctl+0x89/0xa0
> May  8 11:22:13 fatdrive kernel:  [<ffffffff8103e9fc>] ? do_page_fault+0xc/0x10
> May  8 11:22:13 fatdrive kernel:  [<ffffffff817eea32>] system_call_fastpath+0x12/0x17
> May  8 11:22:13 fatdrive kernel: ---[ end trace 205223c7c401157e ]---
> May  8 11:25:39 fatdrive kernel: device: 'btrfs-6': device_unregister
> May  8 11:25:39 fatdrive kernel: PM: Removing info for No Bus:btrfs-6
> May  8 11:25:39 fatdrive kernel: device: 'btrfs-6': device_create_release
> 
> 
> Now I need to figure out error -5.  The Warning is a thrown warning.
> 
>         /* the disk copy procedure reuses the scrub code */
>         ret = btrfs_scrub_dev(fs_info, src_device->devid, 0,
>                               btrfs_device_get_total_bytes(src_device),
>                               &dev_replace->scrub_progress, 0, 1);
> 
>         ret = btrfs_dev_replace_finishing(root->fs_info, ret);
>         /* don't warn if EINPROGRESS, someone else might be running scrub */
>         if (ret == -EINPROGRESS) {
>                 args->result = BTRFS_IOCTL_DEV_REPLACE_RESULT_SCRUB_INPROGRESS;
>                 ret = 0;
>         } else {
>                 WARN_ON(ret);
>         }
> 
> 
> I am close to zeroing the log at this point.  It is sad that we cannot fix the transaction log instead of just flushing it.

   That's not likely to make any difference here. If the FS mounts OK
with -o ro, then zeroing the log might be useful. Otherwise, it won't
e (a read-only mount won't replay the log; if it mounts without
replaying the log, the problem is in the log, and it should be
dropped; if not, then the problem is nothing to do with the log).

   Dropping the log is *not* a panacea to all btrfs problems. It's
there for a very limited class of issues, which rarely show up these
days. I wish there were some way of editing the internet(*) to remove the
idea that btrfs-zero-log will magically fix everything. It won't.

(*) https://www.xkcd.com/386/ is probably appropriate here.

> Once again btrfsck --repair /dev/sdm ended with
> 
> parent transid verify failed on 94238015488 wanted 150690 found 150691
> Ignoring transid failure
> 
> no attempt to actually repair the volume.  No indication from the tools why.

   A transid failure means that the superblock has been written out to
disk *before* a part of the metadata that forms that transaction, and
then the machine has crashed in some way that prevented the
late-arriving metadata from hitting the disk. There are two ways that
this can happen: it's a bug in the kernel, or the hardware lies about
having written data. Both are possible, but the former is more likely.

   Once this failure has happened, the FS is corrupt in a way that's
hard to repair reliably. I did raise this question with Chris a while
ago, and my understanding from the conversation was that he didn't
think that it was possible to fix transid failures in btrfsck.

> Maybe I am over hoping for a COW transaction system to actually have
> the ability to fix transaction issues since there is little to no
> documentation other than zero the log.  I am also wondering why we
> have a log in the file system if the fix is to just flush it.

   Once again: zeroing the log won't help. It doesn't fix everything.
In fact, it rarely fixes anything.

   The reason there's no documentation on fixing transid failures is
because there's no good fix for them.

   Hugo.

-- 
Hugo Mills             | To an Englishman, 100 miles is a long way; to an
hugo@... carfax.org.uk | American, 100 years is a long time.
http://carfax.org.uk/  |
PGP: E2AB1DE4          |                                        Earle Hitchner

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Kernel Dump scanning directory
  2015-05-08 20:58       ` Duncan
@ 2015-05-08 21:37         ` Anthony Plack
  2015-05-08 22:06           ` Duncan
  2015-05-08 22:06           ` Noah Massey
  0 siblings, 2 replies; 14+ messages in thread
From: Anthony Plack @ 2015-05-08 21:37 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

> On May 8, 2015, at 3:58 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> 
> And then I wasn't quite so concerned either about losing something 
> critical in the log, or the safety of simply blowing it away, since by 
> design, the log only deals with what hasn't been committed yet, and the 
> commits are themselves designed to be atomic and leave the filesystem in 
> a known good state.
> 

Duncan,
Thank you for joining me in my monolog here.

30 seconds on a server can be allot of data, and is to small of a time span for adequate backups to occur.  What you say makes sense, however, without this fixed going forward, BTRFS can never become a serious data storage file system.

While there are major distros who have switched to this as their default filesystem, and acknowledging that btrfs has no control over who chooses to do this, there should be more than just a warning about it being unstable, but wholly unreliable.

I do get the point that you are relaying what you understood, and it does help me a bit.  Since I have a backup (and always I would like another of the data), I thought I would try to help solve an ongoing problem of the tranid issues.  I realize I am just stumbling around in the dark, so maybe this comes for nothing in the end.

Maybe we need a warning at the top of the BTRFS page that it is highly likely, if BTRFS crashes and the transactions get corrupt, that you WILL loose at a minimum the last 30 seconds of your data and maybe more.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Kernel Dump scanning directory
  2015-05-08 21:18       ` Hugo Mills
@ 2015-05-08 21:49         ` Anthony Plack
  2015-05-08 22:37           ` Hugo Mills
  0 siblings, 1 reply; 14+ messages in thread
From: Anthony Plack @ 2015-05-08 21:49 UTC (permalink / raw)
  To: Hugo Mills; +Cc: linux-btrfs


> On May 8, 2015, at 4:18 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
> 
> On Fri, May 08, 2015 at 11:44:39AM -0500, Anthony Plack wrote:
>> I am close to zeroing the log at this point.  It is sad that we cannot fix the transaction log instead of just flushing it.
> 
>   That's not likely to make any difference here. If the FS mounts OK
> with -o ro, then zeroing the log might be useful. Otherwise, it won't
> e (a read-only mount won't replay the log; if it mounts without
> replaying the log, the problem is in the log, and it should be
> dropped; if not, then the problem is nothing to do with the log).

The FS mounts okay no matter what, it is trying to read the inodes where it crashes.  Thanks for the advise.

>   Dropping the log is *not* a panacea to all btrfs problems. It's
> there for a very limited class of issues, which rarely show up these
> days. I wish there were some way of editing the internet(*) to remove the
> idea that btrfs-zero-log will magically fix everything. It won't.

Good to know.

> (*) https://www.xkcd.com/386/ is probably appropriate here.
> 
>> Once again btrfsck --repair /dev/sdm ended with
>> 
>> parent transid verify failed on 94238015488 wanted 150690 found 150691
>> Ignoring transid failure
>> 
>> no attempt to actually repair the volume.  No indication from the tools why.
> 
>   A transid failure means that the superblock has been written out to
> disk *before* a part of the metadata that forms that transaction, and
> then the machine has crashed in some way that prevented the
> late-arriving metadata from hitting the disk. There are two ways that
> this can happen: it's a bug in the kernel, or the hardware lies about
> having written data. Both are possible, but the former is more likely.

Also good to know.  I agree to the bug in the kernel.  I think you just hit on the issue here.  Since this COW, we should be able to assume that the superblock is then still "empty."  Or do I misunderstand?

> 
>   Once this failure has happened, the FS is corrupt in a way that's
> hard to repair reliably. I did raise this question with Chris a while
> ago, and my understanding from the conversation was that he didn't
> think that it was possible to fix transid failures in btrfsck.

I guess I don't understand how that could be designed that way.

>   Once again: zeroing the log won't help. It doesn't fix everything.
> In fact, it rarely fixes anything.

>   The reason there's no documentation on fixing transid failures is
> because there's no good fix for them.

Then what is the point of the transactions?  Why do we care about transid mismatch?  Why keep something, that if it fails, breaks the whole thing?

Thanks for the thoughts.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Kernel Dump scanning directory - New Scary Issue
  2015-05-08 16:44     ` Anthony Plack
  2015-05-08 20:58       ` Duncan
  2015-05-08 21:18       ` Hugo Mills
@ 2015-05-08 22:01       ` Anthony Plack
  2 siblings, 0 replies; 14+ messages in thread
From: Anthony Plack @ 2015-05-08 22:01 UTC (permalink / raw)
  To: linux-btrfs


> On May 8, 2015, at 11:44 AM, Anthony Plack <anthony@plack.net> wrote:
> 
>> 
>> On May 7, 2015, at 4:30 PM, Anthony Plack <anthony@plack.net> wrote:
>> 
>> On May 7, 2015, at 9:33 AM, Anthony Plack <anthony@plack.net> wrote:
>>> 
>>> Okay, trying my programming skills at this unfamiliar code.
>>> 
>>> Okay so this looks like an IO error, then an abort transaction.  Is it a fair assumption that a normal transaction abort marks the volume as read-only, undoes the transaction, and then marks the transaction as read-write?  Therefore, because of the dump, in the middle of the undo, the volume remains read-only?
>>> 
>>>> On May 6, 2015, at 4:56 PM, Anthony Plack <anthony@plack.net> wrote:
>>>> 
>>>> Hey guys, I have some dump data from BTRFS.  This occurred when running rsync -nav to my backup server.  Data is all there (as far as I can tell about 9TB of data).  If I unmount the drive, and remount it, it will be writable (echo ‘b’ > b) for days, but I start the rsync and very quickly it will dump and switch to read-only.  Scrub finished successful (mount was still RW after completion) yesterday.  Let me know what else I can get you.
>>>> 
>>>> May  6 11:17:47 fatdrive kernel: verify_parent_transid: 19 callbacks suppressed
>>>> May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
>>>> May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
>>>> May  6 11:17:47 fatdrive kernel: ------------[ cut here ]------------
>>>> May  6 11:17:47 fatdrive kernel: WARNING: CPU: 0 PID: 9426 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x5a/0x130()
>>>> May  6 11:17:47 fatdrive kernel: BTRFS: Transaction aborted (error -5)
>>>> May  6 11:17:47 fatdrive kernel: Modules linked in: nfsd exportfs kvm_amd r8168(O) kvm k10temp ata_generic acpi_cpufreq
>>>> May  6 11:17:47 fatdrive kernel: CPU: 0 PID: 9426 Comm: kworker/u12:4 Tainted: G        W  O    4.0.0-gentoo #1
>>>> May  6 11:17:47 fatdrive kernel: Hardware name: BIOSTAR Group TA880G HD/TA880G HD, BIOS 080015 11/11/2011
>>>> May  6 11:17:47 fatdrive kernel: Workqueue: btrfs-extent-refs btrfs_extent_refs_helper
>>>> May  6 11:17:47 fatdrive kernel:  ffffffff81a9729e ffff8801704abc38 ffffffff817e6414 0000000000000000
>>>> May  6 11:17:47 fatdrive kernel:  ffff8801704abc88 ffff8801704abc78 ffffffff81049e60 0000000000000000
>>>> May  6 11:17:47 fatdrive kernel:  ffff8802142b22c0 ffff88020ce81800 00000000fffffffb ffffffff81869b50
>>>> May  6 11:17:47 fatdrive kernel: Call Trace:
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff817e6414>] dump_stack+0x45/0x57
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81049e60>] warn_slowpath_common+0x90/0xd0
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81049ee1>] warn_slowpath_fmt+0x41/0x50
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812d9e7a>] __btrfs_abort_transaction+0x5a/0x130
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f6fa1>] btrfs_run_delayed_refs.part.65+0x121/0x270
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f7190>] delayed_ref_async_start+0x80/0xa0
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81336e07>] normal_work_helper+0xb7/0x2b0
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff8133711d>] btrfs_extent_refs_helper+0xd/0x10
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff810606c7>] process_one_work+0x147/0x3f0
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81060a88>] worker_thread+0x118/0x440
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81060970>] ? process_one_work+0x3f0/0x3f0
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065624>] kthread+0xc4/0xe0
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff817ee488>] ret_from_fork+0x58/0x90
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
>>>> May  6 11:17:47 fatdrive kernel: ---[ end trace f948e8ffd1f550c0 ]---
>>>> May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
>>>> May  6 11:17:47 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 94051291136 wanted 150686 found 150687
>>> 
>>> Not sure where to dig out btrfs_run_delayed_refs.part.65+0x121 as it is a structure (gdb is not a regular tool I use).  it seems that it is trying to run the delayed refs and something in the data for line 265 is wrong and generating a warning.
>>> 
>>> 260					errno);
>>> 261		}
>>> 262		trans->aborted = errno;
>>> 263		/* Nothing used. The other threads that have joined this
>>> 264		 * transaction may be able to continue. */
>>> 265		if (!trans->blocks_used && list_empty(&trans->new_bgs)) {
>>> 266			const char *errstr;
>>> 267	
>>> 268			errstr = btrfs_decode_error(errno);
>>> 269			btrfs_warn(root->fs_info,
>>> 
>>> 
>>> 
>>>> May  6 11:17:47 fatdrive kernel: BTRFS warning (device sdm): btrfs_run_delayed_refs:2790: Aborting unused transaction(IO failure).
>>>> May  6 11:17:47 fatdrive kernel: ------------[ cut here ]------------
>>>> May  6 11:17:47 fatdrive kernel: WARNING: CPU: 1 PID: 9414 at fs/btrfs/delayed-ref.c:438 btrfs_select_ref_head+0x124/0x140()
>>>> May  6 11:17:47 fatdrive kernel: Modules linked in: nfsd exportfs kvm_amd r8168(O) kvm k10temp ata_generic acpi_cpufreq
>>>> May  6 11:17:47 fatdrive kernel: CPU: 1 PID: 9414 Comm: kworker/u12:0 Tainted: G        W  O    4.0.0-gentoo #1
>>>> May  6 11:17:47 fatdrive kernel: Hardware name: BIOSTAR Group TA880G HD/TA880G HD, BIOS 080015 11/11/2011
>>>> May  6 11:17:47 fatdrive kernel: Workqueue: btrfs-extent-refs btrfs_extent_refs_helper
>>>> May  6 11:17:47 fatdrive kernel:  ffffffff81a98a02 ffff88006476bb58 ffffffff817e6414 0000000000000000
>>>> May  6 11:17:47 fatdrive kernel:  0000000000000000 ffff88006476bb98 ffffffff81049e60 ffff88006476bbd8
>>>> May  6 11:17:47 fatdrive kernel:  ffff880214300320 ffff880178aec120 0000000000000000 ffff880178aec208
>>>> May  6 11:17:47 fatdrive kernel: Call Trace:
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff817e6414>] dump_stack+0x45/0x57
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81049e60>] warn_slowpath_common+0x90/0xd0
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81049f45>] warn_slowpath_null+0x15/0x20
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81351f04>] btrfs_select_ref_head+0x124/0x140
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f1f15>] __btrfs_run_delayed_refs+0x1f5/0x11f0
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff8106e6cd>] ? check_preempt_curr+0x8d/0xa0
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff8106e6f8>] ? ttwu_do_wakeup+0x18/0xd0
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f6eea>] btrfs_run_delayed_refs.part.65+0x6a/0x270
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff812f7190>] delayed_ref_async_start+0x80/0xa0
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81336e07>] normal_work_helper+0xb7/0x2b0
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff8133711d>] btrfs_extent_refs_helper+0xd/0x10
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff810606c7>] process_one_work+0x147/0x3f0
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81060a88>] worker_thread+0x118/0x440
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81060970>] ? process_one_work+0x3f0/0x3f0
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065624>] kthread+0xc4/0xe0
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff817ee488>] ret_from_fork+0x58/0x90
>>>> May  6 11:17:47 fatdrive kernel:  [<ffffffff81065560>] ? kthread_create_on_node+0x190/0x190
>>>> May  6 11:17:47 fatdrive kernel: ---[ end trace f948e8ffd1f550c1 ]---
>>>> May  6 11:17:47 fatdrive kernel: BTRFS: error (device sdm) in btrfs_run_delayed_refs:2790: errno=-5 IO failure
>>>> May  6 11:17:47 fatdrive kernel: BTRFS info (device sdm): forced readonly
>>>> May  6 11:17:47 fatdrive kernel: BTRFS: error (device sdm) in btrfs_run_delayed_refs:2790: errno=-5 IO failure
>>>> May  6 11:17:47 fatdrive kernel: BTRFS: error (device sdm) in btrfs_run_delayed_refs:2790: errno=-5 IO failure
>>>> May  6 11:18:16 fatdrive kernel: verify_parent_transid: 28 callbacks suppressed
>>>> May  6 11:18:16 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 7645497978880 wanted 148829 found 147711
>>>> 
>>> 
>>> This next one seems a little more obvious.
>>> 
>>> 0x12d4 is in btrfs_select_ref_head (fs/btrfs/delayed-ref.c:438).
>>> 433			head = rb_entry(node, struct btrfs_delayed_ref_head,
>>> 434					href_node);
>>> 435		}
>>> 436	
>>> 437		head->processing = 1;
>>> 438		WARN_ON(delayed_refs->num_heads_ready == 0);
>>> 439		delayed_refs->num_heads_ready--;
>>> 440		delayed_refs->run_delayed_start = head->node.bytenr +
>>> 441			head->node.num_bytes;
>>> 442		return head;
>>> 
>>> Because there are not heads ready, we are getting a warning.
>>> 
>>> So the real question is how do I know what is generating the IO error?  The rsync -nav is not changing the the volume, it is reading from it.
>>> 
>>> The devices are not showing errors in the log or in the "device stat" view.
>>> 
>>> What seems to be interesting is that whatever is generating the IO errors (which seems to be in the extents), triggers the Warning, which by default flips the whole volume to read-only.
>> 
>> Okay, so I ran btrfs-debug-tree /dev/sdm.  Most of the leaves looked okay until it got to one where it gave me this and quit:
>> 
>> 	item 35 key (2685703434240 EXTENT_ITEM 4096) itemoff 2159 itemsize 51
>> 		extent refs 1 gen 150619 flags TREE_BLOCK
>> 		tree block key (4231221 DIR_ITEM 1140804831) level 0
>> 		tree block backref root 5
>> parent transid verify failed on 94238015488 wanted 150690 found 150691
>> parent transid verify failed on 94238015488 wanted 150690 found 150691
>> parent transid verify failed on 94238015488 wanted 150690 found 150691
>> parent transid verify failed on 94238015488 wanted 150690 found 150691
>> Ignoring transid failure
>> print-tree.c:1091: btrfs_print_tree: Assertion failed.
>> btrfs-debug-tree[0x418539]
>> btrfs-debug-tree[0x41a53a]
>> btrfs-debug-tree(btrfs_print_tree+0x1eb)[0x41a40b]
>> btrfs-debug-tree(main+0x704)[0x407d04]
>> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f2936a42a85]
>> btrfs-debug-tree[0x40812d]
>> 
>> So it looks like the IO error is really just a bad extent, which makes sense from the warnings from the kernel.
>> 
> 
> Okay, updated the btrfs-progs to 4.0.  No change.  Scrub runs fine, 6 uncorrectable errors.
> 
> I have a new drive, and the worst drive in the list is /dev/sdl.  So I started a replace operation on sdl figuring that if I can replace the drive causing the bad IO, I can fix the issue.
> 
> btrfs replace start -f /dev/sdu /dev/sdl /mnt/data
> 
> This generated another warning (and read-only location)
> 
> May  8 11:22:05 fatdrive kernel: BTRFS: dev_replace from /dev/sdl (devid 10) to /dev/sdu started
> May  8 11:22:13 fatdrive kernel: BTRFS: btrfs_scrub_dev(/dev/sdl, 10, /dev/sdu) failed -5
> May  8 11:22:13 fatdrive kernel: ------------[ cut here ]------------
> May  8 11:22:13 fatdrive kernel: WARNING: CPU: 0 PID: 8719 at fs/btrfs/dev-replace.c:425 btrfs_dev_replace_start+0x345/0x370()
> May  8 11:22:13 fatdrive kernel: Modules linked in: ntfs nfsd exportfs r8169 mii kvm_amd kvm k10temp ata_generic acpi_cpufreq
> May  8 11:22:13 fatdrive kernel: CPU: 0 PID: 8719 Comm: btrfs Tainted: G        W       4.0.0-gentoo #2
> May  8 11:22:13 fatdrive kernel: Hardware name: BIOSTAR Group TA880G HD/TA880G HD, BIOS 080015  11/11/2011
> May  8 11:22:13 fatdrive kernel:  ffffffff81a98eea ffff880214a9bcb8 ffffffff817e6934 0000000000000000
> May  8 11:22:13 fatdrive kernel:  0000000000000000 ffff880214a9bcf8 ffffffff81049e60 ffff880214a9bcc8
> May  8 11:22:13 fatdrive kernel:  ffff88020fdf2000 00000000fffffffb ffff88020c89d000 ffff88020c89dda8
> May  8 11:22:13 fatdrive kernel: Call Trace:
> May  8 11:22:13 fatdrive kernel:  [<ffffffff817e6934>] dump_stack+0x45/0x57
> May  8 11:22:13 fatdrive kernel:  [<ffffffff81049e60>] warn_slowpath_common+0x90/0xd0
> May  8 11:22:13 fatdrive kernel:  [<ffffffff81049f45>] warn_slowpath_null+0x15/0x20
> May  8 11:22:13 fatdrive kernel:  [<ffffffff81377d75>] btrfs_dev_replace_start+0x345/0x370
> May  8 11:22:13 fatdrive kernel:  [<ffffffff8133f41b>] btrfs_ioctl+0x1a2b/0x2a90
> May  8 11:22:13 fatdrive kernel:  [<ffffffff8110e92b>] ? unlock_page+0x6b/0x70
> May  8 11:22:13 fatdrive kernel:  [<ffffffff8110f30a>] ? filemap_map_pages+0x1ba/0x210
> May  8 11:22:13 fatdrive kernel:  [<ffffffff811349c3>] ? handle_mm_fault+0x733/0xef0
> May  8 11:22:13 fatdrive kernel:  [<ffffffff81170b00>] do_vfs_ioctl+0x2e0/0x4e0
> May  8 11:22:13 fatdrive kernel:  [<ffffffff813957bf>] ? file_has_perm+0x8f/0xa0
> May  8 11:22:13 fatdrive kernel:  [<ffffffff81170d89>] SyS_ioctl+0x89/0xa0
> May  8 11:22:13 fatdrive kernel:  [<ffffffff8103e9fc>] ? do_page_fault+0xc/0x10
> May  8 11:22:13 fatdrive kernel:  [<ffffffff817eea32>] system_call_fastpath+0x12/0x17
> May  8 11:22:13 fatdrive kernel: ---[ end trace 205223c7c401157e ]---
> May  8 11:25:39 fatdrive kernel: device: 'btrfs-6': device_unregister
> May  8 11:25:39 fatdrive kernel: PM: Removing info for No Bus:btrfs-6
> May  8 11:25:39 fatdrive kernel: device: 'btrfs-6': device_create_release
> 
> 
> Now I need to figure out error -5.  The Warning is a thrown warning.
> 
>        /* the disk copy procedure reuses the scrub code */
>        ret = btrfs_scrub_dev(fs_info, src_device->devid, 0,
>                              btrfs_device_get_total_bytes(src_device),
>                              &dev_replace->scrub_progress, 0, 1);
> 
>        ret = btrfs_dev_replace_finishing(root->fs_info, ret);
>        /* don't warn if EINPROGRESS, someone else might be running scrub */
>        if (ret == -EINPROGRESS) {
>                args->result = BTRFS_IOCTL_DEV_REPLACE_RESULT_SCRUB_INPROGRESS;
>                ret = 0;
>        } else {
>                WARN_ON(ret);
>        }
> 
> 
> I am close to zeroing the log at this point.  It is sad that we cannot fix the transaction log instead of just flushing it.
> 
> Once again btrfsck --repair /dev/sdm ended with
> 
> parent transid verify failed on 94238015488 wanted 150690 found 150691
> Ignoring transid failure
> 
> no attempt to actually repair the volume.  No indication from the tools why.
> 
> Maybe I am over hoping for a COW transaction system to actually have the ability to fix transaction issues since there is little to no documentation other than zero the log.  I am also wondering why we have a log in the file system if the fix is to just flush it.

Okay, new twist here, and this is even more scary than the tranid crashing the system.  I have 4 additional drives which I was going to place on a RAID5 mdadm array with BTRFS on-top.  The goal was to get these running and try btrfs recover on the damaged filesystem.  So here is what happens:

fatdrive data # umount /mnt/data
umount: /mnt/data: target is busy
        (In some cases useful info about processes that
         use the device is found by lsof(8) or fuser(1).)
fatdrive data # btrfs fi sh /mnt/data
Label: 'data'  uuid: f591ac13-1a69-476d-bd30-346f87a491da
	Total devices 11 FS bytes used 9.85TiB
	devid    1 size 1.82TiB used 1.82TiB path /dev/sdm
	devid    2 size 1.82TiB used 1.82TiB path /dev/sdh
	devid    3 size 1.82TiB used 1.82TiB path /dev/sdg
	devid    4 size 1.82TiB used 1.81TiB path /dev/sdf
	devid    5 size 1.82TiB used 1.81TiB path /dev/sdc
	devid    6 size 1.82TiB used 1.82TiB path /dev/sdb
	devid    7 size 1.82TiB used 1.82TiB path /dev/sdk
	devid    8 size 1.82TiB used 1.82TiB path /dev/sdd
	devid    9 size 2.73TiB used 2.63TiB path /dev/sde
	devid   10 size 1.82TiB used 1.82TiB path /dev/sdl
	devid   11 size 1.82TiB used 836.00GiB path /dev/sda

btrfs-progs v4.0
fatdrive data # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] 
md2 : active raid5 sdt[4] sdq[2] sdp[1] sdo[0]
      8790405120 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
      [=>...................]  recovery =  6.1% (180268820/2930135040) finish=1144.3min speed=40050K/sec
      bitmap: 0/22 pages [0KB], 65536KB chunk

md1 : active raid1 sdj3[2] sdi3[1]
      479661056 blocks super 1.2 [2/2] [UU]
      bitmap: 0/4 pages [0KB], 65536KB chunk

md0 : active raid1 sdi1[0] sdj1[1]
      204736 blocks [2/2] [UU]
      
unused devices: <none>
fatdrive data # mkfs.btrfs -f /dev/md2
Error: error checking /dev/md2 mount status


This may not seem scary at first, but when you check messages, every time I run mkfs.btrfs -f /dev/md2, I get the following:

May  8 15:05:46 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 93472575488 wanted 150744 found 150742
May  8 15:05:46 fatdrive kernel: BTRFS (device sdm): parent transid verify failed on 93472575488 wanted 150744 found 150742

This is 100% reproducible on this machine.  Every time I try to create the new filesystem, it fails with those two messages in dmesg or message log.  Every time, like clockwork.

So what does /dev/md2 have to do with /dev/sdm?  Why does md2 fail because of a transid failure on a totally different set of disks?

So my new understanding is that if ANY BTRFS filesystem is damaged, then BTRFS will not work on the remainder of the disks in the system.

Can someone please prove me wrong?


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Kernel Dump scanning directory
  2015-05-08 21:37         ` Anthony Plack
@ 2015-05-08 22:06           ` Duncan
  2015-05-08 22:06           ` Noah Massey
  1 sibling, 0 replies; 14+ messages in thread
From: Duncan @ 2015-05-08 22:06 UTC (permalink / raw)
  To: linux-btrfs

Anthony Plack posted on Fri, 08 May 2015 16:37:05 -0500 as excerpted:

> 30 seconds on a server can be allot of data, and is to small of a time
> span for adequate backups to occur.

> Maybe we need a warning at the top of the BTRFS page that it is highly
> likely, if BTRFS crashes and the transactions get corrupt, that you WILL
> loose at a minimum the last 30 seconds of your data and maybe more.

Note that commit time has been a mount-time option for a number of kernel 
releases, now.  It's 30 seconds by default, and IMO that's a reasonable 
compromise, but it /is/ an option.

In general I agree, however, particularly about the stability, which 
simply isn't as good as some would make it, ATM[1], tho Hugo does make a 
valid point, there's another bug triggering the (current) problem, and 
getting that rooted out is critical, regardless of where this more 
general debate goes.

---
[1] And unfortunately, that lack of very visible stability warning, as 
they've all been removed, is causing people to rely on btrfs without 
backups, that really should either have backups, or if they really /must/ 
play without backups, they really /should/ be on something other than 
btrfs at this point.

Tho FWIW I think the unstated thinking is that the warnings were scaring 
away the broader level of testing that btrfs needs at this point, with 
the thought being chicken and egg, we wouldn't get the needed testing 
with the warnings, and the warnings couldn't be truthfully taken away 
without that needed testing and followup, so I think someone decided it 
was time to fudge a bit, at least to the extent of arranging for the 
absence of warnings that /were/ there previously, thus allowing deception 
by omission for those who simply don't look close enough at the small-
print when they choose their filesystem, as well as not realizing that 
even on "stable" filesystems let alone something still not entirely 
stable like btrfs, lack of a backup really does state by action that you 
simply don't care about the data, and prefer to risk its potential loss 
over the certain time and opportunity cost of actually doing/testing the 
backups.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Kernel Dump scanning directory
  2015-05-08 21:37         ` Anthony Plack
  2015-05-08 22:06           ` Duncan
@ 2015-05-08 22:06           ` Noah Massey
  1 sibling, 0 replies; 14+ messages in thread
From: Noah Massey @ 2015-05-08 22:06 UTC (permalink / raw)
  To: linux-btrfs

On Fri, May 8, 2015 at 5:37 PM, Anthony Plack <tmy@plack.net> wrote:
>
>> On May 8, 2015, at 3:58 PM, Duncan <1i5t5.duncan@cox.net> wrote:
>>
>> And then I wasn't quite so concerned either about losing something
>> critical in the log, or the safety of simply blowing it away, since by
>> design, the log only deals with what hasn't been committed yet, and the
>> commits are themselves designed to be atomic and leave the filesystem in
>> a known good state.
>>
>
> Duncan,
> Thank you for joining me in my monolog here.
>
> 30 seconds on a server can be allot of data, and is to small of a time span for adequate backups to occur.  What you say makes sense, however, without this fixed going forward, BTRFS can never become a serious data storage file system.
>

Standard disclaimer: not a dev, just a user.
I believe the email Duncan referred to was
http://www.spinics.net/lists/linux-btrfs/msg43407.html
and my understanding was slightly different than his summary.
If I understood correctly, by the time an fsync returns success, the
BTRFS log will contain all the data needed to comply with the fsync
guarantee ("This data has been safely written to media").
If the system goes down before the next full transaction commit
(default 30 seconds), then on next rw mount, BTRFS will replay log.
The resulting filesystem state is the state of last transaction +
fsynced data.

In the event of log corruption, BTRFS may experience failures replaying the log.
In this case, you can zero the log: then the filesystem state becomes
the last transaction, without the last 30 seconds.
Normal operation should not result in a corrupt log, since the log is
it's own COW tree and is coherently on media before fsync returns.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Kernel Dump scanning directory
  2015-05-08 21:49         ` Anthony Plack
@ 2015-05-08 22:37           ` Hugo Mills
  2015-05-11 17:30             ` Anthony Plack
  0 siblings, 1 reply; 14+ messages in thread
From: Hugo Mills @ 2015-05-08 22:37 UTC (permalink / raw)
  To: Anthony Plack; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 5189 bytes --]

On Fri, May 08, 2015 at 04:49:19PM -0500, Anthony Plack wrote:
> 
> > On May 8, 2015, at 4:18 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
> > 
> > On Fri, May 08, 2015 at 11:44:39AM -0500, Anthony Plack wrote:
> >> Once again btrfsck --repair /dev/sdm ended with
> >> 
> >> parent transid verify failed on 94238015488 wanted 150690 found 150691
> >> Ignoring transid failure
> >> 
> >> no attempt to actually repair the volume.  No indication from the tools why.
> > 
> >   A transid failure means that the superblock has been written out to
> > disk *before* a part of the metadata that forms that transaction, and
> > then the machine has crashed in some way that prevented the
> > late-arriving metadata from hitting the disk. There are two ways that
> > this can happen: it's a bug in the kernel, or the hardware lies about
> > having written data. Both are possible, but the former is more likely.
> 
> Also good to know.  I agree to the bug in the kernel.  I think you just hit on the issue here.  Since this COW, we should be able to assume that the superblock is then still "empty."  Or do I misunderstand?

   No, if the superblock didn't have anything in it, the FS wouldn't
be mountable.

   The CoW design works by making all writes to currently-unused
locations. At some point (a transaction boundary, every 30 seconds by
default), the FS ensures that everything written to date is consistent
-- that everything has been flushed to permanent storage -- and only
when all of that data has hit the disk does it write out the updated
superblock(s) that point to the new data structures. Since superblock
writes are atomic, the FS remains consistent at all times: either the
superblock refers to the old data or the new data, but cannot refer to
any partial state in between.

   It doesn't matter what order the data is written out in, *provided*
that there's a guarantee that it's *all* written out before the
superblock is sent.

   Now, where things go wrong, and results in a transid failure, is
where the metadata structures are written or modified *after* the
superblock is written, and then the machine crashes partway through
the sequence of writes. The intended design of the FS is such that
this should never happen -- however, there's a class of bug that means
it *can* happen, which is probably what's gone on here. I don't know
enough about the block layer to be able to characterise in more detail
what the problem looks like.

> >   Once this failure has happened, the FS is corrupt in a way that's
> > hard to repair reliably. I did raise this question with Chris a while
> > ago, and my understanding from the conversation was that he didn't
> > think that it was possible to fix transid failures in btrfsck.
> 
> I guess I don't understand how that could be designed that way.
> 
> >   Once again: zeroing the log won't help. It doesn't fix everything.
> > In fact, it rarely fixes anything.
> 
> >   The reason there's no documentation on fixing transid failures is
> > because there's no good fix for them.
> 
> Then what is the point of the transactions?  Why do we care about
> transid mismatch?  Why keep something, that if it fails, breaks the
> whole thing?

   A transaction is simply what you get when you write out a new
superblock. It's an atomic change to the FS state. In an ideal world,
the transactions would be fine-grained, and would match the POSIX
filesystem semantics. In practice, the overhead of CoWing all the
metadata for a change at that scale is (relatively) huge. As a result,
the FS collects together larger chunks of writes into a single
transaction.

   Now, there's a further difficulty, which is that if you make
transactions long-lived, you have two choices about the FS behaviour
towards applications: either you wait until the end of the
transaction, and then report success to the writer, in which case
performance is appalling; or you can report to the application that
the write succeeded as soon as you've accepted it.

   Btrfs uses the latter method for performance reasons. But... if you
do that and the machine crashes before the end of the transaction, the
FS state is still the old one, so the app thinks it's written data
successfully to disk, but there's no record of it in the FS. The log
tree fixes _this_ problem by keeping a minimal record of the writes
made so far in the current transaction, so that when the FS resumes
after a crash or power loss, the log can be examined and used to find
the outstanding writes that have been confirmed to userspace. From
that, the partial transaction can be completed.

   In the past, there have been a few bugs where the log tree has been
written out with bad data in it, and this has led to non-mountable
filesystems (because it tries replaying the log tree and crashes).
This is the *only* case where btrfs-zero-log helps.

   I hope all that helps explain what all the moving parts and for,
and why they're there.

   Hugo.

-- 
Hugo Mills             | To an Englishman, 100 miles is a long way; to an
hugo@... carfax.org.uk | American, 100 years is a long time.
http://carfax.org.uk/  |
PGP: E2AB1DE4          |                                        Earle Hitchner

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Kernel Dump scanning directory
  2015-05-08 22:37           ` Hugo Mills
@ 2015-05-11 17:30             ` Anthony Plack
  2015-05-12  6:33               ` Duncan
  0 siblings, 1 reply; 14+ messages in thread
From: Anthony Plack @ 2015-05-11 17:30 UTC (permalink / raw)
  To: Hugo Mills; +Cc: linux-btrfs


> On May 8, 2015, at 5:37 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
> 
> On Fri, May 08, 2015 at 04:49:19PM -0500, Anthony Plack wrote:
>> 
>>> On May 8, 2015, at 4:18 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
>>> 
>>> On Fri, May 08, 2015 at 11:44:39AM -0500, Anthony Plack wrote:
>>>> Once again btrfsck --repair /dev/sdm ended with
>>>> 
>>>> parent transid verify failed on 94238015488 wanted 150690 found 150691
>>>> Ignoring transid failure
>>>> 
>>>> no attempt to actually repair the volume.  No indication from the tools why.
>>> 
>>>  A transid failure means that the superblock has been written out to
>>> disk *before* a part of the metadata that forms that transaction, and
>>> then the machine has crashed in some way that prevented the
>>> late-arriving metadata from hitting the disk. There are two ways that
>>> this can happen: it's a bug in the kernel, or the hardware lies about
>>> having written data. Both are possible, but the former is more likely.
>> 
>> Also good to know.  I agree to the bug in the kernel.  I think you just hit on the issue here.  Since this COW, we should be able to assume that the superblock is then still "empty."  Or do I misunderstand?
> 
>   No, if the superblock didn't have anything in it, the FS wouldn't
> be mountable.
> 
>   The CoW design works by making all writes to currently-unused
> locations. At some point (a transaction boundary, every 30 seconds by
> default), the FS ensures that everything written to date is consistent
> -- that everything has been flushed to permanent storage -- and only
> when all of that data has hit the disk does it write out the updated
> superblock(s) that point to the new data structures. Since superblock
> writes are atomic, the FS remains consistent at all times: either the
> superblock refers to the old data or the new data, but cannot refer to
> any partial state in between.
> 
>   It doesn't matter what order the data is written out in, *provided*
> that there's a guarantee that it's *all* written out before the
> superblock is sent.
> 
>   Now, where things go wrong, and results in a transid failure, is
> where the metadata structures are written or modified *after* the
> superblock is written, and then the machine crashes partway through
> the sequence of writes. The intended design of the FS is such that
> this should never happen -- however, there's a class of bug that means
> it *can* happen, which is probably what's gone on here. I don't know
> enough about the block layer to be able to characterise in more detail
> what the problem looks like.
> 
>>>  Once this failure has happened, the FS is corrupt in a way that's
>>> hard to repair reliably. I did raise this question with Chris a while
>>> ago, and my understanding from the conversation was that he didn't
>>> think that it was possible to fix transid failures in btrfsck.
>> 
>> I guess I don't understand how that could be designed that way.
>> 
>>>  Once again: zeroing the log won't help. It doesn't fix everything.
>>> In fact, it rarely fixes anything.
>> 
>>>  The reason there's no documentation on fixing transid failures is
>>> because there's no good fix for them.
>> 
>> Then what is the point of the transactions?  Why do we care about
>> transid mismatch?  Why keep something, that if it fails, breaks the
>> whole thing?
> 
>   A transaction is simply what you get when you write out a new
> superblock. It's an atomic change to the FS state. In an ideal world,
> the transactions would be fine-grained, and would match the POSIX
> filesystem semantics. In practice, the overhead of CoWing all the
> metadata for a change at that scale is (relatively) huge. As a result,
> the FS collects together larger chunks of writes into a single
> transaction.
> 
>   Now, there's a further difficulty, which is that if you make
> transactions long-lived, you have two choices about the FS behaviour
> towards applications: either you wait until the end of the
> transaction, and then report success to the writer, in which case
> performance is appalling; or you can report to the application that
> the write succeeded as soon as you've accepted it.
> 
>   Btrfs uses the latter method for performance reasons. But... if you
> do that and the machine crashes before the end of the transaction, the
> FS state is still the old one, so the app thinks it's written data
> successfully to disk, but there's no record of it in the FS. The log
> tree fixes _this_ problem by keeping a minimal record of the writes
> made so far in the current transaction, so that when the FS resumes
> after a crash or power loss, the log can be examined and used to find
> the outstanding writes that have been confirmed to userspace. From
> that, the partial transaction can be completed.
> 
>   In the past, there have been a few bugs where the log tree has been
> written out with bad data in it, and this has led to non-mountable
> filesystems (because it tries replaying the log tree and crashes).
> This is the *only* case where btrfs-zero-log helps.
> 
>   I hope all that helps explain what all the moving parts and for,
> and why they're there.
> 
>   Hugo.
> 
> -- 
> Hugo Mills             | To an Englishman, 100 miles is a long way; to an
> hugo@... carfax.org.uk | American, 100 years is a long time.
> http://carfax.org.uk/  |
> PGP: E2AB1DE4          |                                        Earle Hitchner

Hugo,
Thank you.  I have been letting your email ruminate in the brain pan.

I needed that reminder of transaction based processes.

So the transaction has failed, and therefore the drive is in an unknown state, but at some point, we need to make a decision about that transaction.

Either we:
1. Accept fact that the transaction has failed, revert the data back to the earlier stated, and continue.
2. Backup the entire data set, and reset the transaction log
3. Flush the disk, recreate from scratch and hope our backup is good and current.
4. ????

Regardless, I still don't get not handling the error.  Yes, there is an error.  All fsck programs have to deal with errors.  btrfsck just ends.  No repair, no options.

It seems that in BTRFS we just crash.  The transaction has an error, well dump the kernel and set the volume to read only.  There is no repair tool to help the user make these decisions. There isn't even a good explanation other than -5 kernel dump.

"Just recreate the volume from scratch and restore you backup is all we can do" does not seem to be a long term viable solution.

If the code understands enough to know that the transaction is damaged, why can the code not walk the admin through the repair?  It seems that we need to get to that point before we can even call this a viable beta file system.

If I understood more of the transaction issue, I might just write the code to help btrfsck actually become a real program regarding transactions.  Maybe that is where I need to start...


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Kernel Dump scanning directory
  2015-05-11 17:30             ` Anthony Plack
@ 2015-05-12  6:33               ` Duncan
  0 siblings, 0 replies; 14+ messages in thread
From: Duncan @ 2015-05-12  6:33 UTC (permalink / raw)
  To: linux-btrfs

Anthony Plack posted on Mon, 11 May 2015 12:30:40 -0500 as excerpted:

> So the transaction has failed, and therefore the drive is in an unknown
> state, but at some point, we need to make a decision about that
> transaction.
> 
> Either we:
> 1. Accept fact that the transaction has failed, revert the data back to
> the earlier stated, and continue.
> 2. Backup the entire data set, and reset the transaction log
> 3. Flush the disk, recreate from scratch and hope our backup
> is good and current.
> 4. ????

FWIW, 4 would be btrfs restore, offline-restore any files that can be 
read from the unmounted filesystem, to some other mounted (not necessary 
btrfs) filesystem of sufficient size to contain them, using btrfs-find-
root if restore can't find a good root to work with on its own.

I've used that a couple times, a year or so apart, when one of my btrfs 
crapped out.  I have backups, as by definition anyone has if they value 
the data more than the (relatively trivial) time necessary to do the 
backup (if they don't, by definition they don't value the data to that 
level, regardless of what they claim), but they aren't always current -- 
the risk of total loss that restore can't deal with is low enough that by 
definition based on my behavior, I don't consider the incremental risk 
worth the hassle, and am willing to take the relatively low risk of 
having to fall back to relatively old backups, if restore won't do it for 
me.

Luckily, I keep relatively small btrfs (all on SSD), with my much larger 
media partition reiserfs on spinning rust, and I can mount it and use a 
subdir there as a destination for the restored files.

But btrfs restore... isn't exactly non-tech-user friendly... and it 
/does/ require someplace to restore all those files to, thus at least 
doubling the effective space requirements. =:^(

> Regardless, I still don't get not handling the error.  Yes, there is an
> error.  All fsck programs have to deal with errors.  btrfsck just ends.
> No repair, no options.
> 
> It seems that in BTRFS we just crash.  The transaction has an error,
> well dump the kernel and set the volume to read only.  There is no
> repair tool to help the user make these decisions. There isn't even a
> good explanation other than -5 kernel dump.
> 
> "Just recreate the volume from scratch and restore you backup is all we
> can do" does not seem to be a long term viable solution.
> 
> If the code understands enough to know that the transaction is damaged,
> why can the code not walk the admin through the repair?  It seems that
> we need to get to that point before we can even call this a viable beta
> file system.

Too bad.  I agree, but the warnings are removed and despite all the 
evidence to the contrary, forget about beta, btrfs is being billed as 
stable and ready for ordinary use, even shipping as the default in 
mainstream distros, these days. =:^(

Yes, btrfs is very nice for the tech-and-admin-literate user to play 
with, and in the hands of someone who can manage a restore with btrfs-
find-root and/or is an absolutely perfectly programmed robot with the 
backups, it's extremely powerful and stable /enough/, but stable for the 
masses... hardly.

> If I understood more of the transaction issue, I might just write the
> code to help btrfsck actually become a real program regarding
> transactions. Maybe that is where I need to start...

Given what I've seen restore able to do, perhaps looking at how it 
handles this sort of issue and coding a perhaps optional btrfs check mode 
to do the same (at least to the restore-only level, without resorting to 
find-root), would be useful.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-05-12  6:34 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-06 21:56 Kernel Dump scanning directory Anthony Plack
2015-05-07 14:33 ` Anthony Plack
2015-05-07 21:30   ` Anthony Plack
2015-05-08 16:44     ` Anthony Plack
2015-05-08 20:58       ` Duncan
2015-05-08 21:37         ` Anthony Plack
2015-05-08 22:06           ` Duncan
2015-05-08 22:06           ` Noah Massey
2015-05-08 21:18       ` Hugo Mills
2015-05-08 21:49         ` Anthony Plack
2015-05-08 22:37           ` Hugo Mills
2015-05-11 17:30             ` Anthony Plack
2015-05-12  6:33               ` Duncan
2015-05-08 22:01       ` Kernel Dump scanning directory - New Scary Issue Anthony Plack

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).