On Sun, Jan 05, 2014 at 08:28:57AM -0800, Muthu Kumar wrote: > Fengguang, > Instead of rebooting, can you trigger a crash dump when this happens > and send us the backtrace (to start with)? Muthu, good point! Attached is the full dmesg with backtrace: [ 1398.988324] SysRq : Show Blocked State [ 1398.992007] task PC stack pid father [ 1398.992007] mount D 0000000000000002 0 2875 2870 0x00000000 [ 1398.992007] ffff88007f859a70 0000000000000082 ffff88007f859fd8 ffff8803d21c6c10 [ 1398.992007] 0000000000012fc0 ffff8803d21c6c10 0000000000000000 0000000000000000 [ 1398.992007] ffff8803d2d22068 0000000000000008 ffff88007f859a18 ffffffff814c2b62 [ 1398.992007] Call Trace: [ 1398.992007] [] ? submit_bio+0x106/0x159 [ 1398.992007] [] ? __do_readpage+0x4b9/0x50e [ 1398.992007] [] ? kvm_clock_read+0x27/0x31 [ 1398.992007] [] ? kvm_clock_get_cycles+0x9/0xb [ 1398.992007] [] ? filemap_fdatawait+0x23/0x23 [ 1398.992007] [] schedule+0x6f/0x71 [ 1398.992007] [] io_schedule+0x8f/0xd6 [ 1398.992007] [] sleep_on_page+0xe/0x12 [ 1398.992007] [] __wait_on_bit+0x48/0x7b [ 1398.992007] [] wait_on_page_bit+0x7a/0x7c [ 1398.992007] [] ? autoremove_wake_function+0x34/0x34 [ 1398.992007] [] read_extent_buffer_pages+0x1ae/0x23b [ 1398.992007] [] ? free_root_pointers+0x5b/0x5b [ 1398.992007] [] btree_read_extent_buffer_pages.constprop.48+0x66/0x100 [ 1398.992007] [] read_tree_block+0x2f/0x47 [ 1398.992007] [] open_ctree+0x1271/0x1adf [ 1398.992007] [] btrfs_mount+0x47b/0x771 [ 1398.992007] [] ? get_from_free_list+0x41/0x4b [ 1398.992007] [] mount_fs+0x15/0xae [ 1398.992007] [] vfs_kern_mount+0x64/0xf6 [ 1398.992007] [] do_mount+0x781/0x878 [ 1398.992007] [] ? strndup_user+0x3a/0xd6 [ 1398.992007] [] SyS_mount+0x85/0xbe [ 1398.992007] [] system_call_fastpath+0x16/0x1b [ 1398.992007] Sched Debug Version: v0.11, 3.13.0-rc6-00148-gc05f7ce #1 > Kent, > Did you do any btrfs test with your changes? Just try simple dd writes. Thanks, Fengguang > Regards, > Muthu > > On Sun, Jan 5, 2014 at 1:46 AM, Fengguang Wu wrote: > > Hi Muthu, > > > > On Fri, Jan 03, 2014 at 11:51:31AM -0800, Muthu Kumar wrote: > >> Looks like Kent missed the btrfs endio in the original commit. How > >> about this patch: > >> > >> --------- > >> > >> In btrfs_end_bio, call bio_endio_nodec on the restored bio so the > >> bi_remaining is accounted for correctly. > >> > >> Reported-by: fengguang.wu@intel.com > >> Cc: Kent Overstreet > >> CC: Jens Axboe > >> Signed-off-by: Muthukumar Ratty > >> -------- > >> > >> fs/btrfs/volumes.c | 6 +++++- > >> 1 files changed, 5 insertions(+), 1 deletions(-) > >> > >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > >> index f2130de..edfed52 100644 > >> --- a/fs/btrfs/volumes.c > >> +++ b/fs/btrfs/volumes.c > >> @@ -5316,7 +5316,11 @@ static void btrfs_end_bio(struct bio *bio, int err) > >> } > >> kfree(bbio); > >> > >> - bio_endio(bio, err); > >> + /* > >> + * Call endio_nodec on the restored bio so the bi_remaining is > >> + * accounted for correctly > >> + */ > >> + bio_endio_nodec(bio, err); > >> } else if (!is_orig_bio) { > >> bio_put(bio); > >> } > > > > Interestingly, the BUG message disappeared but it blocks the test run. > > In the end, the test watchdog reboots the machine with SysRq: > > > > 2014-01-04 23:13:02 mount -t btrfs /dev/vda /fs/vda > > [ 20.184264] btrfs: device fsid f0e06999-0518-47e0-a622-21b8749438be devid 1 transid 4 /dev/vda > > [ 20.186552] btrfs: disk space caching is enabled > > [ 131.360457] random: nonblocking pool is initialized > > ==> [ 1465.069342] SysRq : Emergency Sync > > ==> [ 1475.071055] SysRq : Resetting > > > > Attached is the full dmesg for a good run (v3.13-rc7) and a bad run > > (this patch). > > > > Thanks, > > Fengguang