[BUG] xfstest269 causes deadlock on linux-3.9.0 (ext4)

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [BUG] xfstest269 causes deadlock on linux-3.9.0 (ext4)
@ 2013-05-13  6:49 Akira Fujita
  2013-05-14 21:37 ` Jan Kara
  0 siblings, 1 reply; 10+ messages in thread
From: Akira Fujita @ 2013-05-13  6:49 UTC (permalink / raw)
  To: ext4 development

Hi,

I ran into the deaclock with xfs_test 269 on linux-3.9.0.
It seems happen between jbd2_log_wait_commit, sleep_on_buffer
and writeback_indoes (Please see ps log below).
Once it occurs we can't touch FS anymore.
In my case 300 - 1000 trials to occur. Is this known issue?

The following kernels seems to have same problem:
- linux-3.5-rc5
- linux-3.8.5
- linux-3.9-rc7
And now I'm trying it on linux-3.10-rc1.

# ./check generic/269
FSTYP         -- ext4
PLATFORM      -- Linux/x86_64 mcds1 3.9.0
MKFS_OPTIONS  -- /dev/sda12
MOUNT_OPTIONS -- -o acl,user_xattr /dev/sda12 /mnt/mp2

# ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:16,comm
  PID   TID CLS RTPRIO  NI PRI PSR %CPU STAT WCHAN            COMMAND
    1     1 TS       -   0  19   0  0.0 Ss   poll_schedule_ti init
    2     2 TS       -   0  19   0  0.0 S    kthreadd         kthreadd
    3     3 TS       -   0  19   0  0.0 S    smpboot_thread_f ksoftirqd/0
...
 2391  2391 TS       -   0  19   2  0.1 D    jbd2_log_wait_co flush-8:0
...
22647 22647 TS       -   0  19   3  0.0 S    worker_thread    kworker/3:1
22655 22655 TS       -   0  19   0  0.0 S    hrtimer_nanoslee sleep
22657 22657 TS       -   0  19   2  0.0 R+   -                ps
25330 25330 TS       -   0  19   0  0.0 S    worker_thread    kworker/0:0
28963 28963 TS       -   0  19   1  0.0 S+   wait             loop_xfstests.s
28964 28964 TS       -   0  19   1  0.0 S+   wait             check
29180 29180 TS       -   0  19   3  0.0 S    kjournald2       jbd2/sda11-8
29181 29181 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
29199 29199 TS       -   0  19   3  0.0 S+   wait             269
29391 29391 TS       -   0  19   0  0.6 D    sleep_on_buffer  jbd2/sda12-8
29392 29392 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
29394 29394 TS       -   0  19   0  0.0 S    wait             fsstress
29505 29505 TS       -   0  19   3  0.0 D    writeback_inodes fsstress

# df -T /dev/sda11 /dev/sda12
Filesystem    Type   1K-blocks      Used Available Use% Mounted on
/dev/sda11    ext4     9857264     22308   9327564   1% /mnt/mp1
/dev/sda12    ext4      499656    499656         0 100% /mnt/mp2

# tail -f /var/log/messages
May 11 05:59:22 mcds1 root: run xfstest generic/269
May 11 05:59:23 mcds1 kernel: [60999.075705] EXT4-fs (sda12): mounted filesystem with ordered data mode. Opts: acl,user_xattr
May 11 06:01:41 mcds1 kernel: [61137.377958] EXT4-fs (sda11): warning: maximal mount count reached, running e2fsck is recommended
May 11 06:01:41 mcds1 kernel: [61137.378756] EXT4-fs (sda11): mounted filesystem with ordered data mode. Opts: acl,user_xattr
May 11 06:01:48 mcds1 kernel: [61144.673759] EXT4-fs (sda12): mounted filesystem with ordered data mode. Opts: acl,user_xattr
May 11 06:01:48 mcds1 kernel: [61144.882217] EXT4-fs (sda11): warning: maximal mount count reached, running e2fsck is recommended
May 11 06:01:48 mcds1 kernel: [61144.882799] EXT4-fs (sda11): mounted filesystem with ordered data mode. Opts: acl,user_xattr
May 11 06:01:48 mcds1 root: run xfstest generic/269
May 11 06:01:49 mcds1 kernel: [61145.444455] EXT4-fs (sda12): mounted filesystem with ordered data mode. Opts: acl,user_xattr

Regards,
Akira Fujita

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] xfstest269 causes deadlock on linux-3.9.0 (ext4)
  2013-05-13  6:49 [BUG] xfstest269 causes deadlock on linux-3.9.0 (ext4) Akira Fujita
@ 2013-05-14 21:37 ` Jan Kara
  2013-05-15  2:57   ` Akira Fujita
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Kara @ 2013-05-14 21:37 UTC (permalink / raw)
  To: Akira Fujita; +Cc: ext4 development

  Hello,

On Mon 13-05-13 15:49:24, Akira Fujita wrote:
> I ran into the deaclock with xfs_test 269 on linux-3.9.0.
> It seems happen between jbd2_log_wait_commit, sleep_on_buffer
> and writeback_indoes (Please see ps log below).
> Once it occurs we can't touch FS anymore.
> In my case 300 - 1000 trials to occur. Is this known issue?
> 
> The following kernels seems to have same problem:
> - linux-3.5-rc5
> - linux-3.8.5
> - linux-3.9-rc7
> And now I'm trying it on linux-3.10-rc1.
> 
> # ./check generic/269
> FSTYP         -- ext4
> PLATFORM      -- Linux/x86_64 mcds1 3.9.0
> MKFS_OPTIONS  -- /dev/sda12
> MOUNT_OPTIONS -- -o acl,user_xattr /dev/sda12 /mnt/mp2
> 
> 
> # ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:16,comm
>   PID   TID CLS RTPRIO  NI PRI PSR %CPU STAT WCHAN            COMMAND
>     1     1 TS       -   0  19   0  0.0 Ss   poll_schedule_ti init
>     2     2 TS       -   0  19   0  0.0 S    kthreadd         kthreadd
>     3     3 TS       -   0  19   0  0.0 S    smpboot_thread_f ksoftirqd/0
> ...
>  2391  2391 TS       -   0  19   2  0.1 D    jbd2_log_wait_co flush-8:0
> ...
> 22647 22647 TS       -   0  19   3  0.0 S    worker_thread    kworker/3:1
> 22655 22655 TS       -   0  19   0  0.0 S    hrtimer_nanoslee sleep
> 22657 22657 TS       -   0  19   2  0.0 R+   -                ps
> 25330 25330 TS       -   0  19   0  0.0 S    worker_thread    kworker/0:0
> 28963 28963 TS       -   0  19   1  0.0 S+   wait             loop_xfstests.s
> 28964 28964 TS       -   0  19   1  0.0 S+   wait             check
> 29180 29180 TS       -   0  19   3  0.0 S    kjournald2       jbd2/sda11-8
> 29181 29181 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
> 29199 29199 TS       -   0  19   3  0.0 S+   wait             269
> 29391 29391 TS       -   0  19   0  0.6 D    sleep_on_buffer  jbd2/sda12-8
> 29392 29392 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
> 29394 29394 TS       -   0  19   0  0.0 S    wait             fsstress
> 29505 29505 TS       -   0  19   3  0.0 D    writeback_inodes fsstress
> 
> # df -T /dev/sda11 /dev/sda12
> Filesystem    Type   1K-blocks      Used Available Use% Mounted on
> /dev/sda11    ext4     9857264     22308   9327564   1% /mnt/mp1
> /dev/sda12    ext4      499656    499656         0 100% /mnt/mp2
  Thanks for report. No I don't think this problem has been reported
before. Seeing that sda12 is out of space and fsstress hangs in
writeback_inodes(), I suspect we have some deadlock in ENOSPC recovery path
when we want to flush data to disk to reduce delalloc uncertainty. Can you
run 'echo w >/proc/sysrq-trigger' when the deadlock happens and post your
dmesg here? Thanks!

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] xfstest269 causes deadlock on linux-3.9.0 (ext4)
  2013-05-14 21:37 ` Jan Kara
@ 2013-05-15  2:57   ` Akira Fujita
  2013-05-27  1:49     ` Akira Fujita
  0 siblings, 1 reply; 10+ messages in thread
From: Akira Fujita @ 2013-05-15  2:57 UTC (permalink / raw)
  To: Jan Kara; +Cc: ext4 development

Hi,

(2013/05/15 6:37), Jan Kara wrote:
>    Hello,
> 
> On Mon 13-05-13 15:49:24, Akira Fujita wrote:
>> I ran into the deaclock with xfs_test 269 on linux-3.9.0.
>> It seems happen between jbd2_log_wait_commit, sleep_on_buffer
>> and writeback_indoes (Please see ps log below).
>> Once it occurs we can't touch FS anymore.
>> In my case 300 - 1000 trials to occur. Is this known issue?
>>
>> The following kernels seems to have same problem:
>> - linux-3.5-rc5
>> - linux-3.8.5
>> - linux-3.9-rc7
>> And now I'm trying it on linux-3.10-rc1.
>>
>> # ./check generic/269
>> FSTYP         -- ext4
>> PLATFORM      -- Linux/x86_64 mcds1 3.9.0
>> MKFS_OPTIONS  -- /dev/sda12
>> MOUNT_OPTIONS -- -o acl,user_xattr /dev/sda12 /mnt/mp2
>>
>>
>> # ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:16,comm
>>    PID   TID CLS RTPRIO  NI PRI PSR %CPU STAT WCHAN            COMMAND
>>      1     1 TS       -   0  19   0  0.0 Ss   poll_schedule_ti init
>>      2     2 TS       -   0  19   0  0.0 S    kthreadd         kthreadd
>>      3     3 TS       -   0  19   0  0.0 S    smpboot_thread_f ksoftirqd/0
>> ...
>>   2391  2391 TS       -   0  19   2  0.1 D    jbd2_log_wait_co flush-8:0
>> ...
>> 22647 22647 TS       -   0  19   3  0.0 S    worker_thread    kworker/3:1
>> 22655 22655 TS       -   0  19   0  0.0 S    hrtimer_nanoslee sleep
>> 22657 22657 TS       -   0  19   2  0.0 R+   -                ps
>> 25330 25330 TS       -   0  19   0  0.0 S    worker_thread    kworker/0:0
>> 28963 28963 TS       -   0  19   1  0.0 S+   wait             loop_xfstests.s
>> 28964 28964 TS       -   0  19   1  0.0 S+   wait             check
>> 29180 29180 TS       -   0  19   3  0.0 S    kjournald2       jbd2/sda11-8
>> 29181 29181 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
>> 29199 29199 TS       -   0  19   3  0.0 S+   wait             269
>> 29391 29391 TS       -   0  19   0  0.6 D    sleep_on_buffer  jbd2/sda12-8
>> 29392 29392 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
>> 29394 29394 TS       -   0  19   0  0.0 S    wait             fsstress
>> 29505 29505 TS       -   0  19   3  0.0 D    writeback_inodes fsstress
>>
>> # df -T /dev/sda11 /dev/sda12
>> Filesystem    Type   1K-blocks      Used Available Use% Mounted on
>> /dev/sda11    ext4     9857264     22308   9327564   1% /mnt/mp1
>> /dev/sda12    ext4      499656    499656         0 100% /mnt/mp2
>    Thanks for report. No I don't think this problem has been reported
> before. Seeing that sda12 is out of space and fsstress hangs in
> writeback_inodes(), I suspect we have some deadlock in ENOSPC recovery path
> when we want to flush data to disk to reduce delalloc uncertainty. Can you
> run 'echo w >/proc/sysrq-trigger' when the deadlock happens and post your
> dmesg here? Thanks!
> 

Thanks for reply.
I'll take that information when the deadlock happens again.

Regards,
Akira Fujita


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] xfstest269 causes deadlock on linux-3.9.0 (ext4)
  2013-05-15  2:57   ` Akira Fujita
@ 2013-05-27  1:49     ` Akira Fujita
  2013-05-27  8:16       ` Dmitry Monakhov
  0 siblings, 1 reply; 10+ messages in thread
From: Akira Fujita @ 2013-05-27  1:49 UTC (permalink / raw)
  To: Jan Kara; +Cc: ext4 development

Hi Jan,

(2013/05/15 11:57), Akira Fujita wrote:
> Hi,
> 
> (2013/05/15 6:37), Jan Kara wrote:
>>     Hello,
>>
>> On Mon 13-05-13 15:49:24, Akira Fujita wrote:
>>> I ran into the deaclock with xfs_test 269 on linux-3.9.0.
>>> It seems happen between jbd2_log_wait_commit, sleep_on_buffer
>>> and writeback_indoes (Please see ps log below).
>>> Once it occurs we can't touch FS anymore.
>>> In my case 300 - 1000 trials to occur. Is this known issue?
>>>
>>> The following kernels seems to have same problem:
>>> - linux-3.5-rc5
>>> - linux-3.8.5
>>> - linux-3.9-rc7
>>> And now I'm trying it on linux-3.10-rc1.
>>>
>>> # ./check generic/269
>>> FSTYP         -- ext4
>>> PLATFORM      -- Linux/x86_64 mcds1 3.9.0
>>> MKFS_OPTIONS  -- /dev/sda12
>>> MOUNT_OPTIONS -- -o acl,user_xattr /dev/sda12 /mnt/mp2
>>>
>>>
>>> # ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:16,comm
>>>     PID   TID CLS RTPRIO  NI PRI PSR %CPU STAT WCHAN            COMMAND
>>>       1     1 TS       -   0  19   0  0.0 Ss   poll_schedule_ti init
>>>       2     2 TS       -   0  19   0  0.0 S    kthreadd         kthreadd
>>>       3     3 TS       -   0  19   0  0.0 S    smpboot_thread_f ksoftirqd/0
>>> ...
>>>    2391  2391 TS       -   0  19   2  0.1 D    jbd2_log_wait_co flush-8:0
>>> ...
>>> 22647 22647 TS       -   0  19   3  0.0 S    worker_thread    kworker/3:1
>>> 22655 22655 TS       -   0  19   0  0.0 S    hrtimer_nanoslee sleep
>>> 22657 22657 TS       -   0  19   2  0.0 R+   -                ps
>>> 25330 25330 TS       -   0  19   0  0.0 S    worker_thread    kworker/0:0
>>> 28963 28963 TS       -   0  19   1  0.0 S+   wait             loop_xfstests.s
>>> 28964 28964 TS       -   0  19   1  0.0 S+   wait             check
>>> 29180 29180 TS       -   0  19   3  0.0 S    kjournald2       jbd2/sda11-8
>>> 29181 29181 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
>>> 29199 29199 TS       -   0  19   3  0.0 S+   wait             269
>>> 29391 29391 TS       -   0  19   0  0.6 D    sleep_on_buffer  jbd2/sda12-8
>>> 29392 29392 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
>>> 29394 29394 TS       -   0  19   0  0.0 S    wait             fsstress
>>> 29505 29505 TS       -   0  19   3  0.0 D    writeback_inodes fsstress
>>>
>>> # df -T /dev/sda11 /dev/sda12
>>> Filesystem    Type   1K-blocks      Used Available Use% Mounted on
>>> /dev/sda11    ext4     9857264     22308   9327564   1% /mnt/mp1
>>> /dev/sda12    ext4      499656    499656         0 100% /mnt/mp2
>>     Thanks for report. No I don't think this problem has been reported
>> before. Seeing that sda12 is out of space and fsstress hangs in
>> writeback_inodes(), I suspect we have some deadlock in ENOSPC recovery path
>> when we want to flush data to disk to reduce delalloc uncertainty. Can you
>> run 'echo w >/proc/sysrq-trigger' when the deadlock happens and post your
>> dmesg here? Thanks!
>>
> 
> Thanks for reply.
> I'll take that information when the deadlock happens again.

Finally, I got the following messages on linux-3.9.0.
Could you take a look?

#echo w > /proc/sysrq-trigger
/var/log/messages:
-----------------------
May 23 19:15:50 mcds1 kernel: [210728.230978] SysRq : Show Blocked State
May 23 19:15:50 mcds1 kernel: [210728.230985]   task                        PC stack   pid father
May 23 19:15:50 mcds1 kernel: [210728.230992] flush-8:0       D ffffffff81806d60  3192  2361      2 0x00000000
May 23 19:15:50 mcds1 kernel: [210728.230997]  ffff880214d27a08 0000000000000046 0000000000000282 ffff88021608d330
May 23 19:15:50 mcds1 kernel: [210728.231001]  0000000000000000 ffff880215d95ac0 0000000000012980 ffff880214d27fd8
May 23 19:15:50 mcds1 kernel: [210728.231005]  ffff880214d26010 0000000000012980 0000000000012980 ffff880214d27fd8
May 23 19:15:50 mcds1 kernel: [210728.231008] Call Trace:
May 23 19:15:50 mcds1 kernel: [210728.231018]  [<ffffffff81798ae4>] schedule+0x24/0x70
May 23 19:15:50 mcds1 kernel: [210728.231023]  [<ffffffff81204b95>] jbd2_log_wait_commit+0xb5/0x130
May 23 19:15:50 mcds1 kernel: [210728.231029]  [<ffffffff81060830>] ? wake_up_bit+0x40/0x40
May 23 19:15:50 mcds1 kernel: [210728.231033]  [<ffffffff81204cb4>] ? __jbd2_log_start_commit+0xa4/0xb0
May 23 19:15:50 mcds1 kernel: [210728.231037]  [<ffffffff81204e4b>] jbd2_journal_force_commit_nested+0x5b/0xa0
May 23 19:15:50 mcds1 kernel: [210728.231041]  [<ffffffff811b75c8>] ext4_da_writepages+0x348/0x5a0
May 23 19:15:50 mcds1 kernel: [210728.231045]  [<ffffffff81074e77>] ? check_preempt_wakeup+0x1b7/0x2a0
May 23 19:15:50 mcds1 kernel: [210728.231051]  [<ffffffff810f0f5b>] do_writepages+0x1b/0x30
May 23 19:15:50 mcds1 kernel: [210728.231056]  [<ffffffff8115d9d4>] __writeback_single_inode+0x44/0x220
May 23 19:15:50 mcds1 kernel: [210728.231061]  [<ffffffff8115e93f>] writeback_sb_inodes+0x23f/0x430
May 23 19:15:50 mcds1 kernel: [210728.231065]  [<ffffffff8115ed6d>] wb_writeback+0xed/0x2b0
May 23 19:15:50 mcds1 kernel: [210728.231071]  [<ffffffff8104c537>] ? lock_timer_base+0x37/0x70
May 23 19:15:50 mcds1 kernel: [210728.231075]  [<ffffffff8115efcd>] wb_do_writeback+0x9d/0x240
May 23 19:15:50 mcds1 kernel: [210728.231079]  [<ffffffff8104c570>] ? lock_timer_base+0x70/0x70
May 23 19:15:50 mcds1 kernel: [210728.231084]  [<ffffffff8115f212>] bdi_writeback_thread+0xa2/0x230
May 23 19:15:50 mcds1 kernel: [210728.231088]  [<ffffffff8115f170>] ? wb_do_writeback+0x240/0x240
May 23 19:15:50 mcds1 kernel: [210728.231092]  [<ffffffff8115f170>] ? wb_do_writeback+0x240/0x240
May 23 19:15:50 mcds1 kernel: [210728.231095]  [<ffffffff81060136>] kthread+0xc6/0xd0
May 23 19:15:50 mcds1 kernel: [210728.231099]  [<ffffffff81060070>] ? kthread_freezable_should_stop+0x70/0x70
May 23 19:15:50 mcds1 kernel: [210728.231104]  [<ffffffff817a19ac>] ret_from_fork+0x7c/0xb0
May 23 19:15:50 mcds1 kernel: [210728.231108]  [<ffffffff81060070>] ? kthread_freezable_should_stop+0x70/0x70
May 23 19:15:50 mcds1 kernel: [210728.231124] jbd2/sda12-8    D ffffffff81806d60  4736 25056      2 0x00000000
May 23 19:15:50 mcds1 kernel: [210728.231128]  ffff8801e436db78 0000000000000046 ffff880215c06f00 ffffffff81c10420
May 23 19:15:50 mcds1 kernel: [210728.231131]  ffff880200011200 ffff8801f8f296b0 0000000000012980 ffff8801e436dfd8
May 23 19:15:50 mcds1 kernel: [210728.231135]  ffff8801e436c010 0000000000012980 0000000000012980 ffff8801e436dfd8
May 23 19:15:50 mcds1 kernel: [210728.231138] Call Trace:
May 23 19:15:50 mcds1 kernel: [210728.231142]  [<ffffffff81166c50>] ? __wait_on_buffer+0x30/0x30
May 23 19:15:50 mcds1 kernel: [210728.231146]  [<ffffffff81798ae4>] schedule+0x24/0x70
May 23 19:15:50 mcds1 kernel: [210728.231150]  [<ffffffff81798bb7>] io_schedule+0x87/0xd0
May 23 19:15:50 mcds1 kernel: [210728.231154]  [<ffffffff81166c59>] sleep_on_buffer+0x9/0x10
May 23 19:15:50 mcds1 kernel: [210728.231157]  [<ffffffff817971d7>] __wait_on_bit+0x57/0x80
May 23 19:15:50 mcds1 kernel: [210728.231161]  [<ffffffff81165f7c>] ? submit_bh+0x13c/0x1f0
May 23 19:15:50 mcds1 kernel: [210728.231164]  [<ffffffff81166c50>] ? __wait_on_buffer+0x30/0x30
May 23 19:15:50 mcds1 kernel: [210728.231168]  [<ffffffff81797273>] out_of_line_wait_on_bit+0x73/0x90
May 23 19:15:50 mcds1 kernel: [210728.231172]  [<ffffffff81060870>] ? autoremove_wake_function+0x40/0x40
May 23 19:15:50 mcds1 kernel: [210728.231175]  [<ffffffff81166c46>] __wait_on_buffer+0x26/0x30
May 23 19:15:50 mcds1 kernel: [210728.231179]  [<ffffffff811ff130>] jbd2_journal_commit_transaction+0xee0/0x1960
May 23 19:15:50 mcds1 kernel: [210728.231183]  [<ffffffff817984fd>] ? __schedule+0x3ad/0x7a0
May 23 19:15:50 mcds1 kernel: [210728.231188]  [<ffffffff8104c537>] ? lock_timer_base+0x37/0x70
May 23 19:15:50 mcds1 kernel: [210728.231191]  [<ffffffff81204f62>] kjournald2+0xc2/0x240
May 23 19:15:50 mcds1 kernel: [210728.231195]  [<ffffffff81060830>] ? wake_up_bit+0x40/0x40
May 23 19:15:50 mcds1 kernel: [210728.231199]  [<ffffffff81204ea0>] ? commit_timeout+0x10/0x10
May 23 19:15:50 mcds1 kernel: [210728.231202]  [<ffffffff81060136>] kthread+0xc6/0xd0
May 23 19:15:50 mcds1 kernel: [210728.231206]  [<ffffffff81060070>] ? kthread_freezable_should_stop+0x70/0x70
May 23 19:15:50 mcds1 kernel: [210728.231210]  [<ffffffff817a19ac>] ret_from_fork+0x7c/0xb0
May 23 19:15:50 mcds1 kernel: [210728.231213]  [<ffffffff81060070>] ? kthread_freezable_should_stop+0x70/0x70
May 23 19:15:50 mcds1 kernel: [210728.231216] umount          D ffffffff81806d60  5448 25241  24864 0x00000000
May 23 19:15:50 mcds1 kernel: [210728.231220]  ffff8801f3a89c28 0000000000000086 ffff88021fc929f0 ffff88021608cba0
May 23 19:15:50 mcds1 kernel: [210728.231223]  ffff8801f3a89bf8 ffff8801f84d8790 0000000000012980 ffff8801f3a89fd8
May 23 19:15:50 mcds1 kernel: [210728.231226]  ffff8801f3a88010 0000000000012980 0000000000012980 ffff8801f3a89fd8
May 23 19:15:50 mcds1 kernel: [210728.231230] Call Trace:
May 23 19:15:50 mcds1 kernel: [210728.231234]  [<ffffffff81798ae4>] schedule+0x24/0x70
May 23 19:15:50 mcds1 kernel: [210728.231237]  [<ffffffff81796f4d>] schedule_timeout+0x15d/0x1f0
May 23 19:15:50 mcds1 kernel: [210728.231241]  [<ffffffff8106f2b5>] ? __cond_resched+0x25/0x40
May 23 19:15:50 mcds1 kernel: [210728.231245]  [<ffffffff81798f64>] wait_for_completion+0x94/0x100
May 23 19:15:50 mcds1 kernel: [210728.231248]  [<ffffffff81070ca0>] ? try_to_wake_up+0x2a0/0x2a0
May 23 19:15:50 mcds1 kernel: [210728.231252]  [<ffffffff8115ddb1>] ? bdi_queue_work+0x81/0x110
May 23 19:15:50 mcds1 kernel: [210728.231256]  [<ffffffff8115dec3>] writeback_inodes_sb_nr+0x83/0xb0
May 23 19:15:50 mcds1 kernel: [210728.231261]  [<ffffffff8115df4a>] writeback_inodes_sb+0x5a/0x70
May 23 19:15:50 mcds1 kernel: [210728.231264]  [<ffffffff8116463a>] __sync_filesystem+0x4a/0x50
May 23 19:15:50 mcds1 kernel: [210728.231267]  [<ffffffff81164672>] sync_filesystem+0x32/0x60
May 23 19:15:50 mcds1 kernel: [210728.231273]  [<ffffffff81137be6>] generic_shutdown_super+0x36/0xe0
May 23 19:15:50 mcds1 kernel: [210728.231277]  [<ffffffff81137cbc>] kill_block_super+0x2c/0x80
May 23 19:15:50 mcds1 kernel: [210728.231281]  [<ffffffff811381f5>] deactivate_locked_super+0x45/0x70
May 23 19:15:50 mcds1 kernel: [210728.231284]  [<ffffffff81139025>] deactivate_super+0x45/0x60
May 23 19:15:50 mcds1 kernel: [210728.231287]  [<ffffffff81152f63>] mntput_no_expire+0xf3/0x150
May 23 19:15:50 mcds1 kernel: [210728.231290]  [<ffffffff811537dd>] sys_umount+0xcd/0x3f0
May 23 19:15:50 mcds1 kernel: [210728.231294]  [<ffffffff817a1a52>] system_call_fastpath+0x16/0x1b

Regards,
Akira Fujita

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] xfstest269 causes deadlock on linux-3.9.0 (ext4)
  2013-05-27  1:49     ` Akira Fujita
@ 2013-05-27  8:16       ` Dmitry Monakhov
  2013-05-27  8:33         ` Jan Kara
  0 siblings, 1 reply; 10+ messages in thread
From: Dmitry Monakhov @ 2013-05-27  8:16 UTC (permalink / raw)
  To: Akira Fujita, Jan Kara; +Cc: ext4 development

On Mon, 27 May 2013 10:49:11 +0900, Akira Fujita <a-fujita@rs.jp.nec.com> wrote:
> Hi Jan,
> 
> (2013/05/15 11:57), Akira Fujita wrote:
> > Hi,
> > 
> > (2013/05/15 6:37), Jan Kara wrote:
> >>     Hello,
> >>
> >> On Mon 13-05-13 15:49:24, Akira Fujita wrote:
> >>> I ran into the deaclock with xfs_test 269 on linux-3.9.0.
> >>> It seems happen between jbd2_log_wait_commit, sleep_on_buffer
> >>> and writeback_indoes (Please see ps log below).
> >>> Once it occurs we can't touch FS anymore.
> >>> In my case 300 - 1000 trials to occur. Is this known issue?
> >>>
> >>> The following kernels seems to have same problem:
> >>> - linux-3.5-rc5
> >>> - linux-3.8.5
> >>> - linux-3.9-rc7
> >>> And now I'm trying it on linux-3.10-rc1.
> >>>
> >>> # ./check generic/269
> >>> FSTYP         -- ext4
> >>> PLATFORM      -- Linux/x86_64 mcds1 3.9.0
> >>> MKFS_OPTIONS  -- /dev/sda12
> >>> MOUNT_OPTIONS -- -o acl,user_xattr /dev/sda12 /mnt/mp2
> >>>
> >>>
> >>> # ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:16,comm
> >>>     PID   TID CLS RTPRIO  NI PRI PSR %CPU STAT WCHAN            COMMAND
> >>>       1     1 TS       -   0  19   0  0.0 Ss   poll_schedule_ti init
> >>>       2     2 TS       -   0  19   0  0.0 S    kthreadd         kthreadd
> >>>       3     3 TS       -   0  19   0  0.0 S    smpboot_thread_f ksoftirqd/0
> >>> ...
> >>>    2391  2391 TS       -   0  19   2  0.1 D    jbd2_log_wait_co flush-8:0
> >>> ...
> >>> 22647 22647 TS       -   0  19   3  0.0 S    worker_thread    kworker/3:1
> >>> 22655 22655 TS       -   0  19   0  0.0 S    hrtimer_nanoslee sleep
> >>> 22657 22657 TS       -   0  19   2  0.0 R+   -                ps
> >>> 25330 25330 TS       -   0  19   0  0.0 S    worker_thread    kworker/0:0
> >>> 28963 28963 TS       -   0  19   1  0.0 S+   wait             loop_xfstests.s
> >>> 28964 28964 TS       -   0  19   1  0.0 S+   wait             check
> >>> 29180 29180 TS       -   0  19   3  0.0 S    kjournald2       jbd2/sda11-8
> >>> 29181 29181 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
> >>> 29199 29199 TS       -   0  19   3  0.0 S+   wait             269
> >>> 29391 29391 TS       -   0  19   0  0.6 D    sleep_on_buffer  jbd2/sda12-8
> >>> 29392 29392 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
> >>> 29394 29394 TS       -   0  19   0  0.0 S    wait             fsstress
> >>> 29505 29505 TS       -   0  19   3  0.0 D    writeback_inodes fsstress
> >>>
> >>> # df -T /dev/sda11 /dev/sda12
> >>> Filesystem    Type   1K-blocks      Used Available Use% Mounted on
> >>> /dev/sda11    ext4     9857264     22308   9327564   1% /mnt/mp1
> >>> /dev/sda12    ext4      499656    499656         0 100% /mnt/mp2
> >>     Thanks for report. No I don't think this problem has been reported
> >> before. Seeing that sda12 is out of space and fsstress hangs in
> >> writeback_inodes(), I suspect we have some deadlock in ENOSPC recovery path
> >> when we want to flush data to disk to reduce delalloc uncertainty. Can you
> >> run 'echo w >/proc/sysrq-trigger' when the deadlock happens and post your
> >> dmesg here? Thanks!
> >>
> > 
> > Thanks for reply.
> > I'll take that information when the deadlock happens again.
> 
> Finally, I got the following messages on linux-3.9.0.
> Could you take a look?
Is it a real deadlock?
umount wait for flush-8:0
flush wait for jbd2
jbd2 wait for bh.
All buffer-heads will be written at sooner or later.
It probably will require significant amount of time, because fs is too
fragmented(that is what this test all about), but it is not a deadlock.
> 
> #echo w > /proc/sysrq-trigger
> /var/log/messages:
> -----------------------
> May 23 19:15:50 mcds1 kernel: [210728.230978] SysRq : Show Blocked State
> May 23 19:15:50 mcds1 kernel: [210728.230985]   task                        PC stack   pid father
> May 23 19:15:50 mcds1 kernel: [210728.230992] flush-8:0       D ffffffff81806d60  3192  2361      2 0x00000000
> May 23 19:15:50 mcds1 kernel: [210728.230997]  ffff880214d27a08 0000000000000046 0000000000000282 ffff88021608d330
> May 23 19:15:50 mcds1 kernel: [210728.231001]  0000000000000000 ffff880215d95ac0 0000000000012980 ffff880214d27fd8
> May 23 19:15:50 mcds1 kernel: [210728.231005]  ffff880214d26010 0000000000012980 0000000000012980 ffff880214d27fd8
> May 23 19:15:50 mcds1 kernel: [210728.231008] Call Trace:
> May 23 19:15:50 mcds1 kernel: [210728.231018]  [<ffffffff81798ae4>] schedule+0x24/0x70
> May 23 19:15:50 mcds1 kernel: [210728.231023]  [<ffffffff81204b95>] jbd2_log_wait_commit+0xb5/0x130
> May 23 19:15:50 mcds1 kernel: [210728.231029]  [<ffffffff81060830>] ? wake_up_bit+0x40/0x40
> May 23 19:15:50 mcds1 kernel: [210728.231033]  [<ffffffff81204cb4>] ? __jbd2_log_start_commit+0xa4/0xb0
> May 23 19:15:50 mcds1 kernel: [210728.231037]  [<ffffffff81204e4b>] jbd2_journal_force_commit_nested+0x5b/0xa0
> May 23 19:15:50 mcds1 kernel: [210728.231041]  [<ffffffff811b75c8>] ext4_da_writepages+0x348/0x5a0
> May 23 19:15:50 mcds1 kernel: [210728.231045]  [<ffffffff81074e77>] ? check_preempt_wakeup+0x1b7/0x2a0
> May 23 19:15:50 mcds1 kernel: [210728.231051]  [<ffffffff810f0f5b>] do_writepages+0x1b/0x30
> May 23 19:15:50 mcds1 kernel: [210728.231056]  [<ffffffff8115d9d4>] __writeback_single_inode+0x44/0x220
> May 23 19:15:50 mcds1 kernel: [210728.231061]  [<ffffffff8115e93f>] writeback_sb_inodes+0x23f/0x430
> May 23 19:15:50 mcds1 kernel: [210728.231065]  [<ffffffff8115ed6d>] wb_writeback+0xed/0x2b0
> May 23 19:15:50 mcds1 kernel: [210728.231071]  [<ffffffff8104c537>] ? lock_timer_base+0x37/0x70
> May 23 19:15:50 mcds1 kernel: [210728.231075]  [<ffffffff8115efcd>] wb_do_writeback+0x9d/0x240
> May 23 19:15:50 mcds1 kernel: [210728.231079]  [<ffffffff8104c570>] ? lock_timer_base+0x70/0x70
> May 23 19:15:50 mcds1 kernel: [210728.231084]  [<ffffffff8115f212>] bdi_writeback_thread+0xa2/0x230
> May 23 19:15:50 mcds1 kernel: [210728.231088]  [<ffffffff8115f170>] ? wb_do_writeback+0x240/0x240
> May 23 19:15:50 mcds1 kernel: [210728.231092]  [<ffffffff8115f170>] ? wb_do_writeback+0x240/0x240
> May 23 19:15:50 mcds1 kernel: [210728.231095]  [<ffffffff81060136>] kthread+0xc6/0xd0
> May 23 19:15:50 mcds1 kernel: [210728.231099]  [<ffffffff81060070>] ? kthread_freezable_should_stop+0x70/0x70
> May 23 19:15:50 mcds1 kernel: [210728.231104]  [<ffffffff817a19ac>] ret_from_fork+0x7c/0xb0
> May 23 19:15:50 mcds1 kernel: [210728.231108]  [<ffffffff81060070>] ? kthread_freezable_should_stop+0x70/0x70
> May 23 19:15:50 mcds1 kernel: [210728.231124] jbd2/sda12-8    D ffffffff81806d60  4736 25056      2 0x00000000
> May 23 19:15:50 mcds1 kernel: [210728.231128]  ffff8801e436db78 0000000000000046 ffff880215c06f00 ffffffff81c10420
> May 23 19:15:50 mcds1 kernel: [210728.231131]  ffff880200011200 ffff8801f8f296b0 0000000000012980 ffff8801e436dfd8
> May 23 19:15:50 mcds1 kernel: [210728.231135]  ffff8801e436c010 0000000000012980 0000000000012980 ffff8801e436dfd8
> May 23 19:15:50 mcds1 kernel: [210728.231138] Call Trace:
> May 23 19:15:50 mcds1 kernel: [210728.231142]  [<ffffffff81166c50>] ? __wait_on_buffer+0x30/0x30
> May 23 19:15:50 mcds1 kernel: [210728.231146]  [<ffffffff81798ae4>] schedule+0x24/0x70
> May 23 19:15:50 mcds1 kernel: [210728.231150]  [<ffffffff81798bb7>] io_schedule+0x87/0xd0
> May 23 19:15:50 mcds1 kernel: [210728.231154]  [<ffffffff81166c59>] sleep_on_buffer+0x9/0x10
> May 23 19:15:50 mcds1 kernel: [210728.231157]  [<ffffffff817971d7>] __wait_on_bit+0x57/0x80
> May 23 19:15:50 mcds1 kernel: [210728.231161]  [<ffffffff81165f7c>] ? submit_bh+0x13c/0x1f0
> May 23 19:15:50 mcds1 kernel: [210728.231164]  [<ffffffff81166c50>] ? __wait_on_buffer+0x30/0x30
> May 23 19:15:50 mcds1 kernel: [210728.231168]  [<ffffffff81797273>] out_of_line_wait_on_bit+0x73/0x90
> May 23 19:15:50 mcds1 kernel: [210728.231172]  [<ffffffff81060870>] ? autoremove_wake_function+0x40/0x40
> May 23 19:15:50 mcds1 kernel: [210728.231175]  [<ffffffff81166c46>] __wait_on_buffer+0x26/0x30
> May 23 19:15:50 mcds1 kernel: [210728.231179]  [<ffffffff811ff130>] jbd2_journal_commit_transaction+0xee0/0x1960
> May 23 19:15:50 mcds1 kernel: [210728.231183]  [<ffffffff817984fd>] ? __schedule+0x3ad/0x7a0
> May 23 19:15:50 mcds1 kernel: [210728.231188]  [<ffffffff8104c537>] ? lock_timer_base+0x37/0x70
> May 23 19:15:50 mcds1 kernel: [210728.231191]  [<ffffffff81204f62>] kjournald2+0xc2/0x240
> May 23 19:15:50 mcds1 kernel: [210728.231195]  [<ffffffff81060830>] ? wake_up_bit+0x40/0x40
> May 23 19:15:50 mcds1 kernel: [210728.231199]  [<ffffffff81204ea0>] ? commit_timeout+0x10/0x10
> May 23 19:15:50 mcds1 kernel: [210728.231202]  [<ffffffff81060136>] kthread+0xc6/0xd0
> May 23 19:15:50 mcds1 kernel: [210728.231206]  [<ffffffff81060070>] ? kthread_freezable_should_stop+0x70/0x70
> May 23 19:15:50 mcds1 kernel: [210728.231210]  [<ffffffff817a19ac>] ret_from_fork+0x7c/0xb0
> May 23 19:15:50 mcds1 kernel: [210728.231213]  [<ffffffff81060070>] ? kthread_freezable_should_stop+0x70/0x70
> May 23 19:15:50 mcds1 kernel: [210728.231216] umount          D ffffffff81806d60  5448 25241  24864 0x00000000
> May 23 19:15:50 mcds1 kernel: [210728.231220]  ffff8801f3a89c28 0000000000000086 ffff88021fc929f0 ffff88021608cba0
> May 23 19:15:50 mcds1 kernel: [210728.231223]  ffff8801f3a89bf8 ffff8801f84d8790 0000000000012980 ffff8801f3a89fd8
> May 23 19:15:50 mcds1 kernel: [210728.231226]  ffff8801f3a88010 0000000000012980 0000000000012980 ffff8801f3a89fd8
> May 23 19:15:50 mcds1 kernel: [210728.231230] Call Trace:
> May 23 19:15:50 mcds1 kernel: [210728.231234]  [<ffffffff81798ae4>] schedule+0x24/0x70
> May 23 19:15:50 mcds1 kernel: [210728.231237]  [<ffffffff81796f4d>] schedule_timeout+0x15d/0x1f0
> May 23 19:15:50 mcds1 kernel: [210728.231241]  [<ffffffff8106f2b5>] ? __cond_resched+0x25/0x40
> May 23 19:15:50 mcds1 kernel: [210728.231245]  [<ffffffff81798f64>] wait_for_completion+0x94/0x100
> May 23 19:15:50 mcds1 kernel: [210728.231248]  [<ffffffff81070ca0>] ? try_to_wake_up+0x2a0/0x2a0
> May 23 19:15:50 mcds1 kernel: [210728.231252]  [<ffffffff8115ddb1>] ? bdi_queue_work+0x81/0x110
> May 23 19:15:50 mcds1 kernel: [210728.231256]  [<ffffffff8115dec3>] writeback_inodes_sb_nr+0x83/0xb0
> May 23 19:15:50 mcds1 kernel: [210728.231261]  [<ffffffff8115df4a>] writeback_inodes_sb+0x5a/0x70
> May 23 19:15:50 mcds1 kernel: [210728.231264]  [<ffffffff8116463a>] __sync_filesystem+0x4a/0x50
> May 23 19:15:50 mcds1 kernel: [210728.231267]  [<ffffffff81164672>] sync_filesystem+0x32/0x60
> May 23 19:15:50 mcds1 kernel: [210728.231273]  [<ffffffff81137be6>] generic_shutdown_super+0x36/0xe0
> May 23 19:15:50 mcds1 kernel: [210728.231277]  [<ffffffff81137cbc>] kill_block_super+0x2c/0x80
> May 23 19:15:50 mcds1 kernel: [210728.231281]  [<ffffffff811381f5>] deactivate_locked_super+0x45/0x70
> May 23 19:15:50 mcds1 kernel: [210728.231284]  [<ffffffff81139025>] deactivate_super+0x45/0x60
> May 23 19:15:50 mcds1 kernel: [210728.231287]  [<ffffffff81152f63>] mntput_no_expire+0xf3/0x150
> May 23 19:15:50 mcds1 kernel: [210728.231290]  [<ffffffff811537dd>] sys_umount+0xcd/0x3f0
> May 23 19:15:50 mcds1 kernel: [210728.231294]  [<ffffffff817a1a52>] system_call_fastpath+0x16/0x1b
> 
> Regards,
> Akira Fujita
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] xfstest269 causes deadlock on linux-3.9.0 (ext4)
  2013-05-27  8:16       ` Dmitry Monakhov
@ 2013-05-27  8:33         ` Jan Kara
       [not found]           ` <51A46C79.4000105@rs.jp.nec.com>
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Kara @ 2013-05-27  8:33 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: Akira Fujita, Jan Kara, ext4 development

On Mon 27-05-13 12:16:01, Dmitry Monakhov wrote:
> On Mon, 27 May 2013 10:49:11 +0900, Akira Fujita <a-fujita@rs.jp.nec.com> wrote:
> > Hi Jan,
> > 
> > (2013/05/15 11:57), Akira Fujita wrote:
> > > Hi,
> > > 
> > > (2013/05/15 6:37), Jan Kara wrote:
> > >>     Hello,
> > >>
> > >> On Mon 13-05-13 15:49:24, Akira Fujita wrote:
> > >>> I ran into the deaclock with xfs_test 269 on linux-3.9.0.
> > >>> It seems happen between jbd2_log_wait_commit, sleep_on_buffer
> > >>> and writeback_indoes (Please see ps log below).
> > >>> Once it occurs we can't touch FS anymore.
> > >>> In my case 300 - 1000 trials to occur. Is this known issue?
> > >>>
> > >>> The following kernels seems to have same problem:
> > >>> - linux-3.5-rc5
> > >>> - linux-3.8.5
> > >>> - linux-3.9-rc7
> > >>> And now I'm trying it on linux-3.10-rc1.
> > >>>
> > >>> # ./check generic/269
> > >>> FSTYP         -- ext4
> > >>> PLATFORM      -- Linux/x86_64 mcds1 3.9.0
> > >>> MKFS_OPTIONS  -- /dev/sda12
> > >>> MOUNT_OPTIONS -- -o acl,user_xattr /dev/sda12 /mnt/mp2
> > >>>
> > >>>
> > >>> # ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:16,comm
> > >>>     PID   TID CLS RTPRIO  NI PRI PSR %CPU STAT WCHAN            COMMAND
> > >>>       1     1 TS       -   0  19   0  0.0 Ss   poll_schedule_ti init
> > >>>       2     2 TS       -   0  19   0  0.0 S    kthreadd         kthreadd
> > >>>       3     3 TS       -   0  19   0  0.0 S    smpboot_thread_f ksoftirqd/0
> > >>> ...
> > >>>    2391  2391 TS       -   0  19   2  0.1 D    jbd2_log_wait_co flush-8:0
> > >>> ...
> > >>> 22647 22647 TS       -   0  19   3  0.0 S    worker_thread    kworker/3:1
> > >>> 22655 22655 TS       -   0  19   0  0.0 S    hrtimer_nanoslee sleep
> > >>> 22657 22657 TS       -   0  19   2  0.0 R+   -                ps
> > >>> 25330 25330 TS       -   0  19   0  0.0 S    worker_thread    kworker/0:0
> > >>> 28963 28963 TS       -   0  19   1  0.0 S+   wait             loop_xfstests.s
> > >>> 28964 28964 TS       -   0  19   1  0.0 S+   wait             check
> > >>> 29180 29180 TS       -   0  19   3  0.0 S    kjournald2       jbd2/sda11-8
> > >>> 29181 29181 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
> > >>> 29199 29199 TS       -   0  19   3  0.0 S+   wait             269
> > >>> 29391 29391 TS       -   0  19   0  0.6 D    sleep_on_buffer  jbd2/sda12-8
> > >>> 29392 29392 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
> > >>> 29394 29394 TS       -   0  19   0  0.0 S    wait             fsstress
> > >>> 29505 29505 TS       -   0  19   3  0.0 D    writeback_inodes fsstress
> > >>>
> > >>> # df -T /dev/sda11 /dev/sda12
> > >>> Filesystem    Type   1K-blocks      Used Available Use% Mounted on
> > >>> /dev/sda11    ext4     9857264     22308   9327564   1% /mnt/mp1
> > >>> /dev/sda12    ext4      499656    499656         0 100% /mnt/mp2
> > >>     Thanks for report. No I don't think this problem has been reported
> > >> before. Seeing that sda12 is out of space and fsstress hangs in
> > >> writeback_inodes(), I suspect we have some deadlock in ENOSPC recovery path
> > >> when we want to flush data to disk to reduce delalloc uncertainty. Can you
> > >> run 'echo w >/proc/sysrq-trigger' when the deadlock happens and post your
> > >> dmesg here? Thanks!
> > >>
> > > 
> > > Thanks for reply.
> > > I'll take that information when the deadlock happens again.
> > 
> > Finally, I got the following messages on linux-3.9.0.
> > Could you take a look?
> Is it a real deadlock?
> umount wait for flush-8:0
> flush wait for jbd2
> jbd2 wait for bh.
> All buffer-heads will be written at sooner or later.
> It probably will require significant amount of time, because fs is too
> fragmented(that is what this test all about), but it is not a deadlock.
  Yeah, that is my analysis as well. I was only thinking whether we
couldn't forget some buffer locked on some error recovery path (we are
getting ENOSPC) but it's not obvious where it would be. Akira-san, could
you also send us disassembly of your jbd2_journal_commit_transaction()
function so that we can check where exactly offset 0xee0 is in the
function?

								Honza

> > #echo w > /proc/sysrq-trigger
> > /var/log/messages:
> > -----------------------
> > May 23 19:15:50 mcds1 kernel: [210728.230978] SysRq : Show Blocked State
> > May 23 19:15:50 mcds1 kernel: [210728.230985]   task                        PC stack   pid father
> > May 23 19:15:50 mcds1 kernel: [210728.230992] flush-8:0       D ffffffff81806d60  3192  2361      2 0x00000000
> > May 23 19:15:50 mcds1 kernel: [210728.230997]  ffff880214d27a08 0000000000000046 0000000000000282 ffff88021608d330
> > May 23 19:15:50 mcds1 kernel: [210728.231001]  0000000000000000 ffff880215d95ac0 0000000000012980 ffff880214d27fd8
> > May 23 19:15:50 mcds1 kernel: [210728.231005]  ffff880214d26010 0000000000012980 0000000000012980 ffff880214d27fd8
> > May 23 19:15:50 mcds1 kernel: [210728.231008] Call Trace:
> > May 23 19:15:50 mcds1 kernel: [210728.231018]  [<ffffffff81798ae4>] schedule+0x24/0x70
> > May 23 19:15:50 mcds1 kernel: [210728.231023]  [<ffffffff81204b95>] jbd2_log_wait_commit+0xb5/0x130
> > May 23 19:15:50 mcds1 kernel: [210728.231029]  [<ffffffff81060830>] ? wake_up_bit+0x40/0x40
> > May 23 19:15:50 mcds1 kernel: [210728.231033]  [<ffffffff81204cb4>] ? __jbd2_log_start_commit+0xa4/0xb0
> > May 23 19:15:50 mcds1 kernel: [210728.231037]  [<ffffffff81204e4b>] jbd2_journal_force_commit_nested+0x5b/0xa0
> > May 23 19:15:50 mcds1 kernel: [210728.231041]  [<ffffffff811b75c8>] ext4_da_writepages+0x348/0x5a0
> > May 23 19:15:50 mcds1 kernel: [210728.231045]  [<ffffffff81074e77>] ? check_preempt_wakeup+0x1b7/0x2a0
> > May 23 19:15:50 mcds1 kernel: [210728.231051]  [<ffffffff810f0f5b>] do_writepages+0x1b/0x30
> > May 23 19:15:50 mcds1 kernel: [210728.231056]  [<ffffffff8115d9d4>] __writeback_single_inode+0x44/0x220
> > May 23 19:15:50 mcds1 kernel: [210728.231061]  [<ffffffff8115e93f>] writeback_sb_inodes+0x23f/0x430
> > May 23 19:15:50 mcds1 kernel: [210728.231065]  [<ffffffff8115ed6d>] wb_writeback+0xed/0x2b0
> > May 23 19:15:50 mcds1 kernel: [210728.231071]  [<ffffffff8104c537>] ? lock_timer_base+0x37/0x70
> > May 23 19:15:50 mcds1 kernel: [210728.231075]  [<ffffffff8115efcd>] wb_do_writeback+0x9d/0x240
> > May 23 19:15:50 mcds1 kernel: [210728.231079]  [<ffffffff8104c570>] ? lock_timer_base+0x70/0x70
> > May 23 19:15:50 mcds1 kernel: [210728.231084]  [<ffffffff8115f212>] bdi_writeback_thread+0xa2/0x230
> > May 23 19:15:50 mcds1 kernel: [210728.231088]  [<ffffffff8115f170>] ? wb_do_writeback+0x240/0x240
> > May 23 19:15:50 mcds1 kernel: [210728.231092]  [<ffffffff8115f170>] ? wb_do_writeback+0x240/0x240
> > May 23 19:15:50 mcds1 kernel: [210728.231095]  [<ffffffff81060136>] kthread+0xc6/0xd0
> > May 23 19:15:50 mcds1 kernel: [210728.231099]  [<ffffffff81060070>] ? kthread_freezable_should_stop+0x70/0x70
> > May 23 19:15:50 mcds1 kernel: [210728.231104]  [<ffffffff817a19ac>] ret_from_fork+0x7c/0xb0
> > May 23 19:15:50 mcds1 kernel: [210728.231108]  [<ffffffff81060070>] ? kthread_freezable_should_stop+0x70/0x70
> > May 23 19:15:50 mcds1 kernel: [210728.231124] jbd2/sda12-8    D ffffffff81806d60  4736 25056      2 0x00000000
> > May 23 19:15:50 mcds1 kernel: [210728.231128]  ffff8801e436db78 0000000000000046 ffff880215c06f00 ffffffff81c10420
> > May 23 19:15:50 mcds1 kernel: [210728.231131]  ffff880200011200 ffff8801f8f296b0 0000000000012980 ffff8801e436dfd8
> > May 23 19:15:50 mcds1 kernel: [210728.231135]  ffff8801e436c010 0000000000012980 0000000000012980 ffff8801e436dfd8
> > May 23 19:15:50 mcds1 kernel: [210728.231138] Call Trace:
> > May 23 19:15:50 mcds1 kernel: [210728.231142]  [<ffffffff81166c50>] ? __wait_on_buffer+0x30/0x30
> > May 23 19:15:50 mcds1 kernel: [210728.231146]  [<ffffffff81798ae4>] schedule+0x24/0x70
> > May 23 19:15:50 mcds1 kernel: [210728.231150]  [<ffffffff81798bb7>] io_schedule+0x87/0xd0
> > May 23 19:15:50 mcds1 kernel: [210728.231154]  [<ffffffff81166c59>] sleep_on_buffer+0x9/0x10
> > May 23 19:15:50 mcds1 kernel: [210728.231157]  [<ffffffff817971d7>] __wait_on_bit+0x57/0x80
> > May 23 19:15:50 mcds1 kernel: [210728.231161]  [<ffffffff81165f7c>] ? submit_bh+0x13c/0x1f0
> > May 23 19:15:50 mcds1 kernel: [210728.231164]  [<ffffffff81166c50>] ? __wait_on_buffer+0x30/0x30
> > May 23 19:15:50 mcds1 kernel: [210728.231168]  [<ffffffff81797273>] out_of_line_wait_on_bit+0x73/0x90
> > May 23 19:15:50 mcds1 kernel: [210728.231172]  [<ffffffff81060870>] ? autoremove_wake_function+0x40/0x40
> > May 23 19:15:50 mcds1 kernel: [210728.231175]  [<ffffffff81166c46>] __wait_on_buffer+0x26/0x30
> > May 23 19:15:50 mcds1 kernel: [210728.231179]  [<ffffffff811ff130>] jbd2_journal_commit_transaction+0xee0/0x1960
> > May 23 19:15:50 mcds1 kernel: [210728.231183]  [<ffffffff817984fd>] ? __schedule+0x3ad/0x7a0
> > May 23 19:15:50 mcds1 kernel: [210728.231188]  [<ffffffff8104c537>] ? lock_timer_base+0x37/0x70
> > May 23 19:15:50 mcds1 kernel: [210728.231191]  [<ffffffff81204f62>] kjournald2+0xc2/0x240
> > May 23 19:15:50 mcds1 kernel: [210728.231195]  [<ffffffff81060830>] ? wake_up_bit+0x40/0x40
> > May 23 19:15:50 mcds1 kernel: [210728.231199]  [<ffffffff81204ea0>] ? commit_timeout+0x10/0x10
> > May 23 19:15:50 mcds1 kernel: [210728.231202]  [<ffffffff81060136>] kthread+0xc6/0xd0
> > May 23 19:15:50 mcds1 kernel: [210728.231206]  [<ffffffff81060070>] ? kthread_freezable_should_stop+0x70/0x70
> > May 23 19:15:50 mcds1 kernel: [210728.231210]  [<ffffffff817a19ac>] ret_from_fork+0x7c/0xb0
> > May 23 19:15:50 mcds1 kernel: [210728.231213]  [<ffffffff81060070>] ? kthread_freezable_should_stop+0x70/0x70
> > May 23 19:15:50 mcds1 kernel: [210728.231216] umount          D ffffffff81806d60  5448 25241  24864 0x00000000
> > May 23 19:15:50 mcds1 kernel: [210728.231220]  ffff8801f3a89c28 0000000000000086 ffff88021fc929f0 ffff88021608cba0
> > May 23 19:15:50 mcds1 kernel: [210728.231223]  ffff8801f3a89bf8 ffff8801f84d8790 0000000000012980 ffff8801f3a89fd8
> > May 23 19:15:50 mcds1 kernel: [210728.231226]  ffff8801f3a88010 0000000000012980 0000000000012980 ffff8801f3a89fd8
> > May 23 19:15:50 mcds1 kernel: [210728.231230] Call Trace:
> > May 23 19:15:50 mcds1 kernel: [210728.231234]  [<ffffffff81798ae4>] schedule+0x24/0x70
> > May 23 19:15:50 mcds1 kernel: [210728.231237]  [<ffffffff81796f4d>] schedule_timeout+0x15d/0x1f0
> > May 23 19:15:50 mcds1 kernel: [210728.231241]  [<ffffffff8106f2b5>] ? __cond_resched+0x25/0x40
> > May 23 19:15:50 mcds1 kernel: [210728.231245]  [<ffffffff81798f64>] wait_for_completion+0x94/0x100
> > May 23 19:15:50 mcds1 kernel: [210728.231248]  [<ffffffff81070ca0>] ? try_to_wake_up+0x2a0/0x2a0
> > May 23 19:15:50 mcds1 kernel: [210728.231252]  [<ffffffff8115ddb1>] ? bdi_queue_work+0x81/0x110
> > May 23 19:15:50 mcds1 kernel: [210728.231256]  [<ffffffff8115dec3>] writeback_inodes_sb_nr+0x83/0xb0
> > May 23 19:15:50 mcds1 kernel: [210728.231261]  [<ffffffff8115df4a>] writeback_inodes_sb+0x5a/0x70
> > May 23 19:15:50 mcds1 kernel: [210728.231264]  [<ffffffff8116463a>] __sync_filesystem+0x4a/0x50
> > May 23 19:15:50 mcds1 kernel: [210728.231267]  [<ffffffff81164672>] sync_filesystem+0x32/0x60
> > May 23 19:15:50 mcds1 kernel: [210728.231273]  [<ffffffff81137be6>] generic_shutdown_super+0x36/0xe0
> > May 23 19:15:50 mcds1 kernel: [210728.231277]  [<ffffffff81137cbc>] kill_block_super+0x2c/0x80
> > May 23 19:15:50 mcds1 kernel: [210728.231281]  [<ffffffff811381f5>] deactivate_locked_super+0x45/0x70
> > May 23 19:15:50 mcds1 kernel: [210728.231284]  [<ffffffff81139025>] deactivate_super+0x45/0x60
> > May 23 19:15:50 mcds1 kernel: [210728.231287]  [<ffffffff81152f63>] mntput_no_expire+0xf3/0x150
> > May 23 19:15:50 mcds1 kernel: [210728.231290]  [<ffffffff811537dd>] sys_umount+0xcd/0x3f0
> > May 23 19:15:50 mcds1 kernel: [210728.231294]  [<ffffffff817a1a52>] system_call_fastpath+0x16/0x1b
> > 
> > Regards,
> > Akira Fujita
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

[parent not found: <51A46C79.4000105@rs.jp.nec.com>]

* Re: [BUG] xfstest269 causes deadlock on linux-3.9.0 (ext4)
       [not found]           ` <51A46C79.4000105@rs.jp.nec.com>
@ 2013-06-24 11:04             ` Jan Kara
  2013-06-25 10:06               ` Akira Fujita
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Kara @ 2013-06-24 11:04 UTC (permalink / raw)
  To: Akira Fujita; +Cc: Jan Kara, Dmitry Monakhov, ext4 development

On Tue 28-05-13 17:36:09, Akira Fujita wrote:
> Hello,
> 
> (2013/05/27 17:33), Jan Kara wrote:
> > On Mon 27-05-13 12:16:01, Dmitry Monakhov wrote:
> >> On Mon, 27 May 2013 10:49:11 +0900, Akira Fujita <a-fujita@rs.jp.nec.com> wrote:
> >>> Hi Jan,
> >>>
> >>> (2013/05/15 11:57), Akira Fujita wrote:
> >>>> Hi,
> >>>>
> >>>> (2013/05/15 6:37), Jan Kara wrote:
> >>>>>      Hello,
> >>>>>
> >>>>> On Mon 13-05-13 15:49:24, Akira Fujita wrote:
> >>>>>> I ran into the deaclock with xfs_test 269 on linux-3.9.0.
> >>>>>> It seems happen between jbd2_log_wait_commit, sleep_on_buffer
> >>>>>> and writeback_indoes (Please see ps log below).
> >>>>>> Once it occurs we can't touch FS anymore.
> >>>>>> In my case 300 - 1000 trials to occur. Is this known issue?
> >>>>>>
> >>>>>> The following kernels seems to have same problem:
> >>>>>> - linux-3.5-rc5
> >>>>>> - linux-3.8.5
> >>>>>> - linux-3.9-rc7
> >>>>>> And now I'm trying it on linux-3.10-rc1.
> >>>>>>
> >>>>>> # ./check generic/269
> >>>>>> FSTYP         -- ext4
> >>>>>> PLATFORM      -- Linux/x86_64 mcds1 3.9.0
> >>>>>> MKFS_OPTIONS  -- /dev/sda12
> >>>>>> MOUNT_OPTIONS -- -o acl,user_xattr /dev/sda12 /mnt/mp2
> >>>>>>
> >>>>>>
> >>>>>> # ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:16,comm
> >>>>>>      PID   TID CLS RTPRIO  NI PRI PSR %CPU STAT WCHAN            COMMAND
> >>>>>>        1     1 TS       -   0  19   0  0.0 Ss   poll_schedule_ti init
> >>>>>>        2     2 TS       -   0  19   0  0.0 S    kthreadd         kthreadd
> >>>>>>        3     3 TS       -   0  19   0  0.0 S    smpboot_thread_f ksoftirqd/0
> >>>>>> ...
> >>>>>>     2391  2391 TS       -   0  19   2  0.1 D    jbd2_log_wait_co flush-8:0
> >>>>>> ...
> >>>>>> 22647 22647 TS       -   0  19   3  0.0 S    worker_thread    kworker/3:1
> >>>>>> 22655 22655 TS       -   0  19   0  0.0 S    hrtimer_nanoslee sleep
> >>>>>> 22657 22657 TS       -   0  19   2  0.0 R+   -                ps
> >>>>>> 25330 25330 TS       -   0  19   0  0.0 S    worker_thread    kworker/0:0
> >>>>>> 28963 28963 TS       -   0  19   1  0.0 S+   wait             loop_xfstests.s
> >>>>>> 28964 28964 TS       -   0  19   1  0.0 S+   wait             check
> >>>>>> 29180 29180 TS       -   0  19   3  0.0 S    kjournald2       jbd2/sda11-8
> >>>>>> 29181 29181 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
> >>>>>> 29199 29199 TS       -   0  19   3  0.0 S+   wait             269
> >>>>>> 29391 29391 TS       -   0  19   0  0.6 D    sleep_on_buffer  jbd2/sda12-8
> >>>>>> 29392 29392 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
> >>>>>> 29394 29394 TS       -   0  19   0  0.0 S    wait             fsstress
> >>>>>> 29505 29505 TS       -   0  19   3  0.0 D    writeback_inodes fsstress
> >>>>>>
> >>>>>> # df -T /dev/sda11 /dev/sda12
> >>>>>> Filesystem    Type   1K-blocks      Used Available Use% Mounted on
> >>>>>> /dev/sda11    ext4     9857264     22308   9327564   1% /mnt/mp1
> >>>>>> /dev/sda12    ext4      499656    499656         0 100% /mnt/mp2
> >>>>>      Thanks for report. No I don't think this problem has been reported
> >>>>> before. Seeing that sda12 is out of space and fsstress hangs in
> >>>>> writeback_inodes(), I suspect we have some deadlock in ENOSPC recovery path
> >>>>> when we want to flush data to disk to reduce delalloc uncertainty. Can you
> >>>>> run 'echo w >/proc/sysrq-trigger' when the deadlock happens and post your
> >>>>> dmesg here? Thanks!
> >>>>>
> >>>>
> >>>> Thanks for reply.
> >>>> I'll take that information when the deadlock happens again.
> >>>
> >>> Finally, I got the following messages on linux-3.9.0.
> >>> Could you take a look?
> >> Is it a real deadlock?
> >> umount wait for flush-8:0
> >> flush wait for jbd2
> >> jbd2 wait for bh.
> >> All buffer-heads will be written at sooner or later.
> >> It probably will require significant amount of time, because fs is too
> >> fragmented(that is what this test all about), but it is not a deadlock.
> >    Yeah, that is my analysis as well. I was only thinking whether we
> > couldn't forget some buffer locked on some error recovery path (we are
> > getting ENOSPC) but it's not obvious where it would be. Akira-san, could
> > you also send us disassembly of your jbd2_journal_commit_transaction()
> > function so that we can check where exactly offset 0xee0 is in the
> > function?
> > 
> > 								Honza
> 
> I attached disasemble code of jbd2_journal_commit_tnrasaction,
> please check it.
  Thanks. I've looked into the disassembly now and we are waiting for the
write of commit record to finish. We have submitted the buffer just before
so it can hardly be some issue like forgotten locked buffer. Really
strange. It looks like some problem in block layer, device driver, or HW
itself... I'm thinking how to debug this further - it would be good if we
had blktrace data just before the moment when the hang happens but given
how hard this is to reproduce the traces would be extremly large...

One idea - can you try whether the problem reproduces with 'nobarrier'
mount option? It could be some problem with flush / fua handling.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] xfstest269 causes deadlock on linux-3.9.0 (ext4)
  2013-06-24 11:04             ` Jan Kara
@ 2013-06-25 10:06               ` Akira Fujita
  2013-06-25 11:27                 ` Jan Kara
  0 siblings, 1 reply; 10+ messages in thread
From: Akira Fujita @ 2013-06-25 10:06 UTC (permalink / raw)
  To: Jan Kara; +Cc: Dmitry Monakhov, ext4 development



(2013/06/24 20:04), Jan Kara wrote:
> On Tue 28-05-13 17:36:09, Akira Fujita wrote:
>> Hello,
>>
>> (2013/05/27 17:33), Jan Kara wrote:
>>> On Mon 27-05-13 12:16:01, Dmitry Monakhov wrote:
>>>> On Mon, 27 May 2013 10:49:11 +0900, Akira Fujita <a-fujita@rs.jp.nec.com> wrote:
>>>>> Hi Jan,
>>>>>
>>>>> (2013/05/15 11:57), Akira Fujita wrote:
>>>>>> Hi,
>>>>>>
>>>>>> (2013/05/15 6:37), Jan Kara wrote:
>>>>>>>       Hello,
>>>>>>>
>>>>>>> On Mon 13-05-13 15:49:24, Akira Fujita wrote:
>>>>>>>> I ran into the deaclock with xfs_test 269 on linux-3.9.0.
>>>>>>>> It seems happen between jbd2_log_wait_commit, sleep_on_buffer
>>>>>>>> and writeback_indoes (Please see ps log below).
>>>>>>>> Once it occurs we can't touch FS anymore.
>>>>>>>> In my case 300 - 1000 trials to occur. Is this known issue?
>>>>>>>>
>>>>>>>> The following kernels seems to have same problem:
>>>>>>>> - linux-3.5-rc5
>>>>>>>> - linux-3.8.5
>>>>>>>> - linux-3.9-rc7
>>>>>>>> And now I'm trying it on linux-3.10-rc1.
>>>>>>>>
>>>>>>>> # ./check generic/269
>>>>>>>> FSTYP         -- ext4
>>>>>>>> PLATFORM      -- Linux/x86_64 mcds1 3.9.0
>>>>>>>> MKFS_OPTIONS  -- /dev/sda12
>>>>>>>> MOUNT_OPTIONS -- -o acl,user_xattr /dev/sda12 /mnt/mp2
>>>>>>>>
>>>>>>>>
>>>>>>>> # ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:16,comm
>>>>>>>>       PID   TID CLS RTPRIO  NI PRI PSR %CPU STAT WCHAN            COMMAND
>>>>>>>>         1     1 TS       -   0  19   0  0.0 Ss   poll_schedule_ti init
>>>>>>>>         2     2 TS       -   0  19   0  0.0 S    kthreadd         kthreadd
>>>>>>>>         3     3 TS       -   0  19   0  0.0 S    smpboot_thread_f ksoftirqd/0
>>>>>>>> ...
>>>>>>>>      2391  2391 TS       -   0  19   2  0.1 D    jbd2_log_wait_co flush-8:0
>>>>>>>> ...
>>>>>>>> 22647 22647 TS       -   0  19   3  0.0 S    worker_thread    kworker/3:1
>>>>>>>> 22655 22655 TS       -   0  19   0  0.0 S    hrtimer_nanoslee sleep
>>>>>>>> 22657 22657 TS       -   0  19   2  0.0 R+   -                ps
>>>>>>>> 25330 25330 TS       -   0  19   0  0.0 S    worker_thread    kworker/0:0
>>>>>>>> 28963 28963 TS       -   0  19   1  0.0 S+   wait             loop_xfstests.s
>>>>>>>> 28964 28964 TS       -   0  19   1  0.0 S+   wait             check
>>>>>>>> 29180 29180 TS       -   0  19   3  0.0 S    kjournald2       jbd2/sda11-8
>>>>>>>> 29181 29181 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
>>>>>>>> 29199 29199 TS       -   0  19   3  0.0 S+   wait             269
>>>>>>>> 29391 29391 TS       -   0  19   0  0.6 D    sleep_on_buffer  jbd2/sda12-8
>>>>>>>> 29392 29392 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
>>>>>>>> 29394 29394 TS       -   0  19   0  0.0 S    wait             fsstress
>>>>>>>> 29505 29505 TS       -   0  19   3  0.0 D    writeback_inodes fsstress
>>>>>>>>
>>>>>>>> # df -T /dev/sda11 /dev/sda12
>>>>>>>> Filesystem    Type   1K-blocks      Used Available Use% Mounted on
>>>>>>>> /dev/sda11    ext4     9857264     22308   9327564   1% /mnt/mp1
>>>>>>>> /dev/sda12    ext4      499656    499656         0 100% /mnt/mp2
>>>>>>>       Thanks for report. No I don't think this problem has been reported
>>>>>>> before. Seeing that sda12 is out of space and fsstress hangs in
>>>>>>> writeback_inodes(), I suspect we have some deadlock in ENOSPC recovery path
>>>>>>> when we want to flush data to disk to reduce delalloc uncertainty. Can you
>>>>>>> run 'echo w >/proc/sysrq-trigger' when the deadlock happens and post your
>>>>>>> dmesg here? Thanks!
>>>>>>>
>>>>>>
>>>>>> Thanks for reply.
>>>>>> I'll take that information when the deadlock happens again.
>>>>>
>>>>> Finally, I got the following messages on linux-3.9.0.
>>>>> Could you take a look?
>>>> Is it a real deadlock?
>>>> umount wait for flush-8:0
>>>> flush wait for jbd2
>>>> jbd2 wait for bh.
>>>> All buffer-heads will be written at sooner or later.
>>>> It probably will require significant amount of time, because fs is too
>>>> fragmented(that is what this test all about), but it is not a deadlock.
>>>     Yeah, that is my analysis as well. I was only thinking whether we
>>> couldn't forget some buffer locked on some error recovery path (we are
>>> getting ENOSPC) but it's not obvious where it would be. Akira-san, could
>>> you also send us disassembly of your jbd2_journal_commit_transaction()
>>> function so that we can check where exactly offset 0xee0 is in the
>>> function?
>>>
>>> 								Honza
>>
>> I attached disasemble code of jbd2_journal_commit_tnrasaction,
>> please check it.
>    Thanks. I've looked into the disassembly now and we are waiting for the
> write of commit record to finish. We have submitted the buffer just before
> so it can hardly be some issue like forgotten locked buffer. Really
> strange. It looks like some problem in block layer, device driver, or HW
> itself... I'm thinking how to debug this further - it would be good if we
> had blktrace data just before the moment when the hang happens but given
> how hard this is to reproduce the traces would be extremly large...
> 
> One idea - can you try whether the problem reproduces with 'nobarrier'
> mount option? It could be some problem with flush / fua handling.
> 

OK, I'll try to reproduce this issue with 'nobarrier'
(and take blktrace log if I can).

Thanks,
Akira Fujita

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] xfstest269 causes deadlock on linux-3.9.0 (ext4)
  2013-06-25 10:06               ` Akira Fujita
@ 2013-06-25 11:27                 ` Jan Kara
  2013-06-26  8:38                   ` Akira Fujita
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Kara @ 2013-06-25 11:27 UTC (permalink / raw)
  To: Akira Fujita; +Cc: Jan Kara, Dmitry Monakhov, ext4 development

On Tue 25-06-13 19:06:53, Akira Fujita wrote:
> 
> 
> (2013/06/24 20:04), Jan Kara wrote:
> > On Tue 28-05-13 17:36:09, Akira Fujita wrote:
> >> Hello,
> >>
> >> (2013/05/27 17:33), Jan Kara wrote:
> >>> On Mon 27-05-13 12:16:01, Dmitry Monakhov wrote:
> >>>> On Mon, 27 May 2013 10:49:11 +0900, Akira Fujita <a-fujita@rs.jp.nec.com> wrote:
> >>>>> Hi Jan,
> >>>>>
> >>>>> (2013/05/15 11:57), Akira Fujita wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> (2013/05/15 6:37), Jan Kara wrote:
> >>>>>>>       Hello,
> >>>>>>>
> >>>>>>> On Mon 13-05-13 15:49:24, Akira Fujita wrote:
> >>>>>>>> I ran into the deaclock with xfs_test 269 on linux-3.9.0.
> >>>>>>>> It seems happen between jbd2_log_wait_commit, sleep_on_buffer
> >>>>>>>> and writeback_indoes (Please see ps log below).
> >>>>>>>> Once it occurs we can't touch FS anymore.
> >>>>>>>> In my case 300 - 1000 trials to occur. Is this known issue?
> >>>>>>>>
> >>>>>>>> The following kernels seems to have same problem:
> >>>>>>>> - linux-3.5-rc5
> >>>>>>>> - linux-3.8.5
> >>>>>>>> - linux-3.9-rc7
> >>>>>>>> And now I'm trying it on linux-3.10-rc1.
> >>>>>>>>
> >>>>>>>> # ./check generic/269
> >>>>>>>> FSTYP         -- ext4
> >>>>>>>> PLATFORM      -- Linux/x86_64 mcds1 3.9.0
> >>>>>>>> MKFS_OPTIONS  -- /dev/sda12
> >>>>>>>> MOUNT_OPTIONS -- -o acl,user_xattr /dev/sda12 /mnt/mp2
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> # ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:16,comm
> >>>>>>>>       PID   TID CLS RTPRIO  NI PRI PSR %CPU STAT WCHAN            COMMAND
> >>>>>>>>         1     1 TS       -   0  19   0  0.0 Ss   poll_schedule_ti init
> >>>>>>>>         2     2 TS       -   0  19   0  0.0 S    kthreadd         kthreadd
> >>>>>>>>         3     3 TS       -   0  19   0  0.0 S    smpboot_thread_f ksoftirqd/0
> >>>>>>>> ...
> >>>>>>>>      2391  2391 TS       -   0  19   2  0.1 D    jbd2_log_wait_co flush-8:0
> >>>>>>>> ...
> >>>>>>>> 22647 22647 TS       -   0  19   3  0.0 S    worker_thread    kworker/3:1
> >>>>>>>> 22655 22655 TS       -   0  19   0  0.0 S    hrtimer_nanoslee sleep
> >>>>>>>> 22657 22657 TS       -   0  19   2  0.0 R+   -                ps
> >>>>>>>> 25330 25330 TS       -   0  19   0  0.0 S    worker_thread    kworker/0:0
> >>>>>>>> 28963 28963 TS       -   0  19   1  0.0 S+   wait             loop_xfstests.s
> >>>>>>>> 28964 28964 TS       -   0  19   1  0.0 S+   wait             check
> >>>>>>>> 29180 29180 TS       -   0  19   3  0.0 S    kjournald2       jbd2/sda11-8
> >>>>>>>> 29181 29181 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
> >>>>>>>> 29199 29199 TS       -   0  19   3  0.0 S+   wait             269
> >>>>>>>> 29391 29391 TS       -   0  19   0  0.6 D    sleep_on_buffer  jbd2/sda12-8
> >>>>>>>> 29392 29392 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
> >>>>>>>> 29394 29394 TS       -   0  19   0  0.0 S    wait             fsstress
> >>>>>>>> 29505 29505 TS       -   0  19   3  0.0 D    writeback_inodes fsstress
> >>>>>>>>
> >>>>>>>> # df -T /dev/sda11 /dev/sda12
> >>>>>>>> Filesystem    Type   1K-blocks      Used Available Use% Mounted on
> >>>>>>>> /dev/sda11    ext4     9857264     22308   9327564   1% /mnt/mp1
> >>>>>>>> /dev/sda12    ext4      499656    499656         0 100% /mnt/mp2
> >>>>>>>       Thanks for report. No I don't think this problem has been reported
> >>>>>>> before. Seeing that sda12 is out of space and fsstress hangs in
> >>>>>>> writeback_inodes(), I suspect we have some deadlock in ENOSPC recovery path
> >>>>>>> when we want to flush data to disk to reduce delalloc uncertainty. Can you
> >>>>>>> run 'echo w >/proc/sysrq-trigger' when the deadlock happens and post your
> >>>>>>> dmesg here? Thanks!
> >>>>>>>
> >>>>>>
> >>>>>> Thanks for reply.
> >>>>>> I'll take that information when the deadlock happens again.
> >>>>>
> >>>>> Finally, I got the following messages on linux-3.9.0.
> >>>>> Could you take a look?
> >>>> Is it a real deadlock?
> >>>> umount wait for flush-8:0
> >>>> flush wait for jbd2
> >>>> jbd2 wait for bh.
> >>>> All buffer-heads will be written at sooner or later.
> >>>> It probably will require significant amount of time, because fs is too
> >>>> fragmented(that is what this test all about), but it is not a deadlock.
> >>>     Yeah, that is my analysis as well. I was only thinking whether we
> >>> couldn't forget some buffer locked on some error recovery path (we are
> >>> getting ENOSPC) but it's not obvious where it would be. Akira-san, could
> >>> you also send us disassembly of your jbd2_journal_commit_transaction()
> >>> function so that we can check where exactly offset 0xee0 is in the
> >>> function?
> >>>
> >>> 								Honza
> >>
> >> I attached disasemble code of jbd2_journal_commit_tnrasaction,
> >> please check it.
> >    Thanks. I've looked into the disassembly now and we are waiting for the
> > write of commit record to finish. We have submitted the buffer just before
> > so it can hardly be some issue like forgotten locked buffer. Really
> > strange. It looks like some problem in block layer, device driver, or HW
> > itself... I'm thinking how to debug this further - it would be good if we
> > had blktrace data just before the moment when the hang happens but given
> > how hard this is to reproduce the traces would be extremly large...
> > 
> > One idea - can you try whether the problem reproduces with 'nobarrier'
> > mount option? It could be some problem with flush / fua handling.
> > 
> 
> OK, I'll try to reproduce this issue with 'nobarrier'
> (and take blktrace log if I can).
  You can restart the blktrace every 5 minutes or so, so that trace files
don't get too big. If we have a trace for a minute before the machine crashes
it should be enough for debugging.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] xfstest269 causes deadlock on linux-3.9.0 (ext4)
  2013-06-25 11:27                 ` Jan Kara
@ 2013-06-26  8:38                   ` Akira Fujita
  0 siblings, 0 replies; 10+ messages in thread
From: Akira Fujita @ 2013-06-26  8:38 UTC (permalink / raw)
  To: Jan Kara; +Cc: Dmitry Monakhov, ext4 development



(2013/06/25 20:27), Jan Kara wrote:
> On Tue 25-06-13 19:06:53, Akira Fujita wrote:
>>
>>
>> (2013/06/24 20:04), Jan Kara wrote:
>>> On Tue 28-05-13 17:36:09, Akira Fujita wrote:
>>>> Hello,
>>>>
>>>> (2013/05/27 17:33), Jan Kara wrote:
>>>>> On Mon 27-05-13 12:16:01, Dmitry Monakhov wrote:
>>>>>> On Mon, 27 May 2013 10:49:11 +0900, Akira Fujita <a-fujita@rs.jp.nec.com> wrote:
>>>>>>> Hi Jan,
>>>>>>>
>>>>>>> (2013/05/15 11:57), Akira Fujita wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> (2013/05/15 6:37), Jan Kara wrote:
>>>>>>>>>        Hello,
>>>>>>>>>
>>>>>>>>> On Mon 13-05-13 15:49:24, Akira Fujita wrote:
>>>>>>>>>> I ran into the deaclock with xfs_test 269 on linux-3.9.0.
>>>>>>>>>> It seems happen between jbd2_log_wait_commit, sleep_on_buffer
>>>>>>>>>> and writeback_indoes (Please see ps log below).
>>>>>>>>>> Once it occurs we can't touch FS anymore.
>>>>>>>>>> In my case 300 - 1000 trials to occur. Is this known issue?
>>>>>>>>>>
>>>>>>>>>> The following kernels seems to have same problem:
>>>>>>>>>> - linux-3.5-rc5
>>>>>>>>>> - linux-3.8.5
>>>>>>>>>> - linux-3.9-rc7
>>>>>>>>>> And now I'm trying it on linux-3.10-rc1.
>>>>>>>>>>
>>>>>>>>>> # ./check generic/269
>>>>>>>>>> FSTYP         -- ext4
>>>>>>>>>> PLATFORM      -- Linux/x86_64 mcds1 3.9.0
>>>>>>>>>> MKFS_OPTIONS  -- /dev/sda12
>>>>>>>>>> MOUNT_OPTIONS -- -o acl,user_xattr /dev/sda12 /mnt/mp2
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> # ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:16,comm
>>>>>>>>>>        PID   TID CLS RTPRIO  NI PRI PSR %CPU STAT WCHAN            COMMAND
>>>>>>>>>>          1     1 TS       -   0  19   0  0.0 Ss   poll_schedule_ti init
>>>>>>>>>>          2     2 TS       -   0  19   0  0.0 S    kthreadd         kthreadd
>>>>>>>>>>          3     3 TS       -   0  19   0  0.0 S    smpboot_thread_f ksoftirqd/0
>>>>>>>>>> ...
>>>>>>>>>>       2391  2391 TS       -   0  19   2  0.1 D    jbd2_log_wait_co flush-8:0
>>>>>>>>>> ...
>>>>>>>>>> 22647 22647 TS       -   0  19   3  0.0 S    worker_thread    kworker/3:1
>>>>>>>>>> 22655 22655 TS       -   0  19   0  0.0 S    hrtimer_nanoslee sleep
>>>>>>>>>> 22657 22657 TS       -   0  19   2  0.0 R+   -                ps
>>>>>>>>>> 25330 25330 TS       -   0  19   0  0.0 S    worker_thread    kworker/0:0
>>>>>>>>>> 28963 28963 TS       -   0  19   1  0.0 S+   wait             loop_xfstests.s
>>>>>>>>>> 28964 28964 TS       -   0  19   1  0.0 S+   wait             check
>>>>>>>>>> 29180 29180 TS       -   0  19   3  0.0 S    kjournald2       jbd2/sda11-8
>>>>>>>>>> 29181 29181 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
>>>>>>>>>> 29199 29199 TS       -   0  19   3  0.0 S+   wait             269
>>>>>>>>>> 29391 29391 TS       -   0  19   0  0.6 D    sleep_on_buffer  jbd2/sda12-8
>>>>>>>>>> 29392 29392 TS       - -20  39   3  0.0 S<   rescuer_thread   ext4-dio-unwrit
>>>>>>>>>> 29394 29394 TS       -   0  19   0  0.0 S    wait             fsstress
>>>>>>>>>> 29505 29505 TS       -   0  19   3  0.0 D    writeback_inodes fsstress
>>>>>>>>>>
>>>>>>>>>> # df -T /dev/sda11 /dev/sda12
>>>>>>>>>> Filesystem    Type   1K-blocks      Used Available Use% Mounted on
>>>>>>>>>> /dev/sda11    ext4     9857264     22308   9327564   1% /mnt/mp1
>>>>>>>>>> /dev/sda12    ext4      499656    499656         0 100% /mnt/mp2
>>>>>>>>>        Thanks for report. No I don't think this problem has been reported
>>>>>>>>> before. Seeing that sda12 is out of space and fsstress hangs in
>>>>>>>>> writeback_inodes(), I suspect we have some deadlock in ENOSPC recovery path
>>>>>>>>> when we want to flush data to disk to reduce delalloc uncertainty. Can you
>>>>>>>>> run 'echo w >/proc/sysrq-trigger' when the deadlock happens and post your
>>>>>>>>> dmesg here? Thanks!
>>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks for reply.
>>>>>>>> I'll take that information when the deadlock happens again.
>>>>>>>
>>>>>>> Finally, I got the following messages on linux-3.9.0.
>>>>>>> Could you take a look?
>>>>>> Is it a real deadlock?
>>>>>> umount wait for flush-8:0
>>>>>> flush wait for jbd2
>>>>>> jbd2 wait for bh.
>>>>>> All buffer-heads will be written at sooner or later.
>>>>>> It probably will require significant amount of time, because fs is too
>>>>>> fragmented(that is what this test all about), but it is not a deadlock.
>>>>>      Yeah, that is my analysis as well. I was only thinking whether we
>>>>> couldn't forget some buffer locked on some error recovery path (we are
>>>>> getting ENOSPC) but it's not obvious where it would be. Akira-san, could
>>>>> you also send us disassembly of your jbd2_journal_commit_transaction()
>>>>> function so that we can check where exactly offset 0xee0 is in the
>>>>> function?
>>>>>
>>>>> 								Honza
>>>>
>>>> I attached disasemble code of jbd2_journal_commit_tnrasaction,
>>>> please check it.
>>>     Thanks. I've looked into the disassembly now and we are waiting for the
>>> write of commit record to finish. We have submitted the buffer just before
>>> so it can hardly be some issue like forgotten locked buffer. Really
>>> strange. It looks like some problem in block layer, device driver, or HW
>>> itself... I'm thinking how to debug this further - it would be good if we
>>> had blktrace data just before the moment when the hang happens but given
>>> how hard this is to reproduce the traces would be extremly large...
>>>
>>> One idea - can you try whether the problem reproduces with 'nobarrier'
>>> mount option? It could be some problem with flush / fua handling.
>>>
>>
>> OK, I'll try to reproduce this issue with 'nobarrier'
>> (and take blktrace log if I can).
>    You can restart the blktrace every 5 minutes or so, so that trace files
> don't get too big. If we have a trace for a minute before the machine crashes
> it should be enough for debugging.

Thanks for advice, I'll try it.
In my environment, 1 time xfstests269 running generates
about 150MB log by blktrace, so taking log seems okay.

Thanks,
Akira Fujita


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-06-26  8:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-13  6:49 [BUG] xfstest269 causes deadlock on linux-3.9.0 (ext4) Akira Fujita
2013-05-14 21:37 ` Jan Kara
2013-05-15  2:57   ` Akira Fujita
2013-05-27  1:49     ` Akira Fujita
2013-05-27  8:16       ` Dmitry Monakhov
2013-05-27  8:33         ` Jan Kara
     [not found]           ` <51A46C79.4000105@rs.jp.nec.com>
2013-06-24 11:04             ` Jan Kara
2013-06-25 10:06               ` Akira Fujita
2013-06-25 11:27                 ` Jan Kara
2013-06-26  8:38                   ` Akira Fujita

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).