linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG][dioread_nolock] blocked for more than 120s when we run xfstests #269
@ 2013-03-07 12:40 Zheng Liu
  2013-03-07 15:11 ` Jan Kara
  0 siblings, 1 reply; 4+ messages in thread
From: Zheng Liu @ 2013-03-07 12:40 UTC (permalink / raw)
  To: linux-ext4

Hi all,

This bug will be triggered when dioread_nolock enables.  In 3.8 kernel,
the test will be blocked for more than 120s, and we get the following
messages.  *But* in dev branch the system will hang silently without any
messages.  I need to run test 7 times to hit it.

wenqing: run xfstest 269
kernel: EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts:
acl,user_xattr,dioread_nolock
kernel: INFO: task umount:3376 blocked for more than 120 seconds.
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
kernel: umount          D 0000000000000001     0  3376   3003 0x00000000
kernel: ffff880117417cd8 0000000000000082 ffff880117417bd8 ffff880117416010
kernel: ffff880112d12110 0000000000012080 ffff880117417fd8 0000000000004000
kernel: ffff880117417fd8 0000000000012080 ffff88011331d950 ffff880112d12110
kernel: Call Trace:
kernel: [<ffffffff820ba03c>] ? pagevec_lookup+0x22/0x2b
kernel: [<ffffffff820bb526>] ? truncate_inode_pages_range+0x273/0x3a2
kernel: [<ffffffff8237ef2f>] schedule+0x64/0x66
kernel: [<ffffffffa0203d7c>] ext4_ioend_wait+0x87/0x9f [ext4]
kernel: [<ffffffff8204f0be>] ? wake_up_bit+0x2a/0x2a
kernel: [<ffffffffa0201396>] ext4_evict_inode+0x4a/0x41d [ext4]
kernel: [<ffffffff82102c8c>] evict+0xa2/0x15b
kernel: [<ffffffff821ba2b8>] ? __list_del_entry+0x51/0x98
kernel: [<ffffffff821031b7>] dispose_list+0x3e/0x50
kernel: [<ffffffff8210365b>] evict_inodes+0xdb/0xe7
kernel: [<ffffffff820ef954>] generic_shutdown_super+0x4c/0xd3
kernel: [<ffffffff820efa02>] kill_block_super+0x27/0x69
kernel: [<ffffffff820efe56>] deactivate_locked_super+0x26/0x52
kernel: [<ffffffff820f0ad2>] deactivate_super+0x45/0x4a
kernel: [<ffffffff821074b9>] mntput_no_expire+0x110/0x118
kernel: [<ffffffff82108331>] sys_umount+0x306/0x331
kernel: [<ffffffff82041bf6>] ? sigprocmask+0x63/0x67
kernel: [<ffffffff82386942>] system_call_fastpath+0x16/0x1b

Regards,
                                                - Zheng

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG][dioread_nolock] blocked for more than 120s when we run xfstests #269
  2013-03-07 12:40 [BUG][dioread_nolock] blocked for more than 120s when we run xfstests #269 Zheng Liu
@ 2013-03-07 15:11 ` Jan Kara
       [not found]   ` <20130308135222.GA2768@gmail.com>
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Kara @ 2013-03-07 15:11 UTC (permalink / raw)
  To: Zheng Liu; +Cc: linux-ext4

On Thu 07-03-13 20:40:54, Zheng Liu wrote:
> Hi all,
> 
> This bug will be triggered when dioread_nolock enables.  In 3.8 kernel,
> the test will be blocked for more than 120s, and we get the following
> messages.  *But* in dev branch the system will hang silently without any
> messages.  I need to run test 7 times to hit it.
> 
> wenqing: run xfstest 269
> kernel: EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts:
> acl,user_xattr,dioread_nolock
> kernel: INFO: task umount:3376 blocked for more than 120 seconds.
> kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
> message.
> kernel: umount          D 0000000000000001     0  3376   3003 0x00000000
> kernel: ffff880117417cd8 0000000000000082 ffff880117417bd8 ffff880117416010
> kernel: ffff880112d12110 0000000000012080 ffff880117417fd8 0000000000004000
> kernel: ffff880117417fd8 0000000000012080 ffff88011331d950 ffff880112d12110
> kernel: Call Trace:
> kernel: [<ffffffff820ba03c>] ? pagevec_lookup+0x22/0x2b
> kernel: [<ffffffff820bb526>] ? truncate_inode_pages_range+0x273/0x3a2
> kernel: [<ffffffff8237ef2f>] schedule+0x64/0x66
> kernel: [<ffffffffa0203d7c>] ext4_ioend_wait+0x87/0x9f [ext4]
> kernel: [<ffffffff8204f0be>] ? wake_up_bit+0x2a/0x2a
> kernel: [<ffffffffa0201396>] ext4_evict_inode+0x4a/0x41d [ext4]
> kernel: [<ffffffff82102c8c>] evict+0xa2/0x15b
> kernel: [<ffffffff821ba2b8>] ? __list_del_entry+0x51/0x98
> kernel: [<ffffffff821031b7>] dispose_list+0x3e/0x50
> kernel: [<ffffffff8210365b>] evict_inodes+0xdb/0xe7
> kernel: [<ffffffff820ef954>] generic_shutdown_super+0x4c/0xd3
> kernel: [<ffffffff820efa02>] kill_block_super+0x27/0x69
> kernel: [<ffffffff820efe56>] deactivate_locked_super+0x26/0x52
> kernel: [<ffffffff820f0ad2>] deactivate_super+0x45/0x4a
> kernel: [<ffffffff821074b9>] mntput_no_expire+0x110/0x118
> kernel: [<ffffffff82108331>] sys_umount+0x306/0x331
> kernel: [<ffffffff82041bf6>] ? sigprocmask+0x63/0x67
> kernel: [<ffffffff82386942>] system_call_fastpath+0x16/0x1b
  I think I see the problem
  Hum, either we missed a wakeup or ioend gets stuck for some reason and is
never processed. Can you maybe post sysrq-w output once the kernel gets
stuck? Thanks.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG][dioread_nolock] blocked for more than 120s when we run xfstests #269
       [not found]       ` <20130313091511.GB29730@quack.suse.cz>
@ 2013-03-13 10:52         ` Zheng Liu
  2013-03-13 11:04           ` Jan Kara
  0 siblings, 1 reply; 4+ messages in thread
From: Zheng Liu @ 2013-03-13 10:52 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4

On Wed, Mar 13, 2013 at 10:15:11AM +0100, Jan Kara wrote:
[snip]
> > > I post the sysrq-w output here.  But IMHO it is not very useful.  So I
> > > also post the sysrq-t output.
> >   Heh, curious. Thanks for the data. So worker thinks there's nothing to do
> > but inode has elevated i_ioend_count... Maybe we leaked ioend somewhere.
> > I'll check the code when I have time.
>   Ah, I think I see what's going on.
> a) Code in ext4_ext_direct_IO() is racy wrt iocb->private handling (that
>    can get cleared concurrently from ext4_end_io_dio()).

Thanks for tracing this problem.  But I am still confused that iocb is
allocated on stack in do_sync_write(), and is allocated from slab in
ioctx_alloc().  You mean iocb in ext4_ext_direct_IO and ext4_end_io_dio
is the same one?  Then this iocb could be changed concurrently, and we
are blocked for more than 120s.  I must miss something.

> b) ext4_end_io_dio() forgets to free the io_end if size == 0 (but this
>    shouldn't really happen looking into fs/direct_io.c).

Yes, we will return directly from do_blockdev_direct_IO().

Regards,
                                                - Zheng

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG][dioread_nolock] blocked for more than 120s when we run xfstests #269
  2013-03-13 10:52         ` Zheng Liu
@ 2013-03-13 11:04           ` Jan Kara
  0 siblings, 0 replies; 4+ messages in thread
From: Jan Kara @ 2013-03-13 11:04 UTC (permalink / raw)
  To: Zheng Liu; +Cc: Jan Kara, linux-ext4

On Wed 13-03-13 18:52:33, Zheng Liu wrote:
> On Wed, Mar 13, 2013 at 10:15:11AM +0100, Jan Kara wrote:
> [snip]
> > > > I post the sysrq-w output here.  But IMHO it is not very useful.  So I
> > > > also post the sysrq-t output.
> > >   Heh, curious. Thanks for the data. So worker thinks there's nothing to do
> > > but inode has elevated i_ioend_count... Maybe we leaked ioend somewhere.
> > > I'll check the code when I have time.
> >   Ah, I think I see what's going on.
> > a) Code in ext4_ext_direct_IO() is racy wrt iocb->private handling (that
> >    can get cleared concurrently from ext4_end_io_dio()).
> 
> Thanks for tracing this problem.  But I am still confused that iocb is
> allocated on stack in do_sync_write(), and is allocated from slab in
> ioctx_alloc().  You mean iocb in ext4_ext_direct_IO and ext4_end_io_dio
> is the same one?
  Yes, it is.

> Then this iocb could be changed concurrently, and we are blocked for more
> than 120s.  I must miss something.
  Well, the hang results from direct IO code forgetting to call
ext4_free_io_end() in some (likely error recovery) path. So
inode->i_ioend_count remains elevated and we never finish waiting in
ext4_evict_inode(). How that forgotten ext4_free_io_end() really happens
isn't 100% clear to me but I really suspect something with concurrent iocb
modification goes wrong...

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-03-13 11:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-07 12:40 [BUG][dioread_nolock] blocked for more than 120s when we run xfstests #269 Zheng Liu
2013-03-07 15:11 ` Jan Kara
     [not found]   ` <20130308135222.GA2768@gmail.com>
     [not found]     ` <20130311163041.GL29799@quack.suse.cz>
     [not found]       ` <20130313091511.GB29730@quack.suse.cz>
2013-03-13 10:52         ` Zheng Liu
2013-03-13 11:04           ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).