Re: [PATCH 0/11] Per-bdi writeback flusher threads v8

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Theodore Tso <tytso@mit.edu>
To: Jens Axboe <jens.axboe@oracle.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	chris.mason@oracle.com, david@fromorbit.com, hch@infradead.org,
	akpm@linux-foundation.org, jack@suse.cz,
	yanmin_zhang@linux.intel.com, richard@rsk.demon.co.uk,
	damien.wyart@free.fr
Subject: Re: [PATCH 0/11] Per-bdi writeback flusher threads v8
Date: Wed, 27 May 2009 13:53:53 -0400	[thread overview]
Message-ID: <20090527175353.GE10842@mit.edu> (raw)
In-Reply-To: <20090527144754.GD10842@mit.edu>

On Wed, May 27, 2009 at 10:47:54AM -0400, Theodore Tso wrote:
> 
> I'll retry the test with your stock writeback-v8 git branch w/o any
> ext4 patches planned the next mere window mainline to see if I get the
> same soft lockup, but I thought I should give you an early heads up.

Confirmed.  I had to run fsstress twice, but I was able to trigger a
soft hangup with just the per-bdi v8 patches using ext4.

With ext3, fsstress didn't cause a soft lockup while it was running
--- but after the test, when I tried to unmount the filesystem,
/sbin/umount hung:

[ 2040.893469] INFO: task umount:7154 blocked for more than 120 seconds.
[ 2040.893487] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2040.893503] umount        D 000001ba  2600  7154   5885
[ 2040.893531]  ec408db8 00000046 ba2bff0b 000001ba c0be7148 c0e68bc8 c0163ebd c0a78700
[ 2040.893572]  c0a78700 ec408d74 c0164e28 e95c0000 e95c027c c2d13700 00000000 ba2d9a13
[ 2040.893612]  000001ba c0165031 00000006 e95c0000 c05e9594 00000002 ec408d9c e95c027c
[ 2040.893652] Call Trace:
[ 2040.893683]  [<c0163ebd>] ? lock_release_holdtime+0x30/0x131
[ 2040.893702]  [<c0164e28>] ? mark_lock+0x1e/0x1e4
[ 2040.893720]  [<c0165031>] ? mark_held_locks+0x43/0x5b
[ 2040.893742]  [<c05e9594>] ? _spin_unlock_irqrestore+0x3c/0x48
[ 2040.893761]  [<c01652ba>] ? trace_hardirqs_on+0xb/0xd
[ 2040.893782]  [<c05e79ff>] schedule+0x8/0x17
[ 2040.893801]  [<c01d7009>] bdi_sched_wait+0x8/0xc
[ 2040.893818]  [<c05e7ee8>] __wait_on_bit+0x36/0x5d
[ 2040.893836]  [<c01d7001>] ? bdi_sched_wait+0x0/0xc
[ 2040.893854]  [<c05e7fba>] out_of_line_wait_on_bit+0xab/0xb3
[ 2040.893872]  [<c01d7001>] ? bdi_sched_wait+0x0/0xc
[ 2040.893892]  [<c01577ae>] ? wake_bit_function+0x0/0x43
[ 2040.893911]  [<c01d618e>] wait_on_bit+0x20/0x2c
[ 2040.893929]  [<c01d6d06>] bdi_writeback_all+0x161/0x18e
[ 2040.893951]  [<c0199f63>] ? wait_on_page_writeback_range+0x9d/0xdc
[ 2040.894052]  [<c01d6e47>] generic_sync_sb_inodes+0x2f/0xcc
[ 2040.894079]  [<c01d6f52>] sync_inodes_sb+0x6e/0x76
[ 2040.894107]  [<c01c1aa0>] __fsync_super+0x63/0x66
[ 2040.894131]  [<c01c1aae>] fsync_super+0xb/0x19
[ 2040.894149]  [<c01c1d16>] generic_shutdown_super+0x1c/0xde
[ 2040.894167]  [<c01c1df5>] kill_block_super+0x1d/0x31
[ 2040.894186]  [<c01f0a85>] ? vfs_quota_off+0x0/0x12
[ 2040.894204]  [<c01c2350>] deactivate_super+0x57/0x6b
[ 2040.894223]  [<c01d2156>] mntput_no_expire+0xca/0xfb
[ 2040.894242]  [<c01d2633>] sys_umount+0x28f/0x2b4
[ 2040.894262]  [<c01d2665>] sys_oldumount+0xd/0xf
[ 2040.894281]  [<c011c264>] sysenter_do_call+0x12/0x38
[ 2040.894297] 1 lock held by umount/7154:
[ 2040.894307]  #0:  (&type->s_umount_key#31){++++..}, at: [<c01c234b>] deactivate_super+0x52/0x6b


Given that the ext4 hangs were also related to s_umount being taken by
sync_inodes(), there seems to be something going on there:

[ 3720.900031] INFO: task fsstress:8487 blocked for more than 120 seconds.
[ 3720.900041] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3720.900049] fsstress      D 00000330  2060  8487   8484
[ 3720.900063]  eddcde38 00000046 633117b6 00000330 c0be7148 c0e58ba8 c0163ebd c0a78700
[ 3720.900084]  c0a78700 eddcddf4 c0164e28 f45a0000 f45a027c c2d13700 00000000 63320800
[ 3720.900104]  00000330 c0165031 00000006 f45a0000 c05e9594 00000002 eddcde1c f45a027c
[ 3720.900124] Call Trace:
[ 3720.900142]  [<c0163ebd>] ? lock_release_holdtime+0x30/0x131
[ 3720.900151]  [<c0164e28>] ? mark_lock+0x1e/0x1e4
[ 3720.900160]  [<c0165031>] ? mark_held_locks+0x43/0x5b
[ 3720.900172]  [<c05e9594>] ? _spin_unlock_irqrestore+0x3c/0x48
[ 3720.900181]  [<c01652ba>] ? trace_hardirqs_on+0xb/0xd
[ 3720.900192]  [<c05e79ff>] schedule+0x8/0x17
[ 3720.900202]  [<c01d7009>] bdi_sched_wait+0x8/0xc
[ 3720.900211]  [<c05e7ee8>] __wait_on_bit+0x36/0x5d
[ 3720.900220]  [<c01d7001>] ? bdi_sched_wait+0x0/0xc
[ 3720.900229]  [<c05e7fba>] out_of_line_wait_on_bit+0xab/0xb3
[ 3720.900238]  [<c01d7001>] ? bdi_sched_wait+0x0/0xc
[ 3720.900248]  [<c01577ae>] ? wake_bit_function+0x0/0x43
[ 3720.900258]  [<c01d618e>] wait_on_bit+0x20/0x2c
[ 3720.900267]  [<c01d6d06>] bdi_writeback_all+0x161/0x18e
[ 3720.900277]  [<c01d6e47>] generic_sync_sb_inodes+0x2f/0xcc
[ 3720.900287]  [<c01d6f52>] sync_inodes_sb+0x6e/0x76
[ 3720.900297]  [<c01d6f9d>] __sync_inodes+0x43/0x89
[ 3720.900306]  [<c01d6ffe>] sync_inodes+0x1b/0x1e
[ 3720.900314]  [<c01d96f9>] do_sync+0x38/0x5a
[ 3720.900323]  [<c01d973f>] sys_sync+0xd/0x12
[ 3720.900333]  [<c011c264>] sysenter_do_call+0x12/0x38
[ 3720.900340] 1 lock held by fsstress/8487:
[ 3720.900345]  #0:  (&type->s_umount_key#15){++++..}, at: [<c01d6f8e>] __sync_inodes+0x34/0x89

  		     				       	   - Ted

WARNING: multiple messages have this Message-ID (diff)

From: Theodore Tso <tytso@mit.edu>
To: Jens Axboe <jens.axboe@oracle.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	chris.mason@oracle.com, david@fromorbit.com, hch@infradead.org,
	akpm@linux-foundation.o
Subject: Re: [PATCH 0/11] Per-bdi writeback flusher threads v8
Date: Wed, 27 May 2009 13:53:53 -0400	[thread overview]
Message-ID: <20090527175353.GE10842@mit.edu> (raw)
In-Reply-To: <20090527144754.GD10842@mit.edu>

On Wed, May 27, 2009 at 10:47:54AM -0400, Theodore Tso wrote:
> 
> I'll retry the test with your stock writeback-v8 git branch w/o any
> ext4 patches planned the next mere window mainline to see if I get the
> same soft lockup, but I thought I should give you an early heads up.

Confirmed.  I had to run fsstress twice, but I was able to trigger a
soft hangup with just the per-bdi v8 patches using ext4.

With ext3, fsstress didn't cause a soft lockup while it was running
--- but after the test, when I tried to unmount the filesystem,
/sbin/umount hung:

[ 2040.893469] INFO: task umount:7154 blocked for more than 120 seconds.
[ 2040.893487] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2040.893503] umount        D 000001ba  2600  7154   5885
[ 2040.893531]  ec408db8 00000046 ba2bff0b 000001ba c0be7148 c0e68bc8 c0163ebd c0a78700
[ 2040.893572]  c0a78700 ec408d74 c0164e28 e95c0000 e95c027c c2d13700 00000000 ba2d9a13
[ 2040.893612]  000001ba c0165031 00000006 e95c0000 c05e9594 00000002 ec408d9c e95c027c
[ 2040.893652] Call Trace:
[ 2040.893683]  [<c0163ebd>] ? lock_release_holdtime+0x30/0x131
[ 2040.893702]  [<c0164e28>] ? mark_lock+0x1e/0x1e4
[ 2040.893720]  [<c0165031>] ? mark_held_locks+0x43/0x5b
[ 2040.893742]  [<c05e9594>] ? _spin_unlock_irqrestore+0x3c/0x48
[ 2040.893761]  [<c01652ba>] ? trace_hardirqs_on+0xb/0xd
[ 2040.893782]  [<c05e79ff>] schedule+0x8/0x17
[ 2040.893801]  [<c01d7009>] bdi_sched_wait+0x8/0xc
[ 2040.893818]  [<c05e7ee8>] __wait_on_bit+0x36/0x5d
[ 2040.893836]  [<c01d7001>] ? bdi_sched_wait+0x0/0xc
[ 2040.893854]  [<c05e7fba>] out_of_line_wait_on_bit+0xab/0xb3
[ 2040.893872]  [<c01d7001>] ? bdi_sched_wait+0x0/0xc
[ 2040.893892]  [<c01577ae>] ? wake_bit_function+0x0/0x43
[ 2040.893911]  [<c01d618e>] wait_on_bit+0x20/0x2c
[ 2040.893929]  [<c01d6d06>] bdi_writeback_all+0x161/0x18e
[ 2040.893951]  [<c0199f63>] ? wait_on_page_writeback_range+0x9d/0xdc
[ 2040.894052]  [<c01d6e47>] generic_sync_sb_inodes+0x2f/0xcc
[ 2040.894079]  [<c01d6f52>] sync_inodes_sb+0x6e/0x76
[ 2040.894107]  [<c01c1aa0>] __fsync_super+0x63/0x66
[ 2040.894131]  [<c01c1aae>] fsync_super+0xb/0x19
[ 2040.894149]  [<c01c1d16>] generic_shutdown_super+0x1c/0xde
[ 2040.894167]  [<c01c1df5>] kill_block_super+0x1d/0x31
[ 2040.894186]  [<c01f0a85>] ? vfs_quota_off+0x0/0x12
[ 2040.894204]  [<c01c2350>] deactivate_super+0x57/0x6b
[ 2040.894223]  [<c01d2156>] mntput_no_expire+0xca/0xfb
[ 2040.894242]  [<c01d2633>] sys_umount+0x28f/0x2b4
[ 2040.894262]  [<c01d2665>] sys_oldumount+0xd/0xf
[ 2040.894281]  [<c011c264>] sysenter_do_call+0x12/0x38
[ 2040.894297] 1 lock held by umount/7154:
[ 2040.894307]  #0:  (&type->s_umount_key#31){++++..}, at: [<c01c234b>] deactivate_super+0x52/0x6b


Given that the ext4 hangs were also related to s_umount being taken by
sync_inodes(), there seems to be something going on there:

[ 3720.900031] INFO: task fsstress:8487 blocked for more than 120 seconds.
[ 3720.900041] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3720.900049] fsstress      D 00000330  2060  8487   8484
[ 3720.900063]  eddcde38 00000046 633117b6 00000330 c0be7148 c0e58ba8 c0163ebd c0a78700
[ 3720.900084]  c0a78700 eddcddf4 c0164e28 f45a0000 f45a027c c2d13700 00000000 63320800
[ 3720.900104]  00000330 c0165031 00000006 f45a0000 c05e9594 00000002 eddcde1c f45a027c
[ 3720.900124] Call Trace:
[ 3720.900142]  [<c0163ebd>] ? lock_release_holdtime+0x30/0x131
[ 3720.900151]  [<c0164e28>] ? mark_lock+0x1e/0x1e4
[ 3720.900160]  [<c0165031>] ? mark_held_locks+0x43/0x5b
[ 3720.900172]  [<c05e9594>] ? _spin_unlock_irqrestore+0x3c/0x48
[ 3720.900181]  [<c01652ba>] ? trace_hardirqs_on+0xb/0xd
[ 3720.900192]  [<c05e79ff>] schedule+0x8/0x17
[ 3720.900202]  [<c01d7009>] bdi_sched_wait+0x8/0xc
[ 3720.900211]  [<c05e7ee8>] __wait_on_bit+0x36/0x5d
[ 3720.900220]  [<c01d7001>] ? bdi_sched_wait+0x0/0xc
[ 3720.900229]  [<c05e7fba>] out_of_line_wait_on_bit+0xab/0xb3
[ 3720.900238]  [<c01d7001>] ? bdi_sched_wait+0x0/0xc
[ 3720.900248]  [<c01577ae>] ? wake_bit_function+0x0/0x43
[ 3720.900258]  [<c01d618e>] wait_on_bit+0x20/0x2c
[ 3720.900267]  [<c01d6d06>] bdi_writeback_all+0x161/0x18e
[ 3720.900277]  [<c01d6e47>] generic_sync_sb_inodes+0x2f/0xcc
[ 3720.900287]  [<c01d6f52>] sync_inodes_sb+0x6e/0x76
[ 3720.900297]  [<c01d6f9d>] __sync_inodes+0x43/0x89
[ 3720.900306]  [<c01d6ffe>] sync_inodes+0x1b/0x1e
[ 3720.900314]  [<c01d96f9>] do_sync+0x38/0x5a
[ 3720.900323]  [<c01d973f>] sys_sync+0xd/0x12
[ 3720.900333]  [<c011c264>] sysenter_do_call+0x12/0x38
[ 3720.900340] 1 lock held by fsstress/8487:
[ 3720.900345]  #0:  (&type->s_umount_key#15){++++..}, at: [<c01d6f8e>] __sync_inodes+0x34/0x89

  		     				       	   - Ted

next prev parent reply	other threads:[~2009-05-27 17:56 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-27  9:41 [PATCH 0/11] Per-bdi writeback flusher threads v8 Jens Axboe
2009-05-27  9:41 ` [PATCH 01/11] ntfs: remove old debug check for dirty data in ntfs_put_super() Jens Axboe
2009-05-27  9:41 ` [PATCH 02/11] btrfs: properly register fs backing device Jens Axboe
2009-05-27  9:41 ` [PATCH 03/11] writeback: move dirty inodes from super_block to backing_dev_info Jens Axboe
2009-05-27  9:41 ` [PATCH 04/11] writeback: switch to per-bdi threads for flushing data Jens Axboe
2009-05-27 11:11   ` Peter Zijlstra
2009-05-27 11:24     ` Jens Axboe
2009-05-27 15:14   ` Jan Kara
2009-05-27 17:50     ` Jens Axboe
2009-05-28 14:45       ` Jan Kara
2009-05-27  9:41 ` [PATCH 05/11] writeback: get rid of pdflush completely Jens Axboe
2009-05-27  9:41 ` [PATCH 06/11] writeback: separate the flushing state/task from the bdi Jens Axboe
2009-05-27  9:41 ` [PATCH 07/11] writeback: support > 1 flusher thread per bdi Jens Axboe
2009-05-28  9:27   ` Jan Kara
2009-05-28 10:40     ` Jens Axboe
2009-05-28 12:43       ` Jan Kara
2009-05-28 12:53         ` Jens Axboe
2009-05-28 13:58           ` Jan Kara
2009-05-27  9:41 ` [PATCH 08/11] writeback: allow sleepy exit of default writeback task Jens Axboe
2009-05-27  9:41 ` [PATCH 09/11] writeback: add some debug inode list counters to bdi stats Jens Axboe
2009-05-27  9:41 ` [PATCH 10/11] writeback: add name to backing_dev_info Jens Axboe
2009-05-27  9:41 ` [PATCH 11/11] writeback: check for registered bdi in flusher add and inode dirty Jens Axboe
2009-05-27 12:41 ` [PATCH 0/11] Per-bdi writeback flusher threads v8 Richard Kennedy
2009-05-27 12:47   ` Jens Axboe
2009-05-27 14:47 ` Theodore Tso
2009-05-27 15:05   ` Jens Axboe
2009-05-27 17:53   ` Theodore Tso [this message]
2009-05-27 17:53     ` Theodore Tso
2009-05-27 17:57     ` Jens Axboe
2009-05-27 17:58     ` Theodore Tso
2009-05-27 17:58       ` Theodore Tso
2009-05-27 18:14       ` Jens Axboe
2009-05-27 19:15         ` Jens Axboe
2009-05-27 19:45           ` Jens Axboe
2009-05-28  0:49             ` Theodore Tso
2009-05-28  9:28               ` Jan Kara
2009-05-28  9:28                 ` Jan Kara
2009-05-28  9:36                 ` Jens Axboe
2009-05-28 15:23                 ` Eric W. Biederman
2009-05-28 19:32                   ` Theodore Tso
2009-05-28 19:38                     ` Christoph Hellwig
2009-05-28 19:38                       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090527175353.GE10842@mit.edu \
    --to=tytso@mit.edu \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=damien.wyart@free.fr \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jens.axboe@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=richard@rsk.demon.co.uk \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.