public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Mark Tinguely <tinguely@sgi.com>
Cc: Christoph Hellwig <hch@infradead.org>, xfs@oss.sgi.com
Subject: Re: [PATCH 04/10] xfs: implement freezing by emptying the AIL
Date: Tue, 17 Apr 2012 09:54:32 +1000	[thread overview]
Message-ID: <20120416235432.GZ6734@dastard> (raw)
In-Reply-To: <4F8C22D4.3040908@sgi.com>

On Mon, Apr 16, 2012 at 08:47:00AM -0500, Mark Tinguely wrote:
> On 03/27/12 11:44, Christoph Hellwig wrote:
> >Now that we write back all metadata either synchronously or through the AIL
> >we can simply implement metadata freezing in terms of emptying the AIL.
> >
> >The implementation for this is fairly simply and straight-forward:  A new
> >routine is added that increments a counter that tells xfsaild to not stop
> >until the AIL is empty and then waits on a wakeup from
> >xfs_trans_ail_delete_bulk to signal that the AIL is empty.
> >
> >As usual the devil is in the details, in this case the filesystem shutdown
> >code.  Currently we are a bit sloppy there and do not continue ail pushing
> >in that case, and thus never reach the code in the log item implementations
> >that can unwind in case of a shutdown filesystem.  Also the code to
> >abort inode and dquot flushes was rather sloppy before and did not remove
> >the log items from the AIL, which had to be fixed as well.
> >
> >Also treat unmount the same way as freeze now, except that we still keep a
> >synchronous inode reclaim pass to make sure we reclaim all clean inodes, too.
> >
> >As an upside we can now remove the radix tree based inode writeback and
> >xfs_unmountfs_writesb.
> >
> >Signed-off-by: Christoph Hellwig<hch@lst.de>
> 
> Sorry for the empty email.
> 
> This series hangs my test boxes. This patch is the first indication
> of the hang. Reboot, and remove patch 4 and the test are successful.
> 
> The machine is still responsive. Only the SCRATCH filesystem from
> the test suite is hung.
> 
> Per Dave's observation, I added a couple inode reclaims to this
> patch and the test gets further (hangs on run 9 of test 068 rather
> than run 3).

That implies that there are dirty inodes at the VFS level leaking
through the freeze.

.....

> The back traces are from a Linux 3.4-rc2 kernel with just patches
> 0-4 of this series applied. This traceback does not have extra inode
> reclaims. The hang is in test 068. I did an ls and sync to the
> filesystem, so I included their tracebacks as well.  live system.
> 
> I have looked at the remaining patches in the series, but have not
> reviewed them because they depend on this patch...
> 
> --Mark.
> ---
> 
> crash> bt -f 20050
> PID: 20050  TASK: ffff88034a6943c0  CPU: 0   COMMAND: "fsstress"
>  #0 [ffff88034aa93d18] __schedule at ffffffff81416e50
>  #1 [ffff88034aa93e60] schedule at ffffffff814171c4
>  #2 [ffff88034aa93e70] do_wait at ffffffff81040e39
>  #3 [ffff88034aa93ee0] sys_wait4 at ffffffff81040f11
>  #4 [ffff88034aa93f80] system_call_fastpath at ffffffff8141fff9
> 
> PID: 20051  TASK: ffff88034e31e600  CPU: 3   COMMAND: "fsstress"
>  #0 [ffff88034c5c1c08] __schedule at ffffffff81416e50
>  #1 [ffff88034c5c1d50] schedule at ffffffff814171c4
>  #2 [ffff88034c5c1d60] xfs_file_aio_write at ffffffffa044d4b5 [xfs]
>  #3 [ffff88034c5c1df0] do_sync_write at ffffffff8114d3d9
>  #4 [ffff88034c5c1f10] vfs_write at ffffffff8114da0b
>  #5 [ffff88034c5c1f40] sys_write at ffffffff8114db60
>  #6 [ffff88034c5c1f80] system_call_fastpath at ffffffff8141fff9

Frozen write, not holding any locks.

> PID: 20052  TASK: ffff88034ad56080  CPU: 3   COMMAND: "fsstress"
>  #0 [ffff88034a88fbb8] __schedule at ffffffff81416e50
>  #1 [ffff88034a88fd00] schedule at ffffffff814171c4
>  #2 [ffff88034a88fd10] schedule_timeout at ffffffff81415455
>  #3 [ffff88034a88fdb0] wait_for_common at ffffffff814166b7
>  #4 [ffff88034a88fe40] wait_for_completion at ffffffff81416828
>  #5 [ffff88034a88fe50] sync_inodes_sb at ffffffff81174eaa
>  #6 [ffff88034a88fee0] __sync_filesystem at ffffffff8117a4a0
>  #7 [ffff88034a88ff00] sync_one_sb at ffffffff8117a4c7
>  #8 [ffff88034a88ff10] iterate_supers at ffffffff8115126b
>  #9 [ffff88034a88ff50] sys_sync at ffffffff8117a515
> #10 [ffff88034a88ff80] system_call_fastpath at ffffffff8141fff9

Waiting for flusher thread completion, holding the sb->s_umount lock
in read mode.

> PID: 20089  TASK: ffff88034c5ca340  CPU: 2   COMMAND: "xfs_freeze"
>  #0 [ffff88034aaafd18] __schedule at ffffffff81416e50
>  #1 [ffff88034aaafe60] schedule at ffffffff814171c4
>  #2 [ffff88034aaafe70] do_wait at ffffffff81040e39
>  #3 [ffff88034aaafee0] sys_wait4 at ffffffff81040f11
>  #4 [ffff88034aaaff80] system_call_fastpath at ffffffff8141fff9
> 
> PID: 20093  TASK: ffff88034b42a4c0  CPU: 1   COMMAND: "xfs_io"
>  #0 [ffff88034c3abc98] __schedule at ffffffff81416e50
>  #1 [ffff88034c3abde0] schedule at ffffffff814171c4
>  #2 [ffff88034c3abdf0] rwsem_down_failed_common at ffffffff81417de5
>  #3 [ffff88034c3abe60] rwsem_down_write_failed at ffffffff81417e93
>  #4 [ffff88034c3abe70] call_rwsem_down_write_failed at ffffffff8123fd93
>  #5 [ffff88034c3abeb0] down_write at ffffffff81416110
>  #6 [ffff88034c3abec0] thaw_super at ffffffff81150343
>  #7 [ffff88034c3abef0] do_vfs_ioctl at ffffffff8115efb8
>  #8 [ffff88034c3abf30] sys_ioctl at ffffffff8115f139
>  #9 [ffff88034c3abf80] system_call_fastpath at ffffffff8141fff9

waiting for sb->s_umount, which can only be released by flusher
thread completion.

> PID: 20185  TASK: ffff88034c31c280  CPU: 1   COMMAND: "sync"
>  #0 [ffff88034afe7b88] __schedule at ffffffff81416e50
>  #1 [ffff88034afe7cd0] schedule at ffffffff814171c4
>  #2 [ffff88034afe7ce0] schedule_timeout at ffffffff81415455
>  #3 [ffff88034afe7d80] wait_for_common at ffffffff814166b7
>  #4 [ffff88034afe7e10] wait_for_completion at ffffffff81416828
>  #5 [ffff88034afe7e20] writeback_inodes_sb_nr at ffffffff81174c69
>  #6 [ffff88034afe7eb0] writeback_inodes_sb at ffffffff8117522c
>  #7 [ffff88034afe7ee0] __sync_filesystem at ffffffff8117a469
>  #8 [ffff88034afe7f00] sync_one_sb at ffffffff8117a4c7
>  #9 [ffff88034afe7f10] iterate_supers at ffffffff8115126b
> #10 [ffff88034afe7f50] sys_sync at ffffffff8117a4ff
> #11 [ffff88034afe7f80] system_call_fastpath at ffffffff8141fff9

waiting for flusher thread completion, holding the sb->s_umount lock
in read mode.

> 
> PID: 20110  TASK: ffff88034a4820c0  CPU: 2   COMMAND: "ls"
>  #0 [ffff88034a855c78] __schedule at ffffffff81416e50
>  #1 [ffff88034a855dc0] schedule at ffffffff814171c4
>  #2 [ffff88034a855dd0] xfs_trans_alloc at ffffffffa0499fb5 [xfs]
>  #3 [ffff88034a855e30] xfs_fs_dirty_inode at ffffffffa0457aa2 [xfs]
>  #4 [ffff88034a855e60] __mark_inode_dirty at ffffffff811753da
>  #5 [ffff88034a855ea0] touch_atime at ffffffff811662db
>  #6 [ffff88034a855ef0] vfs_readdir at ffffffff8115f934
>  #7 [ffff88034a855f30] sys_getdents64 at ffffffff8115f9c3
>  #8 [ffff88034a855f80] system_call_fastpath at ffffffff8141fff9

Frozen attribute modification, no locks held.

So, what are the flusher threads doing - where are they stuck?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2012-04-16 23:54 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-27 16:44 [PATCH 00/10] remove xfsbufd Christoph Hellwig
2012-03-27 16:44 ` [PATCH 01/10] xfs: remove log item from AIL in xfs_qm_dqflush after a shutdown Christoph Hellwig
2012-03-27 18:17   ` Mark Tinguely
2012-04-13  9:36   ` Dave Chinner
2012-03-27 16:44 ` [PATCH 02/10] xfs: remove log item from AIL in xfs_iflush " Christoph Hellwig
2012-04-13  9:37   ` Dave Chinner
2012-03-27 16:44 ` [PATCH 03/10] xfs: allow assigning the tail lsn with the AIL lock held Christoph Hellwig
2012-03-27 18:18   ` Mark Tinguely
2012-04-13  9:42   ` Dave Chinner
2012-03-27 16:44 ` [PATCH 04/10] xfs: implement freezing by emptying the AIL Christoph Hellwig
2012-04-13 10:04   ` Dave Chinner
2012-04-16 13:33   ` Mark Tinguely
2012-04-16 13:47   ` Mark Tinguely
2012-04-16 23:54     ` Dave Chinner [this message]
2012-04-17  4:20       ` Dave Chinner
2012-04-17  8:26         ` Dave Chinner
2012-04-18 13:13           ` Mark Tinguely
2012-04-18 18:14             ` Ben Myers
2012-04-18 17:53           ` Mark Tinguely
2012-03-27 16:44 ` [PATCH 05/10] xfs: do flush inodes from background inode reclaim Christoph Hellwig
2012-04-13 10:14   ` Dave Chinner
2012-04-16 19:25   ` Mark Tinguely
2012-03-27 16:44 ` [PATCH 06/10] xfs: do not write the buffer from xfs_iflush Christoph Hellwig
2012-04-13 10:31   ` Dave Chinner
2012-04-18 13:33   ` Mark Tinguely
2012-03-27 16:44 ` [PATCH 07/10] xfs: do not write the buffer from xfs_qm_dqflush Christoph Hellwig
2012-04-13 10:33   ` Dave Chinner
2012-04-18 21:11   ` Mark Tinguely
2012-03-27 16:44 ` [PATCH 08/10] xfs: do not add buffers to the delwri queue until pushed Christoph Hellwig
2012-04-13 10:35   ` Dave Chinner
2012-04-18 21:11   ` Mark Tinguely
2012-03-27 16:44 ` [PATCH 09/10] xfs: on-stack delayed write buffer lists Christoph Hellwig
2012-04-13 11:37   ` Dave Chinner
2012-04-20 18:19   ` Mark Tinguely
2012-04-21  0:42     ` Dave Chinner
2012-04-23  1:57       ` Dave Chinner
2012-03-27 16:44 ` [PATCH 10/10] xfs: remove some obsolete comments in xfs_trans_ail.c Christoph Hellwig
2012-04-13 11:37   ` Dave Chinner
2012-03-28  0:53 ` [PATCH 00/10] remove xfsbufd Dave Chinner
2012-03-28 15:10   ` Christoph Hellwig
2012-03-29  0:52     ` Dave Chinner
2012-03-29 19:38       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120416235432.GZ6734@dastard \
    --to=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=tinguely@sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox