All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Mark Tinguely <tinguely@sgi.com>
Cc: Christoph Hellwig <hch@infradead.org>, xfs@oss.sgi.com
Subject: Re: [PATCH 04/10] xfs: implement freezing by emptying the AIL
Date: Tue, 17 Apr 2012 09:54:32 +1000	[thread overview]
Message-ID: <20120416235432.GZ6734@dastard> (raw)
In-Reply-To: <4F8C22D4.3040908@sgi.com>

On Mon, Apr 16, 2012 at 08:47:00AM -0500, Mark Tinguely wrote:
> On 03/27/12 11:44, Christoph Hellwig wrote:
> >Now that we write back all metadata either synchronously or through the AIL
> >we can simply implement metadata freezing in terms of emptying the AIL.
> >
> >The implementation for this is fairly simply and straight-forward:  A new
> >routine is added that increments a counter that tells xfsaild to not stop
> >until the AIL is empty and then waits on a wakeup from
> >xfs_trans_ail_delete_bulk to signal that the AIL is empty.
> >
> >As usual the devil is in the details, in this case the filesystem shutdown
> >code.  Currently we are a bit sloppy there and do not continue ail pushing
> >in that case, and thus never reach the code in the log item implementations
> >that can unwind in case of a shutdown filesystem.  Also the code to
> >abort inode and dquot flushes was rather sloppy before and did not remove
> >the log items from the AIL, which had to be fixed as well.
> >
> >Also treat unmount the same way as freeze now, except that we still keep a
> >synchronous inode reclaim pass to make sure we reclaim all clean inodes, too.
> >
> >As an upside we can now remove the radix tree based inode writeback and
> >xfs_unmountfs_writesb.
> >
> >Signed-off-by: Christoph Hellwig<hch@lst.de>
> 
> Sorry for the empty email.
> 
> This series hangs my test boxes. This patch is the first indication
> of the hang. Reboot, and remove patch 4 and the test are successful.
> 
> The machine is still responsive. Only the SCRATCH filesystem from
> the test suite is hung.
> 
> Per Dave's observation, I added a couple inode reclaims to this
> patch and the test gets further (hangs on run 9 of test 068 rather
> than run 3).

That implies that there are dirty inodes at the VFS level leaking
through the freeze.

.....

> The back traces are from a Linux 3.4-rc2 kernel with just patches
> 0-4 of this series applied. This traceback does not have extra inode
> reclaims. The hang is in test 068. I did an ls and sync to the
> filesystem, so I included their tracebacks as well.  live system.
> 
> I have looked at the remaining patches in the series, but have not
> reviewed them because they depend on this patch...
> 
> --Mark.
> ---
> 
> crash> bt -f 20050
> PID: 20050  TASK: ffff88034a6943c0  CPU: 0   COMMAND: "fsstress"
>  #0 [ffff88034aa93d18] __schedule at ffffffff81416e50
>  #1 [ffff88034aa93e60] schedule at ffffffff814171c4
>  #2 [ffff88034aa93e70] do_wait at ffffffff81040e39
>  #3 [ffff88034aa93ee0] sys_wait4 at ffffffff81040f11
>  #4 [ffff88034aa93f80] system_call_fastpath at ffffffff8141fff9
> 
> PID: 20051  TASK: ffff88034e31e600  CPU: 3   COMMAND: "fsstress"
>  #0 [ffff88034c5c1c08] __schedule at ffffffff81416e50
>  #1 [ffff88034c5c1d50] schedule at ffffffff814171c4
>  #2 [ffff88034c5c1d60] xfs_file_aio_write at ffffffffa044d4b5 [xfs]
>  #3 [ffff88034c5c1df0] do_sync_write at ffffffff8114d3d9
>  #4 [ffff88034c5c1f10] vfs_write at ffffffff8114da0b
>  #5 [ffff88034c5c1f40] sys_write at ffffffff8114db60
>  #6 [ffff88034c5c1f80] system_call_fastpath at ffffffff8141fff9

Frozen write, not holding any locks.

> PID: 20052  TASK: ffff88034ad56080  CPU: 3   COMMAND: "fsstress"
>  #0 [ffff88034a88fbb8] __schedule at ffffffff81416e50
>  #1 [ffff88034a88fd00] schedule at ffffffff814171c4
>  #2 [ffff88034a88fd10] schedule_timeout at ffffffff81415455
>  #3 [ffff88034a88fdb0] wait_for_common at ffffffff814166b7
>  #4 [ffff88034a88fe40] wait_for_completion at ffffffff81416828
>  #5 [ffff88034a88fe50] sync_inodes_sb at ffffffff81174eaa
>  #6 [ffff88034a88fee0] __sync_filesystem at ffffffff8117a4a0
>  #7 [ffff88034a88ff00] sync_one_sb at ffffffff8117a4c7
>  #8 [ffff88034a88ff10] iterate_supers at ffffffff8115126b
>  #9 [ffff88034a88ff50] sys_sync at ffffffff8117a515
> #10 [ffff88034a88ff80] system_call_fastpath at ffffffff8141fff9

Waiting for flusher thread completion, holding the sb->s_umount lock
in read mode.

> PID: 20089  TASK: ffff88034c5ca340  CPU: 2   COMMAND: "xfs_freeze"
>  #0 [ffff88034aaafd18] __schedule at ffffffff81416e50
>  #1 [ffff88034aaafe60] schedule at ffffffff814171c4
>  #2 [ffff88034aaafe70] do_wait at ffffffff81040e39
>  #3 [ffff88034aaafee0] sys_wait4 at ffffffff81040f11
>  #4 [ffff88034aaaff80] system_call_fastpath at ffffffff8141fff9
> 
> PID: 20093  TASK: ffff88034b42a4c0  CPU: 1   COMMAND: "xfs_io"
>  #0 [ffff88034c3abc98] __schedule at ffffffff81416e50
>  #1 [ffff88034c3abde0] schedule at ffffffff814171c4
>  #2 [ffff88034c3abdf0] rwsem_down_failed_common at ffffffff81417de5
>  #3 [ffff88034c3abe60] rwsem_down_write_failed at ffffffff81417e93
>  #4 [ffff88034c3abe70] call_rwsem_down_write_failed at ffffffff8123fd93
>  #5 [ffff88034c3abeb0] down_write at ffffffff81416110
>  #6 [ffff88034c3abec0] thaw_super at ffffffff81150343
>  #7 [ffff88034c3abef0] do_vfs_ioctl at ffffffff8115efb8
>  #8 [ffff88034c3abf30] sys_ioctl at ffffffff8115f139
>  #9 [ffff88034c3abf80] system_call_fastpath at ffffffff8141fff9

waiting for sb->s_umount, which can only be released by flusher
thread completion.

> PID: 20185  TASK: ffff88034c31c280  CPU: 1   COMMAND: "sync"
>  #0 [ffff88034afe7b88] __schedule at ffffffff81416e50
>  #1 [ffff88034afe7cd0] schedule at ffffffff814171c4
>  #2 [ffff88034afe7ce0] schedule_timeout at ffffffff81415455
>  #3 [ffff88034afe7d80] wait_for_common at ffffffff814166b7
>  #4 [ffff88034afe7e10] wait_for_completion at ffffffff81416828
>  #5 [ffff88034afe7e20] writeback_inodes_sb_nr at ffffffff81174c69
>  #6 [ffff88034afe7eb0] writeback_inodes_sb at ffffffff8117522c
>  #7 [ffff88034afe7ee0] __sync_filesystem at ffffffff8117a469
>  #8 [ffff88034afe7f00] sync_one_sb at ffffffff8117a4c7
>  #9 [ffff88034afe7f10] iterate_supers at ffffffff8115126b
> #10 [ffff88034afe7f50] sys_sync at ffffffff8117a4ff
> #11 [ffff88034afe7f80] system_call_fastpath at ffffffff8141fff9

waiting for flusher thread completion, holding the sb->s_umount lock
in read mode.

> 
> PID: 20110  TASK: ffff88034a4820c0  CPU: 2   COMMAND: "ls"
>  #0 [ffff88034a855c78] __schedule at ffffffff81416e50
>  #1 [ffff88034a855dc0] schedule at ffffffff814171c4
>  #2 [ffff88034a855dd0] xfs_trans_alloc at ffffffffa0499fb5 [xfs]
>  #3 [ffff88034a855e30] xfs_fs_dirty_inode at ffffffffa0457aa2 [xfs]
>  #4 [ffff88034a855e60] __mark_inode_dirty at ffffffff811753da
>  #5 [ffff88034a855ea0] touch_atime at ffffffff811662db
>  #6 [ffff88034a855ef0] vfs_readdir at ffffffff8115f934
>  #7 [ffff88034a855f30] sys_getdents64 at ffffffff8115f9c3
>  #8 [ffff88034a855f80] system_call_fastpath at ffffffff8141fff9

Frozen attribute modification, no locks held.

So, what are the flusher threads doing - where are they stuck?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2012-04-16 23:54 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-27 16:44 [PATCH 00/10] remove xfsbufd Christoph Hellwig
2012-03-27 16:44 ` [PATCH 01/10] xfs: remove log item from AIL in xfs_qm_dqflush after a shutdown Christoph Hellwig
2012-03-27 18:17   ` Mark Tinguely
2012-04-13  9:36   ` Dave Chinner
2012-03-27 16:44 ` [PATCH 02/10] xfs: remove log item from AIL in xfs_iflush " Christoph Hellwig
2012-04-13  9:37   ` Dave Chinner
2012-03-27 16:44 ` [PATCH 03/10] xfs: allow assigning the tail lsn with the AIL lock held Christoph Hellwig
2012-03-27 18:18   ` Mark Tinguely
2012-04-13  9:42   ` Dave Chinner
2012-03-27 16:44 ` [PATCH 04/10] xfs: implement freezing by emptying the AIL Christoph Hellwig
2012-04-13 10:04   ` Dave Chinner
2012-04-16 13:33   ` Mark Tinguely
2012-04-16 13:47   ` Mark Tinguely
2012-04-16 23:54     ` Dave Chinner [this message]
2012-04-17  4:20       ` Dave Chinner
2012-04-17  8:26         ` Dave Chinner
2012-04-18 13:13           ` Mark Tinguely
2012-04-18 18:14             ` Ben Myers
2012-04-18 17:53           ` Mark Tinguely
2012-03-27 16:44 ` [PATCH 05/10] xfs: do flush inodes from background inode reclaim Christoph Hellwig
2012-04-13 10:14   ` Dave Chinner
2012-04-16 19:25   ` Mark Tinguely
2012-03-27 16:44 ` [PATCH 06/10] xfs: do not write the buffer from xfs_iflush Christoph Hellwig
2012-04-13 10:31   ` Dave Chinner
2012-04-18 13:33   ` Mark Tinguely
2012-03-27 16:44 ` [PATCH 07/10] xfs: do not write the buffer from xfs_qm_dqflush Christoph Hellwig
2012-04-13 10:33   ` Dave Chinner
2012-04-18 21:11   ` Mark Tinguely
2012-03-27 16:44 ` [PATCH 08/10] xfs: do not add buffers to the delwri queue until pushed Christoph Hellwig
2012-04-13 10:35   ` Dave Chinner
2012-04-18 21:11   ` Mark Tinguely
2012-03-27 16:44 ` [PATCH 09/10] xfs: on-stack delayed write buffer lists Christoph Hellwig
2012-04-13 11:37   ` Dave Chinner
2012-04-20 18:19   ` Mark Tinguely
2012-04-21  0:42     ` Dave Chinner
2012-04-23  1:57       ` Dave Chinner
2012-03-27 16:44 ` [PATCH 10/10] xfs: remove some obsolete comments in xfs_trans_ail.c Christoph Hellwig
2012-04-13 11:37   ` Dave Chinner
2012-03-28  0:53 ` [PATCH 00/10] remove xfsbufd Dave Chinner
2012-03-28 15:10   ` Christoph Hellwig
2012-03-29  0:52     ` Dave Chinner
2012-03-29 19:38       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120416235432.GZ6734@dastard \
    --to=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=tinguely@sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.