public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Mark Tinguely <tinguely@sgi.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 09/10] xfs: on-stack delayed write buffer lists
Date: Fri, 20 Apr 2012 13:19:46 -0500	[thread overview]
Message-ID: <4F91A8C2.3050907@sgi.com> (raw)
In-Reply-To: <20120327164646.975031281@bombadil.infradead.org>

On 03/27/12 11:44, Christoph Hellwig wrote:
> Queue delwri buffers on a local on-stack list instead of a per-buftarg one,
> and write back the buffers per-process instead of by waking up xfsbufd.
>
> This is now easily doable given that we have very few places left that write
> delwri buffers:
>
>   - log recovery:
> 	Only done at mount time, and already forcing out the buffers
> 	synchronously using xfs_flush_buftarg
>
>   - quotacheck:
> 	Same story.
>
>   - dquot reclaim:
> 	Writes out dirty dquots on the LRU under memory pressure.  We might
> 	want to look into doing more of this via xfsaild, but it's already
> 	more optimal than the synchronous inode reclaim that writes each
> 	buffer synchronously.
>
>   - xfsaild:
> 	This is the main beneficiary of the change.  By keeping a local list
> 	of buffers to write we reduce latency of writing out buffers, and
> 	more importably we can remove all the delwri list promotions which
> 	were hitting the buffer cache hard under sustained metadata loads.
>
> The implementation is very straight forward - xfs_buf_delwri_queue now gets
> a new list_head pointer that it adds the delwri buffers to, and all callers
> need to eventually submit the list using xfs_buf_delwi_submit or
> xfs_buf_delwi_submit_nowait.  Buffers that already are on a delwri list are
> skipped in xfs_buf_delwri_queue, assuming they already are on another delwri
> list.  The biggest change to pass down the buffer list was done to the AIL
> pushing. Now that we operate on buffers the trylock, push and pushbuf log
> item methods are merged into a single push routine, which tries to lock the
> item, and if possible add the buffer that needs writeback to the buffer list.
> This leads to much simpler code than the previous split but requires the
> individual IOP_PUSH instances to unlock and reacquire the AIL around calls
> to blocking routines.
>
> Given that xfsailds now also handles writing out buffers the conditions for
> log forcing and the sleep times needed some small changes.  The most
> important one is that we consider an AIL busy as long we still have buffers
> to push, and the other one is that we do increment the pushed LSN for
> buffers that are under flushing at this moment, but still count them towards
> the stuck items for restart purposes.  Without this we could hammer on stuck
> items without ever forcing the log and not make progress under heavy random
> delete workloads on fast flash storage devices.
>
> Signed-off-by: Christoph Hellwig<hch@lst.de>

Test 106 runs to completion with patch 06.

Patch 07 and 08 do not compile without patch 09.

Starting with patch 09, I get the following hang on every test 106:

ID: 27992  TASK: ffff8808310d00c0  CPU: 2   COMMAND: "mount"
  #0 [ffff880834237938] __schedule at ffffffff81417200
  #1 [ffff880834237a80] schedule at ffffffff81417574
  #2 [ffff880834237a90] schedule_timeout at ffffffff81415805
  #3 [ffff880834237b30] wait_for_common at ffffffff81416a67
  #4 [ffff880834237bc0] wait_for_completion at ffffffff81416bd8
  #5 [ffff880834237bd0] xfs_buf_iowait at ffffffffa04fc5a5 [xfs]
  #6 [ffff880834237c00] xfs_buf_delwri_submit at ffffffffa04fe4b9 [xfs]
  #7 [ffff880834237c40] xfs_qm_quotacheck at ffffffffa055cb2d [xfs]
  #8 [ffff880834237cc0] xfs_qm_mount_quotas at ffffffffa055cdf0 [xfs]
  #9 [ffff880834237cf0] xfs_mountfs at ffffffffa054c041 [xfs]
#10 [ffff880834237d40] xfs_fs_fill_super at ffffffffa050ca80 [xfs]
#11 [ffff880834237d70] mount_bdev at ffffffff81150c5c
#12 [ffff880834237de0] xfs_fs_mount at ffffffffa050ac00 [xfs]
#13 [ffff880834237df0] mount_fs at ffffffff811505f8
#14 [ffff880834237e40] vfs_kern_mount at ffffffff8116c070
#15 [ffff880834237e80] do_kern_mount at ffffffff8116c16e
#16 [ffff880834237ec0] do_mount at ffffffff8116d6f0
#17 [ffff880834237f20] sys_mount at ffffffff8116d7f3
#18 [ffff880834237f80] system_call_fastpath at ffffffff814203b9


The workers seem to be idle. For example the xfsaild:

PID: 27676  TASK: ffff880832880240  CPU: 3   COMMAND: "xfsaild/sda7"
  #0 [ffff880832933cb0] __schedule at ffffffff81417200
  #1 [ffff880832933df8] schedule at ffffffff81417574
  #2 [ffff880832933e08] schedule_timeout at ffffffff81415805
  #3 [ffff880832933ea8] xfsaild at ffffffffa0555935 [xfs]
  #4 [ffff880832933ee8] kthread at ffffffff8105dd6e
  #5 [ffff880832933f48] kernel_thread_helper at ffffffff814216a4


The hang is on the third quotacheck.

Should be easy to duplicate this.

--Mark Tinguely.


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2012-04-20 18:19 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-27 16:44 [PATCH 00/10] remove xfsbufd Christoph Hellwig
2012-03-27 16:44 ` [PATCH 01/10] xfs: remove log item from AIL in xfs_qm_dqflush after a shutdown Christoph Hellwig
2012-03-27 18:17   ` Mark Tinguely
2012-04-13  9:36   ` Dave Chinner
2012-03-27 16:44 ` [PATCH 02/10] xfs: remove log item from AIL in xfs_iflush " Christoph Hellwig
2012-04-13  9:37   ` Dave Chinner
2012-03-27 16:44 ` [PATCH 03/10] xfs: allow assigning the tail lsn with the AIL lock held Christoph Hellwig
2012-03-27 18:18   ` Mark Tinguely
2012-04-13  9:42   ` Dave Chinner
2012-03-27 16:44 ` [PATCH 04/10] xfs: implement freezing by emptying the AIL Christoph Hellwig
2012-04-13 10:04   ` Dave Chinner
2012-04-16 13:33   ` Mark Tinguely
2012-04-16 13:47   ` Mark Tinguely
2012-04-16 23:54     ` Dave Chinner
2012-04-17  4:20       ` Dave Chinner
2012-04-17  8:26         ` Dave Chinner
2012-04-18 13:13           ` Mark Tinguely
2012-04-18 18:14             ` Ben Myers
2012-04-18 17:53           ` Mark Tinguely
2012-03-27 16:44 ` [PATCH 05/10] xfs: do flush inodes from background inode reclaim Christoph Hellwig
2012-04-13 10:14   ` Dave Chinner
2012-04-16 19:25   ` Mark Tinguely
2012-03-27 16:44 ` [PATCH 06/10] xfs: do not write the buffer from xfs_iflush Christoph Hellwig
2012-04-13 10:31   ` Dave Chinner
2012-04-18 13:33   ` Mark Tinguely
2012-03-27 16:44 ` [PATCH 07/10] xfs: do not write the buffer from xfs_qm_dqflush Christoph Hellwig
2012-04-13 10:33   ` Dave Chinner
2012-04-18 21:11   ` Mark Tinguely
2012-03-27 16:44 ` [PATCH 08/10] xfs: do not add buffers to the delwri queue until pushed Christoph Hellwig
2012-04-13 10:35   ` Dave Chinner
2012-04-18 21:11   ` Mark Tinguely
2012-03-27 16:44 ` [PATCH 09/10] xfs: on-stack delayed write buffer lists Christoph Hellwig
2012-04-13 11:37   ` Dave Chinner
2012-04-20 18:19   ` Mark Tinguely [this message]
2012-04-21  0:42     ` Dave Chinner
2012-04-23  1:57       ` Dave Chinner
2012-03-27 16:44 ` [PATCH 10/10] xfs: remove some obsolete comments in xfs_trans_ail.c Christoph Hellwig
2012-04-13 11:37   ` Dave Chinner
2012-03-28  0:53 ` [PATCH 00/10] remove xfsbufd Dave Chinner
2012-03-28 15:10   ` Christoph Hellwig
2012-03-29  0:52     ` Dave Chinner
2012-03-29 19:38       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F91A8C2.3050907@sgi.com \
    --to=tinguely@sgi.com \
    --cc=hch@infradead.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox