linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@lst.de>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>,
	rpeterso@redhat.com, linux-fsdevel@vger.kernel.org,
	xfs@oss.sgi.com
Subject: Re: iomap infrastructure and multipage writes V5
Date: Sun, 31 Jul 2016 21:19:00 +0200	[thread overview]
Message-ID: <20160731191900.GA29301@lst.de> (raw)
In-Reply-To: <20160718111851.GD16044@dastard>

Another quiet weekend trying to debug this, and only minor progress.

The biggest different in traces of the old vs new code is that we manage
to allocate much bigger delalloc reservations at a time in xfs_bmapi_delay
-> xfs_bmapi_reserve_delalloc.  The old code always went for a single FSB,
which also meant allocating an indlen of 7 FSBs.  With the iomap code
we always allocate at least 4FSB (aka a page), and sometimes 8 or 12.
All of these still need 7 FSBs for the worst case indirect blocks.  So
what happens here is that in an ENOSPC case we manage to allocate more
actual delalloc blocks before hitting ENOSPC - notwithstanding that the
old case would immediately release them a little later in
xfs_bmap_add_extent_hole_delay after merging the delalloc extents.

On the writeback side I don't see to many changes either.  We'll
eventually run out of blocks when allocating the transaction in
xfs_iomap_write_allocate because the reserved pool is too small.  The
only real difference to before is that under the ENOSPC / out of memory
case we have allocated between 4 to 12 times more blocks, so we have
to clean up 4 to 12 times as much while write_cache_pages continues
iterating over these dirty delalloc blocks.   For me this happens
~6 times as much as before, but I still don't manage to hit an
endless loop.

Now after spending this much time I've started wondering why we even
reserve blocks in xfs_iomap_write_allocate - after all we've reserved
space for the actual data blocks and the indlen worst case in
xfs_bmapi_reserve_delalloc.  And in fact a little hack to drop that
reservation seems to solve both the root cause (depleted reserved pool)
and the cleanup mess.  I just haven't spend enought time to convince
myself that it's actually safe, and in fact looking at the allocator
makes me thing it only works by accident currently despite generally
postive test results.

Here is the quick patch if anyone wants to chime in:

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 620fc91..67c317f 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -717,7 +717,7 @@ xfs_iomap_write_allocate(
 
 		nimaps = 0;
 		while (nimaps == 0) {
-			nres = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK);
+			nres = 0; // XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK);
 
 			error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, nres,
 					0, XFS_TRANS_RESERVE, &tp);

  reply	other threads:[~2016-07-31 19:19 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-01 14:44 iomap infrastructure and multipage writes V5 Christoph Hellwig
2016-06-01 14:44 ` [PATCH 01/14] fs: move struct iomap from exportfs.h to a separate header Christoph Hellwig
2016-06-01 14:44 ` [PATCH 02/14] fs: introduce iomap infrastructure Christoph Hellwig
2016-06-16 16:12   ` Jan Kara
2016-06-17 12:01     ` Christoph Hellwig
2016-06-20  2:29       ` Dave Chinner
2016-06-20 12:22         ` Christoph Hellwig
2016-06-01 14:44 ` [PATCH 03/14] fs: support DAX based iomap zeroing Christoph Hellwig
2016-06-01 14:44 ` [PATCH 04/14] xfs: make xfs_bmbt_to_iomap available outside of xfs_pnfs.c Christoph Hellwig
2016-06-01 14:44 ` [PATCH 05/14] xfs: reorder zeroing and flushing sequence in truncate Christoph Hellwig
2016-06-01 14:44 ` [PATCH 06/14] xfs: implement iomap based buffered write path Christoph Hellwig
2016-06-01 14:44 ` [PATCH 07/14] xfs: remove buffered write support from __xfs_get_blocks Christoph Hellwig
2016-06-01 14:44 ` [PATCH 08/14] fs: iomap based fiemap implementation Christoph Hellwig
2016-08-18 13:18   ` Andreas Grünbacher
2016-06-01 14:44 ` [PATCH 09/14] xfs: use iomap " Christoph Hellwig
2016-06-01 14:44 ` [PATCH 10/14] xfs: use iomap infrastructure for DAX zeroing Christoph Hellwig
2016-06-01 14:44 ` [PATCH 11/14] xfs: handle 64-bit length in xfs_iozero Christoph Hellwig
2016-06-01 14:44 ` [PATCH 12/14] xfs: use xfs_zero_range in xfs_zero_eof Christoph Hellwig
2016-06-01 14:44 ` [PATCH 13/14] xfs: split xfs_free_file_space in manageable pieces Christoph Hellwig
2016-06-01 14:44 ` [PATCH 14/14] xfs: kill xfs_zero_remaining_bytes Christoph Hellwig
2016-06-01 14:46 ` iomap infrastructure and multipage writes V5 Christoph Hellwig
2016-06-28  0:26 ` Dave Chinner
2016-06-28 13:28   ` Christoph Hellwig
2016-06-28 13:38     ` Christoph Hellwig
2016-06-30 17:22   ` Christoph Hellwig
2016-06-30 23:16     ` Dave Chinner
2016-07-18 11:14     ` Dave Chinner
2016-07-18 11:18       ` Dave Chinner
2016-07-31 19:19         ` Christoph Hellwig [this message]
2016-08-01  0:16           ` Dave Chinner
2016-08-02 23:42           ` Dave Chinner
2017-02-13 22:31             ` Eric Sandeen
2017-02-14  6:10               ` Christoph Hellwig
2016-07-19  3:50       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160731191900.GA29301@lst.de \
    --to=hch@lst.de \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=rpeterso@redhat.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).