From: Christoph Hellwig <hch@lst.de>
To: Dave Chinner <david@fromorbit.com>
Cc: rpeterso@redhat.com, linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: iomap infrastructure and multipage writes V5
Date: Thu, 30 Jun 2016 19:22:39 +0200 [thread overview]
Message-ID: <20160630172239.GA23082@lst.de> (raw)
In-Reply-To: <20160628002649.GI12670@dastard>
On Tue, Jun 28, 2016 at 10:26:49AM +1000, Dave Chinner wrote:
> Christoph, it look slike there's an ENOSPC+ENOMEM behavioural regression here.
> generic/224 on my 1p/1GB RAM VM using a 1k lock size filesystem has
> significantly different behaviour once ENOSPC is hit withi this patchset.
>
> It ends up with an endless stream of errors like this:
I've spent some time trying to reproduce this. I'm actually getting
the OOM killer almost reproducible for for-next without the iomap
patches as well when just using 1GB of mem. 1400 MB is the minimum
I can reproducibly finish the test with either code base.
But with the 1400 MB setup I see a few interesting things. Even
with the baseline, no-iomap case I see a few errors in the log:
[ 70.407465] Filesystem "vdc": reserve blocks depleted! Consider increasing
reserve pool
size.
[ 70.195645] XFS (vdc): page discard on page ffff88005682a988, inode 0xd3, offset 761856.
[ 70.408079] Buffer I/O error on dev vdc, logical block 1048513, lost async
page write
[ 70.408598] Buffer I/O error on dev vdc, logical block 1048514, lost async
page write
27s
With iomap I also see the spew of page discard errors your see, but while
I see a lot of them, the rest still finishes after a reasonable time,
just a few seconds more than the pre-iomap baseline. I also see the
reserve block depleted message in this case.
Digging into the reserve block depleted message - it seems we have
too many parallel iomap_allocate transactions going on. I suspect
this might be because the writeback code will not finish a writeback
context if we have multiple blocks inside a page, which can
happen easily for this 1k ENOSPC setup. I've not had time to fully
check if this is what really happens, but I did a quick hack (see below)
to only allocate 1k at a time in iomap_begin, and with that generic/224
finishes without the warning spew. Of course this isn't a real fix,
and I need to fully understand what's going on in writeback due to
different allocation / dirtying patterns from the iomap change.
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 620fc91..d9afba2 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1018,7 +1018,7 @@ xfs_file_iomap_begin(
* Note that the values needs to be less than 32-bits wide until
* the lower level functions are updated.
*/
- length = min_t(loff_t, length, 1024 * PAGE_SIZE);
+ length = min_t(loff_t, length, 1024);
if (xfs_get_extsz_hint(ip)) {
/*
* xfs_iomap_write_direct() expects the shared lock. It
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
WARNING: multiple messages have this Message-ID (diff)
From: Christoph Hellwig <hch@lst.de>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com, rpeterso@redhat.com, linux-fsdevel@vger.kernel.org
Subject: Re: iomap infrastructure and multipage writes V5
Date: Thu, 30 Jun 2016 19:22:39 +0200 [thread overview]
Message-ID: <20160630172239.GA23082@lst.de> (raw)
In-Reply-To: <20160628002649.GI12670@dastard>
On Tue, Jun 28, 2016 at 10:26:49AM +1000, Dave Chinner wrote:
> Christoph, it look slike there's an ENOSPC+ENOMEM behavioural regression here.
> generic/224 on my 1p/1GB RAM VM using a 1k lock size filesystem has
> significantly different behaviour once ENOSPC is hit withi this patchset.
>
> It ends up with an endless stream of errors like this:
I've spent some time trying to reproduce this. I'm actually getting
the OOM killer almost reproducible for for-next without the iomap
patches as well when just using 1GB of mem. 1400 MB is the minimum
I can reproducibly finish the test with either code base.
But with the 1400 MB setup I see a few interesting things. Even
with the baseline, no-iomap case I see a few errors in the log:
[ 70.407465] Filesystem "vdc": reserve blocks depleted! Consider increasing
reserve pool
size.
[ 70.195645] XFS (vdc): page discard on page ffff88005682a988, inode 0xd3, offset 761856.
[ 70.408079] Buffer I/O error on dev vdc, logical block 1048513, lost async
page write
[ 70.408598] Buffer I/O error on dev vdc, logical block 1048514, lost async
page write
27s
With iomap I also see the spew of page discard errors your see, but while
I see a lot of them, the rest still finishes after a reasonable time,
just a few seconds more than the pre-iomap baseline. I also see the
reserve block depleted message in this case.
Digging into the reserve block depleted message - it seems we have
too many parallel iomap_allocate transactions going on. I suspect
this might be because the writeback code will not finish a writeback
context if we have multiple blocks inside a page, which can
happen easily for this 1k ENOSPC setup. I've not had time to fully
check if this is what really happens, but I did a quick hack (see below)
to only allocate 1k at a time in iomap_begin, and with that generic/224
finishes without the warning spew. Of course this isn't a real fix,
and I need to fully understand what's going on in writeback due to
different allocation / dirtying patterns from the iomap change.
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 620fc91..d9afba2 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1018,7 +1018,7 @@ xfs_file_iomap_begin(
* Note that the values needs to be less than 32-bits wide until
* the lower level functions are updated.
*/
- length = min_t(loff_t, length, 1024 * PAGE_SIZE);
+ length = min_t(loff_t, length, 1024);
if (xfs_get_extsz_hint(ip)) {
/*
* xfs_iomap_write_direct() expects the shared lock. It
next prev parent reply other threads:[~2016-06-30 17:22 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-01 14:44 iomap infrastructure and multipage writes V5 Christoph Hellwig
2016-06-01 14:44 ` Christoph Hellwig
2016-06-01 14:44 ` [PATCH 01/14] fs: move struct iomap from exportfs.h to a separate header Christoph Hellwig
2016-06-01 14:44 ` Christoph Hellwig
2016-06-01 14:44 ` [PATCH 02/14] fs: introduce iomap infrastructure Christoph Hellwig
2016-06-01 14:44 ` Christoph Hellwig
2016-06-16 16:12 ` Jan Kara
2016-06-16 16:12 ` Jan Kara
2016-06-17 12:01 ` Christoph Hellwig
2016-06-17 12:01 ` Christoph Hellwig
2016-06-20 2:29 ` Dave Chinner
2016-06-20 2:29 ` Dave Chinner
2016-06-20 12:22 ` Christoph Hellwig
2016-06-20 12:22 ` Christoph Hellwig
2016-06-01 14:44 ` [PATCH 03/14] fs: support DAX based iomap zeroing Christoph Hellwig
2016-06-01 14:44 ` Christoph Hellwig
2016-06-01 14:44 ` [PATCH 04/14] xfs: make xfs_bmbt_to_iomap available outside of xfs_pnfs.c Christoph Hellwig
2016-06-01 14:44 ` Christoph Hellwig
2016-06-01 14:44 ` [PATCH 05/14] xfs: reorder zeroing and flushing sequence in truncate Christoph Hellwig
2016-06-01 14:44 ` Christoph Hellwig
2016-06-01 14:44 ` [PATCH 06/14] xfs: implement iomap based buffered write path Christoph Hellwig
2016-06-01 14:44 ` Christoph Hellwig
2016-06-01 14:44 ` [PATCH 07/14] xfs: remove buffered write support from __xfs_get_blocks Christoph Hellwig
2016-06-01 14:44 ` Christoph Hellwig
2016-06-01 14:44 ` [PATCH 08/14] fs: iomap based fiemap implementation Christoph Hellwig
2016-06-01 14:44 ` Christoph Hellwig
2016-08-18 13:18 ` Andreas Grünbacher
2016-08-18 13:18 ` Andreas Grünbacher
2016-06-01 14:44 ` [PATCH 09/14] xfs: use iomap " Christoph Hellwig
2016-06-01 14:44 ` Christoph Hellwig
2016-06-01 14:44 ` [PATCH 10/14] xfs: use iomap infrastructure for DAX zeroing Christoph Hellwig
2016-06-01 14:44 ` Christoph Hellwig
2016-06-01 14:44 ` [PATCH 11/14] xfs: handle 64-bit length in xfs_iozero Christoph Hellwig
2016-06-01 14:44 ` Christoph Hellwig
2016-06-01 14:44 ` [PATCH 12/14] xfs: use xfs_zero_range in xfs_zero_eof Christoph Hellwig
2016-06-01 14:44 ` Christoph Hellwig
2016-06-01 14:44 ` [PATCH 13/14] xfs: split xfs_free_file_space in manageable pieces Christoph Hellwig
2016-06-01 14:44 ` Christoph Hellwig
2016-06-01 14:44 ` [PATCH 14/14] xfs: kill xfs_zero_remaining_bytes Christoph Hellwig
2016-06-01 14:44 ` Christoph Hellwig
2016-06-01 14:46 ` iomap infrastructure and multipage writes V5 Christoph Hellwig
2016-06-01 14:46 ` Christoph Hellwig
2016-06-28 0:26 ` Dave Chinner
2016-06-28 0:26 ` Dave Chinner
2016-06-28 13:28 ` Christoph Hellwig
2016-06-28 13:28 ` Christoph Hellwig
2016-06-28 13:38 ` Christoph Hellwig
2016-06-28 13:38 ` Christoph Hellwig
2016-06-30 17:22 ` Christoph Hellwig [this message]
2016-06-30 17:22 ` Christoph Hellwig
2016-06-30 23:16 ` Dave Chinner
2016-06-30 23:16 ` Dave Chinner
2016-07-18 11:14 ` Dave Chinner
2016-07-18 11:14 ` Dave Chinner
2016-07-18 11:18 ` Dave Chinner
2016-07-18 11:18 ` Dave Chinner
2016-07-31 19:19 ` Christoph Hellwig
2016-07-31 19:19 ` Christoph Hellwig
2016-08-01 0:16 ` Dave Chinner
2016-08-01 0:16 ` Dave Chinner
2016-08-02 23:42 ` Dave Chinner
2016-08-02 23:42 ` Dave Chinner
2017-02-13 22:31 ` Eric Sandeen
2017-02-14 6:10 ` Christoph Hellwig
2016-07-19 3:50 ` Christoph Hellwig
2016-07-19 3:50 ` Christoph Hellwig
-- strict thread matches above, loose matches on Subject: below --
2017-02-13 22:32 xfs-owner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160630172239.GA23082@lst.de \
--to=hch@lst.de \
--cc=david@fromorbit.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=rpeterso@redhat.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.