All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@lst.de>
To: Dave Chinner <david@fromorbit.com>
Cc: rpeterso@redhat.com, linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: iomap infrastructure and multipage writes V2
Date: Mon, 2 May 2016 20:23:41 +0200	[thread overview]
Message-ID: <20160502182341.GA7077@lst.de> (raw)
In-Reply-To: <20160413215442.GS567@dastard>

Hi Dave,

sorry for taking forever to get back to this - travel to LSF and some
other meetings and a dealine last week didn't leave me any time for
XFS work.

On Thu, Apr 14, 2016 at 07:54:42AM +1000, Dave Chinner wrote:
> Christoph, have you done any perf testing of this patchset yet to
> check that it does indeed reduce the CPU overhead of large write
> operations? I'd also be interested to know if there is any change in
> overhead for single page (4k) IOs as well, even though I suspect
> there won't be.

I've done a lot of testing earlier, and this version also looks very
promising.  On the sort of hardware I have access to now, the 4k
numbers don't change much, but with 1M writes we both increase the
write bandwith a little bit and significantly lower the cpu usage.

The simple test that demonstrates this is this, the runs are from
a 4p VM with 4G of RAM, access to a fast NVMe SSD and a small enough
data size so that writeback shouldn't throttle the buffered write
path:

MNT=/mnt
PERF="perf_3.16"        # soo smart to have tools in the kernel tree..

#BS=4k
#COUNT=65536
BS=1M
COUNT=256

$PERF stat dd if=/dev/zero of=$MNT/testfile bs=$BS count=$COUNT

with the baseline for-next tree I get the following bandwith and
cpu utilization:

BS=4k: ~600MB/s			0.856 CPUs utilized ( +-  0.32% )
BS=1M: 1.45GB/s			0.820 CPUs utilized ( +-  0.77% )

with all patches applied:

BS=4k:	~610MB/s		0.848 CPUs utilized ( +-  0.36% )
BS=1M:	~1.55GB/s		0.615 CPUs utilized ( +-  0.80% )

This is also visible in the walltime

baseline, 4k:

real	0m0.540s
user	0m0.000s
sys	0m0.533s

baseline, 1M:

real	0m0.310s
user	0m0.000s
sys	0m0.313s

multipage, 4k:

real	0m0.541s
user	0m0.010s
sys	0m0.527s

multipage, 1M:

real	0m0.272s
user	0m0.000s
sys	0m0.263s

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

WARNING: multiple messages have this Message-ID (diff)
From: Christoph Hellwig <hch@lst.de>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com, rpeterso@redhat.com, linux-fsdevel@vger.kernel.org
Subject: Re: iomap infrastructure and multipage writes V2
Date: Mon, 2 May 2016 20:23:41 +0200	[thread overview]
Message-ID: <20160502182341.GA7077@lst.de> (raw)
In-Reply-To: <20160413215442.GS567@dastard>

Hi Dave,

sorry for taking forever to get back to this - travel to LSF and some
other meetings and a dealine last week didn't leave me any time for
XFS work.

On Thu, Apr 14, 2016 at 07:54:42AM +1000, Dave Chinner wrote:
> Christoph, have you done any perf testing of this patchset yet to
> check that it does indeed reduce the CPU overhead of large write
> operations? I'd also be interested to know if there is any change in
> overhead for single page (4k) IOs as well, even though I suspect
> there won't be.

I've done a lot of testing earlier, and this version also looks very
promising.  On the sort of hardware I have access to now, the 4k
numbers don't change much, but with 1M writes we both increase the
write bandwith a little bit and significantly lower the cpu usage.

The simple test that demonstrates this is this, the runs are from
a 4p VM with 4G of RAM, access to a fast NVMe SSD and a small enough
data size so that writeback shouldn't throttle the buffered write
path:

MNT=/mnt
PERF="perf_3.16"        # soo smart to have tools in the kernel tree..

#BS=4k
#COUNT=65536
BS=1M
COUNT=256

$PERF stat dd if=/dev/zero of=$MNT/testfile bs=$BS count=$COUNT

with the baseline for-next tree I get the following bandwith and
cpu utilization:

BS=4k: ~600MB/s			0.856 CPUs utilized ( +-  0.32% )
BS=1M: 1.45GB/s			0.820 CPUs utilized ( +-  0.77% )

with all patches applied:

BS=4k:	~610MB/s		0.848 CPUs utilized ( +-  0.36% )
BS=1M:	~1.55GB/s		0.615 CPUs utilized ( +-  0.80% )

This is also visible in the walltime

baseline, 4k:

real	0m0.540s
user	0m0.000s
sys	0m0.533s

baseline, 1M:

real	0m0.310s
user	0m0.000s
sys	0m0.313s

multipage, 4k:

real	0m0.541s
user	0m0.010s
sys	0m0.527s

multipage, 1M:

real	0m0.272s
user	0m0.000s
sys	0m0.263s

  reply	other threads:[~2016-05-02 18:23 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-12 20:52 iomap infrastructure and multipage writes V2 Christoph Hellwig
2016-04-12 20:52 ` Christoph Hellwig
2016-04-12 20:52 ` [PATCH 1/8] fs: move struct iomap from exportfs.h to a separate header Christoph Hellwig
2016-04-12 20:52   ` Christoph Hellwig
2016-04-12 20:52 ` [PATCH 2/8] fs: introduce iomap infrastructure Christoph Hellwig
2016-04-12 20:52   ` Christoph Hellwig
2016-04-12 20:52 ` [PATCH 3/8] xfs: make xfs_bmbt_to_iomap available outside of xfs_pnfs.c Christoph Hellwig
2016-04-12 20:52   ` Christoph Hellwig
2016-04-12 20:52 ` [PATCH 4/8] xfs: reorder zeroing and flushing sequence in truncate Christoph Hellwig
2016-04-12 20:52   ` Christoph Hellwig
2016-04-12 20:52 ` [PATCH 5/8] xfs: implement iomap based buffered write path Christoph Hellwig
2016-04-12 20:52   ` Christoph Hellwig
2016-04-14 12:58   ` Brian Foster
2016-04-14 12:58     ` Brian Foster
2016-05-02 18:25     ` Christoph Hellwig
2016-05-02 18:25       ` Christoph Hellwig
2016-05-03 15:02       ` Brian Foster
2016-05-03 15:02         ` Brian Foster
2016-05-03 18:15         ` Christoph Hellwig
2016-05-03 18:15           ` Christoph Hellwig
2016-04-12 20:53 ` [PATCH 6/8] xfs: remove buffered write support from __xfs_get_blocks Christoph Hellwig
2016-04-12 20:53   ` Christoph Hellwig
2016-04-12 20:53 ` [PATCH 7/8] fs: iomap based fiemap implementation Christoph Hellwig
2016-04-12 20:53   ` Christoph Hellwig
2016-04-12 20:53 ` [PATCH 8/8] xfs: use iomap " Christoph Hellwig
2016-04-12 20:53   ` Christoph Hellwig
2016-04-13 21:54 ` iomap infrastructure and multipage writes V2 Dave Chinner
2016-04-13 21:54   ` Dave Chinner
2016-05-02 18:23   ` Christoph Hellwig [this message]
2016-05-02 18:23     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160502182341.GA7077@lst.de \
    --to=hch@lst.de \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=rpeterso@redhat.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.