From: Stefan Hajnoczi <stefanha@gmail.com>
To: Roman Pen <roman.penyaev@profitbricks.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v2 0/3] linux-aio: reduce completion latency
Date: Wed, 20 Jul 2016 10:08:11 +0100 [thread overview]
Message-ID: <20160720090811.GC13233@stefanha-x1.localdomain> (raw)
In-Reply-To: <1468931263-32667-1-git-send-email-roman.penyaev@profitbricks.com>
[-- Attachment #1: Type: text/plain, Size: 4240 bytes --]
On Tue, Jul 19, 2016 at 02:27:40PM +0200, Roman Pen wrote:
> v2:
> o For the third patch do not introduce extra member for LinuxAioState
> structure, reuse ret == -EINPROGRESS.
>
> o Add explicit comment which explains why we do not hang if requests
> are still pended.
>
>
> This series are intended to reduce completion latencies by two changes:
>
> 1. QEMU does not use any timeout value for harvesting completed AIO
> requests from the ring buffer, thus io_getevents() can be implemented
> in userspace (first patch).
>
> 2. In order to reduce completion latency it makes sense to harvest completed
> requests ASAP. Very fast backend device can complete requests just after
> submission, so it is worth trying to check ring buffer and peek completed
> requests directly after io_submit() has been called (third patch).
>
> Indeed, the series reduces the completions latencies and increases the
> overall throughput, e.g. the following is the percentiles of number of
> completed requests at once:
>
> 1th 10th 20th 30th 40th 50th 60th 70th 80th 90th 99.99th
> Before 2 4 42 112 128 128 128 128 128 128 128
> After 1 1 4 14 33 45 47 48 50 51 108
>
> That means, that before the third patch is applied the ring buffer is
> observed as full (128 requests were consumed at once) in 60% of calls.
>
> After the third patch is applied the distribution of number of completed
> requests is "smoother" and the queue (requests in-flight) is almost never
> full.
>
> The fio read results are the following (write results are almost the
> same and are not showed here):
>
> Before
> ------
> job: (groupid=0, jobs=8): err= 0: pid=2227: Tue Jul 19 11:29:50 2016
> Description : [Emulation of Storage Server Access Pattern]
> read : io=54681MB, bw=1822.7MB/s, iops=179779, runt= 30001msec
> slat (usec): min=172, max=16883, avg=338.35, stdev=109.66
> clat (usec): min=1, max=21977, avg=1051.45, stdev=299.29
> lat (usec): min=317, max=22521, avg=1389.83, stdev=300.73
> clat percentiles (usec):
> | 1.00th=[ 346], 5.00th=[ 596], 10.00th=[ 708], 20.00th=[ 852],
> | 30.00th=[ 932], 40.00th=[ 996], 50.00th=[ 1048], 60.00th=[ 1112],
> | 70.00th=[ 1176], 80.00th=[ 1256], 90.00th=[ 1384], 95.00th=[ 1496],
> | 99.00th=[ 1800], 99.50th=[ 1928], 99.90th=[ 2320], 99.95th=[ 2672],
> | 99.99th=[ 4704]
> bw (KB /s): min=205229, max=553181, per=12.50%, avg=233278.26, stdev=18383.51
>
> After
> ------
> job: (groupid=0, jobs=8): err= 0: pid=2220: Tue Jul 19 11:31:51 2016
> Description : [Emulation of Storage Server Access Pattern]
> read : io=57637MB, bw=1921.2MB/s, iops=189529, runt= 30002msec
> slat (usec): min=169, max=20636, avg=329.61, stdev=124.18
> clat (usec): min=2, max=19592, avg=988.78, stdev=251.04
> lat (usec): min=381, max=21067, avg=1318.42, stdev=243.58
> clat percentiles (usec):
> | 1.00th=[ 310], 5.00th=[ 580], 10.00th=[ 748], 20.00th=[ 876],
> | 30.00th=[ 908], 40.00th=[ 948], 50.00th=[ 1012], 60.00th=[ 1064],
> | 70.00th=[ 1080], 80.00th=[ 1128], 90.00th=[ 1224], 95.00th=[ 1288],
> | 99.00th=[ 1496], 99.50th=[ 1608], 99.90th=[ 1960], 99.95th=[ 2256],
> | 99.99th=[ 5408]
> bw (KB /s): min=212149, max=390160, per=12.49%, avg=245746.04, stdev=11606.75
>
> Throughput increased from 1822MB/s to 1921MB/s, average completion latencies
> decreased from 1051us to 988us.
>
> Roman Pen (3):
> linux-aio: consume events in userspace instead of calling io_getevents
> linux-aio: split processing events function
> linux-aio: process completions from ioq_submit()
>
> block/linux-aio.c | 178 ++++++++++++++++++++++++++++++++++++++++++------------
> 1 file changed, 141 insertions(+), 37 deletions(-)
>
> Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: qemu-devel@nongnu.org
Thanks, applied to my block-next tree for QEMU 2.8:
https://github.com/stefanha/qemu/commits/block-next
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
prev parent reply other threads:[~2016-07-20 11:32 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-19 12:27 [Qemu-devel] [PATCH v2 0/3] linux-aio: reduce completion latency Roman Pen
2016-07-19 12:27 ` [Qemu-devel] [PATCH v2 1/3] linux-aio: consume events in userspace instead of calling io_getevents Roman Pen
2016-07-19 12:27 ` [Qemu-devel] [PATCH v2 2/3] linux-aio: split processing events function Roman Pen
2016-07-19 12:27 ` [Qemu-devel] [PATCH v2 3/3] linux-aio: process completions from ioq_submit() Roman Pen
2016-07-20 9:08 ` Stefan Hajnoczi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160720090811.GC13233@stefanha-x1.localdomain \
--to=stefanha@gmail.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=roman.penyaev@profitbricks.com \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).