Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Stefan Hajnoczi <stefanha@gmail.com>
To: Karl Rister <krister@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
	qemu-devel@nongnu.org, Andrew Theurer <atheurer@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>, Fam Zheng <famz@redhat.com>
Subject: Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode
Date: Mon, 14 Nov 2016 15:26:42 +0000	[thread overview]
Message-ID: <20161114152642.GE26198@stefanha-x1.localdomain> (raw)
In-Reply-To: <5826231D.7070208@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3836 bytes --]

On Fri, Nov 11, 2016 at 01:59:25PM -0600, Karl Rister wrote:
> On 11/09/2016 11:13 AM, Stefan Hajnoczi wrote:
> > Recent performance investigation work done by Karl Rister shows that the
> > guest->host notification takes around 20 us.  This is more than the "overhead"
> > of QEMU itself (e.g. block layer).
> > 
> > One way to avoid the costly exit is to use polling instead of notification.
> > The main drawback of polling is that it consumes CPU resources.  In order to
> > benefit performance the host must have extra CPU cycles available on physical
> > CPUs that aren't used by the guest.
> > 
> > This is an experimental AioContext polling implementation.  It adds a polling
> > callback into the event loop.  Polling functions are implemented for virtio-blk
> > virtqueue guest->host kick and Linux AIO completion.
> > 
> > The QEMU_AIO_POLL_MAX_NS environment variable sets the number of nanoseconds to
> > poll before entering the usual blocking poll(2) syscall.  Try setting this
> > variable to the time from old request completion to new virtqueue kick.
> > 
> > By default no polling is done.  The QEMU_AIO_POLL_MAX_NS must be set to get any
> > polling!
> > 
> > Karl: I hope you can try this patch series with several QEMU_AIO_POLL_MAX_NS
> > values.  If you don't find a good value we should double-check the tracing data
> > to see if this experimental code can be improved.
> 
> Stefan
> 
> I ran some quick tests with your patches and got some pretty good gains,
> but also some seemingly odd behavior.
>
> These results are for a 5 minute test doing sequential 4KB requests from
> fio using O_DIRECT, libaio, and IO depth of 1.  The requests are
> performed directly against the virtio-blk device (no filesystem) which
> is backed by a 400GB NVme card.
> 
> QEMU_AIO_POLL_MAX_NS      IOPs
>                unset    31,383
>                    1    46,860
>                    2    46,440
>                    4    35,246
>                    8    34,973
>                   16    46,794
>                   32    46,729
>                   64    35,520
>                  128    45,902

The environment variable is in nanoseconds.  The range of values you
tried are very small (all <1 usec).  It would be interesting to try
larger values in the ballpark of the latencies you have traced.  For
example 2000, 4000, 8000, 16000, and 32000 ns.

Very interesting that QEMU_AIO_POLL_MAX_NS=1 performs so well without
much CPU overhead.

> I found the results for 4, 8, and 64 odd so I re-ran some tests to check
> for consistency.  I used values of 2 and 4 and ran each 5 times.  Here
> is what I got:
> 
> Iteration    QEMU_AIO_POLL_MAX_NS=2   QEMU_AIO_POLL_MAX_NS=4
>         1                    46,972                   35,434
>         2                    46,939                   35,719
>         3                    47,005                   35,584
>         4                    47,016                   35,615
>         5                    47,267                   35,474
> 
> So the results seem consistent.

That is interesting.  I don't have an explanation for the consistent
difference between 2 and 4 ns polling time.  The time difference is so
small yet the IOPS difference is clear.

Comparing traces could shed light on the cause for this difference.

> I saw some discussion on the patches made which make me think you'll be
> making some changes, is that right?  If so, I may wait for the updates
> and then we can run the much more exhaustive set of workloads
> (sequential read and write, random read and write) at various block
> sizes (4, 8, 16, 32, 64, 128, and 256) and multiple IO depths (1 and 32)
> that we were doing when we started looking at this.

I'll send an updated version of the patches.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

next prev parent reply	other threads:[~2016-11-14 15:26 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-09 17:13 [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode Stefan Hajnoczi
2016-11-09 17:13 ` [Qemu-devel] [RFC 1/3] aio-posix: add aio_set_poll_handler() Stefan Hajnoczi
2016-11-09 17:30   ` Paolo Bonzini
2016-11-10 10:17     ` Stefan Hajnoczi
2016-11-10 13:20       ` Paolo Bonzini
2016-11-15 20:14   ` Fam Zheng
2016-11-09 17:13 ` [Qemu-devel] [RFC 2/3] virtio: poll virtqueues for new buffers Stefan Hajnoczi
2016-11-09 17:13 ` [Qemu-devel] [RFC 3/3] linux-aio: poll ring for completions Stefan Hajnoczi
2016-11-11 19:59 ` [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode Karl Rister
2016-11-14 13:53   ` Fam Zheng
2016-11-14 14:52     ` Karl Rister
2016-11-14 16:56       ` Stefan Hajnoczi
2016-11-14 15:26   ` Stefan Hajnoczi [this message]
2016-11-14 15:29     ` Paolo Bonzini
2016-11-14 17:06       ` Stefan Hajnoczi
2016-11-14 17:13         ` Fam Zheng
2016-11-14 17:15         ` Paolo Bonzini
2016-11-15 10:36           ` Stefan Hajnoczi
2016-11-16  8:27       ` Fam Zheng
2016-11-14 15:36     ` Karl Rister
2016-11-14 20:12     ` Karl Rister
2016-11-14 20:52       ` Paolo Bonzini
2016-11-15 10:32         ` Stefan Hajnoczi
2016-11-15 18:45           ` Karl Rister
2016-11-13  6:20 ` no-reply
2016-11-14 14:51 ` Christian Borntraeger
2016-11-14 16:53   ` Stefan Hajnoczi
2016-11-14 14:59 ` Christian Borntraeger
2016-11-14 16:52   ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161114152642.GE26198@stefanha-x1.localdomain \
    --to=stefanha@gmail.com \
    --cc=atheurer@redhat.com \
    --cc=famz@redhat.com \
    --cc=krister@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).