From: Stefan Hajnoczi <stefanha@redhat.com>
To: Sergio Lopez <slp@redhat.com>
Cc: qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] QEMU event loop optimizations
Date: Mon, 8 Apr 2019 09:29:58 +0100 [thread overview]
Message-ID: <20190408082958.GF15001@stefanha-x1.localdomain> (raw)
In-Reply-To: <878swomn42.fsf@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 3061 bytes --]
On Fri, Apr 05, 2019 at 06:29:49PM +0200, Sergio Lopez wrote:
>
> Stefan Hajnoczi writes:
>
> > Hi Sergio,
> > Here are the forgotten event loop optimizations I mentioned:
> >
> > https://github.com/stefanha/qemu/commits/event-loop-optimizations
> >
> > The goal was to eliminate or reorder syscalls so that useful work (like
> > executing BHs) occurs as soon as possible after an event is detected.
> >
> > I remember that these optimizations only shave off a handful of
> > microseconds, so they aren't a huge win. They do become attractive on
> > fast SSDs with <10us read/write latency.
> >
> > These optimizations are aggressive and there is a possibility of
> > introducing regressions.
> >
> > If you have time to pick up this work, try benchmarking each commit
> > individually so performance changes are attributed individually.
> > There's no need to send them together in a single patch series, the
> > changes are quite independent.
>
> It took me a while to find a way to get meaningful numbers to evaluate
> those optimizations. The problem is that here (Xeon E5-2640 v3 and EPYC
> 7351P) the cost of event_notifier_set() is just ~0.4us when the code
> path is hot, and it's hard differentiating it from the noise.
>
> To do so, I've used a patched kernel with a naive io_poll implementation
> for virtio_blk [1], an also patched QEMU with poll-inflight [2] (just to
> be sure we're polling) and ran the test on semi-isolated cores
> (nohz_full + rcu_nocbs + systemd_isolation) with idle siblings. The
> storage is simulated by null_blk with "completion_nsec=0 no_sched=1
> irqmode=0".
>
> # fio --time_based --runtime=30 --rw=randread --name=randread \
> --filename=/dev/vdb --direct=1 --ioengine=pvsync2 --iodepth=1 --hipri=1
>
> | avg_lat (us) | master | qbsn* |
> | run1 | 11.32 | 10.96 |
> | run2 | 11.37 | 10.79 |
> | run3 | 11.42 | 10.67 |
> | run4 | 11.32 | 11.06 |
> | run5 | 11.42 | 11.19 |
> | run6 | 11.42 | 10.91 |
> * patched with aio: add optimized qemu_bh_schedule_nested() API
>
> Even though there's still some variance in the numbers, the 0.4us
> improvement can be clearly appreciated.
>
> I haven't tested the other 3 patches, as their optimizations only have
> effect when the event loop is not running in polling mode. Without
> polling, we get an additional overhead of, at least, 10us, in addition
> to a lot of noise, due to both direct costs (ppoll()...) and indirect
> ones (re-scheduling and TLB/cache pollution), so I don't think we can
> reliable benchmark them. Probably their impact won't be significant
> either, due to the costs I've just mentioned.
Thanks for benchmarking them. We can leave them for now, since there is
a risk of introducing bugs and they don't make a great difference.
Stefan
> Sergio.
>
> [1] https://github.com/slp/linux/commit/d369b37db3e298933e8bb88c6eeacff07f39bc13
> [2] https://lists.nongnu.org/archive/html/qemu-devel/2019-04/msg00447.html
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]
WARNING: multiple messages have this Message-ID (diff)
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Sergio Lopez <slp@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] QEMU event loop optimizations
Date: Mon, 8 Apr 2019 09:29:58 +0100 [thread overview]
Message-ID: <20190408082958.GF15001@stefanha-x1.localdomain> (raw)
Message-ID: <20190408082958.SGeSDSVyXxyTAw3ogozVNfp2p0lAVf7tAE8QFI_0QAU@z> (raw)
In-Reply-To: <878swomn42.fsf@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 3061 bytes --]
On Fri, Apr 05, 2019 at 06:29:49PM +0200, Sergio Lopez wrote:
>
> Stefan Hajnoczi writes:
>
> > Hi Sergio,
> > Here are the forgotten event loop optimizations I mentioned:
> >
> > https://github.com/stefanha/qemu/commits/event-loop-optimizations
> >
> > The goal was to eliminate or reorder syscalls so that useful work (like
> > executing BHs) occurs as soon as possible after an event is detected.
> >
> > I remember that these optimizations only shave off a handful of
> > microseconds, so they aren't a huge win. They do become attractive on
> > fast SSDs with <10us read/write latency.
> >
> > These optimizations are aggressive and there is a possibility of
> > introducing regressions.
> >
> > If you have time to pick up this work, try benchmarking each commit
> > individually so performance changes are attributed individually.
> > There's no need to send them together in a single patch series, the
> > changes are quite independent.
>
> It took me a while to find a way to get meaningful numbers to evaluate
> those optimizations. The problem is that here (Xeon E5-2640 v3 and EPYC
> 7351P) the cost of event_notifier_set() is just ~0.4us when the code
> path is hot, and it's hard differentiating it from the noise.
>
> To do so, I've used a patched kernel with a naive io_poll implementation
> for virtio_blk [1], an also patched QEMU with poll-inflight [2] (just to
> be sure we're polling) and ran the test on semi-isolated cores
> (nohz_full + rcu_nocbs + systemd_isolation) with idle siblings. The
> storage is simulated by null_blk with "completion_nsec=0 no_sched=1
> irqmode=0".
>
> # fio --time_based --runtime=30 --rw=randread --name=randread \
> --filename=/dev/vdb --direct=1 --ioengine=pvsync2 --iodepth=1 --hipri=1
>
> | avg_lat (us) | master | qbsn* |
> | run1 | 11.32 | 10.96 |
> | run2 | 11.37 | 10.79 |
> | run3 | 11.42 | 10.67 |
> | run4 | 11.32 | 11.06 |
> | run5 | 11.42 | 11.19 |
> | run6 | 11.42 | 10.91 |
> * patched with aio: add optimized qemu_bh_schedule_nested() API
>
> Even though there's still some variance in the numbers, the 0.4us
> improvement can be clearly appreciated.
>
> I haven't tested the other 3 patches, as their optimizations only have
> effect when the event loop is not running in polling mode. Without
> polling, we get an additional overhead of, at least, 10us, in addition
> to a lot of noise, due to both direct costs (ppoll()...) and indirect
> ones (re-scheduling and TLB/cache pollution), so I don't think we can
> reliable benchmark them. Probably their impact won't be significant
> either, due to the costs I've just mentioned.
Thanks for benchmarking them. We can leave them for now, since there is
a risk of introducing bugs and they don't make a great difference.
Stefan
> Sergio.
>
> [1] https://github.com/slp/linux/commit/d369b37db3e298933e8bb88c6eeacff07f39bc13
> [2] https://lists.nongnu.org/archive/html/qemu-devel/2019-04/msg00447.html
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]
next prev parent reply other threads:[~2019-04-08 8:30 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20190326131822.GD15011@stefanha-x1.localdomain>
2019-04-05 16:29 ` [Qemu-devel] QEMU event loop optimizations Sergio Lopez
2019-04-05 16:29 ` Sergio Lopez
2019-04-08 8:29 ` Stefan Hajnoczi [this message]
2019-04-08 8:29 ` Stefan Hajnoczi
[not found] ` <55751c00-0854-ea4d-75b5-ab82b4eeb70d@redhat.com>
2019-04-02 16:18 ` Kevin Wolf
2019-04-02 16:25 ` Paolo Bonzini
2019-04-05 16:33 ` Sergio Lopez
2019-04-05 16:33 ` Sergio Lopez
2019-04-08 10:42 ` Paolo Bonzini
2019-04-08 10:42 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190408082958.GF15001@stefanha-x1.localdomain \
--to=stefanha@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=slp@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).