From: Fam Zheng <famz-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Jason Baron <jbaron-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
"H. Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>,
x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
Alexander Viro
<viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
Andrew Morton
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
David Herrmann
<dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org>,
Miklos Szeredi <mszeredi-AlSwsSmVLrQ@public.gmane.org>,
David Drysdale <drysdale-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
"David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>,
Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Mike Frysinger <vapier-aBrp7R+bbdUdnm+yROfE0A@public.gmane.org>,
Theodore Ts'o <tytso-3s7WtUTddSA@public.gmane.org>,
Heiko Carstens
<heiko.carstens-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
Rasmus Villemoes
<linux-qQsb+v5E8BnlAoU/VqSP6n9LOBIZ5rWg@public.gmane.org>,
Rashika Kheria
<rashika.kheria-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Mathieu Desnoyers
<mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>,
Peter Zijlstra <peter>
Subject: Re: [PATCH v4 0/9] epoll: Introduce new syscalls, epoll_ctl_batch and epoll_pwait1
Date: Fri, 13 Mar 2015 19:31:22 +0800 [thread overview]
Message-ID: <20150313113122.GA7427@ad.nay.redhat.com> (raw)
In-Reply-To: <5501AA6B.2020209-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
On Thu, 03/12 11:02, Jason Baron wrote:
> On 03/09/2015 09:49 PM, Fam Zheng wrote:
> >
> > Benchmark for epoll_pwait1
> > ==========================
> >
> > By running fio tests inside VM with both original and modified QEMU, we can
> > compare their difference in performance.
> >
> > With a small VM setup [t1], the original QEMU (ppoll based) has an 4k read
> > latency overhead around 37 us. In this setup, the main loop polls 10~20 fds.
> >
> > With a slightly larger VM instance [t2] - attached a virtio-serial device so
> > that there are 80~90 fds in the main loop - the original QEMU has a latency
> > overhead around 49 us. By adding more such devices [t3], we can see the latency
> > go even higher - 83 us with ~200 FDs.
> >
> > Now modify QEMU to use epoll_pwait1 and test again, the latency numbers are
> > repectively 36us, 37us, 47us for t1, t2 and t3.
> >
> >
>
> Hi,
>
> So it sounds like you are comparing original qemu code (which was using
> ppoll) vs. using epoll with these new syscalls. Curious if you have numbers
> comparing the existing epoll (with say the timerfd in your epoll set), so
> we can see the improvement relative to epoll.
I did compare them, but they are too close to see differences. The improvements
in epoll_pwait1 doesn't really help the hot path of guest IO, but it does
affect the program timer precision, that are used in various device emulations
in QEMU.
Although it's kind of subtle and difficult to summarize here, I can give an
example in the IO throttling implementation in QEMU, to show the significance:
The throttling algorithm computes a duration for the next IO, which is used to
arm a timer in order to delay the request a bit. As timers are always rounded
*UP* to the effective granularity, the timeout being 1ms in epoll_pwait is just
too coarse and will lead to severe inaccuracy. With epoll_pwait1, we can avoid
the rounding-up.
I think this idea could be pertty generally desired by other applications, too.
Regarding the epoll_ctl_batch improvement, again, it is not going to disrupt
the numbers in the small workload I managed to test.
Of course, if you have a specific application senario in mind, I will try it. :)
Thanks,
Fam
WARNING: multiple messages have this Message-ID (diff)
From: Fam Zheng <famz-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Jason Baron <jbaron-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
"H. Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>,
x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
Alexander Viro
<viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
Andrew Morton
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
David Herrmann
<dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org>,
Miklos Szeredi <mszeredi-AlSwsSmVLrQ@public.gmane.org>,
David Drysdale <drysdale-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
"David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>,
Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Mike Frysinger <vapier-aBrp7R+bbdUdnm+yROfE0A@public.gmane.org>,
"Theodore Ts'o" <tytso-3s7WtUTddSA@public.gmane.org>,
Heiko Carstens
<heiko.carstens-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
Rasmus Villemoes
<linux-qQsb+v5E8BnlAoU/VqSP6n9LOBIZ5rWg@public.gmane.org>,
Rashika Kheria
<rashika.kheria-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Mathieu Desnoyers
<mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>,
Peter Zijlstra <peter
Subject: Re: [PATCH v4 0/9] epoll: Introduce new syscalls, epoll_ctl_batch and epoll_pwait1
Date: Fri, 13 Mar 2015 19:31:22 +0800 [thread overview]
Message-ID: <20150313113122.GA7427@ad.nay.redhat.com> (raw)
In-Reply-To: <5501AA6B.2020209-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
On Thu, 03/12 11:02, Jason Baron wrote:
> On 03/09/2015 09:49 PM, Fam Zheng wrote:
> >
> > Benchmark for epoll_pwait1
> > ==========================
> >
> > By running fio tests inside VM with both original and modified QEMU, we can
> > compare their difference in performance.
> >
> > With a small VM setup [t1], the original QEMU (ppoll based) has an 4k read
> > latency overhead around 37 us. In this setup, the main loop polls 10~20 fds.
> >
> > With a slightly larger VM instance [t2] - attached a virtio-serial device so
> > that there are 80~90 fds in the main loop - the original QEMU has a latency
> > overhead around 49 us. By adding more such devices [t3], we can see the latency
> > go even higher - 83 us with ~200 FDs.
> >
> > Now modify QEMU to use epoll_pwait1 and test again, the latency numbers are
> > repectively 36us, 37us, 47us for t1, t2 and t3.
> >
> >
>
> Hi,
>
> So it sounds like you are comparing original qemu code (which was using
> ppoll) vs. using epoll with these new syscalls. Curious if you have numbers
> comparing the existing epoll (with say the timerfd in your epoll set), so
> we can see the improvement relative to epoll.
I did compare them, but they are too close to see differences. The improvements
in epoll_pwait1 doesn't really help the hot path of guest IO, but it does
affect the program timer precision, that are used in various device emulations
in QEMU.
Although it's kind of subtle and difficult to summarize here, I can give an
example in the IO throttling implementation in QEMU, to show the significance:
The throttling algorithm computes a duration for the next IO, which is used to
arm a timer in order to delay the request a bit. As timers are always rounded
*UP* to the effective granularity, the timeout being 1ms in epoll_pwait is just
too coarse and will lead to severe inaccuracy. With epoll_pwait1, we can avoid
the rounding-up.
I think this idea could be pertty generally desired by other applications, too.
Regarding the epoll_ctl_batch improvement, again, it is not going to disrupt
the numbers in the small workload I managed to test.
Of course, if you have a specific application senario in mind, I will try it. :)
Thanks,
Fam
WARNING: multiple messages have this Message-ID (diff)
From: Fam Zheng <famz@redhat.com>
To: Jason Baron <jbaron@akamai.com>
Cc: linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
x86@kernel.org, Alexander Viro <viro@zeniv.linux.org.uk>,
Andrew Morton <akpm@linux-foundation.org>,
Kees Cook <keescook@chromium.org>,
Andy Lutomirski <luto@amacapital.net>,
David Herrmann <dh.herrmann@gmail.com>,
Alexei Starovoitov <ast@plumgrid.com>,
Miklos Szeredi <mszeredi@suse.cz>,
David Drysdale <drysdale@google.com>,
Oleg Nesterov <oleg@redhat.com>,
"David S. Miller" <davem@davemloft.net>,
Vivek Goyal <vgoyal@redhat.com>,
Mike Frysinger <vapier@gentoo.org>,
"Theodore Ts'o" <tytso@mit.edu>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
Rasmus Villemoes <linux@rasmusvillemoes.dk>,
Rashika Kheria <rashika.kheria@gmail.com>,
Hugh Dickins <hughd@google.com>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Peter Zijlstra <peterz@infradead.org>,
linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org,
Josh Triplett <josh@joshtriplett.org>,
"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Omar Sandoval <osandov@osandov.com>,
Jonathan Corbet <corbet@lwn.net>,
shane.seymour@hp.com, dan.j.rosenberg@gmail.com
Subject: Re: [PATCH v4 0/9] epoll: Introduce new syscalls, epoll_ctl_batch and epoll_pwait1
Date: Fri, 13 Mar 2015 19:31:22 +0800 [thread overview]
Message-ID: <20150313113122.GA7427@ad.nay.redhat.com> (raw)
In-Reply-To: <5501AA6B.2020209@akamai.com>
On Thu, 03/12 11:02, Jason Baron wrote:
> On 03/09/2015 09:49 PM, Fam Zheng wrote:
> >
> > Benchmark for epoll_pwait1
> > ==========================
> >
> > By running fio tests inside VM with both original and modified QEMU, we can
> > compare their difference in performance.
> >
> > With a small VM setup [t1], the original QEMU (ppoll based) has an 4k read
> > latency overhead around 37 us. In this setup, the main loop polls 10~20 fds.
> >
> > With a slightly larger VM instance [t2] - attached a virtio-serial device so
> > that there are 80~90 fds in the main loop - the original QEMU has a latency
> > overhead around 49 us. By adding more such devices [t3], we can see the latency
> > go even higher - 83 us with ~200 FDs.
> >
> > Now modify QEMU to use epoll_pwait1 and test again, the latency numbers are
> > repectively 36us, 37us, 47us for t1, t2 and t3.
> >
> >
>
> Hi,
>
> So it sounds like you are comparing original qemu code (which was using
> ppoll) vs. using epoll with these new syscalls. Curious if you have numbers
> comparing the existing epoll (with say the timerfd in your epoll set), so
> we can see the improvement relative to epoll.
I did compare them, but they are too close to see differences. The improvements
in epoll_pwait1 doesn't really help the hot path of guest IO, but it does
affect the program timer precision, that are used in various device emulations
in QEMU.
Although it's kind of subtle and difficult to summarize here, I can give an
example in the IO throttling implementation in QEMU, to show the significance:
The throttling algorithm computes a duration for the next IO, which is used to
arm a timer in order to delay the request a bit. As timers are always rounded
*UP* to the effective granularity, the timeout being 1ms in epoll_pwait is just
too coarse and will lead to severe inaccuracy. With epoll_pwait1, we can avoid
the rounding-up.
I think this idea could be pertty generally desired by other applications, too.
Regarding the epoll_ctl_batch improvement, again, it is not going to disrupt
the numbers in the small workload I managed to test.
Of course, if you have a specific application senario in mind, I will try it. :)
Thanks,
Fam
next prev parent reply other threads:[~2015-03-13 11:31 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-10 1:49 [PATCH v4 0/9] epoll: Introduce new syscalls, epoll_ctl_batch and epoll_pwait1 Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-10 1:49 ` [PATCH v4 1/9] epoll: Extract epoll_wait_do and epoll_pwait_do Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-10 1:49 ` [PATCH v4 2/9] epoll: Specify clockid explicitly Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-10 1:49 ` [PATCH v4 3/9] epoll: Extract ep_ctl_do Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-10 1:49 ` [PATCH v4 4/9] epoll: Add implementation for epoll_ctl_batch Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-10 1:49 ` Fam Zheng
[not found] ` <1425952155-27603-5-git-send-email-famz-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-03-10 13:59 ` Dan Rosenberg
2015-03-10 13:59 ` Dan Rosenberg
2015-03-11 2:23 ` Fam Zheng
2015-03-10 1:49 ` [PATCH v4 5/9] x86: Hook up epoll_ctl_batch syscall Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-10 1:49 ` [PATCH v4 6/9] epoll: Add implementation for epoll_pwait1 Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-10 1:49 ` [PATCH v4 7/9] x86: Hook up epoll_pwait1 syscall Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-10 1:49 ` Fam Zheng
[not found] ` <1425952155-27603-1-git-send-email-famz-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-03-10 1:49 ` [PATCH v4 8/9] epoll: Add compat version implementation of epoll_pwait1 Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-10 1:49 ` [PATCH v4 9/9] x86: Hook up 32 bit compat epoll_pwait1 syscall Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-10 1:49 ` Fam Zheng
2015-03-12 15:02 ` [PATCH v4 0/9] epoll: Introduce new syscalls, epoll_ctl_batch and epoll_pwait1 Jason Baron
2015-03-12 15:02 ` Jason Baron
[not found] ` <5501AA6B.2020209-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
2015-03-13 11:31 ` Fam Zheng [this message]
2015-03-13 11:31 ` Fam Zheng
2015-03-13 11:31 ` Fam Zheng
[not found] ` <20150313113122.GA7427-ZfWej9ACyHUXGNroddHbYwC/G2K4zDHf@public.gmane.org>
2015-03-13 14:46 ` Jason Baron
2015-03-13 14:46 ` Jason Baron
2015-03-13 14:46 ` Jason Baron
[not found] ` <5502F857.6050505-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
2015-03-13 14:56 ` Paolo Bonzini
2015-03-13 14:56 ` Paolo Bonzini
2015-03-13 14:56 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150313113122.GA7427@ad.nay.redhat.com \
--to=famz-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org \
--cc=davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org \
--cc=dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=drysdale-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=heiko.carstens-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org \
--cc=hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org \
--cc=hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=jbaron-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org \
--cc=keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-qQsb+v5E8BnlAoU/VqSP6n9LOBIZ5rWg@public.gmane.org \
--cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
--cc=mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org \
--cc=mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=mszeredi-AlSwsSmVLrQ@public.gmane.org \
--cc=oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=rashika.kheria-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
--cc=tytso-3s7WtUTddSA@public.gmane.org \
--cc=vapier-aBrp7R+bbdUdnm+yROfE0A@public.gmane.org \
--cc=vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
--cc=x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.