From: Avi Kivity <avi@scylladb.com>
To: Stefan Hajnoczi <stefanha@gmail.com>,
Paolo Bonzini <pbonzini@redhat.com>
Cc: Willem de Bruijn <willemb@google.com>,
Fam Zheng <famz@redhat.com>,
Eliezer Tamir <eliezer.tamir@linux.intel.com>,
Eric Dumazet <eric.dumazet@gmail.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
qemu-devel@nongnu.org, Jens Axboe <axboe@fb.com>,
Christian Borntraeger <borntraeger@de.ibm.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
Davide Libenzi <davidel@xmailserver.org>,
Christoph Hellwig <hch@lst.de>
Subject: Re: [Qemu-devel] Linux kernel polling for QEMU
Date: Wed, 7 Dec 2016 12:38:34 +0200 [thread overview]
Message-ID: <3cdee89e-5d3f-c473-6c0b-4bb6b622d13f@scylladb.com> (raw)
In-Reply-To: <20161202101235.GB14548@stefanha-x1.localdomain>
On 12/02/2016 12:12 PM, Stefan Hajnoczi wrote:
> On Thu, Dec 01, 2016 at 09:35:27AM -0500, Paolo Bonzini wrote:
>>>>> Maybe we could do the same for sockets? When data is available on a
>>>>> socket (or when it becomes writable), write to a user memory location.
>>>>>
>>>>> I, too, have an interest in polling; in my situation most of the polling
>>>>> happens in userspace.
>>>> You are trying to improve on the latency of non-blocking
>>>> ppoll(2)/epoll_wait(2) call?
>>> Yes, but the concern is for throughput, not latency.
>>>
>>> My main loop looks like
>>>
>>> execute some tasks
>>> poll from many sources
>>>
>>> Since epoll_wait(..., 0) has non-trivial costs, I have to limit the
>>> polling rate, and even so it shows up in the profile. If the cost were
>>> lower, I could poll at a higher frequency, resulting in lower worst-case
>>> latencies for high-priority tasks.
>> IMHO, the ideal model wouldn't enter the kernel at all unless you _want_
>> to go to sleep.
> Something like mmapping an epoll file descriptor to get a ring of
> epoll_events. It has to play nice with blocking epoll_wait(2) and
> ppoll(2) of an eventpoll file descriptor so that the userspace process
> can go to sleep without a lot of work to switch epoll wait vs mmap
> modes.
I think that's overkill; as soon as data is returned, you're going to be
talking to the kernel in order to get it. Besides, leave something for
userspace tcp stack implementations.
> I'm bothered by the fact that the kernel will not be able to poll NIC or
> storage controller rings if userspace is doing the polling :(.
Yes, that's a concern.
We have three conflicting requirements:
- fast inter-thread IPC (or guest/host IPC) -> shared memory+polling
- kernel multiplexing of hardware -> kernel interfaces + kernel poll /
sleep
- composable interfaces that allow you to use the two methods
You could use something like Flow Director to move interesting flows
directly to the application, let the application poll those queues (and
maybe tell the kernel if something interesting happens), while the
kernel manages the other queues. It gets ugly very quickly.
next prev parent reply other threads:[~2016-12-07 10:39 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-24 15:12 [Qemu-devel] Linux kernel polling for QEMU Stefan Hajnoczi
2016-11-28 9:31 ` Eliezer Tamir
2016-11-28 15:29 ` Stefan Hajnoczi
2016-11-28 15:41 ` Paolo Bonzini
2016-11-29 10:45 ` Stefan Hajnoczi
2016-11-30 17:41 ` Avi Kivity
2016-12-01 11:45 ` Stefan Hajnoczi
2016-12-01 11:59 ` Avi Kivity
2016-12-01 14:35 ` Paolo Bonzini
2016-12-02 10:12 ` Stefan Hajnoczi
2016-12-07 10:38 ` Avi Kivity [this message]
2016-12-07 10:32 ` Avi Kivity
2016-11-28 20:41 ` Willem de Bruijn
2016-11-29 8:19 ` Christian Borntraeger
2016-11-29 11:00 ` Stefan Hajnoczi
2016-11-29 11:58 ` Christian Borntraeger
2016-11-29 10:32 ` Fam Zheng
2016-11-29 11:17 ` Paolo Bonzini
2016-11-29 13:24 ` Fam Zheng
2016-11-29 13:27 ` Paolo Bonzini
2016-11-29 14:17 ` Fam Zheng
2016-11-29 15:24 ` Andrew Jones
2016-11-29 15:39 ` Fam Zheng
2016-11-29 16:01 ` Andrew Jones
2016-11-29 16:13 ` Paolo Bonzini
2016-11-29 19:38 ` Andrew Jones
2016-11-30 7:19 ` Peter Maydell
2016-11-30 9:05 ` Andrew Jones
2016-11-30 9:46 ` Peter Maydell
2016-11-30 14:18 ` Paolo Bonzini
2016-12-05 11:20 ` Alex Bennée
2016-11-29 15:45 ` Paolo Bonzini
2016-11-29 20:43 ` Stefan Hajnoczi
2016-11-30 5:42 ` Fam Zheng
2016-11-30 9:38 ` Stefan Hajnoczi
2016-11-30 10:50 ` Fam Zheng
2016-11-30 15:10 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3cdee89e-5d3f-c473-6c0b-4bb6b622d13f@scylladb.com \
--to=avi@scylladb.com \
--cc=axboe@fb.com \
--cc=borntraeger@de.ibm.com \
--cc=davidel@xmailserver.org \
--cc=eliezer.tamir@linux.intel.com \
--cc=eric.dumazet@gmail.com \
--cc=famz@redhat.com \
--cc=hch@lst.de \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
--cc=stefanha@redhat.com \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).