qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Fam Zheng <famz@redhat.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Eliezer Tamir <eliezer.tamir@linux.intel.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>, Jens Axboe <axboe@fb.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Davide Libenzi <davidel@xmailserver.org>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [Qemu-devel] Linux kernel polling for QEMU
Date: Wed, 30 Nov 2016 09:38:00 +0000	[thread overview]
Message-ID: <20161130093800.GA2589@stefanha-x1.localdomain> (raw)
In-Reply-To: <20161130054214.GA22613@lemon>

[-- Attachment #1: Type: text/plain, Size: 2887 bytes --]

On Wed, Nov 30, 2016 at 01:42:14PM +0800, Fam Zheng wrote:
> On Tue, 11/29 20:43, Stefan Hajnoczi wrote:
> > On Tue, Nov 29, 2016 at 1:24 PM, Fam Zheng <famz@redhat.com> wrote:
> > > On Tue, 11/29 12:17, Paolo Bonzini wrote:
> > >> On 29/11/2016 11:32, Fam Zheng wrote:
> > >> * it still needs a system call before polling is entered.  Ideally, QEMU
> > >> could run without any system call while in polling mode.
> > >>
> > >> Another possibility is to add a system call for single_task_running().
> > >> It should be simple enough that you can implement it in the vDSO and
> > >> avoid a context switch.  There are convenient hooking points in
> > >> add_nr_running and sub_nr_running.
> > >
> > > That sounds good!
> > 
> > With this solution QEMU can either poll virtqueues or the host kernel
> > can poll NIC and storage controller descriptor rings, but not both at
> > the same time in one thread.  This is one of the reasons why I think
> > exploring polling in the kernel makes more sense.
> 
> That's true. I have one question though: controller rings are in a different
> layer in the kernel, I wonder what the syscall interface looks like to ask
> kernel to poll both hardware rings and memory locations in the same loop? It's
> not obvious to me after reading your eventfd patch.

Current descriptor ring polling in select(2)/poll(2) is supported for
network sockets.  Take a look at the POLL_BUSY_LOOP flag in
fs/select.c:do_poll().  If the .poll() callback sets the flag then it
indicates that the fd supports busy loop polling.

The way this is implemented for network sockets is that the socket looks
up the napi index and is able to use the NIC driver to poll the rx ring.
Then it checks whether the socket's receive queue contains data after
the rx ring was processed.

The virtio_net.ko driver supports this interface, for example.  See
drivers/net/virtio_net.c:virtnet_busy_poll().

Busy loop polling isn't supported for block I/O yet.  There is currently
a completely independent code path for O_DIRECT synchronous I/O where
NVMe can poll for request completion.  But it doesn't work together with
asynchronous I/O (e.g. Linux AIO using eventfd with select(2)/poll(2)).

> > The disadvantage of the kernel approach is that you must make the
> > ppoll(2)/epoll_wait(2) syscall even for polling, and you probably need
> > to do eventfd reads afterwards so the minimum event loop iteration
> > latency is higher than doing polling in userspace.
> 
> And userspace drivers powered by dpdk or vfio will still want to do polling in
> userspace anyway, we may want to take that into account as well.

vfio supports interrupts so it can definitely be integrated with
adaptive kernel polling (i.e. poll for a little while and then wait for
an interrupt if there was no event).

Does dpdk ever use interrupts?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

  reply	other threads:[~2016-11-30 10:12 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-24 15:12 [Qemu-devel] Linux kernel polling for QEMU Stefan Hajnoczi
2016-11-28  9:31 ` Eliezer Tamir
2016-11-28 15:29   ` Stefan Hajnoczi
2016-11-28 15:41     ` Paolo Bonzini
2016-11-29 10:45       ` Stefan Hajnoczi
2016-11-30 17:41         ` Avi Kivity
2016-12-01 11:45           ` Stefan Hajnoczi
2016-12-01 11:59             ` Avi Kivity
2016-12-01 14:35               ` Paolo Bonzini
2016-12-02 10:12                 ` Stefan Hajnoczi
2016-12-07 10:38                   ` Avi Kivity
2016-12-07 10:32                 ` Avi Kivity
2016-11-28 20:41   ` Willem de Bruijn
2016-11-29  8:19 ` Christian Borntraeger
2016-11-29 11:00   ` Stefan Hajnoczi
2016-11-29 11:58     ` Christian Borntraeger
2016-11-29 10:32 ` Fam Zheng
2016-11-29 11:17   ` Paolo Bonzini
2016-11-29 13:24     ` Fam Zheng
2016-11-29 13:27       ` Paolo Bonzini
2016-11-29 14:17         ` Fam Zheng
2016-11-29 15:24           ` Andrew Jones
2016-11-29 15:39             ` Fam Zheng
2016-11-29 16:01               ` Andrew Jones
2016-11-29 16:13                 ` Paolo Bonzini
2016-11-29 19:38                   ` Andrew Jones
2016-11-30  7:19                     ` Peter Maydell
2016-11-30  9:05                       ` Andrew Jones
2016-11-30  9:46                         ` Peter Maydell
2016-11-30 14:18                           ` Paolo Bonzini
2016-12-05 11:20                             ` Alex Bennée
2016-11-29 15:45             ` Paolo Bonzini
2016-11-29 20:43       ` Stefan Hajnoczi
2016-11-30  5:42         ` Fam Zheng
2016-11-30  9:38           ` Stefan Hajnoczi [this message]
2016-11-30 10:50             ` Fam Zheng
2016-11-30 15:10               ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161130093800.GA2589@stefanha-x1.localdomain \
    --to=stefanha@redhat.com \
    --cc=axboe@fb.com \
    --cc=borntraeger@de.ibm.com \
    --cc=davidel@xmailserver.org \
    --cc=eliezer.tamir@linux.intel.com \
    --cc=famz@redhat.com \
    --cc=hch@lst.de \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).