From: "Michael S. Tsirkin" <mst@redhat.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: "snabb-devel@googlegroups.com" <snabb-devel@googlegroups.com>,
qemu-devel@nongnu.org,
Julian Stecklina <jsteckli@os.inf.tu-dresden.de>
Subject: Re: [Qemu-devel] snabbswitch integration with QEMU for userspace ethernet I/O
Date: Tue, 28 May 2013 20:17:42 +0300 [thread overview]
Message-ID: <20130528171742.GB30296@redhat.com> (raw)
In-Reply-To: <87r4grca4p.fsf@codemonkey.ws>
On Tue, May 28, 2013 at 12:00:38PM -0500, Anthony Liguori wrote:
> Julian Stecklina <jsteckli@os.inf.tu-dresden.de> writes:
>
> > On 05/28/2013 12:10 PM, Luke Gorrie wrote:
> >> On 27 May 2013 11:34, Stefan Hajnoczi <stefanha@redhat.com
> >> <mailto:stefanha@redhat.com>> wrote:
> >>
> >> vhost_net is about connecting the a virtio-net speaking process to a
> >> tun-like device. The problem you are trying to solve is connecting a
> >> virtio-net speaking process to Snabb Switch.
> >>
> >>
> >> Yep!
> >
> > Since I am on a similar path as Luke, let me share another idea.
> >
> > What about extending qemu in a way to allow PCI device models to be
> > implemented in another process.
>
> We aren't going to support any interface that enables out of tree
> devices. This is just plugins in a different form with even more
> downsides. You cannot easily keep track of dirty info, the guest
> physical address translation to host is difficult to keep in sync
> (imagine the complexity of memory hotplug).
>
> Basically, it's easy to hack up but extremely hard to do something that
> works correctly overall.
>
> There isn't a compelling reason to implement something like this other
> than avoiding getting code into QEMU. Best to just submit your device
> to QEMU for inclusion.
>
> If you want to avoid copying in a vswitch, better to use something like
> vmsplice as I outlined in another thread.
>
> > This is not as hard as it may sound.
> > qemu would open a domain socket to this process and map VM memory over
> > to the other side. This can be accomplished by having file descriptors
> > in qemu to VM memory (reusing -mem-path code) and passing those over the
> > domain socket. The other side can then just mmap them. The socket would
> > also be used for configuration and I/O by the guest on the PCI
> > I/O/memory regions. You could also use this to do IRQs or use eventfds,
> > whatever works better.
> >
> > To have a zero copy userspace switch, the switch would offer virtio-net
> > devices to any qemu that wants to connect to it and implement the
> > complete device logic itself. Since it has access to all guest memory,
> > it can just do memcpy for packet data. Of course, this only works for
> > 64-bit systems, because you need vast amounts of virtual address space.
> > In my experience, doing this in userspace is _way less painful_.
> >
> > If you can get away with polling in the switch the overhead of doing all
> > this in userspace is zero. And as long as you can rate-limit explicit
> > notifications over the socket even that overhead should be okay.
> >
> > Opinions?
>
> I don't see any compelling reason to do something like this. It's
> jumping through a tremendous number of hoops to avoid putting code that
> belongs in QEMU in tree.
>
> Regards,
>
> Anthony Liguori
>
> >
> > Julian
OTOH an in-tree device that runs in a separate process would
be useful e.g. for security.
For example, we could limit a virtio-net device process
to only access tap and vhost files.
We can kill this process if there's a bug
with the result that NIC gets stalled but everything else
keeps going.
Possibly restart on next guest reset.
There could be other advantages.
--
MST
next prev parent reply other threads:[~2013-05-28 17:17 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-26 9:32 [Qemu-devel] snabbswitch integration with QEMU for userspace ethernet I/O Luke Gorrie
2013-05-27 9:34 ` Stefan Hajnoczi
2013-05-27 15:18 ` Michael S. Tsirkin
2013-05-27 15:43 ` Paolo Bonzini
2013-05-27 16:18 ` Anthony Liguori
2013-05-27 16:18 ` Paolo Bonzini
2013-05-27 17:01 ` Anthony Liguori
2013-05-27 17:13 ` Michael S. Tsirkin
2013-05-27 18:31 ` Anthony Liguori
2013-05-28 10:39 ` Luke Gorrie
2013-05-28 10:10 ` Luke Gorrie
2013-05-28 10:35 ` Stefan Hajnoczi
2013-05-28 11:36 ` Julian Stecklina
2013-05-28 11:53 ` Michael S. Tsirkin
2013-05-28 12:09 ` Julian Stecklina
2013-05-28 13:56 ` Michael S. Tsirkin
2013-05-28 15:35 ` Julian Stecklina
2013-05-28 15:44 ` Michael S. Tsirkin
2013-05-28 12:48 ` [Qemu-devel] [snabb-devel:276] " Luke Gorrie
2013-05-28 13:12 ` Julian Stecklina
2013-05-28 13:42 ` [Qemu-devel] [snabb-devel:280] " Luke Gorrie
2013-05-28 14:42 ` [Qemu-devel] [snabb-devel:276] " Luke Gorrie
2013-05-28 15:33 ` Julian Stecklina
2013-05-28 17:00 ` [Qemu-devel] " Anthony Liguori
2013-05-28 17:17 ` Michael S. Tsirkin [this message]
2013-05-28 18:55 ` Anthony Liguori
2013-05-29 10:31 ` Stefano Stabellini
2013-05-29 12:25 ` Michael S. Tsirkin
2013-05-29 13:04 ` Stefano Stabellini
2013-06-04 12:19 ` [Qemu-devel] [snabb-devel:300] " Luke Gorrie
2013-06-04 12:49 ` Julian Stecklina
2013-06-04 20:09 ` [Qemu-devel] [snabb-devel:326] " Luke Gorrie
2013-06-04 12:56 ` [Qemu-devel] [snabb-devel:300] " Michael S. Tsirkin
2013-06-05 6:09 ` [Qemu-devel] [snabb-devel:327] " Luke Gorrie
2013-05-29 7:49 ` [Qemu-devel] " Stefan Hajnoczi
2013-05-29 9:08 ` Michael S. Tsirkin
2013-05-29 14:21 ` Stefan Hajnoczi
2013-05-29 14:48 ` Michael S. Tsirkin
2013-05-29 16:02 ` Julian Stecklina
2013-05-30 2:35 ` ronnie sahlberg
2013-05-30 6:46 ` Stefan Hajnoczi
2013-05-30 6:55 ` Michael S. Tsirkin
2013-05-30 7:11 ` [Qemu-devel] [snabb-devel:308] " Luke Gorrie
2013-05-30 8:08 ` [Qemu-devel] " Julian Stecklina
2013-05-29 12:32 ` Julian Stecklina
2013-05-29 14:31 ` Stefan Hajnoczi
2013-05-29 15:59 ` Julian Stecklina
2013-05-28 11:58 ` Stefan Hajnoczi
2013-10-21 10:29 ` Luke Gorrie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130528171742.GB30296@redhat.com \
--to=mst@redhat.com \
--cc=anthony@codemonkey.ws \
--cc=jsteckli@os.inf.tu-dresden.de \
--cc=qemu-devel@nongnu.org \
--cc=snabb-devel@googlegroups.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).