qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: "snabb-devel@googlegroups.com" <snabb-devel@googlegroups.com>,
	qemu-devel@nongnu.org, Anthony Liguori <anthony@codemonkey.ws>,
	Julian Stecklina <jsteckli@os.inf.tu-dresden.de>
Subject: Re: [Qemu-devel] snabbswitch integration with QEMU for userspace ethernet I/O
Date: Wed, 29 May 2013 17:48:58 +0300	[thread overview]
Message-ID: <20130529144858.GC10462@redhat.com> (raw)
In-Reply-To: <20130529142143.GA9545@stefanha-thinkpad.redhat.com>

On Wed, May 29, 2013 at 04:21:43PM +0200, Stefan Hajnoczi wrote:
> On Wed, May 29, 2013 at 12:08:59PM +0300, Michael S. Tsirkin wrote:
> > On Wed, May 29, 2013 at 09:49:29AM +0200, Stefan Hajnoczi wrote:
> > > On Tue, May 28, 2013 at 08:17:42PM +0300, Michael S. Tsirkin wrote:
> > > > On Tue, May 28, 2013 at 12:00:38PM -0500, Anthony Liguori wrote:
> > > > > Julian Stecklina <jsteckli@os.inf.tu-dresden.de> writes:
> > > > > 
> > > > > > On 05/28/2013 12:10 PM, Luke Gorrie wrote:
> > > > > >> On 27 May 2013 11:34, Stefan Hajnoczi <stefanha@redhat.com
> > > > > >> <mailto:stefanha@redhat.com>> wrote:
> > > > > >> 
> > > > > >>     vhost_net is about connecting the a virtio-net speaking process to a
> > > > > >>     tun-like device.  The problem you are trying to solve is connecting a
> > > > > >>     virtio-net speaking process to Snabb Switch.
> > > > > >> 
> > > > > >> 
> > > > > >> Yep!
> > > > > >
> > > > > > Since I am on a similar path as Luke, let me share another idea.
> > > > > >
> > > > > > What about extending qemu in a way to allow PCI device models to be
> > > > > > implemented in another process.
> > > > > 
> > > > > We aren't going to support any interface that enables out of tree
> > > > > devices.  This is just plugins in a different form with even more
> > > > > downsides.  You cannot easily keep track of dirty info, the guest
> > > > > physical address translation to host is difficult to keep in sync
> > > > > (imagine the complexity of memory hotplug).
> > > > > 
> > > > > Basically, it's easy to hack up but extremely hard to do something that
> > > > > works correctly overall.
> > > > > 
> > > > > There isn't a compelling reason to implement something like this other
> > > > > than avoiding getting code into QEMU.  Best to just submit your device
> > > > > to QEMU for inclusion.
> > > > > 
> > > > > If you want to avoid copying in a vswitch, better to use something like
> > > > > vmsplice as I outlined in another thread.
> > > > > 
> > > > > > This is not as hard as it may sound.
> > > > > > qemu would open a domain socket to this process and map VM memory over
> > > > > > to the other side. This can be accomplished by having file descriptors
> > > > > > in qemu to VM memory (reusing -mem-path code) and passing those over the
> > > > > > domain socket. The other side can then just mmap them. The socket would
> > > > > > also be used for configuration and I/O by the guest on the PCI
> > > > > > I/O/memory regions. You could also use this to do IRQs or use eventfds,
> > > > > > whatever works better.
> > > > > >
> > > > > > To have a zero copy userspace switch, the switch would offer virtio-net
> > > > > > devices to any qemu that wants to connect to it and implement the
> > > > > > complete device logic itself. Since it has access to all guest memory,
> > > > > > it can just do memcpy for packet data. Of course, this only works for
> > > > > > 64-bit systems, because you need vast amounts of virtual address space.
> > > > > > In my experience, doing this in userspace is _way less painful_.
> > > > > >
> > > > > > If you can get away with polling in the switch the overhead of doing all
> > > > > > this in userspace is zero. And as long as you can rate-limit explicit
> > > > > > notifications over the socket even that overhead should be okay.
> > > > > >
> > > > > > Opinions?
> > > > > 
> > > > > I don't see any compelling reason to do something like this.  It's
> > > > > jumping through a tremendous number of hoops to avoid putting code that
> > > > > belongs in QEMU in tree.
> > > > > 
> > > > > Regards,
> > > > > 
> > > > > Anthony Liguori
> > > > > 
> > > > > >
> > > > > > Julian
> > > > 
> > > > OTOH an in-tree device that runs in a separate process would
> > > > be useful e.g. for security.
> > > > For example, we could limit a virtio-net device process
> > > > to only access tap and vhost files.
> > > 
> > > For tap or vhost files only this is good for security.  I'm not sure it
> > > has many advantages over a QEMU process under SELinux though.
> > 
> > At the moment SELinux necessarily gives QEMU rights to
> > e.g. access the filesystem.
> > This process would only get access to tap and vhost.
> > 
> > We can also run it as a different user.
> > Defence in depth.
> > 
> > We can also limit e.g. the CPU of this process aggressively
> > (as it's not doing anything on data path).
> > 
> > I could go on.
> > 
> > And it's really easy too, until you want to use it in production,
> > at which point you need to cover lots of
> > nasty details like hotplug and migration.
> 
> I think there are diminishing returns.  Once QEMU is isolated so it
> cannot open arbitrary files, just has access to the resources granted by
> the management tool on startup, etc then I'm not sure it's worth the
> complexity and performance-cost of splitting the model up into even
> smaller pieces.

Well, this part is network-facing so there is some value,
to isolate it, I don't know how big it is.

> IMO there isn't a trust boundary that's worth isolating
> here (compare to sshd privilege separation where separate uids really
> make sense and are necessary, with QEMU having multiple uids that lack
> capabilities to do much doesn't win much over the SELinux setup).
>
> > > Obviously when the switch process has shared memory access to multiple
> > > guests' RAM, the security is worse than a QEMU process solution but
> > > better than a vhost kernel solution.
> > > So the security story is not a clear win.
> > > 
> > > Stefan
> > 
> > How exactly you pass packets between guest and host is very unlikely to
> > affect your security in a meaningful way.
> > 
> > Except, if you lose networking, orif it's just slow beyond any measure,
> > you are suddenly more secure against network-based attacks.
> 
> The fact that a single switch process has shared memory access to all
> guests' RAM is critical.  If the switch process is exploited, then that
> exposes other guests' data!  (Think of a multi-tenant host with guests
> belonging to different users.)
> 
> Stefan

Well local priveledge escalation bugs are common enough that you
should be very careful in any network facing application,
whether that has access to all guests when well-behaved, or not.

-- 
MST

  reply	other threads:[~2013-05-29 14:48 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-26  9:32 [Qemu-devel] snabbswitch integration with QEMU for userspace ethernet I/O Luke Gorrie
2013-05-27  9:34 ` Stefan Hajnoczi
2013-05-27 15:18   ` Michael S. Tsirkin
2013-05-27 15:43   ` Paolo Bonzini
2013-05-27 16:18     ` Anthony Liguori
2013-05-27 16:18       ` Paolo Bonzini
2013-05-27 17:01         ` Anthony Liguori
2013-05-27 17:13           ` Michael S. Tsirkin
2013-05-27 18:31             ` Anthony Liguori
2013-05-28 10:39       ` Luke Gorrie
2013-05-28 10:10   ` Luke Gorrie
2013-05-28 10:35     ` Stefan Hajnoczi
2013-05-28 11:36     ` Julian Stecklina
2013-05-28 11:53       ` Michael S. Tsirkin
2013-05-28 12:09         ` Julian Stecklina
2013-05-28 13:56           ` Michael S. Tsirkin
2013-05-28 15:35             ` Julian Stecklina
2013-05-28 15:44               ` Michael S. Tsirkin
2013-05-28 12:48         ` [Qemu-devel] [snabb-devel:276] " Luke Gorrie
2013-05-28 13:12           ` Julian Stecklina
2013-05-28 13:42             ` [Qemu-devel] [snabb-devel:280] " Luke Gorrie
2013-05-28 14:42         ` [Qemu-devel] [snabb-devel:276] " Luke Gorrie
2013-05-28 15:33           ` Julian Stecklina
2013-05-28 17:00       ` [Qemu-devel] " Anthony Liguori
2013-05-28 17:17         ` Michael S. Tsirkin
2013-05-28 18:55           ` Anthony Liguori
2013-05-29 10:31             ` Stefano Stabellini
2013-05-29 12:25               ` Michael S. Tsirkin
2013-05-29 13:04                 ` Stefano Stabellini
2013-06-04 12:19               ` [Qemu-devel] [snabb-devel:300] " Luke Gorrie
2013-06-04 12:49                 ` Julian Stecklina
2013-06-04 20:09                   ` [Qemu-devel] [snabb-devel:326] " Luke Gorrie
2013-06-04 12:56                 ` [Qemu-devel] [snabb-devel:300] " Michael S. Tsirkin
2013-06-05  6:09                   ` [Qemu-devel] [snabb-devel:327] " Luke Gorrie
2013-05-29  7:49           ` [Qemu-devel] " Stefan Hajnoczi
2013-05-29  9:08             ` Michael S. Tsirkin
2013-05-29 14:21               ` Stefan Hajnoczi
2013-05-29 14:48                 ` Michael S. Tsirkin [this message]
2013-05-29 16:02                 ` Julian Stecklina
2013-05-30  2:35                   ` ronnie sahlberg
2013-05-30  6:46                   ` Stefan Hajnoczi
2013-05-30  6:55                     ` Michael S. Tsirkin
2013-05-30  7:11                     ` [Qemu-devel] [snabb-devel:308] " Luke Gorrie
2013-05-30  8:08                     ` [Qemu-devel] " Julian Stecklina
2013-05-29 12:32         ` Julian Stecklina
2013-05-29 14:31           ` Stefan Hajnoczi
2013-05-29 15:59             ` Julian Stecklina
2013-05-28 11:58     ` Stefan Hajnoczi
2013-10-21 10:29       ` Luke Gorrie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130529144858.GC10462@redhat.com \
    --to=mst@redhat.com \
    --cc=anthony@codemonkey.ws \
    --cc=jsteckli@os.inf.tu-dresden.de \
    --cc=qemu-devel@nongnu.org \
    --cc=snabb-devel@googlegroups.com \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).