From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: netdev@vger.kernel.org, David Miller <davem@davemloft.net>,
kvm@vger.kernel.org, virtualization@lists.linux-foundation.org
Subject: Re: [PATCH] AF_VMCHANNEL address family for guest<->host communication.
Date: Mon, 15 Dec 2008 15:44:22 -0800 [thread overview]
Message-ID: <4946EBD6.9080201@goop.org> (raw)
In-Reply-To: <4946E36D.8060503@codemonkey.ws>
Anthony Liguori wrote:
> Jeremy Fitzhardinge wrote:
>> Anthony Liguori wrote:
>>>
>>> That seems unnecessarily complex.
>>>
>>
>> Well, the simplest thing is to let the host TCP stack do TCP. Could
>> you go into more detail about why you'd want to avoid that?
>
> The KVM model is that a guest is a process. Any IO operations
> original from the process (QEMU). The advantage to this is that you
> get very good security because you can use things like SELinux and
> simply treat the QEMU process as you would the guest. In fact, in
> general, I think we want to assume that QEMU is guest code from a
> security perspective.
>
> By passing up the network traffic to the host kernel, we now face a
> problem when we try to get the data back. We could setup a tun device
> to send traffic to the kernel but then the rest of the system can see
> that traffic too. If that traffic is sensitive, it's potentially unsafe.
Well, one could come up with a mechanism to bind an interface to be only
visible to a particular context/container/something.
> You can use iptables to restrict who can receive traffic and possibly
> use SELinux packet tagging or whatever. This gets extremely complex
> though.
Well, if you can just tag everything based on interface its relatively
simple.
> It's far easier to avoid the host kernel entirely and implement the
> backends in QEMU. Then any actions the backend takes will be on
> behalf of the guest. You never have to worry about transport data
> leakage.
Well, a stream-like protocol layered over a reliable packet transport
would get you there without the complexity of tcp. Or just do a
usermode tcp; its not that complex if you really think it simplifies the
other aspects.
>
>>> This is why I've been pushing for the backends to be implemented in
>>> QEMU. Then QEMU can marshal the backend-specific state and transfer
>>> it during live migration. For something like copy/paste, this is
>>> obvious (the clipboard state). A general command interface is
>>> probably stateless so it's a nop.
>>>
>>
>> Copy/paste seems like a particularly bogus example. Surely this
>> isn't a sensible way to implement it?
>
> I think it's the most sensible way to implement it. Would you suggest
> something different?
Well, off the top of my head I'm assuming the requirements are:
* the goal is to unify the user's actual desktop session with a
virtual session within a vm
* a given user may have multiple VMs running on their desktop
* a VM may be serving multiple user sessions
* the VMs are not necessarily hosted by the user's desktop machine
* the VMs can migrate at any moment
To me that looks like a daemon running within the context of each of the
user's virtual sessions monitoring clipboard events, talking over a TCP
connection to a corresponding daemon in their desktop session, which is
responsible for reconciling cuts and pastes in all the various sessions.
I guess you'd say that each VM would multiplex all its cut/paste events
via its AF_VMCHANNEL/cut+paste channel to its qemu, which would then
demultiplex them off to the user's real desktops. And that since the VM
itself may have no networking, it needs to be a special magic connection.
And my counter argument to this nicely placed straw man is that the
VM<->qemu connection can still be TCP, even if its a private network
with no outside access.
>
>>> I'm not a fan of having external backends to QEMU for the very
>>> reasons you outline above. You cannot marshal the state of a
>>> channel we know nothing about. We're really just talking about
>>> extending virtio in a guest down to userspace so that we can
>>> implement paravirtual device drivers in guest userspace. This may
>>> be an X graphics driver, a mouse driver, copy/paste, remote
>>> shutdown, etc.
>>> A socket seems like a natural choice. If that's wrong, then we
>>> can explore other options (like a char device, virtual fs, etc.).
>>
>> I think a socket is a pretty poor choice. It's too low level, and it
>> only really makes sense for streaming data, not for data storage
>> (name/value pairs). It means that everyone ends up making up their
>> own serializations. A filesystem view with notifications seems to be
>> a better match for the use-cases you mention (aside from cut/paste),
>> with a single well-defined way to serialize onto any given channel.
>> Each "file" may well have an application-specific content, but in
>> general that's going to be something pretty simple.
>
> I had suggested a virtual file system at first and was thoroughly
> ridiculed for it :-) There is a 9p virtio transport already so we
> could even just use that.
You mean 9p directly over a virtio ringbuffer rather than via the
network stack? You could do that, but I'd still argue that using the
network stack is a better approach.
> The main issue with a virtual file system is that it does map well to
> other guests. It's actually easier to implement a socket interface
> for Windows than it is to implement a new file system.
There's no need to put the "filesystem" into the kernel unless something
else in the kernel needs to access it. A usermode implementation
talking over some stream interface would be fine.
> But we could find ways around this with libraries. If we used 9p as a
> transport, we could just provide a char device in Windows that
> received it in userspace.
Or just use a tcp connection, and do it all with no kernel mods.
(Is 9p a good choice? You need to be able to subscribe to events
happening to files, and you'd need some kind of atomicity guarantee. I
dunno, maybe 9p already has this or can be cleanly adapted.)
J
next prev parent reply other threads:[~2008-12-15 23:44 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-14 11:50 [PATCH] AF_VMCHANNEL address family for guest<->host communication Gleb Natapov
2008-12-14 12:23 ` Evgeniy Polyakov
2008-12-14 12:46 ` Gleb Natapov
2008-12-15 6:44 ` David Miller
2008-12-15 7:48 ` Gleb Natapov
2008-12-15 8:27 ` David Miller
2008-12-15 15:02 ` Anthony Liguori
2008-12-15 17:45 ` Jeremy Fitzhardinge
2008-12-15 18:26 ` Itamar Heim
2008-12-15 18:45 ` Anthony Liguori
2008-12-15 22:52 ` Jeremy Fitzhardinge
2008-12-15 23:08 ` Anthony Liguori
2008-12-15 23:44 ` Jeremy Fitzhardinge [this message]
2008-12-15 23:52 ` Evgeniy Polyakov
2008-12-16 0:01 ` Dor Laor
2008-12-15 19:43 ` David Miller
2008-12-15 20:44 ` Anthony Liguori
2008-12-15 22:29 ` David Miller
2008-12-15 23:01 ` Anthony Liguori
2008-12-15 23:10 ` David Miller
2008-12-15 23:17 ` Anthony Liguori
2008-12-16 2:55 ` Herbert Xu
2008-12-15 23:13 ` Stephen Hemminger
2008-12-15 23:45 ` Evgeniy Polyakov
2008-12-16 6:57 ` Gleb Natapov
2008-12-16 21:25 ` Evgeniy Polyakov
2008-12-16 23:20 ` Dor Laor
2008-12-17 14:31 ` Gleb Natapov
2008-12-18 12:30 ` Evgeniy Polyakov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4946EBD6.9080201@goop.org \
--to=jeremy@goop.org \
--cc=anthony@codemonkey.ws \
--cc=davem@davemloft.net \
--cc=kvm@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).