qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Wei Wang <wei.w.wang@intel.com>,
	"virtio-dev@lists.oasis-open.org"
	<virtio-dev@lists.oasis-open.org>,
	"Yang, Zhiyong" <zhiyong.yang@intel.com>,
	"jan.kiszka@siemens.com" <jan.kiszka@siemens.com>,
	"jasowang@redhat.com" <jasowang@redhat.com>,
	"avi.cohen@huawei.com" <avi.cohen@huawei.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"marcandre.lureau@redhat.com" <marcandre.lureau@redhat.com>
Subject: Re: [Qemu-devel] [virtio-dev] [PATCH v3 0/7] Vhost-pci for inter-VM communication
Date: Fri, 8 Dec 2017 16:27:33 +0200	[thread overview]
Message-ID: <20171208161606-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CAJSP0QXYMVBidUd5-NJb5FDYbc6wSkNYgdadjk8+NXvwosLMPw@mail.gmail.com>

On Fri, Dec 08, 2017 at 06:08:05AM +0000, Stefan Hajnoczi wrote:
> On Thu, Dec 7, 2017 at 11:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Thu, Dec 07, 2017 at 06:28:19PM +0000, Stefan Hajnoczi wrote:
> >> On Thu, Dec 7, 2017 at 5:38 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> > On Thu, Dec 07, 2017 at 05:29:14PM +0000, Stefan Hajnoczi wrote:
> >> >> On Thu, Dec 7, 2017 at 4:47 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> >> > On Thu, Dec 07, 2017 at 04:29:45PM +0000, Stefan Hajnoczi wrote:
> >> >> >> On Thu, Dec 7, 2017 at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> >> >> > On Thu, Dec 07, 2017 at 01:08:04PM +0000, Stefan Hajnoczi wrote:
> >> >> >> >> Instead of responding individually to these points, I hope this will
> >> >> >> >> explain my perspective.  Let me know if you do want individual
> >> >> >> >> responses, I'm happy to talk more about the points above but I think
> >> >> >> >> the biggest difference is our perspective on this:
> >> >> >> >>
> >> >> >> >> Existing vhost-user slave code should be able to run on top of
> >> >> >> >> vhost-pci.  For example, QEMU's
> >> >> >> >> contrib/vhost-user-scsi/vhost-user-scsi.c should work inside the guest
> >> >> >> >> with only minimal changes to the source file (i.e. today it explicitly
> >> >> >> >> opens a UNIX domain socket and that should be done by libvhost-user
> >> >> >> >> instead).  It shouldn't be hard to add vhost-pci vfio support to
> >> >> >> >> contrib/libvhost-user/ alongside the existing UNIX domain socket code.
> >> >> >> >>
> >> >> >> >> This seems pretty easy to achieve with the vhost-pci PCI adapter that
> >> >> >> >> I've described but I'm not sure how to implement libvhost-user on top
> >> >> >> >> of vhost-pci vfio if the device doesn't expose the vhost-user
> >> >> >> >> protocol.
> >> >> >> >>
> >> >> >> >> I think this is a really important goal.  Let's use a single
> >> >> >> >> vhost-user software stack instead of creating a separate one for guest
> >> >> >> >> code only.
> >> >> >> >>
> >> >> >> >> Do you agree that the vhost-user software stack should be shared
> >> >> >> >> between host userspace and guest code as much as possible?
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> > The sharing you propose is not necessarily practical because the security goals
> >> >> >> > of the two are different.
> >> >> >> >
> >> >> >> > It seems that the best motivation presentation is still the original rfc
> >> >> >> >
> >> >> >> > http://virtualization.linux-foundation.narkive.com/A7FkzAgp/rfc-vhost-user-enhancements-for-vm2vm-communication
> >> >> >> >
> >> >> >> > So comparing with vhost-user iotlb handling is different:
> >> >> >> >
> >> >> >> > With vhost-user guest trusts the vhost-user backend on the host.
> >> >> >> >
> >> >> >> > With vhost-pci we can strive to limit the trust to qemu only.
> >> >> >> > The switch running within a VM does not have to be trusted.
> >> >> >>
> >> >> >> Can you give a concrete example?
> >> >> >>
> >> >> >> I have an idea about what you're saying but it may be wrong:
> >> >> >>
> >> >> >> Today the iotlb mechanism in vhost-user does not actually enforce
> >> >> >> memory permissions.  The vhost-user slave has full access to mmapped
> >> >> >> memory regions even when iotlb is enabled.  Currently the iotlb just
> >> >> >> adds an indirection layer but no real security.  (Is this correct?)
> >> >> >
> >> >> > Not exactly. iotlb protects against malicious drivers within guest.
> >> >> > But yes, not against a vhost-user driver on the host.
> >> >> >
> >> >> >> Are you saying the vhost-pci device code in QEMU should enforce iotlb
> >> >> >> permissions so the vhost-user slave guest only has access to memory
> >> >> >> regions that are allowed by the iotlb?
> >> >> >
> >> >> > Yes.
> >> >>
> >> >> Okay, thanks for confirming.
> >> >>
> >> >> This can be supported by the approach I've described.  The vhost-pci
> >> >> QEMU code has control over the BAR memory so it can prevent the guest
> >> >> from accessing regions that are not allowed by the iotlb.
> >> >>
> >> >> Inside the guest the vhost-user slave still has the memory region
> >> >> descriptions and sends iotlb messages.  This is completely compatible
> >> >> with the libvirt-user APIs and existing vhost-user slave code can run
> >> >> fine.  The only unique thing is that guest accesses to memory regions
> >> >> not allowed by the iotlb do not work because QEMU has prevented it.
> >> >
> >> > I don't think this can work since suddenly you need
> >> > to map full IOMMU address space into BAR.
> >>
> >> The BAR covers all guest RAM
> >> but QEMU can set up MemoryRegions that
> >> hide parts from the guest (e.g. reads produce 0xff).  I'm not sure how
> >> expensive that is but implementing a strict IOMMU is hard to do
> >> without performance overhead.
> >
> > I'm worried about leaking PAs.
> > fundamentally if you want proper protection you
> > need your device driver to use VA for addressing,
> >
> > On the one hand BAR only needs to be as large as guest PA then.
> > On the other hand it must cover all of guest PA,
> > not just what is accessible to the device.
> 
> A more heavyweight iotlb implementation in QEMU's vhost-pci device
> could present only VAs to the vhost-pci driver.  It would use
> MemoryRegions to map pieces of shared guest memory dynamically.  The
> only information leak would be the overall guest RAM size because we
> still need to set the correct BAR size.

I'm not sure this will work. KVM simply
isn't designed with a huge number of fragmented regions
in mind.

Wei, just what is the plan for the IOMMU? How will
all virtual addresses fit in a BAR?

Maybe we really do want a non-translating IOMMU
(leaking PA to userspace but oh well)?

> >>
> >> > Besides, this means implementing iotlb in both qemu and guest.
> >>
> >> It's free in the guest, the libvhost-user stack already has it.
> >
> > That library is designed to work with a unix domain socket
> > though. We'll need extra kernel code to make a device
> > pretend it's a socket.
> 
> A kernel vhost-pci driver isn't necessary because I don't think there
> are in-kernel users.
>
> A vfio vhost-pci backend can go alongside the UNIX domain socket
> backend that exists today in libvhost-user.
> 
> If we want to expose kernel vhost devices via vhost-pci then a
> libvhost-user program can translate the vhost-user protocol into
> kernel ioctls.  For example:
> $ vhost-pci-proxy --vhost-pci-addr 00:04.0 --vhost-fd 3 3<>/dev/vhost-net
> 
> The vhost-pci-proxy implements the vhost-user protocol callbacks and
> submits ioctls on the vhost kernel device fd.  I haven't compared the
> kernel ioctl interface vs the vhost-user protocol to see if everything
> maps cleanly though.
> 
> Stefan

I don't really like this, it's yet another package to install, yet
another process to complicate debugging and yet another service that can
go down.

Maybe vsock can do the trick though?

-- 
MST

  reply	other threads:[~2017-12-08 14:27 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-05  3:33 [Qemu-devel] [PATCH v3 0/7] Vhost-pci for inter-VM communication Wei Wang
2017-12-05  3:33 ` [Qemu-devel] [PATCH v3 1/7] vhost-user: share the vhost-user protocol related structures Wei Wang
2017-12-05  3:33 ` [Qemu-devel] [PATCH v3 2/7] vhost-pci-net: add vhost-pci-net Wei Wang
2017-12-05 14:59   ` [Qemu-devel] [virtio-dev] " Stefan Hajnoczi
2017-12-05 15:17     ` Michael S. Tsirkin
2017-12-05 15:55     ` Michael S. Tsirkin
2017-12-05 16:41       ` Stefan Hajnoczi
2017-12-05 16:53         ` Michael S. Tsirkin
2017-12-05 17:00           ` Cornelia Huck
2017-12-05 18:06             ` Michael S. Tsirkin
2017-12-06 10:17     ` Wei Wang
2017-12-06 12:01       ` Stefan Hajnoczi
2017-12-05  3:33 ` [Qemu-devel] [PATCH v3 3/7] virtio/virtio-pci.c: add vhost-pci-net-pci Wei Wang
2017-12-05  3:33 ` [Qemu-devel] [PATCH v3 4/7] vhost-pci-slave: add vhost-pci slave implementation Wei Wang
2017-12-05 15:56   ` Stefan Hajnoczi
2017-12-14 17:30   ` Stefan Hajnoczi
2017-12-14 17:48   ` Stefan Hajnoczi
2017-12-05  3:33 ` [Qemu-devel] [PATCH v3 5/7] vhost-user: VHOST_USER_SET_VHOST_PCI msg Wei Wang
2017-12-05 16:00   ` Stefan Hajnoczi
2017-12-06 10:32     ` [Qemu-devel] [virtio-dev] " Wei Wang
2017-12-15 12:40       ` Stefan Hajnoczi
2017-12-05  3:33 ` [Qemu-devel] [PATCH v3 6/7] vhost-pci-slave: handle VHOST_USER_SET_VHOST_PCI Wei Wang
2017-12-05  3:33 ` [Qemu-devel] [PATCH v3 7/7] virtio/vhost.c: vhost-pci needs remote gpa Wei Wang
2017-12-05 16:05   ` Stefan Hajnoczi
2017-12-06 10:46     ` [Qemu-devel] [virtio-dev] " Wei Wang
2017-12-05  4:13 ` [Qemu-devel] [PATCH v3 0/7] Vhost-pci for inter-VM communication no-reply
2017-12-05  7:01 ` [Qemu-devel] [virtio-dev] " Jason Wang
2017-12-05  7:15   ` Wei Wang
2017-12-05  7:19     ` Jason Wang
2017-12-05  8:49       ` Avi Cohen (A)
2017-12-05 10:36         ` Wei Wang
2017-12-05 14:30 ` Stefan Hajnoczi
2017-12-05 15:20 ` [Qemu-devel] " Michael S. Tsirkin
2017-12-05 16:06 ` [Qemu-devel] [virtio-dev] " Stefan Hajnoczi
2017-12-06 13:49 ` Stefan Hajnoczi
2017-12-06 16:09   ` Wang, Wei W
2017-12-06 16:27     ` Stefan Hajnoczi
2017-12-07  3:57       ` Wei Wang
2017-12-07  5:11         ` Michael S. Tsirkin
2017-12-07  5:34           ` Wei Wang
2017-12-07  6:31         ` Stefan Hajnoczi
2017-12-07  7:54           ` Avi Cohen (A)
2017-12-07  8:04             ` Stefan Hajnoczi
2017-12-07  8:31               ` Jason Wang
2017-12-07 10:24                 ` [Qemu-devel] [virtio-dev] " Stefan Hajnoczi
2017-12-07 13:33             ` [Qemu-devel] " Michael S. Tsirkin
2017-12-07  9:02           ` Wei Wang
2017-12-07 13:08             ` Stefan Hajnoczi
2017-12-07 14:02               ` Michael S. Tsirkin
2017-12-07 16:29                 ` Stefan Hajnoczi
2017-12-07 16:47                   ` Michael S. Tsirkin
2017-12-07 17:29                     ` Stefan Hajnoczi
2017-12-07 17:38                       ` Michael S. Tsirkin
2017-12-07 18:28                         ` Stefan Hajnoczi
2017-12-07 23:54                           ` Michael S. Tsirkin
2017-12-08  6:08                             ` Stefan Hajnoczi
2017-12-08 14:27                               ` Michael S. Tsirkin [this message]
2017-12-08 16:15                                 ` Stefan Hajnoczi
2017-12-09 16:08                                 ` Wang, Wei W
2017-12-08  6:43                             ` Wei Wang
2017-12-08  8:33                               ` Stefan Hajnoczi
2017-12-09 16:23                                 ` Wang, Wei W
2017-12-11 11:11                                   ` Stefan Hajnoczi
2017-12-11 13:53                                     ` Wang, Wei W
2017-12-12 10:14                                       ` Stefan Hajnoczi
2017-12-13  8:11                                         ` Wei Wang
2017-12-13 12:35                                           ` Stefan Hajnoczi
2017-12-13 15:01                                             ` Michael S. Tsirkin
2017-12-13 20:08                                               ` Stefan Hajnoczi
2017-12-13 20:59                                                 ` Michael S. Tsirkin
2017-12-14 15:06                                                   ` Stefan Hajnoczi
2017-12-15 10:33                                                     ` Wei Wang
2017-12-15 12:37                                                       ` Stefan Hajnoczi
2017-12-13 21:50                                                 ` Maxime Coquelin
2017-12-14 15:46                                                   ` [Qemu-devel] [virtio-dev] " Stefan Hajnoczi
2017-12-14 16:27                                                     ` Michael S. Tsirkin
2017-12-14 16:39                                                       ` Maxime Coquelin
2017-12-14 16:40                                                         ` Michael S. Tsirkin
2017-12-14 16:50                                                           ` Maxime Coquelin
2017-12-14 18:11                                                             ` Stefan Hajnoczi
2017-12-14  5:53                                             ` [Qemu-devel] " Wei Wang
2017-12-14 17:32                                               ` Stefan Hajnoczi
2017-12-15  9:10                                                 ` Wei Wang
2017-12-15 12:26                                                   ` Stefan Hajnoczi
2017-12-14 18:04                                               ` Stefan Hajnoczi
2017-12-15 10:33                                                 ` [Qemu-devel] [virtio-dev] " Wei Wang
2017-12-15 12:00                                                   ` Stefan Hajnoczi
2017-12-06 16:13   ` [Qemu-devel] " Stefan Hajnoczi
2017-12-19 11:35 ` Stefan Hajnoczi
2017-12-19 14:56   ` Michael S. Tsirkin
2017-12-19 17:05     ` Stefan Hajnoczi
2017-12-20  4:06       ` Michael S. Tsirkin
2017-12-20  6:26         ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171208161606-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=avi.cohen@huawei.com \
    --cc=jan.kiszka@siemens.com \
    --cc=jasowang@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=wei.w.wang@intel.com \
    --cc=zhiyong.yang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).