From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Nakajima, Jun" <jun.nakajima@intel.com>
Cc: virtio-dev@lists.oasis-open.org,
Jan Kiszka <jan.kiszka@siemens.com>,
Claudio.Fontana@huawei.com, qemu-devel@nongnu.org,
Linux Virtualization <virtualization@lists.linux-foundation.org>,
opnfv-tech-discuss@lists.opnfv.org
Subject: Re: [Qemu-devel] rfc: vhost user enhancements for vm2vm communication
Date: Wed, 7 Oct 2015 08:39:40 +0300 [thread overview]
Message-ID: <20151007053940.GB13983@redhat.com> (raw)
In-Reply-To: <CAL54oT1Q=+y_oNuUGq=KOXa7Yc-skNeXzcSn4kza3EpCKUYQ1w@mail.gmail.com>
On Tue, Oct 06, 2015 at 02:42:34PM -0700, Nakajima, Jun wrote:
> Hi Michael,
>
> Looks like the discussions tapered off, but do you have a plan to
> implement this if people are eventually fine with it? We want to
> extend this to support multiple VMs.
Absolutely. We are just back from holidays, and started looking at who
does what. If anyone wants to help, that'd also be nice.
> On Mon, Aug 31, 2015 at 11:35 AM, Nakajima, Jun <jun.nakajima@intel.com> wrote:
> > On Mon, Aug 31, 2015 at 7:11 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> Hello!
> >> During the KVM forum, we discussed supporting virtio on top
> >> of ivshmem. I have considered it, and came up with an alternative
> >> that has several advantages over that - please see below.
> >> Comments welcome.
> >
> > Hi Michael,
> >
> > I like this, and it should be able to achieve what I presented at KVM
> > Forum (vhost-user-shmem).
> > Comments below.
> >
> >>
> >> -----
> >>
> >> Existing solutions to userspace switching between VMs on the
> >> same host are vhost-user and ivshmem.
> >>
> >> vhost-user works by mapping memory of all VMs being bridged into the
> >> switch memory space.
> >>
> >> By comparison, ivshmem works by exposing a shared region of memory to all VMs.
> >> VMs are required to use this region to store packets. The switch only
> >> needs access to this region.
> >>
> >> Another difference between vhost-user and ivshmem surfaces when polling
> >> is used. With vhost-user, the switch is required to handle
> >> data movement between VMs, if using polling, this means that 1 host CPU
> >> needs to be sacrificed for this task.
> >>
> >> This is easiest to understand when one of the VMs is
> >> used with VF pass-through. This can be schematically shown below:
> >>
> >> +-- VM1 --------------+ +---VM2-----------+
> >> | virtio-pci +-vhost-user-+ virtio-pci -- VF | -- VFIO -- IOMMU -- NIC
> >> +---------------------+ +-----------------+
> >>
> >>
> >> With ivshmem in theory communication can happen directly, with two VMs
> >> polling the shared memory region.
> >>
> >>
> >> I won't spend time listing advantages of vhost-user over ivshmem.
> >> Instead, having identified two advantages of ivshmem over vhost-user,
> >> below is a proposal to extend vhost-user to gain the advantages
> >> of ivshmem.
> >>
> >>
> >> 1: virtio in guest can be extended to allow support
> >> for IOMMUs. This provides guest with full flexibility
> >> about memory which is readable or write able by each device.
> >
> > I assume that you meant VFIO only for virtio by "use of VFIO". To get
> > VFIO working for general direct-I/O (including VFs) in guests, as you
> > know, we need to virtualize IOMMU (e.g. VT-d) and the interrupt
> > remapping table on x86 (i.e. nested VT-d).
> >
> >> By setting up a virtio device for each other VM we need to
> >> communicate to, guest gets full control of its security, from
> >> mapping all memory (like with current vhost-user) to only
> >> mapping buffers used for networking (like ivshmem) to
> >> transient mappings for the duration of data transfer only.
> >
> > And I think that we can use VMFUNC to have such transient mappings.
> >
> >> This also allows use of VFIO within guests, for improved
> >> security.
> >>
> >> vhost user would need to be extended to send the
> >> mappings programmed by guest IOMMU.
> >
> > Right. We need to think about cases where other VMs (VM3, etc.) join
> > the group or some existing VM leaves.
> > PCI hot-plug should work there (as you point out at "Advantages over
> > ivshmem" below).
> >
> >>
> >> 2. qemu can be extended to serve as a vhost-user client:
> >> remote VM mappings over the vhost-user protocol, and
> >> map them into another VM's memory.
> >> This mapping can take, for example, the form of
> >> a BAR of a pci device, which I'll call here vhost-pci -
> >> with bus address allowed
> >> by VM1's IOMMU mappings being translated into
> >> offsets within this BAR within VM2's physical
> >> memory space.
> >
> > I think it's sensible.
> >
> >>
> >> Since the translation can be a simple one, VM2
> >> can perform it within its vhost-pci device driver.
> >>
> >> While this setup would be the most useful with polling,
> >> VM1's ioeventfd can also be mapped to
> >> another VM2's irqfd, and vice versa, such that VMs
> >> can trigger interrupts to each other without need
> >> for a helper thread on the host.
> >>
> >>
> >> The resulting channel might look something like the following:
> >>
> >> +-- VM1 --------------+ +---VM2-----------+
> >> | virtio-pci -- iommu +--+ vhost-pci -- VF | -- VFIO -- IOMMU -- NIC
> >> +---------------------+ +-----------------+
> >>
> >> comparing the two diagrams, a vhost-user thread on the host is
> >> no longer required, reducing the host CPU utilization when
> >> polling is active. At the same time, VM2 can not access all of VM1's
> >> memory - it is limited by the iommu configuration setup by VM1.
> >>
> >>
> >> Advantages over ivshmem:
> >>
> >> - more flexibility, endpoint VMs do not have to place data at any
> >> specific locations to use the device, in practice this likely
> >> means less data copies.
> >> - better standardization/code reuse
> >> virtio changes within guests would be fairly easy to implement
> >> and would also benefit other backends, besides vhost-user
> >> standard hotplug interfaces can be used to add and remove these
> >> channels as VMs are added or removed.
> >> - migration support
> >> It's easy to implement since ownership of memory is well defined.
> >> For example, during migration VM2 can notify hypervisor of VM1
> >> by updating dirty bitmap each time is writes into VM1 memory.
> >
> > Also, the ivshmem functionality could be implemented by this proposal:
> > - vswitch (or some VM) allocates memory regions in its address space, and
> > - it sets up that IOMMU mappings on the VMs be translated into the regions
> >
> >>
> >> Thanks,
> >>
> >> --
> >> MST
>
>
>
> --
> Jun
> Intel Open Source Technology Center
next prev parent reply other threads:[~2015-10-07 5:39 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-31 14:11 [Qemu-devel] rfc: vhost user enhancements for vm2vm communication Michael S. Tsirkin
2015-08-31 18:35 ` Nakajima, Jun
2015-09-01 3:03 ` Varun Sethi
2015-09-01 8:30 ` Michael S. Tsirkin
2015-09-01 8:17 ` Michael S. Tsirkin
2015-09-01 22:56 ` Nakajima, Jun
2015-10-06 21:42 ` Nakajima, Jun
2015-10-07 5:39 ` Michael S. Tsirkin [this message]
2015-09-01 7:35 ` Jan Kiszka
2015-09-01 8:01 ` Michael S. Tsirkin
2015-09-01 9:11 ` Jan Kiszka
2015-09-01 9:24 ` Michael S. Tsirkin
2015-09-01 14:09 ` Jan Kiszka
2015-09-01 14:34 ` Michael S. Tsirkin
2015-09-01 15:34 ` Jan Kiszka
2015-09-01 16:02 ` Michael S. Tsirkin
2015-09-01 16:28 ` Jan Kiszka
2015-09-02 0:01 ` Nakajima, Jun
2015-09-02 12:15 ` Michael S. Tsirkin
2015-09-03 4:45 ` Nakajima, Jun
2015-09-03 8:09 ` Michael S. Tsirkin
2015-09-03 8:08 ` Michael S. Tsirkin
2015-09-03 8:21 ` Jan Kiszka
2015-09-03 8:37 ` Michael S. Tsirkin
2015-09-03 10:25 ` Jan Kiszka
2015-09-07 12:38 ` Claudio Fontana
2015-09-09 6:40 ` [Qemu-devel] [opnfv-tech-discuss] " Zhang, Yang Z
2015-09-09 8:39 ` Claudio Fontana
2015-09-18 16:29 ` [Qemu-devel] RFC: virtio-peer shared memory based peer communication device Claudio Fontana
2015-09-18 21:11 ` Paolo Bonzini
2015-09-21 10:47 ` Jan Kiszka
2015-09-21 12:15 ` Paolo Bonzini
2015-09-21 12:13 ` Michael S. Tsirkin
2015-09-21 12:32 ` Jan Kiszka
2015-09-24 10:04 ` Michael S. Tsirkin
2015-09-09 7:06 ` [Qemu-devel] rfc: vhost user enhancements for vm2vm communication Michael S. Tsirkin
2015-09-11 15:39 ` Claudio Fontana
2015-09-13 9:12 ` Michael S. Tsirkin
2015-09-14 0:43 ` [Qemu-devel] [opnfv-tech-discuss] " Zhang, Yang Z
2015-09-14 16:00 ` [Qemu-devel] [virtio-dev] " Stefan Hajnoczi
-- strict thread matches above, loose matches on Subject: below --
2016-03-17 12:56 [Qemu-devel] " Bret Ketchum
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151007053940.GB13983@redhat.com \
--to=mst@redhat.com \
--cc=Claudio.Fontana@huawei.com \
--cc=jan.kiszka@siemens.com \
--cc=jun.nakajima@intel.com \
--cc=opnfv-tech-discuss@lists.opnfv.org \
--cc=qemu-devel@nongnu.org \
--cc=virtio-dev@lists.oasis-open.org \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).