From: Stefan Hajnoczi <stefanha@redhat.com>
To: Wei Wang <wei.w.wang@intel.com>
Cc: virtio-dev@lists.oasis-open.org, qemu-devel@nongnu.org,
mst@redhat.com, marcandre.lureau@redhat.com, jasowang@redhat.com,
pbonzini@redhat.com, jan.kiszka@siemens.com,
avi.cohen@huawei.com, zhiyong.yang@intel.com
Subject: Re: [Qemu-devel] [virtio-dev] [PATCH v3 0/7] Vhost-pci for inter-VM communication
Date: Wed, 6 Dec 2017 13:49:57 +0000 [thread overview]
Message-ID: <20171206134957.GD12584@stefanha-x1.localdomain> (raw)
In-Reply-To: <1512444796-30615-1-git-send-email-wei.w.wang@intel.com>
[-- Attachment #1: Type: text/plain, Size: 6733 bytes --]
On Tue, Dec 05, 2017 at 11:33:09AM +0800, Wei Wang wrote:
> Vhost-pci is a point-to-point based inter-VM communication solution. This
> patch series implements the vhost-pci-net device setup and emulation. The
> device is implemented as a virtio device, and it is set up via the
> vhost-user protocol to get the neessary info (e.g the memory info of the
> remote VM, vring info).
>
> Currently, only the fundamental functions are implemented. More features,
> such as MQ and live migration, will be updated in the future.
>
> The DPDK PMD of vhost-pci has been posted to the dpdk mailinglist here:
> http://dpdk.org/ml/archives/dev/2017-November/082615.html
I have asked questions about the scope of this feature. In particular,
I think it's best to support all device types rather than just
virtio-net. Here is a design document that shows how this can be
achieved.
What I'm proposing is different from the current approach:
1. It's a PCI adapter (see below for justification)
2. The vhost-user protocol is exposed by the device (not handled 100% in
QEMU). Ultimately I think your approach would also need to do this.
I'm not implementing this and not asking you to implement it. Let's
just use this for discussion so we can figure out what the final
vhost-pci will look like.
Please let me know what you think, Wei, Michael, and others.
---
vhost-pci device specification
-------------------------------
The vhost-pci device allows guests to act as vhost-user slaves. This
enables appliance VMs like network switches or storage targets to back
devices in other VMs. VM-to-VM communication is possible without
vmexits using polling mode drivers.
The vhost-user protocol has been used to implement virtio devices in
userspace processes on the host. vhost-pci maps the vhost-user protocol
to a PCI adapter so guest software can perform virtio device emulation.
This is useful in environments where high-performance VM-to-VM
communication is necessary or where it is preferrable to deploy device
emulation as VMs instead of host userspace processes.
The vhost-user protocol involves file descriptor passing and shared
memory. This precludes vhost-user slave implementations over
virtio-vsock, virtio-serial, or TCP/IP. Therefore a new device type is
needed to expose the vhost-user protocol to guests.
The vhost-pci PCI adapter has the following resources:
Queues (used for vhost-user protocol communication):
1. Master-to-slave messages
2. Slave-to-master messages
Doorbells (used for slave->guest/master events):
1. Vring call (one doorbell per virtqueue)
2. Vring err (one doorbell per virtqueue)
3. Log changed
Interrupts (used for guest->slave events):
1. Vring kick (one MSI per virtqueue)
Shared Memory BARs:
1. Guest memory
2. Log
Master-to-slave queue:
The following vhost-user protocol messages are relayed from the
vhost-user master. Each message follows the vhost-user protocol
VhostUserMsg layout.
Messages that include file descriptor passing are relayed but do not
carry file descriptors. The relevant resources (doorbells, interrupts,
or shared memory BARs) are initialized from the file descriptors prior
to the message becoming available on the Master-to-Slave queue.
Resources must only be used after the corresponding vhost-user message
has been received. For example, the Vring call doorbell can only be
used after VHOST_USER_SET_VRING_CALL becomes available on the
Master-to-Slave queue.
Messages must be processed in order.
The following vhost-user protocol messages are relayed:
* VHOST_USER_GET_FEATURES
* VHOST_USER_SET_FEATURES
* VHOST_USER_GET_PROTOCOL_FEATURES
* VHOST_USER_SET_PROTOCOL_FEATURES
* VHOST_USER_SET_OWNER
* VHOST_USER_SET_MEM_TABLE
The shared memory is available in the corresponding BAR.
* VHOST_USER_SET_LOG_BASE
The shared memory is available in the corresponding BAR.
* VHOST_USER_SET_LOG_FD
The logging file descriptor can be signalled through the logging
virtqueue.
* VHOST_USER_SET_VRING_NUM
* VHOST_USER_SET_VRING_ADDR
* VHOST_USER_SET_VRING_BASE
* VHOST_USER_GET_VRING_BASE
* VHOST_USER_SET_VRING_KICK
This message is still needed because it may indicate only polling
mode is supported.
* VHOST_USER_SET_VRING_CALL
This message is still needed because it may indicate only polling
mode is supported.
* VHOST_USER_SET_VRING_ERR
* VHOST_USER_GET_QUEUE_NUM
* VHOST_USER_SET_VRING_ENABLE
* VHOST_USER_SEND_RARP
* VHOST_USER_NET_SET_MTU
* VHOST_USER_SET_SLAVE_REQ_FD
* VHOST_USER_IOTLB_MSG
* VHOST_USER_SET_VRING_ENDIAN
Slave-to-Master queue:
Messages added to the Slave-to-Master queue are sent to the vhost-user
master. Each message follows the vhost-user protocol VhostUserMsg
layout.
The following vhost-user protocol messages are relayed:
* VHOST_USER_SLAVE_IOTLB_MSG
Theory of Operation:
When the vhost-pci adapter is detected the queues must be set up by the
driver. Once the driver is ready the vhost-pci device begins relaying
vhost-user protocol messages over the Master-to-Slave queue. The driver
must follow the vhost-user protocol specification to implement
virtio device initialization and virtqueue processing.
Notes:
The vhost-user UNIX domain socket connects two host processes. The
slave process interprets messages and initializes vhost-pci resources
(doorbells, interrupts, shared memory BARs) based on them before
relaying via the Master-to-Slave queue. All messages are relayed, even
if they only pass a file descriptor, because the message itself may act
as a signal (e.g. virtqueue is now enabled).
vhost-pci is a PCI adapter instead of a virtio device to allow doorbells
and interrupts to be connected to the virtio device in the master VM in
the most efficient way possible. This means the Vring call doorbell can
be an ioeventfd that signals an irqfd inside the host kernel without
host userspace involvement. The Vring kick interrupt can be an irqfd
that is signalled by the master VM's virtqueue ioeventfd.
It may be possible to write a Linux vhost-pci driver that implements the
drivers/vhost/ API. That way existing vhost drivers could work with
vhost-pci in the kernel.
Guest userspace vhost-pci drivers will be similar to QEMU's
contrib/libvhost-user/ except they will probably use vfio to access the
vhost-pci device directly from userspace.
TODO:
* Queue memory layout and hardware registers
* vhost-pci-level negotiation and configuration so the hardware
interface can be extended in the future.
* vhost-pci <-> driver initialization procedure
* Master<->Slave disconnected & reconnect
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]
next prev parent reply other threads:[~2017-12-06 13:50 UTC|newest]
Thread overview: 93+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-05 3:33 [Qemu-devel] [PATCH v3 0/7] Vhost-pci for inter-VM communication Wei Wang
2017-12-05 3:33 ` [Qemu-devel] [PATCH v3 1/7] vhost-user: share the vhost-user protocol related structures Wei Wang
2017-12-05 3:33 ` [Qemu-devel] [PATCH v3 2/7] vhost-pci-net: add vhost-pci-net Wei Wang
2017-12-05 14:59 ` [Qemu-devel] [virtio-dev] " Stefan Hajnoczi
2017-12-05 15:17 ` Michael S. Tsirkin
2017-12-05 15:55 ` Michael S. Tsirkin
2017-12-05 16:41 ` Stefan Hajnoczi
2017-12-05 16:53 ` Michael S. Tsirkin
2017-12-05 17:00 ` Cornelia Huck
2017-12-05 18:06 ` Michael S. Tsirkin
2017-12-06 10:17 ` Wei Wang
2017-12-06 12:01 ` Stefan Hajnoczi
2017-12-05 3:33 ` [Qemu-devel] [PATCH v3 3/7] virtio/virtio-pci.c: add vhost-pci-net-pci Wei Wang
2017-12-05 3:33 ` [Qemu-devel] [PATCH v3 4/7] vhost-pci-slave: add vhost-pci slave implementation Wei Wang
2017-12-05 15:56 ` Stefan Hajnoczi
2017-12-14 17:30 ` Stefan Hajnoczi
2017-12-14 17:48 ` Stefan Hajnoczi
2017-12-05 3:33 ` [Qemu-devel] [PATCH v3 5/7] vhost-user: VHOST_USER_SET_VHOST_PCI msg Wei Wang
2017-12-05 16:00 ` Stefan Hajnoczi
2017-12-06 10:32 ` [Qemu-devel] [virtio-dev] " Wei Wang
2017-12-15 12:40 ` Stefan Hajnoczi
2017-12-05 3:33 ` [Qemu-devel] [PATCH v3 6/7] vhost-pci-slave: handle VHOST_USER_SET_VHOST_PCI Wei Wang
2017-12-05 3:33 ` [Qemu-devel] [PATCH v3 7/7] virtio/vhost.c: vhost-pci needs remote gpa Wei Wang
2017-12-05 16:05 ` Stefan Hajnoczi
2017-12-06 10:46 ` [Qemu-devel] [virtio-dev] " Wei Wang
2017-12-05 4:13 ` [Qemu-devel] [PATCH v3 0/7] Vhost-pci for inter-VM communication no-reply
2017-12-05 7:01 ` [Qemu-devel] [virtio-dev] " Jason Wang
2017-12-05 7:15 ` Wei Wang
2017-12-05 7:19 ` Jason Wang
2017-12-05 8:49 ` Avi Cohen (A)
2017-12-05 10:36 ` Wei Wang
2017-12-05 14:30 ` Stefan Hajnoczi
2017-12-05 15:20 ` [Qemu-devel] " Michael S. Tsirkin
2017-12-05 16:06 ` [Qemu-devel] [virtio-dev] " Stefan Hajnoczi
2017-12-06 13:49 ` Stefan Hajnoczi [this message]
2017-12-06 16:09 ` Wang, Wei W
2017-12-06 16:27 ` Stefan Hajnoczi
2017-12-07 3:57 ` Wei Wang
2017-12-07 5:11 ` Michael S. Tsirkin
2017-12-07 5:34 ` Wei Wang
2017-12-07 6:31 ` Stefan Hajnoczi
2017-12-07 7:54 ` Avi Cohen (A)
2017-12-07 8:04 ` Stefan Hajnoczi
2017-12-07 8:31 ` Jason Wang
2017-12-07 10:24 ` [Qemu-devel] [virtio-dev] " Stefan Hajnoczi
2017-12-07 13:33 ` [Qemu-devel] " Michael S. Tsirkin
2017-12-07 9:02 ` Wei Wang
2017-12-07 13:08 ` Stefan Hajnoczi
2017-12-07 14:02 ` Michael S. Tsirkin
2017-12-07 16:29 ` Stefan Hajnoczi
2017-12-07 16:47 ` Michael S. Tsirkin
2017-12-07 17:29 ` Stefan Hajnoczi
2017-12-07 17:38 ` Michael S. Tsirkin
2017-12-07 18:28 ` Stefan Hajnoczi
2017-12-07 23:54 ` Michael S. Tsirkin
2017-12-08 6:08 ` Stefan Hajnoczi
2017-12-08 14:27 ` Michael S. Tsirkin
2017-12-08 16:15 ` Stefan Hajnoczi
2017-12-09 16:08 ` Wang, Wei W
2017-12-08 6:43 ` Wei Wang
2017-12-08 8:33 ` Stefan Hajnoczi
2017-12-09 16:23 ` Wang, Wei W
2017-12-11 11:11 ` Stefan Hajnoczi
2017-12-11 13:53 ` Wang, Wei W
2017-12-12 10:14 ` Stefan Hajnoczi
2017-12-13 8:11 ` Wei Wang
2017-12-13 12:35 ` Stefan Hajnoczi
2017-12-13 15:01 ` Michael S. Tsirkin
2017-12-13 20:08 ` Stefan Hajnoczi
2017-12-13 20:59 ` Michael S. Tsirkin
2017-12-14 15:06 ` Stefan Hajnoczi
2017-12-15 10:33 ` Wei Wang
2017-12-15 12:37 ` Stefan Hajnoczi
2017-12-13 21:50 ` Maxime Coquelin
2017-12-14 15:46 ` [Qemu-devel] [virtio-dev] " Stefan Hajnoczi
2017-12-14 16:27 ` Michael S. Tsirkin
2017-12-14 16:39 ` Maxime Coquelin
2017-12-14 16:40 ` Michael S. Tsirkin
2017-12-14 16:50 ` Maxime Coquelin
2017-12-14 18:11 ` Stefan Hajnoczi
2017-12-14 5:53 ` [Qemu-devel] " Wei Wang
2017-12-14 17:32 ` Stefan Hajnoczi
2017-12-15 9:10 ` Wei Wang
2017-12-15 12:26 ` Stefan Hajnoczi
2017-12-14 18:04 ` Stefan Hajnoczi
2017-12-15 10:33 ` [Qemu-devel] [virtio-dev] " Wei Wang
2017-12-15 12:00 ` Stefan Hajnoczi
2017-12-06 16:13 ` [Qemu-devel] " Stefan Hajnoczi
2017-12-19 11:35 ` Stefan Hajnoczi
2017-12-19 14:56 ` Michael S. Tsirkin
2017-12-19 17:05 ` Stefan Hajnoczi
2017-12-20 4:06 ` Michael S. Tsirkin
2017-12-20 6:26 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171206134957.GD12584@stefanha-x1.localdomain \
--to=stefanha@redhat.com \
--cc=avi.cohen@huawei.com \
--cc=jan.kiszka@siemens.com \
--cc=jasowang@redhat.com \
--cc=marcandre.lureau@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=virtio-dev@lists.oasis-open.org \
--cc=wei.w.wang@intel.com \
--cc=zhiyong.yang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).