From: "Michael S. Tsirkin" <mst@redhat.com>
To: Tiwei Bie <tiwei.bie@intel.com>
Cc: jasowang@redhat.com, alex.williamson@redhat.com,
pbonzini@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org,
virtio-dev@lists.oasis-open.org, cunming.liang@intel.com,
dan.daly@intel.com, jianfeng.tan@intel.com,
zhihong.wang@intel.com, xiao.w.wang@intel.com
Subject: Re: [Qemu-devel] [PATCH v3 6/6] vhost-user: support registering external host notifiers
Date: Wed, 18 Apr 2018 19:34:06 +0300 [thread overview]
Message-ID: <20180418192154-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20180412151232.17506-7-tiwei.bie@intel.com>
On Thu, Apr 12, 2018 at 11:12:32PM +0800, Tiwei Bie wrote:
> This patch introduces VHOST_USER_PROTOCOL_F_HOST_NOTIFIER.
> With this feature negotiated, vhost-user backend can register
> memory region based host notifiers. And it will allow the guest
> driver in the VM to notify the hardware accelerator at the
> vhost-user backend directly.
>
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Overall I think we can merge this approach, but I have two main concerns
about this:
1. Testing. Most people do not have the virtio hardware
so how to make sure this does not bit rot?
I have an idea: add an option like this to libvhost-user.
Naturally libvhost-user can not get notified about a write
to an mmapped area, but it can write a special value there
(e.g. all-ones?) and then poll it to detect VQ # writes.
Then include a vhost user bridge test with an option like this.
I'd like to see a patch doing this.
2. Memory barriers. Right now after updating the avail idx,
virtio does smp_wmb() and then the MMIO write.
Normal hardware drivers do wmb() which is an sfence.
Can a PCI device read bypass index write and see a stale
index value?
To make virtio pci do wmb() we would need a new feature bit.
Alternatively I guess we could maybe look at subsystem vendor/device id.
I'd like to see a patch doing one of these things.
Thanks!
> ---
> docs/interop/vhost-user.txt | 33 +++++++++++
> hw/virtio/vhost-user.c | 123 +++++++++++++++++++++++++++++++++++++++++
> include/hw/virtio/vhost-user.h | 8 +++
> 3 files changed, 164 insertions(+)
>
> diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
> index 534caab18a..9e57b36b20 100644
> --- a/docs/interop/vhost-user.txt
> +++ b/docs/interop/vhost-user.txt
> @@ -132,6 +132,16 @@ Depending on the request type, payload can be:
> Payload: Size bytes array holding the contents of the virtio
> device's configuration space
>
> + * Vring area description
> + -----------------------
> + | u64 | size | offset |
> + -----------------------
> +
> + u64: a 64-bit integer contains vring index and flags
> + Size: a 64-bit size of this area
> + Offset: a 64-bit offset of this area from the start of the
> + supplied file descriptor
> +
> In QEMU the vhost-user message is implemented with the following struct:
>
> typedef struct VhostUserMsg {
> @@ -146,6 +156,7 @@ typedef struct VhostUserMsg {
> VhostUserLog log;
> struct vhost_iotlb_msg iotlb;
> VhostUserConfig config;
> + VhostUserVringArea area;
> };
> } QEMU_PACKED VhostUserMsg;
>
> @@ -380,6 +391,7 @@ Protocol features
> #define VHOST_USER_PROTOCOL_F_CRYPTO_SESSION 7
> #define VHOST_USER_PROTOCOL_F_PAGEFAULT 8
> #define VHOST_USER_PROTOCOL_F_CONFIG 9
> +#define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER 10
>
> Master message types
> --------------------
> @@ -777,6 +789,27 @@ Slave message types
> the VHOST_USER_NEED_REPLY flag, master must respond with zero when
> operation is successfully completed, or non-zero otherwise.
>
> + * VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG
> +
> + Id: 3
> + Equivalent ioctl: N/A
> + Slave payload: vring area description
> + Master payload: N/A
> +
> + Sets host notifier for a specified queue. The queue index is contained
> + in the u64 field of the vring area description. The host notifier is
> + described by the file descriptor (typically it's a VFIO device fd) which
> + is passed as ancillary data and the size (which is mmap size and should
> + be the same as host page size) and offset (which is mmap offset) carried
> + in the vring area description. QEMU can mmap the file descriptor based
> + on the size and offset to get a memory range. Registering a host notifier
> + means mapping this memory range to the VM as the specified queue's notify
> + MMIO region. Slave sends this request to tell QEMU to de-register the
> + existing notifier if any and register the new notifier if the request is
> + sent with a file descriptor.
> + This request should be sent only when VHOST_USER_PROTOCOL_F_HOST_NOTIFIER
> + protocol feature has been successfully negotiated.
> +
> VHOST_USER_PROTOCOL_F_REPLY_ACK:
> -------------------------------
> The original vhost-user specification only demands replies for certain
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 791e0a4763..1cd9c7276b 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -13,6 +13,7 @@
> #include "hw/virtio/vhost.h"
> #include "hw/virtio/vhost-user.h"
> #include "hw/virtio/vhost-backend.h"
> +#include "hw/virtio/virtio.h"
> #include "hw/virtio/virtio-net.h"
> #include "chardev/char-fe.h"
> #include "sysemu/kvm.h"
> @@ -48,6 +49,7 @@ enum VhostUserProtocolFeature {
> VHOST_USER_PROTOCOL_F_CRYPTO_SESSION = 7,
> VHOST_USER_PROTOCOL_F_PAGEFAULT = 8,
> VHOST_USER_PROTOCOL_F_CONFIG = 9,
> + VHOST_USER_PROTOCOL_F_HOST_NOTIFIER = 10,
> VHOST_USER_PROTOCOL_F_MAX
> };
>
> @@ -92,6 +94,7 @@ typedef enum VhostUserSlaveRequest {
> VHOST_USER_SLAVE_NONE = 0,
> VHOST_USER_SLAVE_IOTLB_MSG = 1,
> VHOST_USER_SLAVE_CONFIG_CHANGE_MSG = 2,
> + VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG = 3,
> VHOST_USER_SLAVE_MAX
> } VhostUserSlaveRequest;
>
> @@ -136,6 +139,12 @@ static VhostUserConfig c __attribute__ ((unused));
> + sizeof(c.size) \
> + sizeof(c.flags))
>
> +typedef struct VhostUserVringArea {
> + uint64_t u64;
> + uint64_t size;
> + uint64_t offset;
> +} VhostUserVringArea;
> +
> typedef struct {
> VhostUserRequest request;
>
> @@ -157,6 +166,7 @@ typedef union {
> struct vhost_iotlb_msg iotlb;
> VhostUserConfig config;
> VhostUserCryptoSession session;
> + VhostUserVringArea area;
> } VhostUserPayload;
>
> typedef struct VhostUserMsg {
> @@ -638,9 +648,37 @@ static int vhost_user_set_vring_num(struct vhost_dev *dev,
> return vhost_set_vring(dev, VHOST_USER_SET_VRING_NUM, ring);
> }
>
> +static void vhost_user_host_notifier_restore(struct vhost_dev *dev,
> + int queue_idx)
> +{
> + struct vhost_user *u = dev->opaque;
> + VhostUserHostNotifier *n = &u->user->notifier[queue_idx];
> + VirtIODevice *vdev = dev->vdev;
> +
> + if (n->addr && !n->set) {
> + virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, true);
> + n->set = true;
> + }
> +}
> +
> +static void vhost_user_host_notifier_remove(struct vhost_dev *dev,
> + int queue_idx)
> +{
> + struct vhost_user *u = dev->opaque;
> + VhostUserHostNotifier *n = &u->user->notifier[queue_idx];
> + VirtIODevice *vdev = dev->vdev;
> +
> + if (n->addr && n->set) {
> + virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, false);
> + n->set = false;
> + }
> +}
> +
> static int vhost_user_set_vring_base(struct vhost_dev *dev,
> struct vhost_vring_state *ring)
> {
> + vhost_user_host_notifier_restore(dev, ring->index);
> +
> return vhost_set_vring(dev, VHOST_USER_SET_VRING_BASE, ring);
> }
>
> @@ -674,6 +712,8 @@ static int vhost_user_get_vring_base(struct vhost_dev *dev,
> .hdr.size = sizeof(msg.payload.state),
> };
>
> + vhost_user_host_notifier_remove(dev, ring->index);
> +
> if (vhost_user_write(dev, &msg, NULL, 0) < 0) {
> return -1;
> }
> @@ -847,6 +887,76 @@ static int vhost_user_slave_handle_config_change(struct vhost_dev *dev)
> return ret;
> }
>
> +static int vhost_user_slave_handle_vring_host_notifier(struct vhost_dev *dev,
> + VhostUserVringArea *area,
> + int fd)
> +{
> + int queue_idx = area->u64 & VHOST_USER_VRING_IDX_MASK;
> + size_t page_size = qemu_real_host_page_size;
> + struct vhost_user *u = dev->opaque;
> + VhostUserState *user = u->user;
> + VirtIODevice *vdev = dev->vdev;
> + VhostUserHostNotifier *n;
> + int ret = 0;
> + void *addr;
> + char *name;
> +
> + if (!virtio_has_feature(dev->protocol_features,
> + VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) ||
> + vdev == NULL || queue_idx >= virtio_get_num_queues(vdev)) {
> + ret = -1;
> + goto out;
> + }
> +
> + n = &user->notifier[queue_idx];
> +
> + if (n->addr) {
> + virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, false);
> + object_unparent(OBJECT(&n->mr));
> + munmap(n->addr, page_size);
> + n->addr = NULL;
> + }
> +
> + if (area->u64 & VHOST_USER_VRING_NOFD_MASK) {
> + goto out;
> + }
> +
> + /* Sanity check. */
> + if (area->size != page_size) {
> + ret = -1;
> + goto out;
> + }
> +
> + addr = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED,
> + fd, area->offset);
> + if (addr == MAP_FAILED) {
> + ret = -1;
> + goto out;
> + }
> +
> + name = g_strdup_printf("vhost-user/host-notifier@%p mmaps[%d]",
> + user, queue_idx);
> + memory_region_init_ram_device_ptr(&n->mr, OBJECT(vdev), name,
> + page_size, addr);
> + g_free(name);
> +
> + if (virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, true)) {
> + munmap(addr, page_size);
> + ret = -1;
> + goto out;
> + }
> +
> + n->addr = addr;
> + n->set = true;
> +
> +out:
> + /* Always close the fd. */
> + if (fd != -1) {
> + close(fd);
> + }
> + return ret;
> +}
> +
> static void slave_read(void *opaque)
> {
> struct vhost_dev *dev = opaque;
> @@ -913,6 +1023,10 @@ static void slave_read(void *opaque)
> case VHOST_USER_SLAVE_CONFIG_CHANGE_MSG :
> ret = vhost_user_slave_handle_config_change(dev);
> break;
> + case VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG:
> + ret = vhost_user_slave_handle_vring_host_notifier(dev, &payload.area,
> + fd);
> + break;
> default:
> error_report("Received unexpected msg type.");
> if (fd != -1) {
> @@ -1641,6 +1755,15 @@ VhostUserState *vhost_user_init(void)
>
> void vhost_user_cleanup(VhostUserState *user)
> {
> + int i;
> +
> + for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
> + if (user->notifier[i].addr) {
> + object_unparent(OBJECT(&user->notifier[i].mr));
> + munmap(user->notifier[i].addr, qemu_real_host_page_size);
> + user->notifier[i].addr = NULL;
> + }
> + }
> }
>
> const VhostOps user_ops = {
> diff --git a/include/hw/virtio/vhost-user.h b/include/hw/virtio/vhost-user.h
> index eb8bc0d90d..fd660393a0 100644
> --- a/include/hw/virtio/vhost-user.h
> +++ b/include/hw/virtio/vhost-user.h
> @@ -9,9 +9,17 @@
> #define HW_VIRTIO_VHOST_USER_H
>
> #include "chardev/char-fe.h"
> +#include "hw/virtio/virtio.h"
> +
> +typedef struct VhostUserHostNotifier {
> + MemoryRegion mr;
> + void *addr;
> + bool set;
> +} VhostUserHostNotifier;
>
> typedef struct VhostUserState {
> CharBackend *chr;
> + VhostUserHostNotifier notifier[VIRTIO_QUEUE_MAX];
> } VhostUserState;
>
> VhostUserState *vhost_user_init(void);
> --
> 2.11.0
next prev parent reply other threads:[~2018-04-18 16:34 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-12 15:12 [Qemu-devel] [PATCH v3 0/6] Extend vhost-user to support registering external host notifiers Tiwei Bie
2018-04-12 15:12 ` [Qemu-devel] [PATCH v3 1/6] vhost-user: add Net prefix to internal state structure Tiwei Bie
2018-04-12 15:12 ` [Qemu-devel] [PATCH v3 2/6] vhost-user: introduce shared vhost-user state Tiwei Bie
2018-05-23 13:44 ` Michael S. Tsirkin
2018-05-23 15:36 ` Michael S. Tsirkin
2018-05-23 15:43 ` Michael S. Tsirkin
2018-05-23 23:21 ` Tiwei Bie
2018-05-24 2:24 ` Tiwei Bie
2018-05-24 7:03 ` Tiwei Bie
2018-05-24 10:59 ` Tiwei Bie
2018-05-24 13:55 ` Michael S. Tsirkin
2018-05-24 14:54 ` [Qemu-devel] [virtio-dev] " Tiwei Bie
2018-05-24 14:30 ` [Qemu-devel] " Michael S. Tsirkin
2018-05-24 15:22 ` Tiwei Bie
2018-05-24 13:50 ` Michael S. Tsirkin
2018-04-12 15:12 ` [Qemu-devel] [PATCH v3 3/6] vhost-user: support receiving file descriptors in slave_read Tiwei Bie
2018-05-23 21:25 ` Michael S. Tsirkin
2018-05-23 23:12 ` [Qemu-devel] [virtio-dev] " Tiwei Bie
2018-05-24 13:48 ` Michael S. Tsirkin
2018-05-24 14:56 ` Tiwei Bie
2018-04-12 15:12 ` [Qemu-devel] [PATCH v3 4/6] virtio: support setting memory region based host notifier Tiwei Bie
2018-04-12 15:12 ` [Qemu-devel] [PATCH v3 5/6] vhost: allow backends to filter memory sections Tiwei Bie
2018-04-12 15:12 ` [Qemu-devel] [PATCH v3 6/6] vhost-user: support registering external host notifiers Tiwei Bie
2018-04-18 16:34 ` Michael S. Tsirkin [this message]
2018-04-19 11:14 ` Tiwei Bie
2018-04-19 12:43 ` Liang, Cunming
2018-04-19 13:02 ` [Qemu-devel] [virtio-dev] " Paolo Bonzini
2018-04-19 15:19 ` Michael S. Tsirkin
2018-04-19 15:51 ` Paolo Bonzini
2018-04-19 15:59 ` Michael S. Tsirkin
2018-04-19 16:07 ` Paolo Bonzini
2018-04-19 16:48 ` Michael S. Tsirkin
2018-04-19 16:24 ` Liang, Cunming
2018-04-19 16:55 ` Michael S. Tsirkin
2018-04-20 3:01 ` Liang, Cunming
2018-04-19 15:42 ` [Qemu-devel] " Michael S. Tsirkin
2018-04-19 15:52 ` Paolo Bonzini
2018-04-19 16:34 ` Michael S. Tsirkin
2018-04-19 16:52 ` Liang, Cunming
2018-04-19 16:59 ` [Qemu-devel] [virtio-dev] " Paolo Bonzini
2018-04-19 17:27 ` Michael S. Tsirkin
2018-04-19 17:35 ` Paolo Bonzini
2018-04-19 17:39 ` Michael S. Tsirkin
2018-04-19 17:00 ` [Qemu-devel] " Michael S. Tsirkin
2018-04-19 23:05 ` Liang, Cunming
2018-04-19 16:27 ` Liang, Cunming
2018-05-02 10:32 ` Tiwei Bie
2018-05-16 1:41 ` [Qemu-devel] [PATCH v3 0/6] Extend vhost-user to " Michael S. Tsirkin
2018-05-16 1:56 ` Tiwei Bie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180418192154-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=cunming.liang@intel.com \
--cc=dan.daly@intel.com \
--cc=jasowang@redhat.com \
--cc=jianfeng.tan@intel.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=tiwei.bie@intel.com \
--cc=virtio-dev@lists.oasis-open.org \
--cc=xiao.w.wang@intel.com \
--cc=zhihong.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).