qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Liang, Cunming" <cunming.liang@intel.com>
Cc: "Bie, Tiwei" <tiwei.bie@intel.com>,
	"jasowang@redhat.com" <jasowang@redhat.com>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"stefanha@redhat.com" <stefanha@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"virtio-dev@lists.oasis-open.org"
	<virtio-dev@lists.oasis-open.org>,
	"Daly, Dan" <dan.daly@intel.com>,
	"Tan, Jianfeng" <jianfeng.tan@intel.com>,
	"Wang, Zhihong" <zhihong.wang@intel.com>,
	"Wang, Xiao W" <xiao.w.wang@intel.com>
Subject: Re: [Qemu-devel] [PATCH v3 6/6] vhost-user: support registering external host notifiers
Date: Thu, 19 Apr 2018 18:42:37 +0300	[thread overview]
Message-ID: <20180419182035-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <D0158A423229094DA7ABF71CF2FA0DA34E94E895@SHSMSX104.ccr.corp.intel.com>

On Thu, Apr 19, 2018 at 12:43:42PM +0000, Liang, Cunming wrote:
> 
> 
> > -----Original Message-----
> > From: Bie, Tiwei
> > Sent: Thursday, April 19, 2018 7:15 PM
> > To: Michael S. Tsirkin <mst@redhat.com>
> > Cc: jasowang@redhat.com; alex.williamson@redhat.com; pbonzini@redhat.com;
> > stefanha@redhat.com; qemu-devel@nongnu.org; virtio-dev@lists.oasis-
> > open.org; Liang, Cunming <cunming.liang@intel.com>; Daly, Dan
> > <dan.daly@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>; Wang, Zhihong
> > <zhihong.wang@intel.com>; Wang, Xiao W <xiao.w.wang@intel.com>
> > Subject: Re: [PATCH v3 6/6] vhost-user: support registering external host
> > notifiers
> > 
> > On Wed, Apr 18, 2018 at 07:34:06PM +0300, Michael S. Tsirkin wrote:
> > > On Thu, Apr 12, 2018 at 11:12:32PM +0800, Tiwei Bie wrote:
> > > > This patch introduces VHOST_USER_PROTOCOL_F_HOST_NOTIFIER.
> > > > With this feature negotiated, vhost-user backend can register memory
> > > > region based host notifiers. And it will allow the guest driver in
> > > > the VM to notify the hardware accelerator at the vhost-user backend
> > > > directly.
> > > >
> > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > >
> > > Overall I think we can merge this approach, but I have two main
> > > concerns about this:
> > >
> > > 1. Testing. Most people do not have the virtio hardware
> > >    so how to make sure this does not bit rot?
> > >
> > >    I have an idea: add an option like this to libvhost-user.
> > >    Naturally libvhost-user can not get notified about a write
> > >    to an mmapped area, but it can write a special value there
> > >    (e.g. all-ones?) and then poll it to detect VQ # writes.
> > >
> > >    Then include a vhost user bridge test with an option like this.
> > >
> > >    I'd like to see a patch doing this.
> > 
> > Sure, I'll do it. Thanks for the suggestion!
> > 
> > >
> > > 2. Memory barriers. Right now after updating the avail idx,
> > >    virtio does smp_wmb() and then the MMIO write.
> > >    Normal hardware drivers do wmb() which is an sfence.
> > >    Can a PCI device read bypass index write and see a stale
> > >    index value?
> A compiler barrier is enough on strongly-ordered memory platform. As it doesn't re-order store, PCI device won't see a stale index value. But a weakly-ordered memory needs sfence.


Oh you are right.

So it's only needed for non-intel platforms or when packets are in WC
memory then. And I don't know whether dpdk ever puts packets in WC memory.

I guess we'll cross this bridge when we get to it.


> > 
> > It depends on arch's memory model. Cunming will provide more details about
> > this later.
> > 
> > >    To make virtio pci do wmb() we would need a new feature bit.
> > >    Alternatively I guess we could maybe look at subsystem vendor/device id.
> > >
> > >    I'd like to see a patch doing one of these things.
> > 
> > We prefer to add a new feature bit as it's a more robust way to do this. I'll send
> > out some patches soon.
> > 
> > Thank you very much! :)
> > 
> > Best regards,
> > Tiwei Bie
> > 
> > >
> > > Thanks!
> > >
> > > > ---
> > > >  docs/interop/vhost-user.txt    |  33 +++++++++++
> > > >  hw/virtio/vhost-user.c         | 123
> > +++++++++++++++++++++++++++++++++++++++++
> > > >  include/hw/virtio/vhost-user.h |   8 +++
> > > >  3 files changed, 164 insertions(+)
> > > >
> > > > diff --git a/docs/interop/vhost-user.txt
> > > > b/docs/interop/vhost-user.txt index 534caab18a..9e57b36b20 100644
> > > > --- a/docs/interop/vhost-user.txt
> > > > +++ b/docs/interop/vhost-user.txt
> > > > @@ -132,6 +132,16 @@ Depending on the request type, payload can be:
> > > >     Payload: Size bytes array holding the contents of the virtio
> > > >         device's configuration space
> > > >
> > > > + * Vring area description
> > > > +   -----------------------
> > > > +   | u64 | size | offset |
> > > > +   -----------------------
> > > > +
> > > > +   u64: a 64-bit integer contains vring index and flags
> > > > +   Size: a 64-bit size of this area
> > > > +   Offset: a 64-bit offset of this area from the start of the
> > > > +       supplied file descriptor
> > > > +
> > > >  In QEMU the vhost-user message is implemented with the following struct:
> > > >
> > > >  typedef struct VhostUserMsg {
> > > > @@ -146,6 +156,7 @@ typedef struct VhostUserMsg {
> > > >          VhostUserLog log;
> > > >          struct vhost_iotlb_msg iotlb;
> > > >          VhostUserConfig config;
> > > > +        VhostUserVringArea area;
> > > >      };
> > > >  } QEMU_PACKED VhostUserMsg;
> > > >
> > > > @@ -380,6 +391,7 @@ Protocol features  #define
> > > > VHOST_USER_PROTOCOL_F_CRYPTO_SESSION 7
> > > >  #define VHOST_USER_PROTOCOL_F_PAGEFAULT      8
> > > >  #define VHOST_USER_PROTOCOL_F_CONFIG         9
> > > > +#define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER  10
> > > >
> > > >  Master message types
> > > >  --------------------
> > > > @@ -777,6 +789,27 @@ Slave message types
> > > >       the VHOST_USER_NEED_REPLY flag, master must respond with zero when
> > > >       operation is successfully completed, or non-zero otherwise.
> > > >
> > > > + * VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG
> > > > +
> > > > +      Id: 3
> > > > +      Equivalent ioctl: N/A
> > > > +      Slave payload: vring area description
> > > > +      Master payload: N/A
> > > > +
> > > > +      Sets host notifier for a specified queue. The queue index is contained
> > > > +      in the u64 field of the vring area description. The host notifier is
> > > > +      described by the file descriptor (typically it's a VFIO device fd) which
> > > > +      is passed as ancillary data and the size (which is mmap size and should
> > > > +      be the same as host page size) and offset (which is mmap offset) carried
> > > > +      in the vring area description. QEMU can mmap the file descriptor based
> > > > +      on the size and offset to get a memory range. Registering a host notifier
> > > > +      means mapping this memory range to the VM as the specified queue's
> > notify
> > > > +      MMIO region. Slave sends this request to tell QEMU to de-register the
> > > > +      existing notifier if any and register the new notifier if the request is
> > > > +      sent with a file descriptor.
> > > > +      This request should be sent only when
> > VHOST_USER_PROTOCOL_F_HOST_NOTIFIER
> > > > +      protocol feature has been successfully negotiated.
> > > > +
> > > >  VHOST_USER_PROTOCOL_F_REPLY_ACK:
> > > >  -------------------------------
> > > >  The original vhost-user specification only demands replies for
> > > > certain diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > > > index 791e0a4763..1cd9c7276b 100644
> > > > --- a/hw/virtio/vhost-user.c
> > > > +++ b/hw/virtio/vhost-user.c
> > > > @@ -13,6 +13,7 @@
> > > >  #include "hw/virtio/vhost.h"
> > > >  #include "hw/virtio/vhost-user.h"
> > > >  #include "hw/virtio/vhost-backend.h"
> > > > +#include "hw/virtio/virtio.h"
> > > >  #include "hw/virtio/virtio-net.h"
> > > >  #include "chardev/char-fe.h"
> > > >  #include "sysemu/kvm.h"
> > > > @@ -48,6 +49,7 @@ enum VhostUserProtocolFeature {
> > > >      VHOST_USER_PROTOCOL_F_CRYPTO_SESSION = 7,
> > > >      VHOST_USER_PROTOCOL_F_PAGEFAULT = 8,
> > > >      VHOST_USER_PROTOCOL_F_CONFIG = 9,
> > > > +    VHOST_USER_PROTOCOL_F_HOST_NOTIFIER = 10,
> > > >      VHOST_USER_PROTOCOL_F_MAX
> > > >  };
> > > >
> > > > @@ -92,6 +94,7 @@ typedef enum VhostUserSlaveRequest {
> > > >      VHOST_USER_SLAVE_NONE = 0,
> > > >      VHOST_USER_SLAVE_IOTLB_MSG = 1,
> > > >      VHOST_USER_SLAVE_CONFIG_CHANGE_MSG = 2,
> > > > +    VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG = 3,
> > > >      VHOST_USER_SLAVE_MAX
> > > >  }  VhostUserSlaveRequest;
> > > >
> > > > @@ -136,6 +139,12 @@ static VhostUserConfig c __attribute__ ((unused));
> > > >                                     + sizeof(c.size) \
> > > >                                     + sizeof(c.flags))
> > > >
> > > > +typedef struct VhostUserVringArea {
> > > > +    uint64_t u64;
> > > > +    uint64_t size;
> > > > +    uint64_t offset;
> > > > +} VhostUserVringArea;
> > > > +
> > > >  typedef struct {
> > > >      VhostUserRequest request;
> > > >
> > > > @@ -157,6 +166,7 @@ typedef union {
> > > >          struct vhost_iotlb_msg iotlb;
> > > >          VhostUserConfig config;
> > > >          VhostUserCryptoSession session;
> > > > +        VhostUserVringArea area;
> > > >  } VhostUserPayload;
> > > >
> > > >  typedef struct VhostUserMsg {
> > > > @@ -638,9 +648,37 @@ static int vhost_user_set_vring_num(struct
> > vhost_dev *dev,
> > > >      return vhost_set_vring(dev, VHOST_USER_SET_VRING_NUM, ring);  }
> > > >
> > > > +static void vhost_user_host_notifier_restore(struct vhost_dev *dev,
> > > > +                                             int queue_idx) {
> > > > +    struct vhost_user *u = dev->opaque;
> > > > +    VhostUserHostNotifier *n = &u->user->notifier[queue_idx];
> > > > +    VirtIODevice *vdev = dev->vdev;
> > > > +
> > > > +    if (n->addr && !n->set) {
> > > > +        virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, true);
> > > > +        n->set = true;
> > > > +    }
> > > > +}
> > > > +
> > > > +static void vhost_user_host_notifier_remove(struct vhost_dev *dev,
> > > > +                                            int queue_idx) {
> > > > +    struct vhost_user *u = dev->opaque;
> > > > +    VhostUserHostNotifier *n = &u->user->notifier[queue_idx];
> > > > +    VirtIODevice *vdev = dev->vdev;
> > > > +
> > > > +    if (n->addr && n->set) {
> > > > +        virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, false);
> > > > +        n->set = false;
> > > > +    }
> > > > +}
> > > > +
> > > >  static int vhost_user_set_vring_base(struct vhost_dev *dev,
> > > >                                       struct vhost_vring_state
> > > > *ring)  {
> > > > +    vhost_user_host_notifier_restore(dev, ring->index);
> > > > +
> > > >      return vhost_set_vring(dev, VHOST_USER_SET_VRING_BASE, ring);
> > > > }
> > > >
> > > > @@ -674,6 +712,8 @@ static int vhost_user_get_vring_base(struct
> > vhost_dev *dev,
> > > >          .hdr.size = sizeof(msg.payload.state),
> > > >      };
> > > >
> > > > +    vhost_user_host_notifier_remove(dev, ring->index);
> > > > +
> > > >      if (vhost_user_write(dev, &msg, NULL, 0) < 0) {
> > > >          return -1;
> > > >      }
> > > > @@ -847,6 +887,76 @@ static int
> > vhost_user_slave_handle_config_change(struct vhost_dev *dev)
> > > >      return ret;
> > > >  }
> > > >
> > > > +static int vhost_user_slave_handle_vring_host_notifier(struct vhost_dev
> > *dev,
> > > > +                                                       VhostUserVringArea *area,
> > > > +                                                       int fd) {
> > > > +    int queue_idx = area->u64 & VHOST_USER_VRING_IDX_MASK;
> > > > +    size_t page_size = qemu_real_host_page_size;
> > > > +    struct vhost_user *u = dev->opaque;
> > > > +    VhostUserState *user = u->user;
> > > > +    VirtIODevice *vdev = dev->vdev;
> > > > +    VhostUserHostNotifier *n;
> > > > +    int ret = 0;
> > > > +    void *addr;
> > > > +    char *name;
> > > > +
> > > > +    if (!virtio_has_feature(dev->protocol_features,
> > > > +                            VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) ||
> > > > +        vdev == NULL || queue_idx >= virtio_get_num_queues(vdev)) {
> > > > +        ret = -1;
> > > > +        goto out;
> > > > +    }
> > > > +
> > > > +    n = &user->notifier[queue_idx];
> > > > +
> > > > +    if (n->addr) {
> > > > +        virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, false);
> > > > +        object_unparent(OBJECT(&n->mr));
> > > > +        munmap(n->addr, page_size);
> > > > +        n->addr = NULL;
> > > > +    }
> > > > +
> > > > +    if (area->u64 & VHOST_USER_VRING_NOFD_MASK) {
> > > > +        goto out;
> > > > +    }
> > > > +
> > > > +    /* Sanity check. */
> > > > +    if (area->size != page_size) {
> > > > +        ret = -1;
> > > > +        goto out;
> > > > +    }
> > > > +
> > > > +    addr = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED,
> > > > +                fd, area->offset);
> > > > +    if (addr == MAP_FAILED) {
> > > > +        ret = -1;
> > > > +        goto out;
> > > > +    }
> > > > +
> > > > +    name = g_strdup_printf("vhost-user/host-notifier@%p mmaps[%d]",
> > > > +                           user, queue_idx);
> > > > +    memory_region_init_ram_device_ptr(&n->mr, OBJECT(vdev), name,
> > > > +                                      page_size, addr);
> > > > +    g_free(name);
> > > > +
> > > > +    if (virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, true)) {
> > > > +        munmap(addr, page_size);
> > > > +        ret = -1;
> > > > +        goto out;
> > > > +    }
> > > > +
> > > > +    n->addr = addr;
> > > > +    n->set = true;
> > > > +
> > > > +out:
> > > > +    /* Always close the fd. */
> > > > +    if (fd != -1) {
> > > > +        close(fd);
> > > > +    }
> > > > +    return ret;
> > > > +}
> > > > +
> > > >  static void slave_read(void *opaque)  {
> > > >      struct vhost_dev *dev = opaque; @@ -913,6 +1023,10 @@ static
> > > > void slave_read(void *opaque)
> > > >      case VHOST_USER_SLAVE_CONFIG_CHANGE_MSG :
> > > >          ret = vhost_user_slave_handle_config_change(dev);
> > > >          break;
> > > > +    case VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG:
> > > > +        ret = vhost_user_slave_handle_vring_host_notifier(dev, &payload.area,
> > > > +                                                          fd);
> > > > +        break;
> > > >      default:
> > > >          error_report("Received unexpected msg type.");
> > > >          if (fd != -1) {
> > > > @@ -1641,6 +1755,15 @@ VhostUserState *vhost_user_init(void)
> > > >
> > > >  void vhost_user_cleanup(VhostUserState *user)  {
> > > > +    int i;
> > > > +
> > > > +    for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
> > > > +        if (user->notifier[i].addr) {
> > > > +            object_unparent(OBJECT(&user->notifier[i].mr));
> > > > +            munmap(user->notifier[i].addr, qemu_real_host_page_size);
> > > > +            user->notifier[i].addr = NULL;
> > > > +        }
> > > > +    }
> > > >  }
> > > >
> > > >  const VhostOps user_ops = {
> > > > diff --git a/include/hw/virtio/vhost-user.h
> > > > b/include/hw/virtio/vhost-user.h index eb8bc0d90d..fd660393a0 100644
> > > > --- a/include/hw/virtio/vhost-user.h
> > > > +++ b/include/hw/virtio/vhost-user.h
> > > > @@ -9,9 +9,17 @@
> > > >  #define HW_VIRTIO_VHOST_USER_H
> > > >
> > > >  #include "chardev/char-fe.h"
> > > > +#include "hw/virtio/virtio.h"
> > > > +
> > > > +typedef struct VhostUserHostNotifier {
> > > > +    MemoryRegion mr;
> > > > +    void *addr;
> > > > +    bool set;
> > > > +} VhostUserHostNotifier;
> > > >
> > > >  typedef struct VhostUserState {
> > > >      CharBackend *chr;
> > > > +    VhostUserHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > > >  } VhostUserState;
> > > >
> > > >  VhostUserState *vhost_user_init(void);
> > > > --
> > > > 2.11.0

  parent reply	other threads:[~2018-04-19 15:42 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-12 15:12 [Qemu-devel] [PATCH v3 0/6] Extend vhost-user to support registering external host notifiers Tiwei Bie
2018-04-12 15:12 ` [Qemu-devel] [PATCH v3 1/6] vhost-user: add Net prefix to internal state structure Tiwei Bie
2018-04-12 15:12 ` [Qemu-devel] [PATCH v3 2/6] vhost-user: introduce shared vhost-user state Tiwei Bie
2018-05-23 13:44   ` Michael S. Tsirkin
2018-05-23 15:36     ` Michael S. Tsirkin
2018-05-23 15:43       ` Michael S. Tsirkin
2018-05-23 23:21         ` Tiwei Bie
2018-05-24  2:24           ` Tiwei Bie
2018-05-24  7:03             ` Tiwei Bie
2018-05-24 10:59             ` Tiwei Bie
2018-05-24 13:55               ` Michael S. Tsirkin
2018-05-24 14:54                 ` [Qemu-devel] [virtio-dev] " Tiwei Bie
2018-05-24 14:30               ` [Qemu-devel] " Michael S. Tsirkin
2018-05-24 15:22                 ` Tiwei Bie
2018-05-24 13:50             ` Michael S. Tsirkin
2018-04-12 15:12 ` [Qemu-devel] [PATCH v3 3/6] vhost-user: support receiving file descriptors in slave_read Tiwei Bie
2018-05-23 21:25   ` Michael S. Tsirkin
2018-05-23 23:12     ` [Qemu-devel] [virtio-dev] " Tiwei Bie
2018-05-24 13:48       ` Michael S. Tsirkin
2018-05-24 14:56         ` Tiwei Bie
2018-04-12 15:12 ` [Qemu-devel] [PATCH v3 4/6] virtio: support setting memory region based host notifier Tiwei Bie
2018-04-12 15:12 ` [Qemu-devel] [PATCH v3 5/6] vhost: allow backends to filter memory sections Tiwei Bie
2018-04-12 15:12 ` [Qemu-devel] [PATCH v3 6/6] vhost-user: support registering external host notifiers Tiwei Bie
2018-04-18 16:34   ` Michael S. Tsirkin
2018-04-19 11:14     ` Tiwei Bie
2018-04-19 12:43       ` Liang, Cunming
2018-04-19 13:02         ` [Qemu-devel] [virtio-dev] " Paolo Bonzini
2018-04-19 15:19           ` Michael S. Tsirkin
2018-04-19 15:51             ` Paolo Bonzini
2018-04-19 15:59               ` Michael S. Tsirkin
2018-04-19 16:07                 ` Paolo Bonzini
2018-04-19 16:48                   ` Michael S. Tsirkin
2018-04-19 16:24             ` Liang, Cunming
2018-04-19 16:55               ` Michael S. Tsirkin
2018-04-20  3:01                 ` Liang, Cunming
2018-04-19 15:42         ` Michael S. Tsirkin [this message]
2018-04-19 15:52           ` [Qemu-devel] " Paolo Bonzini
2018-04-19 16:34             ` Michael S. Tsirkin
2018-04-19 16:52             ` Liang, Cunming
2018-04-19 16:59               ` [Qemu-devel] [virtio-dev] " Paolo Bonzini
2018-04-19 17:27                 ` Michael S. Tsirkin
2018-04-19 17:35                   ` Paolo Bonzini
2018-04-19 17:39                     ` Michael S. Tsirkin
2018-04-19 17:00               ` [Qemu-devel] " Michael S. Tsirkin
2018-04-19 23:05                 ` Liang, Cunming
2018-04-19 16:27           ` Liang, Cunming
2018-05-02 10:32       ` Tiwei Bie
2018-05-16  1:41 ` [Qemu-devel] [PATCH v3 0/6] Extend vhost-user to " Michael S. Tsirkin
2018-05-16  1:56   ` Tiwei Bie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180419182035-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=cunming.liang@intel.com \
    --cc=dan.daly@intel.com \
    --cc=jasowang@redhat.com \
    --cc=jianfeng.tan@intel.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=tiwei.bie@intel.com \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=xiao.w.wang@intel.com \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).