From: Yongji Xie <elohimes@gmail.com>
To: marcandre.lureau@gmail.com
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
nixun@baidu.com, qemu-devel@nongnu.org, lilin24@baidu.com,
zhangyu31@baidu.com, chaiwen@baidu.com,
Xie Yongji <xieyongji@baidu.com>
Subject: Re: [Qemu-devel] [PATCH for-4.0 2/6] vhost-user: Add shared memory to record inflight I/O
Date: Thu, 6 Dec 2018 15:22:46 +0800 [thread overview]
Message-ID: <CAONzpcbvoSdOTssddYmr46OVO686Ruja2i+Fm60BL31y6LOw3w@mail.gmail.com> (raw)
In-Reply-To: <CAJ+F1CJtJi8GKOSFPr0M3j3ZCx2n3xR=Lmg=PAT+jJaDJkM_aw@mail.gmail.com>
On Thu, 6 Dec 2018 at 15:19, Marc-André Lureau
<marcandre.lureau@gmail.com> wrote:
>
> Hi
> On Thu, Dec 6, 2018 at 10:40 AM <elohimes@gmail.com> wrote:
> >
> > From: Xie Yongji <xieyongji@baidu.com>
> >
> > This introduces a new message VHOST_USER_SET_VRING_INFLIGHT
> > to support offering shared memory to backend to record
> > its inflight I/O.
> >
> > With this new message, the backend is able to restart without
> > missing I/O which would cause I/O hung for block device.
> >
> > Signed-off-by: Xie Yongji <xieyongji@baidu.com>
> > Signed-off-by: Chai Wen <chaiwen@baidu.com>
> > Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
> > ---
> > hw/virtio/vhost-user.c | 69 +++++++++++++++++++++++++++++++
> > hw/virtio/vhost.c | 8 ++++
> > include/hw/virtio/vhost-backend.h | 4 ++
> > include/hw/virtio/vhost-user.h | 8 ++++
>
> Please update docs/interop/vhost-user.txt to describe the new message
>
Will do it in v2.
Thanks,
Yongji
> > 4 files changed, 89 insertions(+)
> >
> > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > index e09bed0e4a..4c0e64891d 100644
> > --- a/hw/virtio/vhost-user.c
> > +++ b/hw/virtio/vhost-user.c
> > @@ -19,6 +19,7 @@
> > #include "sysemu/kvm.h"
> > #include "qemu/error-report.h"
> > #include "qemu/sockets.h"
> > +#include "qemu/memfd.h"
> > #include "sysemu/cryptodev.h"
> > #include "migration/migration.h"
> > #include "migration/postcopy-ram.h"
> > @@ -52,6 +53,7 @@ enum VhostUserProtocolFeature {
> > VHOST_USER_PROTOCOL_F_CONFIG = 9,
> > VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD = 10,
> > VHOST_USER_PROTOCOL_F_HOST_NOTIFIER = 11,
> > + VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD = 12,
> > VHOST_USER_PROTOCOL_F_MAX
> > };
> >
> > @@ -89,6 +91,7 @@ typedef enum VhostUserRequest {
> > VHOST_USER_POSTCOPY_ADVISE = 28,
> > VHOST_USER_POSTCOPY_LISTEN = 29,
> > VHOST_USER_POSTCOPY_END = 30,
> > + VHOST_USER_SET_VRING_INFLIGHT = 31,
>
> why VRING? it seems to be free/arbitrary memory area.
>
> Oh, I understand later that this has an explicit layout and behaviour
> later described in "libvhost-user: Support recording inflight I/O in
> shared memory"
>
> Please update the vhost-user spec first to describe expected usage/behaviour.
>
>
> > VHOST_USER_MAX
> > } VhostUserRequest;
> >
> > @@ -147,6 +150,11 @@ typedef struct VhostUserVringArea {
> > uint64_t offset;
> > } VhostUserVringArea;
> >
> > +typedef struct VhostUserVringInflight {
> > + uint32_t size;
> > + uint32_t idx;
> > +} VhostUserVringInflight;
> > +
> > typedef struct {
> > VhostUserRequest request;
> >
> > @@ -169,6 +177,7 @@ typedef union {
> > VhostUserConfig config;
> > VhostUserCryptoSession session;
> > VhostUserVringArea area;
> > + VhostUserVringInflight inflight;
> > } VhostUserPayload;
> >
> > typedef struct VhostUserMsg {
> > @@ -1739,6 +1748,58 @@ static bool vhost_user_mem_section_filter(struct vhost_dev *dev,
> > return result;
> > }
> >
> > +static int vhost_user_set_vring_inflight(struct vhost_dev *dev, int idx)
> > +{
> > + struct vhost_user *u = dev->opaque;
> > +
> > + if (!virtio_has_feature(dev->protocol_features,
> > + VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD)) {
> > + return 0;
> > + }
> > +
> > + if (!u->user->inflight[idx].addr) {
> > + Error *err = NULL;
> > +
> > + u->user->inflight[idx].size = qemu_real_host_page_size;
> > + u->user->inflight[idx].addr = qemu_memfd_alloc("vhost-inflight",
> > + u->user->inflight[idx].size,
> > + F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL,
> > + &u->user->inflight[idx].fd, &err);
> > + if (err) {
> > + error_report_err(err);
> > + u->user->inflight[idx].addr = NULL;
> > + return -1;
> > + }
> > + }
> > +
> > + VhostUserMsg msg = {
> > + .hdr.request = VHOST_USER_SET_VRING_INFLIGHT,
> > + .hdr.flags = VHOST_USER_VERSION,
> > + .payload.inflight.size = u->user->inflight[idx].size,
> > + .payload.inflight.idx = idx,
> > + .hdr.size = sizeof(msg.payload.inflight),
> > + };
> > +
> > + if (vhost_user_write(dev, &msg, &u->user->inflight[idx].fd, 1) < 0) {
> > + return -1;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +void vhost_user_inflight_reset(VhostUserState *user)
> > +{
> > + int i;
> > +
> > + for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
> > + if (!user->inflight[i].addr) {
> > + continue;
> > + }
> > +
> > + memset(user->inflight[i].addr, 0, user->inflight[i].size);
> > + }
> > +}
> > +
> > VhostUserState *vhost_user_init(void)
> > {
> > VhostUserState *user = g_new0(struct VhostUserState, 1);
> > @@ -1756,6 +1817,13 @@ void vhost_user_cleanup(VhostUserState *user)
> > munmap(user->notifier[i].addr, qemu_real_host_page_size);
> > user->notifier[i].addr = NULL;
> > }
> > +
> > + if (user->inflight[i].addr) {
> > + munmap(user->inflight[i].addr, user->inflight[i].size);
> > + user->inflight[i].addr = NULL;
> > + close(user->inflight[i].fd);
> > + user->inflight[i].fd = -1;
> > + }
> > }
> > }
> >
> > @@ -1790,4 +1858,5 @@ const VhostOps user_ops = {
> > .vhost_crypto_create_session = vhost_user_crypto_create_session,
> > .vhost_crypto_close_session = vhost_user_crypto_close_session,
> > .vhost_backend_mem_section_filter = vhost_user_mem_section_filter,
> > + .vhost_set_vring_inflight = vhost_user_set_vring_inflight,
> > };
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index 569c4053ea..2ca7b4e841 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> > @@ -973,6 +973,14 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
> > return -errno;
> > }
> >
> > + if (dev->vhost_ops->vhost_set_vring_inflight) {
> > + r = dev->vhost_ops->vhost_set_vring_inflight(dev, vhost_vq_index);
> > + if (r) {
> > + VHOST_OPS_DEBUG("vhost_set_vring_inflight failed");
> > + return -errno;
> > + }
> > + }
> > +
> > state.num = virtio_queue_get_last_avail_idx(vdev, idx);
> > r = dev->vhost_ops->vhost_set_vring_base(dev, &state);
> > if (r) {
> > diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
> > index 81283ec50f..8110e09089 100644
> > --- a/include/hw/virtio/vhost-backend.h
> > +++ b/include/hw/virtio/vhost-backend.h
> > @@ -104,6 +104,9 @@ typedef int (*vhost_crypto_close_session_op)(struct vhost_dev *dev,
> > typedef bool (*vhost_backend_mem_section_filter_op)(struct vhost_dev *dev,
> > MemoryRegionSection *section);
> >
> > +typedef int (*vhost_set_vring_inflight_op)(struct vhost_dev *dev,
> > + int idx);
> > +
> > typedef struct VhostOps {
> > VhostBackendType backend_type;
> > vhost_backend_init vhost_backend_init;
> > @@ -142,6 +145,7 @@ typedef struct VhostOps {
> > vhost_crypto_create_session_op vhost_crypto_create_session;
> > vhost_crypto_close_session_op vhost_crypto_close_session;
> > vhost_backend_mem_section_filter_op vhost_backend_mem_section_filter;
> > + vhost_set_vring_inflight_op vhost_set_vring_inflight;
> > } VhostOps;
> >
> > extern const VhostOps user_ops;
> > diff --git a/include/hw/virtio/vhost-user.h b/include/hw/virtio/vhost-user.h
> > index fd660393a0..ff13433153 100644
> > --- a/include/hw/virtio/vhost-user.h
> > +++ b/include/hw/virtio/vhost-user.h
> > @@ -17,11 +17,19 @@ typedef struct VhostUserHostNotifier {
> > bool set;
> > } VhostUserHostNotifier;
> >
> > +typedef struct VhostUserInflight {
> > + void *addr;
> > + uint32_t size;
> > + int fd;
> > +} VhostUserInflight;
> > +
> > typedef struct VhostUserState {
> > CharBackend *chr;
> > VhostUserHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > + VhostUserInflight inflight[VIRTIO_QUEUE_MAX];
> > } VhostUserState;
> >
> > +void vhost_user_inflight_reset(VhostUserState *user);
> > VhostUserState *vhost_user_init(void);
> > void vhost_user_cleanup(VhostUserState *user);
> >
> > --
> > 2.17.1
> >
> >
>
>
> --
> Marc-André Lureau
next prev parent reply other threads:[~2018-12-06 7:23 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-06 6:35 [Qemu-devel] [PATCH for-4.0 0/6] vhost-user-blk: Add support for backend reconnecting elohimes
2018-12-06 6:35 ` [Qemu-devel] [PATCH for-4.0 1/6] char-socket: Enable "wait" option for client mode elohimes
2018-12-06 7:23 ` Marc-André Lureau
2018-12-06 7:53 ` Yongji Xie
2018-12-06 9:31 ` Yury Kotov
2018-12-06 6:35 ` [Qemu-devel] [PATCH for-4.0 2/6] vhost-user: Add shared memory to record inflight I/O elohimes
2018-12-06 7:19 ` Marc-André Lureau
2018-12-06 7:22 ` Yongji Xie [this message]
2018-12-06 6:35 ` [Qemu-devel] [PATCH for-4.0 3/6] libvhost-user: Introduce vu_queue_map_desc() elohimes
2018-12-06 7:16 ` Marc-André Lureau
2018-12-06 6:35 ` [Qemu-devel] [PATCH for-4.0 4/6] libvhost-user: Support recording inflight I/O in shared memory elohimes
2018-12-06 6:35 ` [Qemu-devel] [PATCH for-4.0 5/6] vhost-user-blk: Add support for reconnecting backend elohimes
2018-12-06 12:21 ` Yury Kotov
2018-12-06 13:26 ` Yongji Xie
2018-12-06 6:35 ` [Qemu-devel] [PATCH for-4.0 6/6] contrib/vhost-user-blk: enable inflight I/O recording elohimes
2018-12-06 7:23 ` [Qemu-devel] [PATCH for-4.0 0/6] vhost-user-blk: Add support for backend reconnecting Marc-André Lureau
2018-12-06 7:43 ` Yongji Xie
2018-12-06 9:21 ` Yury Kotov
2018-12-06 9:41 ` Yongji Xie
2018-12-06 9:52 ` Yury Kotov
2018-12-06 10:35 ` Yongji Xie
2018-12-06 13:57 ` Jason Wang
2018-12-06 13:59 ` Michael S. Tsirkin
2018-12-10 9:32 ` Jason Wang
2018-12-12 2:48 ` Yongji Xie
2018-12-12 3:00 ` Jason Wang
2018-12-12 3:21 ` Yongji Xie
2018-12-12 4:06 ` Jason Wang
2018-12-12 6:41 ` Yongji Xie
2018-12-12 7:47 ` Jason Wang
2018-12-12 9:18 ` Yongji Xie
2018-12-13 2:58 ` Jason Wang
2018-12-13 3:41 ` Yongji Xie
2018-12-13 14:56 ` Michael S. Tsirkin
2018-12-14 4:36 ` Jason Wang
2018-12-14 13:31 ` Michael S. Tsirkin
2018-12-06 14:00 ` Jason Wang
2018-12-07 8:56 ` Yongji Xie
2018-12-13 14:45 ` Michael S. Tsirkin
2018-12-14 1:56 ` Yongji Xie
2018-12-14 2:20 ` Michael S. Tsirkin
2018-12-14 2:33 ` Yongji Xie
2018-12-14 21:23 ` Michael S. Tsirkin
2018-12-15 11:34 ` Yongji Xie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAONzpcbvoSdOTssddYmr46OVO686Ruja2i+Fm60BL31y6LOw3w@mail.gmail.com \
--to=elohimes@gmail.com \
--cc=chaiwen@baidu.com \
--cc=lilin24@baidu.com \
--cc=marcandre.lureau@gmail.com \
--cc=mst@redhat.com \
--cc=nixun@baidu.com \
--cc=qemu-devel@nongnu.org \
--cc=xieyongji@baidu.com \
--cc=zhangyu31@baidu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).