* [PATCH 1/2] Reduce vdpa initialization / startup overhead @ 2023-04-18 22:56 peili.dev 2023-04-18 22:56 ` [PATCH 2/2] " peili.dev ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: peili.dev @ 2023-04-18 22:56 UTC (permalink / raw) To: qemu-devel; +Cc: eperezma, Pei Li From: Pei Li <peili.dev@gmail.com> Currently, part of the vdpa initialization / startup process needs to trigger many ioctls per vq, which is very inefficient and causing unnecessary context switch between user mode and kernel mode. This patch creates an additional ioctl() command, namely VHOST_VDPA_GET_VRING_GROUP_BATCH, that will batching commands of VHOST_VDPA_GET_VRING_GROUP into a single ioctl() call. Signed-off-by: Pei Li <peili.dev@gmail.com> --- hw/virtio/vhost-vdpa.c | 31 +++++++++++++++----- include/standard-headers/linux/vhost_types.h | 3 ++ linux-headers/linux/vhost.h | 7 +++++ 3 files changed, 33 insertions(+), 8 deletions(-) diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c index bc6bad23d5..6d45ff8539 100644 --- a/hw/virtio/vhost-vdpa.c +++ b/hw/virtio/vhost-vdpa.c @@ -679,7 +679,8 @@ static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev) uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 | 0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH | 0x1ULL << VHOST_BACKEND_F_IOTLB_ASID | - 0x1ULL << VHOST_BACKEND_F_SUSPEND; + 0x1ULL << VHOST_BACKEND_F_SUSPEND | + 0x1ULL << VHOST_BACKEND_F_IOCTL_BATCH; int r; if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) { @@ -731,14 +732,28 @@ static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx) static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev) { - int i; + int i, nvqs = dev->nvqs; + uint64_t backend_features = dev->backend_cap; + trace_vhost_vdpa_set_vring_ready(dev); - for (i = 0; i < dev->nvqs; ++i) { - struct vhost_vring_state state = { - .index = dev->vq_index + i, - .num = 1, - }; - vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); + + if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOCTL_BATCH))) { + for (i = 0; i < nvqs; ++i) { + struct vhost_vring_state state = { + .index = dev->vq_index + i, + .num = 1, + }; + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); + } + } else { + struct vhost_vring_state states[nvqs + 1]; + states[0].num = nvqs; + for (i = 1; i <= nvqs; ++i) { + states[i].index = dev->vq_index + i - 1; + states[i].num = 1; + } + + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE_BATCH, &states[0]); } return 0; } diff --git a/include/standard-headers/linux/vhost_types.h b/include/standard-headers/linux/vhost_types.h index c41a73fe36..068d0e1ceb 100644 --- a/include/standard-headers/linux/vhost_types.h +++ b/include/standard-headers/linux/vhost_types.h @@ -164,4 +164,7 @@ struct vhost_vdpa_iova_range { /* Device can be suspended */ #define VHOST_BACKEND_F_SUSPEND 0x4 +/* IOCTL requests can be batched */ +#define VHOST_BACKEND_F_IOCTL_BATCH 0x6 + #endif diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h index f9f115a7c7..4c9ddd0a0e 100644 --- a/linux-headers/linux/vhost.h +++ b/linux-headers/linux/vhost.h @@ -180,4 +180,11 @@ */ #define VHOST_VDPA_SUSPEND _IO(VHOST_VIRTIO, 0x7D) +/* Batch version of VHOST_VDPA_SET_VRING_ENABLE + * + * Enable/disable the ring while batching the commands. + */ +#define VHOST_VDPA_SET_VRING_ENABLE_BATCH _IOW(VHOST_VIRTIO, 0x7F, \ + struct vhost_vring_state) + #endif -- 2.25.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/2] Reduce vdpa initialization / startup overhead 2023-04-18 22:56 [PATCH 1/2] Reduce vdpa initialization / startup overhead peili.dev @ 2023-04-18 22:56 ` peili.dev 2023-04-19 15:33 ` [PATCH 1/2] " Eugenio Perez Martin 2023-07-18 10:53 ` [PATCH 1/2] Reduce vdpa initialization / startup overhead Michael S. Tsirkin 2 siblings, 0 replies; 13+ messages in thread From: peili.dev @ 2023-04-18 22:56 UTC (permalink / raw) To: qemu-devel; +Cc: eperezma, Pei Li From: Pei Li <peili.dev@gmail.com> Currently, part of the vdpa initialization / startup process needs to trigger many ioctls per vq, which is very inefficient and causing unnecessary context switch between user mode and kernel mode. This patch creates an additional ioctl() command, namely VHOST_VDPA_SET_VRING_ENABLE_BATCH, that will batching commands of VHOST_VDPA_SET_VRING_ENABLE_BATCH into a single ioctl() call. Signed-off-by: Pei Li <peili.dev@gmail.com> --- linux-headers/linux/vhost.h | 10 ++++++ net/vhost-vdpa.c | 70 +++++++++++++++++++++++++++++++------ 2 files changed, 70 insertions(+), 10 deletions(-) diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h index 4c9ddd0a0e..f7cfa324c4 100644 --- a/linux-headers/linux/vhost.h +++ b/linux-headers/linux/vhost.h @@ -187,4 +187,14 @@ #define VHOST_VDPA_SET_VRING_ENABLE_BATCH _IOW(VHOST_VIRTIO, 0x7F, \ struct vhost_vring_state) +/* Batch version of VHOST_VDPA_GET_VRING_GROUP + * + * Get the group for a virtqueue: read index, write group in num, + * The virtqueue index is stored in the index field of + * vhost_vring_state. The group for this specific virtqueue is + * returned via num field of vhost_vring_state while batching commands. + */ +#define VHOST_VDPA_GET_VRING_GROUP_BATCH _IOWR(VHOST_VIRTIO, 0x82, \ + struct vhost_vring_state) + #endif diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c index 99904a0da7..ed4f2d5c49 100644 --- a/net/vhost-vdpa.c +++ b/net/vhost-vdpa.c @@ -377,6 +377,47 @@ static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index) return state.num; } +static int64_t vhost_vdpa_get_vring_group_batch(int device_fd, unsigned vq_index) +{ + int r; + struct vhost_vring_state states[vq_index + 1]; + int64_t cvq_group; + + states[0].num = vq_index; + + for (int i = 1; i <= vq_index; ++i) { + states[i].index = i - 1; + } + + r = ioctl(device_fd, VHOST_VDPA_GET_VRING_GROUP_BATCH, &states[0]); + + if (unlikely(r < 0)) { + error_report("Cannot get VQ %d group: %s", vq_index - 1, + g_strerror(errno)); + return r; + } + + cvq_group = states[vq_index].num; + + if (unlikely(cvq_group < 0)) { + return cvq_group; + } + + for (int i = 1; i < vq_index; ++i) { + int64_t group = states[i].num; + + if (unlikely(group < 0)) { + return group; + } + + if (group == cvq_group) { + return 0; + } + } + + return vq_index; +} + static int vhost_vdpa_set_address_space_id(struct vhost_vdpa *v, unsigned vq_group, unsigned asid_num) @@ -512,19 +553,28 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc) * than the last vq. VQ group of last group passed in cvq_group. */ cvq_index = v->dev->vq_index_end - 1; - cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index); - if (unlikely(cvq_group < 0)) { - return cvq_group; - } - for (int i = 0; i < cvq_index; ++i) { - int64_t group = vhost_vdpa_get_vring_group(v->device_fd, i); - if (unlikely(group < 0)) { - return group; + if (! (backend_features & BIT_ULL(VHOST_BACKEND_F_IOCTL_BATCH))) { + cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index); + if (unlikely(cvq_group < 0)) { + return cvq_group; } + for (int i = 0; i < cvq_index; ++i) { + int64_t group = vhost_vdpa_get_vring_group(v->device_fd, i); - if (group == cvq_group) { - return 0; + if (unlikely(group < 0)) { + return group; + } + + if (group == cvq_group) { + return 0; + } + } + } else { + cvq_group = vhost_vdpa_get_vring_group_batch(v->device_fd, cvq_index + 1); + + if (unlikely(cvq_group <= 0)) { + return cvq_group; } } -- 2.25.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead 2023-04-18 22:56 [PATCH 1/2] Reduce vdpa initialization / startup overhead peili.dev 2023-04-18 22:56 ` [PATCH 2/2] " peili.dev @ 2023-04-19 15:33 ` Eugenio Perez Martin 2023-04-20 4:18 ` Jason Wang 2023-07-18 10:53 ` [PATCH 1/2] Reduce vdpa initialization / startup overhead Michael S. Tsirkin 2 siblings, 1 reply; 13+ messages in thread From: Eugenio Perez Martin @ 2023-04-19 15:33 UTC (permalink / raw) To: peili.dev; +Cc: qemu-devel, Jason Wang, Michael Tsirkin On Wed, Apr 19, 2023 at 12:56 AM <peili.dev@gmail.com> wrote: > > From: Pei Li <peili.dev@gmail.com> > > Currently, part of the vdpa initialization / startup process > needs to trigger many ioctls per vq, which is very inefficient > and causing unnecessary context switch between user mode and > kernel mode. > > This patch creates an additional ioctl() command, namely > VHOST_VDPA_GET_VRING_GROUP_BATCH, that will batching > commands of VHOST_VDPA_GET_VRING_GROUP into a single > ioctl() call. > It seems to me you forgot to send the 0/2 cover letter :). Please include the time we save thanks to avoiding the repetitive ioctls in each patch. CCing Jason and Michael. > Signed-off-by: Pei Li <peili.dev@gmail.com> > --- > hw/virtio/vhost-vdpa.c | 31 +++++++++++++++----- > include/standard-headers/linux/vhost_types.h | 3 ++ > linux-headers/linux/vhost.h | 7 +++++ > 3 files changed, 33 insertions(+), 8 deletions(-) > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > index bc6bad23d5..6d45ff8539 100644 > --- a/hw/virtio/vhost-vdpa.c > +++ b/hw/virtio/vhost-vdpa.c > @@ -679,7 +679,8 @@ static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev) > uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 | > 0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH | > 0x1ULL << VHOST_BACKEND_F_IOTLB_ASID | > - 0x1ULL << VHOST_BACKEND_F_SUSPEND; > + 0x1ULL << VHOST_BACKEND_F_SUSPEND | > + 0x1ULL << VHOST_BACKEND_F_IOCTL_BATCH; > int r; > > if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) { > @@ -731,14 +732,28 @@ static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx) > > static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev) > { > - int i; > + int i, nvqs = dev->nvqs; > + uint64_t backend_features = dev->backend_cap; > + > trace_vhost_vdpa_set_vring_ready(dev); > - for (i = 0; i < dev->nvqs; ++i) { > - struct vhost_vring_state state = { > - .index = dev->vq_index + i, > - .num = 1, > - }; > - vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); > + > + if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOCTL_BATCH))) { > + for (i = 0; i < nvqs; ++i) { > + struct vhost_vring_state state = { > + .index = dev->vq_index + i, > + .num = 1, > + }; > + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); > + } > + } else { > + struct vhost_vring_state states[nvqs + 1]; > + states[0].num = nvqs; > + for (i = 1; i <= nvqs; ++i) { > + states[i].index = dev->vq_index + i - 1; > + states[i].num = 1; > + } > + I think it's more clear to share the array of vhost_vring_state states[nvqs + 1], and then fill it either in one shot with VHOST_VDPA_SET_VRING_ENABLE_BATCH or individually with VHOST_VDPA_SET_VRING_ENABLE. The same feedback goes for VHOST_VDPA_GET_VRING_GROUP_BATCH in the next patch. > + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE_BATCH, &states[0]); > } > return 0; This comment is previous to your patch but I just realized we don't check the return value of vhost_vdpa_call. Maybe it is worth considering addressing that in this series too. > } > diff --git a/include/standard-headers/linux/vhost_types.h b/include/standard-headers/linux/vhost_types.h > index c41a73fe36..068d0e1ceb 100644 > --- a/include/standard-headers/linux/vhost_types.h > +++ b/include/standard-headers/linux/vhost_types.h > @@ -164,4 +164,7 @@ struct vhost_vdpa_iova_range { > /* Device can be suspended */ > #define VHOST_BACKEND_F_SUSPEND 0x4 > > +/* IOCTL requests can be batched */ > +#define VHOST_BACKEND_F_IOCTL_BATCH 0x6 > + > #endif > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h > index f9f115a7c7..4c9ddd0a0e 100644 > --- a/linux-headers/linux/vhost.h > +++ b/linux-headers/linux/vhost.h > @@ -180,4 +180,11 @@ > */ > #define VHOST_VDPA_SUSPEND _IO(VHOST_VIRTIO, 0x7D) > > +/* Batch version of VHOST_VDPA_SET_VRING_ENABLE > + * > + * Enable/disable the ring while batching the commands. > + */ > +#define VHOST_VDPA_SET_VRING_ENABLE_BATCH _IOW(VHOST_VIRTIO, 0x7F, \ > + struct vhost_vring_state) > + These headers should be updated with update-linux-headers.sh. To be done that way we need the changes merged in the kernel before. We can signal that the series is not ready for inclusion with the subject [RFC. PATCH], like [1], and note it in the commit message. That way, you can keep updating the header manually for demonstration purposes. Thanks! [1] https://lore.kernel.org/qemu-devel/20220413163206.1958254-23-eperezma@redhat.com/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead 2023-04-19 15:33 ` [PATCH 1/2] " Eugenio Perez Martin @ 2023-04-20 4:18 ` Jason Wang 2023-04-20 5:25 ` Pei Li 2023-07-18 10:32 ` vdpa: use io_uring passthrough command for IOCTLs [was Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead] Stefano Garzarella 0 siblings, 2 replies; 13+ messages in thread From: Jason Wang @ 2023-04-20 4:18 UTC (permalink / raw) To: Eugenio Perez Martin; +Cc: peili.dev, qemu-devel, Michael Tsirkin On Wed, Apr 19, 2023 at 11:33 PM Eugenio Perez Martin <eperezma@redhat.com> wrote: > > On Wed, Apr 19, 2023 at 12:56 AM <peili.dev@gmail.com> wrote: > > > > From: Pei Li <peili.dev@gmail.com> > > > > Currently, part of the vdpa initialization / startup process > > needs to trigger many ioctls per vq, which is very inefficient > > and causing unnecessary context switch between user mode and > > kernel mode. > > > > This patch creates an additional ioctl() command, namely > > VHOST_VDPA_GET_VRING_GROUP_BATCH, that will batching > > commands of VHOST_VDPA_GET_VRING_GROUP into a single > > ioctl() call. I'd expect there's a kernel patch but I didn't see that? If we want to go this way. Why simply have a more generic way, that is introducing something like: VHOST_CMD_BATCH which did something like struct vhost_cmd_batch { int ncmds; struct vhost_ioctls[]; }; Then you can batch other ioctls other than GET_VRING_GROUP? Thanks > > > > It seems to me you forgot to send the 0/2 cover letter :). > > Please include the time we save thanks to avoiding the repetitive > ioctls in each patch. > > CCing Jason and Michael. > > > Signed-off-by: Pei Li <peili.dev@gmail.com> > > --- > > hw/virtio/vhost-vdpa.c | 31 +++++++++++++++----- > > include/standard-headers/linux/vhost_types.h | 3 ++ > > linux-headers/linux/vhost.h | 7 +++++ > > 3 files changed, 33 insertions(+), 8 deletions(-) > > > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > > index bc6bad23d5..6d45ff8539 100644 > > --- a/hw/virtio/vhost-vdpa.c > > +++ b/hw/virtio/vhost-vdpa.c > > @@ -679,7 +679,8 @@ static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev) > > uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 | > > 0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH | > > 0x1ULL << VHOST_BACKEND_F_IOTLB_ASID | > > - 0x1ULL << VHOST_BACKEND_F_SUSPEND; > > + 0x1ULL << VHOST_BACKEND_F_SUSPEND | > > + 0x1ULL << VHOST_BACKEND_F_IOCTL_BATCH; > > int r; > > > > if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) { > > @@ -731,14 +732,28 @@ static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx) > > > > static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev) > > { > > - int i; > > + int i, nvqs = dev->nvqs; > > + uint64_t backend_features = dev->backend_cap; > > + > > trace_vhost_vdpa_set_vring_ready(dev); > > - for (i = 0; i < dev->nvqs; ++i) { > > - struct vhost_vring_state state = { > > - .index = dev->vq_index + i, > > - .num = 1, > > - }; > > - vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); > > + > > + if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOCTL_BATCH))) { > > + for (i = 0; i < nvqs; ++i) { > > + struct vhost_vring_state state = { > > + .index = dev->vq_index + i, > > + .num = 1, > > + }; > > + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); > > + } > > + } else { > > + struct vhost_vring_state states[nvqs + 1]; > > + states[0].num = nvqs; > > + for (i = 1; i <= nvqs; ++i) { > > + states[i].index = dev->vq_index + i - 1; > > + states[i].num = 1; > > + } > > + > > I think it's more clear to share the array of vhost_vring_state > states[nvqs + 1], and then fill it either in one shot with > VHOST_VDPA_SET_VRING_ENABLE_BATCH or individually with > VHOST_VDPA_SET_VRING_ENABLE. > > The same feedback goes for VHOST_VDPA_GET_VRING_GROUP_BATCH in the next patch. > > > + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE_BATCH, &states[0]); > > } > > return 0; > > This comment is previous to your patch but I just realized we don't > check the return value of vhost_vdpa_call. Maybe it is worth > considering addressing that in this series too. > > > } > > diff --git a/include/standard-headers/linux/vhost_types.h b/include/standard-headers/linux/vhost_types.h > > index c41a73fe36..068d0e1ceb 100644 > > --- a/include/standard-headers/linux/vhost_types.h > > +++ b/include/standard-headers/linux/vhost_types.h > > @@ -164,4 +164,7 @@ struct vhost_vdpa_iova_range { > > /* Device can be suspended */ > > #define VHOST_BACKEND_F_SUSPEND 0x4 > > > > +/* IOCTL requests can be batched */ > > +#define VHOST_BACKEND_F_IOCTL_BATCH 0x6 > > + > > #endif > > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h > > index f9f115a7c7..4c9ddd0a0e 100644 > > --- a/linux-headers/linux/vhost.h > > +++ b/linux-headers/linux/vhost.h > > @@ -180,4 +180,11 @@ > > */ > > #define VHOST_VDPA_SUSPEND _IO(VHOST_VIRTIO, 0x7D) > > > > +/* Batch version of VHOST_VDPA_SET_VRING_ENABLE > > + * > > + * Enable/disable the ring while batching the commands. > > + */ > > +#define VHOST_VDPA_SET_VRING_ENABLE_BATCH _IOW(VHOST_VIRTIO, 0x7F, \ > > + struct vhost_vring_state) > > + > > These headers should be updated with update-linux-headers.sh. To be > done that way we need the changes merged in the kernel before. > > We can signal that the series is not ready for inclusion with the > subject [RFC. PATCH], like [1], and note it in the commit message. > That way, you can keep updating the header manually for demonstration > purposes. > > Thanks! > > [1] https://lore.kernel.org/qemu-devel/20220413163206.1958254-23-eperezma@redhat.com/ > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead 2023-04-20 4:18 ` Jason Wang @ 2023-04-20 5:25 ` Pei Li 2023-04-20 8:14 ` Jason Wang 2023-04-20 8:59 ` Eugenio Perez Martin 2023-07-18 10:32 ` vdpa: use io_uring passthrough command for IOCTLs [was Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead] Stefano Garzarella 1 sibling, 2 replies; 13+ messages in thread From: Pei Li @ 2023-04-20 5:25 UTC (permalink / raw) To: Jason Wang; +Cc: Eugenio Perez Martin, peili.dev, qemu-devel, Michael Tsirkin [-- Attachment #1: Type: text/plain, Size: 6685 bytes --] Hi all, My bad, I just submitted the kernel patch. If we are passing some generic command, still we have to add an additional field in the structure to indicate what is the unbatched version of this command, and the struct vhost_ioctls would be some command specific structure. In summary, the structure would be something like struct vhost_cmd_batch { int ncmds; int cmd; struct vhost_ioctls[]; }; This is doable. Also, this is my first time submitting patches to open source, sorry about the mess in advance. That being said, feel free to throw questions / comments! Thanks and best regards, Pei On Wed, Apr 19, 2023 at 9:19 PM Jason Wang <jasowang@redhat.com> wrote: > On Wed, Apr 19, 2023 at 11:33 PM Eugenio Perez Martin > <eperezma@redhat.com> wrote: > > > > On Wed, Apr 19, 2023 at 12:56 AM <peili.dev@gmail.com> wrote: > > > > > > From: Pei Li <peili.dev@gmail.com> > > > > > > Currently, part of the vdpa initialization / startup process > > > needs to trigger many ioctls per vq, which is very inefficient > > > and causing unnecessary context switch between user mode and > > > kernel mode. > > > > > > This patch creates an additional ioctl() command, namely > > > VHOST_VDPA_GET_VRING_GROUP_BATCH, that will batching > > > commands of VHOST_VDPA_GET_VRING_GROUP into a single > > > ioctl() call. > > I'd expect there's a kernel patch but I didn't see that? > > If we want to go this way. Why simply have a more generic way, that is > introducing something like: > > VHOST_CMD_BATCH which did something like > > struct vhost_cmd_batch { > int ncmds; > struct vhost_ioctls[]; > }; > > Then you can batch other ioctls other than GET_VRING_GROUP? > > Thanks > > > > > > > > It seems to me you forgot to send the 0/2 cover letter :). > > > > Please include the time we save thanks to avoiding the repetitive > > ioctls in each patch. > > > > CCing Jason and Michael. > > > > > Signed-off-by: Pei Li <peili.dev@gmail.com> > > > --- > > > hw/virtio/vhost-vdpa.c | 31 +++++++++++++++----- > > > include/standard-headers/linux/vhost_types.h | 3 ++ > > > linux-headers/linux/vhost.h | 7 +++++ > > > 3 files changed, 33 insertions(+), 8 deletions(-) > > > > > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > > > index bc6bad23d5..6d45ff8539 100644 > > > --- a/hw/virtio/vhost-vdpa.c > > > +++ b/hw/virtio/vhost-vdpa.c > > > @@ -679,7 +679,8 @@ static int vhost_vdpa_set_backend_cap(struct > vhost_dev *dev) > > > uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 | > > > 0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH | > > > 0x1ULL << VHOST_BACKEND_F_IOTLB_ASID | > > > - 0x1ULL << VHOST_BACKEND_F_SUSPEND; > > > + 0x1ULL << VHOST_BACKEND_F_SUSPEND | > > > + 0x1ULL << VHOST_BACKEND_F_IOCTL_BATCH; > > > int r; > > > > > > if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) { > > > @@ -731,14 +732,28 @@ static int vhost_vdpa_get_vq_index(struct > vhost_dev *dev, int idx) > > > > > > static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev) > > > { > > > - int i; > > > + int i, nvqs = dev->nvqs; > > > + uint64_t backend_features = dev->backend_cap; > > > + > > > trace_vhost_vdpa_set_vring_ready(dev); > > > - for (i = 0; i < dev->nvqs; ++i) { > > > - struct vhost_vring_state state = { > > > - .index = dev->vq_index + i, > > > - .num = 1, > > > - }; > > > - vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); > > > + > > > + if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOCTL_BATCH))) { > > > + for (i = 0; i < nvqs; ++i) { > > > + struct vhost_vring_state state = { > > > + .index = dev->vq_index + i, > > > + .num = 1, > > > + }; > > > + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); > > > + } > > > + } else { > > > + struct vhost_vring_state states[nvqs + 1]; > > > + states[0].num = nvqs; > > > + for (i = 1; i <= nvqs; ++i) { > > > + states[i].index = dev->vq_index + i - 1; > > > + states[i].num = 1; > > > + } > > > + > > > > I think it's more clear to share the array of vhost_vring_state > > states[nvqs + 1], and then fill it either in one shot with > > VHOST_VDPA_SET_VRING_ENABLE_BATCH or individually with > > VHOST_VDPA_SET_VRING_ENABLE. > > > > The same feedback goes for VHOST_VDPA_GET_VRING_GROUP_BATCH in the next > patch. > > > > > + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE_BATCH, > &states[0]); > > > } > > > return 0; > > > > This comment is previous to your patch but I just realized we don't > > check the return value of vhost_vdpa_call. Maybe it is worth > > considering addressing that in this series too. > > > > > } > > > diff --git a/include/standard-headers/linux/vhost_types.h > b/include/standard-headers/linux/vhost_types.h > > > index c41a73fe36..068d0e1ceb 100644 > > > --- a/include/standard-headers/linux/vhost_types.h > > > +++ b/include/standard-headers/linux/vhost_types.h > > > @@ -164,4 +164,7 @@ struct vhost_vdpa_iova_range { > > > /* Device can be suspended */ > > > #define VHOST_BACKEND_F_SUSPEND 0x4 > > > > > > +/* IOCTL requests can be batched */ > > > +#define VHOST_BACKEND_F_IOCTL_BATCH 0x6 > > > + > > > #endif > > > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h > > > index f9f115a7c7..4c9ddd0a0e 100644 > > > --- a/linux-headers/linux/vhost.h > > > +++ b/linux-headers/linux/vhost.h > > > @@ -180,4 +180,11 @@ > > > */ > > > #define VHOST_VDPA_SUSPEND _IO(VHOST_VIRTIO, 0x7D) > > > > > > +/* Batch version of VHOST_VDPA_SET_VRING_ENABLE > > > + * > > > + * Enable/disable the ring while batching the commands. > > > + */ > > > +#define VHOST_VDPA_SET_VRING_ENABLE_BATCH _IOW(VHOST_VIRTIO, > 0x7F, \ > > > + struct vhost_vring_state) > > > + > > > > These headers should be updated with update-linux-headers.sh. To be > > done that way we need the changes merged in the kernel before. > > > > We can signal that the series is not ready for inclusion with the > > subject [RFC. PATCH], like [1], and note it in the commit message. > > That way, you can keep updating the header manually for demonstration > > purposes. > > > > Thanks! > > > > [1] > https://lore.kernel.org/qemu-devel/20220413163206.1958254-23-eperezma@redhat.com/ > > > > > [-- Attachment #2: Type: text/html, Size: 8770 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead 2023-04-20 5:25 ` Pei Li @ 2023-04-20 8:14 ` Jason Wang 2023-04-20 8:59 ` Eugenio Perez Martin 1 sibling, 0 replies; 13+ messages in thread From: Jason Wang @ 2023-04-20 8:14 UTC (permalink / raw) To: Pei Li; +Cc: Eugenio Perez Martin, peili.dev, qemu-devel, Michael Tsirkin On Thu, Apr 20, 2023 at 1:25 PM Pei Li <peili@andrew.cmu.edu> wrote: > > Hi all, > > My bad, I just submitted the kernel patch. Please cc maintainers. You can get it via scripts/get_maintainer.pl > If we are passing some generic command, still we have to add an additional field in the structure to indicate what is the unbatched version of this command, Right. > and the struct vhost_ioctls would be some command specific structure. In summary, the structure would be something like > struct vhost_cmd_batch { > int ncmds; > int cmd; > struct vhost_ioctls[]; > }; > > This is doable. Let's try that? > Also, this is my first time submitting patches to open source, sorry about the mess in advance. That being said, feel free to throw questions / comments! You can get more instructions on how to contribute patches through Documentation/process/submitting-patches.rst. Happy hacking! Thanks > > Thanks and best regards, > Pei > > On Wed, Apr 19, 2023 at 9:19 PM Jason Wang <jasowang@redhat.com> wrote: >> >> On Wed, Apr 19, 2023 at 11:33 PM Eugenio Perez Martin >> <eperezma@redhat.com> wrote: >> > >> > On Wed, Apr 19, 2023 at 12:56 AM <peili.dev@gmail.com> wrote: >> > > >> > > From: Pei Li <peili.dev@gmail.com> >> > > >> > > Currently, part of the vdpa initialization / startup process >> > > needs to trigger many ioctls per vq, which is very inefficient >> > > and causing unnecessary context switch between user mode and >> > > kernel mode. >> > > >> > > This patch creates an additional ioctl() command, namely >> > > VHOST_VDPA_GET_VRING_GROUP_BATCH, that will batching >> > > commands of VHOST_VDPA_GET_VRING_GROUP into a single >> > > ioctl() call. >> >> I'd expect there's a kernel patch but I didn't see that? >> >> If we want to go this way. Why simply have a more generic way, that is >> introducing something like: >> >> VHOST_CMD_BATCH which did something like >> >> struct vhost_cmd_batch { >> int ncmds; >> struct vhost_ioctls[]; >> }; >> >> Then you can batch other ioctls other than GET_VRING_GROUP? >> >> Thanks >> >> > > >> > >> > It seems to me you forgot to send the 0/2 cover letter :). >> > >> > Please include the time we save thanks to avoiding the repetitive >> > ioctls in each patch. >> > >> > CCing Jason and Michael. >> > >> > > Signed-off-by: Pei Li <peili.dev@gmail.com> >> > > --- >> > > hw/virtio/vhost-vdpa.c | 31 +++++++++++++++----- >> > > include/standard-headers/linux/vhost_types.h | 3 ++ >> > > linux-headers/linux/vhost.h | 7 +++++ >> > > 3 files changed, 33 insertions(+), 8 deletions(-) >> > > >> > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c >> > > index bc6bad23d5..6d45ff8539 100644 >> > > --- a/hw/virtio/vhost-vdpa.c >> > > +++ b/hw/virtio/vhost-vdpa.c >> > > @@ -679,7 +679,8 @@ static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev) >> > > uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 | >> > > 0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH | >> > > 0x1ULL << VHOST_BACKEND_F_IOTLB_ASID | >> > > - 0x1ULL << VHOST_BACKEND_F_SUSPEND; >> > > + 0x1ULL << VHOST_BACKEND_F_SUSPEND | >> > > + 0x1ULL << VHOST_BACKEND_F_IOCTL_BATCH; >> > > int r; >> > > >> > > if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) { >> > > @@ -731,14 +732,28 @@ static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx) >> > > >> > > static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev) >> > > { >> > > - int i; >> > > + int i, nvqs = dev->nvqs; >> > > + uint64_t backend_features = dev->backend_cap; >> > > + >> > > trace_vhost_vdpa_set_vring_ready(dev); >> > > - for (i = 0; i < dev->nvqs; ++i) { >> > > - struct vhost_vring_state state = { >> > > - .index = dev->vq_index + i, >> > > - .num = 1, >> > > - }; >> > > - vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); >> > > + >> > > + if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOCTL_BATCH))) { >> > > + for (i = 0; i < nvqs; ++i) { >> > > + struct vhost_vring_state state = { >> > > + .index = dev->vq_index + i, >> > > + .num = 1, >> > > + }; >> > > + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); >> > > + } >> > > + } else { >> > > + struct vhost_vring_state states[nvqs + 1]; >> > > + states[0].num = nvqs; >> > > + for (i = 1; i <= nvqs; ++i) { >> > > + states[i].index = dev->vq_index + i - 1; >> > > + states[i].num = 1; >> > > + } >> > > + >> > >> > I think it's more clear to share the array of vhost_vring_state >> > states[nvqs + 1], and then fill it either in one shot with >> > VHOST_VDPA_SET_VRING_ENABLE_BATCH or individually with >> > VHOST_VDPA_SET_VRING_ENABLE. >> > >> > The same feedback goes for VHOST_VDPA_GET_VRING_GROUP_BATCH in the next patch. >> > >> > > + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE_BATCH, &states[0]); >> > > } >> > > return 0; >> > >> > This comment is previous to your patch but I just realized we don't >> > check the return value of vhost_vdpa_call. Maybe it is worth >> > considering addressing that in this series too. >> > >> > > } >> > > diff --git a/include/standard-headers/linux/vhost_types.h b/include/standard-headers/linux/vhost_types.h >> > > index c41a73fe36..068d0e1ceb 100644 >> > > --- a/include/standard-headers/linux/vhost_types.h >> > > +++ b/include/standard-headers/linux/vhost_types.h >> > > @@ -164,4 +164,7 @@ struct vhost_vdpa_iova_range { >> > > /* Device can be suspended */ >> > > #define VHOST_BACKEND_F_SUSPEND 0x4 >> > > >> > > +/* IOCTL requests can be batched */ >> > > +#define VHOST_BACKEND_F_IOCTL_BATCH 0x6 >> > > + >> > > #endif >> > > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h >> > > index f9f115a7c7..4c9ddd0a0e 100644 >> > > --- a/linux-headers/linux/vhost.h >> > > +++ b/linux-headers/linux/vhost.h >> > > @@ -180,4 +180,11 @@ >> > > */ >> > > #define VHOST_VDPA_SUSPEND _IO(VHOST_VIRTIO, 0x7D) >> > > >> > > +/* Batch version of VHOST_VDPA_SET_VRING_ENABLE >> > > + * >> > > + * Enable/disable the ring while batching the commands. >> > > + */ >> > > +#define VHOST_VDPA_SET_VRING_ENABLE_BATCH _IOW(VHOST_VIRTIO, 0x7F, \ >> > > + struct vhost_vring_state) >> > > + >> > >> > These headers should be updated with update-linux-headers.sh. To be >> > done that way we need the changes merged in the kernel before. >> > >> > We can signal that the series is not ready for inclusion with the >> > subject [RFC. PATCH], like [1], and note it in the commit message. >> > That way, you can keep updating the header manually for demonstration >> > purposes. >> > >> > Thanks! >> > >> > [1] https://lore.kernel.org/qemu-devel/20220413163206.1958254-23-eperezma@redhat.com/ >> > >> >> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead 2023-04-20 5:25 ` Pei Li 2023-04-20 8:14 ` Jason Wang @ 2023-04-20 8:59 ` Eugenio Perez Martin 2023-07-18 10:55 ` Michael S. Tsirkin 1 sibling, 1 reply; 13+ messages in thread From: Eugenio Perez Martin @ 2023-04-20 8:59 UTC (permalink / raw) To: Pei Li; +Cc: Jason Wang, peili.dev, qemu-devel, Michael Tsirkin, Si-Wei Liu On Thu, Apr 20, 2023 at 7:25 AM Pei Li <peili@andrew.cmu.edu> wrote: > > Hi all, > > My bad, I just submitted the kernel patch. If we are passing some generic command, still we have to add an additional field in the structure to indicate what is the unbatched version of this command, and the struct vhost_ioctls would be some command specific structure. In summary, the structure would be something like > struct vhost_cmd_batch { > int ncmds; > int cmd; The unbatched version should go in each vhost_ioctls. That allows us to send many different commands in one ioctl instead of having to resort to many ioctls, each one for a different task. The problem with that is the size of that struct vhost_ioctl, so we can build an array. I think it should be enough with the biggest of them (vhost_vring_addr ?) for a long time, but I would like to know if anybody finds a drawback here. We could always resort to pointers if we find we need more space, or start with them from the beginning. CCing Si-Wei here too, as he is also interested in reducing the startup time. Thanks! > struct vhost_ioctls[]; > }; > > This is doable. Also, this is my first time submitting patches to open source, sorry about the mess in advance. That being said, feel free to throw questions / comments! > > Thanks and best regards, > Pei > > On Wed, Apr 19, 2023 at 9:19 PM Jason Wang <jasowang@redhat.com> wrote: >> >> On Wed, Apr 19, 2023 at 11:33 PM Eugenio Perez Martin >> <eperezma@redhat.com> wrote: >> > >> > On Wed, Apr 19, 2023 at 12:56 AM <peili.dev@gmail.com> wrote: >> > > >> > > From: Pei Li <peili.dev@gmail.com> >> > > >> > > Currently, part of the vdpa initialization / startup process >> > > needs to trigger many ioctls per vq, which is very inefficient >> > > and causing unnecessary context switch between user mode and >> > > kernel mode. >> > > >> > > This patch creates an additional ioctl() command, namely >> > > VHOST_VDPA_GET_VRING_GROUP_BATCH, that will batching >> > > commands of VHOST_VDPA_GET_VRING_GROUP into a single >> > > ioctl() call. >> >> I'd expect there's a kernel patch but I didn't see that? >> >> If we want to go this way. Why simply have a more generic way, that is >> introducing something like: >> >> VHOST_CMD_BATCH which did something like >> >> struct vhost_cmd_batch { >> int ncmds; >> struct vhost_ioctls[]; >> }; >> >> Then you can batch other ioctls other than GET_VRING_GROUP? >> >> Thanks >> >> > > >> > >> > It seems to me you forgot to send the 0/2 cover letter :). >> > >> > Please include the time we save thanks to avoiding the repetitive >> > ioctls in each patch. >> > >> > CCing Jason and Michael. >> > >> > > Signed-off-by: Pei Li <peili.dev@gmail.com> >> > > --- >> > > hw/virtio/vhost-vdpa.c | 31 +++++++++++++++----- >> > > include/standard-headers/linux/vhost_types.h | 3 ++ >> > > linux-headers/linux/vhost.h | 7 +++++ >> > > 3 files changed, 33 insertions(+), 8 deletions(-) >> > > >> > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c >> > > index bc6bad23d5..6d45ff8539 100644 >> > > --- a/hw/virtio/vhost-vdpa.c >> > > +++ b/hw/virtio/vhost-vdpa.c >> > > @@ -679,7 +679,8 @@ static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev) >> > > uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 | >> > > 0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH | >> > > 0x1ULL << VHOST_BACKEND_F_IOTLB_ASID | >> > > - 0x1ULL << VHOST_BACKEND_F_SUSPEND; >> > > + 0x1ULL << VHOST_BACKEND_F_SUSPEND | >> > > + 0x1ULL << VHOST_BACKEND_F_IOCTL_BATCH; >> > > int r; >> > > >> > > if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) { >> > > @@ -731,14 +732,28 @@ static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx) >> > > >> > > static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev) >> > > { >> > > - int i; >> > > + int i, nvqs = dev->nvqs; >> > > + uint64_t backend_features = dev->backend_cap; >> > > + >> > > trace_vhost_vdpa_set_vring_ready(dev); >> > > - for (i = 0; i < dev->nvqs; ++i) { >> > > - struct vhost_vring_state state = { >> > > - .index = dev->vq_index + i, >> > > - .num = 1, >> > > - }; >> > > - vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); >> > > + >> > > + if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOCTL_BATCH))) { >> > > + for (i = 0; i < nvqs; ++i) { >> > > + struct vhost_vring_state state = { >> > > + .index = dev->vq_index + i, >> > > + .num = 1, >> > > + }; >> > > + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); >> > > + } >> > > + } else { >> > > + struct vhost_vring_state states[nvqs + 1]; >> > > + states[0].num = nvqs; >> > > + for (i = 1; i <= nvqs; ++i) { >> > > + states[i].index = dev->vq_index + i - 1; >> > > + states[i].num = 1; >> > > + } >> > > + >> > >> > I think it's more clear to share the array of vhost_vring_state >> > states[nvqs + 1], and then fill it either in one shot with >> > VHOST_VDPA_SET_VRING_ENABLE_BATCH or individually with >> > VHOST_VDPA_SET_VRING_ENABLE. >> > >> > The same feedback goes for VHOST_VDPA_GET_VRING_GROUP_BATCH in the next patch. >> > >> > > + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE_BATCH, &states[0]); >> > > } >> > > return 0; >> > >> > This comment is previous to your patch but I just realized we don't >> > check the return value of vhost_vdpa_call. Maybe it is worth >> > considering addressing that in this series too. >> > >> > > } >> > > diff --git a/include/standard-headers/linux/vhost_types.h b/include/standard-headers/linux/vhost_types.h >> > > index c41a73fe36..068d0e1ceb 100644 >> > > --- a/include/standard-headers/linux/vhost_types.h >> > > +++ b/include/standard-headers/linux/vhost_types.h >> > > @@ -164,4 +164,7 @@ struct vhost_vdpa_iova_range { >> > > /* Device can be suspended */ >> > > #define VHOST_BACKEND_F_SUSPEND 0x4 >> > > >> > > +/* IOCTL requests can be batched */ >> > > +#define VHOST_BACKEND_F_IOCTL_BATCH 0x6 >> > > + >> > > #endif >> > > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h >> > > index f9f115a7c7..4c9ddd0a0e 100644 >> > > --- a/linux-headers/linux/vhost.h >> > > +++ b/linux-headers/linux/vhost.h >> > > @@ -180,4 +180,11 @@ >> > > */ >> > > #define VHOST_VDPA_SUSPEND _IO(VHOST_VIRTIO, 0x7D) >> > > >> > > +/* Batch version of VHOST_VDPA_SET_VRING_ENABLE >> > > + * >> > > + * Enable/disable the ring while batching the commands. >> > > + */ >> > > +#define VHOST_VDPA_SET_VRING_ENABLE_BATCH _IOW(VHOST_VIRTIO, 0x7F, \ >> > > + struct vhost_vring_state) >> > > + >> > >> > These headers should be updated with update-linux-headers.sh. To be >> > done that way we need the changes merged in the kernel before. >> > >> > We can signal that the series is not ready for inclusion with the >> > subject [RFC. PATCH], like [1], and note it in the commit message. >> > That way, you can keep updating the header manually for demonstration >> > purposes. >> > >> > Thanks! >> > >> > [1] https://lore.kernel.org/qemu-devel/20220413163206.1958254-23-eperezma@redhat.com/ >> > >> >> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead 2023-04-20 8:59 ` Eugenio Perez Martin @ 2023-07-18 10:55 ` Michael S. Tsirkin 2023-07-21 10:39 ` Eugenio Perez Martin 0 siblings, 1 reply; 13+ messages in thread From: Michael S. Tsirkin @ 2023-07-18 10:55 UTC (permalink / raw) To: Eugenio Perez Martin Cc: Pei Li, Jason Wang, peili.dev, qemu-devel, Si-Wei Liu On Thu, Apr 20, 2023 at 10:59:56AM +0200, Eugenio Perez Martin wrote: > On Thu, Apr 20, 2023 at 7:25 AM Pei Li <peili@andrew.cmu.edu> wrote: > > > > Hi all, > > > > My bad, I just submitted the kernel patch. If we are passing some generic command, still we have to add an additional field in the structure to indicate what is the unbatched version of this command, and the struct vhost_ioctls would be some command specific structure. In summary, the structure would be something like > > struct vhost_cmd_batch { > > int ncmds; > > int cmd; > > The unbatched version should go in each vhost_ioctls. That allows us > to send many different commands in one ioctl instead of having to > resort to many ioctls, each one for a different task. > > The problem with that is the size of that struct vhost_ioctl, so we > can build an array. I think it should be enough with the biggest of > them (vhost_vring_addr ?) for a long time, but I would like to know if > anybody finds a drawback here. We could always resort to pointers if > we find we need more space, or start with them from the beginning. > > CCing Si-Wei here too, as he is also interested in reducing the startup time. > > Thanks! And copying my response too: This is all very exciting, but what exactly is the benefit? No optimization patches are going to be merged without numbers showing performance gains. In this case, can you show gains in process startup time? Are they significant enough to warrant adding new UAPI? > > struct vhost_ioctls[]; > > }; > > > > This is doable. Also, this is my first time submitting patches to open source, sorry about the mess in advance. That being said, feel free to throw questions / comments! > > > > Thanks and best regards, > > Pei > > > > On Wed, Apr 19, 2023 at 9:19 PM Jason Wang <jasowang@redhat.com> wrote: > >> > >> On Wed, Apr 19, 2023 at 11:33 PM Eugenio Perez Martin > >> <eperezma@redhat.com> wrote: > >> > > >> > On Wed, Apr 19, 2023 at 12:56 AM <peili.dev@gmail.com> wrote: > >> > > > >> > > From: Pei Li <peili.dev@gmail.com> > >> > > > >> > > Currently, part of the vdpa initialization / startup process > >> > > needs to trigger many ioctls per vq, which is very inefficient > >> > > and causing unnecessary context switch between user mode and > >> > > kernel mode. > >> > > > >> > > This patch creates an additional ioctl() command, namely > >> > > VHOST_VDPA_GET_VRING_GROUP_BATCH, that will batching > >> > > commands of VHOST_VDPA_GET_VRING_GROUP into a single > >> > > ioctl() call. > >> > >> I'd expect there's a kernel patch but I didn't see that? > >> > >> If we want to go this way. Why simply have a more generic way, that is > >> introducing something like: > >> > >> VHOST_CMD_BATCH which did something like > >> > >> struct vhost_cmd_batch { > >> int ncmds; > >> struct vhost_ioctls[]; > >> }; > >> > >> Then you can batch other ioctls other than GET_VRING_GROUP? > >> > >> Thanks > >> > >> > > > >> > > >> > It seems to me you forgot to send the 0/2 cover letter :). > >> > > >> > Please include the time we save thanks to avoiding the repetitive > >> > ioctls in each patch. > >> > > >> > CCing Jason and Michael. > >> > > >> > > Signed-off-by: Pei Li <peili.dev@gmail.com> > >> > > --- > >> > > hw/virtio/vhost-vdpa.c | 31 +++++++++++++++----- > >> > > include/standard-headers/linux/vhost_types.h | 3 ++ > >> > > linux-headers/linux/vhost.h | 7 +++++ > >> > > 3 files changed, 33 insertions(+), 8 deletions(-) > >> > > > >> > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > >> > > index bc6bad23d5..6d45ff8539 100644 > >> > > --- a/hw/virtio/vhost-vdpa.c > >> > > +++ b/hw/virtio/vhost-vdpa.c > >> > > @@ -679,7 +679,8 @@ static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev) > >> > > uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 | > >> > > 0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH | > >> > > 0x1ULL << VHOST_BACKEND_F_IOTLB_ASID | > >> > > - 0x1ULL << VHOST_BACKEND_F_SUSPEND; > >> > > + 0x1ULL << VHOST_BACKEND_F_SUSPEND | > >> > > + 0x1ULL << VHOST_BACKEND_F_IOCTL_BATCH; > >> > > int r; > >> > > > >> > > if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) { > >> > > @@ -731,14 +732,28 @@ static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx) > >> > > > >> > > static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev) > >> > > { > >> > > - int i; > >> > > + int i, nvqs = dev->nvqs; > >> > > + uint64_t backend_features = dev->backend_cap; > >> > > + > >> > > trace_vhost_vdpa_set_vring_ready(dev); > >> > > - for (i = 0; i < dev->nvqs; ++i) { > >> > > - struct vhost_vring_state state = { > >> > > - .index = dev->vq_index + i, > >> > > - .num = 1, > >> > > - }; > >> > > - vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); > >> > > + > >> > > + if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOCTL_BATCH))) { > >> > > + for (i = 0; i < nvqs; ++i) { > >> > > + struct vhost_vring_state state = { > >> > > + .index = dev->vq_index + i, > >> > > + .num = 1, > >> > > + }; > >> > > + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); > >> > > + } > >> > > + } else { > >> > > + struct vhost_vring_state states[nvqs + 1]; > >> > > + states[0].num = nvqs; > >> > > + for (i = 1; i <= nvqs; ++i) { > >> > > + states[i].index = dev->vq_index + i - 1; > >> > > + states[i].num = 1; > >> > > + } > >> > > + > >> > > >> > I think it's more clear to share the array of vhost_vring_state > >> > states[nvqs + 1], and then fill it either in one shot with > >> > VHOST_VDPA_SET_VRING_ENABLE_BATCH or individually with > >> > VHOST_VDPA_SET_VRING_ENABLE. > >> > > >> > The same feedback goes for VHOST_VDPA_GET_VRING_GROUP_BATCH in the next patch. > >> > > >> > > + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE_BATCH, &states[0]); > >> > > } > >> > > return 0; > >> > > >> > This comment is previous to your patch but I just realized we don't > >> > check the return value of vhost_vdpa_call. Maybe it is worth > >> > considering addressing that in this series too. > >> > > >> > > } > >> > > diff --git a/include/standard-headers/linux/vhost_types.h b/include/standard-headers/linux/vhost_types.h > >> > > index c41a73fe36..068d0e1ceb 100644 > >> > > --- a/include/standard-headers/linux/vhost_types.h > >> > > +++ b/include/standard-headers/linux/vhost_types.h > >> > > @@ -164,4 +164,7 @@ struct vhost_vdpa_iova_range { > >> > > /* Device can be suspended */ > >> > > #define VHOST_BACKEND_F_SUSPEND 0x4 > >> > > > >> > > +/* IOCTL requests can be batched */ > >> > > +#define VHOST_BACKEND_F_IOCTL_BATCH 0x6 > >> > > + > >> > > #endif > >> > > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h > >> > > index f9f115a7c7..4c9ddd0a0e 100644 > >> > > --- a/linux-headers/linux/vhost.h > >> > > +++ b/linux-headers/linux/vhost.h > >> > > @@ -180,4 +180,11 @@ > >> > > */ > >> > > #define VHOST_VDPA_SUSPEND _IO(VHOST_VIRTIO, 0x7D) > >> > > > >> > > +/* Batch version of VHOST_VDPA_SET_VRING_ENABLE > >> > > + * > >> > > + * Enable/disable the ring while batching the commands. > >> > > + */ > >> > > +#define VHOST_VDPA_SET_VRING_ENABLE_BATCH _IOW(VHOST_VIRTIO, 0x7F, \ > >> > > + struct vhost_vring_state) > >> > > + > >> > > >> > These headers should be updated with update-linux-headers.sh. To be > >> > done that way we need the changes merged in the kernel before. > >> > > >> > We can signal that the series is not ready for inclusion with the > >> > subject [RFC. PATCH], like [1], and note it in the commit message. > >> > That way, you can keep updating the header manually for demonstration > >> > purposes. > >> > > >> > Thanks! > >> > > >> > [1] https://lore.kernel.org/qemu-devel/20220413163206.1958254-23-eperezma@redhat.com/ > >> > > >> > >> > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead 2023-07-18 10:55 ` Michael S. Tsirkin @ 2023-07-21 10:39 ` Eugenio Perez Martin 2023-07-21 20:57 ` Si-Wei Liu 0 siblings, 1 reply; 13+ messages in thread From: Eugenio Perez Martin @ 2023-07-21 10:39 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: Pei Li, Jason Wang, peili.dev, qemu-devel, Si-Wei Liu On Tue, Jul 18, 2023 at 12:55 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > On Thu, Apr 20, 2023 at 10:59:56AM +0200, Eugenio Perez Martin wrote: > > On Thu, Apr 20, 2023 at 7:25 AM Pei Li <peili@andrew.cmu.edu> wrote: > > > > > > Hi all, > > > > > > My bad, I just submitted the kernel patch. If we are passing some generic command, still we have to add an additional field in the structure to indicate what is the unbatched version of this command, and the struct vhost_ioctls would be some command specific structure. In summary, the structure would be something like > > > struct vhost_cmd_batch { > > > int ncmds; > > > int cmd; > > > > The unbatched version should go in each vhost_ioctls. That allows us > > to send many different commands in one ioctl instead of having to > > resort to many ioctls, each one for a different task. > > > > The problem with that is the size of that struct vhost_ioctl, so we > > can build an array. I think it should be enough with the biggest of > > them (vhost_vring_addr ?) for a long time, but I would like to know if > > anybody finds a drawback here. We could always resort to pointers if > > we find we need more space, or start with them from the beginning. > > > > CCing Si-Wei here too, as he is also interested in reducing the startup time. > > > > Thanks! > > And copying my response too: > This is all very exciting, but what exactly is the benefit? > No optimization patches are going to be merged without > numbers showing performance gains. > In this case, can you show gains in process startup time? > Are they significant enough to warrant adding new UAPI? > This should have been marked as RFC in that regard. When this was sent it was one of the planned actions to reduce overhead. After Si-Wei benchmarks, all the efforts will focus on reducing the pinning / maps for the moment. It is unlikely that this will be moved forward soon. Thanks! > > > > > struct vhost_ioctls[]; > > > }; > > > > > > This is doable. Also, this is my first time submitting patches to open source, sorry about the mess in advance. That being said, feel free to throw questions / comments! > > > > > > Thanks and best regards, > > > Pei > > > > > > On Wed, Apr 19, 2023 at 9:19 PM Jason Wang <jasowang@redhat.com> wrote: > > >> > > >> On Wed, Apr 19, 2023 at 11:33 PM Eugenio Perez Martin > > >> <eperezma@redhat.com> wrote: > > >> > > > >> > On Wed, Apr 19, 2023 at 12:56 AM <peili.dev@gmail.com> wrote: > > >> > > > > >> > > From: Pei Li <peili.dev@gmail.com> > > >> > > > > >> > > Currently, part of the vdpa initialization / startup process > > >> > > needs to trigger many ioctls per vq, which is very inefficient > > >> > > and causing unnecessary context switch between user mode and > > >> > > kernel mode. > > >> > > > > >> > > This patch creates an additional ioctl() command, namely > > >> > > VHOST_VDPA_GET_VRING_GROUP_BATCH, that will batching > > >> > > commands of VHOST_VDPA_GET_VRING_GROUP into a single > > >> > > ioctl() call. > > >> > > >> I'd expect there's a kernel patch but I didn't see that? > > >> > > >> If we want to go this way. Why simply have a more generic way, that is > > >> introducing something like: > > >> > > >> VHOST_CMD_BATCH which did something like > > >> > > >> struct vhost_cmd_batch { > > >> int ncmds; > > >> struct vhost_ioctls[]; > > >> }; > > >> > > >> Then you can batch other ioctls other than GET_VRING_GROUP? > > >> > > >> Thanks > > >> > > >> > > > > >> > > > >> > It seems to me you forgot to send the 0/2 cover letter :). > > >> > > > >> > Please include the time we save thanks to avoiding the repetitive > > >> > ioctls in each patch. > > >> > > > >> > CCing Jason and Michael. > > >> > > > >> > > Signed-off-by: Pei Li <peili.dev@gmail.com> > > >> > > --- > > >> > > hw/virtio/vhost-vdpa.c | 31 +++++++++++++++----- > > >> > > include/standard-headers/linux/vhost_types.h | 3 ++ > > >> > > linux-headers/linux/vhost.h | 7 +++++ > > >> > > 3 files changed, 33 insertions(+), 8 deletions(-) > > >> > > > > >> > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > > >> > > index bc6bad23d5..6d45ff8539 100644 > > >> > > --- a/hw/virtio/vhost-vdpa.c > > >> > > +++ b/hw/virtio/vhost-vdpa.c > > >> > > @@ -679,7 +679,8 @@ static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev) > > >> > > uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 | > > >> > > 0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH | > > >> > > 0x1ULL << VHOST_BACKEND_F_IOTLB_ASID | > > >> > > - 0x1ULL << VHOST_BACKEND_F_SUSPEND; > > >> > > + 0x1ULL << VHOST_BACKEND_F_SUSPEND | > > >> > > + 0x1ULL << VHOST_BACKEND_F_IOCTL_BATCH; > > >> > > int r; > > >> > > > > >> > > if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) { > > >> > > @@ -731,14 +732,28 @@ static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx) > > >> > > > > >> > > static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev) > > >> > > { > > >> > > - int i; > > >> > > + int i, nvqs = dev->nvqs; > > >> > > + uint64_t backend_features = dev->backend_cap; > > >> > > + > > >> > > trace_vhost_vdpa_set_vring_ready(dev); > > >> > > - for (i = 0; i < dev->nvqs; ++i) { > > >> > > - struct vhost_vring_state state = { > > >> > > - .index = dev->vq_index + i, > > >> > > - .num = 1, > > >> > > - }; > > >> > > - vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); > > >> > > + > > >> > > + if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOCTL_BATCH))) { > > >> > > + for (i = 0; i < nvqs; ++i) { > > >> > > + struct vhost_vring_state state = { > > >> > > + .index = dev->vq_index + i, > > >> > > + .num = 1, > > >> > > + }; > > >> > > + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); > > >> > > + } > > >> > > + } else { > > >> > > + struct vhost_vring_state states[nvqs + 1]; > > >> > > + states[0].num = nvqs; > > >> > > + for (i = 1; i <= nvqs; ++i) { > > >> > > + states[i].index = dev->vq_index + i - 1; > > >> > > + states[i].num = 1; > > >> > > + } > > >> > > + > > >> > > > >> > I think it's more clear to share the array of vhost_vring_state > > >> > states[nvqs + 1], and then fill it either in one shot with > > >> > VHOST_VDPA_SET_VRING_ENABLE_BATCH or individually with > > >> > VHOST_VDPA_SET_VRING_ENABLE. > > >> > > > >> > The same feedback goes for VHOST_VDPA_GET_VRING_GROUP_BATCH in the next patch. > > >> > > > >> > > + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE_BATCH, &states[0]); > > >> > > } > > >> > > return 0; > > >> > > > >> > This comment is previous to your patch but I just realized we don't > > >> > check the return value of vhost_vdpa_call. Maybe it is worth > > >> > considering addressing that in this series too. > > >> > > > >> > > } > > >> > > diff --git a/include/standard-headers/linux/vhost_types.h b/include/standard-headers/linux/vhost_types.h > > >> > > index c41a73fe36..068d0e1ceb 100644 > > >> > > --- a/include/standard-headers/linux/vhost_types.h > > >> > > +++ b/include/standard-headers/linux/vhost_types.h > > >> > > @@ -164,4 +164,7 @@ struct vhost_vdpa_iova_range { > > >> > > /* Device can be suspended */ > > >> > > #define VHOST_BACKEND_F_SUSPEND 0x4 > > >> > > > > >> > > +/* IOCTL requests can be batched */ > > >> > > +#define VHOST_BACKEND_F_IOCTL_BATCH 0x6 > > >> > > + > > >> > > #endif > > >> > > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h > > >> > > index f9f115a7c7..4c9ddd0a0e 100644 > > >> > > --- a/linux-headers/linux/vhost.h > > >> > > +++ b/linux-headers/linux/vhost.h > > >> > > @@ -180,4 +180,11 @@ > > >> > > */ > > >> > > #define VHOST_VDPA_SUSPEND _IO(VHOST_VIRTIO, 0x7D) > > >> > > > > >> > > +/* Batch version of VHOST_VDPA_SET_VRING_ENABLE > > >> > > + * > > >> > > + * Enable/disable the ring while batching the commands. > > >> > > + */ > > >> > > +#define VHOST_VDPA_SET_VRING_ENABLE_BATCH _IOW(VHOST_VIRTIO, 0x7F, \ > > >> > > + struct vhost_vring_state) > > >> > > + > > >> > > > >> > These headers should be updated with update-linux-headers.sh. To be > > >> > done that way we need the changes merged in the kernel before. > > >> > > > >> > We can signal that the series is not ready for inclusion with the > > >> > subject [RFC. PATCH], like [1], and note it in the commit message. > > >> > That way, you can keep updating the header manually for demonstration > > >> > purposes. > > >> > > > >> > Thanks! > > >> > > > >> > [1] https://lore.kernel.org/qemu-devel/20220413163206.1958254-23-eperezma@redhat.com/ > > >> > > > >> > > >> > > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead 2023-07-21 10:39 ` Eugenio Perez Martin @ 2023-07-21 20:57 ` Si-Wei Liu 0 siblings, 0 replies; 13+ messages in thread From: Si-Wei Liu @ 2023-07-21 20:57 UTC (permalink / raw) To: Eugenio Perez Martin, Michael S. Tsirkin Cc: Pei Li, Jason Wang, peili.dev, qemu-devel On 7/21/2023 3:39 AM, Eugenio Perez Martin wrote: > On Tue, Jul 18, 2023 at 12:55 PM Michael S. Tsirkin <mst@redhat.com> wrote: >> On Thu, Apr 20, 2023 at 10:59:56AM +0200, Eugenio Perez Martin wrote: >>> On Thu, Apr 20, 2023 at 7:25 AM Pei Li <peili@andrew.cmu.edu> wrote: >>>> Hi all, >>>> >>>> My bad, I just submitted the kernel patch. If we are passing some generic command, still we have to add an additional field in the structure to indicate what is the unbatched version of this command, and the struct vhost_ioctls would be some command specific structure. In summary, the structure would be something like >>>> struct vhost_cmd_batch { >>>> int ncmds; >>>> int cmd; >>> The unbatched version should go in each vhost_ioctls. That allows us >>> to send many different commands in one ioctl instead of having to >>> resort to many ioctls, each one for a different task. >>> >>> The problem with that is the size of that struct vhost_ioctl, so we >>> can build an array. I think it should be enough with the biggest of >>> them (vhost_vring_addr ?) for a long time, but I would like to know if >>> anybody finds a drawback here. We could always resort to pointers if >>> we find we need more space, or start with them from the beginning. >>> >>> CCing Si-Wei here too, as he is also interested in reducing the startup time. >>> >>> Thanks! >> And copying my response too: >> This is all very exciting, but what exactly is the benefit? >> No optimization patches are going to be merged without >> numbers showing performance gains. >> In this case, can you show gains in process startup time? >> Are they significant enough to warrant adding new UAPI? >> > This should have been marked as RFC in that regard. > > When this was sent it was one of the planned actions to reduce > overhead. After Si-Wei benchmarks, all the efforts will focus on > reducing the pinning / maps for the moment. It is unlikely that this > will be moved forward soon. Right, this work has comparatively lower priority in terms of significance of impact to migration downtime (to vdpa h/w device that does DMA), but after getting the pinning/map latency effect removed from the performance path, it'd be easier to see same scalability effect subjected to vq count as how software vp_vdpa performs today. I think in order to profile the vq scalability effect with large queue count, we first would need to have proper implementation of CVQ replay and multiqueue LM in place - I'm not sure if x-svq=on could be a good approximate, but maybe that can be used to collect some initial profiling data. Would this be sufficient to move this forward in parallel? Regards, -Siwei > > Thanks! > > >> >>>> struct vhost_ioctls[]; >>>> }; >>>> >>>> This is doable. Also, this is my first time submitting patches to open source, sorry about the mess in advance. That being said, feel free to throw questions / comments! >>>> >>>> Thanks and best regards, >>>> Pei >>>> >>>> On Wed, Apr 19, 2023 at 9:19 PM Jason Wang <jasowang@redhat.com> wrote: >>>>> On Wed, Apr 19, 2023 at 11:33 PM Eugenio Perez Martin >>>>> <eperezma@redhat.com> wrote: >>>>>> On Wed, Apr 19, 2023 at 12:56 AM <peili.dev@gmail.com> wrote: >>>>>>> From: Pei Li <peili.dev@gmail.com> >>>>>>> >>>>>>> Currently, part of the vdpa initialization / startup process >>>>>>> needs to trigger many ioctls per vq, which is very inefficient >>>>>>> and causing unnecessary context switch between user mode and >>>>>>> kernel mode. >>>>>>> >>>>>>> This patch creates an additional ioctl() command, namely >>>>>>> VHOST_VDPA_GET_VRING_GROUP_BATCH, that will batching >>>>>>> commands of VHOST_VDPA_GET_VRING_GROUP into a single >>>>>>> ioctl() call. >>>>> I'd expect there's a kernel patch but I didn't see that? >>>>> >>>>> If we want to go this way. Why simply have a more generic way, that is >>>>> introducing something like: >>>>> >>>>> VHOST_CMD_BATCH which did something like >>>>> >>>>> struct vhost_cmd_batch { >>>>> int ncmds; >>>>> struct vhost_ioctls[]; >>>>> }; >>>>> >>>>> Then you can batch other ioctls other than GET_VRING_GROUP? >>>>> >>>>> Thanks >>>>> >>>>>> It seems to me you forgot to send the 0/2 cover letter :). >>>>>> >>>>>> Please include the time we save thanks to avoiding the repetitive >>>>>> ioctls in each patch. >>>>>> >>>>>> CCing Jason and Michael. >>>>>> >>>>>>> Signed-off-by: Pei Li <peili.dev@gmail.com> >>>>>>> --- >>>>>>> hw/virtio/vhost-vdpa.c | 31 +++++++++++++++----- >>>>>>> include/standard-headers/linux/vhost_types.h | 3 ++ >>>>>>> linux-headers/linux/vhost.h | 7 +++++ >>>>>>> 3 files changed, 33 insertions(+), 8 deletions(-) >>>>>>> >>>>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c >>>>>>> index bc6bad23d5..6d45ff8539 100644 >>>>>>> --- a/hw/virtio/vhost-vdpa.c >>>>>>> +++ b/hw/virtio/vhost-vdpa.c >>>>>>> @@ -679,7 +679,8 @@ static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev) >>>>>>> uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 | >>>>>>> 0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH | >>>>>>> 0x1ULL << VHOST_BACKEND_F_IOTLB_ASID | >>>>>>> - 0x1ULL << VHOST_BACKEND_F_SUSPEND; >>>>>>> + 0x1ULL << VHOST_BACKEND_F_SUSPEND | >>>>>>> + 0x1ULL << VHOST_BACKEND_F_IOCTL_BATCH; >>>>>>> int r; >>>>>>> >>>>>>> if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) { >>>>>>> @@ -731,14 +732,28 @@ static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx) >>>>>>> >>>>>>> static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev) >>>>>>> { >>>>>>> - int i; >>>>>>> + int i, nvqs = dev->nvqs; >>>>>>> + uint64_t backend_features = dev->backend_cap; >>>>>>> + >>>>>>> trace_vhost_vdpa_set_vring_ready(dev); >>>>>>> - for (i = 0; i < dev->nvqs; ++i) { >>>>>>> - struct vhost_vring_state state = { >>>>>>> - .index = dev->vq_index + i, >>>>>>> - .num = 1, >>>>>>> - }; >>>>>>> - vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); >>>>>>> + >>>>>>> + if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOCTL_BATCH))) { >>>>>>> + for (i = 0; i < nvqs; ++i) { >>>>>>> + struct vhost_vring_state state = { >>>>>>> + .index = dev->vq_index + i, >>>>>>> + .num = 1, >>>>>>> + }; >>>>>>> + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); >>>>>>> + } >>>>>>> + } else { >>>>>>> + struct vhost_vring_state states[nvqs + 1]; >>>>>>> + states[0].num = nvqs; >>>>>>> + for (i = 1; i <= nvqs; ++i) { >>>>>>> + states[i].index = dev->vq_index + i - 1; >>>>>>> + states[i].num = 1; >>>>>>> + } >>>>>>> + >>>>>> I think it's more clear to share the array of vhost_vring_state >>>>>> states[nvqs + 1], and then fill it either in one shot with >>>>>> VHOST_VDPA_SET_VRING_ENABLE_BATCH or individually with >>>>>> VHOST_VDPA_SET_VRING_ENABLE. >>>>>> >>>>>> The same feedback goes for VHOST_VDPA_GET_VRING_GROUP_BATCH in the next patch. >>>>>> >>>>>>> + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE_BATCH, &states[0]); >>>>>>> } >>>>>>> return 0; >>>>>> This comment is previous to your patch but I just realized we don't >>>>>> check the return value of vhost_vdpa_call. Maybe it is worth >>>>>> considering addressing that in this series too. >>>>>> >>>>>>> } >>>>>>> diff --git a/include/standard-headers/linux/vhost_types.h b/include/standard-headers/linux/vhost_types.h >>>>>>> index c41a73fe36..068d0e1ceb 100644 >>>>>>> --- a/include/standard-headers/linux/vhost_types.h >>>>>>> +++ b/include/standard-headers/linux/vhost_types.h >>>>>>> @@ -164,4 +164,7 @@ struct vhost_vdpa_iova_range { >>>>>>> /* Device can be suspended */ >>>>>>> #define VHOST_BACKEND_F_SUSPEND 0x4 >>>>>>> >>>>>>> +/* IOCTL requests can be batched */ >>>>>>> +#define VHOST_BACKEND_F_IOCTL_BATCH 0x6 >>>>>>> + >>>>>>> #endif >>>>>>> diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h >>>>>>> index f9f115a7c7..4c9ddd0a0e 100644 >>>>>>> --- a/linux-headers/linux/vhost.h >>>>>>> +++ b/linux-headers/linux/vhost.h >>>>>>> @@ -180,4 +180,11 @@ >>>>>>> */ >>>>>>> #define VHOST_VDPA_SUSPEND _IO(VHOST_VIRTIO, 0x7D) >>>>>>> >>>>>>> +/* Batch version of VHOST_VDPA_SET_VRING_ENABLE >>>>>>> + * >>>>>>> + * Enable/disable the ring while batching the commands. >>>>>>> + */ >>>>>>> +#define VHOST_VDPA_SET_VRING_ENABLE_BATCH _IOW(VHOST_VIRTIO, 0x7F, \ >>>>>>> + struct vhost_vring_state) >>>>>>> + >>>>>> These headers should be updated with update-linux-headers.sh. To be >>>>>> done that way we need the changes merged in the kernel before. >>>>>> >>>>>> We can signal that the series is not ready for inclusion with the >>>>>> subject [RFC. PATCH], like [1], and note it in the commit message. >>>>>> That way, you can keep updating the header manually for demonstration >>>>>> purposes. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> [1] https://lore.kernel.org/qemu-devel/20220413163206.1958254-23-eperezma@redhat.com/ >>>>>> >>>>> >>> ^ permalink raw reply [flat|nested] 13+ messages in thread
* vdpa: use io_uring passthrough command for IOCTLs [was Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead] 2023-04-20 4:18 ` Jason Wang 2023-04-20 5:25 ` Pei Li @ 2023-07-18 10:32 ` Stefano Garzarella 2023-07-26 8:10 ` Jason Wang 1 sibling, 1 reply; 13+ messages in thread From: Stefano Garzarella @ 2023-07-18 10:32 UTC (permalink / raw) To: Jason Wang, Michael Tsirkin, Eugenio Perez Martin Cc: peili.dev, qemu devel list, Linux Virtualization On Thu, Apr 20, 2023 at 6:20 AM Jason Wang <jasowang@redhat.com> wrote: > > On Wed, Apr 19, 2023 at 11:33 PM Eugenio Perez Martin > <eperezma@redhat.com> wrote: > > > > On Wed, Apr 19, 2023 at 12:56 AM <peili.dev@gmail.com> wrote: > > > > > > From: Pei Li <peili.dev@gmail.com> > > > > > > Currently, part of the vdpa initialization / startup process > > > needs to trigger many ioctls per vq, which is very inefficient > > > and causing unnecessary context switch between user mode and > > > kernel mode. > > > > > > This patch creates an additional ioctl() command, namely > > > VHOST_VDPA_GET_VRING_GROUP_BATCH, that will batching > > > commands of VHOST_VDPA_GET_VRING_GROUP into a single > > > ioctl() call. > > I'd expect there's a kernel patch but I didn't see that? > > If we want to go this way. Why simply have a more generic way, that is > introducing something like: > > VHOST_CMD_BATCH which did something like > > struct vhost_cmd_batch { > int ncmds; > struct vhost_ioctls[]; > }; > > Then you can batch other ioctls other than GET_VRING_GROUP? > Just restarting this discussion, since I recently worked more with io_uring passthrough commands and I think it can help here. The NVMe guys had a similar problem (ioctl too slow for their use case)[1][2], so they developed a new feature in io_uring that basically allows you to do IOCTLs asynchronously and in batches using io_uring. The same feature is also used by ublk [3] and I recently talked about this at DevConf with German [4]. Basically, there's a new callback in fops (struct file_operations.uring_cmd). IIUC for NVMe (drivers/nvme/host/ioctl.c) they used exactly the same values used for IOCTLs also for the new uring_cmd callback. We could do the same. The changes in the vhost-vdpa kernel module should be simple, and we could share the code for handling ioctl and uring_cmd. That way any new command can be supported with both for compatibility. In QEMU then we can start using it to optimize the control path. What do you think? If it's interesting, I could throw down an RFC with the changes or if anyone is interested in working on it, I can help with the details. Thanks, Stefano [1] https://lpc.events/event/11/contributions/989/ [2] https://lpc.events/event/16/contributions/1382/ [3] https://lwn.net/Articles/903855/ [4] https://www.youtube.com/watch?v=6JqNPirreoY ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: vdpa: use io_uring passthrough command for IOCTLs [was Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead] 2023-07-18 10:32 ` vdpa: use io_uring passthrough command for IOCTLs [was Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead] Stefano Garzarella @ 2023-07-26 8:10 ` Jason Wang 0 siblings, 0 replies; 13+ messages in thread From: Jason Wang @ 2023-07-26 8:10 UTC (permalink / raw) To: Stefano Garzarella Cc: Michael Tsirkin, Eugenio Perez Martin, peili.dev, qemu devel list, Linux Virtualization On Tue, Jul 18, 2023 at 6:32 PM Stefano Garzarella <sgarzare@redhat.com> wrote: > > On Thu, Apr 20, 2023 at 6:20 AM Jason Wang <jasowang@redhat.com> wrote: > > > > On Wed, Apr 19, 2023 at 11:33 PM Eugenio Perez Martin > > <eperezma@redhat.com> wrote: > > > > > > On Wed, Apr 19, 2023 at 12:56 AM <peili.dev@gmail.com> wrote: > > > > > > > > From: Pei Li <peili.dev@gmail.com> > > > > > > > > Currently, part of the vdpa initialization / startup process > > > > needs to trigger many ioctls per vq, which is very inefficient > > > > and causing unnecessary context switch between user mode and > > > > kernel mode. > > > > > > > > This patch creates an additional ioctl() command, namely > > > > VHOST_VDPA_GET_VRING_GROUP_BATCH, that will batching > > > > commands of VHOST_VDPA_GET_VRING_GROUP into a single > > > > ioctl() call. > > > > I'd expect there's a kernel patch but I didn't see that? > > > > If we want to go this way. Why simply have a more generic way, that is > > introducing something like: > > > > VHOST_CMD_BATCH which did something like > > > > struct vhost_cmd_batch { > > int ncmds; > > struct vhost_ioctls[]; > > }; > > > > Then you can batch other ioctls other than GET_VRING_GROUP? > > > > Just restarting this discussion, since I recently worked more with > io_uring passthrough commands and I think it can help here. > > The NVMe guys had a similar problem (ioctl too slow for their use > case)[1][2], so they developed a new feature in io_uring that > basically allows you to do IOCTLs asynchronously and in batches using > io_uring. > > The same feature is also used by ublk [3] and I recently talked about > this at DevConf with German [4]. > > Basically, there's a new callback in fops (struct file_operations.uring_cmd). > IIUC for NVMe (drivers/nvme/host/ioctl.c) they used exactly the same > values used for IOCTLs also for the new uring_cmd callback. > > We could do the same. The changes in the vhost-vdpa kernel module > should be simple, and we could share the code for handling ioctl and > uring_cmd. > That way any new command can be supported with both for compatibility. > > In QEMU then we can start using it to optimize the control path. > > What do you think? This looks interesting. > > If it's interesting, I could throw down an RFC with the changes or if > anyone is interested in working on it, I can help with the details. Please do that. Thanks > > Thanks, > Stefano > > [1] https://lpc.events/event/11/contributions/989/ > [2] https://lpc.events/event/16/contributions/1382/ > [3] https://lwn.net/Articles/903855/ > [4] https://www.youtube.com/watch?v=6JqNPirreoY > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead 2023-04-18 22:56 [PATCH 1/2] Reduce vdpa initialization / startup overhead peili.dev 2023-04-18 22:56 ` [PATCH 2/2] " peili.dev 2023-04-19 15:33 ` [PATCH 1/2] " Eugenio Perez Martin @ 2023-07-18 10:53 ` Michael S. Tsirkin 2 siblings, 0 replies; 13+ messages in thread From: Michael S. Tsirkin @ 2023-07-18 10:53 UTC (permalink / raw) To: peili.dev; +Cc: qemu-devel, eperezma On Tue, Apr 18, 2023 at 06:56:37PM -0400, peili.dev@gmail.com wrote: > From: Pei Li <peili.dev@gmail.com> > > Currently, part of the vdpa initialization / startup process > needs to trigger many ioctls per vq, which is very inefficient > and causing unnecessary context switch between user mode and > kernel mode. > > This patch creates an additional ioctl() command, namely > VHOST_VDPA_GET_VRING_GROUP_BATCH, that will batching > commands of VHOST_VDPA_GET_VRING_GROUP into a single > ioctl() call. > > Signed-off-by: Pei Li <peili.dev@gmail.com> This is all very exciting, but what exactly is the benefit? No optimization patches are going to be merged without numbers showing performance gains. In this case, can you show gains in process startup time? Are they significant enough to warrant adding new UAPI? > --- > hw/virtio/vhost-vdpa.c | 31 +++++++++++++++----- > include/standard-headers/linux/vhost_types.h | 3 ++ > linux-headers/linux/vhost.h | 7 +++++ > 3 files changed, 33 insertions(+), 8 deletions(-) > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > index bc6bad23d5..6d45ff8539 100644 > --- a/hw/virtio/vhost-vdpa.c > +++ b/hw/virtio/vhost-vdpa.c > @@ -679,7 +679,8 @@ static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev) > uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 | > 0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH | > 0x1ULL << VHOST_BACKEND_F_IOTLB_ASID | > - 0x1ULL << VHOST_BACKEND_F_SUSPEND; > + 0x1ULL << VHOST_BACKEND_F_SUSPEND | > + 0x1ULL << VHOST_BACKEND_F_IOCTL_BATCH; > int r; > > if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) { > @@ -731,14 +732,28 @@ static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx) > > static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev) > { > - int i; > + int i, nvqs = dev->nvqs; > + uint64_t backend_features = dev->backend_cap; > + > trace_vhost_vdpa_set_vring_ready(dev); > - for (i = 0; i < dev->nvqs; ++i) { > - struct vhost_vring_state state = { > - .index = dev->vq_index + i, > - .num = 1, > - }; > - vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); > + > + if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOCTL_BATCH))) { > + for (i = 0; i < nvqs; ++i) { > + struct vhost_vring_state state = { > + .index = dev->vq_index + i, > + .num = 1, > + }; > + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); > + } > + } else { > + struct vhost_vring_state states[nvqs + 1]; > + states[0].num = nvqs; > + for (i = 1; i <= nvqs; ++i) { > + states[i].index = dev->vq_index + i - 1; > + states[i].num = 1; > + } > + > + vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE_BATCH, &states[0]); > } > return 0; > } > diff --git a/include/standard-headers/linux/vhost_types.h b/include/standard-headers/linux/vhost_types.h > index c41a73fe36..068d0e1ceb 100644 > --- a/include/standard-headers/linux/vhost_types.h > +++ b/include/standard-headers/linux/vhost_types.h > @@ -164,4 +164,7 @@ struct vhost_vdpa_iova_range { > /* Device can be suspended */ > #define VHOST_BACKEND_F_SUSPEND 0x4 > > +/* IOCTL requests can be batched */ > +#define VHOST_BACKEND_F_IOCTL_BATCH 0x6 > + > #endif > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h > index f9f115a7c7..4c9ddd0a0e 100644 > --- a/linux-headers/linux/vhost.h > +++ b/linux-headers/linux/vhost.h > @@ -180,4 +180,11 @@ > */ > #define VHOST_VDPA_SUSPEND _IO(VHOST_VIRTIO, 0x7D) > > +/* Batch version of VHOST_VDPA_SET_VRING_ENABLE > + * > + * Enable/disable the ring while batching the commands. > + */ > +#define VHOST_VDPA_SET_VRING_ENABLE_BATCH _IOW(VHOST_VIRTIO, 0x7F, \ > + struct vhost_vring_state) > + > #endif > -- > 2.25.1 > > ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2023-07-26 8:11 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-04-18 22:56 [PATCH 1/2] Reduce vdpa initialization / startup overhead peili.dev 2023-04-18 22:56 ` [PATCH 2/2] " peili.dev 2023-04-19 15:33 ` [PATCH 1/2] " Eugenio Perez Martin 2023-04-20 4:18 ` Jason Wang 2023-04-20 5:25 ` Pei Li 2023-04-20 8:14 ` Jason Wang 2023-04-20 8:59 ` Eugenio Perez Martin 2023-07-18 10:55 ` Michael S. Tsirkin 2023-07-21 10:39 ` Eugenio Perez Martin 2023-07-21 20:57 ` Si-Wei Liu 2023-07-18 10:32 ` vdpa: use io_uring passthrough command for IOCTLs [was Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead] Stefano Garzarella 2023-07-26 8:10 ` Jason Wang 2023-07-18 10:53 ` [PATCH 1/2] Reduce vdpa initialization / startup overhead Michael S. Tsirkin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).