* Re: [PATCH v2 0/5] VDUSE: Support registering userspace memory as bounce buffer
[not found] <20220706050503.171-1-xieyongji@bytedance.com>
@ 2022-07-06 9:30 ` Jason Wang
[not found] ` <CACycT3u3kOzzQjKBYNAB5vtpgcmPg7FjJ5yTYMtQo0SJVrBmZg@mail.gmail.com>
[not found] ` <20220706050503.171-2-xieyongji@bytedance.com>
` (2 subsequent siblings)
3 siblings, 1 reply; 9+ messages in thread
From: Jason Wang @ 2022-07-06 9:30 UTC (permalink / raw)
To: Xie Yongji
Cc: mst, Liu Xiaodong, linux-kernel, virtualization, Maxime Coquelin,
Stefan Hajnoczi, songmuchun
On Wed, Jul 6, 2022 at 1:05 PM Xie Yongji <xieyongji@bytedance.com> wrote:
>
> Hi all,
>
> This series introduces some new ioctls: VDUSE_IOTLB_GET_INFO,
> VDUSE_IOTLB_REG_UMEM and VDUSE_IOTLB_DEREG_UMEM to support
> registering and de-registering userspace memory for IOTLB
> as bounce buffer in virtio-vdpa case.
>
> The VDUSE_IOTLB_GET_INFO ioctl can help user to query IOLTB
> information such as bounce buffer size. Then user can use
> those information on VDUSE_IOTLB_REG_UMEM and
> VDUSE_IOTLB_DEREG_UMEM ioctls to register and de-register
> userspace memory for IOTLB.
>
> During registering and de-registering, the DMA data in use
> would be copied from kernel bounce pages to userspace bounce
> pages and back.
>
> With this feature, some existing application such as SPDK
> and DPDK can leverage the datapath of VDUSE directly and
> efficiently as discussed before [1][2]. They can register
> some preallocated hugepages to VDUSE to avoid an extra
> memcpy from bounce-buffer to hugepages.
This is really interesting.
But a small concern on uAPI is that this seems to expose the VDUSE
internal implementation (bounce buffer) to userspace. We tried hard to
hide it via the GET_FD before. Anyway can we keep it?
Thanks
>
> The kernel and userspace codes could be found in github:
>
> https://github.com/bytedance/linux/tree/vduse-umem
> https://github.com/bytedance/qemu/tree/vduse-umem
>
> To test it with qemu-storage-daemon:
>
> $ qemu-storage-daemon \
> --chardev socket,id=charmonitor,path=/tmp/qmp.sock,server=on,wait=off \
> --monitor chardev=charmonitor \
> --blockdev driver=host_device,cache.direct=on,aio=native,filename=/dev/nullb0,node-name=disk0 \
> --export type=vduse-blk,id=vduse-test,name=vduse-test,node-name=disk0,writable=on
>
> [1] https://lkml.org/lkml/2021/6/27/318
> [2] https://lkml.org/lkml/2022/7/4/246
>
> Please review, thanks!
>
> V1 to V2:
> - Drop the patch that updating API version [MST]
> - Replace unpin_user_pages() with unpin_user_pages_dirty_lock() [MST]
> - Use __vmalloc(__GFP_ACCOUNT) for memory accounting [MST]
>
> Xie Yongji (5):
> vduse: Remove unnecessary spin lock protection
> vduse: Use memcpy_{to,from}_page() in do_bounce()
> vduse: Support using userspace pages as bounce buffer
> vduse: Support querying IOLTB information
> vduse: Support registering userspace memory for IOTLB
>
> drivers/vdpa/vdpa_user/iova_domain.c | 134 ++++++++++++++++++++---
> drivers/vdpa/vdpa_user/iova_domain.h | 9 ++
> drivers/vdpa/vdpa_user/vduse_dev.c | 152 +++++++++++++++++++++++++++
> include/uapi/linux/vduse.h | 45 ++++++++
> 4 files changed, 327 insertions(+), 13 deletions(-)
>
> --
> 2.20.1
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 0/5] VDUSE: Support registering userspace memory as bounce buffer
[not found] ` <CACycT3u3kOzzQjKBYNAB5vtpgcmPg7FjJ5yTYMtQo0SJVrBmZg@mail.gmail.com>
@ 2022-07-08 8:38 ` Jason Wang
[not found] ` <CACycT3sNnmV8jrnjFkft6oST_6SGLc43f8Y4ZpomkPeOsvsorQ@mail.gmail.com>
0 siblings, 1 reply; 9+ messages in thread
From: Jason Wang @ 2022-07-08 8:38 UTC (permalink / raw)
To: Yongji Xie
Cc: mst, Liu Xiaodong, linux-kernel, virtualization, Maxime Coquelin,
Stefan Hajnoczi, songmuchun
On Wed, Jul 6, 2022 at 6:16 PM Yongji Xie <xieyongji@bytedance.com> wrote:
>
> On Wed, Jul 6, 2022 at 5:30 PM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Jul 6, 2022 at 1:05 PM Xie Yongji <xieyongji@bytedance.com> wrote:
> > >
> > > Hi all,
> > >
> > > This series introduces some new ioctls: VDUSE_IOTLB_GET_INFO,
> > > VDUSE_IOTLB_REG_UMEM and VDUSE_IOTLB_DEREG_UMEM to support
> > > registering and de-registering userspace memory for IOTLB
> > > as bounce buffer in virtio-vdpa case.
> > >
> > > The VDUSE_IOTLB_GET_INFO ioctl can help user to query IOLTB
> > > information such as bounce buffer size. Then user can use
> > > those information on VDUSE_IOTLB_REG_UMEM and
> > > VDUSE_IOTLB_DEREG_UMEM ioctls to register and de-register
> > > userspace memory for IOTLB.
> > >
> > > During registering and de-registering, the DMA data in use
> > > would be copied from kernel bounce pages to userspace bounce
> > > pages and back.
> > >
> > > With this feature, some existing application such as SPDK
> > > and DPDK can leverage the datapath of VDUSE directly and
> > > efficiently as discussed before [1][2]. They can register
> > > some preallocated hugepages to VDUSE to avoid an extra
> > > memcpy from bounce-buffer to hugepages.
> >
> > This is really interesting.
> >
> > But a small concern on uAPI is that this seems to expose the VDUSE
> > internal implementation (bounce buffer) to userspace. We tried hard to
> > hide it via the GET_FD before. Anyway can we keep it?
> >
>
> Another way is changing GET_FD ioctl to add a flag or reuse 'perm'
> field to indicate whether a IOVA region supports userspace memory
> registration. Then userspace can use
> VDUSE_IOTLB_REG_UMEM/VDUSE_IOTLB_DEREG_UMEM to register/deregister
> userspace memory for this IOVA region.
Looks better.
> Any suggestions?
I wonder what's the value of keeping the compatibility with the kernel
mmaped bounce buffer. It means we need to take extra care on e.g data
copying when reg/reg user space memory.
Can we simply allow the third kind of fd that only works for umem registration?
Thanks
>
> Thanks,
> Yongji
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 0/5] VDUSE: Support registering userspace memory as bounce buffer
[not found] ` <CACycT3sNnmV8jrnjFkft6oST_6SGLc43f8Y4ZpomkPeOsvsorQ@mail.gmail.com>
@ 2022-07-11 6:02 ` Jason Wang
[not found] ` <CACycT3vgaOrLVq+GDRK1PqqBRCkUAU0bYH=2CDvudsX0F9FBDA@mail.gmail.com>
0 siblings, 1 reply; 9+ messages in thread
From: Jason Wang @ 2022-07-11 6:02 UTC (permalink / raw)
To: Yongji Xie
Cc: mst, Liu Xiaodong, linux-kernel, virtualization, Maxime Coquelin,
Stefan Hajnoczi, songmuchun
On Fri, Jul 8, 2022 at 5:53 PM Yongji Xie <xieyongji@bytedance.com> wrote:
>
> On Fri, Jul 8, 2022 at 4:38 PM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Jul 6, 2022 at 6:16 PM Yongji Xie <xieyongji@bytedance.com> wrote:
> > >
> > > On Wed, Jul 6, 2022 at 5:30 PM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Wed, Jul 6, 2022 at 1:05 PM Xie Yongji <xieyongji@bytedance.com> wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > This series introduces some new ioctls: VDUSE_IOTLB_GET_INFO,
> > > > > VDUSE_IOTLB_REG_UMEM and VDUSE_IOTLB_DEREG_UMEM to support
> > > > > registering and de-registering userspace memory for IOTLB
> > > > > as bounce buffer in virtio-vdpa case.
> > > > >
> > > > > The VDUSE_IOTLB_GET_INFO ioctl can help user to query IOLTB
> > > > > information such as bounce buffer size. Then user can use
> > > > > those information on VDUSE_IOTLB_REG_UMEM and
> > > > > VDUSE_IOTLB_DEREG_UMEM ioctls to register and de-register
> > > > > userspace memory for IOTLB.
> > > > >
> > > > > During registering and de-registering, the DMA data in use
> > > > > would be copied from kernel bounce pages to userspace bounce
> > > > > pages and back.
> > > > >
> > > > > With this feature, some existing application such as SPDK
> > > > > and DPDK can leverage the datapath of VDUSE directly and
> > > > > efficiently as discussed before [1][2]. They can register
> > > > > some preallocated hugepages to VDUSE to avoid an extra
> > > > > memcpy from bounce-buffer to hugepages.
> > > >
> > > > This is really interesting.
> > > >
> > > > But a small concern on uAPI is that this seems to expose the VDUSE
> > > > internal implementation (bounce buffer) to userspace. We tried hard to
> > > > hide it via the GET_FD before. Anyway can we keep it?
> > > >
> > >
> > > Another way is changing GET_FD ioctl to add a flag or reuse 'perm'
> > > field to indicate whether a IOVA region supports userspace memory
> > > registration. Then userspace can use
> > > VDUSE_IOTLB_REG_UMEM/VDUSE_IOTLB_DEREG_UMEM to register/deregister
> > > userspace memory for this IOVA region.
> >
> > Looks better.
> >
>
> OK.
>
> > > Any suggestions?
> >
> > I wonder what's the value of keeping the compatibility with the kernel
> > mmaped bounce buffer. It means we need to take extra care on e.g data
> > copying when reg/reg user space memory.
> >
>
> I'm not sure I get your point on the compatibility with the kernel
> bounce buffer. Do you mean they use the same iova region?
Yes.
>
> The userspace daemon might crash or reboot. In those cases, we still
> need a kernel buffer to store/recover the data.
Yes, this should be a good point.
>
> > Can we simply allow the third kind of fd that only works for umem registration?
> >
>
> Do you mean using another iova region for umem?
I meant having a new kind of fd that only allows umem registration.
>I think we don't need
> a fd in umem case since the userspace daemon can access the memory
> directly without using mmap() to map it into the address space in
> advance.
Ok, I will have a look at the code and get back.
Thanks
>
> Thanks,
> Yongji
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/5] vduse: Remove unnecessary spin lock protection
[not found] ` <20220706050503.171-2-xieyongji@bytedance.com>
@ 2022-07-13 5:43 ` Jason Wang
[not found] ` <CACycT3sNt_PcPQ__KVSqV6xd=+z5+gbMqWhu6H3vcj_fTGzUsw@mail.gmail.com>
2022-07-13 5:57 ` Michael S. Tsirkin
1 sibling, 1 reply; 9+ messages in thread
From: Jason Wang @ 2022-07-13 5:43 UTC (permalink / raw)
To: Xie Yongji, mst, xiaodong.liu, maxime.coquelin, stefanha
Cc: virtualization, linux-kernel, songmuchun
在 2022/7/6 13:04, Xie Yongji 写道:
> Taking iotlb lock to access bounce page in page fault
> handler is meaningless since vduse_domain_free_bounce_pages()
> would only be called during file release.
>
> Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
> ---
> drivers/vdpa/vdpa_user/iova_domain.c | 7 ++-----
> 1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
> index 6daa3978d290..bca1f0b8850c 100644
> --- a/drivers/vdpa/vdpa_user/iova_domain.c
> +++ b/drivers/vdpa/vdpa_user/iova_domain.c
> @@ -211,17 +211,14 @@ static struct page *
> vduse_domain_get_bounce_page(struct vduse_iova_domain *domain, u64 iova)
> {
> struct vduse_bounce_map *map;
> - struct page *page = NULL;
> + struct page *page;
>
> - spin_lock(&domain->iotlb_lock);
> map = &domain->bounce_maps[iova >> PAGE_SHIFT];
> if (!map->bounce_page)
> - goto out;
> + return NULL;
Interesting, I wonder why we don't serialize with
vduse_domain_map_bounce_page() with iotlb_lock?
Thanks
>
> page = map->bounce_page;
> get_page(page);
> -out:
> - spin_unlock(&domain->iotlb_lock);
>
> return page;
> }
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/5] vduse: Remove unnecessary spin lock protection
[not found] ` <20220706050503.171-2-xieyongji@bytedance.com>
2022-07-13 5:43 ` [PATCH v2 1/5] vduse: Remove unnecessary spin lock protection Jason Wang
@ 2022-07-13 5:57 ` Michael S. Tsirkin
1 sibling, 0 replies; 9+ messages in thread
From: Michael S. Tsirkin @ 2022-07-13 5:57 UTC (permalink / raw)
To: Xie Yongji
Cc: xiaodong.liu, linux-kernel, virtualization, maxime.coquelin,
stefanha, songmuchun
On Wed, Jul 06, 2022 at 01:04:59PM +0800, Xie Yongji wrote:
> Taking iotlb lock to access bounce page in page fault
> handler is meaningless since vduse_domain_free_bounce_pages()
> would only be called during file release.
>
> Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
vduse_domain_free_bounce_pages is not the
only one taking this lock. This commit log needs more
analysis documenting all points of access to bounce_maps
and why vduse_domain_get_bounce_page and file
release are the only two.
> ---
> drivers/vdpa/vdpa_user/iova_domain.c | 7 ++-----
> 1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
> index 6daa3978d290..bca1f0b8850c 100644
> --- a/drivers/vdpa/vdpa_user/iova_domain.c
> +++ b/drivers/vdpa/vdpa_user/iova_domain.c
> @@ -211,17 +211,14 @@ static struct page *
> vduse_domain_get_bounce_page(struct vduse_iova_domain *domain, u64 iova)
> {
> struct vduse_bounce_map *map;
> - struct page *page = NULL;
> + struct page *page;
>
> - spin_lock(&domain->iotlb_lock);
> map = &domain->bounce_maps[iova >> PAGE_SHIFT];
> if (!map->bounce_page)
> - goto out;
> + return NULL;
>
> page = map->bounce_page;
> get_page(page);
> -out:
> - spin_unlock(&domain->iotlb_lock);
>
> return page;
> }
> --
> 2.20.1
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/5] vduse: Use memcpy_{to,from}_page() in do_bounce()
[not found] ` <20220706050503.171-3-xieyongji@bytedance.com>
@ 2022-07-13 5:59 ` Jason Wang
0 siblings, 0 replies; 9+ messages in thread
From: Jason Wang @ 2022-07-13 5:59 UTC (permalink / raw)
To: Xie Yongji, mst, xiaodong.liu, maxime.coquelin, stefanha
Cc: virtualization, linux-kernel, songmuchun
在 2022/7/6 13:05, Xie Yongji 写道:
> kmap_atomic() is being deprecated in favor of kmap_local_page().
>
> The use of kmap_atomic() in do_bounce() is all thread local therefore
> kmap_local_page() is a sufficient replacement.
>
> Convert to kmap_local_page() but, instead of open coding it,
> use the helpers memcpy_to_page() and memcpy_from_page().
>
> Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
> ---
> drivers/vdpa/vdpa_user/iova_domain.c | 9 ++++-----
> 1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
> index bca1f0b8850c..50d7c08d5450 100644
> --- a/drivers/vdpa/vdpa_user/iova_domain.c
> +++ b/drivers/vdpa/vdpa_user/iova_domain.c
> @@ -138,18 +138,17 @@ static void do_bounce(phys_addr_t orig, void *addr, size_t size,
> {
> unsigned long pfn = PFN_DOWN(orig);
> unsigned int offset = offset_in_page(orig);
> - char *buffer;
> + struct page *page;
> unsigned int sz = 0;
>
> while (size) {
> sz = min_t(size_t, PAGE_SIZE - offset, size);
>
> - buffer = kmap_atomic(pfn_to_page(pfn));
> + page = pfn_to_page(pfn);
> if (dir == DMA_TO_DEVICE)
> - memcpy(addr, buffer + offset, sz);
> + memcpy_from_page(addr, page, offset, sz);
> else
> - memcpy(buffer + offset, addr, sz);
> - kunmap_atomic(buffer);
> + memcpy_to_page(page, offset, addr, sz);
>
> size -= sz;
> pfn++;
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/5] vduse: Remove unnecessary spin lock protection
[not found] ` <CACycT3sNt_PcPQ__KVSqV6xd=+z5+gbMqWhu6H3vcj_fTGzUsw@mail.gmail.com>
@ 2022-07-14 2:27 ` Jason Wang
0 siblings, 0 replies; 9+ messages in thread
From: Jason Wang @ 2022-07-14 2:27 UTC (permalink / raw)
To: Yongji Xie
Cc: Michael S. Tsirkin, Liu Xiaodong, linux-kernel, virtualization,
Maxime Coquelin, Stefan Hajnoczi, songmuchun
On Wed, Jul 13, 2022 at 7:09 PM Yongji Xie <xieyongji@bytedance.com> wrote:
>
> On Wed, Jul 13, 2022 at 1:44 PM Jason Wang <jasowang@redhat.com> wrote:
> >
> >
> > 在 2022/7/6 13:04, Xie Yongji 写道:
> > > Taking iotlb lock to access bounce page in page fault
> > > handler is meaningless since vduse_domain_free_bounce_pages()
> > > would only be called during file release.
> > >
> > > Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
> > > ---
> > > drivers/vdpa/vdpa_user/iova_domain.c | 7 ++-----
> > > 1 file changed, 2 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
> > > index 6daa3978d290..bca1f0b8850c 100644
> > > --- a/drivers/vdpa/vdpa_user/iova_domain.c
> > > +++ b/drivers/vdpa/vdpa_user/iova_domain.c
> > > @@ -211,17 +211,14 @@ static struct page *
> > > vduse_domain_get_bounce_page(struct vduse_iova_domain *domain, u64 iova)
> > > {
> > > struct vduse_bounce_map *map;
> > > - struct page *page = NULL;
> > > + struct page *page;
> > >
> > > - spin_lock(&domain->iotlb_lock);
> > > map = &domain->bounce_maps[iova >> PAGE_SHIFT];
> > > if (!map->bounce_page)
> > > - goto out;
> > > + return NULL;
> >
> >
> > Interesting, I wonder why we don't serialize with
> > vduse_domain_map_bounce_page() with iotlb_lock?
> >
>
> Userspace should only access the bounce page after we set up the dma
> mapping, so we don't need serialization from the iotlb_lock in this
> case.
What about the buggy/malicious user space that tries to access those
pages before or just in the middle of it has been mapped?
> And vduse_domain_map_bounce_page() only sets the
> map->bounce_page rather than clears the map->bounce_page, we would not
> have any problem without the lock protection.
Probably, I see an assignment of orig_phys after the alloc_page() but
it seems only used in bouncing which will only be called by dma ops.
At least we'd better have a comment to explain the synchronization
here.
Thanks
>
> Thanks,
> Yongji
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 4/5] vduse: Support querying IOLTB information
[not found] ` <20220706050503.171-5-xieyongji@bytedance.com>
@ 2022-07-14 2:51 ` Jason Wang
0 siblings, 0 replies; 9+ messages in thread
From: Jason Wang @ 2022-07-14 2:51 UTC (permalink / raw)
To: Xie Yongji
Cc: mst, Liu Xiaodong, linux-kernel, virtualization, Maxime Coquelin,
Stefan Hajnoczi, songmuchun
On Wed, Jul 6, 2022 at 1:06 PM Xie Yongji <xieyongji@bytedance.com> wrote:
>
> This introduces a new ioctl: VDUSE_IOTLB_GET_INFO to
> support querying IOLTB information such as bounce
> buffer size.
>
> Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
> ---
> drivers/vdpa/vdpa_user/vduse_dev.c | 13 +++++++++++++
> include/uapi/linux/vduse.h | 17 +++++++++++++++++
> 2 files changed, 30 insertions(+)
>
> diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> index 3bc27de58f46..c47a5d9765cf 100644
> --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> @@ -1089,6 +1089,19 @@ static long vduse_dev_ioctl(struct file *file, unsigned int cmd,
> ret = vduse_dev_queue_irq_work(dev, &dev->vqs[index].inject);
> break;
> }
> + case VDUSE_IOTLB_GET_INFO: {
As discussed, it's better not to expose the VDUSE internal structure
like "IOTLB" in the name.
We probably need to extend GET_FD or have a new ioctl like GET_FD_INFO.
Thanks
> + struct vduse_iotlb_info iotlb;
> +
> + iotlb.bounce_iova = 0;
> + iotlb.bounce_size = dev->domain->bounce_size;
> +
> + ret = -EFAULT;
> + if (copy_to_user(argp, &iotlb, sizeof(iotlb)))
> + break;
> +
> + ret = 0;
> + break;
> + }
> default:
> ret = -ENOIOCTLCMD;
> break;
> diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h
> index 7cfe1c1280c0..c201b7a77c2c 100644
> --- a/include/uapi/linux/vduse.h
> +++ b/include/uapi/linux/vduse.h
> @@ -210,6 +210,23 @@ struct vduse_vq_eventfd {
> */
> #define VDUSE_VQ_INJECT_IRQ _IOW(VDUSE_BASE, 0x17, __u32)
>
> +/**
> + * struct vduse_iotlb_info - IOTLB information
> + * @bounce_iova: start IOVA of bounce buffer
> + * @bounce_size: bounce buffer size
> + * @reserved: for future use, needs to be initialized to zero
> + *
> + * Structure used by VDUSE_IOTLB_GET_INFO ioctl to get IOTLB information.
> + */
> +struct vduse_iotlb_info {
> + __u64 bounce_iova;
> + __u64 bounce_size;
> + __u64 reserved[2];
> +};
> +
> +/* Get IOTLB information, e.g. bounce buffer size */
> +#define VDUSE_IOTLB_GET_INFO _IOR(VDUSE_BASE, 0x18, struct vduse_iotlb_info)
> +
> /* The control messages definition for read(2)/write(2) on /dev/vduse/$NAME */
>
> /**
> --
> 2.20.1
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 0/5] VDUSE: Support registering userspace memory as bounce buffer
[not found] ` <CACycT3vgaOrLVq+GDRK1PqqBRCkUAU0bYH=2CDvudsX0F9FBDA@mail.gmail.com>
@ 2022-07-14 2:59 ` Jason Wang
0 siblings, 0 replies; 9+ messages in thread
From: Jason Wang @ 2022-07-14 2:59 UTC (permalink / raw)
To: Yongji Xie
Cc: mst, Liu Xiaodong, linux-kernel, virtualization, Maxime Coquelin,
Stefan Hajnoczi, songmuchun
在 2022/7/11 15:24, Yongji Xie 写道:
> On Mon, Jul 11, 2022 at 2:02 PM Jason Wang <jasowang@redhat.com> wrote:
>> On Fri, Jul 8, 2022 at 5:53 PM Yongji Xie <xieyongji@bytedance.com> wrote:
>>> On Fri, Jul 8, 2022 at 4:38 PM Jason Wang <jasowang@redhat.com> wrote:
>>>> On Wed, Jul 6, 2022 at 6:16 PM Yongji Xie <xieyongji@bytedance.com> wrote:
>>>>> On Wed, Jul 6, 2022 at 5:30 PM Jason Wang <jasowang@redhat.com> wrote:
>>>>>> On Wed, Jul 6, 2022 at 1:05 PM Xie Yongji <xieyongji@bytedance.com> wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> This series introduces some new ioctls: VDUSE_IOTLB_GET_INFO,
>>>>>>> VDUSE_IOTLB_REG_UMEM and VDUSE_IOTLB_DEREG_UMEM to support
>>>>>>> registering and de-registering userspace memory for IOTLB
>>>>>>> as bounce buffer in virtio-vdpa case.
>>>>>>>
>>>>>>> The VDUSE_IOTLB_GET_INFO ioctl can help user to query IOLTB
>>>>>>> information such as bounce buffer size. Then user can use
>>>>>>> those information on VDUSE_IOTLB_REG_UMEM and
>>>>>>> VDUSE_IOTLB_DEREG_UMEM ioctls to register and de-register
>>>>>>> userspace memory for IOTLB.
>>>>>>>
>>>>>>> During registering and de-registering, the DMA data in use
>>>>>>> would be copied from kernel bounce pages to userspace bounce
>>>>>>> pages and back.
>>>>>>>
>>>>>>> With this feature, some existing application such as SPDK
>>>>>>> and DPDK can leverage the datapath of VDUSE directly and
>>>>>>> efficiently as discussed before [1][2]. They can register
>>>>>>> some preallocated hugepages to VDUSE to avoid an extra
>>>>>>> memcpy from bounce-buffer to hugepages.
>>>>>> This is really interesting.
>>>>>>
>>>>>> But a small concern on uAPI is that this seems to expose the VDUSE
>>>>>> internal implementation (bounce buffer) to userspace. We tried hard to
>>>>>> hide it via the GET_FD before. Anyway can we keep it?
>>>>>>
>>>>> Another way is changing GET_FD ioctl to add a flag or reuse 'perm'
>>>>> field to indicate whether a IOVA region supports userspace memory
>>>>> registration. Then userspace can use
>>>>> VDUSE_IOTLB_REG_UMEM/VDUSE_IOTLB_DEREG_UMEM to register/deregister
>>>>> userspace memory for this IOVA region.
>>>> Looks better.
>>>>
>>> OK.
>>>
>>>>> Any suggestions?
>>>> I wonder what's the value of keeping the compatibility with the kernel
>>>> mmaped bounce buffer. It means we need to take extra care on e.g data
>>>> copying when reg/reg user space memory.
>>>>
>>> I'm not sure I get your point on the compatibility with the kernel
>>> bounce buffer. Do you mean they use the same iova region?
>> Yes.
>>
>>> The userspace daemon might crash or reboot. In those cases, we still
>>> need a kernel buffer to store/recover the data.
>> Yes, this should be a good point.
>>
>>>> Can we simply allow the third kind of fd that only works for umem registration?
>>>>
>>> Do you mean using another iova region for umem?
>> I meant having a new kind of fd that only allows umem registration.
>>
> OK. It seems to be a little complicated to allow mapping a registered
> user memory via a new fd, e.g. how to handle the mapping if the
> userspace daemon exits but the fd is already passed to another
> process.
>
>>> I think we don't need
>>> a fd in umem case since the userspace daemon can access the memory
>>> directly without using mmap() to map it into the address space in
>>> advance.
>> Ok, I will have a look at the code and get back.
>>
> OK. Looking forward to your reply.
Looks good overall.
Just few comments.
Thanks
>
> Thanks,
> Yongji
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-07-14 2:59 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20220706050503.171-1-xieyongji@bytedance.com>
2022-07-06 9:30 ` [PATCH v2 0/5] VDUSE: Support registering userspace memory as bounce buffer Jason Wang
[not found] ` <CACycT3u3kOzzQjKBYNAB5vtpgcmPg7FjJ5yTYMtQo0SJVrBmZg@mail.gmail.com>
2022-07-08 8:38 ` Jason Wang
[not found] ` <CACycT3sNnmV8jrnjFkft6oST_6SGLc43f8Y4ZpomkPeOsvsorQ@mail.gmail.com>
2022-07-11 6:02 ` Jason Wang
[not found] ` <CACycT3vgaOrLVq+GDRK1PqqBRCkUAU0bYH=2CDvudsX0F9FBDA@mail.gmail.com>
2022-07-14 2:59 ` Jason Wang
[not found] ` <20220706050503.171-2-xieyongji@bytedance.com>
2022-07-13 5:43 ` [PATCH v2 1/5] vduse: Remove unnecessary spin lock protection Jason Wang
[not found] ` <CACycT3sNt_PcPQ__KVSqV6xd=+z5+gbMqWhu6H3vcj_fTGzUsw@mail.gmail.com>
2022-07-14 2:27 ` Jason Wang
2022-07-13 5:57 ` Michael S. Tsirkin
[not found] ` <20220706050503.171-3-xieyongji@bytedance.com>
2022-07-13 5:59 ` [PATCH v2 2/5] vduse: Use memcpy_{to,from}_page() in do_bounce() Jason Wang
[not found] ` <20220706050503.171-5-xieyongji@bytedance.com>
2022-07-14 2:51 ` [PATCH v2 4/5] vduse: Support querying IOLTB information Jason Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).