virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v2 0/5] VDUSE: Support registering userspace memory as bounce buffer
       [not found] <20220706050503.171-1-xieyongji@bytedance.com>
@ 2022-07-06  9:30 ` Jason Wang
       [not found]   ` <CACycT3u3kOzzQjKBYNAB5vtpgcmPg7FjJ5yTYMtQo0SJVrBmZg@mail.gmail.com>
       [not found] ` <20220706050503.171-2-xieyongji@bytedance.com>
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Jason Wang @ 2022-07-06  9:30 UTC (permalink / raw)
  To: Xie Yongji
  Cc: mst, Liu Xiaodong, linux-kernel, virtualization, Maxime Coquelin,
	Stefan Hajnoczi, songmuchun

On Wed, Jul 6, 2022 at 1:05 PM Xie Yongji <xieyongji@bytedance.com> wrote:
>
> Hi all,
>
> This series introduces some new ioctls: VDUSE_IOTLB_GET_INFO,
> VDUSE_IOTLB_REG_UMEM and VDUSE_IOTLB_DEREG_UMEM to support
> registering and de-registering userspace memory for IOTLB
> as bounce buffer in virtio-vdpa case.
>
> The VDUSE_IOTLB_GET_INFO ioctl can help user to query IOLTB
> information such as bounce buffer size. Then user can use
> those information on VDUSE_IOTLB_REG_UMEM and
> VDUSE_IOTLB_DEREG_UMEM ioctls to register and de-register
> userspace memory for IOTLB.
>
> During registering and de-registering, the DMA data in use
> would be copied from kernel bounce pages to userspace bounce
> pages and back.
>
> With this feature, some existing application such as SPDK
> and DPDK can leverage the datapath of VDUSE directly and
> efficiently as discussed before [1][2]. They can register
> some preallocated hugepages to VDUSE to avoid an extra
> memcpy from bounce-buffer to hugepages.

This is really interesting.

But a small concern on uAPI is that this seems to expose the VDUSE
internal implementation (bounce buffer) to userspace. We tried hard to
hide it via the GET_FD before. Anyway can we keep it?

Thanks

>
> The kernel and userspace codes could be found in github:
>
> https://github.com/bytedance/linux/tree/vduse-umem
> https://github.com/bytedance/qemu/tree/vduse-umem
>
> To test it with qemu-storage-daemon:
>
> $ qemu-storage-daemon \
>     --chardev socket,id=charmonitor,path=/tmp/qmp.sock,server=on,wait=off \
>     --monitor chardev=charmonitor \
>     --blockdev driver=host_device,cache.direct=on,aio=native,filename=/dev/nullb0,node-name=disk0 \
>     --export type=vduse-blk,id=vduse-test,name=vduse-test,node-name=disk0,writable=on
>
> [1] https://lkml.org/lkml/2021/6/27/318
> [2] https://lkml.org/lkml/2022/7/4/246
>
> Please review, thanks!
>
> V1 to V2:
> - Drop the patch that updating API version [MST]
> - Replace unpin_user_pages() with unpin_user_pages_dirty_lock() [MST]
> - Use __vmalloc(__GFP_ACCOUNT) for memory accounting [MST]
>
> Xie Yongji (5):
>   vduse: Remove unnecessary spin lock protection
>   vduse: Use memcpy_{to,from}_page() in do_bounce()
>   vduse: Support using userspace pages as bounce buffer
>   vduse: Support querying IOLTB information
>   vduse: Support registering userspace memory for IOTLB
>
>  drivers/vdpa/vdpa_user/iova_domain.c | 134 ++++++++++++++++++++---
>  drivers/vdpa/vdpa_user/iova_domain.h |   9 ++
>  drivers/vdpa/vdpa_user/vduse_dev.c   | 152 +++++++++++++++++++++++++++
>  include/uapi/linux/vduse.h           |  45 ++++++++
>  4 files changed, 327 insertions(+), 13 deletions(-)
>
> --
> 2.20.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/5] VDUSE: Support registering userspace memory as bounce buffer
       [not found]   ` <CACycT3u3kOzzQjKBYNAB5vtpgcmPg7FjJ5yTYMtQo0SJVrBmZg@mail.gmail.com>
@ 2022-07-08  8:38     ` Jason Wang
       [not found]       ` <CACycT3sNnmV8jrnjFkft6oST_6SGLc43f8Y4ZpomkPeOsvsorQ@mail.gmail.com>
  0 siblings, 1 reply; 9+ messages in thread
From: Jason Wang @ 2022-07-08  8:38 UTC (permalink / raw)
  To: Yongji Xie
  Cc: mst, Liu Xiaodong, linux-kernel, virtualization, Maxime Coquelin,
	Stefan Hajnoczi, songmuchun

On Wed, Jul 6, 2022 at 6:16 PM Yongji Xie <xieyongji@bytedance.com> wrote:
>
> On Wed, Jul 6, 2022 at 5:30 PM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Jul 6, 2022 at 1:05 PM Xie Yongji <xieyongji@bytedance.com> wrote:
> > >
> > > Hi all,
> > >
> > > This series introduces some new ioctls: VDUSE_IOTLB_GET_INFO,
> > > VDUSE_IOTLB_REG_UMEM and VDUSE_IOTLB_DEREG_UMEM to support
> > > registering and de-registering userspace memory for IOTLB
> > > as bounce buffer in virtio-vdpa case.
> > >
> > > The VDUSE_IOTLB_GET_INFO ioctl can help user to query IOLTB
> > > information such as bounce buffer size. Then user can use
> > > those information on VDUSE_IOTLB_REG_UMEM and
> > > VDUSE_IOTLB_DEREG_UMEM ioctls to register and de-register
> > > userspace memory for IOTLB.
> > >
> > > During registering and de-registering, the DMA data in use
> > > would be copied from kernel bounce pages to userspace bounce
> > > pages and back.
> > >
> > > With this feature, some existing application such as SPDK
> > > and DPDK can leverage the datapath of VDUSE directly and
> > > efficiently as discussed before [1][2]. They can register
> > > some preallocated hugepages to VDUSE to avoid an extra
> > > memcpy from bounce-buffer to hugepages.
> >
> > This is really interesting.
> >
> > But a small concern on uAPI is that this seems to expose the VDUSE
> > internal implementation (bounce buffer) to userspace. We tried hard to
> > hide it via the GET_FD before. Anyway can we keep it?
> >
>
> Another way is changing GET_FD ioctl to add a flag or reuse 'perm'
> field to indicate whether a IOVA region supports userspace memory
> registration. Then userspace can use
> VDUSE_IOTLB_REG_UMEM/VDUSE_IOTLB_DEREG_UMEM to register/deregister
> userspace memory for this IOVA region.

Looks better.

> Any suggestions?

I wonder what's the value of keeping the compatibility with the kernel
mmaped bounce buffer. It means we need to take extra care on e.g data
copying when reg/reg user space memory.

Can we simply allow the third kind of fd that only works for umem registration?

Thanks

>
> Thanks,
> Yongji
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/5] VDUSE: Support registering userspace memory as bounce buffer
       [not found]       ` <CACycT3sNnmV8jrnjFkft6oST_6SGLc43f8Y4ZpomkPeOsvsorQ@mail.gmail.com>
@ 2022-07-11  6:02         ` Jason Wang
       [not found]           ` <CACycT3vgaOrLVq+GDRK1PqqBRCkUAU0bYH=2CDvudsX0F9FBDA@mail.gmail.com>
  0 siblings, 1 reply; 9+ messages in thread
From: Jason Wang @ 2022-07-11  6:02 UTC (permalink / raw)
  To: Yongji Xie
  Cc: mst, Liu Xiaodong, linux-kernel, virtualization, Maxime Coquelin,
	Stefan Hajnoczi, songmuchun

On Fri, Jul 8, 2022 at 5:53 PM Yongji Xie <xieyongji@bytedance.com> wrote:
>
> On Fri, Jul 8, 2022 at 4:38 PM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Jul 6, 2022 at 6:16 PM Yongji Xie <xieyongji@bytedance.com> wrote:
> > >
> > > On Wed, Jul 6, 2022 at 5:30 PM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Wed, Jul 6, 2022 at 1:05 PM Xie Yongji <xieyongji@bytedance.com> wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > This series introduces some new ioctls: VDUSE_IOTLB_GET_INFO,
> > > > > VDUSE_IOTLB_REG_UMEM and VDUSE_IOTLB_DEREG_UMEM to support
> > > > > registering and de-registering userspace memory for IOTLB
> > > > > as bounce buffer in virtio-vdpa case.
> > > > >
> > > > > The VDUSE_IOTLB_GET_INFO ioctl can help user to query IOLTB
> > > > > information such as bounce buffer size. Then user can use
> > > > > those information on VDUSE_IOTLB_REG_UMEM and
> > > > > VDUSE_IOTLB_DEREG_UMEM ioctls to register and de-register
> > > > > userspace memory for IOTLB.
> > > > >
> > > > > During registering and de-registering, the DMA data in use
> > > > > would be copied from kernel bounce pages to userspace bounce
> > > > > pages and back.
> > > > >
> > > > > With this feature, some existing application such as SPDK
> > > > > and DPDK can leverage the datapath of VDUSE directly and
> > > > > efficiently as discussed before [1][2]. They can register
> > > > > some preallocated hugepages to VDUSE to avoid an extra
> > > > > memcpy from bounce-buffer to hugepages.
> > > >
> > > > This is really interesting.
> > > >
> > > > But a small concern on uAPI is that this seems to expose the VDUSE
> > > > internal implementation (bounce buffer) to userspace. We tried hard to
> > > > hide it via the GET_FD before. Anyway can we keep it?
> > > >
> > >
> > > Another way is changing GET_FD ioctl to add a flag or reuse 'perm'
> > > field to indicate whether a IOVA region supports userspace memory
> > > registration. Then userspace can use
> > > VDUSE_IOTLB_REG_UMEM/VDUSE_IOTLB_DEREG_UMEM to register/deregister
> > > userspace memory for this IOVA region.
> >
> > Looks better.
> >
>
> OK.
>
> > > Any suggestions?
> >
> > I wonder what's the value of keeping the compatibility with the kernel
> > mmaped bounce buffer. It means we need to take extra care on e.g data
> > copying when reg/reg user space memory.
> >
>
> I'm not sure I get your point on the compatibility with the kernel
> bounce buffer. Do you mean they use the same iova region?

Yes.

>
> The userspace daemon might crash or reboot. In those cases, we still
> need a kernel buffer to store/recover the data.

Yes, this should be a good point.

>
> > Can we simply allow the third kind of fd that only works for umem registration?
> >
>
> Do you mean using another iova region for umem?

I meant having a new kind of fd that only allows umem registration.

>I think we don't need
> a fd in umem case since the userspace daemon can access the memory
> directly without using mmap() to map it into the address space in
> advance.

Ok, I will have a look at the code and get back.

Thanks

>
> Thanks,
> Yongji
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/5] vduse: Remove unnecessary spin lock protection
       [not found] ` <20220706050503.171-2-xieyongji@bytedance.com>
@ 2022-07-13  5:43   ` Jason Wang
       [not found]     ` <CACycT3sNt_PcPQ__KVSqV6xd=+z5+gbMqWhu6H3vcj_fTGzUsw@mail.gmail.com>
  2022-07-13  5:57   ` Michael S. Tsirkin
  1 sibling, 1 reply; 9+ messages in thread
From: Jason Wang @ 2022-07-13  5:43 UTC (permalink / raw)
  To: Xie Yongji, mst, xiaodong.liu, maxime.coquelin, stefanha
  Cc: virtualization, linux-kernel, songmuchun


在 2022/7/6 13:04, Xie Yongji 写道:
> Taking iotlb lock to access bounce page in page fault
> handler is meaningless since vduse_domain_free_bounce_pages()
> would only be called during file release.
>
> Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
> ---
>   drivers/vdpa/vdpa_user/iova_domain.c | 7 ++-----
>   1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
> index 6daa3978d290..bca1f0b8850c 100644
> --- a/drivers/vdpa/vdpa_user/iova_domain.c
> +++ b/drivers/vdpa/vdpa_user/iova_domain.c
> @@ -211,17 +211,14 @@ static struct page *
>   vduse_domain_get_bounce_page(struct vduse_iova_domain *domain, u64 iova)
>   {
>   	struct vduse_bounce_map *map;
> -	struct page *page = NULL;
> +	struct page *page;
>   
> -	spin_lock(&domain->iotlb_lock);
>   	map = &domain->bounce_maps[iova >> PAGE_SHIFT];
>   	if (!map->bounce_page)
> -		goto out;
> +		return NULL;


Interesting, I wonder why we don't serialize with 
vduse_domain_map_bounce_page() with iotlb_lock?

Thanks


>   
>   	page = map->bounce_page;
>   	get_page(page);
> -out:
> -	spin_unlock(&domain->iotlb_lock);
>   
>   	return page;
>   }

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/5] vduse: Remove unnecessary spin lock protection
       [not found] ` <20220706050503.171-2-xieyongji@bytedance.com>
  2022-07-13  5:43   ` [PATCH v2 1/5] vduse: Remove unnecessary spin lock protection Jason Wang
@ 2022-07-13  5:57   ` Michael S. Tsirkin
  1 sibling, 0 replies; 9+ messages in thread
From: Michael S. Tsirkin @ 2022-07-13  5:57 UTC (permalink / raw)
  To: Xie Yongji
  Cc: xiaodong.liu, linux-kernel, virtualization, maxime.coquelin,
	stefanha, songmuchun

On Wed, Jul 06, 2022 at 01:04:59PM +0800, Xie Yongji wrote:
> Taking iotlb lock to access bounce page in page fault
> handler is meaningless since vduse_domain_free_bounce_pages()
> would only be called during file release.
> 
> Signed-off-by: Xie Yongji <xieyongji@bytedance.com>

vduse_domain_free_bounce_pages is not the
only one taking this lock. This commit log needs more
analysis documenting all points of access to bounce_maps
and why vduse_domain_get_bounce_page and file
release are the only two.

> ---
>  drivers/vdpa/vdpa_user/iova_domain.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
> index 6daa3978d290..bca1f0b8850c 100644
> --- a/drivers/vdpa/vdpa_user/iova_domain.c
> +++ b/drivers/vdpa/vdpa_user/iova_domain.c
> @@ -211,17 +211,14 @@ static struct page *
>  vduse_domain_get_bounce_page(struct vduse_iova_domain *domain, u64 iova)
>  {
>  	struct vduse_bounce_map *map;
> -	struct page *page = NULL;
> +	struct page *page;
>  
> -	spin_lock(&domain->iotlb_lock);
>  	map = &domain->bounce_maps[iova >> PAGE_SHIFT];
>  	if (!map->bounce_page)
> -		goto out;
> +		return NULL;
>  
>  	page = map->bounce_page;
>  	get_page(page);
> -out:
> -	spin_unlock(&domain->iotlb_lock);
>  
>  	return page;
>  }
> -- 
> 2.20.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/5] vduse: Use memcpy_{to,from}_page() in do_bounce()
       [not found] ` <20220706050503.171-3-xieyongji@bytedance.com>
@ 2022-07-13  5:59   ` Jason Wang
  0 siblings, 0 replies; 9+ messages in thread
From: Jason Wang @ 2022-07-13  5:59 UTC (permalink / raw)
  To: Xie Yongji, mst, xiaodong.liu, maxime.coquelin, stefanha
  Cc: virtualization, linux-kernel, songmuchun


在 2022/7/6 13:05, Xie Yongji 写道:
> kmap_atomic() is being deprecated in favor of kmap_local_page().
>
> The use of kmap_atomic() in do_bounce() is all thread local therefore
> kmap_local_page() is a sufficient replacement.
>
> Convert to kmap_local_page() but, instead of open coding it,
> use the helpers memcpy_to_page() and memcpy_from_page().
>
> Signed-off-by: Xie Yongji <xieyongji@bytedance.com>


Acked-by: Jason Wang <jasowang@redhat.com>


> ---
>   drivers/vdpa/vdpa_user/iova_domain.c | 9 ++++-----
>   1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
> index bca1f0b8850c..50d7c08d5450 100644
> --- a/drivers/vdpa/vdpa_user/iova_domain.c
> +++ b/drivers/vdpa/vdpa_user/iova_domain.c
> @@ -138,18 +138,17 @@ static void do_bounce(phys_addr_t orig, void *addr, size_t size,
>   {
>   	unsigned long pfn = PFN_DOWN(orig);
>   	unsigned int offset = offset_in_page(orig);
> -	char *buffer;
> +	struct page *page;
>   	unsigned int sz = 0;
>   
>   	while (size) {
>   		sz = min_t(size_t, PAGE_SIZE - offset, size);
>   
> -		buffer = kmap_atomic(pfn_to_page(pfn));
> +		page = pfn_to_page(pfn);
>   		if (dir == DMA_TO_DEVICE)
> -			memcpy(addr, buffer + offset, sz);
> +			memcpy_from_page(addr, page, offset, sz);
>   		else
> -			memcpy(buffer + offset, addr, sz);
> -		kunmap_atomic(buffer);
> +			memcpy_to_page(page, offset, addr, sz);
>   
>   		size -= sz;
>   		pfn++;

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/5] vduse: Remove unnecessary spin lock protection
       [not found]     ` <CACycT3sNt_PcPQ__KVSqV6xd=+z5+gbMqWhu6H3vcj_fTGzUsw@mail.gmail.com>
@ 2022-07-14  2:27       ` Jason Wang
  0 siblings, 0 replies; 9+ messages in thread
From: Jason Wang @ 2022-07-14  2:27 UTC (permalink / raw)
  To: Yongji Xie
  Cc: Michael S. Tsirkin, Liu Xiaodong, linux-kernel, virtualization,
	Maxime Coquelin, Stefan Hajnoczi, songmuchun

On Wed, Jul 13, 2022 at 7:09 PM Yongji Xie <xieyongji@bytedance.com> wrote:
>
> On Wed, Jul 13, 2022 at 1:44 PM Jason Wang <jasowang@redhat.com> wrote:
> >
> >
> > 在 2022/7/6 13:04, Xie Yongji 写道:
> > > Taking iotlb lock to access bounce page in page fault
> > > handler is meaningless since vduse_domain_free_bounce_pages()
> > > would only be called during file release.
> > >
> > > Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
> > > ---
> > >   drivers/vdpa/vdpa_user/iova_domain.c | 7 ++-----
> > >   1 file changed, 2 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
> > > index 6daa3978d290..bca1f0b8850c 100644
> > > --- a/drivers/vdpa/vdpa_user/iova_domain.c
> > > +++ b/drivers/vdpa/vdpa_user/iova_domain.c
> > > @@ -211,17 +211,14 @@ static struct page *
> > >   vduse_domain_get_bounce_page(struct vduse_iova_domain *domain, u64 iova)
> > >   {
> > >       struct vduse_bounce_map *map;
> > > -     struct page *page = NULL;
> > > +     struct page *page;
> > >
> > > -     spin_lock(&domain->iotlb_lock);
> > >       map = &domain->bounce_maps[iova >> PAGE_SHIFT];
> > >       if (!map->bounce_page)
> > > -             goto out;
> > > +             return NULL;
> >
> >
> > Interesting, I wonder why we don't serialize with
> > vduse_domain_map_bounce_page() with iotlb_lock?
> >
>
> Userspace should only access the bounce page after we set up the dma
> mapping, so we don't need serialization from the iotlb_lock in this
> case.

What about the buggy/malicious user space that tries to access those
pages before or just in the middle of it has been mapped?

> And vduse_domain_map_bounce_page() only sets the
> map->bounce_page rather than clears the map->bounce_page, we would not
> have any problem without the lock protection.

Probably, I see an assignment of orig_phys after the alloc_page() but
it seems only used in bouncing which will only be called by dma ops.
At least we'd better have a comment to explain the synchronization
here.

Thanks

>
> Thanks,
> Yongji
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 4/5] vduse: Support querying IOLTB information
       [not found] ` <20220706050503.171-5-xieyongji@bytedance.com>
@ 2022-07-14  2:51   ` Jason Wang
  0 siblings, 0 replies; 9+ messages in thread
From: Jason Wang @ 2022-07-14  2:51 UTC (permalink / raw)
  To: Xie Yongji
  Cc: mst, Liu Xiaodong, linux-kernel, virtualization, Maxime Coquelin,
	Stefan Hajnoczi, songmuchun

On Wed, Jul 6, 2022 at 1:06 PM Xie Yongji <xieyongji@bytedance.com> wrote:
>
> This introduces a new ioctl: VDUSE_IOTLB_GET_INFO to
> support querying IOLTB information such as bounce
> buffer size.
>
> Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
> ---
>  drivers/vdpa/vdpa_user/vduse_dev.c | 13 +++++++++++++
>  include/uapi/linux/vduse.h         | 17 +++++++++++++++++
>  2 files changed, 30 insertions(+)
>
> diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> index 3bc27de58f46..c47a5d9765cf 100644
> --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> @@ -1089,6 +1089,19 @@ static long vduse_dev_ioctl(struct file *file, unsigned int cmd,
>                 ret = vduse_dev_queue_irq_work(dev, &dev->vqs[index].inject);
>                 break;
>         }
> +       case VDUSE_IOTLB_GET_INFO: {

As discussed, it's better not to expose the VDUSE internal structure
like "IOTLB" in the name.

We probably need to extend GET_FD or have a new ioctl like GET_FD_INFO.

Thanks

> +               struct vduse_iotlb_info iotlb;
> +
> +               iotlb.bounce_iova = 0;
> +               iotlb.bounce_size = dev->domain->bounce_size;
> +
> +               ret = -EFAULT;
> +               if (copy_to_user(argp, &iotlb, sizeof(iotlb)))
> +                       break;
> +
> +               ret = 0;
> +               break;
> +       }
>         default:
>                 ret = -ENOIOCTLCMD;
>                 break;
> diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h
> index 7cfe1c1280c0..c201b7a77c2c 100644
> --- a/include/uapi/linux/vduse.h
> +++ b/include/uapi/linux/vduse.h
> @@ -210,6 +210,23 @@ struct vduse_vq_eventfd {
>   */
>  #define VDUSE_VQ_INJECT_IRQ    _IOW(VDUSE_BASE, 0x17, __u32)
>
> +/**
> + * struct vduse_iotlb_info - IOTLB information
> + * @bounce_iova: start IOVA of bounce buffer
> + * @bounce_size: bounce buffer size
> + * @reserved: for future use, needs to be initialized to zero
> + *
> + * Structure used by VDUSE_IOTLB_GET_INFO ioctl to get IOTLB information.
> + */
> +struct vduse_iotlb_info {
> +       __u64 bounce_iova;
> +       __u64 bounce_size;
> +       __u64 reserved[2];
> +};
> +
> +/* Get IOTLB information, e.g. bounce buffer size */
> +#define VDUSE_IOTLB_GET_INFO    _IOR(VDUSE_BASE, 0x18, struct vduse_iotlb_info)
> +
>  /* The control messages definition for read(2)/write(2) on /dev/vduse/$NAME */
>
>  /**
> --
> 2.20.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/5] VDUSE: Support registering userspace memory as bounce buffer
       [not found]           ` <CACycT3vgaOrLVq+GDRK1PqqBRCkUAU0bYH=2CDvudsX0F9FBDA@mail.gmail.com>
@ 2022-07-14  2:59             ` Jason Wang
  0 siblings, 0 replies; 9+ messages in thread
From: Jason Wang @ 2022-07-14  2:59 UTC (permalink / raw)
  To: Yongji Xie
  Cc: mst, Liu Xiaodong, linux-kernel, virtualization, Maxime Coquelin,
	Stefan Hajnoczi, songmuchun


在 2022/7/11 15:24, Yongji Xie 写道:
> On Mon, Jul 11, 2022 at 2:02 PM Jason Wang <jasowang@redhat.com> wrote:
>> On Fri, Jul 8, 2022 at 5:53 PM Yongji Xie <xieyongji@bytedance.com> wrote:
>>> On Fri, Jul 8, 2022 at 4:38 PM Jason Wang <jasowang@redhat.com> wrote:
>>>> On Wed, Jul 6, 2022 at 6:16 PM Yongji Xie <xieyongji@bytedance.com> wrote:
>>>>> On Wed, Jul 6, 2022 at 5:30 PM Jason Wang <jasowang@redhat.com> wrote:
>>>>>> On Wed, Jul 6, 2022 at 1:05 PM Xie Yongji <xieyongji@bytedance.com> wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> This series introduces some new ioctls: VDUSE_IOTLB_GET_INFO,
>>>>>>> VDUSE_IOTLB_REG_UMEM and VDUSE_IOTLB_DEREG_UMEM to support
>>>>>>> registering and de-registering userspace memory for IOTLB
>>>>>>> as bounce buffer in virtio-vdpa case.
>>>>>>>
>>>>>>> The VDUSE_IOTLB_GET_INFO ioctl can help user to query IOLTB
>>>>>>> information such as bounce buffer size. Then user can use
>>>>>>> those information on VDUSE_IOTLB_REG_UMEM and
>>>>>>> VDUSE_IOTLB_DEREG_UMEM ioctls to register and de-register
>>>>>>> userspace memory for IOTLB.
>>>>>>>
>>>>>>> During registering and de-registering, the DMA data in use
>>>>>>> would be copied from kernel bounce pages to userspace bounce
>>>>>>> pages and back.
>>>>>>>
>>>>>>> With this feature, some existing application such as SPDK
>>>>>>> and DPDK can leverage the datapath of VDUSE directly and
>>>>>>> efficiently as discussed before [1][2]. They can register
>>>>>>> some preallocated hugepages to VDUSE to avoid an extra
>>>>>>> memcpy from bounce-buffer to hugepages.
>>>>>> This is really interesting.
>>>>>>
>>>>>> But a small concern on uAPI is that this seems to expose the VDUSE
>>>>>> internal implementation (bounce buffer) to userspace. We tried hard to
>>>>>> hide it via the GET_FD before. Anyway can we keep it?
>>>>>>
>>>>> Another way is changing GET_FD ioctl to add a flag or reuse 'perm'
>>>>> field to indicate whether a IOVA region supports userspace memory
>>>>> registration. Then userspace can use
>>>>> VDUSE_IOTLB_REG_UMEM/VDUSE_IOTLB_DEREG_UMEM to register/deregister
>>>>> userspace memory for this IOVA region.
>>>> Looks better.
>>>>
>>> OK.
>>>
>>>>> Any suggestions?
>>>> I wonder what's the value of keeping the compatibility with the kernel
>>>> mmaped bounce buffer. It means we need to take extra care on e.g data
>>>> copying when reg/reg user space memory.
>>>>
>>> I'm not sure I get your point on the compatibility with the kernel
>>> bounce buffer. Do you mean they use the same iova region?
>> Yes.
>>
>>> The userspace daemon might crash or reboot. In those cases, we still
>>> need a kernel buffer to store/recover the data.
>> Yes, this should be a good point.
>>
>>>> Can we simply allow the third kind of fd that only works for umem registration?
>>>>
>>> Do you mean using another iova region for umem?
>> I meant having a new kind of fd that only allows umem registration.
>>
> OK. It seems to be a little complicated to allow mapping a registered
> user memory via a new fd, e.g. how to handle the mapping if the
> userspace daemon exits but the fd is already passed to another
> process.
>
>>> I think we don't need
>>> a fd in umem case since the userspace daemon can access the memory
>>> directly without using mmap() to map it into the address space in
>>> advance.
>> Ok, I will have a look at the code and get back.
>>
> OK. Looking forward to your reply.


Looks good overall.

Just few comments.

Thanks


>
> Thanks,
> Yongji
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-07-14  2:59 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20220706050503.171-1-xieyongji@bytedance.com>
2022-07-06  9:30 ` [PATCH v2 0/5] VDUSE: Support registering userspace memory as bounce buffer Jason Wang
     [not found]   ` <CACycT3u3kOzzQjKBYNAB5vtpgcmPg7FjJ5yTYMtQo0SJVrBmZg@mail.gmail.com>
2022-07-08  8:38     ` Jason Wang
     [not found]       ` <CACycT3sNnmV8jrnjFkft6oST_6SGLc43f8Y4ZpomkPeOsvsorQ@mail.gmail.com>
2022-07-11  6:02         ` Jason Wang
     [not found]           ` <CACycT3vgaOrLVq+GDRK1PqqBRCkUAU0bYH=2CDvudsX0F9FBDA@mail.gmail.com>
2022-07-14  2:59             ` Jason Wang
     [not found] ` <20220706050503.171-2-xieyongji@bytedance.com>
2022-07-13  5:43   ` [PATCH v2 1/5] vduse: Remove unnecessary spin lock protection Jason Wang
     [not found]     ` <CACycT3sNt_PcPQ__KVSqV6xd=+z5+gbMqWhu6H3vcj_fTGzUsw@mail.gmail.com>
2022-07-14  2:27       ` Jason Wang
2022-07-13  5:57   ` Michael S. Tsirkin
     [not found] ` <20220706050503.171-3-xieyongji@bytedance.com>
2022-07-13  5:59   ` [PATCH v2 2/5] vduse: Use memcpy_{to,from}_page() in do_bounce() Jason Wang
     [not found] ` <20220706050503.171-5-xieyongji@bytedance.com>
2022-07-14  2:51   ` [PATCH v2 4/5] vduse: Support querying IOLTB information Jason Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).