virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
  • [parent not found: <20201019145623.671-3-xieyongji@bytedance.com>]
  • [parent not found: <20201019145623.671-2-xieyongji@bytedance.com>]
  • * Re: [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace
           [not found] <20201019145623.671-1-xieyongji@bytedance.com>
                       ` (2 preceding siblings ...)
           [not found] ` <20201019145623.671-2-xieyongji@bytedance.com>
    @ 2020-10-19 17:16 ` Michael S. Tsirkin
           [not found]   ` <CACycT3vzpm_+v-DbqeVRMg8BRny_GoL2JxpbzYC3JYTMKGn_vg@mail.gmail.com>
      2020-10-20  3:20 ` Jason Wang
      4 siblings, 1 reply; 12+ messages in thread
    From: Michael S. Tsirkin @ 2020-10-19 17:16 UTC (permalink / raw)
      To: Xie Yongji; +Cc: linux-mm, akpm, virtualization
    
    On Mon, Oct 19, 2020 at 10:56:19PM +0800, Xie Yongji wrote:
    > This series introduces a framework, which can be used to implement
    > vDPA Devices in a userspace program. To implement it, the work
    > consist of two parts: control path emulating and data path offloading.
    > 
    > In the control path, the VDUSE driver will make use of message
    > mechnism to forward the actions (get/set features, get/st status,
    > get/set config space and set virtqueue states) from virtio-vdpa
    > driver to userspace. Userspace can use read()/write() to
    > receive/reply to those control messages.
    > 
    > In the data path, the VDUSE driver implements a MMU-based
    > on-chip IOMMU driver which supports both direct mapping and
    > indirect mapping with bounce buffer. Then userspace can access
    > those iova space via mmap(). Besides, eventfd mechnism is used to
    > trigger interrupts and forward virtqueue kicks.
    > 
    > The details and our user case is shown below:
    > 
    > ------------------------     -----------------------------------------------------------
    > |                  APP |     |                          QEMU                           |
    > |       ---------      |     | --------------------    -------------------+<-->+------ |
    > |       |dev/vdx|      |     | | device emulation |    | virtio dataplane |    | BDS | |
    > ------------+-----------     -----------+-----------------------+-----------------+-----
    >             |                           |                       |                 |
    >             |                           | emulating             | offloading      |
    > ------------+---------------------------+-----------------------+-----------------+------
    > |    | block device |           |  vduse driver |        |  vdpa device |    | TCP/IP | |
    > |    -------+--------           --------+--------        +------+-------     -----+---- |
    > |           |                           |                |      |                 |     |
    > |           |                           |                |      |                 |     |
    > | ----------+----------       ----------+-----------     |      |                 |     |
    > | | virtio-blk driver |       | virtio-vdpa driver |     |      |                 |     |
    > | ----------+----------       ----------+-----------     |      |                 |     |
    > |           |                           |                |      |                 |     |
    > |           |                           ------------------      |                 |     |
    > |           -----------------------------------------------------              ---+---  |
    > ------------------------------------------------------------------------------ | NIC |---
    >                                                                                ---+---
    >                                                                                   |
    >                                                                          ---------+---------
    >                                                                          | Remote Storages |
    >                                                                          -------------------
    > We make use of it to implement a block device connecting to
    > our distributed storage, which can be used in containers and
    > bare metal.
    
    What is not exactly clear is what is the APP above doing.
    
    Taking virtio blk requests and sending them over the network
    in some proprietary way?
    
    > Compared with qemu-nbd solution, this solution has
    > higher performance, and we can have an unified technology stack
    > in VM and containers for remote storages.
    > 
    > To test it with a host disk (e.g. /dev/sdx):
    > 
    >   $ qemu-storage-daemon \
    >       --chardev socket,id=charmonitor,path=/tmp/qmp.sock,server,nowait \
    >       --monitor chardev=charmonitor \
    >       --blockdev driver=host_device,cache.direct=on,aio=native,filename=/dev/sdx,node-name=disk0 \
    >       --export vduse-blk,id=test,node-name=disk0,writable=on,vduse-id=1,num-queues=16,queue-size=128
    > 
    > The qemu-storage-daemon can be found at https://github.com/bytedance/qemu/tree/vduse
    > 
    > Future work:
    >   - Improve performance (e.g. zero copy implementation in datapath)
    >   - Config interrupt support
    >   - Userspace library (find a way to reuse device emulation code in qemu/rust-vmm)
    
    
    How does this driver compare with vhost-user-blk (which doesn't need kernel support)?
    
    
    
    > Xie Yongji (4):
    >   mm: export zap_page_range() for driver use
    >   vduse: Introduce VDUSE - vDPA Device in Userspace
    >   vduse: grab the module's references until there is no vduse device
    >   vduse: Add memory shrinker to reclaim bounce pages
    > 
    >  drivers/vdpa/Kconfig                 |    8 +
    >  drivers/vdpa/Makefile                |    1 +
    >  drivers/vdpa/vdpa_user/Makefile      |    5 +
    >  drivers/vdpa/vdpa_user/eventfd.c     |  221 ++++++
    >  drivers/vdpa/vdpa_user/eventfd.h     |   48 ++
    >  drivers/vdpa/vdpa_user/iova_domain.c |  488 ++++++++++++
    >  drivers/vdpa/vdpa_user/iova_domain.h |  104 +++
    >  drivers/vdpa/vdpa_user/vduse.h       |   66 ++
    >  drivers/vdpa/vdpa_user/vduse_dev.c   | 1081 ++++++++++++++++++++++++++
    >  include/uapi/linux/vduse.h           |   85 ++
    >  mm/memory.c                          |    1 +
    >  11 files changed, 2108 insertions(+)
    >  create mode 100644 drivers/vdpa/vdpa_user/Makefile
    >  create mode 100644 drivers/vdpa/vdpa_user/eventfd.c
    >  create mode 100644 drivers/vdpa/vdpa_user/eventfd.h
    >  create mode 100644 drivers/vdpa/vdpa_user/iova_domain.c
    >  create mode 100644 drivers/vdpa/vdpa_user/iova_domain.h
    >  create mode 100644 drivers/vdpa/vdpa_user/vduse.h
    >  create mode 100644 drivers/vdpa/vdpa_user/vduse_dev.c
    >  create mode 100644 include/uapi/linux/vduse.h
    > 
    > -- 
    > 2.25.1
    
    _______________________________________________
    Virtualization mailing list
    Virtualization@lists.linux-foundation.org
    https://lists.linuxfoundation.org/mailman/listinfo/virtualization
    
    ^ permalink raw reply	[flat|nested] 12+ messages in thread
  • * Re: [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace
           [not found] <20201019145623.671-1-xieyongji@bytedance.com>
                       ` (3 preceding siblings ...)
      2020-10-19 17:16 ` [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace Michael S. Tsirkin
    @ 2020-10-20  3:20 ` Jason Wang
           [not found]   ` <CACycT3srzADF63rotgHwfsqn5GJOCbXx+19Dcnw8HLyTGY_7Eg@mail.gmail.com>
      4 siblings, 1 reply; 12+ messages in thread
    From: Jason Wang @ 2020-10-20  3:20 UTC (permalink / raw)
      To: Xie Yongji, mst, akpm; +Cc: linux-mm, virtualization
    
    
    On 2020/10/19 下午10:56, Xie Yongji wrote:
    > This series introduces a framework, which can be used to implement
    > vDPA Devices in a userspace program. To implement it, the work
    > consist of two parts: control path emulating and data path offloading.
    >
    > In the control path, the VDUSE driver will make use of message
    > mechnism to forward the actions (get/set features, get/st status,
    > get/set config space and set virtqueue states) from virtio-vdpa
    > driver to userspace. Userspace can use read()/write() to
    > receive/reply to those control messages.
    >
    > In the data path, the VDUSE driver implements a MMU-based
    > on-chip IOMMU driver which supports both direct mapping and
    > indirect mapping with bounce buffer. Then userspace can access
    > those iova space via mmap(). Besides, eventfd mechnism is used to
    > trigger interrupts and forward virtqueue kicks.
    
    
    This is pretty interesting!
    
    For vhost-vdpa, it should work, but for virtio-vdpa, I think we should 
    carefully deal with the IOMMU/DMA ops stuffs.
    
    I notice that neither dma_map nor set_map is implemented in 
    vduse_vdpa_config_ops, this means you want to let vhost-vDPA to deal 
    with IOMMU domains stuffs.  Any reason for doing that?
    
    The reason for the questions are:
    
    1) You've implemented a on-chip IOMMU driver but don't expose it to 
    generic IOMMU layer (or generic IOMMU layer may need some extension to 
    support this)
    2) We will probably remove the IOMMU domain management in vhost-vDPA, 
    and move it to the device(parent).
    
    So if it's possible, please implement either set_map() or 
    dma_map()/dma_unmap(), this may align with our future goal and may speed 
    up the development.
    
    Btw, it would be helpful to give even more details on how the on-chip 
    IOMMU driver in implemented.
    
    
    >
    > The details and our user case is shown below:
    >
    > ------------------------     -----------------------------------------------------------
    > |                  APP |     |                          QEMU                           |
    > |       ---------      |     | --------------------    -------------------+<-->+------ |
    > |       |dev/vdx|      |     | | device emulation |    | virtio dataplane |    | BDS | |
    > ------------+-----------     -----------+-----------------------+-----------------+-----
    >              |                           |                       |                 |
    >              |                           | emulating             | offloading      |
    > ------------+---------------------------+-----------------------+-----------------+------
    > |    | block device |           |  vduse driver |        |  vdpa device |    | TCP/IP | |
    > |    -------+--------           --------+--------        +------+-------     -----+---- |
    > |           |                           |                |      |                 |     |
    > |           |                           |                |      |                 |     |
    > | ----------+----------       ----------+-----------     |      |                 |     |
    > | | virtio-blk driver |       | virtio-vdpa driver |     |      |                 |     |
    > | ----------+----------       ----------+-----------     |      |                 |     |
    > |           |                           |                |      |                 |     |
    > |           |                           ------------------      |                 |     |
    > |           -----------------------------------------------------              ---+---  |
    > ------------------------------------------------------------------------------ | NIC |---
    >                                                                                 ---+---
    >                                                                                    |
    >                                                                           ---------+---------
    >                                                                           | Remote Storages |
    >                                                                           -------------------
    
    
    The figure is not very clear to me in the following points:
    
    1) if the device emulation and virtio dataplane is all implemented in 
    QEMU, what's the point of doing this? I thought the device should be a 
    remove process?
    2) it would be better to draw a vDPA bus somewhere to help people to 
    understand the architecture
    3) for the "offloading" I guess it should be done virtio vhost-vDPA, so 
    it's better to draw a vhost-vDPA block there
    
    
    > We make use of it to implement a block device connecting to
    > our distributed storage, which can be used in containers and
    > bare metal. Compared with qemu-nbd solution, this solution has
    > higher performance, and we can have an unified technology stack
    > in VM and containers for remote storages.
    >
    > To test it with a host disk (e.g. /dev/sdx):
    >
    >    $ qemu-storage-daemon \
    >        --chardev socket,id=charmonitor,path=/tmp/qmp.sock,server,nowait \
    >        --monitor chardev=charmonitor \
    >        --blockdev driver=host_device,cache.direct=on,aio=native,filename=/dev/sdx,node-name=disk0 \
    >        --export vduse-blk,id=test,node-name=disk0,writable=on,vduse-id=1,num-queues=16,queue-size=128
    >
    > The qemu-storage-daemon can be found at https://github.com/bytedance/qemu/tree/vduse
    >
    > Future work:
    >    - Improve performance (e.g. zero copy implementation in datapath)
    >    - Config interrupt support
    >    - Userspace library (find a way to reuse device emulation code in qemu/rust-vmm)
    
    
    Right, a library will be very useful.
    
    Thanks
    
    
    >
    > Xie Yongji (4):
    >    mm: export zap_page_range() for driver use
    >    vduse: Introduce VDUSE - vDPA Device in Userspace
    >    vduse: grab the module's references until there is no vduse device
    >    vduse: Add memory shrinker to reclaim bounce pages
    >
    >   drivers/vdpa/Kconfig                 |    8 +
    >   drivers/vdpa/Makefile                |    1 +
    >   drivers/vdpa/vdpa_user/Makefile      |    5 +
    >   drivers/vdpa/vdpa_user/eventfd.c     |  221 ++++++
    >   drivers/vdpa/vdpa_user/eventfd.h     |   48 ++
    >   drivers/vdpa/vdpa_user/iova_domain.c |  488 ++++++++++++
    >   drivers/vdpa/vdpa_user/iova_domain.h |  104 +++
    >   drivers/vdpa/vdpa_user/vduse.h       |   66 ++
    >   drivers/vdpa/vdpa_user/vduse_dev.c   | 1081 ++++++++++++++++++++++++++
    >   include/uapi/linux/vduse.h           |   85 ++
    >   mm/memory.c                          |    1 +
    >   11 files changed, 2108 insertions(+)
    >   create mode 100644 drivers/vdpa/vdpa_user/Makefile
    >   create mode 100644 drivers/vdpa/vdpa_user/eventfd.c
    >   create mode 100644 drivers/vdpa/vdpa_user/eventfd.h
    >   create mode 100644 drivers/vdpa/vdpa_user/iova_domain.c
    >   create mode 100644 drivers/vdpa/vdpa_user/iova_domain.h
    >   create mode 100644 drivers/vdpa/vdpa_user/vduse.h
    >   create mode 100644 drivers/vdpa/vdpa_user/vduse_dev.c
    >   create mode 100644 include/uapi/linux/vduse.h
    >
    
    _______________________________________________
    Virtualization mailing list
    Virtualization@lists.linux-foundation.org
    https://lists.linuxfoundation.org/mailman/listinfo/virtualization
    
    ^ permalink raw reply	[flat|nested] 12+ messages in thread

  • end of thread, other threads:[~2020-10-23  8:45 UTC | newest]
    
    Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
    -- links below jump to the message on this page --
         [not found] <20201019145623.671-1-xieyongji@bytedance.com>
         [not found] ` <20201019145623.671-4-xieyongji@bytedance.com>
    2020-10-19 15:05   ` [RFC 3/4] vduse: grab the module's references until there is no vduse device Michael S. Tsirkin
         [not found]     ` <CACycT3sQ-rw+weEktyK5jQTfMNWYR6qSaD1vAUEyCP6x7C9rRQ@mail.gmail.com>
    2020-10-19 15:47       ` [External] " Michael S. Tsirkin
         [not found]         ` <CACycT3vG+ZEhn3SYy=7c5rkMz4XRbQZL21NdpPozPnH_x6Srhg@mail.gmail.com>
    2020-10-19 16:41           ` Michael S. Tsirkin
         [not found] ` <20201019145623.671-3-xieyongji@bytedance.com>
    2020-10-19 15:08   ` [RFC 2/4] vduse: Introduce VDUSE - vDPA Device in Userspace Michael S. Tsirkin
    2020-10-19 15:24     ` Randy Dunlap
         [not found] ` <20201019145623.671-2-xieyongji@bytedance.com>
    2020-10-19 15:14   ` [RFC 1/4] mm: export zap_page_range() for driver use Matthew Wilcox
    2020-10-19 17:16 ` [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace Michael S. Tsirkin
         [not found]   ` <CACycT3vzpm_+v-DbqeVRMg8BRny_GoL2JxpbzYC3JYTMKGn_vg@mail.gmail.com>
    2020-10-20  2:20     ` [External] " Jason Wang
    2020-10-20  3:20 ` Jason Wang
         [not found]   ` <CACycT3srzADF63rotgHwfsqn5GJOCbXx+19Dcnw8HLyTGY_7Eg@mail.gmail.com>
    2020-10-20  8:01     ` [External] " Jason Wang
         [not found]       ` <CACycT3ssE-iMquAmrrHGQyBCv7XkQ2WrinFMMPTTubxuuOQ92g@mail.gmail.com>
    2020-10-20  9:12         ` Jason Wang
         [not found]           ` <CACycT3s2GZ3yKP+Xn2V83_-=tXg342J4n91ZAb0c-+UD_+sFnA@mail.gmail.com>
    2020-10-23  8:44             ` Jason Wang
    

    This is a public inbox, see mirroring instructions
    for how to clone and mirror all data and code used for this inbox;
    as well as URLs for NNTP newsgroup(s).