From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: maciej.fijalkowski@intel.com,
virtualization@lists.linux-foundation.org, bjorn@kernel.org,
jonathan.lemon@gmail.com, magnus.karlsson@intel.com
Subject: Re: [PATCH vhost 07/10] virtio_ring: add api virtio_dma_map() for advance dma
Date: Thu, 2 Mar 2023 01:09:04 -0500 [thread overview]
Message-ID: <20230302005901-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CACGkMEsXp2wvLJ=DcrVcKDQZWs8djf+iFKaPsX2-xp3vZa0TvA@mail.gmail.com>
On Thu, Mar 02, 2023 at 11:26:53AM +0800, Jason Wang wrote:
> On Thu, Mar 2, 2023 at 11:24 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > On Thu, 2 Mar 2023 11:05:26 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > On Thu, Mar 2, 2023 at 10:24 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > >
> > > > On Tue, 28 Feb 2023 19:15:23 +0800, Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > > On Tue, 21 Feb 2023 09:51:07 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > > > > On Mon, Feb 20, 2023 at 3:02 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > > > >
> > > > > > > On Mon, 20 Feb 2023 13:38:24 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > > > > > > On Tue, Feb 14, 2023 at 3:27 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > > > > > >
> > > > > > > > > Added virtio_dma_map() to map DMA addresses for virtual memory in
> > > > > > > > > advance. The purpose is to keep memory mapped across multiple add/get
> > > > > > > > > buf operations.
> > > > > > > >
> > > > > > > > I wonder if instead of exporting helpers like this, it might be simple
> > > > > > > > to just export dma_dev then the upper layer can use DMA API at will?
> > > > > > >
> > > > > > >
> > > > > > > The reason for not doing this, Virtio is not just using DMA_DEV to mapp, but
> > > > > > > also check whether DMA is used.
> > > > > >
> > > > > > We should let the DMA API decide by exporting a correct dma_dev. E.g
> > > > > > when ACCESS_PLATFORM is not negotiated, advertising a DMA dev without
> > > > > > dma_ops.
> > > > >
> > > > >
> > > > > Do you mean we provide this API?
> > > > >
> > > > > virtio_get_dma_dev()
> > > > >
> > > > > If it returns NULL, the caller will use the physical memory address directly. If
> > > > > this func return a dma_dev, the caller should use DMA API.
> > > >
> > > >
> > > > cc the XDP_SOCKET's maintainers.
> > > >
> > > > First of all, Jason does not want to encapsulate the API of DMA by Virtio. It is
> > > > best to pass DMA Device to XSK, XSK uses DMA API for DMA MAP operation directly.
> > > > I agree with this idea.
> > > >
> > > > However, there are several problems under Virtio:
> > > > 1. In some virtualization scenarios, we do not have to perform DMA operations,
> > > > just use the physical address directly.
> > >
> > > This is not a problem, we can simply return the virtio device itself
> > > as the DMA device in this case. Since there's no DMA ops attached, DMA
> > > API will use physical address in this case.
> >
> > Is this like this? So why do we have to deal with it in Virtio Ring? Let me
> > learn it.
>
> It has a long debate and I can't recall too many details. (You can
> search the archives). Michael may show more thoughts here.
>
> One concern is the overhead of the DMA API that needs to be benchmarked.
>
> Thanks
Concern with what? This patch does not change devices, they are using
the existing API. Xuan Zhuo already showed a benchmark result for AF_XDP.
> >
> >
> > >
> > > > 2. The latest Virtio Core supports each rx/tx queue with one DMA device.
> > > > Generally, the physical network card has only one device. All queues use
> > > > it for DMA operation.
> > >
> > > I'm not sure this is a big deal, we just need to use the per virtqueue
> > > dma device to use DMA API.
> >
> > Yes.
> >
> >
> > >
> > > >
> > > > So I consider this problem again, Virtio Core provides only one API.
> > > >
> > > > * virtio_get_dma_dev(queue)
> > > >
> > > > If the return value is NULL, it means that there is no DMA operation. If it is
> > > > not NULL, use DMA API for DMA operation.
> > > >
> > > > The modification of XSK is like this. We may pass NULL as dev to xp_dma_map().
> > > > If dev is NULL, then there is no need to perform DMA and Sync operations.
> > > > Otherwise, it will perform DMA operations like other devices.
> > >
> > > As discussed above, it might be easier:
> > >
> > > if (!virtio_has_feature(VIRTIO_F_ACCESS_PLATFORM))
> > > return virtio_device;
> > > else
> > > return vring_dma_dev();
> >
> > Yes, according to Jason's opinion, then XSK not need to do any modifications.
> >
> > Thanks.
Yes AF_XDP does not need the per VQ device hack.
We should probably rethink it.
But as far as implementation goes, poking at VIRTIO_F_ACCESS_PLATFORM
is wrong. Please use virtio_has_dma_quirk.
> > >
> > > >
> > > > And if the dma_dev of rx and tx is different, then we can only disable
> > > > XDP_SOCKET.
> > >
> > > We can start with this.
> > >
> > > Thanks
> > >
> > > >
> > > > Looking forward to everyone's reply.
> > > >
> > > > Thanks.
> > > >
> > > > >
> > > > > Thanks.
> > > > >
> > > > >
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > (Otherwise the DMA helpers need to grow/shrink as the DMA API evolves?)
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Added virtio_dma_unmap() for unmap DMA address.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > > ---
> > > > > > > > > drivers/virtio/virtio_ring.c | 92 ++++++++++++++++++++++++++++++++++++
> > > > > > > > > include/linux/virtio.h | 9 ++++
> > > > > > > > > 2 files changed, 101 insertions(+)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > index cd9364eb2345..855338609c7f 100644
> > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > @@ -3172,4 +3172,96 @@ const struct vring *virtqueue_get_vring(struct virtqueue *vq)
> > > > > > > > > }
> > > > > > > > > EXPORT_SYMBOL_GPL(virtqueue_get_vring);
> > > > > > > > >
> > > > > > > > > +/**
> > > > > > > > > + * virtio_dma_map_page - get the DMA addr of the memory for virtio device
> > > > > > > > > + * @dev: virtio device
> > > > > > > > > + * @page: the page of the memory to DMA
> > > > > > > > > + * @offset: the offset of the memory inside page
> > > > > > > > > + * @length: memory length
> > > > > > > > > + * @dir: DMA direction
> > > > > > > > > + *
> > > > > > > > > + * This API is only for pre-mapped buffers, for non premapped buffers virtio
> > > > > > > > > + * core handles DMA API internally.
> > > > > > > > > + *
> > > > > > > > > + * Returns the DMA addr. DMA_MAPPING_ERROR means error.
> > > > > > > > > + */
> > > > > > > > > +dma_addr_t virtio_dma_map_page(struct device *dev, struct page *page, size_t offset,
> > > > > > > > > + unsigned int length, enum dma_data_direction dir)
> > > > > > > > > +{
> > > > > > > >
> > > > > > > > This (and the reset) needs to be done per virtqueue instead per device
> > > > > > > > after b0e504e5505d184b0be248b7dcdbe50b79f03758 ("virtio_ring: per
> > > > > > > > virtqueue dma device").
> > > > > > >
> > > > > > >
> > > > > > > YES.
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > + struct virtio_device *vdev = dev_to_virtio(dev);
> > > > > > > > > +
> > > > > > > > > + if (!vring_use_dma_api(vdev))
> > > > > > > > > + return page_to_phys(page) + offset;
> > > > > > > > > +
> > > > > > > > > + return dma_map_page(vdev->dev.parent, page, offset, length, dir);
> > > > > > > > > +}
> > > > > > > >
> > > > > > > > Need either inline or EXPORT_SYMBOL_GPL() here.
> > > > > > >
> > > > > > > Because I did not use this interface, I did not export it.
> > > > > > >
> > > > > > > Thanks.
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > >
> > > > > > > > > +
> > > > > > > > > +/**
> > > > > > > > > + * virtio_dma_map - get the DMA addr of the memory for virtio device
> > > > > > > > > + * @dev: virtio device
> > > > > > > > > + * @addr: the addr to DMA
> > > > > > > > > + * @length: memory length
> > > > > > > > > + * @dir: DMA direction
> > > > > > > > > + *
> > > > > > > > > + * This API is only for pre-mapped buffers, for non premapped buffers virtio
> > > > > > > > > + * core handles DMA API internally.
> > > > > > > > > + *
> > > > > > > > > + * Returns the DMA addr.
> > > > > > > > > + */
> > > > > > > > > +dma_addr_t virtio_dma_map(struct device *dev, void *addr, unsigned int length,
> > > > > > > > > + enum dma_data_direction dir)
> > > > > > > > > +{
> > > > > > > > > + struct page *page;
> > > > > > > > > + size_t offset;
> > > > > > > > > +
> > > > > > > > > + page = virt_to_page(addr);
> > > > > > > > > + offset = offset_in_page(addr);
> > > > > > > > > +
> > > > > > > > > + return virtio_dma_map_page(dev, page, offset, length, dir);
> > > > > > > > > +}
> > > > > > > > > +EXPORT_SYMBOL_GPL(virtio_dma_map);
> > > > > > > > > +
> > > > > > > > > +/**
> > > > > > > > > + * virtio_dma_mapping_error - check dma address
> > > > > > > > > + * @dev: virtio device
> > > > > > > > > + * @addr: DMA address
> > > > > > > > > + *
> > > > > > > > > + * This API is only for pre-mapped buffers, for non premapped buffers virtio
> > > > > > > > > + * core handles DMA API internally.
> > > > > > > > > + *
> > > > > > > > > + * Returns 0 means dma valid. Other means invalid dma address.
> > > > > > > > > + */
> > > > > > > > > +int virtio_dma_mapping_error(struct device *dev, dma_addr_t addr)
> > > > > > > > > +{
> > > > > > > > > + struct virtio_device *vdev = dev_to_virtio(dev);
> > > > > > > > > +
> > > > > > > > > + if (!vring_use_dma_api(vdev))
> > > > > > > > > + return 0;
> > > > > > > > > +
> > > > > > > > > + return dma_mapping_error(vdev->dev.parent, addr);
> > > > > > > > > +}
> > > > > > > > > +EXPORT_SYMBOL_GPL(virtio_dma_mapping_error);
> > > > > > > > > +
> > > > > > > > > +/**
> > > > > > > > > + * virtio_dma_unmap - unmap DMA addr
> > > > > > > > > + * @dev: virtio device
> > > > > > > > > + * @dma: DMA address
> > > > > > > > > + * @length: memory length
> > > > > > > > > + * @dir: DMA direction
> > > > > > > > > + *
> > > > > > > > > + * This API is only for pre-mapped buffers, for non premapped buffers virtio
> > > > > > > > > + * core handles DMA API internally.
> > > > > > > > > + */
> > > > > > > > > +void virtio_dma_unmap(struct device *dev, dma_addr_t dma, unsigned int length,
> > > > > > > > > + enum dma_data_direction dir)
> > > > > > > > > +{
> > > > > > > > > + struct virtio_device *vdev = dev_to_virtio(dev);
> > > > > > > > > +
> > > > > > > > > + if (!vring_use_dma_api(vdev))
> > > > > > > > > + return;
> > > > > > > > > +
> > > > > > > > > + dma_unmap_page(vdev->dev.parent, dma, length, dir);
> > > > > > > > > +}
> > > > > > > > > +EXPORT_SYMBOL_GPL(virtio_dma_unmap);
> > > > > > > > > +
> > > > > > > > > MODULE_LICENSE("GPL");
> > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > index 3ebb346ebb7c..b5fa71476737 100644
> > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > @@ -9,6 +9,7 @@
> > > > > > > > > #include <linux/device.h>
> > > > > > > > > #include <linux/mod_devicetable.h>
> > > > > > > > > #include <linux/gfp.h>
> > > > > > > > > +#include <linux/dma-mapping.h>
> > > > > > > > >
> > > > > > > > > /**
> > > > > > > > > * struct virtqueue - a queue to register buffers for sending or receiving.
> > > > > > > > > @@ -216,4 +217,12 @@ void unregister_virtio_driver(struct virtio_driver *drv);
> > > > > > > > > #define module_virtio_driver(__virtio_driver) \
> > > > > > > > > module_driver(__virtio_driver, register_virtio_driver, \
> > > > > > > > > unregister_virtio_driver)
> > > > > > > > > +
> > > > > > > > > +dma_addr_t virtio_dma_map_page(struct device *dev, struct page *page, size_t offset,
> > > > > > > > > + unsigned int length, enum dma_data_direction dir);
> > > > > > > > > +dma_addr_t virtio_dma_map(struct device *dev, void *addr, unsigned int length,
> > > > > > > > > + enum dma_data_direction dir);
> > > > > > > > > +int virtio_dma_mapping_error(struct device *dev, dma_addr_t addr);
> > > > > > > > > +void virtio_dma_unmap(struct device *dev, dma_addr_t dma, unsigned int length,
> > > > > > > > > + enum dma_data_direction dir);
> > > > > > > > > #endif /* _LINUX_VIRTIO_H */
> > > > > > > > > --
> > > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > _______________________________________________
> > > > > Virtualization mailing list
> > > > > Virtualization@lists.linux-foundation.org
> > > > > https://lists.linuxfoundation.org/mailman/listinfo/virtualization
> > > >
> > >
> >
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
next prev parent reply other threads:[~2023-03-02 6:09 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-14 7:26 [PATCH vhost 00/10] virtio core prepares for AF_XDP Xuan Zhuo
2023-02-14 7:26 ` [PATCH vhost 01/10] virtio_ring: split: refactor virtqueue_add_split() for premapped Xuan Zhuo
2023-02-20 5:37 ` Jason Wang
2023-02-20 6:57 ` Xuan Zhuo
2023-02-20 12:12 ` Michael S. Tsirkin
2023-02-14 7:26 ` [PATCH vhost 02/10] virtio_ring: packed: separate prepare code from virtuque_add_indirect_packed() Xuan Zhuo
2023-02-20 5:37 ` Jason Wang
2023-02-20 6:56 ` Xuan Zhuo
2023-02-14 7:26 ` [PATCH vhost 03/10] virtio_ring: packed: refactor virtqueue_add_packed() for premapped Xuan Zhuo
2023-02-20 5:37 ` Jason Wang
2023-02-14 7:26 ` [PATCH vhost 04/10] virtio_ring: split: introduce virtqueue_add_split_premapped() Xuan Zhuo
2023-02-20 5:38 ` Jason Wang
2023-02-20 6:43 ` Xuan Zhuo
2023-02-21 1:49 ` Jason Wang
2023-02-14 7:26 ` [PATCH vhost 05/10] virtio_ring: packed: introduce virtqueue_add_packed_premapped() Xuan Zhuo
2023-02-14 7:27 ` [PATCH vhost 06/10] virtio_ring: introduce virtqueue_add_inbuf_premapped() Xuan Zhuo
2023-02-14 7:27 ` [PATCH vhost 07/10] virtio_ring: add api virtio_dma_map() for advance dma Xuan Zhuo
2023-02-20 5:38 ` Jason Wang
2023-02-20 6:59 ` Xuan Zhuo
2023-02-21 1:51 ` Jason Wang
2023-02-28 11:15 ` Xuan Zhuo
2023-03-02 2:04 ` Xuan Zhuo
2023-03-02 3:05 ` Jason Wang
2023-03-02 3:21 ` Xuan Zhuo
2023-03-02 3:26 ` Jason Wang
2023-03-02 6:09 ` Michael S. Tsirkin [this message]
2023-03-02 6:43 ` Xuan Zhuo
2023-03-02 6:56 ` Jason Wang
2023-03-02 7:31 ` Xuan Zhuo
2023-03-02 7:56 ` Jason Wang
2023-03-02 11:08 ` Xuan Zhuo
2023-03-01 11:47 ` Xuan Zhuo
2023-02-14 7:27 ` [PATCH vhost 08/10] virtio_ring: introduce dma sync api for virtio Xuan Zhuo
2023-02-20 5:38 ` Jason Wang
2023-02-20 7:04 ` Xuan Zhuo
2023-02-21 1:52 ` Jason Wang
2023-02-14 7:27 ` [PATCH vhost 09/10] virtio_ring: correct the expression of the description of virtqueue_resize() Xuan Zhuo
2023-02-20 5:38 ` Jason Wang
2023-02-14 7:27 ` [PATCH vhost 10/10] virtio_ring: introduce virtqueue_reset() Xuan Zhuo
2023-02-20 5:38 ` Jason Wang
2023-02-20 7:03 ` Xuan Zhuo
2023-02-21 1:51 ` Jason Wang
2023-02-16 5:27 ` [PATCH vhost 00/10] virtio core prepares for AF_XDP Jason Wang
2023-02-16 11:46 ` Xuan Zhuo
2023-02-17 5:23 ` Jason Wang
2023-02-17 9:02 ` Xuan Zhuo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230302005901-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=bjorn@kernel.org \
--cc=jasowang@redhat.com \
--cc=jonathan.lemon@gmail.com \
--cc=maciej.fijalkowski@intel.com \
--cc=magnus.karlsson@intel.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).