public inbox for dmaengine@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Davis <afd@ti.com>
To: "Paul Cercueil" <paul@crapouillou.net>,
	"Jonathan Cameron" <jic23@kernel.org>,
	"Lars-Peter Clausen" <lars@metafoo.de>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Christian König" <christian.koenig@amd.com>,
	"Vinod Koul" <vkoul@kernel.org>,
	"Jonathan Corbet" <corbet@lwn.net>
Cc: "Michael Hennerich" <Michael.Hennerich@analog.com>,
	linux-doc@vger.kernel.org, linux-iio@vger.kernel.org,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linaro-mm-sig@lists.linaro.org, "Nuno Sá" <noname.nuno@gmail.com>,
	dmaengine@vger.kernel.org, linux-media@vger.kernel.org
Subject: Re: [PATCH v5 0/8] iio: new DMABUF based API, v5
Date: Mon, 8 Jan 2024 15:12:43 -0600	[thread overview]
Message-ID: <6ec8c7c4-588a-48b5-b0c5-56ca5216a757@ti.com> (raw)
In-Reply-To: <20231219175009.65482-1-paul@crapouillou.net>

On 12/19/23 11:50 AM, Paul Cercueil wrote:
> [V4 was: "iio: Add buffer write() support"][1]
> 
> Hi Jonathan,
> 
> This is a respin of the V3 of my patchset that introduced a new
> interface based on DMABUF objects [2].
> 
> The V4 was a split of the patchset, to attempt to upstream buffer
> write() support first. But since there is no current user upstream, it
> was not merged. This V5 is about doing the opposite, and contains the
> new DMABUF interface, without adding the buffer write() support. It can
> already be used with the upstream adi-axi-adc driver.
> 
> In user-space, Libiio uses it to transfer back and forth blocks of
> samples between the hardware and the applications, without having to
> copy the data.
> 
> On a ZCU102 with a FMComms3 daughter board, running Libiio from the
> pcercuei/dev-new-dmabuf-api branch [3], compiled with
> WITH_LOCAL_DMABUF_API=OFF (so that it uses fileio):
>    sudo utils/iio_rwdev -b 4096 -B cf-ad9361-lpc
>    Throughput: 116 MiB/s
> 
> Same hardware, with the DMABUF API (WITH_LOCAL_DMABUF_API=ON):
>    sudo utils/iio_rwdev -b 4096 -B cf-ad9361-lpc
>    Throughput: 475 MiB/s
> 
> This benchmark only measures the speed at which the data can be fetched
> to iio_rwdev's internal buffers, and does not actually try to read the
> data (e.g. to pipe it to stdout). It shows that fetching the data is
> more than 4x faster using the new interface.
> 
> When actually reading the data, the performance difference isn't that
> impressive (maybe because in case of DMABUF the data is not in cache):
> 
> WITH_LOCAL_DMABUF_API=OFF (so that it uses fileio):
>    sudo utils/iio_rwdev -b 4096 cf-ad9361-lpc | dd of=/dev/zero status=progress
>    2446422528 bytes (2.4 GB, 2.3 GiB) copied, 22 s, 111 MB/s
> 
> WITH_LOCAL_DMABUF_API=ON:
>    sudo utils/iio_rwdev -b 4096 cf-ad9361-lpc | dd of=/dev/zero status=progress
>    2334388736 bytes (2.3 GB, 2.2 GiB) copied, 21 s, 114 MB/s
> 
> One interesting thing to note is that fileio is (currently) actually
> faster than the DMABUF interface if you increase a lot the buffer size.
> My explanation is that the cache invalidation routine takes more and
> more time the bigger the DMABUF gets. This is because the DMABUF is
> backed by small-size pages, so a (e.g.) 64 MiB DMABUF is backed by up
> to 16 thousands pages, that have to be invalidated one by one. This can
> be addressed by using huge pages, but the udmabuf driver does not (yet)
> support creating DMABUFs backed by huge pages.
> 

Have you tried DMABUFs created using the DMABUF System heap exporter?
(drivers/dma-buf/heaps/system_heap.c) It should be able to handle
larger allocation better here, and if you don't have any active
mmaps or vmaps then it can skip CPU-side coherency maintenance
(useful for device to device transfers).

Allocating DMABUFs out of user pages has a bunch of other issues you
might run into also. I'd argue udmabuf is now completely superseded
by DMABUF system heaps. Try it out :)

Andrew

> Anyway, the real benefits happen when the DMABUFs are either shared
> between IIO devices, or between the IIO subsystem and another
> filesystem. In that case, the DMABUFs are simply passed around drivers,
> without the data being copied at any moment.
> 
> We use that feature to transfer samples from our transceivers to USB,
> using a DMABUF interface to FunctionFS [4].
> 
> This drastically increases the throughput, to about 274 MiB/s over a
> USB3 link, vs. 127 MiB/s using IIO's fileio interface + write() to the
> FunctionFS endpoints, for a lower CPU usage (0.85 vs. 0.65 load avg.).
> 
> Based on linux-next/next-20231219.
> 
> Cheers,
> -Paul
> 
> [1] https://lore.kernel.org/all/20230807112113.47157-1-paul@crapouillou.net/
> [2] https://lore.kernel.org/all/20230403154800.215924-1-paul@crapouillou.net/
> [3] https://github.com/analogdevicesinc/libiio/tree/pcercuei/dev-new-dmabuf-api
> [4] https://lore.kernel.org/all/20230322092118.9213-1-paul@crapouillou.net/
> 
> ---
> Changelog:
> - [3/8]: Replace V3's dmaengine_prep_slave_dma_array() with a new
>    dmaengine_prep_slave_dma_vec(), which uses a new 'dma_vec' struct.
>    Note that at some point we will need to support cyclic transfers
>    using dmaengine_prep_slave_dma_vec(). Maybe with a new "flags"
>    parameter to the function?
> 
> - [4/8]: Implement .device_prep_slave_dma_vec() instead of V3's
>    .device_prep_slave_dma_array().
> 
>    @Vinod: this patch will cause a small conflict with my other
>    patchset adding scatter-gather support to the axi-dmac driver.
>    This patch adds a call to axi_dmac_alloc_desc(num_sgs), but the
>    prototype of this function changed in my other patchset - it would
>    have to be passed the "chan" variable. I don't know how you prefer it
>    to be resolved. Worst case scenario (and if @Jonathan is okay with
>    that) this one patch can be re-sent later, but it would make this
>    patchset less "atomic".
> 
> - [5/8]:
>    - Use dev_err() instead of pr_err()
>    - Inline to_iio_dma_fence()
>    - Add comment to explain why we unref twice when detaching dmabuf
>    - Remove TODO comment. It is actually safe to free the file's
>      private data even when transfers are still pending because it
>      won't be accessed.
>    - Fix documentation of new fields in struct iio_buffer_access_funcs
>    - iio_dma_resv_lock() does not need to be exported, make it static
> 
> - [7/8]:
>    - Use the new dmaengine_prep_slave_dma_vec().
>    - Restrict to input buffers, since output buffers are not yet
>      supported by IIO buffers.
> 
> - [8/8]:
>    Use description lists for the documentation of the three new IOCTLs
>    instead of abusing subsections.
> 
> ---
> Alexandru Ardelean (1):
>    iio: buffer-dma: split iio_dma_buffer_fileio_free() function
> 
> Paul Cercueil (7):
>    iio: buffer-dma: Get rid of outgoing queue
>    dmaengine: Add API function dmaengine_prep_slave_dma_vec()
>    dmaengine: dma-axi-dmac: Implement device_prep_slave_dma_vec
>    iio: core: Add new DMABUF interface infrastructure
>    iio: buffer-dma: Enable support for DMABUFs
>    iio: buffer-dmaengine: Support new DMABUF based userspace API
>    Documentation: iio: Document high-speed DMABUF based API
> 
>   Documentation/iio/dmabuf_api.rst              |  54 +++
>   Documentation/iio/index.rst                   |   2 +
>   drivers/dma/dma-axi-dmac.c                    |  40 ++
>   drivers/iio/buffer/industrialio-buffer-dma.c  | 242 ++++++++---
>   .../buffer/industrialio-buffer-dmaengine.c    |  52 ++-
>   drivers/iio/industrialio-buffer.c             | 402 ++++++++++++++++++
>   include/linux/dmaengine.h                     |  25 ++
>   include/linux/iio/buffer-dma.h                |  33 +-
>   include/linux/iio/buffer_impl.h               |  26 ++
>   include/uapi/linux/iio/buffer.h               |  22 +
>   10 files changed, 836 insertions(+), 62 deletions(-)
>   create mode 100644 Documentation/iio/dmabuf_api.rst
> 

  parent reply	other threads:[~2024-01-08 21:13 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-19 17:50 [PATCH v5 0/8] iio: new DMABUF based API, v5 Paul Cercueil
2023-12-19 17:50 ` [PATCH v5 1/8] iio: buffer-dma: Get rid of outgoing queue Paul Cercueil
2023-12-21 11:28   ` Jonathan Cameron
2023-12-19 17:50 ` [PATCH v5 2/8] iio: buffer-dma: split iio_dma_buffer_fileio_free() function Paul Cercueil
2023-12-21 11:31   ` Jonathan Cameron
2023-12-19 17:50 ` [PATCH v5 3/8] dmaengine: Add API function dmaengine_prep_slave_dma_vec() Paul Cercueil
2023-12-21 11:40   ` Jonathan Cameron
2023-12-21 15:14   ` Vinod Koul
2023-12-21 15:29     ` Paul Cercueil
2024-01-08 12:20     ` Paul Cercueil
2024-01-22 11:06       ` [Linaro-mm-sig] " Vinod Koul
2023-12-19 17:50 ` [PATCH v5 4/8] dmaengine: dma-axi-dmac: Implement device_prep_slave_dma_vec Paul Cercueil
2023-12-19 17:50 ` [PATCH v5 5/8] iio: core: Add new DMABUF interface infrastructure Paul Cercueil
2023-12-21 12:06   ` Jonathan Cameron
2023-12-21 17:21     ` Paul Cercueil
2024-01-25 13:47     ` Paul Cercueil
2024-01-27 16:50       ` Jonathan Cameron
2024-01-29 12:52         ` Christian König
2024-01-29 13:06           ` Paul Cercueil
2024-01-29 13:17             ` Christian König
2024-01-29 13:32               ` Paul Cercueil
2024-01-29 14:15                 ` Paul Cercueil
2024-01-08 13:20   ` Daniel Vetter
2023-12-19 17:50 ` [PATCH v5 6/8] iio: buffer-dma: Enable support for DMABUFs Paul Cercueil
2023-12-21 16:04   ` Jonathan Cameron
2023-12-22  8:56     ` Nuno Sá
2023-12-26 15:30       ` Jonathan Cameron
2023-12-19 17:50 ` [PATCH v5 7/8] iio: buffer-dmaengine: Support new DMABUF based userspace API Paul Cercueil
2023-12-21 16:12   ` Jonathan Cameron
2023-12-21 17:30     ` Paul Cercueil
2023-12-22  8:58       ` Nuno Sá
2023-12-26 15:31         ` Jonathan Cameron
2023-12-19 17:50 ` [PATCH v5 8/8] Documentation: iio: Document high-speed DMABUF based API Paul Cercueil
2023-12-21 16:15   ` Jonathan Cameron
2023-12-21 16:30 ` [PATCH v5 0/8] iio: new DMABUF based API, v5 Jonathan Cameron
2023-12-21 17:56   ` Paul Cercueil
2023-12-26 15:37     ` Jonathan Cameron
2024-01-08 21:12 ` Andrew Davis [this message]
2024-01-11  9:20   ` Paul Cercueil
2024-01-11 17:30     ` Andrew Davis
2024-01-12 11:33       ` Paul Cercueil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6ec8c7c4-588a-48b5-b0c5-56ca5216a757@ti.com \
    --to=afd@ti.com \
    --cc=Michael.Hennerich@analog.com \
    --cc=christian.koenig@amd.com \
    --cc=corbet@lwn.net \
    --cc=dmaengine@vger.kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jic23@kernel.org \
    --cc=lars@metafoo.de \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-iio@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=noname.nuno@gmail.com \
    --cc=paul@crapouillou.net \
    --cc=sumit.semwal@linaro.org \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox