From: Leon Romanovsky <leon@kernel.org>
To: Barry Song <21cnbao@gmail.com>
Cc: Juergen Gross <jgross@suse.com>,
Tangquan Zheng <zhengtangquan@oppo.com>,
Barry Song <baohua@kernel.org>,
Stefano Stabellini <sstabellini@kernel.org>,
Ryan Roberts <ryan.roberts@arm.com>,
will@kernel.org, Anshuman Khandual <anshuman.khandual@arm.com>,
catalin.marinas@arm.com, Joerg Roedel <joro@8bytes.org>,
linux-kernel@vger.kernel.org,
Suren Baghdasaryan <surenb@google.com>,
iommu@lists.linux.dev, Marc Zyngier <maz@kernel.org>,
Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>,
xen-devel@lists.xenproject.org, robin.murphy@arm.com,
Ard Biesheuvel <ardb@kernel.org>,
linux-arm-kernel@lists.infradead.org, m.szyprowski@samsung.com
Subject: Re: [PATCH v2 4/8] dma-mapping: Separate DMA sync issuing and completion waiting
Date: Sat, 27 Dec 2025 22:07:06 +0200 [thread overview]
Message-ID: <20251227200706.GN11869@unreal> (raw)
In-Reply-To: <20251226225254.46197-5-21cnbao@gmail.com>
On Sat, Dec 27, 2025 at 11:52:44AM +1300, Barry Song wrote:
> From: Barry Song <baohua@kernel.org>
>
> Currently, arch_sync_dma_for_cpu and arch_sync_dma_for_device
> always wait for the completion of each DMA buffer. That is,
> issuing the DMA sync and waiting for completion is done in a
> single API call.
>
> For scatter-gather lists with multiple entries, this means
> issuing and waiting is repeated for each entry, which can hurt
> performance. Architectures like ARM64 may be able to issue all
> DMA sync operations for all entries first and then wait for
> completion together.
>
> To address this, arch_sync_dma_for_* now issues DMA operations in
> batch, followed by a flush. On ARM64, the flush is implemented
> using a dsb instruction within arch_sync_dma_flush().
>
> For now, add arch_sync_dma_flush() after each
> arch_sync_dma_for_*() call. arch_sync_dma_flush() is defined as a
> no-op on all architectures except arm64, so this patch does not
> change existing behavior. Subsequent patches will introduce true
> batching for SG DMA buffers.
>
> Cc: Leon Romanovsky <leon@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Ada Couprie Diaz <ada.coupriediaz@arm.com>
> Cc: Ard Biesheuvel <ardb@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Ryan Roberts <ryan.roberts@arm.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Juergen Gross <jgross@suse.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Cc: Tangquan Zheng <zhengtangquan@oppo.com>
> Signed-off-by: Barry Song <baohua@kernel.org>
> ---
> arch/arm64/include/asm/cache.h | 6 ++++++
> arch/arm64/mm/dma-mapping.c | 4 ++--
> drivers/iommu/dma-iommu.c | 37 +++++++++++++++++++++++++---------
> drivers/xen/swiotlb-xen.c | 24 ++++++++++++++--------
> include/linux/dma-map-ops.h | 6 ++++++
> kernel/dma/direct.c | 8 ++++++--
> kernel/dma/direct.h | 9 +++++++--
> kernel/dma/swiotlb.c | 4 +++-
> 8 files changed, 73 insertions(+), 25 deletions(-)
<...>
> +#ifndef arch_sync_dma_flush
> +static inline void arch_sync_dma_flush(void)
> +{
> +}
> +#endif
Over the weekend I realized a useful advantage of the ARCH_HAVE_* config
options: they make it straightforward to inspect the entire DMA path simply
by looking at the .config.
Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
WARNING: multiple messages have this Message-ID (diff)
From: Leon Romanovsky <leon@kernel.org>
To: Barry Song <21cnbao@gmail.com>
Cc: catalin.marinas@arm.com, m.szyprowski@samsung.com,
robin.murphy@arm.com, will@kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, xen-devel@lists.xenproject.org,
Barry Song <baohua@kernel.org>,
Ada Couprie Diaz <ada.coupriediaz@arm.com>,
Ard Biesheuvel <ardb@kernel.org>, Marc Zyngier <maz@kernel.org>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Ryan Roberts <ryan.roberts@arm.com>,
Suren Baghdasaryan <surenb@google.com>,
Joerg Roedel <joro@8bytes.org>, Juergen Gross <jgross@suse.com>,
Stefano Stabellini <sstabellini@kernel.org>,
Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>,
Tangquan Zheng <zhengtangquan@oppo.com>
Subject: Re: [PATCH v2 4/8] dma-mapping: Separate DMA sync issuing and completion waiting
Date: Sat, 27 Dec 2025 22:07:06 +0200 [thread overview]
Message-ID: <20251227200706.GN11869@unreal> (raw)
In-Reply-To: <20251226225254.46197-5-21cnbao@gmail.com>
On Sat, Dec 27, 2025 at 11:52:44AM +1300, Barry Song wrote:
> From: Barry Song <baohua@kernel.org>
>
> Currently, arch_sync_dma_for_cpu and arch_sync_dma_for_device
> always wait for the completion of each DMA buffer. That is,
> issuing the DMA sync and waiting for completion is done in a
> single API call.
>
> For scatter-gather lists with multiple entries, this means
> issuing and waiting is repeated for each entry, which can hurt
> performance. Architectures like ARM64 may be able to issue all
> DMA sync operations for all entries first and then wait for
> completion together.
>
> To address this, arch_sync_dma_for_* now issues DMA operations in
> batch, followed by a flush. On ARM64, the flush is implemented
> using a dsb instruction within arch_sync_dma_flush().
>
> For now, add arch_sync_dma_flush() after each
> arch_sync_dma_for_*() call. arch_sync_dma_flush() is defined as a
> no-op on all architectures except arm64, so this patch does not
> change existing behavior. Subsequent patches will introduce true
> batching for SG DMA buffers.
>
> Cc: Leon Romanovsky <leon@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Ada Couprie Diaz <ada.coupriediaz@arm.com>
> Cc: Ard Biesheuvel <ardb@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Ryan Roberts <ryan.roberts@arm.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Juergen Gross <jgross@suse.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Cc: Tangquan Zheng <zhengtangquan@oppo.com>
> Signed-off-by: Barry Song <baohua@kernel.org>
> ---
> arch/arm64/include/asm/cache.h | 6 ++++++
> arch/arm64/mm/dma-mapping.c | 4 ++--
> drivers/iommu/dma-iommu.c | 37 +++++++++++++++++++++++++---------
> drivers/xen/swiotlb-xen.c | 24 ++++++++++++++--------
> include/linux/dma-map-ops.h | 6 ++++++
> kernel/dma/direct.c | 8 ++++++--
> kernel/dma/direct.h | 9 +++++++--
> kernel/dma/swiotlb.c | 4 +++-
> 8 files changed, 73 insertions(+), 25 deletions(-)
<...>
> +#ifndef arch_sync_dma_flush
> +static inline void arch_sync_dma_flush(void)
> +{
> +}
> +#endif
Over the weekend I realized a useful advantage of the ARCH_HAVE_* config
options: they make it straightforward to inspect the entire DMA path simply
by looking at the .config.
Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
next prev parent reply other threads:[~2025-12-27 20:07 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-26 22:52 [PATCH v2 0/8] dma-mapping: arm64: support batched cache sync Barry Song
2025-12-26 22:52 ` Barry Song
2025-12-26 22:52 ` [PATCH v2 1/8] arm64: Provide dcache_by_myline_op_nosync helper Barry Song
2025-12-26 22:52 ` Barry Song
2026-01-20 12:27 ` Will Deacon
2026-01-20 12:27 ` Will Deacon
2026-01-26 1:43 ` Barry Song
2026-01-26 1:43 ` Barry Song
2025-12-26 22:52 ` [PATCH v2 2/8] arm64: Provide dcache_clean_poc_nosync helper Barry Song
2025-12-26 22:52 ` Barry Song
2025-12-26 22:52 ` [PATCH v2 3/8] arm64: Provide dcache_inval_poc_nosync helper Barry Song
2025-12-26 22:52 ` Barry Song
2026-01-20 12:33 ` Will Deacon
2026-01-20 12:33 ` Will Deacon
2025-12-26 22:52 ` [PATCH v2 4/8] dma-mapping: Separate DMA sync issuing and completion waiting Barry Song
2025-12-26 22:52 ` Barry Song
2025-12-27 20:07 ` Leon Romanovsky [this message]
2025-12-27 20:07 ` Leon Romanovsky
2025-12-27 21:45 ` Barry Song
2025-12-27 21:45 ` Barry Song
2025-12-28 14:49 ` Leon Romanovsky
2025-12-28 14:49 ` Leon Romanovsky
2025-12-28 21:38 ` Barry Song
2025-12-28 21:38 ` Barry Song
2025-12-29 14:40 ` Leon Romanovsky
2025-12-29 14:40 ` Leon Romanovsky
2025-12-31 14:43 ` Marek Szyprowski
2025-12-31 14:43 ` Marek Szyprowski
2026-01-05 12:28 ` Jürgen Groß
2026-01-05 12:28 ` Jürgen Groß
2025-12-26 22:52 ` [PATCH v2 5/8] dma-mapping: Support batch mode for dma_direct_sync_sg_for_* Barry Song
2025-12-26 22:52 ` Barry Song
2025-12-27 20:09 ` Leon Romanovsky
2025-12-27 20:09 ` Leon Romanovsky
2025-12-27 20:52 ` Barry Song
2025-12-27 20:52 ` Barry Song
2025-12-28 14:50 ` Leon Romanovsky
2025-12-28 14:50 ` Leon Romanovsky
2026-01-06 18:41 ` Barry Song
2026-01-06 18:41 ` Barry Song
2026-01-06 19:12 ` Robin Murphy
2026-01-06 19:12 ` Robin Murphy
2026-01-06 19:47 ` Barry Song
2026-01-06 19:47 ` Barry Song
2026-01-07 7:54 ` Leon Romanovsky
2026-01-07 7:54 ` Leon Romanovsky
2026-01-07 13:16 ` Robin Murphy
2026-01-07 13:16 ` Robin Murphy
2026-01-08 11:45 ` Marek Szyprowski
2026-01-08 11:45 ` Marek Szyprowski
2025-12-26 22:52 ` [PATCH v2 6/8] dma-mapping: Support batch mode for dma_direct_{map,unmap}_sg Barry Song
2025-12-26 22:52 ` Barry Song
2025-12-27 20:14 ` Leon Romanovsky
2025-12-27 20:14 ` Leon Romanovsky
2025-12-26 22:52 ` [PATCH RFC v2 7/8] dma-iommu: Support DMA sync batch mode for IOVA link and unlink Barry Song
2025-12-26 22:52 ` Barry Song
2025-12-26 22:52 ` [PATCH RFC v2 8/8] dma-iommu: Support DMA sync batch mode for iommu_dma_sync_sg_for_{cpu, device} Barry Song
2025-12-26 22:52 ` Barry Song
2025-12-27 20:16 ` Leon Romanovsky
2025-12-27 20:16 ` Leon Romanovsky
2025-12-27 20:59 ` Barry Song
2025-12-27 20:59 ` Barry Song
2026-01-06 19:42 ` Robin Murphy
2026-01-06 19:42 ` Robin Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251227200706.GN11869@unreal \
--to=leon@kernel.org \
--cc=21cnbao@gmail.com \
--cc=anshuman.khandual@arm.com \
--cc=ardb@kernel.org \
--cc=baohua@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=iommu@lists.linux.dev \
--cc=jgross@suse.com \
--cc=joro@8bytes.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=m.szyprowski@samsung.com \
--cc=maz@kernel.org \
--cc=oleksandr_tyshchenko@epam.com \
--cc=robin.murphy@arm.com \
--cc=ryan.roberts@arm.com \
--cc=sstabellini@kernel.org \
--cc=surenb@google.com \
--cc=will@kernel.org \
--cc=xen-devel@lists.xenproject.org \
--cc=zhengtangquan@oppo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.