linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] dma-mapping: arm64: support batched cache sync
@ 2025-10-29  2:31 Barry Song
  2025-10-29  2:31 ` [RFC PATCH 1/5] arm64: Provide dcache_by_myline_op_nosync helper Barry Song
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Barry Song @ 2025-10-29  2:31 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Marek Szyprowski, Robin Murphy
  Cc: Ryan Roberts, iommu, Anshuman Khandual, Marc Zyngier,
	Tangquan Zheng, linux-kernel, Barry Song, Suren Baghdasaryan,
	Ard Biesheuvel, linux-arm-kernel

From: Barry Song <v-songbaohua@oppo.com>

Many embedded ARM64 SoCs still lack hardware cache coherency support, which
causes DMA mapping operations to appear as hotspots in on-CPU flame graphs.

For an SG list with *nents* entries, the current dma_map/unmap_sg() and DMA
sync APIs perform cache maintenance one entry at a time. After each entry,
the implementation synchronously waits for the corresponding region’s
D-cache operations to complete. On architectures like arm64, efficiency can
be improved by issuing all entries’ operations first and then performing a
single batched wait for completion.

Tangquan's initial results show that batched synchronization can reduce
dma_map_sg() time by 64.61% and dma_unmap_sg() time by 66.60% on an MTK
phone platform (MediaTek Dimensity 9500). The tests were performed by
pinning the task to CPU7 and fixing the CPU frequency at 2.6 GHz,
running dma_map_sg() and dma_unmap_sg() on 10 MB buffers (10 MB / 4 KB
sg entries per buffer) for 200 iterations and then averaging the
results.

Barry Song (5):
  arm64: Provide dcache_by_myline_op_nosync helper
  arm64: Provide dcache_clean_poc_nosync helper
  arm64: Provide dcache_inval_poc_nosync helper
  arm64: Provide arch_sync_dma_ batched helpers
  dma-mapping: Allow batched DMA sync operations if supported by the
    arch

 arch/arm64/Kconfig                  |  1 +
 arch/arm64/include/asm/assembler.h  | 79 +++++++++++++++++++-------
 arch/arm64/include/asm/cacheflush.h |  2 +
 arch/arm64/mm/cache.S               | 58 +++++++++++++++----
 arch/arm64/mm/dma-mapping.c         | 24 ++++++++
 include/linux/dma-map-ops.h         |  8 +++
 kernel/dma/Kconfig                  |  3 +
 kernel/dma/direct.c                 | 53 ++++++++++++++++--
 kernel/dma/direct.h                 | 86 +++++++++++++++++++++++++----
 9 files changed, 267 insertions(+), 47 deletions(-)

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Tangquan Zheng <zhengtangquan@oppo.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: iommu@lists.linux.dev

-- 
2.39.3 (Apple Git-146)



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-11-24 18:20 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-29  2:31 [RFC PATCH 0/5] dma-mapping: arm64: support batched cache sync Barry Song
2025-10-29  2:31 ` [RFC PATCH 1/5] arm64: Provide dcache_by_myline_op_nosync helper Barry Song
2025-10-29  2:31 ` [RFC PATCH 2/5] arm64: Provide dcache_clean_poc_nosync helper Barry Song
2025-10-29  2:31 ` [RFC PATCH 3/5] arm64: Provide dcache_inval_poc_nosync helper Barry Song
2025-10-29  2:31 ` [RFC PATCH 4/5] arm64: Provide arch_sync_dma_ batched helpers Barry Song
2025-10-29  2:31 ` [RFC PATCH 5/5] dma-mapping: Allow batched DMA sync operations if supported by the arch Barry Song
2025-11-13 18:19   ` Catalin Marinas
2025-11-17 21:12     ` Barry Song
2025-11-21 16:09       ` Marek Szyprowski
2025-11-21 23:28         ` Barry Song
2025-11-24 18:11           ` Marek Szyprowski
2025-11-06 20:44 ` [RFC PATCH 0/5] dma-mapping: arm64: support batched cache sync Barry Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).