* [PATCH 0/3] RDMA: Enable runs with DMA debug enabled
@ 2026-03-07 16:49 Leon Romanovsky
2026-03-07 16:49 ` [PATCH 1/3] dma-debug: Allow multiple invocations of overlapping entries Leon Romanovsky
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Leon Romanovsky @ 2026-03-07 16:49 UTC (permalink / raw)
To: Marek Szyprowski, Robin Murphy, Michael S. Tsirkin, Petr Tesarik,
Jonathan Corbet, Shuah Khan, Jason Wang, Xuan Zhuo,
Eugenio Pérez, Jason Gunthorpe, Leon Romanovsky
Cc: iommu, linux-kernel, linux-doc, virtualization, linux-rdma
Fix dma-debug to allow RDMA to run in that mode too.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
Leon Romanovsky (3):
dma-debug: Allow multiple invocations of overlapping entries
dma-mapping: Clarify valid conditions for CPU cache line overlap
RDMA/umem: Tell DMA debug that cacheline overlap is expected
Documentation/core-api/dma-attributes.rst | 26 ++++++++++++++++++--------
drivers/infiniband/core/umem.c | 2 +-
drivers/virtio/virtio_ring.c | 4 ++--
include/linux/dma-mapping.h | 8 ++++----
kernel/dma/debug.c | 8 ++++----
5 files changed, 29 insertions(+), 19 deletions(-)
---
base-commit: 11439c4635edd669ae435eec308f4ab8a0804808
change-id: 20260305-dma-debug-overlap-21487c3fa02c
Best regards,
--
Leon Romanovsky <leonro@nvidia.com>
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/3] dma-debug: Allow multiple invocations of overlapping entries
2026-03-07 16:49 [PATCH 0/3] RDMA: Enable runs with DMA debug enabled Leon Romanovsky
@ 2026-03-07 16:49 ` Leon Romanovsky
2026-03-07 16:49 ` [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap Leon Romanovsky
2026-03-07 16:49 ` [PATCH 3/3] RDMA/umem: Tell DMA debug that cacheline overlap is expected Leon Romanovsky
2 siblings, 0 replies; 17+ messages in thread
From: Leon Romanovsky @ 2026-03-07 16:49 UTC (permalink / raw)
To: Marek Szyprowski, Robin Murphy, Michael S. Tsirkin, Petr Tesarik,
Jonathan Corbet, Shuah Khan, Jason Wang, Xuan Zhuo,
Eugenio Pérez, Jason Gunthorpe, Leon Romanovsky
Cc: iommu, linux-kernel, linux-doc, virtualization, linux-rdma
From: Leon Romanovsky <leonro@nvidia.com>
Repeated DMA mappings with DMA_ATTR_CPU_CACHE_CLEAN trigger the
following splat. This prevents using the attribute in cases where a DMA
region is shared and reused more than seven times.
------------[ cut here ]------------
DMA-API: exceeded 7 overlapping mappings of cacheline 0x000000000438c440
WARNING: kernel/dma/debug.c:467 at add_dma_entry+0x219/0x280, CPU#4: ibv_rc_pingpong/1644
Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat nf_nat xt_addrtype br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay mlx5_fwctl zram zsmalloc mlx5_ib fuse rpcrdma rdma_ucm ib_uverbs ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_core ib_core
CPU: 4 UID: 2733 PID: 1644 Comm: ibv_rc_pingpong Not tainted 6.19.0+ #129 PREEMPT
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:add_dma_entry+0x221/0x280
Code: c0 0f 84 f2 fe ff ff 83 e8 01 89 05 6d 99 11 01 e9 e4 fe ff ff 0f 8e 1f ff ff ff 48 8d 3d 07 ef 2d 01 be 07 00 00 00 48 89 e2 <67> 48 0f b9 3a e9 06 ff ff ff 48 c7 c7 98 05 2b 82 c6 05 72 92 28
RSP: 0018:ff1100010e657970 EFLAGS: 00010002
RAX: 0000000000000007 RBX: ff1100010234eb00 RCX: 0000000000000000
RDX: ff1100010e657970 RSI: 0000000000000007 RDI: ffffffff82678660
RBP: 000000000438c440 R08: 0000000000000228 R09: 0000000000000000
R10: 00000000000001be R11: 000000000000089d R12: 0000000000000800
R13: 00000000ffffffef R14: 0000000000000202 R15: ff1100010234eb00
FS: 00007fb15f3f6740(0000) GS:ff110008dcc19000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb15f32d3a0 CR3: 0000000116f59001 CR4: 0000000000373eb0
Call Trace:
<TASK>
debug_dma_map_sg+0x1b4/0x390
__dma_map_sg_attrs+0x6d/0x1a0
dma_map_sgtable+0x19/0x30
ib_umem_get+0x284/0x3b0 [ib_uverbs]
mlx5_ib_reg_user_mr+0x68/0x2a0 [mlx5_ib]
ib_uverbs_reg_mr+0x17f/0x2a0 [ib_uverbs]
ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xc2/0x130 [ib_uverbs]
ib_uverbs_cmd_verbs+0xa0b/0xae0 [ib_uverbs]
? ib_uverbs_handler_UVERBS_METHOD_QUERY_PORT_SPEED+0xe0/0xe0 [ib_uverbs]
? mmap_region+0x7a/0xb0
? do_mmap+0x3b8/0x5c0
ib_uverbs_ioctl+0xa7/0x110 [ib_uverbs]
__x64_sys_ioctl+0x14f/0x8b0
? ksys_mmap_pgoff+0xc5/0x190
do_syscall_64+0x8c/0xbf0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7fb15f5e4eed
Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
RSP: 002b:00007ffe09a5c540 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffe09a5c5d0 RCX: 00007fb15f5e4eed
RDX: 00007ffe09a5c5f0 RSI: 00000000c0181b01 RDI: 0000000000000003
RBP: 00007ffe09a5c590 R08: 0000000000000028 R09: 00007ffe09a5c794
R10: 0000000000000001 R11: 0000000000000246 R12: 00007ffe09a5c794
R13: 000000000000000c R14: 0000000025a49170 R15: 000000000000000c
</TASK>
---[ end trace 0000000000000000 ]---
Fixes: 61868dc55a11 ("dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
kernel/dma/debug.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 86f87e43438c3..be207be749968 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -453,7 +453,7 @@ static int active_cacheline_set_overlap(phys_addr_t cln, int overlap)
return overlap;
}
-static void active_cacheline_inc_overlap(phys_addr_t cln)
+static void active_cacheline_inc_overlap(phys_addr_t cln, bool is_cache_clean)
{
int overlap = active_cacheline_read_overlap(cln);
@@ -462,7 +462,7 @@ static void active_cacheline_inc_overlap(phys_addr_t cln)
/* If we overflowed the overlap counter then we're potentially
* leaking dma-mappings.
*/
- WARN_ONCE(overlap > ACTIVE_CACHELINE_MAX_OVERLAP,
+ WARN_ONCE(!is_cache_clean && overlap > ACTIVE_CACHELINE_MAX_OVERLAP,
pr_fmt("exceeded %d overlapping mappings of cacheline %pa\n"),
ACTIVE_CACHELINE_MAX_OVERLAP, &cln);
}
@@ -495,7 +495,7 @@ static int active_cacheline_insert(struct dma_debug_entry *entry,
if (rc == -EEXIST) {
struct dma_debug_entry *existing;
- active_cacheline_inc_overlap(cln);
+ active_cacheline_inc_overlap(cln, entry->is_cache_clean);
existing = radix_tree_lookup(&dma_active_cacheline, cln);
/* A lookup failure here after we got -EEXIST is unexpected. */
WARN_ON(!existing);
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap
2026-03-07 16:49 [PATCH 0/3] RDMA: Enable runs with DMA debug enabled Leon Romanovsky
2026-03-07 16:49 ` [PATCH 1/3] dma-debug: Allow multiple invocations of overlapping entries Leon Romanovsky
@ 2026-03-07 16:49 ` Leon Romanovsky
2026-03-08 18:19 ` Jason Gunthorpe
2026-03-07 16:49 ` [PATCH 3/3] RDMA/umem: Tell DMA debug that cacheline overlap is expected Leon Romanovsky
2 siblings, 1 reply; 17+ messages in thread
From: Leon Romanovsky @ 2026-03-07 16:49 UTC (permalink / raw)
To: Marek Szyprowski, Robin Murphy, Michael S. Tsirkin, Petr Tesarik,
Jonathan Corbet, Shuah Khan, Jason Wang, Xuan Zhuo,
Eugenio Pérez, Jason Gunthorpe, Leon Romanovsky
Cc: iommu, linux-kernel, linux-doc, virtualization, linux-rdma
From: Leon Romanovsky <leonro@nvidia.com>
Rename the DMA_ATTR_CPU_CACHE_CLEAN attribute to reflect that it allows
CPU cache overlaps to exist, and document a slightly different but still
valid use case involving overlapping CPU cache lines.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
Documentation/core-api/dma-attributes.rst | 26 ++++++++++++++++++--------
drivers/virtio/virtio_ring.c | 4 ++--
include/linux/dma-mapping.h | 8 ++++----
kernel/dma/debug.c | 2 +-
4 files changed, 25 insertions(+), 15 deletions(-)
diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
index 1d7bfad73b1c7..6b73d92c62721 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -149,11 +149,21 @@ For architectures that require cache flushing for DMA coherence
DMA_ATTR_MMIO will not perform any cache flushing. The address
provided must never be mapped cacheable into the CPU.
-DMA_ATTR_CPU_CACHE_CLEAN
-------------------------
-
-This attribute indicates the CPU will not dirty any cacheline overlapping this
-DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows
-multiple small buffers to safely share a cacheline without risk of data
-corruption, suppressing DMA debug warnings about overlapping mappings.
-All mappings sharing a cacheline should have this attribute.
+DMA_ATTR_CPU_CACHE_OVERLAP
+--------------------------
+
+This attribute indicates that CPU cache lines may overlap for buffers mapped
+with DMA_FROM_DEVICE or DMA_BIDIRECTIONAL.
+
+Such overlap may occur when callers map multiple small buffers that reside
+within the same cache line. In this case, callers must guarantee that the CPU
+will not dirty these cache lines after the mappings are established. When this
+condition is met, multiple buffers can safely share a cache line without risking
+data corruption.
+
+Another valid use case is on systems that are CPU-coherent and do not use
+SWIOTLB, where the caller can guarantee that no cache maintenance operations
+(such as flushes) will be performed that could overwrite shared cache lines.
+
+All mappings that share a cache line must set this attribute to suppress DMA
+debug warnings about overlapping mappings.
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 335692d41617a..bf51ae9a39169 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2912,7 +2912,7 @@ EXPORT_SYMBOL_GPL(virtqueue_add_inbuf);
* @data: the token identifying the buffer.
* @gfp: how to do memory allocations (if necessary).
*
- * Same as virtqueue_add_inbuf but passes DMA_ATTR_CPU_CACHE_CLEAN to indicate
+ * Same as virtqueue_add_inbuf but passes DMA_ATTR_CPU_CACHE_OVERLAP to indicate
* that the CPU will not dirty any cacheline overlapping this buffer while it
* is available, and to suppress overlapping cacheline warnings in DMA debug
* builds.
@@ -2928,7 +2928,7 @@ int virtqueue_add_inbuf_cache_clean(struct virtqueue *vq,
gfp_t gfp)
{
return virtqueue_add(vq, &sg, num, 0, 1, data, NULL, false, gfp,
- DMA_ATTR_CPU_CACHE_CLEAN);
+ DMA_ATTR_CPU_CACHE_OVERLAP);
}
EXPORT_SYMBOL_GPL(virtqueue_add_inbuf_cache_clean);
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 29973baa05816..45efede1a6cce 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -80,11 +80,11 @@
#define DMA_ATTR_MMIO (1UL << 10)
/*
- * DMA_ATTR_CPU_CACHE_CLEAN: Indicates the CPU will not dirty any cacheline
- * overlapping this buffer while it is mapped for DMA. All mappings sharing
- * a cacheline must have this attribute for this to be considered safe.
+ * DMA_ATTR_CPU_CACHE_OVERLAP: Indicates the CPU cache line can be overlapped.
+ * All mappings sharing a cacheline must have this attribute for this
+ * to be considered safe.
*/
-#define DMA_ATTR_CPU_CACHE_CLEAN (1UL << 11)
+#define DMA_ATTR_CPU_CACHE_OVERLAP (1UL << 11)
/*
* A dma_addr_t can hold any valid DMA or bus address for the platform. It can
diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index be207be749968..603be342063f1 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -601,7 +601,7 @@ static void add_dma_entry(struct dma_debug_entry *entry, unsigned long attrs)
unsigned long flags;
int rc;
- entry->is_cache_clean = !!(attrs & DMA_ATTR_CPU_CACHE_CLEAN);
+ entry->is_cache_clean = attrs & DMA_ATTR_CPU_CACHE_OVERLAP;
bucket = get_hash_bucket(entry, &flags);
hash_bucket_add(bucket, entry);
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 3/3] RDMA/umem: Tell DMA debug that cacheline overlap is expected
2026-03-07 16:49 [PATCH 0/3] RDMA: Enable runs with DMA debug enabled Leon Romanovsky
2026-03-07 16:49 ` [PATCH 1/3] dma-debug: Allow multiple invocations of overlapping entries Leon Romanovsky
2026-03-07 16:49 ` [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap Leon Romanovsky
@ 2026-03-07 16:49 ` Leon Romanovsky
2 siblings, 0 replies; 17+ messages in thread
From: Leon Romanovsky @ 2026-03-07 16:49 UTC (permalink / raw)
To: Marek Szyprowski, Robin Murphy, Michael S. Tsirkin, Petr Tesarik,
Jonathan Corbet, Shuah Khan, Jason Wang, Xuan Zhuo,
Eugenio Pérez, Jason Gunthorpe, Leon Romanovsky
Cc: iommu, linux-kernel, linux-doc, virtualization, linux-rdma
From: Leon Romanovsky <leonro@nvidia.com>
The RDMA subsystem exposes DMA regions through the verbs interface. A
given region can be exported multiple times, which can trigger warnings
about cacheline overlaps. In this case the warnings are false positives,
because RDMA does not use SWIOTLB and uverbs operate only on CPU‑coherent
architectures.
infiniband rocep8s0f0: mlx5_ib_reg_user_mr:1592:(pid 5812): start 0x2b28c000, iova 0x2b28c000, length 0x1000, access_flags 0x1
infiniband rocep8s0f0: mlx5_ib_reg_user_mr:1592:(pid 5812): start 0x2b28c001, iova 0x2b28c001, length 0xfff, access_flags 0x1
------------[ cut here ]------------
DMA-API: mlx5_core 0000:08:00.0: cacheline tracking EEXIST, overlapping mappings aren't supported
WARNING: kernel/dma/debug.c:620 at add_dma_entry+0x1bb/0x280, CPU#6: ibv_rc_pingpong/5812
Modules linked in: veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat nf_nat xt_addrtype br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay mlx5_fwctl zram zsmalloc mlx5_ib fuse rpcrdma rdma_ucm ib_uverbs ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_core ib_core
CPU: 6 UID: 2733 PID: 5812 Comm: ibv_rc_pingpong Tainted: G W 6.19.0+ #129 PREEMPT
Tainted: [W]=WARN
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:add_dma_entry+0x1be/0x280
Code: 8b 7b 10 48 85 ff 0f 84 c3 00 00 00 48 8b 6f 50 48 85 ed 75 03 48 8b 2f e8 ff 8e 6a 00 48 89 c6 48 8d 3d 55 ef 2d 01 48 89 ea <67> 48 0f b9 3a 48 85 db 74 1a 48 c7 c7 b0 00 2b 82 e8 9c 25 fd ff
RSP: 0018:ff11000138717978 EFLAGS: 00010286
RAX: ffffffffa02d7831 RBX: ff1100010246de00 RCX: 0000000000000000
RDX: ff110001036fac30 RSI: ffffffffa02d7831 RDI: ffffffff82678650
RBP: ff110001036fac30 R08: ff11000110dcb4a0 R09: ff11000110dcb478
R10: 0000000000000000 R11: ffffffff824b30a8 R12: 0000000000000000
R13: 00000000ffffffef R14: 0000000000000202 R15: ff1100010246de00
FS: 00007f59b411c740(0000) GS:ff110008dcc99000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffe538f7000 CR3: 000000010e066005 CR4: 0000000000373eb0
Call Trace:
<TASK>
debug_dma_map_sg+0x1b4/0x390
__dma_map_sg_attrs+0x6d/0x1a0
dma_map_sgtable+0x19/0x30
ib_umem_get+0x254/0x380 [ib_uverbs]
mlx5_ib_reg_user_mr+0x68/0x2a0 [mlx5_ib]
ib_uverbs_reg_mr+0x17f/0x2a0 [ib_uverbs]
ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xc2/0x130 [ib_uverbs]
ib_uverbs_cmd_verbs+0xa0b/0xae0 [ib_uverbs]
? ib_uverbs_handler_UVERBS_METHOD_QUERY_PORT_SPEED+0xe0/0xe0 [ib_uverbs]
? mmap_region+0x7a/0xb0
? do_mmap+0x3b8/0x5c0
ib_uverbs_ioctl+0xa7/0x110 [ib_uverbs]
__x64_sys_ioctl+0x14f/0x8b0
? ksys_mmap_pgoff+0xc5/0x190
do_syscall_64+0x8c/0xbf0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7f59b430aeed
Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
RSP: 002b:00007ffe538f9430 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffe538f94c0 RCX: 00007f59b430aeed
RDX: 00007ffe538f94e0 RSI: 00000000c0181b01 RDI: 0000000000000003
RBP: 00007ffe538f9480 R08: 0000000000000028 R09: 00007ffe538f9684
R10: 0000000000000001 R11: 0000000000000246 R12: 00007ffe538f9684
R13: 000000000000000c R14: 000000002b28d170 R15: 000000000000000c
</TASK>
---[ end trace 0000000000000000 ]---
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
drivers/infiniband/core/umem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index cff4fcca2c345..4ae04b6e6927c 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -169,7 +169,7 @@ struct ib_umem *ib_umem_get(struct ib_device *device, unsigned long addr,
unsigned long lock_limit;
unsigned long new_pinned;
unsigned long cur_base;
- unsigned long dma_attr = 0;
+ unsigned long dma_attr = DMA_ATTR_CPU_CACHE_OVERLAP;
struct mm_struct *mm;
unsigned long npages;
int pinned, ret;
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap
2026-03-07 16:49 ` [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap Leon Romanovsky
@ 2026-03-08 18:19 ` Jason Gunthorpe
2026-03-08 18:49 ` Leon Romanovsky
0 siblings, 1 reply; 17+ messages in thread
From: Jason Gunthorpe @ 2026-03-08 18:19 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Marek Szyprowski, Robin Murphy, Michael S. Tsirkin, Petr Tesarik,
Jonathan Corbet, Shuah Khan, Jason Wang, Xuan Zhuo,
Eugenio Pérez, iommu, linux-kernel, linux-doc,
virtualization, linux-rdma
On Sat, Mar 07, 2026 at 06:49:56PM +0200, Leon Romanovsky wrote:
> -This attribute indicates the CPU will not dirty any cacheline overlapping this
> -DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows
> -multiple small buffers to safely share a cacheline without risk of data
> -corruption, suppressing DMA debug warnings about overlapping mappings.
> -All mappings sharing a cacheline should have this attribute.
> +DMA_ATTR_CPU_CACHE_OVERLAP
This is a very specific and well defined use case that allows some cache
flushing behaviors to work only under the promise that the CPU doesn't
touch the memory to cause cache inconsistencies.
> +Another valid use case is on systems that are CPU-coherent and do not use
> +SWIOTLB, where the caller can guarantee that no cache maintenance operations
> +(such as flushes) will be performed that could overwrite shared cache lines.
This is something completely unrelated.
What I would really like is a new DMA_ATTR_REQUIRE_COHERENT which
fails any mappings requests that would use any SWIOTLB or cache
flushing.
It should only be used by callers like RDMA/DRM/etc where they have
historical uAPI that has never supported incoherent DMA operation and
are an exception to the normal DMA API requirements.
The problem is to limit the use of that flag to only a few approved
places. I fear adding such a flag wide open would open the door to
widespread driver abuse. These days we have 'export symbol for module'
so maybe there is a way to do it with safety?
I'd really like this right now because CC systems are forcing SWIOTLB
and things like RDMA userspace are unfixably broken with SWIOTLB. The
uAPI it has simply cannot work with it. I'd much rather to immediate
fail than suffer data corruption. Jiri was looking at adding some
hacky "is cc" check, but I'd far prefer a proper flag that covered all
the uAPI breaking cases.
Jason
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap
2026-03-08 18:19 ` Jason Gunthorpe
@ 2026-03-08 18:49 ` Leon Romanovsky
2026-03-08 23:09 ` Jason Gunthorpe
0 siblings, 1 reply; 17+ messages in thread
From: Leon Romanovsky @ 2026-03-08 18:49 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Marek Szyprowski, Robin Murphy, Michael S. Tsirkin, Petr Tesarik,
Jonathan Corbet, Shuah Khan, Jason Wang, Xuan Zhuo,
Eugenio Pérez, iommu, linux-kernel, linux-doc,
virtualization, linux-rdma
On Sun, Mar 08, 2026 at 03:19:20PM -0300, Jason Gunthorpe wrote:
> On Sat, Mar 07, 2026 at 06:49:56PM +0200, Leon Romanovsky wrote:
>
> > -This attribute indicates the CPU will not dirty any cacheline overlapping this
> > -DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows
> > -multiple small buffers to safely share a cacheline without risk of data
> > -corruption, suppressing DMA debug warnings about overlapping mappings.
> > -All mappings sharing a cacheline should have this attribute.
> > +DMA_ATTR_CPU_CACHE_OVERLAP
>
> This is a very specific and well defined use case that allows some cache
> flushing behaviors to work only under the promise that the CPU doesn't
> touch the memory to cause cache inconsistencies.
>
> > +Another valid use case is on systems that are CPU-coherent and do not use
> > +SWIOTLB, where the caller can guarantee that no cache maintenance operations
> > +(such as flushes) will be performed that could overwrite shared cache lines.
>
> This is something completely unrelated.
I disagree. The situation is equivalent in that callers guarantee the
CPU cache will not be overwritten. For the RDMA case, this results in
the same behavior as with virtio. For our case, it addresses and
clears the debug warnings.
>
> What I would really like is a new DMA_ATTR_REQUIRE_COHERENT which
> fails any mappings requests that would use any SWIOTLB or cache
> flushing.
You are proposing something orthogonal that operates at a different layer
(DMA mapping). However, for DMA debugging, your new attribute will be
equivalent to DMA_ATTR_CPU_CACHE_OVERLAP.
Thanks
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap
2026-03-08 18:49 ` Leon Romanovsky
@ 2026-03-08 23:09 ` Jason Gunthorpe
2026-03-09 9:03 ` Leon Romanovsky
0 siblings, 1 reply; 17+ messages in thread
From: Jason Gunthorpe @ 2026-03-08 23:09 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Marek Szyprowski, Robin Murphy, Michael S. Tsirkin, Petr Tesarik,
Jonathan Corbet, Shuah Khan, Jason Wang, Xuan Zhuo,
Eugenio Pérez, iommu, linux-kernel, linux-doc,
virtualization, linux-rdma
On Sun, Mar 08, 2026 at 08:49:02PM +0200, Leon Romanovsky wrote:
> On Sun, Mar 08, 2026 at 03:19:20PM -0300, Jason Gunthorpe wrote:
> > On Sat, Mar 07, 2026 at 06:49:56PM +0200, Leon Romanovsky wrote:
> >
> > > -This attribute indicates the CPU will not dirty any cacheline overlapping this
> > > -DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows
> > > -multiple small buffers to safely share a cacheline without risk of data
> > > -corruption, suppressing DMA debug warnings about overlapping mappings.
> > > -All mappings sharing a cacheline should have this attribute.
> > > +DMA_ATTR_CPU_CACHE_OVERLAP
> >
> > This is a very specific and well defined use case that allows some cache
> > flushing behaviors to work only under the promise that the CPU doesn't
> > touch the memory to cause cache inconsistencies.
> >
> > > +Another valid use case is on systems that are CPU-coherent and do not use
> > > +SWIOTLB, where the caller can guarantee that no cache maintenance operations
> > > +(such as flushes) will be performed that could overwrite shared cache lines.
> >
> > This is something completely unrelated.
>
> I disagree. The situation is equivalent in that callers guarantee the
> CPU cache will not be overwritten.
The RDMA callers do no such thing, they just don't work at all if
there is non-coherence in the mapping which is why it is not a bug.
virtio looks like it does actually keep the caches clean for different
mappings (and probably also in practice forced coherent as well given
qemu is coherent with the VM and VFIO doesn't allow non-coherent DMA
devices)
> > What I would really like is a new DMA_ATTR_REQUIRE_COHERENT which
> > fails any mappings requests that would use any SWIOTLB or cache
> > flushing.
>
> You are proposing something orthogonal that operates at a different layer
> (DMA mapping). However, for DMA debugging, your new attribute will be
> equivalent to DMA_ATTR_CPU_CACHE_OVERLAP.
DMA_ATTR is a dma mapping flag, if you want some weird dma debugging
flag it should be called DMA_ATTR_DEBUGGING_IGNORE_CACHELINES with
some kind of statement at the user why it is OK.
Jason
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap
2026-03-08 23:09 ` Jason Gunthorpe
@ 2026-03-09 9:03 ` Leon Romanovsky
2026-03-09 12:30 ` Marek Szyprowski
0 siblings, 1 reply; 17+ messages in thread
From: Leon Romanovsky @ 2026-03-09 9:03 UTC (permalink / raw)
To: Jason Gunthorpe, Marek Szyprowski
Cc: Robin Murphy, Michael S. Tsirkin, Petr Tesarik, Jonathan Corbet,
Shuah Khan, Jason Wang, Xuan Zhuo, Eugenio Pérez, iommu,
linux-kernel, linux-doc, virtualization, linux-rdma
On Sun, Mar 08, 2026 at 08:09:16PM -0300, Jason Gunthorpe wrote:
> On Sun, Mar 08, 2026 at 08:49:02PM +0200, Leon Romanovsky wrote:
> > On Sun, Mar 08, 2026 at 03:19:20PM -0300, Jason Gunthorpe wrote:
> > > On Sat, Mar 07, 2026 at 06:49:56PM +0200, Leon Romanovsky wrote:
> > >
> > > > -This attribute indicates the CPU will not dirty any cacheline overlapping this
> > > > -DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows
> > > > -multiple small buffers to safely share a cacheline without risk of data
> > > > -corruption, suppressing DMA debug warnings about overlapping mappings.
> > > > -All mappings sharing a cacheline should have this attribute.
> > > > +DMA_ATTR_CPU_CACHE_OVERLAP
> > >
> > > This is a very specific and well defined use case that allows some cache
> > > flushing behaviors to work only under the promise that the CPU doesn't
> > > touch the memory to cause cache inconsistencies.
> > >
> > > > +Another valid use case is on systems that are CPU-coherent and do not use
> > > > +SWIOTLB, where the caller can guarantee that no cache maintenance operations
> > > > +(such as flushes) will be performed that could overwrite shared cache lines.
> > >
> > > This is something completely unrelated.
> >
> > I disagree. The situation is equivalent in that callers guarantee the
> > CPU cache will not be overwritten.
>
> The RDMA callers do no such thing, they just don't work at all if
> there is non-coherence in the mapping which is why it is not a bug.
>
> virtio looks like it does actually keep the caches clean for different
> mappings (and probably also in practice forced coherent as well given
> qemu is coherent with the VM and VFIO doesn't allow non-coherent DMA
> devices)
>
> > > What I would really like is a new DMA_ATTR_REQUIRE_COHERENT which
> > > fails any mappings requests that would use any SWIOTLB or cache
> > > flushing.
> >
> > You are proposing something orthogonal that operates at a different layer
> > (DMA mapping). However, for DMA debugging, your new attribute will be
> > equivalent to DMA_ATTR_CPU_CACHE_OVERLAP.
>
> DMA_ATTR is a dma mapping flag, if you want some weird dma debugging
> flag it should be called DMA_ATTR_DEBUGGING_IGNORE_CACHELINES with
> some kind of statement at the user why it is OK.
And this is the issue: the existing DMA_ATTR_CPU_CACHE_CLEAN is essentially
a debug-oriented attribute. The upper layers are already handled through
__dma_from_device_group_begin()/end(), which pad cache lines on
non-coherent systems.
Marek,
What do you see as the right path forward here? RDMA has a legitimate use
case where CPU cache lines may overlap. The underlying reason differs from
VirtIO, but the outcome is the same. Should I keep the current name? Should
we rename it to the proposed DMA_ATTR_CPU_CACHE_OVERLAP or
DMA_ATTR_DEBUGGING_IGNORE_CACHELINES? Should we introduce a new
DMA_ATTR_REQUIRE_COHERENT attribute instead? Or do you have another
recommendation?
Thanks
>
> Jason
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap
2026-03-09 9:03 ` Leon Romanovsky
@ 2026-03-09 12:30 ` Marek Szyprowski
2026-03-09 13:20 ` Jason Gunthorpe
2026-03-09 15:05 ` Leon Romanovsky
0 siblings, 2 replies; 17+ messages in thread
From: Marek Szyprowski @ 2026-03-09 12:30 UTC (permalink / raw)
To: Leon Romanovsky, Jason Gunthorpe
Cc: Robin Murphy, Michael S. Tsirkin, Petr Tesarik, Jonathan Corbet,
Shuah Khan, Jason Wang, Xuan Zhuo, Eugenio Pérez, iommu,
linux-kernel, linux-doc, virtualization, linux-rdma
On 09.03.2026 10:03, Leon Romanovsky wrote:
> On Sun, Mar 08, 2026 at 08:09:16PM -0300, Jason Gunthorpe wrote:
>> On Sun, Mar 08, 2026 at 08:49:02PM +0200, Leon Romanovsky wrote:
>>> On Sun, Mar 08, 2026 at 03:19:20PM -0300, Jason Gunthorpe wrote:
>>>> On Sat, Mar 07, 2026 at 06:49:56PM +0200, Leon Romanovsky wrote:
>>>>
>>>>> -This attribute indicates the CPU will not dirty any cacheline overlapping this
>>>>> -DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows
>>>>> -multiple small buffers to safely share a cacheline without risk of data
>>>>> -corruption, suppressing DMA debug warnings about overlapping mappings.
>>>>> -All mappings sharing a cacheline should have this attribute.
>>>>> +DMA_ATTR_CPU_CACHE_OVERLAP
>>>> This is a very specific and well defined use case that allows some cache
>>>> flushing behaviors to work only under the promise that the CPU doesn't
>>>> touch the memory to cause cache inconsistencies.
>>>>
>>>>> +Another valid use case is on systems that are CPU-coherent and do not use
>>>>> +SWIOTLB, where the caller can guarantee that no cache maintenance operations
>>>>> +(such as flushes) will be performed that could overwrite shared cache lines.
>>>> This is something completely unrelated.
>>> I disagree. The situation is equivalent in that callers guarantee the
>>> CPU cache will not be overwritten.
>> The RDMA callers do no such thing, they just don't work at all if
>> there is non-coherence in the mapping which is why it is not a bug.
>>
>> virtio looks like it does actually keep the caches clean for different
>> mappings (and probably also in practice forced coherent as well given
>> qemu is coherent with the VM and VFIO doesn't allow non-coherent DMA
>> devices)
>>
>>>> What I would really like is a new DMA_ATTR_REQUIRE_COHERENT which
>>>> fails any mappings requests that would use any SWIOTLB or cache
>>>> flushing.
>>> You are proposing something orthogonal that operates at a different layer
>>> (DMA mapping). However, for DMA debugging, your new attribute will be
>>> equivalent to DMA_ATTR_CPU_CACHE_OVERLAP.
>> DMA_ATTR is a dma mapping flag, if you want some weird dma debugging
>> flag it should be called DMA_ATTR_DEBUGGING_IGNORE_CACHELINES with
>> some kind of statement at the user why it is OK.
> And this is the issue: the existing DMA_ATTR_CPU_CACHE_CLEAN is essentially
> a debug-oriented attribute. The upper layers are already handled through
> __dma_from_device_group_begin()/end(), which pad cache lines on
> non-coherent systems.
>
> Marek,
>
> What do you see as the right path forward here? RDMA has a legitimate use
> case where CPU cache lines may overlap. The underlying reason differs from
> VirtIO, but the outcome is the same. Should I keep the current name? Should
> we rename it to the proposed DMA_ATTR_CPU_CACHE_OVERLAP or
> DMA_ATTR_DEBUGGING_IGNORE_CACHELINES? Should we introduce a new
> DMA_ATTR_REQUIRE_COHERENT attribute instead? Or do you have another
> recommendation?
My question here is if RDMA works on any non-coherent DMA systems? If
not then it should fail early (during init or probe?) to avoid potential
data corruption and new DMA attributes won't help it. On the other hand,
theDMA_ATTR_CPU_CACHE_OVERLAP attribute is a bit more descriptive to me
than DMA_ATTR_CPU_CACHE_CLEAN, but this indeed looks like a separate
issue from the RDMA case.
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap
2026-03-09 12:30 ` Marek Szyprowski
@ 2026-03-09 13:20 ` Jason Gunthorpe
2026-03-09 15:05 ` Leon Romanovsky
1 sibling, 0 replies; 17+ messages in thread
From: Jason Gunthorpe @ 2026-03-09 13:20 UTC (permalink / raw)
To: Marek Szyprowski
Cc: Leon Romanovsky, Robin Murphy, Michael S. Tsirkin, Petr Tesarik,
Jonathan Corbet, Shuah Khan, Jason Wang, Xuan Zhuo,
Eugenio Pérez, iommu, linux-kernel, linux-doc,
virtualization, linux-rdma
On Mon, Mar 09, 2026 at 01:30:24PM +0100, Marek Szyprowski wrote:
> My question here is if RDMA works on any non-coherent DMA systems?
The in kernel components do work, like storage, nvme over fabrics, netdev.
The user API (uverbs) does not work at all, and has never worked.
I think DRM has similar issues too where most of their DMA API usage
is OK but some places where they interact win pin_user_pages() have
the same issues as RDMA.
This is why I'd like a new attribute DMA_ATTR_REQUIRE_COHERENCE that
these special cases can use to fail instead of data corrupt.
Jason
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap
2026-03-09 12:30 ` Marek Szyprowski
2026-03-09 13:20 ` Jason Gunthorpe
@ 2026-03-09 15:05 ` Leon Romanovsky
2026-03-09 15:13 ` Jason Gunthorpe
1 sibling, 1 reply; 17+ messages in thread
From: Leon Romanovsky @ 2026-03-09 15:05 UTC (permalink / raw)
To: Marek Szyprowski
Cc: Jason Gunthorpe, Robin Murphy, Michael S. Tsirkin, Petr Tesarik,
Jonathan Corbet, Shuah Khan, Jason Wang, Xuan Zhuo,
Eugenio Pérez, iommu, linux-kernel, linux-doc,
virtualization, linux-rdma
On Mon, Mar 09, 2026 at 01:30:24PM +0100, Marek Szyprowski wrote:
> On 09.03.2026 10:03, Leon Romanovsky wrote:
> > On Sun, Mar 08, 2026 at 08:09:16PM -0300, Jason Gunthorpe wrote:
> >> On Sun, Mar 08, 2026 at 08:49:02PM +0200, Leon Romanovsky wrote:
> >>> On Sun, Mar 08, 2026 at 03:19:20PM -0300, Jason Gunthorpe wrote:
> >>>> On Sat, Mar 07, 2026 at 06:49:56PM +0200, Leon Romanovsky wrote:
> >>>>
> >>>>> -This attribute indicates the CPU will not dirty any cacheline overlapping this
> >>>>> -DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows
> >>>>> -multiple small buffers to safely share a cacheline without risk of data
> >>>>> -corruption, suppressing DMA debug warnings about overlapping mappings.
> >>>>> -All mappings sharing a cacheline should have this attribute.
> >>>>> +DMA_ATTR_CPU_CACHE_OVERLAP
> >>>> This is a very specific and well defined use case that allows some cache
> >>>> flushing behaviors to work only under the promise that the CPU doesn't
> >>>> touch the memory to cause cache inconsistencies.
> >>>>
> >>>>> +Another valid use case is on systems that are CPU-coherent and do not use
> >>>>> +SWIOTLB, where the caller can guarantee that no cache maintenance operations
> >>>>> +(such as flushes) will be performed that could overwrite shared cache lines.
> >>>> This is something completely unrelated.
> >>> I disagree. The situation is equivalent in that callers guarantee the
> >>> CPU cache will not be overwritten.
> >> The RDMA callers do no such thing, they just don't work at all if
> >> there is non-coherence in the mapping which is why it is not a bug.
> >>
> >> virtio looks like it does actually keep the caches clean for different
> >> mappings (and probably also in practice forced coherent as well given
> >> qemu is coherent with the VM and VFIO doesn't allow non-coherent DMA
> >> devices)
> >>
> >>>> What I would really like is a new DMA_ATTR_REQUIRE_COHERENT which
> >>>> fails any mappings requests that would use any SWIOTLB or cache
> >>>> flushing.
> >>> You are proposing something orthogonal that operates at a different layer
> >>> (DMA mapping). However, for DMA debugging, your new attribute will be
> >>> equivalent to DMA_ATTR_CPU_CACHE_OVERLAP.
> >> DMA_ATTR is a dma mapping flag, if you want some weird dma debugging
> >> flag it should be called DMA_ATTR_DEBUGGING_IGNORE_CACHELINES with
> >> some kind of statement at the user why it is OK.
> > And this is the issue: the existing DMA_ATTR_CPU_CACHE_CLEAN is essentially
> > a debug-oriented attribute. The upper layers are already handled through
> > __dma_from_device_group_begin()/end(), which pad cache lines on
> > non-coherent systems.
> >
> > Marek,
> >
> > What do you see as the right path forward here? RDMA has a legitimate use
> > case where CPU cache lines may overlap. The underlying reason differs from
> > VirtIO, but the outcome is the same. Should I keep the current name? Should
> > we rename it to the proposed DMA_ATTR_CPU_CACHE_OVERLAP or
> > DMA_ATTR_DEBUGGING_IGNORE_CACHELINES? Should we introduce a new
> > DMA_ATTR_REQUIRE_COHERENT attribute instead? Or do you have another
> > recommendation?
>
> My question here is if RDMA works on any non-coherent DMA systems? If
> not then it should fail early (during init or probe?) to avoid potential
> data corruption and new DMA attributes won't help it.
Like Jason wrote, our user‑visible API does not work on non‑coherent
systems, and this is where I'm using the DMA_ATTR_CPU_CACHE_OVERLAP
attribute.
Regarding failure on unsupported systems, I have tried more than once to
make the RDMA fail when the device is known to take the SWIOTLB path
in RDMA and cannot operate correctly, but each attempt was met with a
cold reception:
https://lore.kernel.org/all/d18c454636bf3cfdba9b66b7cc794d713eadc4a5.1719909395.git.leon@kernel.org/
I'm afraid the outcome will be the same this time as well.
> On the other hand, the DMA_ATTR_CPU_CACHE_OVERLAP attribute is a bit more
> descriptive to me than DMA_ATTR_CPU_CACHE_CLEAN, but this indeed looks
> like a separate issue from the RDMA case.
>
> Best regards
> --
> Marek Szyprowski, PhD
> Samsung R&D Institute Poland
>
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap
2026-03-09 15:05 ` Leon Romanovsky
@ 2026-03-09 15:13 ` Jason Gunthorpe
2026-03-10 9:45 ` Marek Szyprowski
0 siblings, 1 reply; 17+ messages in thread
From: Jason Gunthorpe @ 2026-03-09 15:13 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Marek Szyprowski, Robin Murphy, Michael S. Tsirkin, Petr Tesarik,
Jonathan Corbet, Shuah Khan, Jason Wang, Xuan Zhuo,
Eugenio Pérez, iommu, linux-kernel, linux-doc,
virtualization, linux-rdma
On Mon, Mar 09, 2026 at 05:05:02PM +0200, Leon Romanovsky wrote:
> Regarding failure on unsupported systems, I have tried more than once to
> make the RDMA fail when the device is known to take the SWIOTLB path
> in RDMA and cannot operate correctly, but each attempt was met with a
> cold reception:
> https://lore.kernel.org/all/d18c454636bf3cfdba9b66b7cc794d713eadc4a5.1719909395.git.leon@kernel.org/
I think alot of that is the APIs used there. It is hard to determine
if SWIOTLB is possible or coherent is possible, I've also hit these
things in VFIO and gave up.
However, DMA_ATTR_REQUIRE_COHERENCE can be done properly and not leak
alot of dangerous APIs to drivers (beyond itself).
It is also more important now with CC systems, I think.
Jason
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap
2026-03-09 15:13 ` Jason Gunthorpe
@ 2026-03-10 9:45 ` Marek Szyprowski
2026-03-10 12:34 ` Jason Gunthorpe
0 siblings, 1 reply; 17+ messages in thread
From: Marek Szyprowski @ 2026-03-10 9:45 UTC (permalink / raw)
To: Jason Gunthorpe, Leon Romanovsky
Cc: Robin Murphy, Michael S. Tsirkin, Petr Tesarik, Jonathan Corbet,
Shuah Khan, Jason Wang, Xuan Zhuo, Eugenio Pérez, iommu,
linux-kernel, linux-doc, virtualization, linux-rdma
On 09.03.2026 16:13, Jason Gunthorpe wrote:
> On Mon, Mar 09, 2026 at 05:05:02PM +0200, Leon Romanovsky wrote:
>> Regarding failure on unsupported systems, I have tried more than once to
>> make the RDMA fail when the device is known to take the SWIOTLB path
>> in RDMA and cannot operate correctly, but each attempt was met with a
>> cold reception:
>> https://lore.kernel.org/all/d18c454636bf3cfdba9b66b7cc794d713eadc4a5.1719909395.git.leon@kernel.org/
> I think alot of that is the APIs used there. It is hard to determine
> if SWIOTLB is possible or coherent is possible, I've also hit these
> things in VFIO and gave up.
>
> However, DMA_ATTR_REQUIRE_COHERENCE can be done properly and not leak
> alot of dangerous APIs to drivers (beyond itself).
>
> It is also more important now with CC systems, I think.
Jason is right. Indeed the rdma/uverbs case needs some extension to
ensure that the coherent mapping is used, what is not possible now. This
however doesn't mean that the DMA_ATTR_CPU_CACHE_OVERLAP is not needed
for that use case too. I'm open to accept both. The only question I have
is which name should we use? We already have DMA_ATTR_CPU_CACHE_CLEAN,
while DMA_ATTR_CPU_CACHE_OVERLAP and
DMA_ATTR_DEBUGGING_IGNORE_CACHELINES were proposed here. The last seems
to be most descriptive.
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap
2026-03-10 9:45 ` Marek Szyprowski
@ 2026-03-10 12:34 ` Jason Gunthorpe
2026-03-10 18:27 ` Leon Romanovsky
2026-03-10 21:08 ` Marek Szyprowski
0 siblings, 2 replies; 17+ messages in thread
From: Jason Gunthorpe @ 2026-03-10 12:34 UTC (permalink / raw)
To: Marek Szyprowski
Cc: Leon Romanovsky, Robin Murphy, Michael S. Tsirkin, Petr Tesarik,
Jonathan Corbet, Shuah Khan, Jason Wang, Xuan Zhuo,
Eugenio Pérez, iommu, linux-kernel, linux-doc,
virtualization, linux-rdma
On Tue, Mar 10, 2026 at 10:45:38AM +0100, Marek Szyprowski wrote:
> Jason is right. Indeed the rdma/uverbs case needs some extension to
> ensure that the coherent mapping is used, what is not possible now. This
> however doesn't mean that the DMA_ATTR_CPU_CACHE_OVERLAP is not needed
> for that use case too. I'm open to accept both. The only question I have
> is which name should we use? We already have DMA_ATTR_CPU_CACHE_CLEAN,
> while DMA_ATTR_CPU_CACHE_OVERLAP and
> DMA_ATTR_DEBUGGING_IGNORE_CACHELINES were proposed here. The last seems
> to be most descriptive.
If we do DMA_ATTR_REQUIRE_COHERENCE then I imagine it would internally
also set DMA_ATTR_DEBUGGING_IGNORE_CACHELINES, but I'd prefer that
detail not leak into the callers.
Jason
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap
2026-03-10 12:34 ` Jason Gunthorpe
@ 2026-03-10 18:27 ` Leon Romanovsky
2026-03-10 21:08 ` Marek Szyprowski
1 sibling, 0 replies; 17+ messages in thread
From: Leon Romanovsky @ 2026-03-10 18:27 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Marek Szyprowski, Robin Murphy, Michael S. Tsirkin, Petr Tesarik,
Jonathan Corbet, Shuah Khan, Jason Wang, Xuan Zhuo,
Eugenio Pérez, iommu, linux-kernel, linux-doc,
virtualization, linux-rdma
On Tue, Mar 10, 2026 at 09:34:05AM -0300, Jason Gunthorpe wrote:
> On Tue, Mar 10, 2026 at 10:45:38AM +0100, Marek Szyprowski wrote:
> > Jason is right. Indeed the rdma/uverbs case needs some extension to
> > ensure that the coherent mapping is used, what is not possible now. This
> > however doesn't mean that the DMA_ATTR_CPU_CACHE_OVERLAP is not needed
> > for that use case too. I'm open to accept both. The only question I have
> > is which name should we use? We already have DMA_ATTR_CPU_CACHE_CLEAN,
> > while DMA_ATTR_CPU_CACHE_OVERLAP and
> > DMA_ATTR_DEBUGGING_IGNORE_CACHELINES were proposed here. The last seems
> > to be most descriptive.
>
> If we do DMA_ATTR_REQUIRE_COHERENCE then I imagine it would internally
> also set DMA_ATTR_DEBUGGING_IGNORE_CACHELINES, but I'd prefer that
> detail not leak into the callers.
Yes, this is how I implemented in my v2, which I didn't send yet :).
Thanks
>
> Jason
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap
2026-03-10 12:34 ` Jason Gunthorpe
2026-03-10 18:27 ` Leon Romanovsky
@ 2026-03-10 21:08 ` Marek Szyprowski
2026-03-10 23:34 ` Jason Gunthorpe
1 sibling, 1 reply; 17+ messages in thread
From: Marek Szyprowski @ 2026-03-10 21:08 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Leon Romanovsky, Robin Murphy, Michael S. Tsirkin, Petr Tesarik,
Jonathan Corbet, Shuah Khan, Jason Wang, Xuan Zhuo,
Eugenio Pérez, iommu, linux-kernel, linux-doc,
virtualization, linux-rdma
On 10.03.2026 13:34, Jason Gunthorpe wrote:
> On Tue, Mar 10, 2026 at 10:45:38AM +0100, Marek Szyprowski wrote:
>> Jason is right. Indeed the rdma/uverbs case needs some extension to
>> ensure that the coherent mapping is used, what is not possible now. This
>> however doesn't mean that the DMA_ATTR_CPU_CACHE_OVERLAP is not needed
>> for that use case too. I'm open to accept both. The only question I have
>> is which name should we use? We already have DMA_ATTR_CPU_CACHE_CLEAN,
>> while DMA_ATTR_CPU_CACHE_OVERLAP and
>> DMA_ATTR_DEBUGGING_IGNORE_CACHELINES were proposed here. The last seems
>> to be most descriptive.
> If we do DMA_ATTR_REQUIRE_COHERENCE then I imagine it would internally
> also set DMA_ATTR_DEBUGGING_IGNORE_CACHELINES, but I'd prefer that
> detail not leak into the callers.
Why DMA_ATTR_REQUIRE_COHERENCE should imply
DMA_ATTR_DEBUGGING_IGNORE_CACHELINES?
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap
2026-03-10 21:08 ` Marek Szyprowski
@ 2026-03-10 23:34 ` Jason Gunthorpe
0 siblings, 0 replies; 17+ messages in thread
From: Jason Gunthorpe @ 2026-03-10 23:34 UTC (permalink / raw)
To: Marek Szyprowski
Cc: Leon Romanovsky, Robin Murphy, Michael S. Tsirkin, Petr Tesarik,
Jonathan Corbet, Shuah Khan, Jason Wang, Xuan Zhuo,
Eugenio Pérez, iommu, linux-kernel, linux-doc,
virtualization, linux-rdma
On Tue, Mar 10, 2026 at 10:08:38PM +0100, Marek Szyprowski wrote:
> On 10.03.2026 13:34, Jason Gunthorpe wrote:
> > On Tue, Mar 10, 2026 at 10:45:38AM +0100, Marek Szyprowski wrote:
> >> Jason is right. Indeed the rdma/uverbs case needs some extension to
> >> ensure that the coherent mapping is used, what is not possible now. This
> >> however doesn't mean that the DMA_ATTR_CPU_CACHE_OVERLAP is not needed
> >> for that use case too. I'm open to accept both. The only question I have
> >> is which name should we use? We already have DMA_ATTR_CPU_CACHE_CLEAN,
> >> while DMA_ATTR_CPU_CACHE_OVERLAP and
> >> DMA_ATTR_DEBUGGING_IGNORE_CACHELINES were proposed here. The last seems
> >> to be most descriptive.
> > If we do DMA_ATTR_REQUIRE_COHERENCE then I imagine it would internally
> > also set DMA_ATTR_DEBUGGING_IGNORE_CACHELINES, but I'd prefer that
> > detail not leak into the callers.
>
> Why DMA_ATTR_REQUIRE_COHERENCE should imply
> DMA_ATTR_DEBUGGING_IGNORE_CACHELINES?
AFAICT the purpose of the DMA API debugging cacheline tracking is to
ensure that drivers are mapping things properly such that the cache
flushing in incoherent systems can properly cache flush them without
creating bugs (ie a dirty line overwriteing DMA'd data or something).
If the mapping is REQUIRE_COHERENCE then it is prevented from running
on systems where these cache artifacts can cause corruption, so we
don't need to track them and we don't need the strict restrictions on
what can be mapped.
Which trips up and gives false positives for cases like RDMA, DRM, etc
that are allowing userspace to multi-map userspace memory.
Jason
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2026-03-10 23:34 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-07 16:49 [PATCH 0/3] RDMA: Enable runs with DMA debug enabled Leon Romanovsky
2026-03-07 16:49 ` [PATCH 1/3] dma-debug: Allow multiple invocations of overlapping entries Leon Romanovsky
2026-03-07 16:49 ` [PATCH 2/3] dma-mapping: Clarify valid conditions for CPU cache line overlap Leon Romanovsky
2026-03-08 18:19 ` Jason Gunthorpe
2026-03-08 18:49 ` Leon Romanovsky
2026-03-08 23:09 ` Jason Gunthorpe
2026-03-09 9:03 ` Leon Romanovsky
2026-03-09 12:30 ` Marek Szyprowski
2026-03-09 13:20 ` Jason Gunthorpe
2026-03-09 15:05 ` Leon Romanovsky
2026-03-09 15:13 ` Jason Gunthorpe
2026-03-10 9:45 ` Marek Szyprowski
2026-03-10 12:34 ` Jason Gunthorpe
2026-03-10 18:27 ` Leon Romanovsky
2026-03-10 21:08 ` Marek Szyprowski
2026-03-10 23:34 ` Jason Gunthorpe
2026-03-07 16:49 ` [PATCH 3/3] RDMA/umem: Tell DMA debug that cacheline overlap is expected Leon Romanovsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox