Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
* [PATCH rdma-next v3 0/2] RDMA: detect and handle CoCo DMA bounce buffering
@ 2026-05-17 14:13 Jiri Pirko
  2026-05-17 14:13 ` [PATCH rdma-next v3 1/2] RDMA/uverbs: expose CoCo DMA bounce requirement to userspace Jiri Pirko
  2026-05-17 14:13 ` [PATCH rdma-next v3 2/2] RDMA/umem: block plain userspace memory registration under CoCo bounce Jiri Pirko
  0 siblings, 2 replies; 5+ messages in thread
From: Jiri Pirko @ 2026-05-17 14:13 UTC (permalink / raw)
  To: linux-rdma
  Cc: jgg, leon, edwards, kees, parav, mbloch, yishaih, lirongqing,
	huangjunxian6, liuy22, jmoroni

From: Jiri Pirko <jiri@nvidia.com>

In Confidential Computing (CoCo) guests, the DMA mapping layer
redirects all device DMA through swiotlb bounce buffers to keep guest
memory encrypted. This is transparent for regular devices because the
CPU copies data between the bounce buffer and the real buffer on every
DMA map/unmap cycle.

RDMA breaks this model. Once a memory region is registered, the device
accesses the underlying pages directly for an extended period without
CPU involvement. The swiotlb layer never gets a chance to synchronize,
so the device operates on bounce buffer memory while the application
works with its own pages - the two never see each other's updates.

This series adds detection and handling of this condition. A new
IB_UVERBS_DEVICE_CC_DMA_BOUNCE flag is exposed in device_cap_flags_ex
so userspace libraries can detect the situation and switch to
dmabuf-based memory registration using "system_cc_shared" heap
where available. Plain ib_umem_get() is made to fail early with
-EOPNOTSUPP to prevent silent misfunction.

---
See individual patches for changelog.

v2: https://lore.kernel.org/all/20260506111447.2697789-1-jiri@resnulli.us/
v1: https://lore.kernel.org/all/20260505061149.2361536-1-jiri@resnulli.us/

based on top of:
https://lore.kernel.org/all/20260517063006.2200680-1-jiri@resnulli.us/

Jiri Pirko (2):
  RDMA/uverbs: expose CoCo DMA bounce requirement to userspace
  RDMA/umem: block plain userspace memory registration under CoCo bounce

 drivers/infiniband/core/device.c     | 9 +++++++++
 drivers/infiniband/core/umem.c       | 3 +++
 drivers/infiniband/core/uverbs_cmd.c | 2 ++
 include/rdma/ib_verbs.h              | 3 +++
 include/uapi/rdma/ib_user_verbs.h    | 2 ++
 5 files changed, 19 insertions(+)

-- 
2.54.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH rdma-next v3 1/2] RDMA/uverbs: expose CoCo DMA bounce requirement to userspace
  2026-05-17 14:13 [PATCH rdma-next v3 0/2] RDMA: detect and handle CoCo DMA bounce buffering Jiri Pirko
@ 2026-05-17 14:13 ` Jiri Pirko
  2026-05-17 14:13 ` [PATCH rdma-next v3 2/2] RDMA/umem: block plain userspace memory registration under CoCo bounce Jiri Pirko
  1 sibling, 0 replies; 5+ messages in thread
From: Jiri Pirko @ 2026-05-17 14:13 UTC (permalink / raw)
  To: linux-rdma
  Cc: jgg, leon, edwards, kees, parav, mbloch, yishaih, lirongqing,
	huangjunxian6, liuy22, jmoroni

From: Jiri Pirko <jiri@nvidia.com>

In CoCo guests, guest memory is encrypted and untrusted (T=0) devices
cannot DMA to it directly; such transfers must go through unencrypted
bounce buffers. RDMA registers user pages for direct device access,
bypassing the DMA layer and thus any bouncing, so registered memory
does not work in this configuration.

Until trusted (T=1) device detection is available, conservatively
flag every device attached to a CoCo guest. Expose the condition to
userspace as IB_UVERBS_DEVICE_CC_DMA_BOUNCE in device_cap_flags_ex so
applications can avoid memory registration and fall back to copying
buffers through send/recv.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
v2->v3:
- Dropped is_swiotlb_force_bounce()/swiotlb.h; keyed off CC_ATTR_GUEST_MEM_ENCRYPT alone.
- Added comment noting T=1 detection should narrow the check.
- Rewrote log: dropped SWIOTLB rationale, explained T=0 assumption; fixed device_cap_flags_ex typo.
---
 drivers/infiniband/core/device.c     | 9 +++++++++
 drivers/infiniband/core/uverbs_cmd.c | 2 ++
 include/rdma/ib_verbs.h              | 3 +++
 include/uapi/rdma/ib_user_verbs.h    | 2 ++
 4 files changed, 16 insertions(+)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index b89efaaa81ec..21ada0fe9059 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -42,6 +42,7 @@
 #include <linux/security.h>
 #include <linux/notifier.h>
 #include <linux/hashtable.h>
+#include <linux/cc_platform.h>
 #include <rdma/rdma_netlink.h>
 #include <rdma/ib_addr.h>
 #include <rdma/ib_cache.h>
@@ -1419,6 +1420,14 @@ int ib_register_device(struct ib_device *device, const char *name,
 	 */
 	WARN_ON(dma_device && !dma_device->dma_parms);
 	device->dma_device = dma_device;
+	/*
+	 * In a CoCo guest every device is currently assumed to be untrusted
+	 * (T=0) and therefore subject to DMA bouncing. Once trusted (T=1)
+	 * device detection is wired up, narrow this check to exclude such
+	 * devices.
+	 */
+	if (dma_device && cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
+		device->cc_dma_bounce = 1;
 
 	ret = setup_device(device);
 	if (ret)
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 8eed017091b0..2269f636bf58 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3579,6 +3579,8 @@ static int ib_uverbs_ex_query_device(struct uverbs_attr_bundle *attrs)
 	resp.timestamp_mask = attr.timestamp_mask;
 	resp.hca_core_clock = attr.hca_core_clock;
 	resp.device_cap_flags_ex = attr.device_cap_flags;
+	if (ib_dev->cc_dma_bounce)
+		resp.device_cap_flags_ex |= IB_UVERBS_DEVICE_CC_DMA_BOUNCE;
 	resp.rss_caps.supported_qpts = attr.rss_caps.supported_qpts;
 	resp.rss_caps.max_rwq_indirection_tables =
 		attr.rss_caps.max_rwq_indirection_tables;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 167fb924f0cf..d06071b87d96 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -275,6 +275,7 @@ enum ib_device_cap_flags {
 	IB_DEVICE_FLUSH_GLOBAL = IB_UVERBS_DEVICE_FLUSH_GLOBAL,
 	IB_DEVICE_FLUSH_PERSISTENT = IB_UVERBS_DEVICE_FLUSH_PERSISTENT,
 	IB_DEVICE_ATOMIC_WRITE = IB_UVERBS_DEVICE_ATOMIC_WRITE,
+	IB_DEVICE_CC_DMA_BOUNCE = IB_UVERBS_DEVICE_CC_DMA_BOUNCE,
 };
 
 enum ib_kernel_cap_flags {
@@ -2950,6 +2951,8 @@ struct ib_device {
 	u16                          kverbs_provider:1;
 	/* CQ adaptive moderation (RDMA DIM) */
 	u16                          use_cq_dim:1;
+	/* CoCo guest with DMA bounce buffering required */
+	u16                          cc_dma_bounce:1;
 	u8                           node_type;
 	u32			     phys_port_cnt;
 	struct ib_device_attr        attrs;
diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h
index 3b7bd99813e9..d2aeadb6d2f9 100644
--- a/include/uapi/rdma/ib_user_verbs.h
+++ b/include/uapi/rdma/ib_user_verbs.h
@@ -1368,6 +1368,8 @@ enum ib_uverbs_device_cap_flags {
 	IB_UVERBS_DEVICE_FLUSH_PERSISTENT = 1ULL << 39,
 	/* Atomic write attributes */
 	IB_UVERBS_DEVICE_ATOMIC_WRITE = 1ULL << 40,
+	/* CoCo guest with DMA bounce buffering required */
+	IB_UVERBS_DEVICE_CC_DMA_BOUNCE = 1ULL << 41,
 };
 
 enum ib_uverbs_raw_packet_caps {
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH rdma-next v3 2/2] RDMA/umem: block plain userspace memory registration under CoCo bounce
  2026-05-17 14:13 [PATCH rdma-next v3 0/2] RDMA: detect and handle CoCo DMA bounce buffering Jiri Pirko
  2026-05-17 14:13 ` [PATCH rdma-next v3 1/2] RDMA/uverbs: expose CoCo DMA bounce requirement to userspace Jiri Pirko
@ 2026-05-17 14:13 ` Jiri Pirko
  2026-05-17 16:17   ` Leon Romanovsky
  1 sibling, 1 reply; 5+ messages in thread
From: Jiri Pirko @ 2026-05-17 14:13 UTC (permalink / raw)
  To: linux-rdma
  Cc: jgg, leon, edwards, kees, parav, mbloch, yishaih, lirongqing,
	huangjunxian6, liuy22, jmoroni

From: Jiri Pirko <jiri@nvidia.com>

When a device requires DMA bounce buffering inside a Confidential
Computing guest, __ib_umem_get_va() cannot work. The DMA mapping layer
redirects all mappings through swiotlb bounce buffers, so the device
receives DMA addresses pointing to bounce buffer memory rather than
the user's pages. Since RDMA devices access registered memory directly
without CPU involvement, there is no opportunity for swiotlb to
synchronize between the bounce buffer and the original pages.

The registration would already fail later on, since the umem mapping
is requested with DMA_ATTR_REQUIRE_COHERENT and gets rejected under
is_swiotlb_force_bounce() with -EIO. Fail early with -EOPNOTSUPP
instead, so the user gets a specific error code to react to.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
v1->v2:
- updated patch description with mention of DMA_ATTR_REQUIRE_COHERENT
---
 drivers/infiniband/core/umem.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index eb1de32bab9d..b32bc2a5d7d0 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -167,6 +167,9 @@ static struct ib_umem *__ib_umem_get_va(struct ib_device *device,
 	int pinned, ret;
 	unsigned int gup_flags = FOLL_LONGTERM;
 
+	if (device->cc_dma_bounce)
+		return ERR_PTR(-EOPNOTSUPP);
+
 	/*
 	 * If the combination of the addr and size requested for this memory
 	 * region causes an integer overflow, return error.
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH rdma-next v3 2/2] RDMA/umem: block plain userspace memory registration under CoCo bounce
  2026-05-17 14:13 ` [PATCH rdma-next v3 2/2] RDMA/umem: block plain userspace memory registration under CoCo bounce Jiri Pirko
@ 2026-05-17 16:17   ` Leon Romanovsky
  2026-05-17 18:03     ` Jiri Pirko
  0 siblings, 1 reply; 5+ messages in thread
From: Leon Romanovsky @ 2026-05-17 16:17 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: linux-rdma, jgg, edwards, kees, parav, mbloch, yishaih,
	lirongqing, huangjunxian6, liuy22, jmoroni

On Sun, May 17, 2026 at 04:13:11PM +0200, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@nvidia.com>
> 
> When a device requires DMA bounce buffering inside a Confidential
> Computing guest, __ib_umem_get_va() cannot work. The DMA mapping layer
> redirects all mappings through swiotlb bounce buffers, so the device
> receives DMA addresses pointing to bounce buffer memory rather than
> the user's pages. Since RDMA devices access registered memory directly
> without CPU involvement, there is no opportunity for swiotlb to
> synchronize between the bounce buffer and the original pages.
> 
> The registration would already fail later on, since the umem mapping
> is requested with DMA_ATTR_REQUIRE_COHERENT and gets rejected under
> is_swiotlb_force_bounce() with -EIO. Fail early with -EOPNOTSUPP
> instead, so the user gets a specific error code to react to.
> 
> Signed-off-by: Jiri Pirko <jiri@nvidia.com>
> ---
> v1->v2:
> - updated patch description with mention of DMA_ATTR_REQUIRE_COHERENT
> ---
>  drivers/infiniband/core/umem.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> index eb1de32bab9d..b32bc2a5d7d0 100644
> --- a/drivers/infiniband/core/umem.c
> +++ b/drivers/infiniband/core/umem.c
> @@ -167,6 +167,9 @@ static struct ib_umem *__ib_umem_get_va(struct ib_device *device,
>  	int pinned, ret;
>  	unsigned int gup_flags = FOLL_LONGTERM;
>  
> +	if (device->cc_dma_bounce)
> +		return ERR_PTR(-EOPNOTSUPP);
> +

The series looks reasonable, but I cannot apply it yet because  
`__ib_umem_get_va()` has not been merged.

Thanks

>  	/*
>  	 * If the combination of the addr and size requested for this memory
>  	 * region causes an integer overflow, return error.
> -- 
> 2.54.0
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH rdma-next v3 2/2] RDMA/umem: block plain userspace memory registration under CoCo bounce
  2026-05-17 16:17   ` Leon Romanovsky
@ 2026-05-17 18:03     ` Jiri Pirko
  0 siblings, 0 replies; 5+ messages in thread
From: Jiri Pirko @ 2026-05-17 18:03 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: linux-rdma, jgg, edwards, kees, parav, mbloch, yishaih,
	lirongqing, huangjunxian6, liuy22, jmoroni

Sun, May 17, 2026 at 06:17:12PM +0200, leon@kernel.org wrote:
>On Sun, May 17, 2026 at 04:13:11PM +0200, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@nvidia.com>
>> 
>> When a device requires DMA bounce buffering inside a Confidential
>> Computing guest, __ib_umem_get_va() cannot work. The DMA mapping layer
>> redirects all mappings through swiotlb bounce buffers, so the device
>> receives DMA addresses pointing to bounce buffer memory rather than
>> the user's pages. Since RDMA devices access registered memory directly
>> without CPU involvement, there is no opportunity for swiotlb to
>> synchronize between the bounce buffer and the original pages.
>> 
>> The registration would already fail later on, since the umem mapping
>> is requested with DMA_ATTR_REQUIRE_COHERENT and gets rejected under
>> is_swiotlb_force_bounce() with -EIO. Fail early with -EOPNOTSUPP
>> instead, so the user gets a specific error code to react to.
>> 
>> Signed-off-by: Jiri Pirko <jiri@nvidia.com>
>> ---
>> v1->v2:
>> - updated patch description with mention of DMA_ATTR_REQUIRE_COHERENT
>> ---
>>  drivers/infiniband/core/umem.c | 3 +++
>>  1 file changed, 3 insertions(+)
>> 
>> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
>> index eb1de32bab9d..b32bc2a5d7d0 100644
>> --- a/drivers/infiniband/core/umem.c
>> +++ b/drivers/infiniband/core/umem.c
>> @@ -167,6 +167,9 @@ static struct ib_umem *__ib_umem_get_va(struct ib_device *device,
>>  	int pinned, ret;
>>  	unsigned int gup_flags = FOLL_LONGTERM;
>>  
>> +	if (device->cc_dma_bounce)
>> +		return ERR_PTR(-EOPNOTSUPP);
>> +
>
>The series looks reasonable, but I cannot apply it yet because  
>`__ib_umem_get_va()` has not been merged.

Correct. Cover letter says:
based on top of:
https://lore.kernel.org/all/20260517063006.2200680-1-jiri@resnulli.us/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-05-17 18:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-17 14:13 [PATCH rdma-next v3 0/2] RDMA: detect and handle CoCo DMA bounce buffering Jiri Pirko
2026-05-17 14:13 ` [PATCH rdma-next v3 1/2] RDMA/uverbs: expose CoCo DMA bounce requirement to userspace Jiri Pirko
2026-05-17 14:13 ` [PATCH rdma-next v3 2/2] RDMA/umem: block plain userspace memory registration under CoCo bounce Jiri Pirko
2026-05-17 16:17   ` Leon Romanovsky
2026-05-17 18:03     ` Jiri Pirko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox