* [PATCH rdma-next v3 0/2] RDMA: detect and handle CoCo DMA bounce buffering @ 2026-05-17 14:13 Jiri Pirko 2026-05-17 14:13 ` [PATCH rdma-next v3 1/2] RDMA/uverbs: expose CoCo DMA bounce requirement to userspace Jiri Pirko 2026-05-17 14:13 ` [PATCH rdma-next v3 2/2] RDMA/umem: block plain userspace memory registration under CoCo bounce Jiri Pirko 0 siblings, 2 replies; 5+ messages in thread From: Jiri Pirko @ 2026-05-17 14:13 UTC (permalink / raw) To: linux-rdma Cc: jgg, leon, edwards, kees, parav, mbloch, yishaih, lirongqing, huangjunxian6, liuy22, jmoroni From: Jiri Pirko <jiri@nvidia.com> In Confidential Computing (CoCo) guests, the DMA mapping layer redirects all device DMA through swiotlb bounce buffers to keep guest memory encrypted. This is transparent for regular devices because the CPU copies data between the bounce buffer and the real buffer on every DMA map/unmap cycle. RDMA breaks this model. Once a memory region is registered, the device accesses the underlying pages directly for an extended period without CPU involvement. The swiotlb layer never gets a chance to synchronize, so the device operates on bounce buffer memory while the application works with its own pages - the two never see each other's updates. This series adds detection and handling of this condition. A new IB_UVERBS_DEVICE_CC_DMA_BOUNCE flag is exposed in device_cap_flags_ex so userspace libraries can detect the situation and switch to dmabuf-based memory registration using "system_cc_shared" heap where available. Plain ib_umem_get() is made to fail early with -EOPNOTSUPP to prevent silent misfunction. --- See individual patches for changelog. v2: https://lore.kernel.org/all/20260506111447.2697789-1-jiri@resnulli.us/ v1: https://lore.kernel.org/all/20260505061149.2361536-1-jiri@resnulli.us/ based on top of: https://lore.kernel.org/all/20260517063006.2200680-1-jiri@resnulli.us/ Jiri Pirko (2): RDMA/uverbs: expose CoCo DMA bounce requirement to userspace RDMA/umem: block plain userspace memory registration under CoCo bounce drivers/infiniband/core/device.c | 9 +++++++++ drivers/infiniband/core/umem.c | 3 +++ drivers/infiniband/core/uverbs_cmd.c | 2 ++ include/rdma/ib_verbs.h | 3 +++ include/uapi/rdma/ib_user_verbs.h | 2 ++ 5 files changed, 19 insertions(+) -- 2.54.0 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH rdma-next v3 1/2] RDMA/uverbs: expose CoCo DMA bounce requirement to userspace 2026-05-17 14:13 [PATCH rdma-next v3 0/2] RDMA: detect and handle CoCo DMA bounce buffering Jiri Pirko @ 2026-05-17 14:13 ` Jiri Pirko 2026-05-17 14:13 ` [PATCH rdma-next v3 2/2] RDMA/umem: block plain userspace memory registration under CoCo bounce Jiri Pirko 1 sibling, 0 replies; 5+ messages in thread From: Jiri Pirko @ 2026-05-17 14:13 UTC (permalink / raw) To: linux-rdma Cc: jgg, leon, edwards, kees, parav, mbloch, yishaih, lirongqing, huangjunxian6, liuy22, jmoroni From: Jiri Pirko <jiri@nvidia.com> In CoCo guests, guest memory is encrypted and untrusted (T=0) devices cannot DMA to it directly; such transfers must go through unencrypted bounce buffers. RDMA registers user pages for direct device access, bypassing the DMA layer and thus any bouncing, so registered memory does not work in this configuration. Until trusted (T=1) device detection is available, conservatively flag every device attached to a CoCo guest. Expose the condition to userspace as IB_UVERBS_DEVICE_CC_DMA_BOUNCE in device_cap_flags_ex so applications can avoid memory registration and fall back to copying buffers through send/recv. Signed-off-by: Jiri Pirko <jiri@nvidia.com> --- v2->v3: - Dropped is_swiotlb_force_bounce()/swiotlb.h; keyed off CC_ATTR_GUEST_MEM_ENCRYPT alone. - Added comment noting T=1 detection should narrow the check. - Rewrote log: dropped SWIOTLB rationale, explained T=0 assumption; fixed device_cap_flags_ex typo. --- drivers/infiniband/core/device.c | 9 +++++++++ drivers/infiniband/core/uverbs_cmd.c | 2 ++ include/rdma/ib_verbs.h | 3 +++ include/uapi/rdma/ib_user_verbs.h | 2 ++ 4 files changed, 16 insertions(+) diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index b89efaaa81ec..21ada0fe9059 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -42,6 +42,7 @@ #include <linux/security.h> #include <linux/notifier.h> #include <linux/hashtable.h> +#include <linux/cc_platform.h> #include <rdma/rdma_netlink.h> #include <rdma/ib_addr.h> #include <rdma/ib_cache.h> @@ -1419,6 +1420,14 @@ int ib_register_device(struct ib_device *device, const char *name, */ WARN_ON(dma_device && !dma_device->dma_parms); device->dma_device = dma_device; + /* + * In a CoCo guest every device is currently assumed to be untrusted + * (T=0) and therefore subject to DMA bouncing. Once trusted (T=1) + * device detection is wired up, narrow this check to exclude such + * devices. + */ + if (dma_device && cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) + device->cc_dma_bounce = 1; ret = setup_device(device); if (ret) diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c index 8eed017091b0..2269f636bf58 100644 --- a/drivers/infiniband/core/uverbs_cmd.c +++ b/drivers/infiniband/core/uverbs_cmd.c @@ -3579,6 +3579,8 @@ static int ib_uverbs_ex_query_device(struct uverbs_attr_bundle *attrs) resp.timestamp_mask = attr.timestamp_mask; resp.hca_core_clock = attr.hca_core_clock; resp.device_cap_flags_ex = attr.device_cap_flags; + if (ib_dev->cc_dma_bounce) + resp.device_cap_flags_ex |= IB_UVERBS_DEVICE_CC_DMA_BOUNCE; resp.rss_caps.supported_qpts = attr.rss_caps.supported_qpts; resp.rss_caps.max_rwq_indirection_tables = attr.rss_caps.max_rwq_indirection_tables; diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 167fb924f0cf..d06071b87d96 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -275,6 +275,7 @@ enum ib_device_cap_flags { IB_DEVICE_FLUSH_GLOBAL = IB_UVERBS_DEVICE_FLUSH_GLOBAL, IB_DEVICE_FLUSH_PERSISTENT = IB_UVERBS_DEVICE_FLUSH_PERSISTENT, IB_DEVICE_ATOMIC_WRITE = IB_UVERBS_DEVICE_ATOMIC_WRITE, + IB_DEVICE_CC_DMA_BOUNCE = IB_UVERBS_DEVICE_CC_DMA_BOUNCE, }; enum ib_kernel_cap_flags { @@ -2950,6 +2951,8 @@ struct ib_device { u16 kverbs_provider:1; /* CQ adaptive moderation (RDMA DIM) */ u16 use_cq_dim:1; + /* CoCo guest with DMA bounce buffering required */ + u16 cc_dma_bounce:1; u8 node_type; u32 phys_port_cnt; struct ib_device_attr attrs; diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h index 3b7bd99813e9..d2aeadb6d2f9 100644 --- a/include/uapi/rdma/ib_user_verbs.h +++ b/include/uapi/rdma/ib_user_verbs.h @@ -1368,6 +1368,8 @@ enum ib_uverbs_device_cap_flags { IB_UVERBS_DEVICE_FLUSH_PERSISTENT = 1ULL << 39, /* Atomic write attributes */ IB_UVERBS_DEVICE_ATOMIC_WRITE = 1ULL << 40, + /* CoCo guest with DMA bounce buffering required */ + IB_UVERBS_DEVICE_CC_DMA_BOUNCE = 1ULL << 41, }; enum ib_uverbs_raw_packet_caps { -- 2.54.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH rdma-next v3 2/2] RDMA/umem: block plain userspace memory registration under CoCo bounce 2026-05-17 14:13 [PATCH rdma-next v3 0/2] RDMA: detect and handle CoCo DMA bounce buffering Jiri Pirko 2026-05-17 14:13 ` [PATCH rdma-next v3 1/2] RDMA/uverbs: expose CoCo DMA bounce requirement to userspace Jiri Pirko @ 2026-05-17 14:13 ` Jiri Pirko 2026-05-17 16:17 ` Leon Romanovsky 1 sibling, 1 reply; 5+ messages in thread From: Jiri Pirko @ 2026-05-17 14:13 UTC (permalink / raw) To: linux-rdma Cc: jgg, leon, edwards, kees, parav, mbloch, yishaih, lirongqing, huangjunxian6, liuy22, jmoroni From: Jiri Pirko <jiri@nvidia.com> When a device requires DMA bounce buffering inside a Confidential Computing guest, __ib_umem_get_va() cannot work. The DMA mapping layer redirects all mappings through swiotlb bounce buffers, so the device receives DMA addresses pointing to bounce buffer memory rather than the user's pages. Since RDMA devices access registered memory directly without CPU involvement, there is no opportunity for swiotlb to synchronize between the bounce buffer and the original pages. The registration would already fail later on, since the umem mapping is requested with DMA_ATTR_REQUIRE_COHERENT and gets rejected under is_swiotlb_force_bounce() with -EIO. Fail early with -EOPNOTSUPP instead, so the user gets a specific error code to react to. Signed-off-by: Jiri Pirko <jiri@nvidia.com> --- v1->v2: - updated patch description with mention of DMA_ATTR_REQUIRE_COHERENT --- drivers/infiniband/core/umem.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index eb1de32bab9d..b32bc2a5d7d0 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c @@ -167,6 +167,9 @@ static struct ib_umem *__ib_umem_get_va(struct ib_device *device, int pinned, ret; unsigned int gup_flags = FOLL_LONGTERM; + if (device->cc_dma_bounce) + return ERR_PTR(-EOPNOTSUPP); + /* * If the combination of the addr and size requested for this memory * region causes an integer overflow, return error. -- 2.54.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH rdma-next v3 2/2] RDMA/umem: block plain userspace memory registration under CoCo bounce 2026-05-17 14:13 ` [PATCH rdma-next v3 2/2] RDMA/umem: block plain userspace memory registration under CoCo bounce Jiri Pirko @ 2026-05-17 16:17 ` Leon Romanovsky 2026-05-17 18:03 ` Jiri Pirko 0 siblings, 1 reply; 5+ messages in thread From: Leon Romanovsky @ 2026-05-17 16:17 UTC (permalink / raw) To: Jiri Pirko Cc: linux-rdma, jgg, edwards, kees, parav, mbloch, yishaih, lirongqing, huangjunxian6, liuy22, jmoroni On Sun, May 17, 2026 at 04:13:11PM +0200, Jiri Pirko wrote: > From: Jiri Pirko <jiri@nvidia.com> > > When a device requires DMA bounce buffering inside a Confidential > Computing guest, __ib_umem_get_va() cannot work. The DMA mapping layer > redirects all mappings through swiotlb bounce buffers, so the device > receives DMA addresses pointing to bounce buffer memory rather than > the user's pages. Since RDMA devices access registered memory directly > without CPU involvement, there is no opportunity for swiotlb to > synchronize between the bounce buffer and the original pages. > > The registration would already fail later on, since the umem mapping > is requested with DMA_ATTR_REQUIRE_COHERENT and gets rejected under > is_swiotlb_force_bounce() with -EIO. Fail early with -EOPNOTSUPP > instead, so the user gets a specific error code to react to. > > Signed-off-by: Jiri Pirko <jiri@nvidia.com> > --- > v1->v2: > - updated patch description with mention of DMA_ATTR_REQUIRE_COHERENT > --- > drivers/infiniband/core/umem.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c > index eb1de32bab9d..b32bc2a5d7d0 100644 > --- a/drivers/infiniband/core/umem.c > +++ b/drivers/infiniband/core/umem.c > @@ -167,6 +167,9 @@ static struct ib_umem *__ib_umem_get_va(struct ib_device *device, > int pinned, ret; > unsigned int gup_flags = FOLL_LONGTERM; > > + if (device->cc_dma_bounce) > + return ERR_PTR(-EOPNOTSUPP); > + The series looks reasonable, but I cannot apply it yet because `__ib_umem_get_va()` has not been merged. Thanks > /* > * If the combination of the addr and size requested for this memory > * region causes an integer overflow, return error. > -- > 2.54.0 > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH rdma-next v3 2/2] RDMA/umem: block plain userspace memory registration under CoCo bounce 2026-05-17 16:17 ` Leon Romanovsky @ 2026-05-17 18:03 ` Jiri Pirko 0 siblings, 0 replies; 5+ messages in thread From: Jiri Pirko @ 2026-05-17 18:03 UTC (permalink / raw) To: Leon Romanovsky Cc: linux-rdma, jgg, edwards, kees, parav, mbloch, yishaih, lirongqing, huangjunxian6, liuy22, jmoroni Sun, May 17, 2026 at 06:17:12PM +0200, leon@kernel.org wrote: >On Sun, May 17, 2026 at 04:13:11PM +0200, Jiri Pirko wrote: >> From: Jiri Pirko <jiri@nvidia.com> >> >> When a device requires DMA bounce buffering inside a Confidential >> Computing guest, __ib_umem_get_va() cannot work. The DMA mapping layer >> redirects all mappings through swiotlb bounce buffers, so the device >> receives DMA addresses pointing to bounce buffer memory rather than >> the user's pages. Since RDMA devices access registered memory directly >> without CPU involvement, there is no opportunity for swiotlb to >> synchronize between the bounce buffer and the original pages. >> >> The registration would already fail later on, since the umem mapping >> is requested with DMA_ATTR_REQUIRE_COHERENT and gets rejected under >> is_swiotlb_force_bounce() with -EIO. Fail early with -EOPNOTSUPP >> instead, so the user gets a specific error code to react to. >> >> Signed-off-by: Jiri Pirko <jiri@nvidia.com> >> --- >> v1->v2: >> - updated patch description with mention of DMA_ATTR_REQUIRE_COHERENT >> --- >> drivers/infiniband/core/umem.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c >> index eb1de32bab9d..b32bc2a5d7d0 100644 >> --- a/drivers/infiniband/core/umem.c >> +++ b/drivers/infiniband/core/umem.c >> @@ -167,6 +167,9 @@ static struct ib_umem *__ib_umem_get_va(struct ib_device *device, >> int pinned, ret; >> unsigned int gup_flags = FOLL_LONGTERM; >> >> + if (device->cc_dma_bounce) >> + return ERR_PTR(-EOPNOTSUPP); >> + > >The series looks reasonable, but I cannot apply it yet because >`__ib_umem_get_va()` has not been merged. Correct. Cover letter says: based on top of: https://lore.kernel.org/all/20260517063006.2200680-1-jiri@resnulli.us/ ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-05-17 18:03 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-17 14:13 [PATCH rdma-next v3 0/2] RDMA: detect and handle CoCo DMA bounce buffering Jiri Pirko 2026-05-17 14:13 ` [PATCH rdma-next v3 1/2] RDMA/uverbs: expose CoCo DMA bounce requirement to userspace Jiri Pirko 2026-05-17 14:13 ` [PATCH rdma-next v3 2/2] RDMA/umem: block plain userspace memory registration under CoCo bounce Jiri Pirko 2026-05-17 16:17 ` Leon Romanovsky 2026-05-17 18:03 ` Jiri Pirko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox