* [PATCH rdma-next v2 0/5] Add pinned revocable dmabuf import interface
@ 2026-03-02 0:15 Jacob Moroni
2026-03-02 0:15 ` [PATCH rdma-next v2 1/5] RDMA/umem: Add ib_umem_dmabuf_get_pinned_and_lock helper Jacob Moroni
` (4 more replies)
0 siblings, 5 replies; 7+ messages in thread
From: Jacob Moroni @ 2026-03-02 0:15 UTC (permalink / raw)
To: tatyana.e.nikolova, krzysztof.czurylo, jgg, leon; +Cc: linux-rdma, Jacob Moroni
Some dmabuf exporters (VFIO) will require that pinned importers support
revocation. In order to support this for non-ODP drivers/devices, a new
interface is required. This new interface implements a two step process
where the driver will perform a sequence like:
ib_umem_dmabuf_get_pinned_revocable_and_lock()
... Driver MR allocation/initialization/registration/etc
ib_umem_dmabuf_set_revoke_locked()
ib_umem_dmabuf_revoke_unlock()
This allows the driver to provide a callback that can be used to
perform the actual invalidation in a way that is safe against races
from concurrent revocations during initialization.
The driver must ensure that the HW will no longer access the region
before the revoke callback returns. For MRs, this can be achieved
by using the rereg capability to set the region length to 0, or
perhaps by moving the region to a new quarantine PD. For HW that
allows the driver to manage the keys (like irdma), this can be
achieved by deregistering the region in HW but not freeing the key
until the region is truly deregistered via ibv_dereg_mr.
Dependencies:
Please note that this series targets the `for-next` branch, but it depends
on the following commit currently in the `leon-for-rc` branch:
Commit [104016eb671e] ("RDMA/umem: Fix double dma_buf_unpin in failure path")
Changes in v2:
* Created helpers for acquiring/releasing the umem_dmabuf revoke lock.
* Fixed rereg_user_mr handling in irdma to account for async revoke
and used new revoke lock/unlock helper to simplify the dereg_mr path.
* Dropped unnecessary <linux/dma-resv.h> inclusion in irdma/main.h.
Changes in v1 (since RFCs):
* Break the interface into a two step process to avoid needing
extra state in the driver.
* Move the majority of the functionality into the core.
v1: https://lore.kernel.org/linux-rdma/20260225210705.373126-1-jmoroni@google.com/T/#t
RFC v2: https://lore.kernel.org/linux-rdma/20260223195333.438492-1-jmoroni@google.com/T/#t
RFC v1: https://lore.kernel.org/linux-rdma/CAHYDg1TB1Xa+D700WrvrcQVdgZFE5f8iWp48EmQM9XjK9xJdew@mail.gmail.com/T/#t
Jacob Moroni (5):
RDMA/umem: Add ib_umem_dmabuf_get_pinned_and_lock helper
RDMA/umem: Move umem dmabuf revoke logic into helper function
RDMA/umem: Add pinned revocable dmabuf import interface
RDMA/umem: Add helpers for umem dmabuf revoke lock
RDMA/irdma: Add support for revocable pinned dmabuf import
drivers/infiniband/core/umem_dmabuf.c | 138 ++++++++++++++++++++++----
drivers/infiniband/hw/irdma/verbs.c | 105 +++++++++++++++++---
include/rdma/ib_umem.h | 24 +++++
3 files changed, 237 insertions(+), 30 deletions(-)
--
2.53.0.473.g4a7958ca14-goog
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH rdma-next v2 1/5] RDMA/umem: Add ib_umem_dmabuf_get_pinned_and_lock helper
2026-03-02 0:15 [PATCH rdma-next v2 0/5] Add pinned revocable dmabuf import interface Jacob Moroni
@ 2026-03-02 0:15 ` Jacob Moroni
2026-03-02 0:15 ` [PATCH rdma-next v2 2/5] RDMA/umem: Move umem dmabuf revoke logic into helper function Jacob Moroni
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Jacob Moroni @ 2026-03-02 0:15 UTC (permalink / raw)
To: tatyana.e.nikolova, krzysztof.czurylo, jgg, leon; +Cc: linux-rdma, Jacob Moroni
Move the inner logic of ib_umem_dmabuf_get_pinned_with_dma_device()
to a new static function that returns with the lock held upon success.
The intent is to allow reuse for the future get_pinned_revocable_and_lock
function.
Signed-off-by: Jacob Moroni <jmoroni@google.com>
---
drivers/infiniband/core/umem_dmabuf.c | 35 ++++++++++++++++++++-------
1 file changed, 26 insertions(+), 9 deletions(-)
diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c
index d30f24b90..0c0098285 100644
--- a/drivers/infiniband/core/umem_dmabuf.c
+++ b/drivers/infiniband/core/umem_dmabuf.c
@@ -195,18 +195,19 @@ static struct dma_buf_attach_ops ib_umem_dmabuf_attach_pinned_ops = {
.move_notify = ib_umem_dmabuf_unsupported_move_notify,
};
-struct ib_umem_dmabuf *
-ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device,
- struct device *dma_device,
- unsigned long offset, size_t size,
- int fd, int access)
+static struct ib_umem_dmabuf *
+ib_umem_dmabuf_get_pinned_and_lock(struct ib_device *device,
+ struct device *dma_device,
+ unsigned long offset,
+ size_t size, int fd, int access,
+ const struct dma_buf_attach_ops *ops)
{
struct ib_umem_dmabuf *umem_dmabuf;
int err;
- umem_dmabuf = ib_umem_dmabuf_get_with_dma_device(device, dma_device, offset,
- size, fd, access,
- &ib_umem_dmabuf_attach_pinned_ops);
+ umem_dmabuf =
+ ib_umem_dmabuf_get_with_dma_device(device, dma_device, offset,
+ size, fd, access, ops);
if (IS_ERR(umem_dmabuf))
return umem_dmabuf;
@@ -219,7 +220,6 @@ ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device,
err = ib_umem_dmabuf_map_pages(umem_dmabuf);
if (err)
goto err_release;
- dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv);
return umem_dmabuf;
@@ -228,6 +228,23 @@ ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device,
ib_umem_release(&umem_dmabuf->umem);
return ERR_PTR(err);
}
+
+struct ib_umem_dmabuf *
+ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device,
+ struct device *dma_device,
+ unsigned long offset, size_t size,
+ int fd, int access)
+{
+ struct ib_umem_dmabuf *umem_dmabuf =
+ ib_umem_dmabuf_get_pinned_and_lock(device, dma_device, offset,
+ size, fd, access,
+ &ib_umem_dmabuf_attach_pinned_ops);
+ if (IS_ERR(umem_dmabuf))
+ return umem_dmabuf;
+
+ dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv);
+ return umem_dmabuf;
+}
EXPORT_SYMBOL(ib_umem_dmabuf_get_pinned_with_dma_device);
struct ib_umem_dmabuf *ib_umem_dmabuf_get_pinned(struct ib_device *device,
--
2.53.0.473.g4a7958ca14-goog
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH rdma-next v2 2/5] RDMA/umem: Move umem dmabuf revoke logic into helper function
2026-03-02 0:15 [PATCH rdma-next v2 0/5] Add pinned revocable dmabuf import interface Jacob Moroni
2026-03-02 0:15 ` [PATCH rdma-next v2 1/5] RDMA/umem: Add ib_umem_dmabuf_get_pinned_and_lock helper Jacob Moroni
@ 2026-03-02 0:15 ` Jacob Moroni
2026-03-02 0:15 ` [PATCH rdma-next v2 3/5] RDMA/umem: Add pinned revocable dmabuf import interface Jacob Moroni
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Jacob Moroni @ 2026-03-02 0:15 UTC (permalink / raw)
To: tatyana.e.nikolova, krzysztof.czurylo, jgg, leon; +Cc: linux-rdma, Jacob Moroni
This same logic will eventually be reused from within the
invalidate_mappings callback which already has the dma_resv_lock
held, so break it out into a separate function so it can be reused.
Signed-off-by: Jacob Moroni <jmoroni@google.com>
---
drivers/infiniband/core/umem_dmabuf.c | 26 +++++++++++++++++---------
1 file changed, 17 insertions(+), 9 deletions(-)
diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c
index 0c0098285..9cf9cfc93 100644
--- a/drivers/infiniband/core/umem_dmabuf.c
+++ b/drivers/infiniband/core/umem_dmabuf.c
@@ -195,6 +195,22 @@ static struct dma_buf_attach_ops ib_umem_dmabuf_attach_pinned_ops = {
.move_notify = ib_umem_dmabuf_unsupported_move_notify,
};
+static void ib_umem_dmabuf_revoke_locked(struct dma_buf_attachment *attach)
+{
+ struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv;
+
+ dma_resv_assert_held(attach->dmabuf->resv);
+
+ if (umem_dmabuf->revoked)
+ return;
+ ib_umem_dmabuf_unmap_pages(umem_dmabuf);
+ if (umem_dmabuf->pinned) {
+ dma_buf_unpin(umem_dmabuf->attach);
+ umem_dmabuf->pinned = 0;
+ }
+ umem_dmabuf->revoked = 1;
+}
+
static struct ib_umem_dmabuf *
ib_umem_dmabuf_get_pinned_and_lock(struct ib_device *device,
struct device *dma_device,
@@ -262,15 +278,7 @@ void ib_umem_dmabuf_revoke(struct ib_umem_dmabuf *umem_dmabuf)
struct dma_buf *dmabuf = umem_dmabuf->attach->dmabuf;
dma_resv_lock(dmabuf->resv, NULL);
- if (umem_dmabuf->revoked)
- goto end;
- ib_umem_dmabuf_unmap_pages(umem_dmabuf);
- if (umem_dmabuf->pinned) {
- dma_buf_unpin(umem_dmabuf->attach);
- umem_dmabuf->pinned = 0;
- }
- umem_dmabuf->revoked = 1;
-end:
+ ib_umem_dmabuf_revoke_locked(umem_dmabuf->attach);
dma_resv_unlock(dmabuf->resv);
}
EXPORT_SYMBOL(ib_umem_dmabuf_revoke);
--
2.53.0.473.g4a7958ca14-goog
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH rdma-next v2 3/5] RDMA/umem: Add pinned revocable dmabuf import interface
2026-03-02 0:15 [PATCH rdma-next v2 0/5] Add pinned revocable dmabuf import interface Jacob Moroni
2026-03-02 0:15 ` [PATCH rdma-next v2 1/5] RDMA/umem: Add ib_umem_dmabuf_get_pinned_and_lock helper Jacob Moroni
2026-03-02 0:15 ` [PATCH rdma-next v2 2/5] RDMA/umem: Move umem dmabuf revoke logic into helper function Jacob Moroni
@ 2026-03-02 0:15 ` Jacob Moroni
2026-03-04 14:46 ` Leon Romanovsky
2026-03-02 0:15 ` [PATCH rdma-next v2 4/5] RDMA/umem: Add helpers for umem dmabuf revoke lock Jacob Moroni
2026-03-02 0:15 ` [PATCH rdma-next v2 5/5] RDMA/irdma: Add support for revocable pinned dmabuf import Jacob Moroni
4 siblings, 1 reply; 7+ messages in thread
From: Jacob Moroni @ 2026-03-02 0:15 UTC (permalink / raw)
To: tatyana.e.nikolova, krzysztof.czurylo, jgg, leon; +Cc: linux-rdma, Jacob Moroni
Added an interface for importing a pinned but revocable dmabuf.
This interface can be used by drivers that are capable of revocation
so that they can import dmabufs from exporters that may require it,
such as VFIO.
This interface implements a two step process, where drivers will first
call ib_umem_dmabuf_get_pinned_revocable_and_lock() which will pin and
map the dmabuf (and provide a functional move_notify/invalidate_mappings
callback), but will return with the lock still held so that the
driver can then populate the callback via
ib_umem_dmabuf_set_revoke_locked() without races from concurrent
revocations. This scheme also allows for easier integration with drivers
that may not have actually allocated their internal MR objects at the time
of the get_pinned_revocable* call.
Signed-off-by: Jacob Moroni <jmoroni@google.com>
---
drivers/infiniband/core/umem_dmabuf.c | 61 +++++++++++++++++++++++++++
include/rdma/ib_umem.h | 20 +++++++++
2 files changed, 81 insertions(+)
diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c
index 9cf9cfc93..7892e33be 100644
--- a/drivers/infiniband/core/umem_dmabuf.c
+++ b/drivers/infiniband/core/umem_dmabuf.c
@@ -203,6 +203,10 @@ static void ib_umem_dmabuf_revoke_locked(struct dma_buf_attachment *attach)
if (umem_dmabuf->revoked)
return;
+
+ if (umem_dmabuf->pinned_revoke)
+ umem_dmabuf->pinned_revoke(umem_dmabuf->pinned_revoke_priv);
+
ib_umem_dmabuf_unmap_pages(umem_dmabuf);
if (umem_dmabuf->pinned) {
dma_buf_unpin(umem_dmabuf->attach);
@@ -211,6 +215,11 @@ static void ib_umem_dmabuf_revoke_locked(struct dma_buf_attachment *attach)
umem_dmabuf->revoked = 1;
}
+static struct dma_buf_attach_ops ib_umem_dmabuf_attach_pinned_revocable_ops = {
+ .allow_peer2peer = true,
+ .move_notify = ib_umem_dmabuf_revoke_locked,
+};
+
static struct ib_umem_dmabuf *
ib_umem_dmabuf_get_pinned_and_lock(struct ib_device *device,
struct device *dma_device,
@@ -263,6 +272,58 @@ ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device,
}
EXPORT_SYMBOL(ib_umem_dmabuf_get_pinned_with_dma_device);
+/**
+ * ib_umem_dmabuf_get_pinned_revocable_and_lock - Map & pin a revocable dmabuf
+ * @device: IB device.
+ * @offset: Start offset.
+ * @size: Length.
+ * @fd: dmabuf fd.
+ * @access: Access flags.
+ *
+ * Obtains a umem from a dmabuf for drivers/devices that can support revocation.
+ *
+ * Returns with dma_resv_lock held upon success. The driver must set the revoke
+ * callback prior to unlock by calling ib_umem_dmabuf_set_revoke_locked().
+ *
+ * When a revocation occurs, the revoke callback will be called. The driver must
+ * ensure that the region is no longer accessed when the callback returns. Any
+ * subsequent access attempts should also probably cause an AE for MRs.
+ *
+ * If the umem is used for an MR, the driver must ensure that the key remains in
+ * use such that it cannot be obtained by a new region until this region is
+ * fully deregistered (i.e., ibv_dereg_mr). If a driver needs to serialize with
+ * revoke calls, it can use dma_resv_lock.
+ *
+ * If successful, then the revoke callback may be called at any time and will
+ * also be called automatically upon ib_umem_release (serialized). The revoke
+ * callback will be called one time at most.
+ *
+ * Return: A pointer to ib_umem_dmabuf on success, or an ERR_PTR on failure.
+ */
+struct ib_umem_dmabuf *
+ib_umem_dmabuf_get_pinned_revocable_and_lock(struct ib_device *device,
+ unsigned long offset, size_t size,
+ int fd, int access)
+{
+ const struct dma_buf_attach_ops *ops =
+ &ib_umem_dmabuf_attach_pinned_revocable_ops;
+
+ return ib_umem_dmabuf_get_pinned_and_lock(device, device->dma_device,
+ offset, size, fd, access,
+ ops);
+}
+EXPORT_SYMBOL(ib_umem_dmabuf_get_pinned_revocable_and_lock);
+
+void ib_umem_dmabuf_set_revoke_locked(struct ib_umem_dmabuf *umem_dmabuf,
+ void (*revoke)(void *priv), void *priv)
+{
+ dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv);
+
+ umem_dmabuf->pinned_revoke = revoke;
+ umem_dmabuf->pinned_revoke_priv = priv;
+}
+EXPORT_SYMBOL(ib_umem_dmabuf_set_revoke_locked);
+
struct ib_umem_dmabuf *ib_umem_dmabuf_get_pinned(struct ib_device *device,
unsigned long offset,
size_t size, int fd,
diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h
index 1cc1d4077..0fed9435d 100644
--- a/include/rdma/ib_umem.h
+++ b/include/rdma/ib_umem.h
@@ -32,6 +32,8 @@ struct ib_umem_dmabuf {
struct scatterlist *last_sg;
unsigned long first_sg_offset;
unsigned long last_sg_trim;
+ void (*pinned_revoke)(void *priv);
+ void *pinned_revoke_priv;
void *private;
u8 pinned : 1;
u8 revoked : 1;
@@ -137,6 +139,12 @@ struct ib_umem_dmabuf *ib_umem_dmabuf_get_pinned(struct ib_device *device,
size_t size, int fd,
int access);
struct ib_umem_dmabuf *
+ib_umem_dmabuf_get_pinned_revocable_and_lock(struct ib_device *device,
+ unsigned long offset, size_t size,
+ int fd, int access);
+void ib_umem_dmabuf_set_revoke_locked(struct ib_umem_dmabuf *umem_dmabuf,
+ void (*revoke)(void *priv), void *priv);
+struct ib_umem_dmabuf *
ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device,
struct device *dma_device,
unsigned long offset, size_t size,
@@ -189,6 +197,18 @@ ib_umem_dmabuf_get_pinned(struct ib_device *device, unsigned long offset,
return ERR_PTR(-EOPNOTSUPP);
}
+static inline struct ib_umem_dmabuf *
+ib_umem_dmabuf_get_pinned_revocable_and_lock(struct ib_device *device,
+ unsigned long offset, size_t size,
+ int fd, int access)
+{
+ return ERR_PTR(-EOPNOTSUPP);
+}
+
+static inline void
+ib_umem_dmabuf_set_revoke_locked(struct ib_umem_dmabuf *umem_dmabuf,
+ void (*revoke)(void *priv), void *priv) {}
+
static inline struct ib_umem_dmabuf *
ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device,
struct device *dma_device,
--
2.53.0.473.g4a7958ca14-goog
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH rdma-next v2 4/5] RDMA/umem: Add helpers for umem dmabuf revoke lock
2026-03-02 0:15 [PATCH rdma-next v2 0/5] Add pinned revocable dmabuf import interface Jacob Moroni
` (2 preceding siblings ...)
2026-03-02 0:15 ` [PATCH rdma-next v2 3/5] RDMA/umem: Add pinned revocable dmabuf import interface Jacob Moroni
@ 2026-03-02 0:15 ` Jacob Moroni
2026-03-02 0:15 ` [PATCH rdma-next v2 5/5] RDMA/irdma: Add support for revocable pinned dmabuf import Jacob Moroni
4 siblings, 0 replies; 7+ messages in thread
From: Jacob Moroni @ 2026-03-02 0:15 UTC (permalink / raw)
To: tatyana.e.nikolova, krzysztof.czurylo, jgg, leon; +Cc: linux-rdma, Jacob Moroni
Added helpers to acquire and release the umem dmabuf revoke
lock. The intent is to avoid the need for drivers to peek
into the ib_umem_dmabuf internals to get the dma_resv_lock
and bring us one step closer to abstracting ib_umem_dmabuf
away from drivers in general.
Signed-off-by: Jacob Moroni <jmoroni@google.com>
---
drivers/infiniband/core/umem_dmabuf.c | 16 ++++++++++++++++
include/rdma/ib_umem.h | 4 ++++
2 files changed, 20 insertions(+)
diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c
index 7892e33be..4c21191dd 100644
--- a/drivers/infiniband/core/umem_dmabuf.c
+++ b/drivers/infiniband/core/umem_dmabuf.c
@@ -334,6 +334,22 @@ struct ib_umem_dmabuf *ib_umem_dmabuf_get_pinned(struct ib_device *device,
}
EXPORT_SYMBOL(ib_umem_dmabuf_get_pinned);
+void ib_umem_dmabuf_revoke_lock(struct ib_umem_dmabuf *umem_dmabuf)
+{
+ struct dma_buf *dmabuf = umem_dmabuf->attach->dmabuf;
+
+ dma_resv_lock(dmabuf->resv, NULL);
+}
+EXPORT_SYMBOL(ib_umem_dmabuf_revoke_lock);
+
+void ib_umem_dmabuf_revoke_unlock(struct ib_umem_dmabuf *umem_dmabuf)
+{
+ struct dma_buf *dmabuf = umem_dmabuf->attach->dmabuf;
+
+ dma_resv_unlock(dmabuf->resv);
+}
+EXPORT_SYMBOL(ib_umem_dmabuf_revoke_unlock);
+
void ib_umem_dmabuf_revoke(struct ib_umem_dmabuf *umem_dmabuf)
{
struct dma_buf *dmabuf = umem_dmabuf->attach->dmabuf;
diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h
index 0fed9435d..8cc48ec4c 100644
--- a/include/rdma/ib_umem.h
+++ b/include/rdma/ib_umem.h
@@ -152,6 +152,8 @@ ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device,
int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf);
void ib_umem_dmabuf_unmap_pages(struct ib_umem_dmabuf *umem_dmabuf);
void ib_umem_dmabuf_release(struct ib_umem_dmabuf *umem_dmabuf);
+void ib_umem_dmabuf_revoke_lock(struct ib_umem_dmabuf *umem_dmabuf);
+void ib_umem_dmabuf_revoke_unlock(struct ib_umem_dmabuf *umem_dmabuf);
void ib_umem_dmabuf_revoke(struct ib_umem_dmabuf *umem_dmabuf);
#else /* CONFIG_INFINIBAND_USER_MEM */
@@ -224,6 +226,8 @@ static inline int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf)
}
static inline void ib_umem_dmabuf_unmap_pages(struct ib_umem_dmabuf *umem_dmabuf) { }
static inline void ib_umem_dmabuf_release(struct ib_umem_dmabuf *umem_dmabuf) { }
+static inline void ib_umem_dmabuf_revoke_lock(struct ib_umem_dmabuf *umem_dmabuf) {}
+static inline void ib_umem_dmabuf_revoke_unlock(struct ib_umem_dmabuf *umem_dmabuf) {}
static inline void ib_umem_dmabuf_revoke(struct ib_umem_dmabuf *umem_dmabuf) {}
#endif /* CONFIG_INFINIBAND_USER_MEM */
--
2.53.0.473.g4a7958ca14-goog
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH rdma-next v2 5/5] RDMA/irdma: Add support for revocable pinned dmabuf import
2026-03-02 0:15 [PATCH rdma-next v2 0/5] Add pinned revocable dmabuf import interface Jacob Moroni
` (3 preceding siblings ...)
2026-03-02 0:15 ` [PATCH rdma-next v2 4/5] RDMA/umem: Add helpers for umem dmabuf revoke lock Jacob Moroni
@ 2026-03-02 0:15 ` Jacob Moroni
4 siblings, 0 replies; 7+ messages in thread
From: Jacob Moroni @ 2026-03-02 0:15 UTC (permalink / raw)
To: tatyana.e.nikolova, krzysztof.czurylo, jgg, leon; +Cc: linux-rdma, Jacob Moroni
Use the new API to support importing pinned dmabufs from exporters
that require revocation, such as VFIO. The revoke semantic is
achieved by issuing a HW invalidation command but not freeing
the key. This prevents further accesses to the region (they will
result in an invalid key AE), but also keeps the key reserved
until the region is actually deregistered (i.e., ibv_dereg_mr)
so that a new MR registration cannot acquire the same key.
Tested with lockdep+kasan and a memfd backed dmabuf.
The rereg_mr path is explicitly blocked in libibverbs for dmabuf MRs
(more specifically, any MR not of type IBV_MR_TYPE_MR), so the rereg_mr
path for dmabufs was tested with a modified libibverbs.
Signed-off-by: Jacob Moroni <jmoroni@google.com>
---
drivers/infiniband/hw/irdma/verbs.c | 105 ++++++++++++++++++++++++----
1 file changed, 93 insertions(+), 12 deletions(-)
diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c
index 15af53237..a0e0b3e39 100644
--- a/drivers/infiniband/hw/irdma/verbs.c
+++ b/drivers/infiniband/hw/irdma/verbs.c
@@ -3590,6 +3590,36 @@ static struct ib_mr *irdma_reg_user_mr(struct ib_pd *pd, u64 start, u64 len,
return ERR_PTR(err);
}
+static int irdma_hwdereg_mr(struct ib_mr *ib_mr);
+
+static void irdma_umem_dmabuf_revoke(void *priv)
+{
+ /* priv is guaranteed to be valid any time this callback is invoked
+ * because we do not set the callback until after successful iwmr
+ * allocation and initialization.
+ */
+ struct irdma_mr *iwmr = priv;
+ int err;
+
+ /* Invalidate the key in hardware. This does not actually release the
+ * key for potential reuse - that only occurs when the region is fully
+ * deregistered.
+ *
+ * The irdma_hwdereg_mr call is a no-op if the region is not currently
+ * registered with hardware.
+ */
+ err = irdma_hwdereg_mr(&iwmr->ibmr);
+ if (err) {
+ struct irdma_device *iwdev = to_iwdev(iwmr->ibmr.device);
+
+ ibdev_err(&iwdev->ibdev, "dmabuf mr revoke failed %d", err);
+ if (!iwdev->rf->reset) {
+ iwdev->rf->reset = true;
+ iwdev->rf->gen_ops.request_reset(iwdev->rf);
+ }
+ }
+}
+
static struct ib_mr *irdma_reg_user_mr_dmabuf(struct ib_pd *pd, u64 start,
u64 len, u64 virt,
int fd, int access,
@@ -3607,7 +3637,9 @@ static struct ib_mr *irdma_reg_user_mr_dmabuf(struct ib_pd *pd, u64 start,
if (len > iwdev->rf->sc_dev.hw_attrs.max_mr_size)
return ERR_PTR(-EINVAL);
- umem_dmabuf = ib_umem_dmabuf_get_pinned(pd->device, start, len, fd, access);
+ umem_dmabuf =
+ ib_umem_dmabuf_get_pinned_revocable_and_lock(pd->device, start,
+ len, fd, access);
if (IS_ERR(umem_dmabuf)) {
ibdev_dbg(&iwdev->ibdev, "Failed to get dmabuf umem[%pe]\n",
umem_dmabuf);
@@ -3624,12 +3656,20 @@ static struct ib_mr *irdma_reg_user_mr_dmabuf(struct ib_pd *pd, u64 start,
if (err)
goto err_iwmr;
+ ib_umem_dmabuf_set_revoke_locked(umem_dmabuf, irdma_umem_dmabuf_revoke,
+ iwmr);
+ ib_umem_dmabuf_revoke_unlock(umem_dmabuf);
return &iwmr->ibmr;
err_iwmr:
irdma_free_iwmr(iwmr);
err_release:
+ ib_umem_dmabuf_revoke_unlock(umem_dmabuf);
+
+ /* Will result in a call to revoke, but driver callback is not set and
+ * is therefore skipped.
+ */
ib_umem_release(&umem_dmabuf->umem);
return ERR_PTR(err);
@@ -3749,6 +3789,8 @@ static struct ib_mr *irdma_rereg_user_mr(struct ib_mr *ib_mr, int flags,
struct irdma_device *iwdev = to_iwdev(ib_mr->device);
struct irdma_mr *iwmr = to_iwmr(ib_mr);
struct irdma_pbl *iwpbl = &iwmr->iwpbl;
+ bool dmabuf_revocable = iwmr->region && iwmr->region->is_dmabuf;
+ struct ib_umem_dmabuf *umem_dmabuf;
int ret;
if (len > iwdev->rf->sc_dev.hw_attrs.max_mr_size)
@@ -3757,9 +3799,26 @@ static struct ib_mr *irdma_rereg_user_mr(struct ib_mr *ib_mr, int flags,
if (flags & ~(IB_MR_REREG_TRANS | IB_MR_REREG_PD | IB_MR_REREG_ACCESS))
return ERR_PTR(-EOPNOTSUPP);
+ if (dmabuf_revocable) {
+ umem_dmabuf = to_ib_umem_dmabuf(iwmr->region);
+
+ ib_umem_dmabuf_revoke_lock(umem_dmabuf);
+
+ /* If the dmabuf has been revoked, it means that the region has
+ * been invalidated in HW. We must not allow it to become valid
+ * again unless the user is requesting a change in translation
+ * which will end up dropping the umem dmabuf and allocating an
+ * entirely new umem anyway.
+ */
+ if (umem_dmabuf->revoked && !(flags & IB_MR_REREG_TRANS)) {
+ ret = -EINVAL;
+ goto err_unlock;
+ }
+ }
+
ret = irdma_hwdereg_mr(ib_mr);
if (ret)
- return ERR_PTR(ret);
+ goto err_unlock;
if (flags & IB_MR_REREG_ACCESS)
iwmr->access = new_access;
@@ -3775,18 +3834,28 @@ static struct ib_mr *irdma_rereg_user_mr(struct ib_mr *ib_mr, int flags,
&iwpbl->pble_alloc);
iwpbl->pbl_allocated = false;
}
+
+ if (dmabuf_revocable) {
+ /* Must unlock before release to prevent deadlock */
+ ib_umem_dmabuf_revoke_unlock(umem_dmabuf);
+ dmabuf_revocable = false;
+ }
+
if (iwmr->region) {
ib_umem_release(iwmr->region);
iwmr->region = NULL;
}
ret = irdma_rereg_mr_trans(iwmr, start, len, virt);
- } else
+ } else {
ret = irdma_hwreg_mr(iwdev, iwmr, iwmr->access);
- if (ret)
- return ERR_PTR(ret);
+ }
- return NULL;
+err_unlock:
+ if (dmabuf_revocable)
+ ib_umem_dmabuf_revoke_unlock(umem_dmabuf);
+
+ return ret ? ERR_PTR(ret) : NULL;
}
/**
@@ -3909,6 +3978,7 @@ static int irdma_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata)
struct irdma_mr *iwmr = to_iwmr(ib_mr);
struct irdma_device *iwdev = to_iwdev(ib_mr->device);
struct irdma_pbl *iwpbl = &iwmr->iwpbl;
+ bool dmabuf_revocable = iwmr->region && iwmr->region->is_dmabuf;
int ret;
if (iwmr->type != IRDMA_MEMREG_TYPE_MEM) {
@@ -3923,17 +3993,28 @@ static int irdma_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata)
goto done;
}
- ret = irdma_hwdereg_mr(ib_mr);
- if (ret)
- return ret;
+ if (!dmabuf_revocable) {
+ ret = irdma_hwdereg_mr(ib_mr);
+ if (ret)
+ return ret;
- irdma_free_stag(iwdev, iwmr->stag);
+ irdma_free_stag(iwdev, iwmr->stag);
+ }
done:
+ if (iwmr->region)
+ /* For dmabuf MRs, ib_umem_release will trigger a synchronous
+ * call to the revoke callback which will perform the actual HW
+ * invalidation via irdma_hwdereg_mr. We rely on this for its
+ * implicit serialization w.r.t. concurrent revocations. This
+ * must be done before freeing the PBLEs.
+ */
+ ib_umem_release(iwmr->region);
+
if (iwpbl->pbl_allocated)
irdma_free_pble(iwdev->rf->pble_rsrc, &iwpbl->pble_alloc);
- if (iwmr->region)
- ib_umem_release(iwmr->region);
+ if (dmabuf_revocable)
+ irdma_free_stag(iwdev, iwmr->stag);
kfree(iwmr);
--
2.53.0.473.g4a7958ca14-goog
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH rdma-next v2 3/5] RDMA/umem: Add pinned revocable dmabuf import interface
2026-03-02 0:15 ` [PATCH rdma-next v2 3/5] RDMA/umem: Add pinned revocable dmabuf import interface Jacob Moroni
@ 2026-03-04 14:46 ` Leon Romanovsky
0 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2026-03-04 14:46 UTC (permalink / raw)
To: Jacob Moroni; +Cc: tatyana.e.nikolova, krzysztof.czurylo, jgg, linux-rdma
On Mon, Mar 02, 2026 at 12:15:37AM +0000, Jacob Moroni wrote:
> Added an interface for importing a pinned but revocable dmabuf.
> This interface can be used by drivers that are capable of revocation
> so that they can import dmabufs from exporters that may require it,
> such as VFIO.
>
> This interface implements a two step process, where drivers will first
> call ib_umem_dmabuf_get_pinned_revocable_and_lock() which will pin and
> map the dmabuf (and provide a functional move_notify/invalidate_mappings
> callback), but will return with the lock still held so that the
> driver can then populate the callback via
> ib_umem_dmabuf_set_revoke_locked() without races from concurrent
> revocations. This scheme also allows for easier integration with drivers
> that may not have actually allocated their internal MR objects at the time
> of the get_pinned_revocable* call.
>
> Signed-off-by: Jacob Moroni <jmoroni@google.com>
> ---
> drivers/infiniband/core/umem_dmabuf.c | 61 +++++++++++++++++++++++++++
> include/rdma/ib_umem.h | 20 +++++++++
> 2 files changed, 81 insertions(+)
<...>
> +void ib_umem_dmabuf_set_revoke_locked(struct ib_umem_dmabuf *umem_dmabuf,
> + void (*revoke)(void *priv), void *priv)
> +{
> + dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv);
> +
> + umem_dmabuf->pinned_revoke = revoke;
> + umem_dmabuf->pinned_revoke_priv = priv;
I don't think that you need special priv, you can use umem_dmabuf->private instead.
See reg_user_mr_dmabuf():
1680 umem_dmabuf->private = mr;
Everything else looks reasonable to me.
Thanks
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-03-04 14:46 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-02 0:15 [PATCH rdma-next v2 0/5] Add pinned revocable dmabuf import interface Jacob Moroni
2026-03-02 0:15 ` [PATCH rdma-next v2 1/5] RDMA/umem: Add ib_umem_dmabuf_get_pinned_and_lock helper Jacob Moroni
2026-03-02 0:15 ` [PATCH rdma-next v2 2/5] RDMA/umem: Move umem dmabuf revoke logic into helper function Jacob Moroni
2026-03-02 0:15 ` [PATCH rdma-next v2 3/5] RDMA/umem: Add pinned revocable dmabuf import interface Jacob Moroni
2026-03-04 14:46 ` Leon Romanovsky
2026-03-02 0:15 ` [PATCH rdma-next v2 4/5] RDMA/umem: Add helpers for umem dmabuf revoke lock Jacob Moroni
2026-03-02 0:15 ` [PATCH rdma-next v2 5/5] RDMA/irdma: Add support for revocable pinned dmabuf import Jacob Moroni
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox