public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH rdma-next 0/4] Add pinned revocable dmabuf import interface
@ 2026-02-25 21:07 Jacob Moroni
  2026-02-25 21:07 ` [PATCH rdma-next 1/4] RDMA/umem: Add ib_umem_dmabuf_get_pinned_and_lock helper Jacob Moroni
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Jacob Moroni @ 2026-02-25 21:07 UTC (permalink / raw)
  To: tatyana.e.nikolova, krzysztof.czurylo, jgg, leon; +Cc: linux-rdma, Jacob Moroni

Some dmabuf exporters (VFIO) will require that pinned importers support
revocation. In order to support this for non-ODP drivers/devices, a new
interface is required. This new interface implements a two step process
where the driver will perform a sequence like:

    ib_umem_dmabuf_get_pinned_revocable_and_lock()
    
        ... Driver MR allocation/initialization/registration/etc
        
    ib_umem_dmabuf_set_revoke_locked()
    dma_resv_unlock();
    
This allows the driver to provide a callback that can be used to
perform the actual invalidation in a way that is safe against races
from concurrent revocations during initialization.

The driver must ensure that the HW will no longer access the region
before the revoke callback returns. For MRs, this can be achieved
by using the rereg capability to set the region length to 0, or
perhaps by moving the region to a new quarantine PD. For HW that
allows the driver to manage the keys (like irdma), this can be
achieved by deregistering the region in HW but not freeing the key
until the region is truly deregistered via ibv_dereg_mr.

Changes since RFC(s):
* Break the interface into a two step process to avoid needing
  extra state in the driver.
* Move the majority of the functionality into the core.

RFC threads:
https://lore.kernel.org/linux-rdma/20260223195333.438492-1-jmoroni@google.com/T/#t
https://lore.kernel.org/linux-rdma/CAHYDg1TB1Xa+D700WrvrcQVdgZFE5f8iWp48EmQM9XjK9xJdew@mail.gmail.com/T/#t

Jacob Moroni (4):
  RDMA/umem: Add ib_umem_dmabuf_get_pinned_and_lock helper
  RDMA/umem: Move umem dmabuf revoke logic into helper function
  RDMA/umem: Add pinned revocable dmabuf import interface
  RDMA/irdma: Add support for revocable pinned dmabuf import

 drivers/infiniband/core/umem_dmabuf.c | 122 ++++++++++++++++++++++----
 drivers/infiniband/hw/irdma/main.h    |   1 +
 drivers/infiniband/hw/irdma/verbs.c   |  71 ++++++++++++++-
 include/rdma/ib_umem.h                |  20 +++++
 4 files changed, 195 insertions(+), 19 deletions(-)

-- 
2.53.0.414.gf7e9f6c205-goog


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH rdma-next 1/4] RDMA/umem: Add ib_umem_dmabuf_get_pinned_and_lock helper
  2026-02-25 21:07 [PATCH rdma-next 0/4] Add pinned revocable dmabuf import interface Jacob Moroni
@ 2026-02-25 21:07 ` Jacob Moroni
  2026-02-25 21:07 ` [PATCH rdma-next 2/4] RDMA/umem: Move umem dmabuf revoke logic into helper function Jacob Moroni
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Jacob Moroni @ 2026-02-25 21:07 UTC (permalink / raw)
  To: tatyana.e.nikolova, krzysztof.czurylo, jgg, leon; +Cc: linux-rdma, Jacob Moroni

Move the inner logic of ib_umem_dmabuf_get_pinned_with_dma_device()
to a new static function that returns with the lock held upon success.

The intent is to allow reuse for the future get_pinned_revocable_and_lock
function.

Signed-off-by: Jacob Moroni <jmoroni@google.com>
---
 drivers/infiniband/core/umem_dmabuf.c | 35 ++++++++++++++++++++-------
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c
index d30f24b90..0c0098285 100644
--- a/drivers/infiniband/core/umem_dmabuf.c
+++ b/drivers/infiniband/core/umem_dmabuf.c
@@ -195,18 +195,19 @@ static struct dma_buf_attach_ops ib_umem_dmabuf_attach_pinned_ops = {
 	.move_notify = ib_umem_dmabuf_unsupported_move_notify,
 };
 
-struct ib_umem_dmabuf *
-ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device,
-					  struct device *dma_device,
-					  unsigned long offset, size_t size,
-					  int fd, int access)
+static struct ib_umem_dmabuf *
+ib_umem_dmabuf_get_pinned_and_lock(struct ib_device *device,
+				   struct device *dma_device,
+				   unsigned long offset,
+				   size_t size, int fd, int access,
+				   const struct dma_buf_attach_ops *ops)
 {
 	struct ib_umem_dmabuf *umem_dmabuf;
 	int err;
 
-	umem_dmabuf = ib_umem_dmabuf_get_with_dma_device(device, dma_device, offset,
-							 size, fd, access,
-							 &ib_umem_dmabuf_attach_pinned_ops);
+	umem_dmabuf =
+		ib_umem_dmabuf_get_with_dma_device(device, dma_device, offset,
+						   size, fd, access, ops);
 	if (IS_ERR(umem_dmabuf))
 		return umem_dmabuf;
 
@@ -219,7 +220,6 @@ ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device,
 	err = ib_umem_dmabuf_map_pages(umem_dmabuf);
 	if (err)
 		goto err_release;
-	dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv);
 
 	return umem_dmabuf;
 
@@ -228,6 +228,23 @@ ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device,
 	ib_umem_release(&umem_dmabuf->umem);
 	return ERR_PTR(err);
 }
+
+struct ib_umem_dmabuf *
+ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device,
+					  struct device *dma_device,
+					  unsigned long offset, size_t size,
+					  int fd, int access)
+{
+	struct ib_umem_dmabuf *umem_dmabuf =
+		ib_umem_dmabuf_get_pinned_and_lock(device, dma_device, offset,
+						   size, fd, access,
+						   &ib_umem_dmabuf_attach_pinned_ops);
+	if (IS_ERR(umem_dmabuf))
+		return umem_dmabuf;
+
+	dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv);
+	return umem_dmabuf;
+}
 EXPORT_SYMBOL(ib_umem_dmabuf_get_pinned_with_dma_device);
 
 struct ib_umem_dmabuf *ib_umem_dmabuf_get_pinned(struct ib_device *device,
-- 
2.53.0.414.gf7e9f6c205-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH rdma-next 2/4] RDMA/umem: Move umem dmabuf revoke logic into helper function
  2026-02-25 21:07 [PATCH rdma-next 0/4] Add pinned revocable dmabuf import interface Jacob Moroni
  2026-02-25 21:07 ` [PATCH rdma-next 1/4] RDMA/umem: Add ib_umem_dmabuf_get_pinned_and_lock helper Jacob Moroni
@ 2026-02-25 21:07 ` Jacob Moroni
  2026-02-25 21:07 ` [PATCH rdma-next 3/4] RDMA/umem: Add pinned revocable dmabuf import interface Jacob Moroni
  2026-02-25 21:07 ` [PATCH rdma-next 4/4] RDMA/irdma: Add support for revocable pinned dmabuf import Jacob Moroni
  3 siblings, 0 replies; 12+ messages in thread
From: Jacob Moroni @ 2026-02-25 21:07 UTC (permalink / raw)
  To: tatyana.e.nikolova, krzysztof.czurylo, jgg, leon; +Cc: linux-rdma, Jacob Moroni

This same logic will eventually be reused from within the
invalidate_mappings callback which already has the dma_resv_lock
held, so break it out into a separate function so it can be reused.

Signed-off-by: Jacob Moroni <jmoroni@google.com>
---
 drivers/infiniband/core/umem_dmabuf.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c
index 0c0098285..9cf9cfc93 100644
--- a/drivers/infiniband/core/umem_dmabuf.c
+++ b/drivers/infiniband/core/umem_dmabuf.c
@@ -195,6 +195,22 @@ static struct dma_buf_attach_ops ib_umem_dmabuf_attach_pinned_ops = {
 	.move_notify = ib_umem_dmabuf_unsupported_move_notify,
 };
 
+static void ib_umem_dmabuf_revoke_locked(struct dma_buf_attachment *attach)
+{
+	struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv;
+
+	dma_resv_assert_held(attach->dmabuf->resv);
+
+	if (umem_dmabuf->revoked)
+		return;
+	ib_umem_dmabuf_unmap_pages(umem_dmabuf);
+	if (umem_dmabuf->pinned) {
+		dma_buf_unpin(umem_dmabuf->attach);
+		umem_dmabuf->pinned = 0;
+	}
+	umem_dmabuf->revoked = 1;
+}
+
 static struct ib_umem_dmabuf *
 ib_umem_dmabuf_get_pinned_and_lock(struct ib_device *device,
 				   struct device *dma_device,
@@ -262,15 +278,7 @@ void ib_umem_dmabuf_revoke(struct ib_umem_dmabuf *umem_dmabuf)
 	struct dma_buf *dmabuf = umem_dmabuf->attach->dmabuf;
 
 	dma_resv_lock(dmabuf->resv, NULL);
-	if (umem_dmabuf->revoked)
-		goto end;
-	ib_umem_dmabuf_unmap_pages(umem_dmabuf);
-	if (umem_dmabuf->pinned) {
-		dma_buf_unpin(umem_dmabuf->attach);
-		umem_dmabuf->pinned = 0;
-	}
-	umem_dmabuf->revoked = 1;
-end:
+	ib_umem_dmabuf_revoke_locked(umem_dmabuf->attach);
 	dma_resv_unlock(dmabuf->resv);
 }
 EXPORT_SYMBOL(ib_umem_dmabuf_revoke);
-- 
2.53.0.414.gf7e9f6c205-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH rdma-next 3/4] RDMA/umem: Add pinned revocable dmabuf import interface
  2026-02-25 21:07 [PATCH rdma-next 0/4] Add pinned revocable dmabuf import interface Jacob Moroni
  2026-02-25 21:07 ` [PATCH rdma-next 1/4] RDMA/umem: Add ib_umem_dmabuf_get_pinned_and_lock helper Jacob Moroni
  2026-02-25 21:07 ` [PATCH rdma-next 2/4] RDMA/umem: Move umem dmabuf revoke logic into helper function Jacob Moroni
@ 2026-02-25 21:07 ` Jacob Moroni
  2026-02-25 21:07 ` [PATCH rdma-next 4/4] RDMA/irdma: Add support for revocable pinned dmabuf import Jacob Moroni
  3 siblings, 0 replies; 12+ messages in thread
From: Jacob Moroni @ 2026-02-25 21:07 UTC (permalink / raw)
  To: tatyana.e.nikolova, krzysztof.czurylo, jgg, leon; +Cc: linux-rdma, Jacob Moroni

Added an interface for importing a pinned but revocable dmabuf.
This interface can be used by drivers that are capable of revocation
so that they can import dmabufs from exporters that may require it,
such as VFIO.

This interface implements a two step process, where drivers will first
call ib_umem_dmabuf_get_pinned_revocable_and_lock() which will pin and
map the dmabuf (and provide a functional move_notify/invalidate_mappings
callback), but will return with the lock still held so that the
driver can then populate the callback via 
ib_umem_dmabuf_set_revoke_locked() without races from concurrent
revocations. This scheme also allows for easier integration with drivers
that may not have actually allocated their internal MR objects at the time
of the get_pinned_revocable* call.

Signed-off-by: Jacob Moroni <jmoroni@google.com>
---
 drivers/infiniband/core/umem_dmabuf.c | 61 +++++++++++++++++++++++++++
 include/rdma/ib_umem.h                | 20 +++++++++
 2 files changed, 81 insertions(+)

diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c
index 9cf9cfc93..7892e33be 100644
--- a/drivers/infiniband/core/umem_dmabuf.c
+++ b/drivers/infiniband/core/umem_dmabuf.c
@@ -203,6 +203,10 @@ static void ib_umem_dmabuf_revoke_locked(struct dma_buf_attachment *attach)
 
 	if (umem_dmabuf->revoked)
 		return;
+
+	if (umem_dmabuf->pinned_revoke)
+		umem_dmabuf->pinned_revoke(umem_dmabuf->pinned_revoke_priv);
+
 	ib_umem_dmabuf_unmap_pages(umem_dmabuf);
 	if (umem_dmabuf->pinned) {
 		dma_buf_unpin(umem_dmabuf->attach);
@@ -211,6 +215,11 @@ static void ib_umem_dmabuf_revoke_locked(struct dma_buf_attachment *attach)
 	umem_dmabuf->revoked = 1;
 }
 
+static struct dma_buf_attach_ops ib_umem_dmabuf_attach_pinned_revocable_ops = {
+	.allow_peer2peer = true,
+	.move_notify = ib_umem_dmabuf_revoke_locked,
+};
+
 static struct ib_umem_dmabuf *
 ib_umem_dmabuf_get_pinned_and_lock(struct ib_device *device,
 				   struct device *dma_device,
@@ -263,6 +272,58 @@ ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device,
 }
 EXPORT_SYMBOL(ib_umem_dmabuf_get_pinned_with_dma_device);
 
+/**
+ * ib_umem_dmabuf_get_pinned_revocable_and_lock - Map & pin a revocable dmabuf
+ * @device: IB device.
+ * @offset: Start offset.
+ * @size: Length.
+ * @fd: dmabuf fd.
+ * @access: Access flags.
+ *
+ * Obtains a umem from a dmabuf for drivers/devices that can support revocation.
+ *
+ * Returns with dma_resv_lock held upon success. The driver must set the revoke
+ * callback prior to unlock by calling ib_umem_dmabuf_set_revoke_locked().
+ *
+ * When a revocation occurs, the revoke callback will be called. The driver must
+ * ensure that the region is no longer accessed when the callback returns. Any
+ * subsequent access attempts should also probably cause an AE for MRs.
+ *
+ * If the umem is used for an MR, the driver must ensure that the key remains in
+ * use such that it cannot be obtained by a new region until this region is
+ * fully deregistered (i.e., ibv_dereg_mr). If a driver needs to serialize with
+ * revoke calls, it can use dma_resv_lock.
+ *
+ * If successful, then the revoke callback may be called at any time and will
+ * also be called automatically upon ib_umem_release (serialized). The revoke
+ * callback will be called one time at most.
+ *
+ * Return: A pointer to ib_umem_dmabuf on success, or an ERR_PTR on failure.
+ */
+struct ib_umem_dmabuf *
+ib_umem_dmabuf_get_pinned_revocable_and_lock(struct ib_device *device,
+					     unsigned long offset, size_t size,
+					     int fd, int access)
+{
+	const struct dma_buf_attach_ops *ops =
+		&ib_umem_dmabuf_attach_pinned_revocable_ops;
+
+	return ib_umem_dmabuf_get_pinned_and_lock(device, device->dma_device,
+						  offset, size, fd, access,
+						  ops);
+}
+EXPORT_SYMBOL(ib_umem_dmabuf_get_pinned_revocable_and_lock);
+
+void ib_umem_dmabuf_set_revoke_locked(struct ib_umem_dmabuf *umem_dmabuf,
+				      void (*revoke)(void *priv), void *priv)
+{
+	dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv);
+
+	umem_dmabuf->pinned_revoke = revoke;
+	umem_dmabuf->pinned_revoke_priv = priv;
+}
+EXPORT_SYMBOL(ib_umem_dmabuf_set_revoke_locked);
+
 struct ib_umem_dmabuf *ib_umem_dmabuf_get_pinned(struct ib_device *device,
 						 unsigned long offset,
 						 size_t size, int fd,
diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h
index 0a8e092c0..3de8c2efc 100644
--- a/include/rdma/ib_umem.h
+++ b/include/rdma/ib_umem.h
@@ -36,6 +36,8 @@ struct ib_umem_dmabuf {
 	struct scatterlist *last_sg;
 	unsigned long first_sg_offset;
 	unsigned long last_sg_trim;
+	void (*pinned_revoke)(void *priv);
+	void *pinned_revoke_priv;
 	void *private;
 	u8 pinned : 1;
 	u8 revoked : 1;
@@ -169,6 +171,12 @@ struct ib_umem_dmabuf *ib_umem_dmabuf_get_pinned(struct ib_device *device,
 						 size_t size, int fd,
 						 int access);
 struct ib_umem_dmabuf *
+ib_umem_dmabuf_get_pinned_revocable_and_lock(struct ib_device *device,
+					     unsigned long offset, size_t size,
+					     int fd, int access);
+void ib_umem_dmabuf_set_revoke_locked(struct ib_umem_dmabuf *umem_dmabuf,
+				      void (*revoke)(void *priv), void *priv);
+struct ib_umem_dmabuf *
 ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device,
 					  struct device *dma_device,
 					  unsigned long offset, size_t size,
@@ -221,6 +229,18 @@ ib_umem_dmabuf_get_pinned(struct ib_device *device, unsigned long offset,
 	return ERR_PTR(-EOPNOTSUPP);
 }
 
+static inline struct ib_umem_dmabuf *
+ib_umem_dmabuf_get_pinned_revocable_and_lock(struct ib_device *device,
+					     unsigned long offset, size_t size,
+					     int fd, int access)
+{
+	return ERR_PTR(-EOPNOTSUPP);
+}
+
+static inline void
+ib_umem_dmabuf_set_revoke_locked(struct ib_umem_dmabuf *umem_dmabuf,
+				 void (*revoke)(void *priv), void *priv) {}
+
 static inline struct ib_umem_dmabuf *
 ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device,
 					  struct device *dma_device,
-- 
2.53.0.414.gf7e9f6c205-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH rdma-next 4/4] RDMA/irdma: Add support for revocable pinned dmabuf import
  2026-02-25 21:07 [PATCH rdma-next 0/4] Add pinned revocable dmabuf import interface Jacob Moroni
                   ` (2 preceding siblings ...)
  2026-02-25 21:07 ` [PATCH rdma-next 3/4] RDMA/umem: Add pinned revocable dmabuf import interface Jacob Moroni
@ 2026-02-25 21:07 ` Jacob Moroni
  2026-02-26  8:55   ` Leon Romanovsky
  3 siblings, 1 reply; 12+ messages in thread
From: Jacob Moroni @ 2026-02-25 21:07 UTC (permalink / raw)
  To: tatyana.e.nikolova, krzysztof.czurylo, jgg, leon; +Cc: linux-rdma, Jacob Moroni

Use the new API to support importing pinned dmabufs from exporters
that require revocation, such as VFIO. The revoke semantic is
achieved by issuing a HW invalidation command but not freeing
the key. This prevents further accesses to the region (they will
result in an invalid key AE), but also keeps the key reserved
until the region is actually deregistered (i.e., ibv_dereg_mr)
so that a new MR registration cannot acquire the same key.

Tested with lockdep+kasan and a memfd backed dmabuf.

Signed-off-by: Jacob Moroni <jmoroni@google.com>
---
 drivers/infiniband/hw/irdma/main.h  |  1 +
 drivers/infiniband/hw/irdma/verbs.c | 71 ++++++++++++++++++++++++++++-
 2 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/irdma/main.h b/drivers/infiniband/hw/irdma/main.h
index d320d1a22..240c79779 100644
--- a/drivers/infiniband/hw/irdma/main.h
+++ b/drivers/infiniband/hw/irdma/main.h
@@ -20,6 +20,7 @@
 #include <linux/delay.h>
 #include <linux/pci.h>
 #include <linux/dma-mapping.h>
+#include <linux/dma-resv.h>
 #include <linux/workqueue.h>
 #include <linux/slab.h>
 #include <linux/io.h>
diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c
index 15af53237..a92fe3cac 100644
--- a/drivers/infiniband/hw/irdma/verbs.c
+++ b/drivers/infiniband/hw/irdma/verbs.c
@@ -3590,6 +3590,36 @@ static struct ib_mr *irdma_reg_user_mr(struct ib_pd *pd, u64 start, u64 len,
 	return ERR_PTR(err);
 }
 
+static int irdma_hwdereg_mr(struct ib_mr *ib_mr);
+
+static void irdma_umem_dmabuf_revoke(void *priv)
+{
+	/* priv is guaranteed to be valid any time this callback is invoked
+	 * because we do not set the callback until after successful iwmr
+	 * allocation and initialization.
+	 */
+	struct irdma_mr *iwmr = priv;
+	int err;
+
+	/* Invalidate the key in hardware. This does not actually release the
+	 * key for potential reuse - that only occurs when the region is fully
+	 * deregistered.
+	 *
+	 * The irdma_hwdereg_mr call is a no-op if the region is not currently
+	 * registered with hardware.
+	 */
+	err = irdma_hwdereg_mr(&iwmr->ibmr);
+	if (err) {
+		struct irdma_device *iwdev = to_iwdev(iwmr->ibmr.device);
+
+		ibdev_err(&iwdev->ibdev, "dmabuf mr revoke failed %d", err);
+		if (!iwdev->rf->reset) {
+			iwdev->rf->reset = true;
+			iwdev->rf->gen_ops.request_reset(iwdev->rf);
+		}
+	}
+}
+
 static struct ib_mr *irdma_reg_user_mr_dmabuf(struct ib_pd *pd, u64 start,
 					      u64 len, u64 virt,
 					      int fd, int access,
@@ -3607,7 +3637,9 @@ static struct ib_mr *irdma_reg_user_mr_dmabuf(struct ib_pd *pd, u64 start,
 	if (len > iwdev->rf->sc_dev.hw_attrs.max_mr_size)
 		return ERR_PTR(-EINVAL);
 
-	umem_dmabuf = ib_umem_dmabuf_get_pinned(pd->device, start, len, fd, access);
+	umem_dmabuf =
+		ib_umem_dmabuf_get_pinned_revocable_and_lock(pd->device, start,
+							     len, fd, access);
 	if (IS_ERR(umem_dmabuf)) {
 		ibdev_dbg(&iwdev->ibdev, "Failed to get dmabuf umem[%pe]\n",
 			  umem_dmabuf);
@@ -3624,12 +3656,20 @@ static struct ib_mr *irdma_reg_user_mr_dmabuf(struct ib_pd *pd, u64 start,
 	if (err)
 		goto err_iwmr;
 
+	ib_umem_dmabuf_set_revoke_locked(umem_dmabuf, irdma_umem_dmabuf_revoke,
+					 iwmr);
+	dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv);
 	return &iwmr->ibmr;
 
 err_iwmr:
 	irdma_free_iwmr(iwmr);
 
 err_release:
+	dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv);
+
+	/* Will result in a call to revoke, but driver callback is not set and
+	 * is therefore skipped.
+	 */
 	ib_umem_release(&umem_dmabuf->umem);
 
 	return ERR_PTR(err);
@@ -3899,6 +3939,32 @@ static void irdma_del_memlist(struct irdma_mr *iwmr,
 	}
 }
 
+/**
+ * irdma_dereg_mr_dmabuf - deregister a dmabuf mr
+ * @iwdev: iwarp device
+ * @iwmr: mr
+ *
+ * dmabuf deregistration requires a slightly different sequence since it relies
+ * on the umem release to invalidate the region in hardware via the revoke
+ * callback. This ensures serialization w.r.t. concurrent revocations.
+ */
+static int irdma_dereg_mr_dmabuf(struct irdma_device *iwdev,
+				 struct irdma_mr *iwmr)
+{
+	struct irdma_pbl *iwpbl = &iwmr->iwpbl;
+
+	/* Triggers a synchronous call to the revoke callback. */
+	ib_umem_release(iwmr->region);
+
+	irdma_free_stag(iwdev, iwmr->stag);
+
+	if (iwpbl->pbl_allocated)
+		irdma_free_pble(iwdev->rf->pble_rsrc, &iwpbl->pble_alloc);
+
+	kfree(iwmr);
+	return 0;
+}
+
 /**
  * irdma_dereg_mr - deregister mr
  * @ib_mr: mr ptr for dereg
@@ -3911,6 +3977,9 @@ static int irdma_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata)
 	struct irdma_pbl *iwpbl = &iwmr->iwpbl;
 	int ret;
 
+	if (iwmr->region && iwmr->region->is_dmabuf)
+		return irdma_dereg_mr_dmabuf(iwdev, iwmr);
+
 	if (iwmr->type != IRDMA_MEMREG_TYPE_MEM) {
 		if (iwmr->region) {
 			struct irdma_ucontext *ucontext;
-- 
2.53.0.414.gf7e9f6c205-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH rdma-next 4/4] RDMA/irdma: Add support for revocable pinned dmabuf import
  2026-02-25 21:07 ` [PATCH rdma-next 4/4] RDMA/irdma: Add support for revocable pinned dmabuf import Jacob Moroni
@ 2026-02-26  8:55   ` Leon Romanovsky
  2026-02-26 19:22     ` Jacob Moroni
  0 siblings, 1 reply; 12+ messages in thread
From: Leon Romanovsky @ 2026-02-26  8:55 UTC (permalink / raw)
  To: Jacob Moroni; +Cc: tatyana.e.nikolova, krzysztof.czurylo, jgg, linux-rdma

On Wed, Feb 25, 2026 at 09:07:05PM +0000, Jacob Moroni wrote:
> Use the new API to support importing pinned dmabufs from exporters
> that require revocation, such as VFIO. The revoke semantic is
> achieved by issuing a HW invalidation command but not freeing
> the key. This prevents further accesses to the region (they will
> result in an invalid key AE), but also keeps the key reserved
> until the region is actually deregistered (i.e., ibv_dereg_mr)
> so that a new MR registration cannot acquire the same key.
> 
> Tested with lockdep+kasan and a memfd backed dmabuf.
> 
> Signed-off-by: Jacob Moroni <jmoroni@google.com>
> ---
>  drivers/infiniband/hw/irdma/main.h  |  1 +
>  drivers/infiniband/hw/irdma/verbs.c | 71 ++++++++++++++++++++++++++++-
>  2 files changed, 71 insertions(+), 1 deletion(-)

<...>

>   * irdma_dereg_mr - deregister mr
>   * @ib_mr: mr ptr for dereg
> @@ -3911,6 +3977,9 @@ static int irdma_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata)
>  	struct irdma_pbl *iwpbl = &iwmr->iwpbl;
>  	int ret;
>  
> +	if (iwmr->region && iwmr->region->is_dmabuf)
> +		return irdma_dereg_mr_dmabuf(iwdev, iwmr);

I wonder if you really need to leak umem properties and can't use
existing irdma_dereg_mr(). ib_umem_release() handles both regular and dmabuf correctly.

Thanks

> +
>  	if (iwmr->type != IRDMA_MEMREG_TYPE_MEM) {
>  		if (iwmr->region) {
>  			struct irdma_ucontext *ucontext;
> -- 
> 2.53.0.414.gf7e9f6c205-goog
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH rdma-next 4/4] RDMA/irdma: Add support for revocable pinned dmabuf import
  2026-02-26  8:55   ` Leon Romanovsky
@ 2026-02-26 19:22     ` Jacob Moroni
  2026-02-26 19:41       ` Leon Romanovsky
  0 siblings, 1 reply; 12+ messages in thread
From: Jacob Moroni @ 2026-02-26 19:22 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: tatyana.e.nikolova, krzysztof.czurylo, jgg, linux-rdma

Hi,

> I wonder if you really need to leak umem properties and can't use
> existing irdma_dereg_mr(). ib_umem_release() handles both regular and dmabuf correctly.

For dmabuf MRs, we need to protect against async/concurrent revocations. I am
currently relying on the ib_umem_release to do this since it causes a
synchronous revoke (with dma_resv_lock held) and also ensures that no revoke
callbacks will occur after return.

In the normal irdma_dereg_mr flow, irdma_hwdereg_mr is called prior to umem
release and isn't protected.

One solution may be to wrap the irdma_hwdereg_mr call with dma_resv_lock/unlock
if it is a dmabuf MR, but still requires a bit of special handling
compared to normal
MRs. That said, I could mark this state in the internal iwmr rather than peeking
into the umem like this. Then, the rest of the routine would be
identical for normal
and dmabuf MRs. It is worth noting that the ib_umem_release would
still result in
a revoke callback, but this callback would be a no-op because the MR is already
deregistered in HW at that point.

WDYT?

Thanks,
- Jake

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH rdma-next 4/4] RDMA/irdma: Add support for revocable pinned dmabuf import
  2026-02-26 19:22     ` Jacob Moroni
@ 2026-02-26 19:41       ` Leon Romanovsky
  2026-02-26 21:38         ` Jacob Moroni
  2026-02-27 14:53         ` Jason Gunthorpe
  0 siblings, 2 replies; 12+ messages in thread
From: Leon Romanovsky @ 2026-02-26 19:41 UTC (permalink / raw)
  To: Jacob Moroni; +Cc: tatyana.e.nikolova, krzysztof.czurylo, jgg, linux-rdma

On Thu, Feb 26, 2026 at 02:22:09PM -0500, Jacob Moroni wrote:
> Hi,
> 
> > I wonder if you really need to leak umem properties and can't use
> > existing irdma_dereg_mr(). ib_umem_release() handles both regular and dmabuf correctly.
> 
> For dmabuf MRs, we need to protect against async/concurrent revocations. I am
> currently relying on the ib_umem_release to do this since it causes a
> synchronous revoke (with dma_resv_lock held) and also ensures that no revoke
> callbacks will occur after return.
> 
> In the normal irdma_dereg_mr flow, irdma_hwdereg_mr is called prior to umem
> release and isn't protected.
> 
> One solution may be to wrap the irdma_hwdereg_mr call with dma_resv_lock/unlock
> if it is a dmabuf MR, but still requires a bit of special handling
> compared to normal
> MRs. That said, I could mark this state in the internal iwmr rather than peeking
> into the umem like this. Then, the rest of the routine would be
> identical for normal
> and dmabuf MRs. It is worth noting that the ib_umem_release would
> still result in
> a revoke callback, but this callback would be a no-op because the MR is already
> deregistered in HW at that point.

I asked because I'm working on a series that removes direct access to
umem_dmabuf from drivers and instead provides them with a valid ib_umem, which
may be of a different type (regular, dmabuf, or other).

So, in my view, all drivers should follow the same flow, allowing us to
perform lock and unlock operations in the core during the call to
irdma_dereg_mr(). However, it is not clear whether this can actually be
achieved.

Thanks

> 
> WDYT?
> 
> Thanks,
> - Jake

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH rdma-next 4/4] RDMA/irdma: Add support for revocable pinned dmabuf import
  2026-02-26 19:41       ` Leon Romanovsky
@ 2026-02-26 21:38         ` Jacob Moroni
  2026-02-27 14:44           ` Jacob Moroni
  2026-02-27 14:53         ` Jason Gunthorpe
  1 sibling, 1 reply; 12+ messages in thread
From: Jacob Moroni @ 2026-02-26 21:38 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: tatyana.e.nikolova, krzysztof.czurylo, jgg, linux-rdma

I see. Thanks for the context.

It may be hard to totally hide the fact that the umem is now revocable from the
drivers, but we may still be able to still mostly hide the umem type if we make
"revocable" a property of the general umem. Then, there can something like a
"ib_umem_revoke_lock/unlock" helper which would be a no-op for most umems, but
would allow drivers to have the same dereg path at least?

Thanks,
Jake

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH rdma-next 4/4] RDMA/irdma: Add support for revocable pinned dmabuf import
  2026-02-26 21:38         ` Jacob Moroni
@ 2026-02-27 14:44           ` Jacob Moroni
  2026-02-27 14:50             ` Jason Gunthorpe
  0 siblings, 1 reply; 12+ messages in thread
From: Jacob Moroni @ 2026-02-27 14:44 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: tatyana.e.nikolova, krzysztof.czurylo, jgg, linux-rdma

Hi,

I will have to send a v2 anyway to address the rereg_mr path. This does raise a
question though: What is the expected behavior for rereg_mr on a dmabuf mr?

There is no rereg_dmabuf_mr, so my read of the irdma driver is that rereg_mr
will work as expected as long as you don't specify IB_MR_REREG_TRANS.

If IB_MR_REREG_TRANS is specified, then it will drop the umem_dmabuf and
get a new normal umem based on whatever arguments are provided, which I
guess makes sense, and I assume I will need to preserve this behavior since
it's user facing.

In the case where it is only a PD or access change, I will need to also deny
the rereg if the umem has been revoked. The buffer has been invalidated,
unpinned, and unmapped at that point and the rereg would have the side
effect of re-validating it in HW which I don't think can be allowed.

- Jake


On Thu, Feb 26, 2026 at 4:38 PM Jacob Moroni <jmoroni@google.com> wrote:
>
> I see. Thanks for the context.
>
> It may be hard to totally hide the fact that the umem is now revocable from the
> drivers, but we may still be able to still mostly hide the umem type if we make
> "revocable" a property of the general umem. Then, there can something like a
> "ib_umem_revoke_lock/unlock" helper which would be a no-op for most umems, but
> would allow drivers to have the same dereg path at least?
>
> Thanks,
> Jake

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH rdma-next 4/4] RDMA/irdma: Add support for revocable pinned dmabuf import
  2026-02-27 14:44           ` Jacob Moroni
@ 2026-02-27 14:50             ` Jason Gunthorpe
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2026-02-27 14:50 UTC (permalink / raw)
  To: Jacob Moroni
  Cc: Leon Romanovsky, tatyana.e.nikolova, krzysztof.czurylo,
	linux-rdma

On Fri, Feb 27, 2026 at 09:44:02AM -0500, Jacob Moroni wrote:
> If IB_MR_REREG_TRANS is specified, then it will drop the umem_dmabuf and
> get a new normal umem based on whatever arguments are provided, which I
> guess makes sense, and I assume I will need to preserve this behavior since
> it's user facing.

Yeah, I think that is where we are at today. You can't rereg into a
new umem that has a dmabuf yet.

> In the case where it is only a PD or access change, I will need to also deny
> the rereg if the umem has been revoked. 

> The buffer has been invalidated,
> unpinned, and unmapped at that point and the rereg would have the side
> effect of re-validating it in HW which I don't think can be allowed.

Right, once revoked the rkey has to remain inoperable and anything
that undoes that is a serious security hole.

Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH rdma-next 4/4] RDMA/irdma: Add support for revocable pinned dmabuf import
  2026-02-26 19:41       ` Leon Romanovsky
  2026-02-26 21:38         ` Jacob Moroni
@ 2026-02-27 14:53         ` Jason Gunthorpe
  1 sibling, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2026-02-27 14:53 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jacob Moroni, tatyana.e.nikolova, krzysztof.czurylo, linux-rdma

On Thu, Feb 26, 2026 at 09:41:49PM +0200, Leon Romanovsky wrote:

> So, in my view, all drivers should follow the same flow, allowing us to
> perform lock and unlock operations in the core during the call to
> irdma_dereg_mr(). However, it is not clear whether this can actually be
> achieved.

I don't know about this, interlocking with the revoke requires access
to the dma reservation lock in the driver in some way.

Many of the existing drivers don't support revoke so they don't expose
this problem.

Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-02-27 14:53 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-25 21:07 [PATCH rdma-next 0/4] Add pinned revocable dmabuf import interface Jacob Moroni
2026-02-25 21:07 ` [PATCH rdma-next 1/4] RDMA/umem: Add ib_umem_dmabuf_get_pinned_and_lock helper Jacob Moroni
2026-02-25 21:07 ` [PATCH rdma-next 2/4] RDMA/umem: Move umem dmabuf revoke logic into helper function Jacob Moroni
2026-02-25 21:07 ` [PATCH rdma-next 3/4] RDMA/umem: Add pinned revocable dmabuf import interface Jacob Moroni
2026-02-25 21:07 ` [PATCH rdma-next 4/4] RDMA/irdma: Add support for revocable pinned dmabuf import Jacob Moroni
2026-02-26  8:55   ` Leon Romanovsky
2026-02-26 19:22     ` Jacob Moroni
2026-02-26 19:41       ` Leon Romanovsky
2026-02-26 21:38         ` Jacob Moroni
2026-02-27 14:44           ` Jacob Moroni
2026-02-27 14:50             ` Jason Gunthorpe
2026-02-27 14:53         ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox