* [PATCH v3 0/7] dma-buf: Use revoke mechanism to invalidate shared buffers
@ 2026-01-20 14:07 Leon Romanovsky
2026-01-20 14:07 ` [PATCH v3 1/7] dma-buf: Rename .move_notify() callback to a clearer identifier Leon Romanovsky
` (6 more replies)
0 siblings, 7 replies; 40+ messages in thread
From: Leon Romanovsky @ 2026-01-20 14:07 UTC (permalink / raw)
To: Sumit Semwal, Christian König, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Leon Romanovsky, Kevin Tian, Joerg Roedel,
Will Deacon, Robin Murphy, Felix Kuehling, Alex Williamson,
Ankit Agrawal, Vivek Kasireddy
Cc: linux-media, dri-devel, linaro-mm-sig, linux-kernel, amd-gfx,
virtualization, intel-xe, linux-rdma, iommu, kvm
Changelog:
v3:
* Used Jason's wordings for commits and cover letter.
* Removed IOMMUFD patch.
* Renamed dma_buf_attachment_is_revoke() to be dma_buf_attach_revocable().
* Added patch to remove CONFIG_DMABUF_MOVE_NOTIFY.
* Added Reviewed-by tags.
* Called to dma_resv_wait_timeout() after dma_buf_move_notify() in VFIO.
* Added dma_buf_attach_revocable() check to VFIO DMABUF attach function.
* Slightly changed commit messages.
v2: https://patch.msgid.link/20260118-dmabuf-revoke-v2-0-a03bb27c0875@nvidia.com
* Changed series to document the revoke semantics instead of
implementing it.
v1: https://patch.msgid.link/20260111-dmabuf-revoke-v1-0-fb4bcc8c259b@nvidia.com
-------------------------------------------------------------------------
This series documents a dma-buf “revoke” mechanism: to allow a dma-buf
exporter to explicitly invalidate (“kill”) a shared buffer after it has
been distributed to importers, so that further CPU and device access is
prevented and importers reliably observe failure.
The change in this series is to properly document and use existing core
“revoked” state on the dma-buf object and a corresponding exporter-triggered
revoke operation.
dma-buf has quietly allowed calling move_notify on pinned dma-bufs, even
though legacy importers using dma_buf_attach() would simply ignore
these calls.
RDMA saw this and needed to use allow_peer2peer=true, so implemented a
new-style pinned importer with an explicitly non-working move_notify()
callback.
This has been tolerable because the existing exporters are thought to
only call move_notify() on a pinned DMABUF under RAS events and we
have been willing to tolerate the UAF that results by allowing the
importer to continue to use the mapping in this rare case.
VFIO wants to implement a pin supporting exporter that will issue a
revoking move_notify() around FLRs and a few other user triggerable
operations. Since this is much more common we are not willing to
tolerate the security UAF caused by interworking with non-move_notify()
supporting drivers. Thus till now VFIO has required dynamic importers,
even though it never actually moves the buffer location.
To allow VFIO to work with pinned importers, according to how dma-buf
was intended, we need to allow VFIO to detect if an importer is legacy
or RDMA and does not actually implement move_notify().
Introduce a new function that exporters can call to detect these less
capable importers. VFIO can then refuse to accept them during attach.
In theory all exporters that call move_notify() on pinned dma-buf's
should call this function, however that would break a number of widely
used NIC/GPU flows. Thus for now do not spread this further than VFIO
until we can understand how much of RDMA can implement the full
semantic.
In the process clarify how move_notify is intended to be used with
pinned dma-bufs.
Thanks
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
Leon Romanovsky (7):
dma-buf: Rename .move_notify() callback to a clearer identifier
dma-buf: Always build with DMABUF_MOVE_NOTIFY
dma-buf: Document RDMA non-ODP invalidate_mapping() special case
dma-buf: Add check function for revoke semantics
iommufd: Pin dma-buf importer for revoke semantics
vfio: Wait for dma-buf invalidation to complete
vfio: Validate dma-buf revocation semantics
drivers/dma-buf/Kconfig | 12 -----
drivers/dma-buf/dma-buf.c | 69 +++++++++++++++++++++++------
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 14 +++---
drivers/gpu/drm/amd/amdkfd/Kconfig | 2 +-
drivers/gpu/drm/virtio/virtgpu_prime.c | 2 +-
drivers/gpu/drm/xe/tests/xe_dma_buf.c | 7 ++-
drivers/gpu/drm/xe/xe_dma_buf.c | 14 +++---
drivers/infiniband/core/umem_dmabuf.c | 13 +-----
drivers/infiniband/hw/mlx5/mr.c | 2 +-
drivers/iommu/iommufd/pages.c | 11 ++++-
drivers/vfio/pci/vfio_pci_dmabuf.c | 8 ++++
include/linux/dma-buf.h | 9 ++--
12 files changed, 96 insertions(+), 67 deletions(-)
---
base-commit: 9ace4753a5202b02191d54e9fdf7f9e3d02b85eb
change-id: 20251221-dmabuf-revoke-b90ef16e4236
Best regards,
--
Leon Romanovsky <leonro@nvidia.com>
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v3 1/7] dma-buf: Rename .move_notify() callback to a clearer identifier
2026-01-20 14:07 [PATCH v3 0/7] dma-buf: Use revoke mechanism to invalidate shared buffers Leon Romanovsky
@ 2026-01-20 14:07 ` Leon Romanovsky
2026-01-21 8:33 ` Christian König
2026-01-20 14:07 ` [PATCH v3 2/7] dma-buf: Always build with DMABUF_MOVE_NOTIFY Leon Romanovsky
` (5 subsequent siblings)
6 siblings, 1 reply; 40+ messages in thread
From: Leon Romanovsky @ 2026-01-20 14:07 UTC (permalink / raw)
To: Sumit Semwal, Christian König, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Leon Romanovsky, Kevin Tian, Joerg Roedel,
Will Deacon, Robin Murphy, Felix Kuehling, Alex Williamson,
Ankit Agrawal, Vivek Kasireddy
Cc: linux-media, dri-devel, linaro-mm-sig, linux-kernel, amd-gfx,
virtualization, intel-xe, linux-rdma, iommu, kvm
From: Leon Romanovsky <leonro@nvidia.com>
Rename the .move_notify() callback to .invalidate_mappings() to make its
purpose explicit and highlight that it is responsible for invalidating
existing mappings.
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
drivers/dma-buf/dma-buf.c | 6 +++---
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 4 ++--
drivers/gpu/drm/virtio/virtgpu_prime.c | 2 +-
drivers/gpu/drm/xe/tests/xe_dma_buf.c | 6 +++---
drivers/gpu/drm/xe/xe_dma_buf.c | 2 +-
drivers/infiniband/core/umem_dmabuf.c | 4 ++--
drivers/infiniband/hw/mlx5/mr.c | 2 +-
drivers/iommu/iommufd/pages.c | 2 +-
include/linux/dma-buf.h | 6 +++---
9 files changed, 17 insertions(+), 17 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index edaa9e4ee4ae..59cc647bf40e 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -948,7 +948,7 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
if (WARN_ON(!dmabuf || !dev))
return ERR_PTR(-EINVAL);
- if (WARN_ON(importer_ops && !importer_ops->move_notify))
+ if (WARN_ON(importer_ops && !importer_ops->invalidate_mappings))
return ERR_PTR(-EINVAL);
attach = kzalloc(sizeof(*attach), GFP_KERNEL);
@@ -1055,7 +1055,7 @@ EXPORT_SYMBOL_NS_GPL(dma_buf_pin, "DMA_BUF");
*
* This unpins a buffer pinned by dma_buf_pin() and allows the exporter to move
* any mapping of @attach again and inform the importer through
- * &dma_buf_attach_ops.move_notify.
+ * &dma_buf_attach_ops.invalidate_mappings.
*/
void dma_buf_unpin(struct dma_buf_attachment *attach)
{
@@ -1262,7 +1262,7 @@ void dma_buf_move_notify(struct dma_buf *dmabuf)
list_for_each_entry(attach, &dmabuf->attachments, node)
if (attach->importer_ops)
- attach->importer_ops->move_notify(attach);
+ attach->importer_ops->invalidate_mappings(attach);
}
EXPORT_SYMBOL_NS_GPL(dma_buf_move_notify, "DMA_BUF");
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index e22cfa7c6d32..863454148b28 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -450,7 +450,7 @@ amdgpu_dma_buf_create_obj(struct drm_device *dev, struct dma_buf *dma_buf)
}
/**
- * amdgpu_dma_buf_move_notify - &attach.move_notify implementation
+ * amdgpu_dma_buf_move_notify - &attach.invalidate_mappings implementation
*
* @attach: the DMA-buf attachment
*
@@ -521,7 +521,7 @@ amdgpu_dma_buf_move_notify(struct dma_buf_attachment *attach)
static const struct dma_buf_attach_ops amdgpu_dma_buf_attach_ops = {
.allow_peer2peer = true,
- .move_notify = amdgpu_dma_buf_move_notify
+ .invalidate_mappings = amdgpu_dma_buf_move_notify
};
/**
diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c
index ce49282198cb..19c78dd2ca77 100644
--- a/drivers/gpu/drm/virtio/virtgpu_prime.c
+++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
@@ -288,7 +288,7 @@ static void virtgpu_dma_buf_move_notify(struct dma_buf_attachment *attach)
static const struct dma_buf_attach_ops virtgpu_dma_buf_attach_ops = {
.allow_peer2peer = true,
- .move_notify = virtgpu_dma_buf_move_notify
+ .invalidate_mappings = virtgpu_dma_buf_move_notify
};
struct drm_gem_object *virtgpu_gem_prime_import(struct drm_device *dev,
diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf.c b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
index 5df98de5ba3c..1f2cca5c2f81 100644
--- a/drivers/gpu/drm/xe/tests/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
@@ -23,7 +23,7 @@ static bool p2p_enabled(struct dma_buf_test_params *params)
static bool is_dynamic(struct dma_buf_test_params *params)
{
return IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY) && params->attach_ops &&
- params->attach_ops->move_notify;
+ params->attach_ops->invalidate_mappings;
}
static void check_residency(struct kunit *test, struct xe_bo *exported,
@@ -60,7 +60,7 @@ static void check_residency(struct kunit *test, struct xe_bo *exported,
/*
* Evict exporter. Evicting the exported bo will
- * evict also the imported bo through the move_notify() functionality if
+ * evict also the imported bo through the invalidate_mappings() functionality if
* importer is on a different device. If they're on the same device,
* the exporter and the importer should be the same bo.
*/
@@ -198,7 +198,7 @@ static void xe_test_dmabuf_import_same_driver(struct xe_device *xe)
static const struct dma_buf_attach_ops nop2p_attach_ops = {
.allow_peer2peer = false,
- .move_notify = xe_dma_buf_move_notify
+ .invalidate_mappings = xe_dma_buf_move_notify
};
/*
diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c
index 7c74a31d4486..1b9cd043e517 100644
--- a/drivers/gpu/drm/xe/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/xe_dma_buf.c
@@ -287,7 +287,7 @@ static void xe_dma_buf_move_notify(struct dma_buf_attachment *attach)
static const struct dma_buf_attach_ops xe_dma_buf_attach_ops = {
.allow_peer2peer = true,
- .move_notify = xe_dma_buf_move_notify
+ .invalidate_mappings = xe_dma_buf_move_notify
};
#if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST)
diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c
index 0ec2e4120cc9..d77a739cfe7a 100644
--- a/drivers/infiniband/core/umem_dmabuf.c
+++ b/drivers/infiniband/core/umem_dmabuf.c
@@ -129,7 +129,7 @@ ib_umem_dmabuf_get_with_dma_device(struct ib_device *device,
if (check_add_overflow(offset, (unsigned long)size, &end))
return ret;
- if (unlikely(!ops || !ops->move_notify))
+ if (unlikely(!ops || !ops->invalidate_mappings))
return ret;
dmabuf = dma_buf_get(fd);
@@ -195,7 +195,7 @@ ib_umem_dmabuf_unsupported_move_notify(struct dma_buf_attachment *attach)
static struct dma_buf_attach_ops ib_umem_dmabuf_attach_pinned_ops = {
.allow_peer2peer = true,
- .move_notify = ib_umem_dmabuf_unsupported_move_notify,
+ .invalidate_mappings = ib_umem_dmabuf_unsupported_move_notify,
};
struct ib_umem_dmabuf *
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 325fa04cbe8a..97099d3b1688 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1620,7 +1620,7 @@ static void mlx5_ib_dmabuf_invalidate_cb(struct dma_buf_attachment *attach)
static struct dma_buf_attach_ops mlx5_ib_dmabuf_attach_ops = {
.allow_peer2peer = 1,
- .move_notify = mlx5_ib_dmabuf_invalidate_cb,
+ .invalidate_mappings = mlx5_ib_dmabuf_invalidate_cb,
};
static struct ib_mr *
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index dbe51ecb9a20..76f900fa1687 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -1451,7 +1451,7 @@ static void iopt_revoke_notify(struct dma_buf_attachment *attach)
static struct dma_buf_attach_ops iopt_dmabuf_attach_revoke_ops = {
.allow_peer2peer = true,
- .move_notify = iopt_revoke_notify,
+ .invalidate_mappings = iopt_revoke_notify,
};
/*
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 0bc492090237..1b397635c793 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -407,7 +407,7 @@ struct dma_buf {
* through the device.
*
* - Dynamic importers should set fences for any access that they can't
- * disable immediately from their &dma_buf_attach_ops.move_notify
+ * disable immediately from their &dma_buf_attach_ops.invalidate_mappings
* callback.
*
* IMPORTANT:
@@ -458,7 +458,7 @@ struct dma_buf_attach_ops {
bool allow_peer2peer;
/**
- * @move_notify: [optional] notification that the DMA-buf is moving
+ * @invalidate_mappings: [optional] notification that the DMA-buf is moving
*
* If this callback is provided the framework can avoid pinning the
* backing store while mappings exists.
@@ -475,7 +475,7 @@ struct dma_buf_attach_ops {
* New mappings can be created after this callback returns, and will
* point to the new location of the DMA-buf.
*/
- void (*move_notify)(struct dma_buf_attachment *attach);
+ void (*invalidate_mappings)(struct dma_buf_attachment *attach);
};
/**
--
2.52.0
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v3 2/7] dma-buf: Always build with DMABUF_MOVE_NOTIFY
2026-01-20 14:07 [PATCH v3 0/7] dma-buf: Use revoke mechanism to invalidate shared buffers Leon Romanovsky
2026-01-20 14:07 ` [PATCH v3 1/7] dma-buf: Rename .move_notify() callback to a clearer identifier Leon Romanovsky
@ 2026-01-20 14:07 ` Leon Romanovsky
2026-01-21 8:55 ` Christian König
2026-01-20 14:07 ` [PATCH v3 3/7] dma-buf: Document RDMA non-ODP invalidate_mapping() special case Leon Romanovsky
` (4 subsequent siblings)
6 siblings, 1 reply; 40+ messages in thread
From: Leon Romanovsky @ 2026-01-20 14:07 UTC (permalink / raw)
To: Sumit Semwal, Christian König, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Leon Romanovsky, Kevin Tian, Joerg Roedel,
Will Deacon, Robin Murphy, Felix Kuehling, Alex Williamson,
Ankit Agrawal, Vivek Kasireddy
Cc: linux-media, dri-devel, linaro-mm-sig, linux-kernel, amd-gfx,
virtualization, intel-xe, linux-rdma, iommu, kvm
From: Leon Romanovsky <leonro@nvidia.com>
DMABUF_MOVE_NOTIFY was introduced in 2018 and has been marked as
experimental and disabled by default ever since. Six years later,
all new importers implement this callback.
It is therefore reasonable to drop CONFIG_DMABUF_MOVE_NOTIFY and
always build DMABUF with support for it enabled.
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
drivers/dma-buf/Kconfig | 12 ------------
drivers/dma-buf/dma-buf.c | 12 ++----------
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 10 +++-------
drivers/gpu/drm/amd/amdkfd/Kconfig | 2 +-
drivers/gpu/drm/xe/tests/xe_dma_buf.c | 3 +--
drivers/gpu/drm/xe/xe_dma_buf.c | 12 ++++--------
6 files changed, 11 insertions(+), 40 deletions(-)
diff --git a/drivers/dma-buf/Kconfig b/drivers/dma-buf/Kconfig
index b46eb8a552d7..84d5e9b24e20 100644
--- a/drivers/dma-buf/Kconfig
+++ b/drivers/dma-buf/Kconfig
@@ -40,18 +40,6 @@ config UDMABUF
A driver to let userspace turn memfd regions into dma-bufs.
Qemu can use this to create host dmabufs for guest framebuffers.
-config DMABUF_MOVE_NOTIFY
- bool "Move notify between drivers (EXPERIMENTAL)"
- default n
- depends on DMA_SHARED_BUFFER
- help
- Don't pin buffers if the dynamic DMA-buf interface is available on
- both the exporter as well as the importer. This fixes a security
- problem where userspace is able to pin unrestricted amounts of memory
- through DMA-buf.
- This is marked experimental because we don't yet have a consistent
- execution context and memory management between drivers.
-
config DMABUF_DEBUG
bool "DMA-BUF debug checks"
depends on DMA_SHARED_BUFFER
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 59cc647bf40e..cd3b60ce4863 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -837,18 +837,10 @@ static void mangle_sg_table(struct sg_table *sg_table)
}
-static inline bool
-dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
-{
- return !!attach->importer_ops;
-}
-
static bool
dma_buf_pin_on_map(struct dma_buf_attachment *attach)
{
- return attach->dmabuf->ops->pin &&
- (!dma_buf_attachment_is_dynamic(attach) ||
- !IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY));
+ return attach->dmabuf->ops->pin && !attach->importer_ops;
}
/**
@@ -1124,7 +1116,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
/*
* Importers with static attachments don't wait for fences.
*/
- if (!dma_buf_attachment_is_dynamic(attach)) {
+ if (!attach->importer_ops) {
ret = dma_resv_wait_timeout(attach->dmabuf->resv,
DMA_RESV_USAGE_KERNEL, true,
MAX_SCHEDULE_TIMEOUT);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index 863454148b28..349215549e8f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -145,13 +145,9 @@ static int amdgpu_dma_buf_pin(struct dma_buf_attachment *attach)
* notifiers are disabled, only allow pinning in VRAM when move
* notiers are enabled.
*/
- if (!IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY)) {
- domains &= ~AMDGPU_GEM_DOMAIN_VRAM;
- } else {
- list_for_each_entry(attach, &dmabuf->attachments, node)
- if (!attach->peer2peer)
- domains &= ~AMDGPU_GEM_DOMAIN_VRAM;
- }
+ list_for_each_entry(attach, &dmabuf->attachments, node)
+ if (!attach->peer2peer)
+ domains &= ~AMDGPU_GEM_DOMAIN_VRAM;
if (domains & AMDGPU_GEM_DOMAIN_VRAM)
bo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig b/drivers/gpu/drm/amd/amdkfd/Kconfig
index 16e12c9913f9..a5d7467c2f34 100644
--- a/drivers/gpu/drm/amd/amdkfd/Kconfig
+++ b/drivers/gpu/drm/amd/amdkfd/Kconfig
@@ -27,7 +27,7 @@ config HSA_AMD_SVM
config HSA_AMD_P2P
bool "HSA kernel driver support for peer-to-peer for AMD GPU devices"
- depends on HSA_AMD && PCI_P2PDMA && DMABUF_MOVE_NOTIFY
+ depends on HSA_AMD && PCI_P2PDMA
help
Enable peer-to-peer (P2P) communication between AMD GPUs over
the PCIe bus. This can improve performance of multi-GPU compute
diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf.c b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
index 1f2cca5c2f81..c107687ef3c0 100644
--- a/drivers/gpu/drm/xe/tests/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
@@ -22,8 +22,7 @@ static bool p2p_enabled(struct dma_buf_test_params *params)
static bool is_dynamic(struct dma_buf_test_params *params)
{
- return IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY) && params->attach_ops &&
- params->attach_ops->invalidate_mappings;
+ return params->attach_ops && params->attach_ops->invalidate_mappings;
}
static void check_residency(struct kunit *test, struct xe_bo *exported,
diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c
index 1b9cd043e517..ea370cd373e9 100644
--- a/drivers/gpu/drm/xe/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/xe_dma_buf.c
@@ -56,14 +56,10 @@ static int xe_dma_buf_pin(struct dma_buf_attachment *attach)
bool allow_vram = true;
int ret;
- if (!IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY)) {
- allow_vram = false;
- } else {
- list_for_each_entry(attach, &dmabuf->attachments, node) {
- if (!attach->peer2peer) {
- allow_vram = false;
- break;
- }
+ list_for_each_entry(attach, &dmabuf->attachments, node) {
+ if (!attach->peer2peer) {
+ allow_vram = false;
+ break;
}
}
--
2.52.0
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v3 3/7] dma-buf: Document RDMA non-ODP invalidate_mapping() special case
2026-01-20 14:07 [PATCH v3 0/7] dma-buf: Use revoke mechanism to invalidate shared buffers Leon Romanovsky
2026-01-20 14:07 ` [PATCH v3 1/7] dma-buf: Rename .move_notify() callback to a clearer identifier Leon Romanovsky
2026-01-20 14:07 ` [PATCH v3 2/7] dma-buf: Always build with DMABUF_MOVE_NOTIFY Leon Romanovsky
@ 2026-01-20 14:07 ` Leon Romanovsky
2026-01-21 8:59 ` Christian König
2026-01-20 14:07 ` [PATCH v3 4/7] dma-buf: Add check function for revoke semantics Leon Romanovsky
` (3 subsequent siblings)
6 siblings, 1 reply; 40+ messages in thread
From: Leon Romanovsky @ 2026-01-20 14:07 UTC (permalink / raw)
To: Sumit Semwal, Christian König, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Leon Romanovsky, Kevin Tian, Joerg Roedel,
Will Deacon, Robin Murphy, Felix Kuehling, Alex Williamson,
Ankit Agrawal, Vivek Kasireddy
Cc: linux-media, dri-devel, linaro-mm-sig, linux-kernel, amd-gfx,
virtualization, intel-xe, linux-rdma, iommu, kvm
From: Leon Romanovsky <leonro@nvidia.com>
The .invalidate_mapping() callback is documented as optional, yet it
effectively became mandatory whenever importer_ops were provided. This
led to cases where RDMA non-ODP code had to supply an empty stub just to
provide allow_peer2peer.
Document this behavior by creating a dedicated export for the
dma_buf_unsupported_invalidate_mappings() function. This function is
intended solely for the RDMA non-ODP case and must not be used by any
other dma-buf importer.
This makes it possible to rely on a valid .invalidate_mappings()
callback to determine whether an importer supports revocation.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
drivers/dma-buf/dma-buf.c | 14 ++++++++++++++
drivers/infiniband/core/umem_dmabuf.c | 11 +----------
include/linux/dma-buf.h | 4 +++-
3 files changed, 18 insertions(+), 11 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index cd3b60ce4863..c4fa35034b92 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -1238,6 +1238,20 @@ void dma_buf_unmap_attachment_unlocked(struct dma_buf_attachment *attach,
}
EXPORT_SYMBOL_NS_GPL(dma_buf_unmap_attachment_unlocked, "DMA_BUF");
+/*
+ * This function shouldn't be used by anyone except RDMA non-ODP case.
+ * The reason to it is UAPI mistake where dma-buf was exported to the
+ * userspace without knowing that .invalidate_mappings() can be called
+ * for pinned memory too.
+ *
+ * This warning shouldn't be seen in real production scenario.
+ */
+void dma_buf_unsupported_invalidate_mappings(struct dma_buf_attachment *attach)
+{
+ pr_warn("Invalidate callback should not be called when memory is pinned\n");
+}
+EXPORT_SYMBOL_FOR_MODULES(dma_buf_unsupported_invalidate_mappings, "ib_uverbs");
+
/**
* dma_buf_move_notify - notify attachments that DMA-buf is moving
*
diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c
index d77a739cfe7a..81442a887b48 100644
--- a/drivers/infiniband/core/umem_dmabuf.c
+++ b/drivers/infiniband/core/umem_dmabuf.c
@@ -184,18 +184,9 @@ struct ib_umem_dmabuf *ib_umem_dmabuf_get(struct ib_device *device,
}
EXPORT_SYMBOL(ib_umem_dmabuf_get);
-static void
-ib_umem_dmabuf_unsupported_move_notify(struct dma_buf_attachment *attach)
-{
- struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv;
-
- ibdev_warn_ratelimited(umem_dmabuf->umem.ibdev,
- "Invalidate callback should not be called when memory is pinned\n");
-}
-
static struct dma_buf_attach_ops ib_umem_dmabuf_attach_pinned_ops = {
.allow_peer2peer = true,
- .invalidate_mappings = ib_umem_dmabuf_unsupported_move_notify,
+ .invalidate_mappings = dma_buf_unsupported_invalidate_mappings,
};
struct ib_umem_dmabuf *
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 1b397635c793..7d7d0a4fb762 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -458,7 +458,7 @@ struct dma_buf_attach_ops {
bool allow_peer2peer;
/**
- * @invalidate_mappings: [optional] notification that the DMA-buf is moving
+ * @invalidate_mappings: notification that the DMA-buf is moving
*
* If this callback is provided the framework can avoid pinning the
* backing store while mappings exists.
@@ -601,6 +601,8 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *,
void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *,
enum dma_data_direction);
void dma_buf_move_notify(struct dma_buf *dma_buf);
+void dma_buf_unsupported_invalidate_mappings(struct dma_buf_attachment *attach);
+
int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
enum dma_data_direction dir);
int dma_buf_end_cpu_access(struct dma_buf *dma_buf,
--
2.52.0
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v3 4/7] dma-buf: Add check function for revoke semantics
2026-01-20 14:07 [PATCH v3 0/7] dma-buf: Use revoke mechanism to invalidate shared buffers Leon Romanovsky
` (2 preceding siblings ...)
2026-01-20 14:07 ` [PATCH v3 3/7] dma-buf: Document RDMA non-ODP invalidate_mapping() special case Leon Romanovsky
@ 2026-01-20 14:07 ` Leon Romanovsky
2026-01-20 14:07 ` [PATCH v3 5/7] iommufd: Pin dma-buf importer " Leon Romanovsky
` (2 subsequent siblings)
6 siblings, 0 replies; 40+ messages in thread
From: Leon Romanovsky @ 2026-01-20 14:07 UTC (permalink / raw)
To: Sumit Semwal, Christian König, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Leon Romanovsky, Kevin Tian, Joerg Roedel,
Will Deacon, Robin Murphy, Felix Kuehling, Alex Williamson,
Ankit Agrawal, Vivek Kasireddy
Cc: linux-media, dri-devel, linaro-mm-sig, linux-kernel, amd-gfx,
virtualization, intel-xe, linux-rdma, iommu, kvm
From: Leon Romanovsky <leonro@nvidia.com>
A DMA-buf revoke mechanism that allows an exporter to explicitly
invalidate ("kill") a shared buffer after it has been handed out to
importers. Once revoked, all further CPU and device access is blocked, and
importers consistently observe failure.
This requires both importers and exporters to honor the revoke contract.
For importers, this means implementing .invalidate_mappings(). For exporters,
this means implementing the .pin() and/or .attach() callback, which check the
dma‑buf attachment for a valid revoke implementation.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
drivers/dma-buf/dma-buf.c | 37 ++++++++++++++++++++++++++++++++++++-
include/linux/dma-buf.h | 1 +
2 files changed, 37 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index c4fa35034b92..c048c822c3e9 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -1252,13 +1252,48 @@ void dma_buf_unsupported_invalidate_mappings(struct dma_buf_attachment *attach)
}
EXPORT_SYMBOL_FOR_MODULES(dma_buf_unsupported_invalidate_mappings, "ib_uverbs");
+/**
+ * dma_buf_attach_revocable - check if a DMA-buf importer implements
+ * revoke semantics.
+ * @attach: the DMA-buf attachment to check
+ *
+ * Returns true if the DMA-buf importer can handle invalidating it's mappings
+ * at any time, even after pinning a buffer.
+ */
+bool dma_buf_attach_revocable(struct dma_buf_attachment *attach)
+{
+ /*
+ * There is no need to check existence of .invalidate_mappings() as
+ * it always exists when importer_ops is set in dma_buf_dynamic_attach().
+ */
+ return attach->importer_ops &&
+ (attach->importer_ops->invalidate_mappings !=
+ &dma_buf_unsupported_invalidate_mappings);
+}
+EXPORT_SYMBOL_NS_GPL(dma_buf_attach_revocable, "DMA_BUF");
+
/**
* dma_buf_move_notify - notify attachments that DMA-buf is moving
*
* @dmabuf: [in] buffer which is moving
*
* Informs all attachments that they need to destroy and recreate all their
- * mappings.
+ * mappings. If the attachment is dynamic then the dynamic importer is expected
+ * to invalidate any caches it has of the mapping result and perform a new
+ * mapping request before allowing HW to do any further DMA.
+ *
+ * If the attachment is pinned then this informs the pinned importer that
+ * the underlying mapping is no longer available. Pinned importers may take
+ * this is as a permanent revocation so exporters should not trigger it
+ * lightly.
+ *
+ * For legacy pinned importers that cannot support invalidation this is a NOP.
+ * Drivers can call dma_buf_attach_revocable() to determine if the importer
+ * supports this.
+ *
+ * NOTE: The invalidation triggers asynchronous HW operation and the callers
+ * need to wait for this operation to complete by calling
+ * to dma_resv_wait_timeout().
*/
void dma_buf_move_notify(struct dma_buf *dmabuf)
{
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 7d7d0a4fb762..ac2ce1273b4c 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -602,6 +602,7 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *,
enum dma_data_direction);
void dma_buf_move_notify(struct dma_buf *dma_buf);
void dma_buf_unsupported_invalidate_mappings(struct dma_buf_attachment *attach);
+bool dma_buf_attach_revocable(struct dma_buf_attachment *attach);
int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
enum dma_data_direction dir);
--
2.52.0
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v3 5/7] iommufd: Pin dma-buf importer for revoke semantics
2026-01-20 14:07 [PATCH v3 0/7] dma-buf: Use revoke mechanism to invalidate shared buffers Leon Romanovsky
` (3 preceding siblings ...)
2026-01-20 14:07 ` [PATCH v3 4/7] dma-buf: Add check function for revoke semantics Leon Romanovsky
@ 2026-01-20 14:07 ` Leon Romanovsky
2026-01-21 9:01 ` Christian König
2026-01-20 14:07 ` [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete Leon Romanovsky
2026-01-20 14:07 ` [PATCH v3 7/7] vfio: Validate dma-buf revocation semantics Leon Romanovsky
6 siblings, 1 reply; 40+ messages in thread
From: Leon Romanovsky @ 2026-01-20 14:07 UTC (permalink / raw)
To: Sumit Semwal, Christian König, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Leon Romanovsky, Kevin Tian, Joerg Roedel,
Will Deacon, Robin Murphy, Felix Kuehling, Alex Williamson,
Ankit Agrawal, Vivek Kasireddy
Cc: linux-media, dri-devel, linaro-mm-sig, linux-kernel, amd-gfx,
virtualization, intel-xe, linux-rdma, iommu, kvm
From: Leon Romanovsky <leonro@nvidia.com>
IOMMUFD does not support page fault handling, and after a call to
.invalidate_mappings() all mappings become invalid. Ensure that
the IOMMUFD dma-buf importer is bound to a revoke‑aware dma-buf
exporter (for example, VFIO).
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
drivers/iommu/iommufd/pages.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 76f900fa1687..a5eb2bc4ef48 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -1501,16 +1501,22 @@ static int iopt_map_dmabuf(struct iommufd_ctx *ictx, struct iopt_pages *pages,
mutex_unlock(&pages->mutex);
}
- rc = sym_vfio_pci_dma_buf_iommufd_map(attach, &pages->dmabuf.phys);
+ rc = dma_buf_pin(attach);
if (rc)
goto err_detach;
+ rc = sym_vfio_pci_dma_buf_iommufd_map(attach, &pages->dmabuf.phys);
+ if (rc)
+ goto err_unpin;
+
dma_resv_unlock(dmabuf->resv);
/* On success iopt_release_pages() will detach and put the dmabuf. */
pages->dmabuf.attach = attach;
return 0;
+err_unpin:
+ dma_buf_unpin(attach);
err_detach:
dma_resv_unlock(dmabuf->resv);
dma_buf_detach(dmabuf, attach);
@@ -1656,6 +1662,7 @@ void iopt_release_pages(struct kref *kref)
if (iopt_is_dmabuf(pages) && pages->dmabuf.attach) {
struct dma_buf *dmabuf = pages->dmabuf.attach->dmabuf;
+ dma_buf_unpin(pages->dmabuf.attach);
dma_buf_detach(dmabuf, pages->dmabuf.attach);
dma_buf_put(dmabuf);
WARN_ON(!list_empty(&pages->dmabuf.tracker));
--
2.52.0
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-20 14:07 [PATCH v3 0/7] dma-buf: Use revoke mechanism to invalidate shared buffers Leon Romanovsky
` (4 preceding siblings ...)
2026-01-20 14:07 ` [PATCH v3 5/7] iommufd: Pin dma-buf importer " Leon Romanovsky
@ 2026-01-20 14:07 ` Leon Romanovsky
2026-01-20 20:44 ` Matthew Brost
2026-01-21 9:20 ` Christian König
2026-01-20 14:07 ` [PATCH v3 7/7] vfio: Validate dma-buf revocation semantics Leon Romanovsky
6 siblings, 2 replies; 40+ messages in thread
From: Leon Romanovsky @ 2026-01-20 14:07 UTC (permalink / raw)
To: Sumit Semwal, Christian König, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Leon Romanovsky, Kevin Tian, Joerg Roedel,
Will Deacon, Robin Murphy, Felix Kuehling, Alex Williamson,
Ankit Agrawal, Vivek Kasireddy
Cc: linux-media, dri-devel, linaro-mm-sig, linux-kernel, amd-gfx,
virtualization, intel-xe, linux-rdma, iommu, kvm
From: Leon Romanovsky <leonro@nvidia.com>
dma-buf invalidation is performed asynchronously by hardware, so VFIO must
wait until all affected objects have been fully invalidated.
Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO regions")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
drivers/vfio/pci/vfio_pci_dmabuf.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
index d4d0f7d08c53..33bc6a1909dd 100644
--- a/drivers/vfio/pci/vfio_pci_dmabuf.c
+++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
@@ -321,6 +321,9 @@ void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked)
dma_resv_lock(priv->dmabuf->resv, NULL);
priv->revoked = revoked;
dma_buf_move_notify(priv->dmabuf);
+ dma_resv_wait_timeout(priv->dmabuf->resv,
+ DMA_RESV_USAGE_KERNEL, false,
+ MAX_SCHEDULE_TIMEOUT);
dma_resv_unlock(priv->dmabuf->resv);
}
fput(priv->dmabuf->file);
@@ -342,6 +345,8 @@ void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev)
priv->vdev = NULL;
priv->revoked = true;
dma_buf_move_notify(priv->dmabuf);
+ dma_resv_wait_timeout(priv->dmabuf->resv, DMA_RESV_USAGE_KERNEL,
+ false, MAX_SCHEDULE_TIMEOUT);
dma_resv_unlock(priv->dmabuf->resv);
vfio_device_put_registration(&vdev->vdev);
fput(priv->dmabuf->file);
--
2.52.0
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v3 7/7] vfio: Validate dma-buf revocation semantics
2026-01-20 14:07 [PATCH v3 0/7] dma-buf: Use revoke mechanism to invalidate shared buffers Leon Romanovsky
` (5 preceding siblings ...)
2026-01-20 14:07 ` [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete Leon Romanovsky
@ 2026-01-20 14:07 ` Leon Romanovsky
6 siblings, 0 replies; 40+ messages in thread
From: Leon Romanovsky @ 2026-01-20 14:07 UTC (permalink / raw)
To: Sumit Semwal, Christian König, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Leon Romanovsky, Kevin Tian, Joerg Roedel,
Will Deacon, Robin Murphy, Felix Kuehling, Alex Williamson,
Ankit Agrawal, Vivek Kasireddy
Cc: linux-media, dri-devel, linaro-mm-sig, linux-kernel, amd-gfx,
virtualization, intel-xe, linux-rdma, iommu, kvm
From: Leon Romanovsky <leonro@nvidia.com>
Use the new dma_buf_attach_revocable() helper to restrict attachments to
importers that support mapping invalidation.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
drivers/vfio/pci/vfio_pci_dmabuf.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
index 33bc6a1909dd..0c7782a51912 100644
--- a/drivers/vfio/pci/vfio_pci_dmabuf.c
+++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
@@ -31,6 +31,9 @@ static int vfio_pci_dma_buf_attach(struct dma_buf *dmabuf,
if (priv->revoked)
return -ENODEV;
+ if (!dma_buf_attach_revocable(attachment))
+ return -EOPNOTSUPP;
+
return 0;
}
--
2.52.0
^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-20 14:07 ` [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete Leon Romanovsky
@ 2026-01-20 20:44 ` Matthew Brost
2026-01-21 7:59 ` Leon Romanovsky
2026-01-21 10:41 ` Christian König
2026-01-21 9:20 ` Christian König
1 sibling, 2 replies; 40+ messages in thread
From: Matthew Brost @ 2026-01-20 20:44 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Sumit Semwal, Christian König, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon,
Robin Murphy, Felix Kuehling, Alex Williamson, Ankit Agrawal,
Vivek Kasireddy, linux-media, dri-devel, linaro-mm-sig,
linux-kernel, amd-gfx, virtualization, intel-xe, linux-rdma,
iommu, kvm
On Tue, Jan 20, 2026 at 04:07:06PM +0200, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
>
> dma-buf invalidation is performed asynchronously by hardware, so VFIO must
> wait until all affected objects have been fully invalidated.
>
> Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO regions")
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
> drivers/vfio/pci/vfio_pci_dmabuf.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
> index d4d0f7d08c53..33bc6a1909dd 100644
> --- a/drivers/vfio/pci/vfio_pci_dmabuf.c
> +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
> @@ -321,6 +321,9 @@ void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked)
> dma_resv_lock(priv->dmabuf->resv, NULL);
> priv->revoked = revoked;
> dma_buf_move_notify(priv->dmabuf);
> + dma_resv_wait_timeout(priv->dmabuf->resv,
> + DMA_RESV_USAGE_KERNEL, false,
> + MAX_SCHEDULE_TIMEOUT);
Should we explicitly call out in the dma_buf_move_notify() /
invalidate_mappings kernel-doc that KERNEL slots are the mechanism
for communicating asynchronous dma_buf_move_notify /
invalidate_mappings events via fences?
Yes, this is probably implied, but it wouldn’t hurt to state this
explicitly as part of the cross-driver contract.
Here is what we have now:
* - Dynamic importers should set fences for any access that they can't
* disable immediately from their &dma_buf_attach_ops.invalidate_mappings
* callback.
Matt
> dma_resv_unlock(priv->dmabuf->resv);
> }
> fput(priv->dmabuf->file);
> @@ -342,6 +345,8 @@ void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev)
> priv->vdev = NULL;
> priv->revoked = true;
> dma_buf_move_notify(priv->dmabuf);
> + dma_resv_wait_timeout(priv->dmabuf->resv, DMA_RESV_USAGE_KERNEL,
> + false, MAX_SCHEDULE_TIMEOUT);
> dma_resv_unlock(priv->dmabuf->resv);
> vfio_device_put_registration(&vdev->vdev);
> fput(priv->dmabuf->file);
>
> --
> 2.52.0
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-20 20:44 ` Matthew Brost
@ 2026-01-21 7:59 ` Leon Romanovsky
2026-01-21 10:41 ` Christian König
1 sibling, 0 replies; 40+ messages in thread
From: Leon Romanovsky @ 2026-01-21 7:59 UTC (permalink / raw)
To: Matthew Brost
Cc: Sumit Semwal, Christian König, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon,
Robin Murphy, Felix Kuehling, Alex Williamson, Ankit Agrawal,
Vivek Kasireddy, linux-media, dri-devel, linaro-mm-sig,
linux-kernel, amd-gfx, virtualization, intel-xe, linux-rdma,
iommu, kvm
On Tue, Jan 20, 2026 at 12:44:50PM -0800, Matthew Brost wrote:
> On Tue, Jan 20, 2026 at 04:07:06PM +0200, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> >
> > dma-buf invalidation is performed asynchronously by hardware, so VFIO must
> > wait until all affected objects have been fully invalidated.
> >
> > Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO regions")
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > ---
> > drivers/vfio/pci/vfio_pci_dmabuf.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
> > index d4d0f7d08c53..33bc6a1909dd 100644
> > --- a/drivers/vfio/pci/vfio_pci_dmabuf.c
> > +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
> > @@ -321,6 +321,9 @@ void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked)
> > dma_resv_lock(priv->dmabuf->resv, NULL);
> > priv->revoked = revoked;
> > dma_buf_move_notify(priv->dmabuf);
> > + dma_resv_wait_timeout(priv->dmabuf->resv,
> > + DMA_RESV_USAGE_KERNEL, false,
> > + MAX_SCHEDULE_TIMEOUT);
>
> Should we explicitly call out in the dma_buf_move_notify() /
> invalidate_mappings kernel-doc that KERNEL slots are the mechanism
> for communicating asynchronous dma_buf_move_notify /
> invalidate_mappings events via fences?
>
> Yes, this is probably implied, but it wouldn’t hurt to state this
> explicitly as part of the cross-driver contract.
>
> Here is what we have now:
>
> * - Dynamic importers should set fences for any access that they can't
> * disable immediately from their &dma_buf_attach_ops.invalidate_mappings
> * callback.
I believe I documented this in patch 4:
https://lore.kernel.org/all/20260120-dmabuf-revoke-v3-4-b7e0b07b8214@nvidia.com/"
Is there anything else that should be added?
1275 /**
1276 * dma_buf_move_notify - notify attachments that DMA-buf is moving
1277 *
1278 * @dmabuf: [in] buffer which is moving
1279 *
1280 * Informs all attachments that they need to destroy and recreate all their
1281 * mappings. If the attachment is dynamic then the dynamic importer is expected
1282 * to invalidate any caches it has of the mapping result and perform a new
1283 * mapping request before allowing HW to do any further DMA.
1284 *
1285 * If the attachment is pinned then this informs the pinned importer that
1286 * the underlying mapping is no longer available. Pinned importers may take
1287 * this is as a permanent revocation so exporters should not trigger it
1288 * lightly.
1289 *
1290 * For legacy pinned importers that cannot support invalidation this is a NOP.
1291 * Drivers can call dma_buf_attach_revocable() to determine if the importer
1292 * supports this.
1293 *
1294 * NOTE: The invalidation triggers asynchronous HW operation and the callers
1295 * need to wait for this operation to complete by calling
1296 * to dma_resv_wait_timeout().
1297 */
Thanks
>
> Matt
>
> > dma_resv_unlock(priv->dmabuf->resv);
> > }
> > fput(priv->dmabuf->file);
> > @@ -342,6 +345,8 @@ void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev)
> > priv->vdev = NULL;
> > priv->revoked = true;
> > dma_buf_move_notify(priv->dmabuf);
> > + dma_resv_wait_timeout(priv->dmabuf->resv, DMA_RESV_USAGE_KERNEL,
> > + false, MAX_SCHEDULE_TIMEOUT);
> > dma_resv_unlock(priv->dmabuf->resv);
> > vfio_device_put_registration(&vdev->vdev);
> > fput(priv->dmabuf->file);
> >
> > --
> > 2.52.0
> >
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 1/7] dma-buf: Rename .move_notify() callback to a clearer identifier
2026-01-20 14:07 ` [PATCH v3 1/7] dma-buf: Rename .move_notify() callback to a clearer identifier Leon Romanovsky
@ 2026-01-21 8:33 ` Christian König
2026-01-21 8:41 ` Leon Romanovsky
0 siblings, 1 reply; 40+ messages in thread
From: Christian König @ 2026-01-21 8:33 UTC (permalink / raw)
To: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon,
Robin Murphy, Felix Kuehling, Alex Williamson, Ankit Agrawal,
Vivek Kasireddy
Cc: linux-media, dri-devel, linaro-mm-sig, linux-kernel, amd-gfx,
virtualization, intel-xe, linux-rdma, iommu, kvm
On 1/20/26 15:07, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
>
> Rename the .move_notify() callback to .invalidate_mappings() to make its
> purpose explicit and highlight that it is responsible for invalidating
> existing mappings.
>
> Suggested-by: Christian König <christian.koenig@amd.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
> drivers/dma-buf/dma-buf.c | 6 +++---
> drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 4 ++--
> drivers/gpu/drm/virtio/virtgpu_prime.c | 2 +-
> drivers/gpu/drm/xe/tests/xe_dma_buf.c | 6 +++---
> drivers/gpu/drm/xe/xe_dma_buf.c | 2 +-
> drivers/infiniband/core/umem_dmabuf.c | 4 ++--
> drivers/infiniband/hw/mlx5/mr.c | 2 +-
> drivers/iommu/iommufd/pages.c | 2 +-
> include/linux/dma-buf.h | 6 +++---
> 9 files changed, 17 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index edaa9e4ee4ae..59cc647bf40e 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -948,7 +948,7 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
> if (WARN_ON(!dmabuf || !dev))
> return ERR_PTR(-EINVAL);
>
> - if (WARN_ON(importer_ops && !importer_ops->move_notify))
> + if (WARN_ON(importer_ops && !importer_ops->invalidate_mappings))
> return ERR_PTR(-EINVAL);
>
> attach = kzalloc(sizeof(*attach), GFP_KERNEL);
> @@ -1055,7 +1055,7 @@ EXPORT_SYMBOL_NS_GPL(dma_buf_pin, "DMA_BUF");
> *
> * This unpins a buffer pinned by dma_buf_pin() and allows the exporter to move
> * any mapping of @attach again and inform the importer through
> - * &dma_buf_attach_ops.move_notify.
> + * &dma_buf_attach_ops.invalidate_mappings.
> */
> void dma_buf_unpin(struct dma_buf_attachment *attach)
> {
> @@ -1262,7 +1262,7 @@ void dma_buf_move_notify(struct dma_buf *dmabuf)
Thinking more about it we can keep the function names as they are in the importers, but renaming renaming this framework function as well would be really nice to have.
Regards,
Christian.
>
> list_for_each_entry(attach, &dmabuf->attachments, node)
> if (attach->importer_ops)
> - attach->importer_ops->move_notify(attach);
> + attach->importer_ops->invalidate_mappings(attach);
> }
> EXPORT_SYMBOL_NS_GPL(dma_buf_move_notify, "DMA_BUF");
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> index e22cfa7c6d32..863454148b28 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> @@ -450,7 +450,7 @@ amdgpu_dma_buf_create_obj(struct drm_device *dev, struct dma_buf *dma_buf)
> }
>
> /**
> - * amdgpu_dma_buf_move_notify - &attach.move_notify implementation
> + * amdgpu_dma_buf_move_notify - &attach.invalidate_mappings implementation
> *
> * @attach: the DMA-buf attachment
> *
> @@ -521,7 +521,7 @@ amdgpu_dma_buf_move_notify(struct dma_buf_attachment *attach)
>
> static const struct dma_buf_attach_ops amdgpu_dma_buf_attach_ops = {
> .allow_peer2peer = true,
> - .move_notify = amdgpu_dma_buf_move_notify
> + .invalidate_mappings = amdgpu_dma_buf_move_notify
> };
>
> /**
> diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c
> index ce49282198cb..19c78dd2ca77 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_prime.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
> @@ -288,7 +288,7 @@ static void virtgpu_dma_buf_move_notify(struct dma_buf_attachment *attach)
>
> static const struct dma_buf_attach_ops virtgpu_dma_buf_attach_ops = {
> .allow_peer2peer = true,
> - .move_notify = virtgpu_dma_buf_move_notify
> + .invalidate_mappings = virtgpu_dma_buf_move_notify
> };
>
> struct drm_gem_object *virtgpu_gem_prime_import(struct drm_device *dev,
> diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf.c b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
> index 5df98de5ba3c..1f2cca5c2f81 100644
> --- a/drivers/gpu/drm/xe/tests/xe_dma_buf.c
> +++ b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
> @@ -23,7 +23,7 @@ static bool p2p_enabled(struct dma_buf_test_params *params)
> static bool is_dynamic(struct dma_buf_test_params *params)
> {
> return IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY) && params->attach_ops &&
> - params->attach_ops->move_notify;
> + params->attach_ops->invalidate_mappings;
> }
>
> static void check_residency(struct kunit *test, struct xe_bo *exported,
> @@ -60,7 +60,7 @@ static void check_residency(struct kunit *test, struct xe_bo *exported,
>
> /*
> * Evict exporter. Evicting the exported bo will
> - * evict also the imported bo through the move_notify() functionality if
> + * evict also the imported bo through the invalidate_mappings() functionality if
> * importer is on a different device. If they're on the same device,
> * the exporter and the importer should be the same bo.
> */
> @@ -198,7 +198,7 @@ static void xe_test_dmabuf_import_same_driver(struct xe_device *xe)
>
> static const struct dma_buf_attach_ops nop2p_attach_ops = {
> .allow_peer2peer = false,
> - .move_notify = xe_dma_buf_move_notify
> + .invalidate_mappings = xe_dma_buf_move_notify
> };
>
> /*
> diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c
> index 7c74a31d4486..1b9cd043e517 100644
> --- a/drivers/gpu/drm/xe/xe_dma_buf.c
> +++ b/drivers/gpu/drm/xe/xe_dma_buf.c
> @@ -287,7 +287,7 @@ static void xe_dma_buf_move_notify(struct dma_buf_attachment *attach)
>
> static const struct dma_buf_attach_ops xe_dma_buf_attach_ops = {
> .allow_peer2peer = true,
> - .move_notify = xe_dma_buf_move_notify
> + .invalidate_mappings = xe_dma_buf_move_notify
> };
>
> #if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST)
> diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c
> index 0ec2e4120cc9..d77a739cfe7a 100644
> --- a/drivers/infiniband/core/umem_dmabuf.c
> +++ b/drivers/infiniband/core/umem_dmabuf.c
> @@ -129,7 +129,7 @@ ib_umem_dmabuf_get_with_dma_device(struct ib_device *device,
> if (check_add_overflow(offset, (unsigned long)size, &end))
> return ret;
>
> - if (unlikely(!ops || !ops->move_notify))
> + if (unlikely(!ops || !ops->invalidate_mappings))
> return ret;
>
> dmabuf = dma_buf_get(fd);
> @@ -195,7 +195,7 @@ ib_umem_dmabuf_unsupported_move_notify(struct dma_buf_attachment *attach)
>
> static struct dma_buf_attach_ops ib_umem_dmabuf_attach_pinned_ops = {
> .allow_peer2peer = true,
> - .move_notify = ib_umem_dmabuf_unsupported_move_notify,
> + .invalidate_mappings = ib_umem_dmabuf_unsupported_move_notify,
> };
>
> struct ib_umem_dmabuf *
> diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> index 325fa04cbe8a..97099d3b1688 100644
> --- a/drivers/infiniband/hw/mlx5/mr.c
> +++ b/drivers/infiniband/hw/mlx5/mr.c
> @@ -1620,7 +1620,7 @@ static void mlx5_ib_dmabuf_invalidate_cb(struct dma_buf_attachment *attach)
>
> static struct dma_buf_attach_ops mlx5_ib_dmabuf_attach_ops = {
> .allow_peer2peer = 1,
> - .move_notify = mlx5_ib_dmabuf_invalidate_cb,
> + .invalidate_mappings = mlx5_ib_dmabuf_invalidate_cb,
> };
>
> static struct ib_mr *
> diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
> index dbe51ecb9a20..76f900fa1687 100644
> --- a/drivers/iommu/iommufd/pages.c
> +++ b/drivers/iommu/iommufd/pages.c
> @@ -1451,7 +1451,7 @@ static void iopt_revoke_notify(struct dma_buf_attachment *attach)
>
> static struct dma_buf_attach_ops iopt_dmabuf_attach_revoke_ops = {
> .allow_peer2peer = true,
> - .move_notify = iopt_revoke_notify,
> + .invalidate_mappings = iopt_revoke_notify,
> };
>
> /*
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 0bc492090237..1b397635c793 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -407,7 +407,7 @@ struct dma_buf {
> * through the device.
> *
> * - Dynamic importers should set fences for any access that they can't
> - * disable immediately from their &dma_buf_attach_ops.move_notify
> + * disable immediately from their &dma_buf_attach_ops.invalidate_mappings
> * callback.
> *
> * IMPORTANT:
> @@ -458,7 +458,7 @@ struct dma_buf_attach_ops {
> bool allow_peer2peer;
>
> /**
> - * @move_notify: [optional] notification that the DMA-buf is moving
> + * @invalidate_mappings: [optional] notification that the DMA-buf is moving
> *
> * If this callback is provided the framework can avoid pinning the
> * backing store while mappings exists.
> @@ -475,7 +475,7 @@ struct dma_buf_attach_ops {
> * New mappings can be created after this callback returns, and will
> * point to the new location of the DMA-buf.
> */
> - void (*move_notify)(struct dma_buf_attachment *attach);
> + void (*invalidate_mappings)(struct dma_buf_attachment *attach);
> };
>
> /**
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 1/7] dma-buf: Rename .move_notify() callback to a clearer identifier
2026-01-21 8:33 ` Christian König
@ 2026-01-21 8:41 ` Leon Romanovsky
0 siblings, 0 replies; 40+ messages in thread
From: Leon Romanovsky @ 2026-01-21 8:41 UTC (permalink / raw)
To: Christian König
Cc: Sumit Semwal, Alex Deucher, David Airlie, Simona Vetter,
Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh, Chia-I Wu,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon,
Robin Murphy, Felix Kuehling, Alex Williamson, Ankit Agrawal,
Vivek Kasireddy, linux-media, dri-devel, linaro-mm-sig,
linux-kernel, amd-gfx, virtualization, intel-xe, linux-rdma,
iommu, kvm
On Wed, Jan 21, 2026 at 09:33:27AM +0100, Christian König wrote:
> On 1/20/26 15:07, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> >
> > Rename the .move_notify() callback to .invalidate_mappings() to make its
> > purpose explicit and highlight that it is responsible for invalidating
> > existing mappings.
> >
> > Suggested-by: Christian König <christian.koenig@amd.com>
> > Reviewed-by: Christian König <christian.koenig@amd.com>
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > ---
> > drivers/dma-buf/dma-buf.c | 6 +++---
> > drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 4 ++--
> > drivers/gpu/drm/virtio/virtgpu_prime.c | 2 +-
> > drivers/gpu/drm/xe/tests/xe_dma_buf.c | 6 +++---
> > drivers/gpu/drm/xe/xe_dma_buf.c | 2 +-
> > drivers/infiniband/core/umem_dmabuf.c | 4 ++--
> > drivers/infiniband/hw/mlx5/mr.c | 2 +-
> > drivers/iommu/iommufd/pages.c | 2 +-
> > include/linux/dma-buf.h | 6 +++---
> > 9 files changed, 17 insertions(+), 17 deletions(-)
<...>
> > attach = kzalloc(sizeof(*attach), GFP_KERNEL);
> > @@ -1055,7 +1055,7 @@ EXPORT_SYMBOL_NS_GPL(dma_buf_pin, "DMA_BUF");
> > *
> > * This unpins a buffer pinned by dma_buf_pin() and allows the exporter to move
> > * any mapping of @attach again and inform the importer through
> > - * &dma_buf_attach_ops.move_notify.
> > + * &dma_buf_attach_ops.invalidate_mappings.
> > */
> > void dma_buf_unpin(struct dma_buf_attachment *attach)
> > {
> > @@ -1262,7 +1262,7 @@ void dma_buf_move_notify(struct dma_buf *dmabuf)
>
> Thinking more about it we can keep the function names as they are in the importers, but renaming renaming this framework function as well would be really nice to have.
Let me prepare an additional patch on top of this series. I'd prefer to
avoid unnecessary resubmissions caused solely by renaming.
Thanks
>
> Regards,
> Christian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 2/7] dma-buf: Always build with DMABUF_MOVE_NOTIFY
2026-01-20 14:07 ` [PATCH v3 2/7] dma-buf: Always build with DMABUF_MOVE_NOTIFY Leon Romanovsky
@ 2026-01-21 8:55 ` Christian König
2026-01-21 10:14 ` Leon Romanovsky
0 siblings, 1 reply; 40+ messages in thread
From: Christian König @ 2026-01-21 8:55 UTC (permalink / raw)
To: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon,
Robin Murphy, Felix Kuehling, Alex Williamson, Ankit Agrawal,
Vivek Kasireddy
Cc: linux-media, dri-devel, linaro-mm-sig, linux-kernel, amd-gfx,
virtualization, intel-xe, linux-rdma, iommu, kvm
On 1/20/26 15:07, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
>
> DMABUF_MOVE_NOTIFY was introduced in 2018 and has been marked as
> experimental and disabled by default ever since. Six years later,
> all new importers implement this callback.
>
> It is therefore reasonable to drop CONFIG_DMABUF_MOVE_NOTIFY and
> always build DMABUF with support for it enabled.
>
> Suggested-by: Christian König <christian.koenig@amd.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
> drivers/dma-buf/Kconfig | 12 ------------
> drivers/dma-buf/dma-buf.c | 12 ++----------
> drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 10 +++-------
> drivers/gpu/drm/amd/amdkfd/Kconfig | 2 +-
> drivers/gpu/drm/xe/tests/xe_dma_buf.c | 3 +--
> drivers/gpu/drm/xe/xe_dma_buf.c | 12 ++++--------
> 6 files changed, 11 insertions(+), 40 deletions(-)
>
> diff --git a/drivers/dma-buf/Kconfig b/drivers/dma-buf/Kconfig
> index b46eb8a552d7..84d5e9b24e20 100644
> --- a/drivers/dma-buf/Kconfig
> +++ b/drivers/dma-buf/Kconfig
> @@ -40,18 +40,6 @@ config UDMABUF
> A driver to let userspace turn memfd regions into dma-bufs.
> Qemu can use this to create host dmabufs for guest framebuffers.
>
> -config DMABUF_MOVE_NOTIFY
> - bool "Move notify between drivers (EXPERIMENTAL)"
> - default n
> - depends on DMA_SHARED_BUFFER
> - help
> - Don't pin buffers if the dynamic DMA-buf interface is available on
> - both the exporter as well as the importer. This fixes a security
> - problem where userspace is able to pin unrestricted amounts of memory
> - through DMA-buf.
> - This is marked experimental because we don't yet have a consistent
> - execution context and memory management between drivers.
> -
> config DMABUF_DEBUG
> bool "DMA-BUF debug checks"
> depends on DMA_SHARED_BUFFER
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 59cc647bf40e..cd3b60ce4863 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -837,18 +837,10 @@ static void mangle_sg_table(struct sg_table *sg_table)
>
> }
>
> -static inline bool
> -dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
I would rather like to keep the wrapper and even add some explanation what it means when true is returned.
Apart from that looks good to me.
Regards,
Christian.
> -{
> - return !!attach->importer_ops;
> -}
> -
> static bool
> dma_buf_pin_on_map(struct dma_buf_attachment *attach)
> {
> - return attach->dmabuf->ops->pin &&
> - (!dma_buf_attachment_is_dynamic(attach) ||
> - !IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY));
> + return attach->dmabuf->ops->pin && !attach->importer_ops;
> }
>
> /**
> @@ -1124,7 +1116,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
> /*
> * Importers with static attachments don't wait for fences.
> */
> - if (!dma_buf_attachment_is_dynamic(attach)) {
> + if (!attach->importer_ops) {
> ret = dma_resv_wait_timeout(attach->dmabuf->resv,
> DMA_RESV_USAGE_KERNEL, true,
> MAX_SCHEDULE_TIMEOUT);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> index 863454148b28..349215549e8f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> @@ -145,13 +145,9 @@ static int amdgpu_dma_buf_pin(struct dma_buf_attachment *attach)
> * notifiers are disabled, only allow pinning in VRAM when move
> * notiers are enabled.
> */
> - if (!IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY)) {
> - domains &= ~AMDGPU_GEM_DOMAIN_VRAM;
> - } else {
> - list_for_each_entry(attach, &dmabuf->attachments, node)
> - if (!attach->peer2peer)
> - domains &= ~AMDGPU_GEM_DOMAIN_VRAM;
> - }
> + list_for_each_entry(attach, &dmabuf->attachments, node)
> + if (!attach->peer2peer)
> + domains &= ~AMDGPU_GEM_DOMAIN_VRAM;
>
> if (domains & AMDGPU_GEM_DOMAIN_VRAM)
> bo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig b/drivers/gpu/drm/amd/amdkfd/Kconfig
> index 16e12c9913f9..a5d7467c2f34 100644
> --- a/drivers/gpu/drm/amd/amdkfd/Kconfig
> +++ b/drivers/gpu/drm/amd/amdkfd/Kconfig
> @@ -27,7 +27,7 @@ config HSA_AMD_SVM
>
> config HSA_AMD_P2P
> bool "HSA kernel driver support for peer-to-peer for AMD GPU devices"
> - depends on HSA_AMD && PCI_P2PDMA && DMABUF_MOVE_NOTIFY
> + depends on HSA_AMD && PCI_P2PDMA
> help
> Enable peer-to-peer (P2P) communication between AMD GPUs over
> the PCIe bus. This can improve performance of multi-GPU compute
> diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf.c b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
> index 1f2cca5c2f81..c107687ef3c0 100644
> --- a/drivers/gpu/drm/xe/tests/xe_dma_buf.c
> +++ b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
> @@ -22,8 +22,7 @@ static bool p2p_enabled(struct dma_buf_test_params *params)
>
> static bool is_dynamic(struct dma_buf_test_params *params)
> {
> - return IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY) && params->attach_ops &&
> - params->attach_ops->invalidate_mappings;
> + return params->attach_ops && params->attach_ops->invalidate_mappings;
> }
>
> static void check_residency(struct kunit *test, struct xe_bo *exported,
> diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c
> index 1b9cd043e517..ea370cd373e9 100644
> --- a/drivers/gpu/drm/xe/xe_dma_buf.c
> +++ b/drivers/gpu/drm/xe/xe_dma_buf.c
> @@ -56,14 +56,10 @@ static int xe_dma_buf_pin(struct dma_buf_attachment *attach)
> bool allow_vram = true;
> int ret;
>
> - if (!IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY)) {
> - allow_vram = false;
> - } else {
> - list_for_each_entry(attach, &dmabuf->attachments, node) {
> - if (!attach->peer2peer) {
> - allow_vram = false;
> - break;
> - }
> + list_for_each_entry(attach, &dmabuf->attachments, node) {
> + if (!attach->peer2peer) {
> + allow_vram = false;
> + break;
> }
> }
>
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 3/7] dma-buf: Document RDMA non-ODP invalidate_mapping() special case
2026-01-20 14:07 ` [PATCH v3 3/7] dma-buf: Document RDMA non-ODP invalidate_mapping() special case Leon Romanovsky
@ 2026-01-21 8:59 ` Christian König
2026-01-21 9:14 ` Leon Romanovsky
0 siblings, 1 reply; 40+ messages in thread
From: Christian König @ 2026-01-21 8:59 UTC (permalink / raw)
To: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon,
Robin Murphy, Felix Kuehling, Alex Williamson, Ankit Agrawal,
Vivek Kasireddy
Cc: linux-media, dri-devel, linaro-mm-sig, linux-kernel, amd-gfx,
virtualization, intel-xe, linux-rdma, iommu, kvm
On 1/20/26 15:07, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
>
> The .invalidate_mapping() callback is documented as optional, yet it
> effectively became mandatory whenever importer_ops were provided. This
> led to cases where RDMA non-ODP code had to supply an empty stub just to
> provide allow_peer2peer.
>
> Document this behavior by creating a dedicated export for the
> dma_buf_unsupported_invalidate_mappings() function. This function is
> intended solely for the RDMA non-ODP case and must not be used by any
> other dma-buf importer.
>
> This makes it possible to rely on a valid .invalidate_mappings()
> callback to determine whether an importer supports revocation.
>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
> drivers/dma-buf/dma-buf.c | 14 ++++++++++++++
> drivers/infiniband/core/umem_dmabuf.c | 11 +----------
> include/linux/dma-buf.h | 4 +++-
> 3 files changed, 18 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index cd3b60ce4863..c4fa35034b92 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -1238,6 +1238,20 @@ void dma_buf_unmap_attachment_unlocked(struct dma_buf_attachment *attach,
> }
> EXPORT_SYMBOL_NS_GPL(dma_buf_unmap_attachment_unlocked, "DMA_BUF");
>
> +/*
> + * This function shouldn't be used by anyone except RDMA non-ODP case.
> + * The reason to it is UAPI mistake where dma-buf was exported to the
> + * userspace without knowing that .invalidate_mappings() can be called
> + * for pinned memory too.
> + *
> + * This warning shouldn't be seen in real production scenario.
> + */
> +void dma_buf_unsupported_invalidate_mappings(struct dma_buf_attachment *attach)
> +{
> + pr_warn("Invalidate callback should not be called when memory is pinned\n");
> +}
> +EXPORT_SYMBOL_FOR_MODULES(dma_buf_unsupported_invalidate_mappings, "ib_uverbs");
> +
Well that is exactly the opposite of what I had in mind.
The RDMA non-ODP case should explicitly not provide an invalidate_mappings callback, but only the dma_buf_attach_ops with allow_peer2peer set to true.
This is done to explicitly note that RDMA non-ODP can't do invalidation's.
Regards,
Christian.
> /**
> * dma_buf_move_notify - notify attachments that DMA-buf is moving
> *
> diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c
> index d77a739cfe7a..81442a887b48 100644
> --- a/drivers/infiniband/core/umem_dmabuf.c
> +++ b/drivers/infiniband/core/umem_dmabuf.c
> @@ -184,18 +184,9 @@ struct ib_umem_dmabuf *ib_umem_dmabuf_get(struct ib_device *device,
> }
> EXPORT_SYMBOL(ib_umem_dmabuf_get);
>
> -static void
> -ib_umem_dmabuf_unsupported_move_notify(struct dma_buf_attachment *attach)
> -{
> - struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv;
> -
> - ibdev_warn_ratelimited(umem_dmabuf->umem.ibdev,
> - "Invalidate callback should not be called when memory is pinned\n");
> -}
> -
> static struct dma_buf_attach_ops ib_umem_dmabuf_attach_pinned_ops = {
> .allow_peer2peer = true,
> - .invalidate_mappings = ib_umem_dmabuf_unsupported_move_notify,
> + .invalidate_mappings = dma_buf_unsupported_invalidate_mappings,
> };
>
> struct ib_umem_dmabuf *
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 1b397635c793..7d7d0a4fb762 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -458,7 +458,7 @@ struct dma_buf_attach_ops {
> bool allow_peer2peer;
>
> /**
> - * @invalidate_mappings: [optional] notification that the DMA-buf is moving
> + * @invalidate_mappings: notification that the DMA-buf is moving
> *
> * If this callback is provided the framework can avoid pinning the
> * backing store while mappings exists.
> @@ -601,6 +601,8 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *,
> void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *,
> enum dma_data_direction);
> void dma_buf_move_notify(struct dma_buf *dma_buf);
> +void dma_buf_unsupported_invalidate_mappings(struct dma_buf_attachment *attach);
> +
> int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
> enum dma_data_direction dir);
> int dma_buf_end_cpu_access(struct dma_buf *dma_buf,
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 5/7] iommufd: Pin dma-buf importer for revoke semantics
2026-01-20 14:07 ` [PATCH v3 5/7] iommufd: Pin dma-buf importer " Leon Romanovsky
@ 2026-01-21 9:01 ` Christian König
0 siblings, 0 replies; 40+ messages in thread
From: Christian König @ 2026-01-21 9:01 UTC (permalink / raw)
To: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon,
Robin Murphy, Felix Kuehling, Alex Williamson, Ankit Agrawal,
Vivek Kasireddy
Cc: linux-media, dri-devel, linaro-mm-sig, linux-kernel, amd-gfx,
virtualization, intel-xe, linux-rdma, iommu, kvm
On 1/20/26 15:07, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
>
> IOMMUFD does not support page fault handling, and after a call to
> .invalidate_mappings() all mappings become invalid. Ensure that
> the IOMMUFD dma-buf importer is bound to a revoke‑aware dma-buf
> exporter (for example, VFIO).
>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
I don't know the code well enough for a review, but that looks totally reasonable to me.
Acked-by: Christian König <christian.koenig@amd.com>
> ---
> drivers/iommu/iommufd/pages.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
> index 76f900fa1687..a5eb2bc4ef48 100644
> --- a/drivers/iommu/iommufd/pages.c
> +++ b/drivers/iommu/iommufd/pages.c
> @@ -1501,16 +1501,22 @@ static int iopt_map_dmabuf(struct iommufd_ctx *ictx, struct iopt_pages *pages,
> mutex_unlock(&pages->mutex);
> }
>
> - rc = sym_vfio_pci_dma_buf_iommufd_map(attach, &pages->dmabuf.phys);
> + rc = dma_buf_pin(attach);
> if (rc)
> goto err_detach;
>
> + rc = sym_vfio_pci_dma_buf_iommufd_map(attach, &pages->dmabuf.phys);
> + if (rc)
> + goto err_unpin;
> +
> dma_resv_unlock(dmabuf->resv);
>
> /* On success iopt_release_pages() will detach and put the dmabuf. */
> pages->dmabuf.attach = attach;
> return 0;
>
> +err_unpin:
> + dma_buf_unpin(attach);
> err_detach:
> dma_resv_unlock(dmabuf->resv);
> dma_buf_detach(dmabuf, attach);
> @@ -1656,6 +1662,7 @@ void iopt_release_pages(struct kref *kref)
> if (iopt_is_dmabuf(pages) && pages->dmabuf.attach) {
> struct dma_buf *dmabuf = pages->dmabuf.attach->dmabuf;
>
> + dma_buf_unpin(pages->dmabuf.attach);
> dma_buf_detach(dmabuf, pages->dmabuf.attach);
> dma_buf_put(dmabuf);
> WARN_ON(!list_empty(&pages->dmabuf.tracker));
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 3/7] dma-buf: Document RDMA non-ODP invalidate_mapping() special case
2026-01-21 8:59 ` Christian König
@ 2026-01-21 9:14 ` Leon Romanovsky
2026-01-21 9:17 ` Christian König
0 siblings, 1 reply; 40+ messages in thread
From: Leon Romanovsky @ 2026-01-21 9:14 UTC (permalink / raw)
To: Christian König
Cc: Sumit Semwal, Alex Deucher, David Airlie, Simona Vetter,
Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh, Chia-I Wu,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon,
Robin Murphy, Felix Kuehling, Alex Williamson, Ankit Agrawal,
Vivek Kasireddy, linux-media, dri-devel, linaro-mm-sig,
linux-kernel, amd-gfx, virtualization, intel-xe, linux-rdma,
iommu, kvm
On Wed, Jan 21, 2026 at 09:59:59AM +0100, Christian König wrote:
> On 1/20/26 15:07, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> >
> > The .invalidate_mapping() callback is documented as optional, yet it
> > effectively became mandatory whenever importer_ops were provided. This
> > led to cases where RDMA non-ODP code had to supply an empty stub just to
> > provide allow_peer2peer.
> >
> > Document this behavior by creating a dedicated export for the
> > dma_buf_unsupported_invalidate_mappings() function. This function is
> > intended solely for the RDMA non-ODP case and must not be used by any
> > other dma-buf importer.
> >
> > This makes it possible to rely on a valid .invalidate_mappings()
> > callback to determine whether an importer supports revocation.
> >
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > ---
> > drivers/dma-buf/dma-buf.c | 14 ++++++++++++++
> > drivers/infiniband/core/umem_dmabuf.c | 11 +----------
> > include/linux/dma-buf.h | 4 +++-
> > 3 files changed, 18 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > index cd3b60ce4863..c4fa35034b92 100644
> > --- a/drivers/dma-buf/dma-buf.c
> > +++ b/drivers/dma-buf/dma-buf.c
> > @@ -1238,6 +1238,20 @@ void dma_buf_unmap_attachment_unlocked(struct dma_buf_attachment *attach,
> > }
> > EXPORT_SYMBOL_NS_GPL(dma_buf_unmap_attachment_unlocked, "DMA_BUF");
> >
> > +/*
> > + * This function shouldn't be used by anyone except RDMA non-ODP case.
> > + * The reason to it is UAPI mistake where dma-buf was exported to the
> > + * userspace without knowing that .invalidate_mappings() can be called
> > + * for pinned memory too.
> > + *
> > + * This warning shouldn't be seen in real production scenario.
> > + */
> > +void dma_buf_unsupported_invalidate_mappings(struct dma_buf_attachment *attach)
> > +{
> > + pr_warn("Invalidate callback should not be called when memory is pinned\n");
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(dma_buf_unsupported_invalidate_mappings, "ib_uverbs");
> > +
>
> Well that is exactly the opposite of what I had in mind.
>
> The RDMA non-ODP case should explicitly not provide an invalidate_mappings callback, but only the dma_buf_attach_ops with allow_peer2peer set to true.
>
> This is done to explicitly note that RDMA non-ODP can't do invalidation's.
We want to achieve two goals:
1. Provide a meaningful warning to developers, rather than failing later
because dma_buf_move_notify() was called on this problematic imported dma-buf.
2. Require all users to supply a valid .invalidate_mapping().
If I allow empty .invalidate_mapping(), this check will go too:
932 struct dma_buf_attachment *
933 dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
934 const struct dma_buf_attach_ops *importer_ops,
935 void *importer_priv)
...
943 if (WARN_ON(importer_ops && !importer_ops->invalidate_mappings))
944 return ERR_PTR(-EINVAL);
And it is important part of dma-buf.
Thanks
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 3/7] dma-buf: Document RDMA non-ODP invalidate_mapping() special case
2026-01-21 9:14 ` Leon Romanovsky
@ 2026-01-21 9:17 ` Christian König
2026-01-21 13:18 ` Jason Gunthorpe
0 siblings, 1 reply; 40+ messages in thread
From: Christian König @ 2026-01-21 9:17 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Sumit Semwal, Alex Deucher, David Airlie, Simona Vetter,
Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh, Chia-I Wu,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon,
Robin Murphy, Felix Kuehling, Alex Williamson, Ankit Agrawal,
Vivek Kasireddy, linux-media, dri-devel, linaro-mm-sig,
linux-kernel, amd-gfx, virtualization, intel-xe, linux-rdma,
iommu, kvm
On 1/21/26 10:14, Leon Romanovsky wrote:
> On Wed, Jan 21, 2026 at 09:59:59AM +0100, Christian König wrote:
>> On 1/20/26 15:07, Leon Romanovsky wrote:
>>> From: Leon Romanovsky <leonro@nvidia.com>
>>>
>>> The .invalidate_mapping() callback is documented as optional, yet it
>>> effectively became mandatory whenever importer_ops were provided. This
>>> led to cases where RDMA non-ODP code had to supply an empty stub just to
>>> provide allow_peer2peer.
>>>
>>> Document this behavior by creating a dedicated export for the
>>> dma_buf_unsupported_invalidate_mappings() function. This function is
>>> intended solely for the RDMA non-ODP case and must not be used by any
>>> other dma-buf importer.
>>>
>>> This makes it possible to rely on a valid .invalidate_mappings()
>>> callback to determine whether an importer supports revocation.
>>>
>>> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
>>> ---
>>> drivers/dma-buf/dma-buf.c | 14 ++++++++++++++
>>> drivers/infiniband/core/umem_dmabuf.c | 11 +----------
>>> include/linux/dma-buf.h | 4 +++-
>>> 3 files changed, 18 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>> index cd3b60ce4863..c4fa35034b92 100644
>>> --- a/drivers/dma-buf/dma-buf.c
>>> +++ b/drivers/dma-buf/dma-buf.c
>>> @@ -1238,6 +1238,20 @@ void dma_buf_unmap_attachment_unlocked(struct dma_buf_attachment *attach,
>>> }
>>> EXPORT_SYMBOL_NS_GPL(dma_buf_unmap_attachment_unlocked, "DMA_BUF");
>>>
>>> +/*
>>> + * This function shouldn't be used by anyone except RDMA non-ODP case.
>>> + * The reason to it is UAPI mistake where dma-buf was exported to the
>>> + * userspace without knowing that .invalidate_mappings() can be called
>>> + * for pinned memory too.
>>> + *
>>> + * This warning shouldn't be seen in real production scenario.
>>> + */
>>> +void dma_buf_unsupported_invalidate_mappings(struct dma_buf_attachment *attach)
>>> +{
>>> + pr_warn("Invalidate callback should not be called when memory is pinned\n");
>>> +}
>>> +EXPORT_SYMBOL_FOR_MODULES(dma_buf_unsupported_invalidate_mappings, "ib_uverbs");
>>> +
>>
>> Well that is exactly the opposite of what I had in mind.
>>
>> The RDMA non-ODP case should explicitly not provide an invalidate_mappings callback, but only the dma_buf_attach_ops with allow_peer2peer set to true.
>>
>> This is done to explicitly note that RDMA non-ODP can't do invalidation's.
>
> We want to achieve two goals:
> 1. Provide a meaningful warning to developers, rather than failing later
> because dma_buf_move_notify() was called on this problematic imported dma-buf.
> 2. Require all users to supply a valid .invalidate_mapping().
Nope, that is something I would reject. invalidate_mappings must stay optional.
>
> If I allow empty .invalidate_mapping(), this check will go too:
Correct, that is the whole idea.
> 932 struct dma_buf_attachment *
> 933 dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
> 934 const struct dma_buf_attach_ops *importer_ops,
> 935 void *importer_priv)
> ...
> 943 if (WARN_ON(importer_ops && !importer_ops->invalidate_mappings))
> 944 return ERR_PTR(-EINVAL);
>
> And it is important part of dma-buf.
No, as far as I can see that is what we try to avoid.
The whole idea is to make invalidate_mappings truly optional.
Regards,
Christian.
>
> Thanks
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-20 14:07 ` [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete Leon Romanovsky
2026-01-20 20:44 ` Matthew Brost
@ 2026-01-21 9:20 ` Christian König
2026-01-21 9:36 ` Thomas Hellström
2026-01-21 13:31 ` Jason Gunthorpe
1 sibling, 2 replies; 40+ messages in thread
From: Christian König @ 2026-01-21 9:20 UTC (permalink / raw)
To: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon,
Robin Murphy, Felix Kuehling, Alex Williamson, Ankit Agrawal,
Vivek Kasireddy
Cc: linux-media, dri-devel, linaro-mm-sig, linux-kernel, amd-gfx,
virtualization, intel-xe, linux-rdma, iommu, kvm
On 1/20/26 15:07, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
>
> dma-buf invalidation is performed asynchronously by hardware, so VFIO must
> wait until all affected objects have been fully invalidated.
>
> Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO regions")
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Please also keep in mind that the while this wait for all fences for correctness you also need to keep the mapping valid until dma_buf_unmap_attachment() was called.
In other words you can only redirect the DMA-addresses previously given out into nirvana (or a dummy memory or similar), but you still need to avoid re-using them for something else.
Regards,
Christian.
> ---
> drivers/vfio/pci/vfio_pci_dmabuf.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
> index d4d0f7d08c53..33bc6a1909dd 100644
> --- a/drivers/vfio/pci/vfio_pci_dmabuf.c
> +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
> @@ -321,6 +321,9 @@ void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked)
> dma_resv_lock(priv->dmabuf->resv, NULL);
> priv->revoked = revoked;
> dma_buf_move_notify(priv->dmabuf);
> + dma_resv_wait_timeout(priv->dmabuf->resv,
> + DMA_RESV_USAGE_KERNEL, false,
> + MAX_SCHEDULE_TIMEOUT);
> dma_resv_unlock(priv->dmabuf->resv);
> }
> fput(priv->dmabuf->file);
> @@ -342,6 +345,8 @@ void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev)
> priv->vdev = NULL;
> priv->revoked = true;
> dma_buf_move_notify(priv->dmabuf);
> + dma_resv_wait_timeout(priv->dmabuf->resv, DMA_RESV_USAGE_KERNEL,
> + false, MAX_SCHEDULE_TIMEOUT);
> dma_resv_unlock(priv->dmabuf->resv);
> vfio_device_put_registration(&vdev->vdev);
> fput(priv->dmabuf->file);
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-21 9:20 ` Christian König
@ 2026-01-21 9:36 ` Thomas Hellström
2026-01-21 10:55 ` Christian König
2026-01-21 13:31 ` Jason Gunthorpe
1 sibling, 1 reply; 40+ messages in thread
From: Thomas Hellström @ 2026-01-21 9:36 UTC (permalink / raw)
To: Christian König, Leon Romanovsky, Sumit Semwal, Alex Deucher,
David Airlie, Simona Vetter, Gerd Hoffmann, Dmitry Osipenko,
Gurchetan Singh, Chia-I Wu, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, Lucas De Marchi, Rodrigo Vivi, Jason Gunthorpe,
Kevin Tian, Joerg Roedel, Will Deacon, Robin Murphy,
Felix Kuehling, Alex Williamson, Ankit Agrawal, Vivek Kasireddy
Cc: linux-media, dri-devel, linaro-mm-sig, linux-kernel, amd-gfx,
virtualization, intel-xe, linux-rdma, iommu, kvm
Hi, Christian,
On Wed, 2026-01-21 at 10:20 +0100, Christian König wrote:
> On 1/20/26 15:07, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> >
> > dma-buf invalidation is performed asynchronously by hardware, so
> > VFIO must
> > wait until all affected objects have been fully invalidated.
> >
> > Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO
> > regions")
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
>
> Reviewed-by: Christian König <christian.koenig@amd.com>
>
> Please also keep in mind that the while this wait for all fences for
> correctness you also need to keep the mapping valid until
> dma_buf_unmap_attachment() was called.
I'm wondering shouldn't we require DMA_RESV_USAGE_BOOKKEEP here, as
*any* unsignaled fence could indicate access through the map?
/Thomas
>
> In other words you can only redirect the DMA-addresses previously
> given out into nirvana (or a dummy memory or similar), but you still
> need to avoid re-using them for something else.
>
> Regards,
> Christian.
>
> > ---
> > drivers/vfio/pci/vfio_pci_dmabuf.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c
> > b/drivers/vfio/pci/vfio_pci_dmabuf.c
> > index d4d0f7d08c53..33bc6a1909dd 100644
> > --- a/drivers/vfio/pci/vfio_pci_dmabuf.c
> > +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
> > @@ -321,6 +321,9 @@ void vfio_pci_dma_buf_move(struct
> > vfio_pci_core_device *vdev, bool revoked)
> > dma_resv_lock(priv->dmabuf->resv, NULL);
> > priv->revoked = revoked;
> > dma_buf_move_notify(priv->dmabuf);
> > + dma_resv_wait_timeout(priv->dmabuf->resv,
> > +
> > DMA_RESV_USAGE_KERNEL, false,
> > +
> > MAX_SCHEDULE_TIMEOUT);
> > dma_resv_unlock(priv->dmabuf->resv);
> > }
> > fput(priv->dmabuf->file);
> > @@ -342,6 +345,8 @@ void vfio_pci_dma_buf_cleanup(struct
> > vfio_pci_core_device *vdev)
> > priv->vdev = NULL;
> > priv->revoked = true;
> > dma_buf_move_notify(priv->dmabuf);
> > + dma_resv_wait_timeout(priv->dmabuf->resv,
> > DMA_RESV_USAGE_KERNEL,
> > + false,
> > MAX_SCHEDULE_TIMEOUT);
> > dma_resv_unlock(priv->dmabuf->resv);
> > vfio_device_put_registration(&vdev->vdev);
> > fput(priv->dmabuf->file);
> >
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 2/7] dma-buf: Always build with DMABUF_MOVE_NOTIFY
2026-01-21 8:55 ` Christian König
@ 2026-01-21 10:14 ` Leon Romanovsky
2026-01-21 10:57 ` Christian König
0 siblings, 1 reply; 40+ messages in thread
From: Leon Romanovsky @ 2026-01-21 10:14 UTC (permalink / raw)
To: Christian König
Cc: Sumit Semwal, Alex Deucher, David Airlie, Simona Vetter,
Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh, Chia-I Wu,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon,
Robin Murphy, Felix Kuehling, Alex Williamson, Ankit Agrawal,
Vivek Kasireddy, linux-media, dri-devel, linaro-mm-sig,
linux-kernel, amd-gfx, virtualization, intel-xe, linux-rdma,
iommu, kvm
On Wed, Jan 21, 2026 at 09:55:38AM +0100, Christian König wrote:
> On 1/20/26 15:07, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> >
> > DMABUF_MOVE_NOTIFY was introduced in 2018 and has been marked as
> > experimental and disabled by default ever since. Six years later,
> > all new importers implement this callback.
> >
> > It is therefore reasonable to drop CONFIG_DMABUF_MOVE_NOTIFY and
> > always build DMABUF with support for it enabled.
> >
> > Suggested-by: Christian König <christian.koenig@amd.com>
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > ---
> > drivers/dma-buf/Kconfig | 12 ------------
> > drivers/dma-buf/dma-buf.c | 12 ++----------
> > drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 10 +++-------
> > drivers/gpu/drm/amd/amdkfd/Kconfig | 2 +-
> > drivers/gpu/drm/xe/tests/xe_dma_buf.c | 3 +--
> > drivers/gpu/drm/xe/xe_dma_buf.c | 12 ++++--------
> > 6 files changed, 11 insertions(+), 40 deletions(-)
> >
> > diff --git a/drivers/dma-buf/Kconfig b/drivers/dma-buf/Kconfig
> > index b46eb8a552d7..84d5e9b24e20 100644
> > --- a/drivers/dma-buf/Kconfig
> > +++ b/drivers/dma-buf/Kconfig
> > @@ -40,18 +40,6 @@ config UDMABUF
> > A driver to let userspace turn memfd regions into dma-bufs.
> > Qemu can use this to create host dmabufs for guest framebuffers.
> >
> > -config DMABUF_MOVE_NOTIFY
> > - bool "Move notify between drivers (EXPERIMENTAL)"
> > - default n
> > - depends on DMA_SHARED_BUFFER
> > - help
> > - Don't pin buffers if the dynamic DMA-buf interface is available on
> > - both the exporter as well as the importer. This fixes a security
> > - problem where userspace is able to pin unrestricted amounts of memory
> > - through DMA-buf.
> > - This is marked experimental because we don't yet have a consistent
> > - execution context and memory management between drivers.
> > -
> > config DMABUF_DEBUG
> > bool "DMA-BUF debug checks"
> > depends on DMA_SHARED_BUFFER
> > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > index 59cc647bf40e..cd3b60ce4863 100644
> > --- a/drivers/dma-buf/dma-buf.c
> > +++ b/drivers/dma-buf/dma-buf.c
> > @@ -837,18 +837,10 @@ static void mangle_sg_table(struct sg_table *sg_table)
> >
> > }
> >
> > -static inline bool
> > -dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
>
> I would rather like to keep the wrapper and even add some explanation what it means when true is returned.
We have different opinion here. I don't like single line functions which
are called only twice. I'll keep this function to ensure progress the
series.
Thanks
>
> Apart from that looks good to me.
>
> Regards,
> Christian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-20 20:44 ` Matthew Brost
2026-01-21 7:59 ` Leon Romanovsky
@ 2026-01-21 10:41 ` Christian König
2026-01-21 10:44 ` Leon Romanovsky
1 sibling, 1 reply; 40+ messages in thread
From: Christian König @ 2026-01-21 10:41 UTC (permalink / raw)
To: Matthew Brost, Leon Romanovsky
Cc: Sumit Semwal, Alex Deucher, David Airlie, Simona Vetter,
Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh, Chia-I Wu,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon,
Robin Murphy, Felix Kuehling, Alex Williamson, Ankit Agrawal,
Vivek Kasireddy, linux-media, dri-devel, linaro-mm-sig,
linux-kernel, amd-gfx, virtualization, intel-xe, linux-rdma,
iommu, kvm
On 1/20/26 21:44, Matthew Brost wrote:
> On Tue, Jan 20, 2026 at 04:07:06PM +0200, Leon Romanovsky wrote:
>> From: Leon Romanovsky <leonro@nvidia.com>
>>
>> dma-buf invalidation is performed asynchronously by hardware, so VFIO must
>> wait until all affected objects have been fully invalidated.
>>
>> Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO regions")
>> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
>> ---
>> drivers/vfio/pci/vfio_pci_dmabuf.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
>> index d4d0f7d08c53..33bc6a1909dd 100644
>> --- a/drivers/vfio/pci/vfio_pci_dmabuf.c
>> +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
>> @@ -321,6 +321,9 @@ void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked)
>> dma_resv_lock(priv->dmabuf->resv, NULL);
>> priv->revoked = revoked;
>> dma_buf_move_notify(priv->dmabuf);
>> + dma_resv_wait_timeout(priv->dmabuf->resv,
>> + DMA_RESV_USAGE_KERNEL, false,
>> + MAX_SCHEDULE_TIMEOUT);
>
> Should we explicitly call out in the dma_buf_move_notify() /
> invalidate_mappings kernel-doc that KERNEL slots are the mechanism
> for communicating asynchronous dma_buf_move_notify /
> invalidate_mappings events via fences?
Oh, I missed that! And no that is not correct.
This should be DMA_RESV_USAGE_BOOKKEEP so that we wait for everything.
Regards,
Christian.
>
> Yes, this is probably implied, but it wouldn’t hurt to state this
> explicitly as part of the cross-driver contract.
>
> Here is what we have now:
>
> * - Dynamic importers should set fences for any access that they can't
> * disable immediately from their &dma_buf_attach_ops.invalidate_mappings
> * callback.
>
> Matt
>
>> dma_resv_unlock(priv->dmabuf->resv);
>> }
>> fput(priv->dmabuf->file);
>> @@ -342,6 +345,8 @@ void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev)
>> priv->vdev = NULL;
>> priv->revoked = true;
>> dma_buf_move_notify(priv->dmabuf);
>> + dma_resv_wait_timeout(priv->dmabuf->resv, DMA_RESV_USAGE_KERNEL,
>> + false, MAX_SCHEDULE_TIMEOUT);
>> dma_resv_unlock(priv->dmabuf->resv);
>> vfio_device_put_registration(&vdev->vdev);
>> fput(priv->dmabuf->file);
>>
>> --
>> 2.52.0
>>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-21 10:41 ` Christian König
@ 2026-01-21 10:44 ` Leon Romanovsky
2026-01-21 17:18 ` Matthew Brost
0 siblings, 1 reply; 40+ messages in thread
From: Leon Romanovsky @ 2026-01-21 10:44 UTC (permalink / raw)
To: Christian König
Cc: Matthew Brost, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon,
Robin Murphy, Felix Kuehling, Alex Williamson, Ankit Agrawal,
Vivek Kasireddy, linux-media, dri-devel, linaro-mm-sig,
linux-kernel, amd-gfx, virtualization, intel-xe, linux-rdma,
iommu, kvm
On Wed, Jan 21, 2026 at 11:41:48AM +0100, Christian König wrote:
> On 1/20/26 21:44, Matthew Brost wrote:
> > On Tue, Jan 20, 2026 at 04:07:06PM +0200, Leon Romanovsky wrote:
> >> From: Leon Romanovsky <leonro@nvidia.com>
> >>
> >> dma-buf invalidation is performed asynchronously by hardware, so VFIO must
> >> wait until all affected objects have been fully invalidated.
> >>
> >> Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO regions")
> >> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> >> ---
> >> drivers/vfio/pci/vfio_pci_dmabuf.c | 5 +++++
> >> 1 file changed, 5 insertions(+)
> >>
> >> diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
> >> index d4d0f7d08c53..33bc6a1909dd 100644
> >> --- a/drivers/vfio/pci/vfio_pci_dmabuf.c
> >> +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
> >> @@ -321,6 +321,9 @@ void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked)
> >> dma_resv_lock(priv->dmabuf->resv, NULL);
> >> priv->revoked = revoked;
> >> dma_buf_move_notify(priv->dmabuf);
> >> + dma_resv_wait_timeout(priv->dmabuf->resv,
> >> + DMA_RESV_USAGE_KERNEL, false,
> >> + MAX_SCHEDULE_TIMEOUT);
> >
> > Should we explicitly call out in the dma_buf_move_notify() /
> > invalidate_mappings kernel-doc that KERNEL slots are the mechanism
> > for communicating asynchronous dma_buf_move_notify /
> > invalidate_mappings events via fences?
>
> Oh, I missed that! And no that is not correct.
>
> This should be DMA_RESV_USAGE_BOOKKEEP so that we wait for everything.
Will change.
>
> Regards,
> Christian.
>
> >
> > Yes, this is probably implied, but it wouldn’t hurt to state this
> > explicitly as part of the cross-driver contract.
> >
> > Here is what we have now:
> >
> > * - Dynamic importers should set fences for any access that they can't
> > * disable immediately from their &dma_buf_attach_ops.invalidate_mappings
> > * callback.
> >
> > Matt
> >
> >> dma_resv_unlock(priv->dmabuf->resv);
> >> }
> >> fput(priv->dmabuf->file);
> >> @@ -342,6 +345,8 @@ void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev)
> >> priv->vdev = NULL;
> >> priv->revoked = true;
> >> dma_buf_move_notify(priv->dmabuf);
> >> + dma_resv_wait_timeout(priv->dmabuf->resv, DMA_RESV_USAGE_KERNEL,
> >> + false, MAX_SCHEDULE_TIMEOUT);
> >> dma_resv_unlock(priv->dmabuf->resv);
> >> vfio_device_put_registration(&vdev->vdev);
> >> fput(priv->dmabuf->file);
> >>
> >> --
> >> 2.52.0
> >>
>
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-21 9:36 ` Thomas Hellström
@ 2026-01-21 10:55 ` Christian König
0 siblings, 0 replies; 40+ messages in thread
From: Christian König @ 2026-01-21 10:55 UTC (permalink / raw)
To: Thomas Hellström, Leon Romanovsky, Sumit Semwal,
Alex Deucher, David Airlie, Simona Vetter, Gerd Hoffmann,
Dmitry Osipenko, Gurchetan Singh, Chia-I Wu, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, Lucas De Marchi, Rodrigo Vivi,
Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon,
Robin Murphy, Felix Kuehling, Alex Williamson, Ankit Agrawal,
Vivek Kasireddy
Cc: linux-media, dri-devel, linaro-mm-sig, linux-kernel, amd-gfx,
virtualization, intel-xe, linux-rdma, iommu, kvm
On 1/21/26 10:36, Thomas Hellström wrote:
> Hi, Christian,
>
> On Wed, 2026-01-21 at 10:20 +0100, Christian König wrote:
>> On 1/20/26 15:07, Leon Romanovsky wrote:
>>> From: Leon Romanovsky <leonro@nvidia.com>
>>>
>>> dma-buf invalidation is performed asynchronously by hardware, so
>>> VFIO must
>>> wait until all affected objects have been fully invalidated.
>>>
>>> Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO
>>> regions")
>>> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
>>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>
>> Please also keep in mind that the while this wait for all fences for
>> correctness you also need to keep the mapping valid until
>> dma_buf_unmap_attachment() was called.
>
> I'm wondering shouldn't we require DMA_RESV_USAGE_BOOKKEEP here, as
> *any* unsignaled fence could indicate access through the map?
Yes, exactly that. I totally missed this detail.
Thanks a lot to Matthew and you to pointing this out.
Regards,
Christian.
>
> /Thomas
>
>>
>> In other words you can only redirect the DMA-addresses previously
>> given out into nirvana (or a dummy memory or similar), but you still
>> need to avoid re-using them for something else.
>>
>> Regards,
>> Christian.
>>
>>> ---
>>> drivers/vfio/pci/vfio_pci_dmabuf.c | 5 +++++
>>> 1 file changed, 5 insertions(+)
>>>
>>> diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c
>>> b/drivers/vfio/pci/vfio_pci_dmabuf.c
>>> index d4d0f7d08c53..33bc6a1909dd 100644
>>> --- a/drivers/vfio/pci/vfio_pci_dmabuf.c
>>> +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
>>> @@ -321,6 +321,9 @@ void vfio_pci_dma_buf_move(struct
>>> vfio_pci_core_device *vdev, bool revoked)
>>> dma_resv_lock(priv->dmabuf->resv, NULL);
>>> priv->revoked = revoked;
>>> dma_buf_move_notify(priv->dmabuf);
>>> + dma_resv_wait_timeout(priv->dmabuf->resv,
>>> +
>>> DMA_RESV_USAGE_KERNEL, false,
>>> +
>>> MAX_SCHEDULE_TIMEOUT);
>>> dma_resv_unlock(priv->dmabuf->resv);
>>> }
>>> fput(priv->dmabuf->file);
>>> @@ -342,6 +345,8 @@ void vfio_pci_dma_buf_cleanup(struct
>>> vfio_pci_core_device *vdev)
>>> priv->vdev = NULL;
>>> priv->revoked = true;
>>> dma_buf_move_notify(priv->dmabuf);
>>> + dma_resv_wait_timeout(priv->dmabuf->resv,
>>> DMA_RESV_USAGE_KERNEL,
>>> + false,
>>> MAX_SCHEDULE_TIMEOUT);
>>> dma_resv_unlock(priv->dmabuf->resv);
>>> vfio_device_put_registration(&vdev->vdev);
>>> fput(priv->dmabuf->file);
>>>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 2/7] dma-buf: Always build with DMABUF_MOVE_NOTIFY
2026-01-21 10:14 ` Leon Romanovsky
@ 2026-01-21 10:57 ` Christian König
0 siblings, 0 replies; 40+ messages in thread
From: Christian König @ 2026-01-21 10:57 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Sumit Semwal, Alex Deucher, David Airlie, Simona Vetter,
Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh, Chia-I Wu,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon,
Robin Murphy, Felix Kuehling, Alex Williamson, Ankit Agrawal,
Vivek Kasireddy, linux-media, dri-devel, linaro-mm-sig,
linux-kernel, amd-gfx, virtualization, intel-xe, linux-rdma,
iommu, kvm
On 1/21/26 11:14, Leon Romanovsky wrote:
> On Wed, Jan 21, 2026 at 09:55:38AM +0100, Christian König wrote:
>> On 1/20/26 15:07, Leon Romanovsky wrote:
>>> From: Leon Romanovsky <leonro@nvidia.com>
>>>
>>> DMABUF_MOVE_NOTIFY was introduced in 2018 and has been marked as
>>> experimental and disabled by default ever since. Six years later,
>>> all new importers implement this callback.
>>>
>>> It is therefore reasonable to drop CONFIG_DMABUF_MOVE_NOTIFY and
>>> always build DMABUF with support for it enabled.
>>>
>>> Suggested-by: Christian König <christian.koenig@amd.com>
>>> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
>>> ---
>>> drivers/dma-buf/Kconfig | 12 ------------
>>> drivers/dma-buf/dma-buf.c | 12 ++----------
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 10 +++-------
>>> drivers/gpu/drm/amd/amdkfd/Kconfig | 2 +-
>>> drivers/gpu/drm/xe/tests/xe_dma_buf.c | 3 +--
>>> drivers/gpu/drm/xe/xe_dma_buf.c | 12 ++++--------
>>> 6 files changed, 11 insertions(+), 40 deletions(-)
>>>
>>> diff --git a/drivers/dma-buf/Kconfig b/drivers/dma-buf/Kconfig
>>> index b46eb8a552d7..84d5e9b24e20 100644
>>> --- a/drivers/dma-buf/Kconfig
>>> +++ b/drivers/dma-buf/Kconfig
>>> @@ -40,18 +40,6 @@ config UDMABUF
>>> A driver to let userspace turn memfd regions into dma-bufs.
>>> Qemu can use this to create host dmabufs for guest framebuffers.
>>>
>>> -config DMABUF_MOVE_NOTIFY
>>> - bool "Move notify between drivers (EXPERIMENTAL)"
>>> - default n
>>> - depends on DMA_SHARED_BUFFER
>>> - help
>>> - Don't pin buffers if the dynamic DMA-buf interface is available on
>>> - both the exporter as well as the importer. This fixes a security
>>> - problem where userspace is able to pin unrestricted amounts of memory
>>> - through DMA-buf.
>>> - This is marked experimental because we don't yet have a consistent
>>> - execution context and memory management between drivers.
>>> -
>>> config DMABUF_DEBUG
>>> bool "DMA-BUF debug checks"
>>> depends on DMA_SHARED_BUFFER
>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>> index 59cc647bf40e..cd3b60ce4863 100644
>>> --- a/drivers/dma-buf/dma-buf.c
>>> +++ b/drivers/dma-buf/dma-buf.c
>>> @@ -837,18 +837,10 @@ static void mangle_sg_table(struct sg_table *sg_table)
>>>
>>> }
>>>
>>> -static inline bool
>>> -dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
>>
>> I would rather like to keep the wrapper and even add some explanation what it means when true is returned.
>
> We have different opinion here. I don't like single line functions which
> are called only twice. I'll keep this function to ensure progress the
> series.
Yeah, I agree with that but I like to have the opportunity to document things.
Especially since the meaning changed over time.
Thanks,
Christian.
>
> Thanks
>
>>
>> Apart from that looks good to me.
>>
>> Regards,
>> Christian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 3/7] dma-buf: Document RDMA non-ODP invalidate_mapping() special case
2026-01-21 9:17 ` Christian König
@ 2026-01-21 13:18 ` Jason Gunthorpe
2026-01-21 13:52 ` Christian König
0 siblings, 1 reply; 40+ messages in thread
From: Jason Gunthorpe @ 2026-01-21 13:18 UTC (permalink / raw)
To: Christian König
Cc: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi, Kevin Tian,
Joerg Roedel, Will Deacon, Robin Murphy, Felix Kuehling,
Alex Williamson, Ankit Agrawal, Vivek Kasireddy, linux-media,
dri-devel, linaro-mm-sig, linux-kernel, amd-gfx, virtualization,
intel-xe, linux-rdma, iommu, kvm
On Wed, Jan 21, 2026 at 10:17:16AM +0100, Christian König wrote:
> The whole idea is to make invalidate_mappings truly optional.
But it's not really optional! It's absence means we are ignoring UAF
security issues when the exporters do their move_notify() and nothing
happens.
Given this I don't want to loose the warning log either, the situation
needs to be reported..
Jason
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-21 9:20 ` Christian König
2026-01-21 9:36 ` Thomas Hellström
@ 2026-01-21 13:31 ` Jason Gunthorpe
2026-01-21 15:28 ` Christian König
1 sibling, 1 reply; 40+ messages in thread
From: Jason Gunthorpe @ 2026-01-21 13:31 UTC (permalink / raw)
To: Christian König
Cc: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi, Kevin Tian,
Joerg Roedel, Will Deacon, Robin Murphy, Felix Kuehling,
Alex Williamson, Ankit Agrawal, Vivek Kasireddy, linux-media,
dri-devel, linaro-mm-sig, linux-kernel, amd-gfx, virtualization,
intel-xe, linux-rdma, iommu, kvm
On Wed, Jan 21, 2026 at 10:20:51AM +0100, Christian König wrote:
> On 1/20/26 15:07, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> >
> > dma-buf invalidation is performed asynchronously by hardware, so VFIO must
> > wait until all affected objects have been fully invalidated.
> >
> > Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO regions")
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
>
> Reviewed-by: Christian König <christian.koenig@amd.com>
>
> Please also keep in mind that the while this wait for all fences for
> correctness you also need to keep the mapping valid until
> dma_buf_unmap_attachment() was called.
Can you elaborate on this more?
I think what we want for dma_buf_attach_revocable() is the strong
guarentee that the importer stops doing all access to the memory once
this sequence is completed and the exporter can rely on it. I don't
think this works any other way.
This is already true for dynamic move capable importers, right?
For the non-revocable importers I can see the invalidate sequence is
more of an advisory thing and you can't know the access is gone until
the map is undone.
> In other words you can only redirect the DMA-addresses previously
> given out into nirvana (or a dummy memory or similar), but you still
> need to avoid re-using them for something else.
Does any driver do this? If you unload/reload a GPU driver it is
going to re-use the addresses handed out?
Jason
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 3/7] dma-buf: Document RDMA non-ODP invalidate_mapping() special case
2026-01-21 13:18 ` Jason Gunthorpe
@ 2026-01-21 13:52 ` Christian König
2026-01-21 13:59 ` Jason Gunthorpe
0 siblings, 1 reply; 40+ messages in thread
From: Christian König @ 2026-01-21 13:52 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi, Kevin Tian,
Joerg Roedel, Will Deacon, Robin Murphy, Felix Kuehling,
Alex Williamson, Ankit Agrawal, Vivek Kasireddy, linux-media,
dri-devel, linaro-mm-sig, linux-kernel, amd-gfx, virtualization,
intel-xe, linux-rdma, iommu, kvm
On 1/21/26 14:18, Jason Gunthorpe wrote:
> On Wed, Jan 21, 2026 at 10:17:16AM +0100, Christian König wrote:
>> The whole idea is to make invalidate_mappings truly optional.
>
> But it's not really optional! It's absence means we are ignoring UAF
> security issues when the exporters do their move_notify() and nothing
> happens.
No that is unproblematic.
See the invalidate_mappings callback just tells the importer that the mapping in question can't be relied on any more.
But the mapping is truly freed only by the importer calling dma_buf_unmap_attachment().
In other words the invalidate_mappings give the signal to the importer to disable all operations and the dma_buf_unmap_attachment() is the signal from the importer that the housekeeping structures can be freed and the underlying address space or backing object re-used.
Regards,
Christian.
>
> Given this I don't want to loose the warning log either, the situation
> needs to be reported..
>
> Jason
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 3/7] dma-buf: Document RDMA non-ODP invalidate_mapping() special case
2026-01-21 13:52 ` Christian König
@ 2026-01-21 13:59 ` Jason Gunthorpe
2026-01-21 14:15 ` Christian König
0 siblings, 1 reply; 40+ messages in thread
From: Jason Gunthorpe @ 2026-01-21 13:59 UTC (permalink / raw)
To: Christian König
Cc: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi, Kevin Tian,
Joerg Roedel, Will Deacon, Robin Murphy, Felix Kuehling,
Alex Williamson, Ankit Agrawal, Vivek Kasireddy, linux-media,
dri-devel, linaro-mm-sig, linux-kernel, amd-gfx, virtualization,
intel-xe, linux-rdma, iommu, kvm
On Wed, Jan 21, 2026 at 02:52:53PM +0100, Christian König wrote:
> On 1/21/26 14:18, Jason Gunthorpe wrote:
> > On Wed, Jan 21, 2026 at 10:17:16AM +0100, Christian König wrote:
> >> The whole idea is to make invalidate_mappings truly optional.
> >
> > But it's not really optional! It's absence means we are ignoring UAF
> > security issues when the exporters do their move_notify() and nothing
> > happens.
>
> No that is unproblematic.
>
> See the invalidate_mappings callback just tells the importer that
> the mapping in question can't be relied on any more.
>
> But the mapping is truly freed only by the importer calling
> dma_buf_unmap_attachment().
>
> In other words the invalidate_mappings give the signal to the
> importer to disable all operations and the
> dma_buf_unmap_attachment() is the signal from the importer that the
> housekeeping structures can be freed and the underlying address
> space or backing object re-used.
I see
Can we document this please, I haven't seen this scheme described
anyhwere.
And let's clarify what I said in my other email that this new revoke
semantic is not just a signal to maybe someday unmap but a hard
barrier that it must be done once the fences complete, similar to
non-pinned importers.
The cover letter should be clarified with this understanding too.
Jason
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 3/7] dma-buf: Document RDMA non-ODP invalidate_mapping() special case
2026-01-21 13:59 ` Jason Gunthorpe
@ 2026-01-21 14:15 ` Christian König
2026-01-21 14:31 ` Leon Romanovsky
2026-01-21 15:39 ` Jason Gunthorpe
0 siblings, 2 replies; 40+ messages in thread
From: Christian König @ 2026-01-21 14:15 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi, Kevin Tian,
Joerg Roedel, Will Deacon, Robin Murphy, Felix Kuehling,
Alex Williamson, Ankit Agrawal, Vivek Kasireddy, linux-media,
dri-devel, linaro-mm-sig, linux-kernel, amd-gfx, virtualization,
intel-xe, linux-rdma, iommu, kvm
On 1/21/26 14:59, Jason Gunthorpe wrote:
> On Wed, Jan 21, 2026 at 02:52:53PM +0100, Christian König wrote:
>> On 1/21/26 14:18, Jason Gunthorpe wrote:
>>> On Wed, Jan 21, 2026 at 10:17:16AM +0100, Christian König wrote:
>>>> The whole idea is to make invalidate_mappings truly optional.
>>>
>>> But it's not really optional! It's absence means we are ignoring UAF
>>> security issues when the exporters do their move_notify() and nothing
>>> happens.
>>
>> No that is unproblematic.
>>
>> See the invalidate_mappings callback just tells the importer that
>> the mapping in question can't be relied on any more.
>>
>> But the mapping is truly freed only by the importer calling
>> dma_buf_unmap_attachment().
>>
>> In other words the invalidate_mappings give the signal to the
>> importer to disable all operations and the
>> dma_buf_unmap_attachment() is the signal from the importer that the
>> housekeeping structures can be freed and the underlying address
>> space or backing object re-used.
>
> I see
>
> Can we document this please, I haven't seen this scheme described
> anyhwere.
>
> And let's clarify what I said in my other email that this new revoke
> semantic is not just a signal to maybe someday unmap but a hard
> barrier that it must be done once the fences complete, similar to
> non-pinned importers.
Well, I would avoid that semantics.
Even when the exporter requests the mapping to be invalidated it does not mean that the mapping can go away immediately.
It's fine when accesses initiated after an invalidation and then waiting for fences go into nirvana and have undefined results, but they should not trigger PCI AER, warnings from the IOMMU or even worse end up in some MMIO BAR of a newly attached devices.
So if the exporter wants to be 100% sure that nobody is using the mapping any more then it needs to wait for the importer to call dma_buf_unmap_attachment().
> The cover letter should be clarified with this understanding too.
Yeah, completely agree. We really need to flash out that semantics in the documentation.
Regards,
Christian.
>
> Jason
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 3/7] dma-buf: Document RDMA non-ODP invalidate_mapping() special case
2026-01-21 14:15 ` Christian König
@ 2026-01-21 14:31 ` Leon Romanovsky
2026-01-21 15:39 ` Jason Gunthorpe
1 sibling, 0 replies; 40+ messages in thread
From: Leon Romanovsky @ 2026-01-21 14:31 UTC (permalink / raw)
To: Christian König
Cc: Jason Gunthorpe, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi, Kevin Tian,
Joerg Roedel, Will Deacon, Robin Murphy, Felix Kuehling,
Alex Williamson, Ankit Agrawal, Vivek Kasireddy, linux-media,
dri-devel, linaro-mm-sig, linux-kernel, amd-gfx, virtualization,
intel-xe, linux-rdma, iommu, kvm
On Wed, Jan 21, 2026 at 03:15:46PM +0100, Christian König wrote:
> On 1/21/26 14:59, Jason Gunthorpe wrote:
> > On Wed, Jan 21, 2026 at 02:52:53PM +0100, Christian König wrote:
> >> On 1/21/26 14:18, Jason Gunthorpe wrote:
> >>> On Wed, Jan 21, 2026 at 10:17:16AM +0100, Christian König wrote:
> >>>> The whole idea is to make invalidate_mappings truly optional.
> >>>
> >>> But it's not really optional! It's absence means we are ignoring UAF
> >>> security issues when the exporters do their move_notify() and nothing
> >>> happens.
> >>
> >> No that is unproblematic.
> >>
> >> See the invalidate_mappings callback just tells the importer that
> >> the mapping in question can't be relied on any more.
> >>
> >> But the mapping is truly freed only by the importer calling
> >> dma_buf_unmap_attachment().
> >>
> >> In other words the invalidate_mappings give the signal to the
> >> importer to disable all operations and the
> >> dma_buf_unmap_attachment() is the signal from the importer that the
> >> housekeeping structures can be freed and the underlying address
> >> space or backing object re-used.
> >
> > I see
> >
> > Can we document this please, I haven't seen this scheme described
> > anyhwere.
> >
> > And let's clarify what I said in my other email that this new revoke
> > semantic is not just a signal to maybe someday unmap but a hard
> > barrier that it must be done once the fences complete, similar to
> > non-pinned importers.
>
> Well, I would avoid that semantics.
>
> Even when the exporter requests the mapping to be invalidated it does not mean that the mapping can go away immediately.
>
> It's fine when accesses initiated after an invalidation and then waiting for fences go into nirvana and have undefined results, but they should not trigger PCI AER, warnings from the IOMMU or even worse end up in some MMIO BAR of a newly attached devices.
>
> So if the exporter wants to be 100% sure that nobody is using the mapping any more then it needs to wait for the importer to call dma_buf_unmap_attachment().
>
> > The cover letter should be clarified with this understanding too.
>
> Yeah, completely agree. We really need to flash out that semantics in the documentation.
Someone knowledgeable needs to document this properly, either in the code
or in the official documentation. A cover letter is not the right place for
subtle design decisions.
Thanks
>
> Regards,
> Christian.
>
> >
> > Jason
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-21 13:31 ` Jason Gunthorpe
@ 2026-01-21 15:28 ` Christian König
2026-01-21 16:01 ` Jason Gunthorpe
0 siblings, 1 reply; 40+ messages in thread
From: Christian König @ 2026-01-21 15:28 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi, Kevin Tian,
Joerg Roedel, Will Deacon, Robin Murphy, Felix Kuehling,
Alex Williamson, Ankit Agrawal, Vivek Kasireddy, linux-media,
dri-devel, linaro-mm-sig, linux-kernel, amd-gfx, virtualization,
intel-xe, linux-rdma, iommu, kvm
On 1/21/26 14:31, Jason Gunthorpe wrote:
> On Wed, Jan 21, 2026 at 10:20:51AM +0100, Christian König wrote:
>> On 1/20/26 15:07, Leon Romanovsky wrote:
>>> From: Leon Romanovsky <leonro@nvidia.com>
>>>
>>> dma-buf invalidation is performed asynchronously by hardware, so VFIO must
>>> wait until all affected objects have been fully invalidated.
>>>
>>> Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO regions")
>>> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
>>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>
>> Please also keep in mind that the while this wait for all fences for
>> correctness you also need to keep the mapping valid until
>> dma_buf_unmap_attachment() was called.
>
> Can you elaborate on this more?
>
> I think what we want for dma_buf_attach_revocable() is the strong
> guarentee that the importer stops doing all access to the memory once
> this sequence is completed and the exporter can rely on it. I don't
> think this works any other way.
>
> This is already true for dynamic move capable importers, right?
Not quite, no.
> For the non-revocable importers I can see the invalidate sequence is
> more of an advisory thing and you can't know the access is gone until
> the map is undone.
>
>> In other words you can only redirect the DMA-addresses previously
>> given out into nirvana (or a dummy memory or similar), but you still
>> need to avoid re-using them for something else.
>
> Does any driver do this? If you unload/reload a GPU driver it is
> going to re-use the addresses handed out?
I never fully read through all the source code, but if I'm not completely mistaken that is enforced for all GPU drivers through the DMA-buf and DRM layer lifetime handling and I think even in other in kernel frameworks like V4L, alsa etc...
What roughly happens is that each DMA-buf mapping through a couple of hoops keeps a reference on the device, so even after a hotplug event the device can only fully go away after all housekeeping structures are destroyed and buffers freed.
Background is that a lot of device still make reads even after you have invalidated a mapping, but then discard the result.
So when you don't have same grace period you end up with PCI AER, warnings from IOMMU, random accesses to PCI BARs which just happen to be in the old location of something etc...
I would rather like to keep that semantics even for forcefully shootdowns since it proved to be rather reliable.
Regards,
Christian.
>
> Jason
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 3/7] dma-buf: Document RDMA non-ODP invalidate_mapping() special case
2026-01-21 14:15 ` Christian König
2026-01-21 14:31 ` Leon Romanovsky
@ 2026-01-21 15:39 ` Jason Gunthorpe
1 sibling, 0 replies; 40+ messages in thread
From: Jason Gunthorpe @ 2026-01-21 15:39 UTC (permalink / raw)
To: Christian König
Cc: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi, Kevin Tian,
Joerg Roedel, Will Deacon, Robin Murphy, Felix Kuehling,
Alex Williamson, Ankit Agrawal, Vivek Kasireddy, linux-media,
dri-devel, linaro-mm-sig, linux-kernel, amd-gfx, virtualization,
intel-xe, linux-rdma, iommu, kvm
On Wed, Jan 21, 2026 at 03:15:46PM +0100, Christian König wrote:
> > And let's clarify what I said in my other email that this new revoke
> > semantic is not just a signal to maybe someday unmap but a hard
> > barrier that it must be done once the fences complete, similar to
> > non-pinned importers.
>
> Well, I would avoid that semantics.
>
> Even when the exporter requests the mapping to be invalidated it
> does not mean that the mapping can go away immediately.
>
> It's fine when accesses initiated after an invalidation and then
> waiting for fences go into nirvana and have undefined results, but
> they should not trigger PCI AER, warnings from the IOMMU or even
> worse end up in some MMIO BAR of a newly attached devices.
So what's the purpose of the fence if accesses can continue after
waiting for fences?
If we always have to wait for the unmap call, is the importer allowed
to call unmap while its own fences are outstanding?
> So if the exporter wants to be 100% sure that nobody is using the
> mapping any more then it needs to wait for the importer to call
> dma_buf_unmap_attachment().
We are trying to introduce this new idea called "revoke".
Revoke means the exporter does some defined sequence and after the end
of that sequence it knows there are no further DMA or CPU accesses to
its memory at all.
It has to happen in bounded time, so it can't get entangled with
waiting for userspace to do something (eg importer unmap via an ioctl)
It has to be an absolute statement because the VFIO and RDMA exporter
use cases can trigger UAFs and AERs if importers keep accessing.
So, what exactly should the export sequence be? We were proposing to
call invalidate_mapping() and when it returns there is no access.
The fence is missing, so now the sequences includes wait for the
fences.
And now you are saying we have to wait for all unmaps? Not only wait
for the unmaps, but the importers now also must call unmap as part of
their invalidate_mapping() callback.. Is that OK? Do existing
importers do that?
If all the above are yes, then lets document explicitly this is the
required sequence and we can try to make it work. Please say, because
we just don't know and keep getting surprised :)
Thanks,
Jason
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-21 15:28 ` Christian König
@ 2026-01-21 16:01 ` Jason Gunthorpe
2026-01-21 19:45 ` Leon Romanovsky
2026-01-22 11:32 ` Christian König
0 siblings, 2 replies; 40+ messages in thread
From: Jason Gunthorpe @ 2026-01-21 16:01 UTC (permalink / raw)
To: Christian König
Cc: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi, Kevin Tian,
Joerg Roedel, Will Deacon, Robin Murphy, Felix Kuehling,
Alex Williamson, Ankit Agrawal, Vivek Kasireddy, linux-media,
dri-devel, linaro-mm-sig, linux-kernel, amd-gfx, virtualization,
intel-xe, linux-rdma, iommu, kvm
On Wed, Jan 21, 2026 at 04:28:17PM +0100, Christian König wrote:
> On 1/21/26 14:31, Jason Gunthorpe wrote:
> > On Wed, Jan 21, 2026 at 10:20:51AM +0100, Christian König wrote:
> >> On 1/20/26 15:07, Leon Romanovsky wrote:
> >>> From: Leon Romanovsky <leonro@nvidia.com>
> >>>
> >>> dma-buf invalidation is performed asynchronously by hardware, so VFIO must
> >>> wait until all affected objects have been fully invalidated.
> >>>
> >>> Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO regions")
> >>> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> >>
> >> Reviewed-by: Christian König <christian.koenig@amd.com>
> >>
> >> Please also keep in mind that the while this wait for all fences for
> >> correctness you also need to keep the mapping valid until
> >> dma_buf_unmap_attachment() was called.
> >
> > Can you elaborate on this more?
> >
> > I think what we want for dma_buf_attach_revocable() is the strong
> > guarentee that the importer stops doing all access to the memory once
> > this sequence is completed and the exporter can rely on it. I don't
> > think this works any other way.
> >
> > This is already true for dynamic move capable importers, right?
>
> Not quite, no.
:(
It is kind of shocking to hear these APIs work like this with such a
loose lifetime definition. Leon can you include some of these detail
in the new comments?
> >> In other words you can only redirect the DMA-addresses previously
> >> given out into nirvana (or a dummy memory or similar), but you still
> >> need to avoid re-using them for something else.
> >
> > Does any driver do this? If you unload/reload a GPU driver it is
> > going to re-use the addresses handed out?
>
> I never fully read through all the source code, but if I'm not
> completely mistaken that is enforced for all GPU drivers through the
> DMA-buf and DRM layer lifetime handling and I think even in other in
> kernel frameworks like V4L, alsa etc...
> What roughly happens is that each DMA-buf mapping through a couple
> of hoops keeps a reference on the device, so even after a hotplug
> event the device can only fully go away after all housekeeping
> structures are destroyed and buffers freed.
A simple reference on the device means nothing for these kinds of
questions. It does not stop unloading and reloading a driver.
Obviously if the driver is loaded fresh it will reallocate.
To do what you are saying the DRM drivers would have to block during
driver remove until all unmaps happen.
> Background is that a lot of device still make reads even after you
> have invalidated a mapping, but then discard the result.
And they also don't insert fences to conclude that?
> So when you don't have same grace period you end up with PCI AER,
> warnings from IOMMU, random accesses to PCI BARs which just happen
> to be in the old location of something etc...
Yes, definitely. It is very important to have a definitive point in
the API where all accesses stop. While "read but discard" seems
harmless on the surface, there are corner cases where it is not OK.
Am I understanding right that these devices must finish their reads
before doing unmap?
> I would rather like to keep that semantics even for forcefully
> shootdowns since it proved to be rather reliable.
We can investigate making unmap the barrier point if this is the case.
Jason
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-21 10:44 ` Leon Romanovsky
@ 2026-01-21 17:18 ` Matthew Brost
0 siblings, 0 replies; 40+ messages in thread
From: Matthew Brost @ 2026-01-21 17:18 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Christian König, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon,
Robin Murphy, Felix Kuehling, Alex Williamson, Ankit Agrawal,
Vivek Kasireddy, linux-media, dri-devel, linaro-mm-sig,
linux-kernel, amd-gfx, virtualization, intel-xe, linux-rdma,
iommu, kvm
On Wed, Jan 21, 2026 at 12:44:51PM +0200, Leon Romanovsky wrote:
> On Wed, Jan 21, 2026 at 11:41:48AM +0100, Christian König wrote:
> > On 1/20/26 21:44, Matthew Brost wrote:
> > > On Tue, Jan 20, 2026 at 04:07:06PM +0200, Leon Romanovsky wrote:
> > >> From: Leon Romanovsky <leonro@nvidia.com>
> > >>
> > >> dma-buf invalidation is performed asynchronously by hardware, so VFIO must
> > >> wait until all affected objects have been fully invalidated.
> > >>
> > >> Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO regions")
> > >> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > >> ---
> > >> drivers/vfio/pci/vfio_pci_dmabuf.c | 5 +++++
> > >> 1 file changed, 5 insertions(+)
> > >>
> > >> diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
> > >> index d4d0f7d08c53..33bc6a1909dd 100644
> > >> --- a/drivers/vfio/pci/vfio_pci_dmabuf.c
> > >> +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
> > >> @@ -321,6 +321,9 @@ void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked)
> > >> dma_resv_lock(priv->dmabuf->resv, NULL);
> > >> priv->revoked = revoked;
> > >> dma_buf_move_notify(priv->dmabuf);
> > >> + dma_resv_wait_timeout(priv->dmabuf->resv,
> > >> + DMA_RESV_USAGE_KERNEL, false,
> > >> + MAX_SCHEDULE_TIMEOUT);
> > >
> > > Should we explicitly call out in the dma_buf_move_notify() /
> > > invalidate_mappings kernel-doc that KERNEL slots are the mechanism
> > > for communicating asynchronous dma_buf_move_notify /
> > > invalidate_mappings events via fences?
> >
> > Oh, I missed that! And no that is not correct.
> >
+1 on DMA_RESV_USAGE_BOOKKEEP, I reasoned we have to wait for all fences
after I typed the original response. For example preempt fences GPU
drivers are in BOOKKEEP which you'd certainly have to wait on for move
notify to called complete. Likewise a user issued unbind or TLB
invalidation fence would typically be in BOOKKEEP as well, which again
would need to be waited on.
Matt
> > This should be DMA_RESV_USAGE_BOOKKEEP so that we wait for everything.
>
> Will change.
>
> >
> > Regards,
> > Christian.
> >
> > >
> > > Yes, this is probably implied, but it wouldn’t hurt to state this
> > > explicitly as part of the cross-driver contract.
> > >
> > > Here is what we have now:
> > >
> > > * - Dynamic importers should set fences for any access that they can't
> > > * disable immediately from their &dma_buf_attach_ops.invalidate_mappings
> > > * callback.
> > >
> > > Matt
> > >
> > >> dma_resv_unlock(priv->dmabuf->resv);
> > >> }
> > >> fput(priv->dmabuf->file);
> > >> @@ -342,6 +345,8 @@ void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev)
> > >> priv->vdev = NULL;
> > >> priv->revoked = true;
> > >> dma_buf_move_notify(priv->dmabuf);
> > >> + dma_resv_wait_timeout(priv->dmabuf->resv, DMA_RESV_USAGE_KERNEL,
> > >> + false, MAX_SCHEDULE_TIMEOUT);
> > >> dma_resv_unlock(priv->dmabuf->resv);
> > >> vfio_device_put_registration(&vdev->vdev);
> > >> fput(priv->dmabuf->file);
> > >>
> > >> --
> > >> 2.52.0
> > >>
> >
> >
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-21 16:01 ` Jason Gunthorpe
@ 2026-01-21 19:45 ` Leon Romanovsky
2026-01-22 11:32 ` Christian König
1 sibling, 0 replies; 40+ messages in thread
From: Leon Romanovsky @ 2026-01-21 19:45 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Christian König, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi, Kevin Tian,
Joerg Roedel, Will Deacon, Robin Murphy, Felix Kuehling,
Alex Williamson, Ankit Agrawal, Vivek Kasireddy, linux-media,
dri-devel, linaro-mm-sig, linux-kernel, amd-gfx, virtualization,
intel-xe, linux-rdma, iommu, kvm
On Wed, Jan 21, 2026 at 12:01:40PM -0400, Jason Gunthorpe wrote:
> On Wed, Jan 21, 2026 at 04:28:17PM +0100, Christian König wrote:
> > On 1/21/26 14:31, Jason Gunthorpe wrote:
> > > On Wed, Jan 21, 2026 at 10:20:51AM +0100, Christian König wrote:
> > >> On 1/20/26 15:07, Leon Romanovsky wrote:
> > >>> From: Leon Romanovsky <leonro@nvidia.com>
> > >>>
> > >>> dma-buf invalidation is performed asynchronously by hardware, so VFIO must
> > >>> wait until all affected objects have been fully invalidated.
> > >>>
> > >>> Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO regions")
> > >>> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > >>
> > >> Reviewed-by: Christian König <christian.koenig@amd.com>
> > >>
> > >> Please also keep in mind that the while this wait for all fences for
> > >> correctness you also need to keep the mapping valid until
> > >> dma_buf_unmap_attachment() was called.
> > >
> > > Can you elaborate on this more?
> > >
> > > I think what we want for dma_buf_attach_revocable() is the strong
> > > guarentee that the importer stops doing all access to the memory once
> > > this sequence is completed and the exporter can rely on it. I don't
> > > think this works any other way.
> > >
> > > This is already true for dynamic move capable importers, right?
> >
> > Not quite, no.
>
> :(
>
> It is kind of shocking to hear these APIs work like this with such a
> loose lifetime definition. Leon can you include some of these detail
> in the new comments?
If we can clarify what needs to be addressed for v5, I will proceed.
At the moment, it's still unclear what is missing in v4.
Thanks
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-21 16:01 ` Jason Gunthorpe
2026-01-21 19:45 ` Leon Romanovsky
@ 2026-01-22 11:32 ` Christian König
2026-01-22 23:44 ` Jason Gunthorpe
1 sibling, 1 reply; 40+ messages in thread
From: Christian König @ 2026-01-22 11:32 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi, Kevin Tian,
Joerg Roedel, Will Deacon, Robin Murphy, Felix Kuehling,
Alex Williamson, Ankit Agrawal, Vivek Kasireddy, linux-media,
dri-devel, linaro-mm-sig, linux-kernel, amd-gfx, virtualization,
intel-xe, linux-rdma, iommu, kvm
On 1/21/26 17:01, Jason Gunthorpe wrote:
> On Wed, Jan 21, 2026 at 04:28:17PM +0100, Christian König wrote:
>> On 1/21/26 14:31, Jason Gunthorpe wrote:
>>> On Wed, Jan 21, 2026 at 10:20:51AM +0100, Christian König wrote:
>>>> On 1/20/26 15:07, Leon Romanovsky wrote:
>>>>> From: Leon Romanovsky <leonro@nvidia.com>
>>>>>
>>>>> dma-buf invalidation is performed asynchronously by hardware, so VFIO must
>>>>> wait until all affected objects have been fully invalidated.
>>>>>
>>>>> Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO regions")
>>>>> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
>>>>
>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>>
>>>> Please also keep in mind that the while this wait for all fences for
>>>> correctness you also need to keep the mapping valid until
>>>> dma_buf_unmap_attachment() was called.
>>>
>>> Can you elaborate on this more?
>>>
>>> I think what we want for dma_buf_attach_revocable() is the strong
>>> guarentee that the importer stops doing all access to the memory once
>>> this sequence is completed and the exporter can rely on it. I don't
>>> think this works any other way.
>>>
>>> This is already true for dynamic move capable importers, right?
>>
>> Not quite, no.
>
> :(
>
> It is kind of shocking to hear these APIs work like this with such a
> loose lifetime definition. Leon can you include some of these detail
> in the new comments?
Yeah, when the API was designed we intentionally said that by waiting for the fences means waiting for all operations to finish.
But then came reality and there HW just do stuff like speculatively read ahead... and with that all the nice design goes to the trash-bin.
>>>> In other words you can only redirect the DMA-addresses previously
>>>> given out into nirvana (or a dummy memory or similar), but you still
>>>> need to avoid re-using them for something else.
>>>
>>> Does any driver do this? If you unload/reload a GPU driver it is
>>> going to re-use the addresses handed out?
>>
>> I never fully read through all the source code, but if I'm not
>> completely mistaken that is enforced for all GPU drivers through the
>> DMA-buf and DRM layer lifetime handling and I think even in other in
>> kernel frameworks like V4L, alsa etc...
>
>> What roughly happens is that each DMA-buf mapping through a couple
>> of hoops keeps a reference on the device, so even after a hotplug
>> event the device can only fully go away after all housekeeping
>> structures are destroyed and buffers freed.
>
> A simple reference on the device means nothing for these kinds of
> questions. It does not stop unloading and reloading a driver.
Well as far as I know it stops the PCIe address space from being re-used.
So when you do an "echo 1 > remove" and then an re-scan on the upstream bridge that works, but you get different addresses for your MMIO BARs!
> Obviously if the driver is loaded fresh it will reallocate.
>
> To do what you are saying the DRM drivers would have to block during
> driver remove until all unmaps happen.
Oh, well I never looked to deeply into that.
As far as I know it doesn't block, but rather the last drm_dev_put() just cleans things up.
And we have a CI test system which exercises that stuff over and over again because we have a big customer depending on that.
>> Background is that a lot of device still make reads even after you
>> have invalidated a mapping, but then discard the result.
>
> And they also don't insert fences to conclude that?
Nope, that is just speculatively read ahead from other operations which actually doesn't have anything TODO with our buffer.
>> So when you don't have same grace period you end up with PCI AER,
>> warnings from IOMMU, random accesses to PCI BARs which just happen
>> to be in the old location of something etc...
>
> Yes, definitely. It is very important to have a definitive point in
> the API where all accesses stop. While "read but discard" seems
> harmless on the surface, there are corner cases where it is not OK.
>
> Am I understanding right that these devices must finish their reads
> before doing unmap?
Yes, and that is a big one. Otherwise we basically loose any chance of sanely handling this.
>> I would rather like to keep that semantics even for forcefully
>> shootdowns since it proved to be rather reliable.
>
> We can investigate making unmap the barrier point if this is the case.
I mean when you absolutely just can't do it otherwise just make sure that a speculative read doesn't result in any form of error message or triggering actions or similar. That approach works as well.
And yes we absolutely have to document all those findings and behavior in the DMA-buf API.
Regards,
Christian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-22 11:32 ` Christian König
@ 2026-01-22 23:44 ` Jason Gunthorpe
2026-01-23 14:11 ` Jason Gunthorpe
0 siblings, 1 reply; 40+ messages in thread
From: Jason Gunthorpe @ 2026-01-22 23:44 UTC (permalink / raw)
To: Christian König
Cc: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi, Kevin Tian,
Joerg Roedel, Will Deacon, Robin Murphy, Felix Kuehling,
Alex Williamson, Ankit Agrawal, Vivek Kasireddy, linux-media,
dri-devel, linaro-mm-sig, linux-kernel, amd-gfx, virtualization,
intel-xe, linux-rdma, iommu, kvm
On Thu, Jan 22, 2026 at 12:32:03PM +0100, Christian König wrote:
> >> What roughly happens is that each DMA-buf mapping through a couple
> >> of hoops keeps a reference on the device, so even after a hotplug
> >> event the device can only fully go away after all housekeeping
> >> structures are destroyed and buffers freed.
> >
> > A simple reference on the device means nothing for these kinds of
> > questions. It does not stop unloading and reloading a driver.
>
> Well as far as I know it stops the PCIe address space from being re-used.
>
> So when you do an "echo 1 > remove" and then an re-scan on the
> upstream bridge that works, but you get different addresses for your
> MMIO BARs!
That's pretty a niche scenario.. Most people don't rescan their PCI
bus. If you just do rmmod/insmod then it will be re-used, there is no
rescan to move the MMIO around on that case.
> Oh, well I never looked to deeply into that.
>
> As far as I know it doesn't block, but rather the last drm_dev_put()
> just cleans things up.
>
> And we have a CI test system which exercises that stuff over and
> over again because we have a big customer depending on that.
I doubt a CI would detect a UAF like we are discussing here..
Connect a RDMA pinned importer. Do rmmod. If rmmod doesn't hang the
driver has a UAF on some RAS cases. Not great, but is unlikely to
actually trouble any real user.
Jason
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-22 23:44 ` Jason Gunthorpe
@ 2026-01-23 14:11 ` Jason Gunthorpe
2026-01-23 16:23 ` Christian König
0 siblings, 1 reply; 40+ messages in thread
From: Jason Gunthorpe @ 2026-01-23 14:11 UTC (permalink / raw)
To: Christian König
Cc: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi, Kevin Tian,
Joerg Roedel, Will Deacon, Robin Murphy, Felix Kuehling,
Alex Williamson, Ankit Agrawal, Vivek Kasireddy, linux-media,
dri-devel, linaro-mm-sig, linux-kernel, amd-gfx, virtualization,
intel-xe, linux-rdma, iommu, kvm
On Thu, Jan 22, 2026 at 07:44:04PM -0400, Jason Gunthorpe wrote:
> On Thu, Jan 22, 2026 at 12:32:03PM +0100, Christian König wrote:
> > >> What roughly happens is that each DMA-buf mapping through a couple
> > >> of hoops keeps a reference on the device, so even after a hotplug
> > >> event the device can only fully go away after all housekeeping
> > >> structures are destroyed and buffers freed.
> > >
> > > A simple reference on the device means nothing for these kinds of
> > > questions. It does not stop unloading and reloading a driver.
> >
> > Well as far as I know it stops the PCIe address space from being re-used.
> >
> > So when you do an "echo 1 > remove" and then an re-scan on the
> > upstream bridge that works, but you get different addresses for your
> > MMIO BARs!
>
> That's pretty a niche scenario.. Most people don't rescan their PCI
> bus. If you just do rmmod/insmod then it will be re-used, there is no
> rescan to move the MMIO around on that case.
Ah I just remembered there is another important detail here.
It is illegal to call the DMA API after your driver is unprobed. The
kernel can oops. So if a driver is allowing remove() to complete
before all the dma_buf_unmaps have been called it is buggy and risks
an oops.
https://lore.kernel.org/lkml/8067f204-1380-4d37-8ffd-007fc6f26738@kernel.org/T/#m0c7dda0fb5981240879c5ca489176987d688844c
As calling a dma_buf_unmap() -> dma_unma_sg() after remove() returns
is not allowed..
Jason
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-23 14:11 ` Jason Gunthorpe
@ 2026-01-23 16:23 ` Christian König
2026-01-23 16:31 ` Jason Gunthorpe
0 siblings, 1 reply; 40+ messages in thread
From: Christian König @ 2026-01-23 16:23 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi, Kevin Tian,
Joerg Roedel, Will Deacon, Robin Murphy, Felix Kuehling,
Alex Williamson, Ankit Agrawal, Vivek Kasireddy, linux-media,
dri-devel, linaro-mm-sig, linux-kernel, amd-gfx, virtualization,
intel-xe, linux-rdma, iommu, kvm
On 1/23/26 15:11, Jason Gunthorpe wrote:
> On Thu, Jan 22, 2026 at 07:44:04PM -0400, Jason Gunthorpe wrote:
>> On Thu, Jan 22, 2026 at 12:32:03PM +0100, Christian König wrote:
>>>>> What roughly happens is that each DMA-buf mapping through a couple
>>>>> of hoops keeps a reference on the device, so even after a hotplug
>>>>> event the device can only fully go away after all housekeeping
>>>>> structures are destroyed and buffers freed.
>>>>
>>>> A simple reference on the device means nothing for these kinds of
>>>> questions. It does not stop unloading and reloading a driver.
>>>
>>> Well as far as I know it stops the PCIe address space from being re-used.
>>>
>>> So when you do an "echo 1 > remove" and then an re-scan on the
>>> upstream bridge that works, but you get different addresses for your
>>> MMIO BARs!
>>
>> That's pretty a niche scenario.. Most people don't rescan their PCI
>> bus. If you just do rmmod/insmod then it will be re-used, there is no
>> rescan to move the MMIO around on that case.
>
> Ah I just remembered there is another important detail here.
>
> It is illegal to call the DMA API after your driver is unprobed. The
> kernel can oops. So if a driver is allowing remove() to complete
> before all the dma_buf_unmaps have been called it is buggy and risks
> an oops.
>
> https://lore.kernel.org/lkml/8067f204-1380-4d37-8ffd-007fc6f26738@kernel.org/T/#m0c7dda0fb5981240879c5ca489176987d688844c
>
> As calling a dma_buf_unmap() -> dma_unma_sg() after remove() returns
> is not allowed..
That is not even in the hands of the driver. The DMA-buf framework itself does a module_get() on the exporter.
So as long as a DMA-buf exists you *can't* rmmod the module which provides the exporting driver (expect of course of force unloading).
Revoking the DMA mappings won't change anything on that, the importer needs to stop using the DMA-buf and drop all their references.
Christian.
>
> Jason
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete
2026-01-23 16:23 ` Christian König
@ 2026-01-23 16:31 ` Jason Gunthorpe
0 siblings, 0 replies; 40+ messages in thread
From: Jason Gunthorpe @ 2026-01-23 16:31 UTC (permalink / raw)
To: Christian König
Cc: Leon Romanovsky, Sumit Semwal, Alex Deucher, David Airlie,
Simona Vetter, Gerd Hoffmann, Dmitry Osipenko, Gurchetan Singh,
Chia-I Wu, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi, Kevin Tian,
Joerg Roedel, Will Deacon, Robin Murphy, Felix Kuehling,
Alex Williamson, Ankit Agrawal, Vivek Kasireddy, linux-media,
dri-devel, linaro-mm-sig, linux-kernel, amd-gfx, virtualization,
intel-xe, linux-rdma, iommu, kvm
On Fri, Jan 23, 2026 at 05:23:34PM +0100, Christian König wrote:
> > It is illegal to call the DMA API after your driver is unprobed. The
> > kernel can oops. So if a driver is allowing remove() to complete
> > before all the dma_buf_unmaps have been called it is buggy and risks
> > an oops.
> >
> > https://lore.kernel.org/lkml/8067f204-1380-4d37-8ffd-007fc6f26738@kernel.org/T/#m0c7dda0fb5981240879c5ca489176987d688844c
> >
> > As calling a dma_buf_unmap() -> dma_unma_sg() after remove() returns
> > is not allowed..
>
> That is not even in the hands of the driver. The DMA-buf framework
> itself does a module_get() on the exporter.
module_get() prevents the module from being unloaded. It does not
prevent the user from using /sys/../unbind or various other ways to
remove the driver from the device.
rmmod is a popular way to trigger remove() on a driver but not the
only way, and you can't point to a module_get() to dismiss issues with
driver remove() correctness.
> Revoking the DMA mappings won't change anything on that, the
> importer needs to stop using the DMA-buf and drop all their
> references.
And to be correct an exporting driver needs to wait in its remove
function until all the unmaps are done.
Jason
^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2026-01-23 16:31 UTC | newest]
Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-20 14:07 [PATCH v3 0/7] dma-buf: Use revoke mechanism to invalidate shared buffers Leon Romanovsky
2026-01-20 14:07 ` [PATCH v3 1/7] dma-buf: Rename .move_notify() callback to a clearer identifier Leon Romanovsky
2026-01-21 8:33 ` Christian König
2026-01-21 8:41 ` Leon Romanovsky
2026-01-20 14:07 ` [PATCH v3 2/7] dma-buf: Always build with DMABUF_MOVE_NOTIFY Leon Romanovsky
2026-01-21 8:55 ` Christian König
2026-01-21 10:14 ` Leon Romanovsky
2026-01-21 10:57 ` Christian König
2026-01-20 14:07 ` [PATCH v3 3/7] dma-buf: Document RDMA non-ODP invalidate_mapping() special case Leon Romanovsky
2026-01-21 8:59 ` Christian König
2026-01-21 9:14 ` Leon Romanovsky
2026-01-21 9:17 ` Christian König
2026-01-21 13:18 ` Jason Gunthorpe
2026-01-21 13:52 ` Christian König
2026-01-21 13:59 ` Jason Gunthorpe
2026-01-21 14:15 ` Christian König
2026-01-21 14:31 ` Leon Romanovsky
2026-01-21 15:39 ` Jason Gunthorpe
2026-01-20 14:07 ` [PATCH v3 4/7] dma-buf: Add check function for revoke semantics Leon Romanovsky
2026-01-20 14:07 ` [PATCH v3 5/7] iommufd: Pin dma-buf importer " Leon Romanovsky
2026-01-21 9:01 ` Christian König
2026-01-20 14:07 ` [PATCH v3 6/7] vfio: Wait for dma-buf invalidation to complete Leon Romanovsky
2026-01-20 20:44 ` Matthew Brost
2026-01-21 7:59 ` Leon Romanovsky
2026-01-21 10:41 ` Christian König
2026-01-21 10:44 ` Leon Romanovsky
2026-01-21 17:18 ` Matthew Brost
2026-01-21 9:20 ` Christian König
2026-01-21 9:36 ` Thomas Hellström
2026-01-21 10:55 ` Christian König
2026-01-21 13:31 ` Jason Gunthorpe
2026-01-21 15:28 ` Christian König
2026-01-21 16:01 ` Jason Gunthorpe
2026-01-21 19:45 ` Leon Romanovsky
2026-01-22 11:32 ` Christian König
2026-01-22 23:44 ` Jason Gunthorpe
2026-01-23 14:11 ` Jason Gunthorpe
2026-01-23 16:23 ` Christian König
2026-01-23 16:31 ` Jason Gunthorpe
2026-01-20 14:07 ` [PATCH v3 7/7] vfio: Validate dma-buf revocation semantics Leon Romanovsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox