* [PATCH v2 1/2] drm/xe/dma-buf: handle empty bo and UAF races
@ 2026-05-07 11:55 Matthew Auld
2026-05-07 11:55 ` [PATCH v2 2/2] drm/xe/dma-buf: fix UAF with retry loop Matthew Auld
0 siblings, 1 reply; 4+ messages in thread
From: Matthew Auld @ 2026-05-07 11:55 UTC (permalink / raw)
To: intel-xe; +Cc: Thomas Hellström, Matthew Brost, stable
There look to be some nasty races here when triggering the
invalidate_mappings hook:
1) We do xe_bo_alloc() followed by the attach, before the actual full bo
init step in xe_dma_buf_init_obj(). However the bo is visible on the
attachments list after the attach. This is bad since exporter driver,
say amdgpu, can at any time call back into our invalidate_mappings hook,
with an empty/bogus bo, leading to potential bugs/crashes.
2) Similar to 1) but here we get a UAF, when the invalidate_mappings
hook is triggered. For example, we get as far as xe_bo_init_locked()
but this fails in some way. But here the bo will be freed on error, but
we still have it attached from dma-buf pov, so if the
invalidate_mappings is now triggered then the bo we access is gone and
we trigger UAF and more bugs/crashes.
To fix this, move the attach step until after we actually have a fully
set up buffer object. Note that the bo is not published to userspace
until later, so not sure what the comment "Don't publish the bo
until we have a valid attachment", is referring to.
We have at least two different customers reporting hitting a NULL ptr
deref in evict_flags when importing something from amdgpu, followed by
triggering the evict flow. Hit rate is also pretty low, which would
hint at some kind of race, so something like 1) or 2) might explain
this.
v2:
- Shuffle the order of the ops slightly (no functional change)
- Improve the comment to better explain the ordering (Matt B)
Assisted-by: Gemini:gemini-3 #debug
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7903
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/4055
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
drivers/gpu/drm/xe/xe_dma_buf.c | 31 ++++++++++++++++---------------
1 file changed, 16 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c
index b9828da15897..2332db502c8b 100644
--- a/drivers/gpu/drm/xe/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/xe_dma_buf.c
@@ -357,15 +357,25 @@ struct drm_gem_object *xe_gem_prime_import(struct drm_device *dev,
}
}
- /*
- * Don't publish the bo until we have a valid attachment, and a
- * valid attachment needs the bo address. So pre-create a bo before
- * creating the attachment and publish.
- */
bo = xe_bo_alloc();
if (IS_ERR(bo))
return ERR_CAST(bo);
+ /*
+ * xe_dma_buf_init_obj() takes ownership of the raw bo, so do not touch
+ * on fail, since it will already take care of cleanup. On success we
+ * still need to drop the ref, if something later fails.
+ *
+ * In addition this needs to happen before the attach, since
+ * it will create a new attachment for this, and add it to the list of
+ * attachments, at which point it is globally visible, and at any point
+ * the export side can call into on invalidate_mappings callback, which
+ * require a working object.
+ */
+ obj = xe_dma_buf_init_obj(dev, bo, dma_buf);
+ if (IS_ERR(obj))
+ return obj;
+
attach_ops = &xe_dma_buf_attach_ops;
#if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST)
if (test)
@@ -378,21 +388,12 @@ struct drm_gem_object *xe_gem_prime_import(struct drm_device *dev,
goto out_err;
}
- /*
- * xe_dma_buf_init_obj() takes ownership of bo on both success
- * and failure, so we must not touch bo after this call.
- */
- obj = xe_dma_buf_init_obj(dev, bo, dma_buf);
- if (IS_ERR(obj)) {
- dma_buf_detach(dma_buf, attach);
- return obj;
- }
get_dma_buf(dma_buf);
obj->import_attach = attach;
return obj;
out_err:
- xe_bo_free(bo);
+ xe_bo_put(bo);
return obj;
}
--
2.53.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH v2 2/2] drm/xe/dma-buf: fix UAF with retry loop 2026-05-07 11:55 [PATCH v2 1/2] drm/xe/dma-buf: handle empty bo and UAF races Matthew Auld @ 2026-05-07 11:55 ` Matthew Auld 2026-05-07 12:22 ` Thomas Hellström 0 siblings, 1 reply; 4+ messages in thread From: Matthew Auld @ 2026-05-07 11:55 UTC (permalink / raw) To: intel-xe; +Cc: Thomas Hellström, Matthew Brost, stable Retry doesn't work here, since bo will be freed on error, leading to UAF. However, now that we do the alloc & init before the attach, we can now combine this as one unit and have the init do the alloc for us. This should make the retry safe. Reported by Sashiko. Closes: https://sashiko.dev/#/patchset/20260506184332.86743-2-matthew.auld%40intel.com Fixes: eb289a5f6cc6 ("drm/xe: Convert xe_dma_buf.c for exhaustive eviction") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.18+ --- drivers/gpu/drm/xe/xe_dma_buf.c | 42 +++++++++------------------------ 1 file changed, 11 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c index 2332db502c8b..a54ec278a915 100644 --- a/drivers/gpu/drm/xe/xe_dma_buf.c +++ b/drivers/gpu/drm/xe/xe_dma_buf.c @@ -258,16 +258,8 @@ struct dma_buf *xe_gem_prime_export(struct drm_gem_object *obj, int flags) return ERR_PTR(ret); } -/* - * Takes ownership of @storage: on success it is transferred to the returned - * drm_gem_object; on failure it is freed before returning the error. - * This matches the contract of xe_bo_init_locked() which frees @storage on - * its error paths, so callers need not (and must not) free @storage after - * this call. - */ static struct drm_gem_object * -xe_dma_buf_init_obj(struct drm_device *dev, struct xe_bo *storage, - struct dma_buf *dma_buf) +xe_dma_buf_create_obj(struct drm_device *dev, struct dma_buf *dma_buf) { struct dma_resv *resv = dma_buf->resv; struct xe_device *xe = to_xe_device(dev); @@ -278,10 +270,8 @@ xe_dma_buf_init_obj(struct drm_device *dev, struct xe_bo *storage, int ret = 0; dummy_obj = drm_gpuvm_resv_object_alloc(&xe->drm); - if (!dummy_obj) { - xe_bo_free(storage); + if (!dummy_obj) return ERR_PTR(-ENOMEM); - } dummy_obj->resv = resv; xe_validation_guard(&ctx, &xe->val, &exec, (struct xe_val_flags) {}, ret) { @@ -290,8 +280,7 @@ xe_dma_buf_init_obj(struct drm_device *dev, struct xe_bo *storage, if (ret) break; - /* xe_bo_init_locked() frees storage on error */ - bo = xe_bo_init_locked(xe, storage, NULL, resv, NULL, dma_buf->size, + bo = xe_bo_init_locked(xe, NULL, NULL, resv, NULL, dma_buf->size, 0, /* Will require 1way or 2way for vm_bind */ ttm_bo_type_sg, XE_BO_FLAG_SYSTEM, &exec); drm_exec_retry_on_contention(&exec); @@ -342,7 +331,6 @@ struct drm_gem_object *xe_gem_prime_import(struct drm_device *dev, const struct dma_buf_attach_ops *attach_ops; struct dma_buf_attachment *attach; struct drm_gem_object *obj; - struct xe_bo *bo; if (dma_buf->ops == &xe_dmabuf_ops) { obj = dma_buf->priv; @@ -357,22 +345,14 @@ struct drm_gem_object *xe_gem_prime_import(struct drm_device *dev, } } - bo = xe_bo_alloc(); - if (IS_ERR(bo)) - return ERR_CAST(bo); - /* - * xe_dma_buf_init_obj() takes ownership of the raw bo, so do not touch - * on fail, since it will already take care of cleanup. On success we - * still need to drop the ref, if something later fails. - * - * In addition this needs to happen before the attach, since - * it will create a new attachment for this, and add it to the list of - * attachments, at which point it is globally visible, and at any point - * the export side can call into on invalidate_mappings callback, which - * require a working object. + * This needs to happen before the attach, since it will create a new + * attachment for this, and add it to the list of attachments, at which + * point it is globally visible, and at any point the export side can + * call into on invalidate_mappings callback, which require a working + * object. */ - obj = xe_dma_buf_init_obj(dev, bo, dma_buf); + obj = xe_dma_buf_create_obj(dev, dma_buf); if (IS_ERR(obj)) return obj; @@ -382,7 +362,7 @@ struct drm_gem_object *xe_gem_prime_import(struct drm_device *dev, attach_ops = test->attach_ops; #endif - attach = dma_buf_dynamic_attach(dma_buf, dev->dev, attach_ops, &bo->ttm.base); + attach = dma_buf_dynamic_attach(dma_buf, dev->dev, attach_ops, obj); if (IS_ERR(attach)) { obj = ERR_CAST(attach); goto out_err; @@ -393,7 +373,7 @@ struct drm_gem_object *xe_gem_prime_import(struct drm_device *dev, return obj; out_err: - xe_bo_put(bo); + xe_bo_put(gem_to_xe_bo(obj)); return obj; } -- 2.53.0 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v2 2/2] drm/xe/dma-buf: fix UAF with retry loop 2026-05-07 11:55 ` [PATCH v2 2/2] drm/xe/dma-buf: fix UAF with retry loop Matthew Auld @ 2026-05-07 12:22 ` Thomas Hellström 2026-05-07 12:35 ` Matthew Auld 0 siblings, 1 reply; 4+ messages in thread From: Thomas Hellström @ 2026-05-07 12:22 UTC (permalink / raw) To: Matthew Auld, intel-xe; +Cc: Matthew Brost, stable On Thu, 2026-05-07 at 12:55 +0100, Matthew Auld wrote: > Retry doesn't work here, since bo will be freed on error, leading to > UAF. However, now that we do the alloc & init before the attach, we > can > now combine this as one unit and have the init do the alloc for us. > This > should make the retry safe. > > Reported by Sashiko. > > Closes: > https://sashiko.dev/#/patchset/20260506184332.86743-2-matthew.auld%40intel.com > Fixes: eb289a5f6cc6 ("drm/xe: Convert xe_dma_buf.c for exhaustive > eviction") > Signed-off-by: Matthew Auld <matthew.auld@intel.com> > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> > Cc: Matthew Brost <matthew.brost@intel.com> > Cc: <stable@vger.kernel.org> # v6.18+ lgtm. Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> > --- > drivers/gpu/drm/xe/xe_dma_buf.c | 42 +++++++++---------------------- > -- > 1 file changed, 11 insertions(+), 31 deletions(-) > > diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c > b/drivers/gpu/drm/xe/xe_dma_buf.c > index 2332db502c8b..a54ec278a915 100644 > --- a/drivers/gpu/drm/xe/xe_dma_buf.c > +++ b/drivers/gpu/drm/xe/xe_dma_buf.c > @@ -258,16 +258,8 @@ struct dma_buf *xe_gem_prime_export(struct > drm_gem_object *obj, int flags) > return ERR_PTR(ret); > } > > -/* > - * Takes ownership of @storage: on success it is transferred to the > returned > - * drm_gem_object; on failure it is freed before returning the > error. > - * This matches the contract of xe_bo_init_locked() which frees > @storage on > - * its error paths, so callers need not (and must not) free @storage > after > - * this call. > - */ > static struct drm_gem_object * > -xe_dma_buf_init_obj(struct drm_device *dev, struct xe_bo *storage, > - struct dma_buf *dma_buf) > +xe_dma_buf_create_obj(struct drm_device *dev, struct dma_buf > *dma_buf) > { > struct dma_resv *resv = dma_buf->resv; > struct xe_device *xe = to_xe_device(dev); > @@ -278,10 +270,8 @@ xe_dma_buf_init_obj(struct drm_device *dev, > struct xe_bo *storage, > int ret = 0; > > dummy_obj = drm_gpuvm_resv_object_alloc(&xe->drm); > - if (!dummy_obj) { > - xe_bo_free(storage); > + if (!dummy_obj) > return ERR_PTR(-ENOMEM); > - } > > dummy_obj->resv = resv; > xe_validation_guard(&ctx, &xe->val, &exec, (struct > xe_val_flags) {}, ret) { > @@ -290,8 +280,7 @@ xe_dma_buf_init_obj(struct drm_device *dev, > struct xe_bo *storage, > if (ret) > break; > > - /* xe_bo_init_locked() frees storage on error */ > - bo = xe_bo_init_locked(xe, storage, NULL, resv, > NULL, dma_buf->size, > + bo = xe_bo_init_locked(xe, NULL, NULL, resv, NULL, > dma_buf->size, > 0, /* Will require 1way or > 2way for vm_bind */ > ttm_bo_type_sg, > XE_BO_FLAG_SYSTEM, &exec); > drm_exec_retry_on_contention(&exec); > @@ -342,7 +331,6 @@ struct drm_gem_object *xe_gem_prime_import(struct > drm_device *dev, > const struct dma_buf_attach_ops *attach_ops; > struct dma_buf_attachment *attach; > struct drm_gem_object *obj; > - struct xe_bo *bo; > > if (dma_buf->ops == &xe_dmabuf_ops) { > obj = dma_buf->priv; > @@ -357,22 +345,14 @@ struct drm_gem_object > *xe_gem_prime_import(struct drm_device *dev, > } > } > > - bo = xe_bo_alloc(); > - if (IS_ERR(bo)) > - return ERR_CAST(bo); > - > /* > - * xe_dma_buf_init_obj() takes ownership of the raw bo, so > do not touch > - * on fail, since it will already take care of cleanup. On > success we > - * still need to drop the ref, if something later fails. > - * > - * In addition this needs to happen before the attach, since > - * it will create a new attachment for this, and add it to > the list of > - * attachments, at which point it is globally visible, and > at any point > - * the export side can call into on invalidate_mappings > callback, which > - * require a working object. > + * This needs to happen before the attach, since it will > create a new > + * attachment for this, and add it to the list of > attachments, at which > + * point it is globally visible, and at any point the export > side can > + * call into on invalidate_mappings callback, which require > a working > + * object. > */ > - obj = xe_dma_buf_init_obj(dev, bo, dma_buf); > + obj = xe_dma_buf_create_obj(dev, dma_buf); Cant you just use xe_bo_create_novm() here? > if (IS_ERR(obj)) > return obj; > > @@ -382,7 +362,7 @@ struct drm_gem_object *xe_gem_prime_import(struct > drm_device *dev, > attach_ops = test->attach_ops; > #endif > > - attach = dma_buf_dynamic_attach(dma_buf, dev->dev, > attach_ops, &bo->ttm.base); > + attach = dma_buf_dynamic_attach(dma_buf, dev->dev, > attach_ops, obj); > if (IS_ERR(attach)) { > obj = ERR_CAST(attach); > goto out_err; > @@ -393,7 +373,7 @@ struct drm_gem_object *xe_gem_prime_import(struct > drm_device *dev, > return obj; > > out_err: > - xe_bo_put(bo); > + xe_bo_put(gem_to_xe_bo(obj)); > > return obj; > } ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 2/2] drm/xe/dma-buf: fix UAF with retry loop 2026-05-07 12:22 ` Thomas Hellström @ 2026-05-07 12:35 ` Matthew Auld 0 siblings, 0 replies; 4+ messages in thread From: Matthew Auld @ 2026-05-07 12:35 UTC (permalink / raw) To: Thomas Hellström, intel-xe; +Cc: Matthew Brost, stable On 07/05/2026 13:22, Thomas Hellström wrote: > On Thu, 2026-05-07 at 12:55 +0100, Matthew Auld wrote: >> Retry doesn't work here, since bo will be freed on error, leading to >> UAF. However, now that we do the alloc & init before the attach, we >> can >> now combine this as one unit and have the init do the alloc for us. >> This >> should make the retry safe. >> >> Reported by Sashiko. >> >> Closes: >> https://sashiko.dev/#/patchset/20260506184332.86743-2-matthew.auld%40intel.com >> Fixes: eb289a5f6cc6 ("drm/xe: Convert xe_dma_buf.c for exhaustive >> eviction") >> Signed-off-by: Matthew Auld <matthew.auld@intel.com> >> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> >> Cc: Matthew Brost <matthew.brost@intel.com> >> Cc: <stable@vger.kernel.org> # v6.18+ > > lgtm. > Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> > > > >> --- >> drivers/gpu/drm/xe/xe_dma_buf.c | 42 +++++++++---------------------- >> -- >> 1 file changed, 11 insertions(+), 31 deletions(-) >> >> diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c >> b/drivers/gpu/drm/xe/xe_dma_buf.c >> index 2332db502c8b..a54ec278a915 100644 >> --- a/drivers/gpu/drm/xe/xe_dma_buf.c >> +++ b/drivers/gpu/drm/xe/xe_dma_buf.c >> @@ -258,16 +258,8 @@ struct dma_buf *xe_gem_prime_export(struct >> drm_gem_object *obj, int flags) >> return ERR_PTR(ret); >> } >> >> -/* >> - * Takes ownership of @storage: on success it is transferred to the >> returned >> - * drm_gem_object; on failure it is freed before returning the >> error. >> - * This matches the contract of xe_bo_init_locked() which frees >> @storage on >> - * its error paths, so callers need not (and must not) free @storage >> after >> - * this call. >> - */ >> static struct drm_gem_object * >> -xe_dma_buf_init_obj(struct drm_device *dev, struct xe_bo *storage, >> - struct dma_buf *dma_buf) >> +xe_dma_buf_create_obj(struct drm_device *dev, struct dma_buf >> *dma_buf) >> { >> struct dma_resv *resv = dma_buf->resv; >> struct xe_device *xe = to_xe_device(dev); >> @@ -278,10 +270,8 @@ xe_dma_buf_init_obj(struct drm_device *dev, >> struct xe_bo *storage, >> int ret = 0; >> >> dummy_obj = drm_gpuvm_resv_object_alloc(&xe->drm); >> - if (!dummy_obj) { >> - xe_bo_free(storage); >> + if (!dummy_obj) >> return ERR_PTR(-ENOMEM); >> - } >> >> dummy_obj->resv = resv; >> xe_validation_guard(&ctx, &xe->val, &exec, (struct >> xe_val_flags) {}, ret) { >> @@ -290,8 +280,7 @@ xe_dma_buf_init_obj(struct drm_device *dev, >> struct xe_bo *storage, >> if (ret) >> break; >> >> - /* xe_bo_init_locked() frees storage on error */ >> - bo = xe_bo_init_locked(xe, storage, NULL, resv, >> NULL, dma_buf->size, >> + bo = xe_bo_init_locked(xe, NULL, NULL, resv, NULL, >> dma_buf->size, >> 0, /* Will require 1way or >> 2way for vm_bind */ >> ttm_bo_type_sg, >> XE_BO_FLAG_SYSTEM, &exec); >> drm_exec_retry_on_contention(&exec); >> @@ -342,7 +331,6 @@ struct drm_gem_object *xe_gem_prime_import(struct >> drm_device *dev, >> const struct dma_buf_attach_ops *attach_ops; >> struct dma_buf_attachment *attach; >> struct drm_gem_object *obj; >> - struct xe_bo *bo; >> >> if (dma_buf->ops == &xe_dmabuf_ops) { >> obj = dma_buf->priv; >> @@ -357,22 +345,14 @@ struct drm_gem_object >> *xe_gem_prime_import(struct drm_device *dev, >> } >> } >> >> - bo = xe_bo_alloc(); >> - if (IS_ERR(bo)) >> - return ERR_CAST(bo); >> - >> /* >> - * xe_dma_buf_init_obj() takes ownership of the raw bo, so >> do not touch >> - * on fail, since it will already take care of cleanup. On >> success we >> - * still need to drop the ref, if something later fails. >> - * >> - * In addition this needs to happen before the attach, since >> - * it will create a new attachment for this, and add it to >> the list of >> - * attachments, at which point it is globally visible, and >> at any point >> - * the export side can call into on invalidate_mappings >> callback, which >> - * require a working object. >> + * This needs to happen before the attach, since it will >> create a new >> + * attachment for this, and add it to the list of >> attachments, at which >> + * point it is globally visible, and at any point the export >> side can >> + * call into on invalidate_mappings callback, which require >> a working >> + * object. >> */ >> - obj = xe_dma_buf_init_obj(dev, bo, dma_buf); >> + obj = xe_dma_buf_create_obj(dev, dma_buf); > > Cant you just use xe_bo_create_novm() here? Yeah, I tried this out, but it got a bit ugly with the shared dma-resv, since I both need to add a new param to novm(), and also tweak the lower levels to accept it, which felt a bit too much for a backport. Hmm, okay maybe just another patch on top to do that. > > >> if (IS_ERR(obj)) >> return obj; >> >> @@ -382,7 +362,7 @@ struct drm_gem_object *xe_gem_prime_import(struct >> drm_device *dev, >> attach_ops = test->attach_ops; >> #endif >> >> - attach = dma_buf_dynamic_attach(dma_buf, dev->dev, >> attach_ops, &bo->ttm.base); >> + attach = dma_buf_dynamic_attach(dma_buf, dev->dev, >> attach_ops, obj); >> if (IS_ERR(attach)) { >> obj = ERR_CAST(attach); >> goto out_err; >> @@ -393,7 +373,7 @@ struct drm_gem_object *xe_gem_prime_import(struct >> drm_device *dev, >> return obj; >> >> out_err: >> - xe_bo_put(bo); >> + xe_bo_put(gem_to_xe_bo(obj)); >> >> return obj; >> } ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-07 12:35 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-07 11:55 [PATCH v2 1/2] drm/xe/dma-buf: handle empty bo and UAF races Matthew Auld 2026-05-07 11:55 ` [PATCH v2 2/2] drm/xe/dma-buf: fix UAF with retry loop Matthew Auld 2026-05-07 12:22 ` Thomas Hellström 2026-05-07 12:35 ` Matthew Auld
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox