Linux kernel -stable discussions
 help / color / mirror / Atom feed
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Matthew Auld <matthew.auld@intel.com>, intel-xe@lists.freedesktop.org
Cc: Matthew Brost <matthew.brost@intel.com>, stable@vger.kernel.org
Subject: Re: [PATCH] drm/xe/dma-buf: handle empty bo and UAF races
Date: Thu, 07 May 2026 13:14:49 +0200	[thread overview]
Message-ID: <87936849a77c61c8f9ab15e31be16ea1e22e479c.camel@linux.intel.com> (raw)
In-Reply-To: <f4957d06-3c0b-4ffe-928c-e82bf7615d75@intel.com>

On Thu, 2026-05-07 at 10:03 +0100, Matthew Auld wrote:
> On 06/05/2026 20:59, Thomas Hellström wrote:
> > On Wed, 2026-05-06 at 19:43 +0100, Matthew Auld wrote:
> > > There look to be some nasty races here when triggering the
> > > invalidate_mappings hook:
> > > 
> > > 1) We do xe_bo_alloc() followed by the attach, before the actual
> > > full
> > > bo
> > >     init step in xe_dma_buf_init_obj(). However the bo is visible
> > > on
> > > the
> > >     attachments list after the attach.  This is bad since
> > > exporter
> > > driver,
> > >     say amdgpu, can at any time call back into our
> > > invalidate_mappings
> > > hook,
> > >     with an empty/bogus bo, leading to potential bugs/crashes.
> > > 
> > > 2) Similar to 1) but here we get a UAF, when the
> > > invalidate_mappings
> > >     hook is triggered. For example, we get as far as
> > > xe_bo_init_locked()
> > >     but this fails in some way. But here the bo will be freed on
> > > error, but
> > >     we still have it attached from dma-buf pov, so if the
> > >     invalidate_mappings is now triggered then the bo we access is
> > > gone
> > > and
> > >     we trigger UAF and more bugs/crashes.
> > > 
> > > To fix this, move the attach step until after we actually have a
> > > fully
> > > set up buffer object. Note that the bo is not published to
> > > userspace
> > > until later, so not sure what the comment "Don't publish the bo
> > > until we have a valid attachment", is referring to.
> > > 
> > > We have at least two different customers reporting hitting a NULL
> > > ptr
> > > deref in evict_flags when importing something from amdgpu,
> > > followed
> > > by
> > > triggering the evict flow. Hit rate is also pretty low, which
> > > would
> > > hint at some kind of race, so something like 1) or 2) might
> > > explain
> > > this.
> > > 
> > > Assisted-by: Gemini:gemini-3 #debug
> > > Link:
> > > https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7903
> > > Link:
> > > https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/4055
> > > Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for
> > > Intel
> > > GPUs")
> > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > > Cc: Matthew Brost <matthew.brost@intel.com>
> > > Cc: <stable@vger.kernel.org> # v6.8+
> > > ---
> > >   drivers/gpu/drm/xe/xe_dma_buf.c | 23 ++++++++---------------
> > >   1 file changed, 8 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c
> > > b/drivers/gpu/drm/xe/xe_dma_buf.c
> > > index b9828da15897..e6c2f7d30abb 100644
> > > --- a/drivers/gpu/drm/xe/xe_dma_buf.c
> > > +++ b/drivers/gpu/drm/xe/xe_dma_buf.c
> > > @@ -357,11 +357,6 @@ struct drm_gem_object
> > > *xe_gem_prime_import(struct drm_device *dev,
> > >   		}
> > >   	}
> > >   
> > > -	/*
> > > -	 * Don't publish the bo until we have a valid
> > > attachment,
> > > and a
> > > -	 * valid attachment needs the bo address. So pre-create
> > > a bo
> > > before
> > > -	 * creating the attachment and publish.
> > > -	 */
> > >   	bo = xe_bo_alloc();
> > >   	if (IS_ERR(bo))
> > >   		return ERR_CAST(bo);
> > > @@ -371,6 +366,13 @@ struct drm_gem_object
> > > *xe_gem_prime_import(struct drm_device *dev,
> > >   	if (test)
> > >   		attach_ops = test->attach_ops;
> > >   #endif
> > > +	/*
> > > +	 * xe_dma_buf_init_obj() takes ownership of bo on both
> > > success
> > > +	 * and failure, so we must not touch bo after this call.
> > > +	 */
> > > +	obj = xe_dma_buf_init_obj(dev, bo, dma_buf);
> > > +	if (IS_ERR(obj))
> > > +		return obj;
> > 
> > IIRC this publishes the bo on the LRUs, as per the removed comment.
> > What happens if, for example, the shrinker kicks in and shrinks it?
> > But
> > similarly perhaps we should have obj->import_attach set already at
> > publish time?
> 
> I don't think anything bad will happen? I would view it as an sg
> object 
> without any real backing store or attachment. Trying to 
> move/shrink/evict should be a noop, like moving from SYS -> SYS 
> (starting placement for type_sg). But since this a type_sg bo, I
> think 
> shrinker will already ignore it right?
> 
>  From user POV, the handle is only published until much later at the
> end 
> of drm_gem_prime_fd_to_handle(), after our import callback here,
> AFAICT. 
> So I don't think there is a risk of the user somehow using the
> imported 
> bo in an ioctl, before we have done the attach etc.
> 
> One other data point is perhaps amdgpu, which does seem to do the
> create 
> + attach as normal steps.

OK, Yeah if there's no way the bo can see other unexpected callbacks
after initializing it, then

Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>


> 
> > 
> > If this is indeed the case we might have to revert to some
> > trickery.
> > Like invalidate_mappings() returning early if the init is not
> > complete,
> > and set obj->import_attach under the lock in xe_dma_buf_init_obj?
> > 
> > Also I think IIRC xe_bo_alloc() was created specifically for this
> > situation, so unless there are more users of that, and the ordering
> > in
> > this patch is indeed correct, we might be able to get rid of the
> > two-
> > step bo creation here.
> > 
> > /Thomas
> > 
> > 
> > >   
> > >   	attach = dma_buf_dynamic_attach(dma_buf, dev->dev,
> > > attach_ops, &bo->ttm.base);
> > >   	if (IS_ERR(attach)) {
> > > @@ -378,21 +380,12 @@ struct drm_gem_object
> > > *xe_gem_prime_import(struct drm_device *dev,
> > >   		goto out_err;
> > >   	}
> > >   
> > > -	/*
> > > -	 * xe_dma_buf_init_obj() takes ownership of bo on both
> > > success
> > > -	 * and failure, so we must not touch bo after this call.
> > > -	 */
> > > -	obj = xe_dma_buf_init_obj(dev, bo, dma_buf);
> > > -	if (IS_ERR(obj)) {
> > > -		dma_buf_detach(dma_buf, attach);
> > > -		return obj;
> > > -	}
> > >   	get_dma_buf(dma_buf);
> > >   	obj->import_attach = attach;
> > >   	return obj;
> > >   
> > >   out_err:
> > > -	xe_bo_free(bo);
> > > +	xe_bo_put(bo);
> > >   
> > >   	return obj;
> > >   }

      reply	other threads:[~2026-05-07 11:14 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-06 18:43 [PATCH] drm/xe/dma-buf: handle empty bo and UAF races Matthew Auld
2026-05-06 19:32 ` Matthew Brost
2026-05-06 19:59 ` Thomas Hellström
2026-05-07  9:03   ` Matthew Auld
2026-05-07 11:14     ` Thomas Hellström [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87936849a77c61c8f9ab15e31be16ea1e22e479c.camel@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.auld@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox