All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boris Brezillon <boris.brezillon@collabora.com>
To: Matthew Brost <matthew.brost@intel.com>
Cc: "Christian König" <christian.koenig@amd.com>,
	dri-devel@lists.freedesktop.org,
	"Steven Price" <steven.price@arm.com>,
	"Liviu Dudau" <liviu.dudau@arm.com>,
	"Adrián Larumbe" <adrian.larumbe@collabora.com>,
	kernel@collabora.com, "Luben Tuikov" <ltuikov89@gmail.com>,
	"Danilo Krummrich" <dakr@redhat.com>
Subject: Re: [RFC PATCH] drm/sched: Fix a UAF on drm_sched_fence::sched
Date: Sat, 31 Aug 2024 09:25:35 +0200	[thread overview]
Message-ID: <20240831092535.3ae06c19@collabora.com> (raw)
In-Reply-To: <ZtI9EMzHZW3DkHw/@DUT025-TGLU.fm.intel.com>

On Fri, 30 Aug 2024 21:43:44 +0000
Matthew Brost <matthew.brost@intel.com> wrote:

> On Fri, Aug 30, 2024 at 10:14:18AM +0200, Christian König wrote:
> > Am 29.08.24 um 19:12 schrieb Boris Brezillon:  
> > > dma_fence objects created by an entity might outlive the
> > > drm_gpu_scheduler this entity was bound to if those fences are retained
> > > by other other objects, like a dma_buf resv. This means that
> > > drm_sched_fence::sched might be invalid when the resv is walked, which
> > > in turn leads to a UAF when dma_fence_ops::get_timeline_name() is called.
> > > 
> > > This probably went unnoticed so far, because the drm_gpu_scheduler had
> > > the lifetime of the drm_device, so, unless you were removing the device,
> > > there were no reasons for the scheduler to be gone before its fences.  
> > 
> > Nope, that is intentional design. get_timeline_name() is not safe to be
> > called after the fence signaled because that would causes circular
> > dependency problems.
> >  
> 
> I'm quite sure happens, ftrace for example can and will call
> get_timeline_name in trace_dma_fence_destroy which is certainly after
> the fence is signaled. There are likely other cases too - this just
> quickly came to mind.
>  
> > E.g. when you have hardware fences it can happen that fences reference a
> > driver module (for the function printing the name) and the module in turn
> > keeps fences around.
> >   
> 
> I am almost positive without this patch this problematic in Xe or any
> driver in which schedulers are tied to IOCTLs rather than kernel module.
> 
> In Xe 'fence->sched' maps to an xe_exec_queue which can be freed once
> the destroy exec queue IOCTL is called and all jobs are free'd (i.e.
> 'fence' signals). The fence could be live on after in dma-resv objects,
> drm syncobjs, etc...
> 
> I know this issue has been raised before and basically NACK'd but I have
> a strong opinion this is valid and in fact required.

IIUC, Christian recognized that it's more problematic now that
schedulers lifetime is no longer bound to the device lifetime but
instead the GPU context lifetime. So I think we all agree that this
needs fixing :-).

How about I send a v2 of this patch, as it seems Christian was more or
less happy with the approach, baring the allocation scheme. And then
we can discuss how we want to fix the module-unload issue.

  reply	other threads:[~2024-08-31  7:25 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-29 17:12 [RFC PATCH] drm/sched: Fix a UAF on drm_sched_fence::sched Boris Brezillon
2024-08-30  8:14 ` Christian König
2024-08-30  9:37   ` Boris Brezillon
2024-08-30 10:44     ` Boris Brezillon
2024-08-30 12:57     ` Christian König
2024-08-30 21:43   ` Matthew Brost
2024-08-31  7:25     ` Boris Brezillon [this message]
2024-09-02 10:43     ` Christian König
2024-09-02 13:23       ` Daniel Vetter
2024-09-02 14:18         ` Christian König
2024-09-03  8:13           ` Simona Vetter
2024-09-04  7:40             ` Christian König
2024-09-04  9:46               ` Simona Vetter
2024-09-04 10:03                 ` Simona Vetter
2024-09-04 10:26                   ` Boris Brezillon
2024-09-04 10:23                 ` Boris Brezillon
2024-09-01 22:39 ` kernel test robot
2024-09-02  3:14 ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240831092535.3ae06c19@collabora.com \
    --to=boris.brezillon@collabora.com \
    --cc=adrian.larumbe@collabora.com \
    --cc=christian.koenig@amd.com \
    --cc=dakr@redhat.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=kernel@collabora.com \
    --cc=liviu.dudau@arm.com \
    --cc=ltuikov89@gmail.com \
    --cc=matthew.brost@intel.com \
    --cc=steven.price@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.