All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boris Brezillon <boris.brezillon@collabora.com>
To: Simona Vetter <simona.vetter@ffwll.ch>
Cc: "Christian König" <christian.koenig@amd.com>,
	"Daniel Vetter" <daniel.vetter@ffwll.ch>,
	"Matthew Brost" <matthew.brost@intel.com>,
	dri-devel@lists.freedesktop.org,
	"Steven Price" <steven.price@arm.com>,
	"Liviu Dudau" <liviu.dudau@arm.com>,
	"Adrián Larumbe" <adrian.larumbe@collabora.com>,
	kernel@collabora.com, "Luben Tuikov" <ltuikov89@gmail.com>,
	"Danilo Krummrich" <dakr@redhat.com>
Subject: Re: [RFC PATCH] drm/sched: Fix a UAF on drm_sched_fence::sched
Date: Wed, 4 Sep 2024 12:23:57 +0200	[thread overview]
Message-ID: <20240904122357.4c2022e2@collabora.com> (raw)
In-Reply-To: <ZtgsjoCWI4aDWpSi@phenom.ffwll.local>

On Wed, 4 Sep 2024 11:46:54 +0200
Simona Vetter <simona.vetter@ffwll.ch> wrote:

> On Wed, Sep 04, 2024 at 09:40:36AM +0200, Christian König wrote:
> > Am 03.09.24 um 10:13 schrieb Simona Vetter:  
> > > [SNIP]  
> > > > > So I think the issue is much, much bigger, and there's more. And the
> > > > > issue is I think a fundamental design issue of dma_fence itself, not
> > > > > individual users.  
> > > > IIRC both Alex and me pointed out this issue on the very first dma_fence
> > > > code and nobody really cared.  
> > > I guess way back then we didn't really sort out any of the hotunplug
> > > issues, and there wasn't any fw ctx schedulers at least on our horizons
> > > yet. Thin excuse, I know ...  
> > 
> > Well it's just when you have a bee string and a broken leg, what do you
> > attend first? :)  
> 
> Yeah ...
> 
> > > > >    I think at the core it's two constraints:
> > > > > 
> > > > > - dma_fence can stick around practically forever in varios container
> > > > >     objects. We only garbage collect when someone looks, and not even then
> > > > >     consistently.
> > > > > 
> > > > > - fences are meant to be cheap, so they do not have the big refcount going
> > > > >     on like other shared objects like dma_buf
> > > > > 
> > > > > Specifically there's also no refcounting on the module itself with the  
> > > > > ->owner and try_module_get stuff. So even if we fix all these issues on  
> > > > > the data structure lifetime side of things, you might still oops calling
> > > > > into dma_fence->ops->release.
> > > > > 
> > > > > Oops.  
> > > > Yes, exactly that. I'm a bit surprised that you realize that only now :)
> > > > 
> > > > We have the issue for at least 10 years or so and it pops up every now and
> > > > then on my desk because people complain that unloading amdgpu crashes.  
> > > Yeah I knew about the issue. The new idea that popped into my mind is that
> > > I think we cannot plug this properly unless we do it in dma_fence.c for
> > > everyone, and essentially reshape the lifetime rules for that from yolo
> > > to something actually well-defined.
> > > 
> > > Kinda similar work to how dma_resv locking rules and fence book-keeping
> > > were unified to something that actually works across drivers ...  
> > 
> > Well sounds like I've just got more items on my TODO list.
> > 
> > I have patches waiting to be send out going into this direction anyway, will
> > try to get them out by the end of the week and then we can discuss what's
> > still missing.  
> 
> Quick addition, another motivator from the panthor userspace submit
> discussion: If the preempt ctx fence concept spreads, that's another
> non-drm_sched fence that drivers will need and are pretty much guaranteed
> to get wrong.
> 
> Also maybe Boris volunteers to help out with some of the work here?

Sure, I can review/test what Christian comes up with, since he already
seems to have a draft for the new implementation.

  parent reply	other threads:[~2024-09-04 10:24 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-29 17:12 [RFC PATCH] drm/sched: Fix a UAF on drm_sched_fence::sched Boris Brezillon
2024-08-30  8:14 ` Christian König
2024-08-30  9:37   ` Boris Brezillon
2024-08-30 10:44     ` Boris Brezillon
2024-08-30 12:57     ` Christian König
2024-08-30 21:43   ` Matthew Brost
2024-08-31  7:25     ` Boris Brezillon
2024-09-02 10:43     ` Christian König
2024-09-02 13:23       ` Daniel Vetter
2024-09-02 14:18         ` Christian König
2024-09-03  8:13           ` Simona Vetter
2024-09-04  7:40             ` Christian König
2024-09-04  9:46               ` Simona Vetter
2024-09-04 10:03                 ` Simona Vetter
2024-09-04 10:26                   ` Boris Brezillon
2024-09-04 10:23                 ` Boris Brezillon [this message]
2024-09-01 22:39 ` kernel test robot
2024-09-02  3:14 ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240904122357.4c2022e2@collabora.com \
    --to=boris.brezillon@collabora.com \
    --cc=adrian.larumbe@collabora.com \
    --cc=christian.koenig@amd.com \
    --cc=dakr@redhat.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=kernel@collabora.com \
    --cc=liviu.dudau@arm.com \
    --cc=ltuikov89@gmail.com \
    --cc=matthew.brost@intel.com \
    --cc=simona.vetter@ffwll.ch \
    --cc=steven.price@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.