Re: [PATCH 1/2] drm/gpuvm: add deferred vm_bo cleanup

rust-for-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Boris Brezillon <boris.brezillon@collabora.com>
To: Alice Ryhl <aliceryhl@google.com>
Cc: "Danilo Krummrich" <dakr@kernel.org>,
	"Matthew Brost" <matthew.brost@intel.com>,
	"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
	"Maxime Ripard" <mripard@kernel.org>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	"David Airlie" <airlied@gmail.com>,
	"Simona Vetter" <simona@ffwll.ch>,
	"Steven Price" <steven.price@arm.com>,
	"Daniel Almeida" <daniel.almeida@collabora.com>,
	"Liviu Dudau" <liviu.dudau@arm.com>,
	dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	rust-for-linux@vger.kernel.org
Subject: Re: [PATCH 1/2] drm/gpuvm: add deferred vm_bo cleanup
Date: Mon, 8 Sep 2025 09:22:53 +0200	[thread overview]
Message-ID: <20250908092253.52cd4df0@fedora> (raw)
In-Reply-To: <aL1pSFB9iBsfHFM_@google.com>

On Sun, 7 Sep 2025 11:15:20 +0000
Alice Ryhl <aliceryhl@google.com> wrote:

> On Sat, Sep 06, 2025 at 12:47:36AM +0200, Danilo Krummrich wrote:
> > On Fri Sep 5, 2025 at 8:18 PM CEST, Alice Ryhl wrote:  
> > > On Fri, Sep 5, 2025 at 3:25 PM Boris Brezillon
> > > <boris.brezillon@collabora.com> wrote:  
> > >> On Fri, 05 Sep 2025 12:11:28 +0000
> > >> Alice Ryhl <aliceryhl@google.com> wrote:  
> > >> > +static bool
> > >> > +drm_gpuvm_bo_is_dead(struct drm_gpuvm_bo *vm_bo)
> > >> > +{
> > >> > +     return !kref_read(&vm_bo->kref);  
> > >>
> > >> I'm not too sure I like the idea of [ab]using vm_bo::kref to defer the
> > >> vm_bo release. I get why it's done like that, but I'm wondering why we
> > >> don't defer the release of drm_gpuva objects instead (which is really
> > >> what's being released in va_unlink()). I can imagine drivers wanting to
> > >> attach resources to the gpuva that can't be released in the
> > >> dma-signalling path in the future, and if we're doing that at the gpuva
> > >> level, we also get rid of this kref dance, since the va will hold a
> > >> vm_bo ref until it's destroyed.
> > >>
> > >> Any particular reason you went for vm_bo destruction deferral instead
> > >> of gpuva?  
> > >
> > > All of the things that were unsafe to release in the signalling path
> > > were tied to the vm_bo, so that is why I went for vm_bo cleanup.
> > > Another advantage is that it lets us use the same deferred logic for
> > > the vm_bo_put() call that drops the refcount from vm_bo_obtain().
> > >
> > > Of course if gpuvas might have resources that need deferred cleanup,
> > > that might change the situation somewhat.  
> > 
> > I think we want to track PT(E) allocations, or rather reference counts of page
> > table structures carried by the drm_gpuva, but we don't need to release them on
> > drm_gpuva_unlink(), which is where we drop the reference count of the vm_bo.
> > 
> > Deferring drm_gpuva_unlink() isn't really an option I think, the GEMs list of
> > VM_BOs and the VM_BOs list of VAs is usually used in ttm_device_funcs::move to
> > map or unmap all VAs associated with a GEM object.
> > 
> > I think PT(E) reference counts etc. should be rather released when the drm_gpuva
> > is freed, i.e. page table allocations can be bound to the lifetime of a
> > drm_gpuva. Given that, I think that eventually we'll need a cleanup list for
> > those as well, since once they're removed from the VM tree (in the fence
> > signalling critical path), we loose access otherwise.  
> 
> Hmm. Another more conceptual issue with deferring gpuva is that
> "immediate mode" is defined as having the GPUVM match the GPU's actual
> address space at all times, which deferred gpuva cleanup would go
> against.
> 
> Deferring vm_bo cleanup doesn't have this issue because even though the
> vm_bo isn't kfreed immediately, all GPUVM apis still treat it as-if it
> isn't there anymore.
> 
> > >> > +static void
> > >> > +drm_gpuvm_bo_defer_locked(struct kref *kref)
> > >> > +{
> > >> > +     struct drm_gpuvm_bo *vm_bo = container_of(kref, struct drm_gpuvm_bo,
> > >> > +                                               kref);
> > >> > +     struct drm_gpuvm *gpuvm = vm_bo->vm;
> > >> > +
> > >> > +     if (!drm_gpuvm_resv_protected(gpuvm)) {
> > >> > +             drm_gpuvm_bo_list_del(vm_bo, extobj, true);
> > >> > +             drm_gpuvm_bo_list_del(vm_bo, evict, true);
> > >> > +     }
> > >> > +
> > >> > +     list_del(&vm_bo->list.entry.gem);
> > >> > +     mutex_unlock(&vm_bo->obj->gpuva.lock);  
> > >>
> > >> I got tricked by this implicit unlock, and the conditional unlocks it
> > >> creates in drm_gpuva_unlink_defer(). Honestly, I'd rather see this
> > >> unlocked moved to drm_gpuva_unlink_defer() and a conditional unlock
> > >> added to drm_gpuvm_bo_put_deferred(), because it's easier to reason
> > >> about when the lock/unlock calls are in the same function
> > >> (kref_put_mutex() being the equivalent of a conditional lock).  
> > >
> > > Ok. I followed the docs of kref_put_mutex() that say to unlock it from
> > > the function.  
> > 
> > Yes, please keep it the way it is, I don't want to deviate from what is
> > documented and everyone else does. Besides that, I also think it's a little
> > less error prone.  
> 
> I gave it a try:
> 
> bool
> drm_gpuvm_bo_put_deferred(struct drm_gpuvm_bo *vm_bo)
> {
> 	drm_WARN_ON(vm_bo->vm->drm, !drm_gpuvm_immediate_mode(vm_bo->vm));
> 
> 	if (!vm_bo)
> 		return false;
> 
> 	if (kref_put_mutex(&vm_bo->kref, drm_gpuvm_bo_defer_locked,
> 			   &vm_bo->obj->gpuva.lock)) {
> 		/*
> 		 * It's important that the GEM stays alive for the duration in which
> 		 * drm_gpuvm_bo_defer_locked() holds the mutex, but the instant we add
> 		 * the vm_bo to bo_defer, another thread might call
> 		 * drm_gpuvm_bo_deferred_cleanup() and put the GEM. For this reason, we
> 		 * add the vm_bo to bo_defer *after* releasing the GEM's mutex.
> 		 */
> 		mutex_unlock(&vm_bo->obj->gpuva.lock);
> 		drm_gpuvm_bo_list_add(vm_bo, bo_defer, true);
> 		return true;
> 	}
> 
> 	return false;
> }
> 
> void
> drm_gpuva_unlink_defer(struct drm_gpuva *va)
> {
> 	struct drm_gem_object *obj = va->gem.obj;
> 	struct drm_gpuvm_bo *vm_bo = va->vm_bo;
> 	bool should_defer_bo;
> 
> 	if (unlikely(!obj))
> 		return;
> 
> 	drm_WARN_ON(vm_bo->vm->drm, !drm_gpuvm_immediate_mode(vm_bo->vm));
> 
> 	mutex_lock(&obj->gpuva.lock);
> 	list_del_init(&va->gem.entry);
> 
> 	/*
> 	 * This is drm_gpuvm_bo_put_deferred() slightly modified since we
> 	 * already hold the mutex. It's important that we add the vm_bo to
> 	 * bo_defer after releasing the mutex for the same reason as in
> 	 * drm_gpuvm_bo_put_deferred().
> 	 */
> 	should_defer_bo = kref_put(&vm_bo->kref, drm_gpuvm_bo_defer_locked);
> 	mutex_unlock(&obj->gpuva.lock);
> 	if (should_defer_bo)
> 		drm_gpuvm_bo_list_add(vm_bo, bo_defer, true);
> 
> 	va->vm_bo = NULL;
> }
> 
> I do think it looks relatively nice like this, particularly
> drm_gpuva_unlink_defer().

I agree.

> But that's also the one not using
> kref_put_mutex().

Yeah, but that's the thing. I guess if drm_gpuvm_bo_defer_locked() was
only called from kref_put_mutex() this would be okay (though I still
have a hard time with those functions taking locks that have to be
released by the caller, but at least that's a well-known/documented
pattern). But it's also currently called from drm_gpuva_unlink_defer()
where the lock is taken but not released. I guess if the function name
was reflecting that (drm_gpuvm_bo_defer_locked_and_unlock()?), and with
a comment explaining why the lock is conditionally released in the
caller that would be acceptable, but I still find this locking scheme
quite confusing...

next prev parent reply	other threads:[~2025-09-08  7:23 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-05 12:11 [PATCH 0/2] Defer vm_bo cleanup in GPUVM with DRM_GPUVM_IMMEDIATE_MODE Alice Ryhl
2025-09-05 12:11 ` [PATCH 1/2] drm/gpuvm: add deferred vm_bo cleanup Alice Ryhl
2025-09-05 13:25   ` Boris Brezillon
2025-09-05 18:18     ` Alice Ryhl
2025-09-05 22:47       ` Danilo Krummrich
2025-09-07 11:15         ` Alice Ryhl
2025-09-07 11:28           ` Danilo Krummrich
2025-09-07 11:39             ` Alice Ryhl
2025-09-07 11:44               ` Danilo Krummrich
2025-09-08  7:11               ` Boris Brezillon
2025-09-08  8:26                 ` Alice Ryhl
2025-09-08  8:47                   ` Danilo Krummrich
2025-09-08 10:20                     ` Boris Brezillon
2025-09-08 11:11                       ` Danilo Krummrich
2025-09-08 12:11                         ` Boris Brezillon
2025-09-08 12:20                           ` Danilo Krummrich
2025-09-09 10:39                             ` Thomas Hellström
2025-09-09 10:47                               ` Danilo Krummrich
2025-09-09 11:10                                 ` Thomas Hellström
2025-09-09 11:24                                   ` Alice Ryhl
2025-09-09 11:28                                     ` Danilo Krummrich
2025-09-09 11:46                                       ` Thomas Hellström
2025-09-08  9:37                   ` Boris Brezillon
2025-09-08  7:22           ` Boris Brezillon [this message]
2025-09-05 12:11 ` [PATCH 2/2] panthor: use drm_gpuva_unlink_defer() Alice Ryhl
2025-09-05 12:52   ` Boris Brezillon
2025-09-05 13:01     ` Alice Ryhl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250908092253.52cd4df0@fedora \
    --to=boris.brezillon@collabora.com \
    --cc=airlied@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=dakr@kernel.org \
    --cc=daniel.almeida@collabora.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liviu.dudau@arm.com \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=matthew.brost@intel.com \
    --cc=mripard@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=simona@ffwll.ch \
    --cc=steven.price@arm.com \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).