public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Boris Brezillon <boris.brezillon@collabora.com>
To: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Danilo Krummrich <dakr@redhat.com>,
	airlied@gmail.com, daniel@ffwll.ch, matthew.brost@intel.com,
	sarah.walker@imgtec.com, donald.robson@imgtec.com,
	christian.koenig@amd.com, faith@gfxstrand.net,
	dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH drm-misc-next v5 4/6] drm/gpuvm: track/lock/validate external/evicted objects
Date: Tue, 3 Oct 2023 12:05:54 +0200	[thread overview]
Message-ID: <20231003120554.547090bc@collabora.com> (raw)
In-Reply-To: <e4e68970-c7c9-55e2-9483-01252f38c956@linux.intel.com>

Hello Thomas,

On Tue, 3 Oct 2023 10:36:10 +0200
Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:

> > +/**
> > + * get_next_vm_bo_from_list() - get the next vm_bo element
> > + * @__gpuvm: The GPU VM
> > + * @__list_name: The name of the list we're iterating on
> > + * @__local_list: A pointer to the local list used to store already iterated items
> > + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
> > + *
> > + * This helper is here to provide lockless list iteration. Lockless as in, the
> > + * iterator releases the lock immediately after picking the first element from
> > + * the list, so list insertion deletion can happen concurrently.
> > + *
> > + * Elements popped from the original list are kept in a local list, so removal
> > + * and is_empty checks can still happen while we're iterating the list.
> > + */
> > +#define get_next_vm_bo_from_list(__gpuvm, __list_name, __local_list, __prev_vm_bo)	\
> > +	({										\
> > +		struct drm_gpuvm_bo *__vm_bo = NULL;					\
> > +											\
> > +		drm_gpuvm_bo_put(__prev_vm_bo);						\
> > +											\
> > +		spin_lock(&(__gpuvm)->__list_name.lock);				\  
> 
> Here we unconditionally take the spinlocks while iterating, and the main 
> point of DRM_GPUVM_RESV_PROTECTED was really to avoid that?
> 
> 
> > +		if (!(__gpuvm)->__list_name.local_list)					\
> > +			(__gpuvm)->__list_name.local_list = __local_list;		\
> > +		else									\
> > +			WARN_ON((__gpuvm)->__list_name.local_list != __local_list);	\
> > +											\
> > +		while (!list_empty(&(__gpuvm)->__list_name.list)) {			\
> > +			__vm_bo = list_first_entry(&(__gpuvm)->__list_name.list,	\
> > +						   struct drm_gpuvm_bo,			\
> > +						   list.entry.__list_name);		\
> > +			if (kref_get_unless_zero(&__vm_bo->kref)) {  
> And unnecessarily grab a reference in the RESV_PROTECTED case.
> > 			\
> > +				list_move_tail(&(__vm_bo)->list.entry.__list_name,	\
> > +					       __local_list);				\
> > +				break;							\
> > +			} else {							\
> > +				list_del_init(&(__vm_bo)->list.entry.__list_name);	\
> > +				__vm_bo = NULL;						\
> > +			}								\
> > +		}									\
> > +		spin_unlock(&(__gpuvm)->__list_name.lock);				\
> > +											\
> > +		__vm_bo;								\
> > +	})  
> 
> IMHO this lockless list iteration looks very complex and should be 
> pretty difficult to maintain while moving forward, also since it pulls 
> the gpuvm_bos off the list, list iteration needs to be protected by an 
> outer lock anyway.

As being partly responsible for this convoluted list iterator, I must
say I agree with you. There's so many ways this can go wrong if the
user doesn't call it the right way, or doesn't protect concurrent list
iterations with a separate lock (luckily, this is a private iterator). I
mean, it works, so there's certainly a way to get it right, but gosh,
this is so far from the simple API I had hoped for.

> Also from what I understand from Boris, the extobj 
> list would typically not need the fine-grained locking; only the evict 
> list?

Right, I'm adding the gpuvm_bo to extobj list in the ioctl path, when
the GEM and VM resvs are held, and I'm deferring the drm_gpuvm_bo_put()
call to a work that's not in the dma-signalling path. This being said,
I'm still not comfortable with the

gem = drm_gem_object_get(vm_bo->gem);
dma_resv_lock(gem->resv);
drm_gpuvm_bo_put(vm_bo);
dma_resv_unlock(gem->resv);
drm_gem_object_put(gem);

dance that's needed to avoid a UAF when the gpuvm_bo is the last GEM
owner, not to mention that drm_gpuva_unlink() calls drm_gpuvm_bo_put()
after making sure the GEM gpuvm_list lock is held, but this lock might
differ from the resv lock (custom locking so we can call
gpuvm_unlink() in the dma-signalling path). So we now have paths where
drm_gpuvm_bo_put() are called with the resv lock held, and others where
they are not, and that only works because we're relying on the the fact
those drm_gpuvm_bo_put() calls won't make the refcount drop to zero,
because the deferred vm_bo_put() work still owns a vm_bo ref.

All these tiny details add to the overall complexity of this common
layer, and to me, that's not any better than the
get_next_vm_bo_from_list() complexity you were complaining about (might
be even worth, because this sort of things leak to users).

Having an internal lock partly solves that, in that the locking of the
extobj list is now entirely orthogonal to the GEM that's being removed
from this list, and we can lock/unlock internally without forcing the
caller to take weird actions to make sure things don't explode. Don't
get me wrong, I get that this locking overhead is not acceptable for
Xe, but I feel like we're turning drm_gpuvm into a white elephant that
only few people will get right.

This is just my personal view on this, and I certainly don't want to
block or delay the merging of this patchset, but I thought I'd share my
concerns. As someone who's been following the evolution of this
drm_gpuva/vm series for weeks, and who's still sometimes getting lost,
I can't imagine how new drm_gpuvm users would feel...

> Also it seems that if we are to maintain two modes here, for 
> reasonably clean code we'd need two separate instances of 
> get_next_bo_from_list().
> 
> For the !RESV_PROTECTED case, perhaps one would want to consider the 
> solution used currently in xe, where the VM maintains two evict lists. 
> One protected by a spinlock and one protected by the VM resv. When the 
> VM resv is locked to begin list traversal, the spinlock is locked *once* 
> and the spinlock-protected list is looped over and copied into the resv 
> protected one. For traversal, the resv protected one is used.

Oh, so you do have the same sort of trick where you move the entire
list to another list, such that you can let other paths update the list
while you're iterating your own snapshot. That's interesting...

Regards,

Boris

  parent reply	other threads:[~2023-10-03 10:06 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-28 19:16 [PATCH drm-misc-next v5 0/6] [RFC] DRM GPUVM features Danilo Krummrich
2023-09-28 19:16 ` [PATCH drm-misc-next v5 1/6] drm/gpuvm: add common dma-resv per struct drm_gpuvm Danilo Krummrich
2023-09-28 19:16 ` [PATCH drm-misc-next v5 2/6] drm/gpuvm: add drm_gpuvm_flags to drm_gpuvm Danilo Krummrich
2023-09-28 19:16 ` [PATCH drm-misc-next v5 3/6] drm/gpuvm: add an abstraction for a VM / BO combination Danilo Krummrich
2023-10-02  6:34   ` kernel test robot
2023-10-05 11:51   ` Thomas Hellström
2023-10-08 23:08     ` Danilo Krummrich
2023-09-28 19:16 ` [PATCH drm-misc-next v5 4/6] drm/gpuvm: track/lock/validate external/evicted objects Danilo Krummrich
2023-10-02 15:40   ` kernel test robot
2023-10-03  8:36   ` Thomas Hellström
2023-10-03  9:11     ` Thomas Hellström
2023-10-04 12:57       ` Danilo Krummrich
2023-10-04 15:29         ` Thomas Hellström
2023-10-04 17:17           ` Danilo Krummrich
2023-10-04 17:57             ` Thomas Hellström
2023-10-04 18:24               ` Danilo Krummrich
2023-10-03 10:05     ` Boris Brezillon [this message]
2023-10-03 12:25       ` Thomas Hellström
2023-10-03 14:21         ` Boris Brezillon
2023-10-03 16:55           ` Danilo Krummrich
2023-10-03 17:37             ` Thomas Hellström
2023-10-04 13:35               ` Danilo Krummrich
2023-10-03 18:57             ` Thomas Hellström
2023-10-05 11:55   ` Thomas Hellström
2023-09-28 19:16 ` [PATCH drm-misc-next v5 5/6] drm/nouveau: make use of the GPUVM's shared dma-resv Danilo Krummrich
2023-09-28 19:16 ` [PATCH drm-misc-next v5 6/6] drm/nouveau: use GPUVM common infrastructure Danilo Krummrich
2023-10-05  9:35 ` [PATCH drm-misc-next v5 0/6] [RFC] DRM GPUVM features Thomas Hellström
2023-10-08 22:48   ` Danilo Krummrich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231003120554.547090bc@collabora.com \
    --to=boris.brezillon@collabora.com \
    --cc=airlied@gmail.com \
    --cc=christian.koenig@amd.com \
    --cc=dakr@redhat.com \
    --cc=daniel@ffwll.ch \
    --cc=donald.robson@imgtec.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=faith@gfxstrand.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew.brost@intel.com \
    --cc=nouveau@lists.freedesktop.org \
    --cc=sarah.walker@imgtec.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox