Re: [PATCH drm-next v6 02/13] drm: manager to keep track of GPUs VA mappings

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Boris Brezillon <boris.brezillon@collabora.com>
To: Danilo Krummrich <dakr@redhat.com>
Cc: airlied@gmail.com, daniel@ffwll.ch, tzimmermann@suse.de,
	mripard@kernel.org, corbet@lwn.net, christian.koenig@amd.com,
	bskeggs@redhat.com, Liam.Howlett@oracle.com,
	matthew.brost@intel.com, alexdeucher@gmail.com,
	ogabbay@kernel.org, bagasdotme@gmail.com, willy@infradead.org,
	jason@jlekstrand.net, dri-devel@lists.freedesktop.org,
	nouveau@lists.freedesktop.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Donald Robson <donald.robson@imgtec.com>,
	Dave Airlie <airlied@redhat.com>
Subject: Re: [PATCH drm-next v6 02/13] drm: manager to keep track of GPUs VA mappings
Date: Thu, 6 Jul 2023 18:17:39 +0200	[thread overview]
Message-ID: <20230706181739.57afbcfa@collabora.com> (raw)
In-Reply-To: <755b3aeb-8067-2fa5-5173-d889811e954a@redhat.com>

On Thu, 6 Jul 2023 17:06:08 +0200
Danilo Krummrich <dakr@redhat.com> wrote:

> Hi Boris,
> 
> On 6/30/23 10:02, Boris Brezillon wrote:
> > Hi Danilo,
> > 
> > On Fri, 30 Jun 2023 00:25:18 +0200
> > Danilo Krummrich <dakr@redhat.com> wrote:
> >   
> >> + *	int driver_gpuva_remap(struct drm_gpuva_op *op, void *__ctx)
> >> + *	{
> >> + *		struct driver_context *ctx = __ctx;
> >> + *
> >> + *		drm_gpuva_remap(ctx->prev_va, ctx->next_va, &op->remap);
> >> + *
> >> + *		drm_gpuva_unlink(op->remap.unmap->va);
> >> + *		kfree(op->remap.unmap->va);
> >> + *
> >> + *		if (op->remap.prev) {
> >> + *			drm_gpuva_link(ctx->prev_va);  
> > 
> > I ended up switching to dma_resv-based locking for the GEMs and I
> > wonder what the locking is supposed to look like in the async-mapping
> > case, where we insert/remove the VA nodes in the drm_sched::run_job()
> > path.  
> 
> If you decide to pick the interface where you just call 
> drm_gpuva_sm_[un]map() and receive a callback for each operation it 
> takes to fulfill the request, you probably do this because you want to 
> do everything one shot, updating the VA space, link/unlink GPUVAs 
> to/from its corresponding backing GEMs, do the actual GPU mappings.
> 
> This has a few advantages over generating a list of operations when the 
> job is submitted. You've pointed out one of them, when you noticed that 
> with a list of operations one can't sneak in a synchronous job between 
> already queued up asynchronous jobs.
> 
> However, for the asynchronous path it has the limitation that the 
> dma-resv lock can't be used to link/unlink GPUVAs to/from its 
> corresponding backing GEMs, since this would happen in the fence 
> signalling critical path and we're not allowed to hold the dma-resv lock 
> there. Hence, as we discussed I added the option for drivers to provide 
> an external lock for that, just to be able to keep some lockdep checks.

Uh, okay, I guess that means I need to go back to a custom lock for VM
operations then.

> 
> > 
> > What I have right now is something like:
> > 
> > 	dma_resv_lock(vm->resv);
> > 
> > 	// split done in drm_gpuva_sm_map(), each iteration
> > 	// of the loop is a call to the driver ->[re,un]map()
> > 	// hook
> > 	for_each_sub_op() {
> > 		
> > 		// Private BOs have their resv field pointing to the
> > 		// VM resv and we take the VM resv lock before calling
> > 		// drm_gpuva_sm_map()
> > 		if (vm->resv != gem->resv)
> > 			dma_resv_lock(gem->resv);
> > 
> > 		drm_gpuva_[un]link(va);
> > 		gem_[un]pin(gem);
> > 
> > 		if (vm->resv != gem->resv)
> > 			dma_resv_unlock(gem->resv);
> > 	}
> > 
> > 	dma_resv_unlock(vm->resv);
> >   
> 
> I'm not sure I get this code right, reading "for_each_sub_op()" and 
> "drm_gpuva_sm_map()" looks a bit like things are mixed up?
> 
> Or do you mean to represent the sum of all callbacks with 
> "for_each_sub_op()"?

That ^.

> In this case I assume this code runs in 
> drm_sched::run_job() and hence isn't allowed to take the dma-resv lock.

Yeah, I didn't realize that taking the dma-resv lock in the
dma-signaling path was forbidden. I think it's fine for the drm_gpuva
destroy operation (which calls drm_gem_shmem_unpin(), which in turns
acquires the resv lock) because I can move that to a worker and get it
out of the dma-signaling path. The problem remains for remap operations
though. I need to call drm_gem_shmem_pin() so we retain the pages even
after the unmapped gpuva object that's in the middle of a mapping is
released. I guess one option would be to use an atomic_t for
drm_shmem_gem_object::pages_use_count, and
have something like:

int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem)
{
	int ret;

	if (atomic_inc_not_zero(&shmem->pages_use_count))
		return 0;

	dma_resv_lock(shmem->base.resv, NULL);
	ret = drm_gem_shmem_pin_locked(shmem);
	dma_resv_unlock(shmem->base.resv);

	return ret;
}

Given the object already had its pages pinned when we remap, we're sure
the fast path will be taken, and no dma-resv lock aquired.

> 
> > In practice, I don't expect things to deadlock, because the VM resv is
> > not supposed to be taken outside the VM context and the locking order
> > is always the same (VM lock first, and then each shared BO
> > taken/released independently), but I'm not super thrilled by this
> > nested lock, and I'm wondering if we shouldn't have a pass collecting
> > locks in a drm_exec context first, and then have
> > the operations executed. IOW, something like that:
> > 
> > 	drm_exec_init(exec, DRM_EXEC_IGNORE_DUPLICATES)
> > 	drm_exec_until_all_locked(exec) {
> > 		// Dummy GEM is the dummy GEM object I use to make the VM
> > 		// participate in the locking without having to teach
> > 		// drm_exec how to deal with raw dma_resv objects.
> > 		ret = drm_exec_lock_obj(exec, vm->dummy_gem);
> > 		drm_exec_retry_on_contention(exec);
> > 		if (ret)
> > 			return ret;
> > 
> > 		// Could take the form of drm_gpuva_sm_[un]map_acquire_locks()
> > 		// helpers
> > 		for_each_sub_op() {
> > 			ret = drm_exec_lock_obj(exec, gem);
> > 			if (ret)
> > 				return ret;
> > 		}
> > 	}
> > 
> > 	// each iteration of the loop is a call to the driver
> > 	// ->[re,un]map() hook
> > 	for_each_sub_op() {
> > 		...
> > 		gem_[un]pin_locked(gem);
> > 		drm_gpuva_[un]link(va);
> > 		...
> > 	}
> > 
> > 	drm_exec_fini(exec);  
> 
> I have a follow-up patch (still WIP) in the queue to generalize dma-resv 
> handling, fence handling and GEM validation within the GPUVA manager as 
> optional helper functions: 
> https://gitlab.freedesktop.org/nouvelles/kernel/-/commit/a5fc29f3b1edbf3f96fb5a21b858ffe00a3f2584

Thanks for the heads-up. That's more or less what I have, except I'm
attaching a dummy_gem object to the VM so it can be passed to drm_exec
directly (instead of having a separate ww_acquire_ctx).

WARNING: multiple messages have this Message-ID (diff)

From: Boris Brezillon <boris.brezillon@collabora.com>
To: Danilo Krummrich <dakr@redhat.com>
Cc: matthew.brost@intel.com, willy@infradead.org,
	dri-devel@lists.freedesktop.org, corbet@lwn.net,
	nouveau@lists.freedesktop.org, ogabbay@kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	mripard@kernel.org, bskeggs@redhat.com, tzimmermann@suse.de,
	Liam.Howlett@oracle.com, Dave Airlie <airlied@redhat.com>,
	bagasdotme@gmail.com, christian.koenig@amd.com,
	jason@jlekstrand.net, Donald Robson <donald.robson@imgtec.com>
Subject: Re: [PATCH drm-next v6 02/13] drm: manager to keep track of GPUs VA mappings
Date: Thu, 6 Jul 2023 18:17:39 +0200	[thread overview]
Message-ID: <20230706181739.57afbcfa@collabora.com> (raw)
In-Reply-To: <755b3aeb-8067-2fa5-5173-d889811e954a@redhat.com>

On Thu, 6 Jul 2023 17:06:08 +0200
Danilo Krummrich <dakr@redhat.com> wrote:

> Hi Boris,
> 
> On 6/30/23 10:02, Boris Brezillon wrote:
> > Hi Danilo,
> > 
> > On Fri, 30 Jun 2023 00:25:18 +0200
> > Danilo Krummrich <dakr@redhat.com> wrote:
> >   
> >> + *	int driver_gpuva_remap(struct drm_gpuva_op *op, void *__ctx)
> >> + *	{
> >> + *		struct driver_context *ctx = __ctx;
> >> + *
> >> + *		drm_gpuva_remap(ctx->prev_va, ctx->next_va, &op->remap);
> >> + *
> >> + *		drm_gpuva_unlink(op->remap.unmap->va);
> >> + *		kfree(op->remap.unmap->va);
> >> + *
> >> + *		if (op->remap.prev) {
> >> + *			drm_gpuva_link(ctx->prev_va);  
> > 
> > I ended up switching to dma_resv-based locking for the GEMs and I
> > wonder what the locking is supposed to look like in the async-mapping
> > case, where we insert/remove the VA nodes in the drm_sched::run_job()
> > path.  
> 
> If you decide to pick the interface where you just call 
> drm_gpuva_sm_[un]map() and receive a callback for each operation it 
> takes to fulfill the request, you probably do this because you want to 
> do everything one shot, updating the VA space, link/unlink GPUVAs 
> to/from its corresponding backing GEMs, do the actual GPU mappings.
> 
> This has a few advantages over generating a list of operations when the 
> job is submitted. You've pointed out one of them, when you noticed that 
> with a list of operations one can't sneak in a synchronous job between 
> already queued up asynchronous jobs.
> 
> However, for the asynchronous path it has the limitation that the 
> dma-resv lock can't be used to link/unlink GPUVAs to/from its 
> corresponding backing GEMs, since this would happen in the fence 
> signalling critical path and we're not allowed to hold the dma-resv lock 
> there. Hence, as we discussed I added the option for drivers to provide 
> an external lock for that, just to be able to keep some lockdep checks.

Uh, okay, I guess that means I need to go back to a custom lock for VM
operations then.

> 
> > 
> > What I have right now is something like:
> > 
> > 	dma_resv_lock(vm->resv);
> > 
> > 	// split done in drm_gpuva_sm_map(), each iteration
> > 	// of the loop is a call to the driver ->[re,un]map()
> > 	// hook
> > 	for_each_sub_op() {
> > 		
> > 		// Private BOs have their resv field pointing to the
> > 		// VM resv and we take the VM resv lock before calling
> > 		// drm_gpuva_sm_map()
> > 		if (vm->resv != gem->resv)
> > 			dma_resv_lock(gem->resv);
> > 
> > 		drm_gpuva_[un]link(va);
> > 		gem_[un]pin(gem);
> > 
> > 		if (vm->resv != gem->resv)
> > 			dma_resv_unlock(gem->resv);
> > 	}
> > 
> > 	dma_resv_unlock(vm->resv);
> >   
> 
> I'm not sure I get this code right, reading "for_each_sub_op()" and 
> "drm_gpuva_sm_map()" looks a bit like things are mixed up?
> 
> Or do you mean to represent the sum of all callbacks with 
> "for_each_sub_op()"?

That ^.

> In this case I assume this code runs in 
> drm_sched::run_job() and hence isn't allowed to take the dma-resv lock.

Yeah, I didn't realize that taking the dma-resv lock in the
dma-signaling path was forbidden. I think it's fine for the drm_gpuva
destroy operation (which calls drm_gem_shmem_unpin(), which in turns
acquires the resv lock) because I can move that to a worker and get it
out of the dma-signaling path. The problem remains for remap operations
though. I need to call drm_gem_shmem_pin() so we retain the pages even
after the unmapped gpuva object that's in the middle of a mapping is
released. I guess one option would be to use an atomic_t for
drm_shmem_gem_object::pages_use_count, and
have something like:

int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem)
{
	int ret;

	if (atomic_inc_not_zero(&shmem->pages_use_count))
		return 0;

	dma_resv_lock(shmem->base.resv, NULL);
	ret = drm_gem_shmem_pin_locked(shmem);
	dma_resv_unlock(shmem->base.resv);

	return ret;
}

Given the object already had its pages pinned when we remap, we're sure
the fast path will be taken, and no dma-resv lock aquired.

> 
> > In practice, I don't expect things to deadlock, because the VM resv is
> > not supposed to be taken outside the VM context and the locking order
> > is always the same (VM lock first, and then each shared BO
> > taken/released independently), but I'm not super thrilled by this
> > nested lock, and I'm wondering if we shouldn't have a pass collecting
> > locks in a drm_exec context first, and then have
> > the operations executed. IOW, something like that:
> > 
> > 	drm_exec_init(exec, DRM_EXEC_IGNORE_DUPLICATES)
> > 	drm_exec_until_all_locked(exec) {
> > 		// Dummy GEM is the dummy GEM object I use to make the VM
> > 		// participate in the locking without having to teach
> > 		// drm_exec how to deal with raw dma_resv objects.
> > 		ret = drm_exec_lock_obj(exec, vm->dummy_gem);
> > 		drm_exec_retry_on_contention(exec);
> > 		if (ret)
> > 			return ret;
> > 
> > 		// Could take the form of drm_gpuva_sm_[un]map_acquire_locks()
> > 		// helpers
> > 		for_each_sub_op() {
> > 			ret = drm_exec_lock_obj(exec, gem);
> > 			if (ret)
> > 				return ret;
> > 		}
> > 	}
> > 
> > 	// each iteration of the loop is a call to the driver
> > 	// ->[re,un]map() hook
> > 	for_each_sub_op() {
> > 		...
> > 		gem_[un]pin_locked(gem);
> > 		drm_gpuva_[un]link(va);
> > 		...
> > 	}
> > 
> > 	drm_exec_fini(exec);  
> 
> I have a follow-up patch (still WIP) in the queue to generalize dma-resv 
> handling, fence handling and GEM validation within the GPUVA manager as 
> optional helper functions: 
> https://gitlab.freedesktop.org/nouvelles/kernel/-/commit/a5fc29f3b1edbf3f96fb5a21b858ffe00a3f2584

Thanks for the heads-up. That's more or less what I have, except I'm
attaching a dummy_gem object to the VM so it can be passed to drm_exec
directly (instead of having a separate ww_acquire_ctx).

next prev parent reply	other threads:[~2023-07-06 16:17 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-29 22:25 [PATCH drm-next v6 00/13] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI Danilo Krummrich
2023-06-29 22:25 ` Danilo Krummrich
2023-06-29 22:25 ` [Nouveau] " Danilo Krummrich
2023-06-29 22:25 ` [PATCH drm-next v6 01/13] drm: execution context for GEM buffers v5 Danilo Krummrich
2023-06-29 22:25   ` Danilo Krummrich
2023-06-29 22:25   ` [Nouveau] " Danilo Krummrich
2023-06-29 22:25 ` [PATCH drm-next v6 02/13] drm: manager to keep track of GPUs VA mappings Danilo Krummrich
2023-06-29 22:25   ` Danilo Krummrich
2023-06-29 22:25   ` [Nouveau] " Danilo Krummrich
2023-06-30  8:02   ` Boris Brezillon
2023-06-30  8:02     ` Boris Brezillon
2023-06-30  8:09     ` Boris Brezillon
2023-06-30  8:09       ` Boris Brezillon
2023-06-30  9:40     ` Boris Brezillon
2023-06-30  9:40       ` Boris Brezillon
2023-07-06 15:06     ` Danilo Krummrich
2023-07-06 15:06       ` Danilo Krummrich
2023-07-06 15:06       ` [Nouveau] " Danilo Krummrich
2023-07-06 16:17       ` Boris Brezillon [this message]
2023-07-06 16:17         ` Boris Brezillon
2023-07-06  8:49   ` Thomas Hellström (Intel)
2023-07-06  8:49     ` [Nouveau] " Thomas Hellström (Intel)
2023-07-06 15:48     ` Danilo Krummrich
2023-07-06 15:48       ` Danilo Krummrich
2023-07-06 15:48       ` Danilo Krummrich
2023-07-06 17:30       ` [Nouveau] " Thomas Hellström (Intel)
2023-07-06 17:30         ` Thomas Hellström (Intel)
2023-07-06 17:30         ` Thomas Hellström (Intel)
2023-07-06 15:45   ` Donald Robson
2023-07-06 15:45     ` Donald Robson
2023-07-06 15:45     ` [Nouveau] " Donald Robson
2023-07-06 15:52     ` Danilo Krummrich
2023-07-06 15:52       ` Danilo Krummrich
2023-07-06 15:52       ` [Nouveau] " Danilo Krummrich
2023-07-06 18:26   ` Boris Brezillon
2023-07-06 18:26     ` Boris Brezillon
2023-07-07  7:57     ` Boris Brezillon
2023-07-07  7:57       ` Boris Brezillon
2023-07-07 12:32       ` Danilo Krummrich
2023-07-07 12:32         ` Danilo Krummrich
2023-07-07 12:32         ` [Nouveau] " Danilo Krummrich
2023-07-07 11:00   ` Boris Brezillon
2023-07-07 11:00     ` Boris Brezillon
2023-07-07 12:41     ` Danilo Krummrich
2023-07-07 12:41       ` Danilo Krummrich
2023-07-07 12:41       ` [Nouveau] " Danilo Krummrich
2023-07-07 12:52       ` Boris Brezillon
2023-07-07 12:52         ` Boris Brezillon
2023-07-08  6:39         ` Matthew Brost
2023-07-08  6:39           ` Matthew Brost
2023-07-08  6:39           ` [Nouveau] " Matthew Brost
2023-06-29 22:25 ` [PATCH drm-next v6 03/13] drm: debugfs: provide infrastructure to dump a DRM GPU VA space Danilo Krummrich
2023-06-29 22:25   ` Danilo Krummrich
2023-06-29 22:25   ` [Nouveau] " Danilo Krummrich
2023-06-29 22:25 ` [PATCH drm-next v6 04/13] drm/nouveau: new VM_BIND uapi interfaces Danilo Krummrich
2023-06-29 22:25   ` Danilo Krummrich
2023-06-29 22:25   ` [Nouveau] " Danilo Krummrich
2023-06-29 22:25 ` [PATCH drm-next v6 05/13] drm/nouveau: get vmm via nouveau_cli_vmm() Danilo Krummrich
2023-06-29 22:25   ` Danilo Krummrich
2023-06-29 22:25   ` [Nouveau] " Danilo Krummrich
2023-06-29 22:25 ` [PATCH drm-next v6 06/13] drm/nouveau: bo: initialize GEM GPU VA interface Danilo Krummrich
2023-06-29 22:25   ` Danilo Krummrich
2023-06-29 22:25   ` [Nouveau] " Danilo Krummrich
2023-06-29 22:25 ` [PATCH drm-next v6 07/13] drm/nouveau: move usercopy helpers to nouveau_drv.h Danilo Krummrich
2023-06-29 22:25   ` Danilo Krummrich
2023-06-29 22:25   ` [Nouveau] " Danilo Krummrich
2023-06-29 22:25 ` [PATCH drm-next v6 08/13] drm/nouveau: fence: separate fence alloc and emit Danilo Krummrich
2023-06-29 22:25   ` Danilo Krummrich
2023-06-29 22:25   ` [Nouveau] " Danilo Krummrich
2023-06-29 22:25 ` [PATCH drm-next v6 09/13] drm/nouveau: fence: fail to emit when fence context is killed Danilo Krummrich
2023-06-29 22:25   ` Danilo Krummrich
2023-06-29 22:25   ` [Nouveau] " Danilo Krummrich
2023-06-29 22:25 ` [PATCH drm-next v6 10/13] drm/nouveau: chan: provide nouveau_channel_kill() Danilo Krummrich
2023-06-29 22:25   ` Danilo Krummrich
2023-06-29 22:25   ` [Nouveau] " Danilo Krummrich
2023-06-29 22:25 ` [PATCH drm-next v6 11/13] drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm Danilo Krummrich
2023-06-29 22:25   ` Danilo Krummrich
2023-06-29 22:25   ` [Nouveau] " Danilo Krummrich
2023-06-29 22:25 ` [PATCH drm-next v6 12/13] drm/nouveau: implement new VM_BIND uAPI Danilo Krummrich
2023-06-29 22:25   ` Danilo Krummrich
2023-06-29 22:25   ` [Nouveau] " Danilo Krummrich
2023-06-29 22:25 ` [PATCH drm-next v6 13/13] drm/nouveau: debugfs: implement DRM GPU VA debugfs Danilo Krummrich
2023-06-29 22:25   ` Danilo Krummrich
2023-06-29 22:25   ` [Nouveau] " Danilo Krummrich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230706181739.57afbcfa@collabora.com \
    --to=boris.brezillon@collabora.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=airlied@gmail.com \
    --cc=airlied@redhat.com \
    --cc=alexdeucher@gmail.com \
    --cc=bagasdotme@gmail.com \
    --cc=bskeggs@redhat.com \
    --cc=christian.koenig@amd.com \
    --cc=corbet@lwn.net \
    --cc=dakr@redhat.com \
    --cc=daniel@ffwll.ch \
    --cc=donald.robson@imgtec.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jason@jlekstrand.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew.brost@intel.com \
    --cc=mripard@kernel.org \
    --cc=nouveau@lists.freedesktop.org \
    --cc=ogabbay@kernel.org \
    --cc=tzimmermann@suse.de \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.