From: Danilo Krummrich <dakr@redhat.com>
To: Boris Brezillon <boris.brezillon@collabora.com>
Cc: matthew.brost@intel.com, airlied@gmail.com, daniel@ffwll.ch,
tzimmermann@suse.de, mripard@kernel.org, corbet@lwn.net,
christian.koenig@amd.com, bskeggs@redhat.com,
Liam.Howlett@oracle.com, alexdeucher@gmail.com,
ogabbay@kernel.org, bagasdotme@gmail.com, willy@infradead.org,
jason@jlekstrand.net, dri-devel@lists.freedesktop.org,
nouveau@lists.freedesktop.org, linux-doc@vger.kernel.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH drm-next v5 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI
Date: Thu, 22 Jun 2023 15:58:23 +0200 [thread overview]
Message-ID: <b04b3dbb-0509-fec1-4e8e-90f724443836@redhat.com> (raw)
In-Reply-To: <20230622150101.229391e5@collabora.com>
Hi Boris,
On 6/22/23 15:01, Boris Brezillon wrote:
> Hi Danilo,
>
> On Tue, 20 Jun 2023 14:46:07 +0200
> Danilo Krummrich <dakr@redhat.com> wrote:
>
>>> The only thing I'm worried about is the 'sync mapping requests have to
>>> go through the async path and wait for all previous async requests to
>>> be processed' problem I mentioned in one of your previous submission,
>>> but I'm happy leave that for later.
>>
>> Yes, I'm aware of this limitation.
>>
>> Let me quickly try to explain where this limitation comes from and how I
>> intend to address it.
>>
>> In order to be able to allocate the required page tables for a mapping
>> request and in order to free corresponding page tables once the (async)
>> job finished I need to know the corresponding sequence of operations
>> (drm_gpuva_ops) to fulfill the mapping request.
>>
>> This requires me to update the GPUVA space in the ioctl() rather than in
>> the async stage, because otherwise I would need to wait for previous
>> jobs to finish before being able to submit subsequent jobs to the job
>> queue, since I need an up to date view of the GPUVA space in order to
>> calculate the sequence of operations to fulfill a mapping request.
>>
>> As a consequence all jobs need to be processed in the order they were
>> submitted, including synchronous jobs.
>>
>> @Matt: I think you will have the same limitation with synchronous jobs
>> as your implementation in XE should be similar?
>>
>> In order to address it I want to switch to using callbacks rather than
>> 'pre-allocated' drm_gpuva_ops and update the GPUVA space within the
>> asynchronous stage.
>> This would allow me to 'fit' synchronous jobs
>> between jobs waiting in the async job queue. However, to do this I have
>> to re-work how the page table handling in Nouveau is implemented, since
>> this would require me to be able to manage the page tables without
>> knowing the exact sequence of operations to fulfill a mapping request.
>
> Ok, so I think that's more or less what we're trying to do right
> now in PowerVR.
>
> - First, we make sure we reserve enough MMU page tables for a given map
> operation to succeed no matter the VM state in the VM_BIND job
> submission path (our VM_BIND ioctl). That means we're always
> over-provisioning and returning unused memory back when the operation
> is done if we end up using less memory.
> - We pre-allocate for the mapple-tree insertions.
> - Then we map using drm_gpuva_sm_map() and the callbacks we provided in
> the drm_sched::run_job() path. We guarantee that no memory is
> allocated in that path thanks to the pre-allocation/reservation we've
> done at VM_BIND job submission time.
>
> The problem I see with this v5 is that:
>
> 1/ We now have a dma_resv_lock_held() in drm_gpuva_{link,unlink}(),
> which, in our case, is called in the async drm_sched::run_job() path,
> and we don't hold the lock in that path (it's been released just
> after the job submission).
My solution to this, as by now, is to - in the same way we pre-allocate
- to just pre-link and pre-unlink. And then fix things up in the cleanup
path.
However, depending on the driver, this might require you to set a flag
in the driver specific structure (embedding struct drm_gpuva) whether
the gpuva is actually mapped (as in has active page table entries).
Maybe we could also just add such a flag to struct drm_gpuva. But yeah,
doesn't sound too nice to be honest...
> 2/ I'm worried that Liam's plan to only reserve what's actually needed
> based on the mapple tree state is going to play against us, because
> the mapple-tree is only modified at job exec time, and we might have
> several unmaps happening between the moment we created and queued the
> jobs, and the moment they actually get executed, meaning the
> mapple-tree reservation might no longer fit the bill.
Yes, I'm aware and I explained to Liam in detail why we need the
mas_preallocate_worst_case() way of doing it.
See this mail:
https://lore.kernel.org/nouveau/68cd25de-e767-725e-2e7b-703217230bb0@redhat.com/T/#ma326e200b1de1e3c9df4e9fcb3bf243061fee8b5
He hasn't answered yet, but I hope we can just get (or actually keep)
such a function (hopefully with better naming), since it shouldn't
interfere with anything else.
>
> For issue #1, it shouldn't be to problematic if we use a regular lock to
> insert to/remove from the GEM gpuva list.
Yes, that's why I had a separate GEM gpuva list lock in the first place.
However, this doesn't really work when generating ops rather than using
the callback interface.
Have a look at drm_gpuva_gem_unmap_ops_create() requested by Matt for
XE. This function generates drm_gpuva_ops to unmap all mappings of a
given GEM. In order to do that the function must iterate the GEM's gpuva
list and allocate operations for each mapping. As a consequence the
gpuva list lock wouldn't be allowed to be taken in the fence signalling
path (run_job()) any longer. Hence, we can just protect the list with
the GEM's dma-resv lock.
However, I can understand that it might be inconvenient for the callback
interface and admittedly my solution to that isn't that nice as well.
Hence the following idea:
For drivers to be able to use their own lock for that it would be enough
to get rid of the lockdep checks. We could just add a flag to the GPUVA
manager to let the driver indicate it wants to do it's own locking for
the GPUVA list and skip the lockdep checks for the dma-resv lock in that
case.
>
> For issue #2, I can see a way out if, instead of freeing gpuva nodes,
> we flag those as unused when we see that something happening later in
> the queue is going to map a section being unmapped. All of this implies
> keeping access to already queued VM_BIND jobs (using the spsc queue at
> the entity level is not practical), and iterating over them every time
> a new sync or async job is queued to flag what needs to be retained. It
> would obviously be easier if we could tell the mapple-tree API
> 'provision as if the tree was empty', so all we have to do is just
> over-provision for both the page tables and mapple-tree insertion, and
> free the unused mem when the operation is done.
>
> Don't know if you already thought about that and/or have solutions to
> solve these issues.
As already mentioned above, I'd just expect we can keep it the
over-provision way, as you say. I think it's a legit use case to not
know the state of the maple tree at the time the pre-allocated nodes
will be used and keeping that should not interfere with Liams plan to
(hopefully separately) optimize for the pre-allocation use case they
have within -mm.
But let's wait for his take on that.
- Danilo
>
> Regards,
>
> Boris
>
next prev parent reply other threads:[~2023-06-22 13:59 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-20 0:42 [PATCH drm-next v5 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI Danilo Krummrich
2023-06-19 23:06 ` Danilo Krummrich
2023-06-20 4:05 ` Dave Airlie
2023-06-20 7:06 ` Oded Gabbay
2023-06-20 7:13 ` Dave Airlie
2023-06-20 7:34 ` Oded Gabbay
2023-06-20 0:42 ` [PATCH drm-next v5 01/14] drm: execution context for GEM buffers v4 Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 02/14] maple_tree: split up MA_STATE() macro Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 03/14] drm: manager to keep track of GPUs VA mappings Danilo Krummrich
2023-06-20 3:00 ` kernel test robot
2023-06-20 3:32 ` kernel test robot
2023-06-20 4:54 ` Christoph Hellwig
2023-06-20 6:45 ` Christian König
2023-06-20 12:23 ` Danilo Krummrich
2023-06-22 13:54 ` Christian König
2023-06-22 14:22 ` Danilo Krummrich
2023-06-22 14:42 ` Christian König
2023-06-22 15:04 ` Danilo Krummrich
2023-06-22 15:07 ` Danilo Krummrich
2023-06-23 2:24 ` Matthew Brost
2023-06-23 7:16 ` Christian König
2023-06-23 13:55 ` Danilo Krummrich
2023-06-23 15:34 ` Christian König
2023-06-26 22:38 ` Dave Airlie
2023-06-21 18:58 ` Donald Robson
2023-06-20 0:42 ` [PATCH drm-next v5 04/14] drm: debugfs: provide infrastructure to dump a DRM GPU VA space Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 05/14] drm/nouveau: new VM_BIND uapi interfaces Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 06/14] drm/nouveau: get vmm via nouveau_cli_vmm() Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 07/14] drm/nouveau: bo: initialize GEM GPU VA interface Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 08/14] drm/nouveau: move usercopy helpers to nouveau_drv.h Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 09/14] drm/nouveau: fence: separate fence alloc and emit Danilo Krummrich
2023-06-21 2:26 ` kernel test robot
2023-06-20 0:42 ` [PATCH drm-next v5 10/14] drm/nouveau: fence: fail to emit when fence context is killed Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 11/14] drm/nouveau: chan: provide nouveau_channel_kill() Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 12/14] drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 14/14] drm/nouveau: debugfs: implement DRM GPU VA debugfs Danilo Krummrich
2023-06-20 9:25 ` [PATCH drm-next v5 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI Boris Brezillon
2023-06-20 12:46 ` Danilo Krummrich
2023-06-22 13:01 ` Boris Brezillon
2023-06-22 13:58 ` Danilo Krummrich [this message]
2023-06-22 15:19 ` Boris Brezillon
2023-06-22 15:27 ` Danilo Krummrich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b04b3dbb-0509-fec1-4e8e-90f724443836@redhat.com \
--to=dakr@redhat.com \
--cc=Liam.Howlett@oracle.com \
--cc=airlied@gmail.com \
--cc=alexdeucher@gmail.com \
--cc=bagasdotme@gmail.com \
--cc=boris.brezillon@collabora.com \
--cc=bskeggs@redhat.com \
--cc=christian.koenig@amd.com \
--cc=corbet@lwn.net \
--cc=daniel@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=jason@jlekstrand.net \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=matthew.brost@intel.com \
--cc=mripard@kernel.org \
--cc=nouveau@lists.freedesktop.org \
--cc=ogabbay@kernel.org \
--cc=tzimmermann@suse.de \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).