All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boris Brezillon <boris.brezillon@collabora.com>
To: "Christian König" <christian.koenig@amd.com>
Cc: Sarah Walker <Sarah.Walker@imgtec.com>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"luben.tuikov@amd.com" <luben.tuikov@amd.com>,
	Donald Robson <Donald.Robson@imgtec.com>,
	"sumit.semwal@linaro.org" <sumit.semwal@linaro.org>
Subject: Re: [PATCH] drm/sched: Add native dependency support to drm_sched
Date: Mon, 12 Jun 2023 16:59:02 +0200	[thread overview]
Message-ID: <20230612165902.437345c4@collabora.com> (raw)
In-Reply-To: <5fcf04fa-1f45-30dd-1a37-cb0eba6d5f4a@amd.com>

Hi Christian,

On Mon, 12 Jun 2023 15:16:03 +0200
Christian König <christian.koenig@amd.com> wrote:

> Am 08.06.23 um 15:23 schrieb Donald Robson:
> > This patch adds support for 'native' dependencies to DRM scheduler.  In
> > drivers that use a firmware based scheduler there are performance gains
> > to be had by allowing waits to happen in the firmware, as this reduces
> > the latency between signalling and job submission.  
> 
> Well, if I'm not completely mistaken this patch here is superfluous 
> since we already use that functionality.
> 
> This strongly sounds like the HW dependencies we have in amdgpu. See 
> AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES.

I'll look at it in more details. Thanks for the pointer.

> 
> Basically userspace can instead of giving a hard dependency to finish 
> something before the current submission starts also give a soft 
> dependency and only let the other submission be scheduled.
> 
> This way you can then wait for the firmware for certain operations of 
> the previous submission to complete by comparing memory or registers.
> 
> You don't necessarily need to give control over this to userspace, if 
> your kernel driver can determine a fw assisted wait by itself that 
> should also work fine.

That's what we did initially. We had a separate 'native_deps' xarray in
pvr_job that we were managing ourselves, and that worked fine, except
for the kill_entity() stuff. If you don't wait for those
'native-fences', you're potentially signaling the job finished fence
earlier than it should.

Just to make sure we're on the same page, the native fences we
have here are really dma_fences that can be waited upon FW side:
they're not exposed to userspace, the GPU can't access the memory
region containing the counter (it's visible to the FW VM only, and
a kernel side CPU mapping), and we do make sure they signal in finite
time thanks to the job timeout. Feels a bit different compared to
USER_FENCEs most GPUs have nowadays, on which you don't have this
isolation guarantee, and which, AFAIU, are currently used to do some
advanced userspace driven scheduling between queues belonging to the
same context. My understanding, after discussing it with Daniel a few
weeks back, was that exposing USER_FENCEs as dma_fences was risky,
especially if they're used to do inter-context synchronization,
but the FW-visible-only ones were okay to expose as dma_fences. Maybe I
misunderstood what he suggested.

I'm done with this digression, now back to the original topic: we can of
course wait for all those native fences before calling
drm_sched_enity_destroy(), but that's a bit weird to do some partial
wait in the driver while the entity is still active (pretty sure that's
racy anyway), and then delegate the rest to the core.

If we decide we don't care about waiting for native fences when
killing jobs in the kill_entity() path, because we assume drm_resv is
covering us, that's fine, but then I don't really see why
drm_sched_kill_entity() should wait at all, because this 'should wait,
but maybe not for all your deps' behavior is quite confusing.

Regards,

Boris

  reply	other threads:[~2023-06-12 14:59 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-08 13:23 [PATCH] drm/sched: Add native dependency support to drm_sched Donald Robson
2023-06-09  9:29 ` Donald Robson
2023-06-12 13:05 ` Boris Brezillon
2023-06-12 13:16 ` Christian König
2023-06-12 14:59   ` Boris Brezillon [this message]
2023-06-12 16:25     ` Boris Brezillon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230612165902.437345c4@collabora.com \
    --to=boris.brezillon@collabora.com \
    --cc=Donald.Robson@imgtec.com \
    --cc=Sarah.Walker@imgtec.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=luben.tuikov@amd.com \
    --cc=sumit.semwal@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.