From: Boris Brezillon <boris.brezillon@collabora.com>
To: Liviu Dudau <liviu.dudau@arm.com>
Cc: "Steven Price" <steven.price@arm.com>,
"Adrián Larumbe" <adrian.larumbe@collabora.com>,
dri-devel@lists.freedesktop.org, kernel@collabora.com
Subject: Re: [PATCH 2/2] drm/panthor: Fix sync-only jobs
Date: Mon, 1 Jul 2024 08:59:53 +0200 [thread overview]
Message-ID: <20240701085953.32f17425@collabora.com> (raw)
In-Reply-To: <Zn_5uyOcGc8bz-D0@e110455-lin.cambridge.arm.com>
On Sat, 29 Jun 2024 13:10:35 +0100
Liviu Dudau <liviu.dudau@arm.com> wrote:
> On Sat, Jun 29, 2024 at 10:52:56AM +0200, Boris Brezillon wrote:
> > On Fri, 28 Jun 2024 22:34:57 +0100
> > Liviu Dudau <liviu.dudau@arm.com> wrote:
> >
> > > On Fri, Jun 28, 2024 at 04:55:36PM +0200, Boris Brezillon wrote:
> > > > A sync-only job is meant to provide a synchronization point on a
> > > > queue, so we can't return a NULL fence there, we have to add a signal
> > > > operation to the command stream which executes after all other
> > > > previously submitted jobs are done.
> > > >
> > > > Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
> > > > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > >
> > > Took me a bit longer to read, lets blame Friday.
> > >
> > > > ---
> > > > drivers/gpu/drm/panthor/panthor_sched.c | 41 ++++++++++++++++++++-----
> > > > include/uapi/drm/panthor_drm.h | 5 +++
> > > > 2 files changed, 38 insertions(+), 8 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> > > > index 79ffcbc41d78..951ff7e63ea8 100644
> > > > --- a/drivers/gpu/drm/panthor/panthor_sched.c
> > > > +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> > > > @@ -458,6 +458,16 @@ struct panthor_queue {
> > > > /** @seqno: Sequence number of the last initialized fence. */
> > > > atomic64_t seqno;
> > > >
> > > > + /**
> > > > + * @last_fence: Fence of the last submitted job.
> > > > + *
> > > > + * We return this fence when we get an empty command stream.
> > > > + * This way, we are guaranteed that all earlier jobs have completed
> > > > + * when drm_sched_job::s_fence::finished without having to feed
> > > > + * the CS ring buffer with a dummy job that only signals the fence.
> > > > + */
> > > > + struct dma_fence *last_fence;
> > > > +
> > > > /**
> > > > * @in_flight_jobs: List containing all in-flight jobs.
> > > > *
> > > > @@ -829,6 +839,9 @@ static void group_free_queue(struct panthor_group *group, struct panthor_queue *
> > > > panthor_kernel_bo_destroy(queue->ringbuf);
> > > > panthor_kernel_bo_destroy(queue->iface.mem);
> > > >
> > > > + /* Release the last_fence we were holding, if any. */
> > > > + dma_fence_put(queue->fence_ctx.last_fence);
> > > > +
> > > > kfree(queue);
> > > > }
> > > >
> > > > @@ -2865,11 +2878,14 @@ queue_run_job(struct drm_sched_job *sched_job)
> > > > static_assert(sizeof(call_instrs) % 64 == 0,
> > > > "call_instrs is not aligned on a cacheline");
> > > >
> > > > - /* Stream size is zero, nothing to do => return a NULL fence and let
> > > > - * drm_sched signal the parent.
> > > > + /* Stream size is zero, nothing to do except making sure all previously
> > > > + * submitted jobs are done before we signal the
> > > > + * drm_sched_job::s_fence::finished fence.
> > > > */
> > > > - if (!job->call_info.size)
> > > > - return NULL;
> > > > + if (!job->call_info.size) {
> > > > + job->done_fence = dma_fence_get(queue->fence_ctx.last_fence);
> > > > + return job->done_fence;
> > >
> > > What happens if the last job's done_fence was cancelled or timed out? Is the
> > > sync job's done_fence going to be signalled with the same error?
> >
> > It's the same object, so yes, the job will also be considered faulty
> > (the error propagated to the job::s_fence::finished fence). I guess
> > synchronization jobs are not supposed to fail/timeout in theory, because
> > they don't do anything, but I don't think that's an issue in
> > practice, because dma_fence errors are never propagated to user-space
> > (only the queue status is).
> >
> > >
> > > Now that we're returning a fence here, should the job be also added into the
> > > in_flight_jobs?
> >
> > Yeah, that's done on purpose, such that we don't end up signalling the
> > same dma_fence object twice (which is forbidden).
>
> That's the thing I was trying to figure out on Friday: how do we stop the fence
> returned as last_fence for the sync job to be signalled after the job's done_fence
> has also been signalled. I can't say that I found a good answer, if you can point
> me to what I've missed it will be appreciated.
Well, there's only just one job that's added to the in_flight for a
given done_fence object, so there's no risk of signaling such a fence
more than once (even in case of a timeout). Now, let's say the
previous 'real' job is done when this 'dummy/sync' job queued, the
queue::fence_ctx::last_fence object will be signalled too, since this
is essentially the same object, and when we return a signalled fence to
the drm_sched layer in our ::run_job() callback, it just signals the
finished drm_sched_job::s_fence::finished immediately, just like when
you return a NULL fence.
next prev parent reply other threads:[~2024-07-01 7:00 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-28 14:55 [PATCH 0/2] drm/panthor: Fix support for sync-only jobs Boris Brezillon
2024-06-28 14:55 ` [PATCH 1/2] drm/panthor: Don't check the array stride on empty uobj arrays Boris Brezillon
2024-06-28 19:19 ` Liviu Dudau
2024-07-01 9:03 ` Steven Price
2024-06-28 14:55 ` [PATCH 2/2] drm/panthor: Fix sync-only jobs Boris Brezillon
2024-06-28 21:34 ` Liviu Dudau
2024-06-29 8:52 ` Boris Brezillon
2024-06-29 12:10 ` Liviu Dudau
2024-07-01 6:59 ` Boris Brezillon [this message]
2024-07-01 9:03 ` Steven Price
2024-07-01 11:02 ` kernel test robot
2024-07-02 15:13 ` Boris Brezillon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240701085953.32f17425@collabora.com \
--to=boris.brezillon@collabora.com \
--cc=adrian.larumbe@collabora.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=kernel@collabora.com \
--cc=liviu.dudau@arm.com \
--cc=steven.price@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.