From: Lucas Stach <l.stach-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
To: "Christian König" <christian.koenig-5C7GfCeVMHo@public.gmane.org>,
"Andrey Grodzovsky"
<Andrey.Grodzovsky-5C7GfCeVMHo@public.gmane.org>,
"Zhou,
David(ChunMing)" <David1.Zhou-5C7GfCeVMHo@public.gmane.org>,
"amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org"
<amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>,
"dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org"
<dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
Subject: Re: [PATCH 2/2] drm/scheduler: Remove obsolete spinlock.
Date: Wed, 16 May 2018 14:28:33 +0200 [thread overview]
Message-ID: <1526473713.10230.1.camel@pengutronix.de> (raw)
In-Reply-To: <0be3bb9a-004a-7390-469c-3f974ed44055-5C7GfCeVMHo@public.gmane.org>
Am Mittwoch, den 16.05.2018, 14:08 +0200 schrieb Christian König:
> Yes, exactly.
>
> For normal user space command submission we should have tons of
> locks
> guaranteeing that (e.g. just the VM lock should do).
>
> For kernel moves we have the mutex for the GTT windows which protects
> it.
>
> The could be problems with the UVD/VCE queues to cleanup the handles
> when an application crashes.
FWIW, etnaviv is currently completely unlocked in this path, but I
believe that this isn't an issue as the sched fence seq numbers are per
entity. So to actually end up with reversed seqnos one context has to
preempt itself to do another submit, while the current one hasn't
returned from kernel space, which I believe is a fairly theoretical
issue. Is my understanding correct?
Regards,
Lucas
>
> Am 16.05.2018 um 13:47 schrieb Andrey Grodzovsky:
> > So are you saying that you expect this to be already in code for
> > any
> > usage of drm_sched_fence_create and drm_sched_entity_push_job ?
> >
> > lock()
> >
> > drm_sched_fence_create()
> >
> > ... (some code)
> >
> > drm_sched_entity_push_job()
> >
> > unlock()
> >
> >
> > Andrey
> >
> > On 05/16/2018 07:23 AM, Christian König wrote:
> > > drm_sched_fence_create() assigns a sequence number to the fence
> > > it
> > > creates.
> > >
> > > Now drm_sched_fence_create() is called by drm_sched_job_init()
> > > to
> > > initialize the jobs we want to push on the scheduler queue.
> > >
> > > When you now call drm_sched_entity_push_job() without a
> > > protecting
> > > lock it can happen that you push two (or even more) job with
> > > reversed
> > > sequence numbers.
> > >
> > > Since the sequence numbers are used to determine job completion
> > > order
> > > reversing them can seriously mess things up.
> > >
> > > So the spin lock should be superfluous, if it isn't we have a
> > > much
> > > larger bug we need to fix.
> > >
> > > Christian.
> > >
> > > Am 16.05.2018 um 13:15 schrieb Andrey Grodzovsky:
> > > > Can you please elaborate more, maybe give an example - I don't
> > > > understand yet the problematic scenario.
> > > >
> > > > Andrey
> > > >
> > > >
> > > > On 05/16/2018 02:50 AM, Christian König wrote:
> > > > > No, that spinlock is indeed incorrect. I
> > > > >
> > > > > See even when we protect the spsc queue with a spinlock that
> > > > > doesn't make it correct. It can happen that the jobs pushed
> > > > > to the
> > > > > queue are reversed in their sequence order and that can
> > > > > cause
> > > > > severe problems in the memory management.
> > > > >
> > > > > Christian.
> > > > >
> > > > > Am 16.05.2018 um 05:33 schrieb Grodzovsky, Andrey:
> > > > > > Yeah, that what I am not sure about... It's lockless in a
> > > > > > sense of
> > > > > > single producer single consumer but not for multiple
> > > > > > concurrent
> > > > > > producers... So now I think this spinlock should stay
> > > > > > there... It
> > > > > > just looked useless to me at first sight...
> > > > > >
> > > > > > Andrey
> > > > > >
> > > > > > ________________________________________
> > > > > > From: Zhou, David(ChunMing)
> > > > > > Sent: 15 May 2018 23:04:44
> > > > > > To: Grodzovsky, Andrey; amd-gfx@lists.freedesktop.org;
> > > > > > dri-devel@lists.freedesktop.org
> > > > > > Cc: Koenig, Christian
> > > > > > Subject: Re: [PATCH 2/2] drm/scheduler: Remove obsolete
> > > > > > spinlock.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On 2018年05月16日 03:31, Andrey Grodzovsky wrote:
> > > > > > > Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.c
> > > > > > > om>
> > > > > > > ---
> > > > > > > drivers/gpu/drm/scheduler/gpu_scheduler.c | 4 ----
> > > > > > > include/drm/gpu_scheduler.h | 1 -
> > > > > > > 2 files changed, 5 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > > > > > > b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > > > > > > index 1f1dd70..2569a63 100644
> > > > > > > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > > > > > > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > > > > > > @@ -140,7 +140,6 @@ int drm_sched_entity_init(struct
> > > > > > > drm_gpu_scheduler *sched,
> > > > > > > entity->last_scheduled = NULL;
> > > > > > >
> > > > > > > spin_lock_init(&entity->rq_lock);
> > > > > > > - spin_lock_init(&entity->queue_lock);
> > > > > > > spsc_queue_init(&entity->job_queue);
> > > > > > >
> > > > > > > atomic_set(&entity->fence_seq, 0);
> > > > > > > @@ -424,11 +423,8 @@ void
> > > > > > > drm_sched_entity_push_job(struct
> > > > > > > drm_sched_job *sched_job,
> > > > > > >
> > > > > > > trace_drm_sched_job(sched_job, entity);
> > > > > > >
> > > > > > > - spin_lock(&entity->queue_lock);
> > > > > > > first = spsc_queue_push(&entity->job_queue,
> > > > > > > &sched_job->queue_node);
> > > > > > >
> > > > > > > - spin_unlock(&entity->queue_lock);
> > > > > >
> > > > > > Is your spsc safely to be added simultaneously?
> > > > > >
> > > > > > Regards,
> > > > > > David Zhou
> > > > > > > -
> > > > > > > /* first job wakes up scheduler */
> > > > > > > if (first) {
> > > > > > > /* Add the entity to the run queue */
> > > > > > > diff --git a/include/drm/gpu_scheduler.h
> > > > > > > b/include/drm/gpu_scheduler.h
> > > > > > > index 350a62c..683eb65 100644
> > > > > > > --- a/include/drm/gpu_scheduler.h
> > > > > > > +++ b/include/drm/gpu_scheduler.h
> > > > > > > @@ -56,7 +56,6 @@ struct drm_sched_entity {
> > > > > > > spinlock_t rq_lock;
> > > > > > > struct drm_gpu_scheduler *sched;
> > > > > > >
> > > > > > > - spinlock_t queue_lock;
> > > > > > > struct spsc_queue job_queue;
> > > > > > >
> > > > > > > atomic_t fence_seq;
> > > >
> > > > _______________________________________________
> > > > amd-gfx mailing list
> > > > amd-gfx@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
next prev parent reply other threads:[~2018-05-16 12:28 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-15 19:31 [PATCH 1/2] drm/amdgpu: Skip drm_sched_entity realted ops for KIQ ring Andrey Grodzovsky
[not found] ` <1526412674-15913-1-git-send-email-andrey.grodzovsky-5C7GfCeVMHo@public.gmane.org>
2018-05-15 19:31 ` [PATCH 2/2] drm/scheduler: Remove obsolete spinlock Andrey Grodzovsky
[not found] ` <1526412674-15913-2-git-send-email-andrey.grodzovsky-5C7GfCeVMHo@public.gmane.org>
2018-05-15 19:38 ` Alex Deucher
[not found] ` <CADnq5_Pp=itdSOwq75petxvte1kUcLxmLymp_5zy7LVrgowyVw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-05-15 20:24 ` Andrey Grodzovsky
2018-05-16 3:04 ` zhoucm1
[not found] ` <dda4fb88-e228-e10b-46a5-27ba9888b78d-5C7GfCeVMHo@public.gmane.org>
2018-05-16 3:33 ` Grodzovsky, Andrey
[not found] ` <MWHPR12MB14530840CD7DB545DD286827EA920-Gy0DoCVfaSWZBIDmKHdw+wdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-05-16 6:50 ` Christian König
[not found] ` <3a54ce0c-6821-9c94-a67f-2b1a3c74959d-5C7GfCeVMHo@public.gmane.org>
2018-05-16 11:15 ` Andrey Grodzovsky
2018-05-16 11:23 ` Christian König
[not found] ` <0d936838-8a88-9bd7-d4ff-053cd73f41cc-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-05-16 11:47 ` Andrey Grodzovsky
2018-05-16 12:08 ` Christian König
[not found] ` <0be3bb9a-004a-7390-469c-3f974ed44055-5C7GfCeVMHo@public.gmane.org>
2018-05-16 12:28 ` Lucas Stach [this message]
[not found] ` <1526473713.10230.1.camel-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
2018-05-16 12:32 ` Christian König
2018-05-16 13:00 ` Lucas Stach
[not found] ` <1526475600.10230.3.camel-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
2018-05-16 13:10 ` Christian König
[not found] ` <96b4e712-5e80-0053-301e-3f89940be8f0-5C7GfCeVMHo@public.gmane.org>
2018-05-16 13:16 ` Lucas Stach
2018-05-16 14:25 ` Andrey Grodzovsky
2018-05-15 19:42 ` [PATCH 1/2] drm/amdgpu: Skip drm_sched_entity realted ops for KIQ ring Alex Deucher
[not found] ` <CADnq5_OtJDxZSrTe_peJtFTCm6T8QnBW-iri2UJv1yvOmjraNQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-05-16 7:05 ` Christian König
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1526473713.10230.1.camel@pengutronix.de \
--to=l.stach-bicnvbalz9megne8c9+irq@public.gmane.org \
--cc=Andrey.Grodzovsky-5C7GfCeVMHo@public.gmane.org \
--cc=David1.Zhou-5C7GfCeVMHo@public.gmane.org \
--cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
--cc=christian.koenig-5C7GfCeVMHo@public.gmane.org \
--cc=dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).