[PATCH 0/4] fixes for Adreno A5Xx preemption

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/4] fixes for Adreno A5Xx preemption
@ 2024-07-11 10:00 Vladimir Lypak
  2024-07-11 10:00 ` [PATCH 1/4] drm/msm/a5xx: disable preemption in submits by default Vladimir Lypak
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Vladimir Lypak @ 2024-07-11 10:00 UTC (permalink / raw)
  To: Vladimir Lypak
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

There are several issues with preemption on Adreno A5XX GPUs which
render system unusable if more than one priority level is used. Those
issues include persistent GPU faults and hangs, full UI lockups with
idling GPU.

---
Vladimir Lypak (4):
  drm/msm/a5xx: disable preemption in submits by default
  drm/msm/a5xx: properly clear preemption records on resume
  drm/msm/a5xx: fix races in preemption evaluation stage
  drm/msm/a5xx: workaround early ring-buffer emptiness check

 drivers/gpu/drm/msm/adreno/a5xx_gpu.c     | 18 ++++++++++----
 drivers/gpu/drm/msm/adreno/a5xx_gpu.h     | 12 ++++++---
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 30 ++++++++++++++++++++---
 3 files changed, 47 insertions(+), 13 deletions(-)
---
base-commit: 523b23f0bee3014a7a752c9bb9f5c54f0eddae88
-- 
2.45.2


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/4] drm/msm/a5xx: disable preemption in submits by default
  2024-07-11 10:00 [PATCH 0/4] fixes for Adreno A5Xx preemption Vladimir Lypak
@ 2024-07-11 10:00 ` Vladimir Lypak
  2024-07-15 21:00   ` Rob Clark
  2024-07-11 10:00 ` [PATCH 2/4] drm/msm/a5xx: properly clear preemption records on resume Vladimir Lypak
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: Vladimir Lypak @ 2024-07-11 10:00 UTC (permalink / raw)
  To: Vladimir Lypak
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

Fine grain preemption (switching from/to points within submits)
requires extra handling in command stream of those submits, especially
when rendering with tiling (using GMEM). However this handling is
missing at this point in mesa (and always was). For this reason we get
random GPU faults and hangs if more than one priority level is used
because local preemption is enabled prior to executing command stream
from submit.
With that said it was ahead of time to enable local preemption by
default considering the fact that even on downstream kernel it is only
enabled if requested via UAPI.

Fixes: a7a4c19c36de ("drm/msm/a5xx: fix setting of the CP_PREEMPT_ENABLE_LOCAL register")
Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index c0b5373e90d7..6c80d3003966 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -150,9 +150,13 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
 	OUT_RING(ring, 1);
 
-	/* Enable local preemption for finegrain preemption */
+	/*
+	 * Disable local preemption by default because it requires
+	 * user-space to be aware of it and provide additional handling
+	 * to restore rendering state or do various flushes on switch.
+	 */
 	OUT_PKT7(ring, CP_PREEMPT_ENABLE_LOCAL, 1);
-	OUT_RING(ring, 0x1);
+	OUT_RING(ring, 0x0);
 
 	/* Allow CP_CONTEXT_SWITCH_YIELD packets in the IB2 */
 	OUT_PKT7(ring, CP_YIELD_ENABLE, 1);
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] drm/msm/a5xx: disable preemption in submits by default
  2024-07-11 10:00 ` [PATCH 1/4] drm/msm/a5xx: disable preemption in submits by default Vladimir Lypak
@ 2024-07-15 21:00   ` Rob Clark
  2024-08-01 12:38     ` Akhil P Oommen
  0 siblings, 1 reply; 21+ messages in thread
From: Rob Clark @ 2024-07-15 21:00 UTC (permalink / raw)
  To: Vladimir Lypak
  Cc: Sean Paul, Konrad Dybcio, Abhinav Kumar, Dmitry Baryshkov,
	Marijn Suijten, David Airlie, Daniel Vetter, Jordan Crouse,
	linux-arm-msm, dri-devel, freedreno, linux-kernel

On Thu, Jul 11, 2024 at 3:02 AM Vladimir Lypak <vladimir.lypak@gmail.com> wrote:
>
> Fine grain preemption (switching from/to points within submits)
> requires extra handling in command stream of those submits, especially
> when rendering with tiling (using GMEM). However this handling is
> missing at this point in mesa (and always was). For this reason we get
> random GPU faults and hangs if more than one priority level is used
> because local preemption is enabled prior to executing command stream
> from submit.
> With that said it was ahead of time to enable local preemption by
> default considering the fact that even on downstream kernel it is only
> enabled if requested via UAPI.
>
> Fixes: a7a4c19c36de ("drm/msm/a5xx: fix setting of the CP_PREEMPT_ENABLE_LOCAL register")
> Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
> ---
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index c0b5373e90d7..6c80d3003966 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -150,9 +150,13 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
>         OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
>         OUT_RING(ring, 1);
>
> -       /* Enable local preemption for finegrain preemption */
> +       /*
> +        * Disable local preemption by default because it requires
> +        * user-space to be aware of it and provide additional handling
> +        * to restore rendering state or do various flushes on switch.
> +        */
>         OUT_PKT7(ring, CP_PREEMPT_ENABLE_LOCAL, 1);
> -       OUT_RING(ring, 0x1);
> +       OUT_RING(ring, 0x0);

From a quick look at the a530 pfp fw, it looks like
CP_PREEMPT_ENABLE_LOCAL is allowed in IB1/IB2 (ie. not restricted to
kernel RB).  So we should just disable it in the kernel, and let
userspace send a CP_PREEMPT_ENABLE_LOCAL to enable local preemption.

BR,
-R

>         /* Allow CP_CONTEXT_SWITCH_YIELD packets in the IB2 */
>         OUT_PKT7(ring, CP_YIELD_ENABLE, 1);
> --
> 2.45.2
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] drm/msm/a5xx: disable preemption in submits by default
  2024-07-15 21:00   ` Rob Clark
@ 2024-08-01 12:38     ` Akhil P Oommen
  0 siblings, 0 replies; 21+ messages in thread
From: Akhil P Oommen @ 2024-08-01 12:38 UTC (permalink / raw)
  To: Rob Clark
  Cc: Vladimir Lypak, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

On Mon, Jul 15, 2024 at 02:00:10PM -0700, Rob Clark wrote:
> On Thu, Jul 11, 2024 at 3:02 AM Vladimir Lypak <vladimir.lypak@gmail.com> wrote:
> >
> > Fine grain preemption (switching from/to points within submits)
> > requires extra handling in command stream of those submits, especially
> > when rendering with tiling (using GMEM). However this handling is
> > missing at this point in mesa (and always was). For this reason we get
> > random GPU faults and hangs if more than one priority level is used
> > because local preemption is enabled prior to executing command stream
> > from submit.
> > With that said it was ahead of time to enable local preemption by
> > default considering the fact that even on downstream kernel it is only
> > enabled if requested via UAPI.
> >
> > Fixes: a7a4c19c36de ("drm/msm/a5xx: fix setting of the CP_PREEMPT_ENABLE_LOCAL register")
> > Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
> > ---
> >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > index c0b5373e90d7..6c80d3003966 100644
> > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > @@ -150,9 +150,13 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
> >         OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
> >         OUT_RING(ring, 1);
> >
> > -       /* Enable local preemption for finegrain preemption */
> > +       /*
> > +        * Disable local preemption by default because it requires
> > +        * user-space to be aware of it and provide additional handling
> > +        * to restore rendering state or do various flushes on switch.
> > +        */
> >         OUT_PKT7(ring, CP_PREEMPT_ENABLE_LOCAL, 1);
> > -       OUT_RING(ring, 0x1);
> > +       OUT_RING(ring, 0x0);
> 
> From a quick look at the a530 pfp fw, it looks like
> CP_PREEMPT_ENABLE_LOCAL is allowed in IB1/IB2 (ie. not restricted to
> kernel RB).  So we should just disable it in the kernel, and let
> userspace send a CP_PREEMPT_ENABLE_LOCAL to enable local preemption.

Ack. AFAIU about a5x preemption, this should work.

-Akhil

> 
> BR,
> -R
> 
> >         /* Allow CP_CONTEXT_SWITCH_YIELD packets in the IB2 */
> >         OUT_PKT7(ring, CP_YIELD_ENABLE, 1);
> > --
> > 2.45.2
> >

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 2/4] drm/msm/a5xx: properly clear preemption records on resume
  2024-07-11 10:00 [PATCH 0/4] fixes for Adreno A5Xx preemption Vladimir Lypak
  2024-07-11 10:00 ` [PATCH 1/4] drm/msm/a5xx: disable preemption in submits by default Vladimir Lypak
@ 2024-07-11 10:00 ` Vladimir Lypak
  2024-07-11 10:42   ` Konrad Dybcio
  2024-08-01 13:16   ` Akhil P Oommen
  2024-07-11 10:00 ` [PATCH 3/4] drm/msm/a5xx: fix races in preemption evaluation stage Vladimir Lypak
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 21+ messages in thread
From: Vladimir Lypak @ 2024-07-11 10:00 UTC (permalink / raw)
  To: Vladimir Lypak
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

Two fields of preempt_record which are used by CP aren't reset on
resume: "data" and "info". This is the reason behind faults which happen
when we try to switch to the ring that was active last before suspend.
In addition those faults can't be recovered from because we use suspend
and resume to do so (keeping values of those fields again).

Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
---
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
index f58dd564d122..67a8ef4adf6b 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
@@ -204,6 +204,8 @@ void a5xx_preempt_hw_init(struct msm_gpu *gpu)
 		return;
 
 	for (i = 0; i < gpu->nr_rings; i++) {
+		a5xx_gpu->preempt[i]->data = 0;
+		a5xx_gpu->preempt[i]->info = 0;
 		a5xx_gpu->preempt[i]->wptr = 0;
 		a5xx_gpu->preempt[i]->rptr = 0;
 		a5xx_gpu->preempt[i]->rbase = gpu->rb[i]->iova;
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] drm/msm/a5xx: properly clear preemption records on resume
  2024-07-11 10:00 ` [PATCH 2/4] drm/msm/a5xx: properly clear preemption records on resume Vladimir Lypak
@ 2024-07-11 10:42   ` Konrad Dybcio
  2024-08-01 13:16   ` Akhil P Oommen
  1 sibling, 0 replies; 21+ messages in thread
From: Konrad Dybcio @ 2024-07-11 10:42 UTC (permalink / raw)
  To: Vladimir Lypak
  Cc: Rob Clark, Sean Paul, Abhinav Kumar, Dmitry Baryshkov,
	Marijn Suijten, David Airlie, Daniel Vetter, Jordan Crouse,
	linux-arm-msm, dri-devel, freedreno, linux-kernel

On 11.07.2024 12:00 PM, Vladimir Lypak wrote:
> Two fields of preempt_record which are used by CP aren't reset on
> resume: "data" and "info". This is the reason behind faults which happen
> when we try to switch to the ring that was active last before suspend.
> In addition those faults can't be recovered from because we use suspend
> and resume to do so (keeping values of those fields again).
> 
> Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
> Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
> ---

Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>

Konrad

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] drm/msm/a5xx: properly clear preemption records on resume
  2024-07-11 10:00 ` [PATCH 2/4] drm/msm/a5xx: properly clear preemption records on resume Vladimir Lypak
  2024-07-11 10:42   ` Konrad Dybcio
@ 2024-08-01 13:16   ` Akhil P Oommen
  2024-08-02 13:41     ` Vladimir Lypak
  1 sibling, 1 reply; 21+ messages in thread
From: Akhil P Oommen @ 2024-08-01 13:16 UTC (permalink / raw)
  To: Vladimir Lypak
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

On Thu, Jul 11, 2024 at 10:00:19AM +0000, Vladimir Lypak wrote:
> Two fields of preempt_record which are used by CP aren't reset on
> resume: "data" and "info". This is the reason behind faults which happen
> when we try to switch to the ring that was active last before suspend.
> In addition those faults can't be recovered from because we use suspend
> and resume to do so (keeping values of those fields again).
> 
> Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
> Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
> ---
>  drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> index f58dd564d122..67a8ef4adf6b 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> @@ -204,6 +204,8 @@ void a5xx_preempt_hw_init(struct msm_gpu *gpu)
>  		return;
>  
>  	for (i = 0; i < gpu->nr_rings; i++) {
> +		a5xx_gpu->preempt[i]->data = 0;
> +		a5xx_gpu->preempt[i]->info = 0;

I don't see this bit in the downstream driver. Just curious, do we need
to clear both fields to avoid the gpu faults?

-Akhil
>  		a5xx_gpu->preempt[i]->wptr = 0;
>  		a5xx_gpu->preempt[i]->rptr = 0;
>  		a5xx_gpu->preempt[i]->rbase = gpu->rb[i]->iova;
> -- 
> 2.45.2
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] drm/msm/a5xx: properly clear preemption records on resume
  2024-08-01 13:16   ` Akhil P Oommen
@ 2024-08-02 13:41     ` Vladimir Lypak
  2024-08-05 19:07       ` Akhil P Oommen
  0 siblings, 1 reply; 21+ messages in thread
From: Vladimir Lypak @ 2024-08-02 13:41 UTC (permalink / raw)
  To: Akhil P Oommen
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

On Thu, Aug 01, 2024 at 06:46:10PM +0530, Akhil P Oommen wrote:
> On Thu, Jul 11, 2024 at 10:00:19AM +0000, Vladimir Lypak wrote:
> > Two fields of preempt_record which are used by CP aren't reset on
> > resume: "data" and "info". This is the reason behind faults which happen
> > when we try to switch to the ring that was active last before suspend.
> > In addition those faults can't be recovered from because we use suspend
> > and resume to do so (keeping values of those fields again).
> > 
> > Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
> > Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
> > ---
> >  drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > index f58dd564d122..67a8ef4adf6b 100644
> > --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > @@ -204,6 +204,8 @@ void a5xx_preempt_hw_init(struct msm_gpu *gpu)
> >  		return;
> >  
> >  	for (i = 0; i < gpu->nr_rings; i++) {
> > +		a5xx_gpu->preempt[i]->data = 0;
> > +		a5xx_gpu->preempt[i]->info = 0;
> 
> I don't see this bit in the downstream driver. Just curious, do we need
> to clear both fields to avoid the gpu faults?

Downstream gets away without doing so because it resumes on the same
ring that it suspended on. On mainline we always do GPU resume on first
ring. It was enough to zero info field to avoid faults but clearing
both shouldn't hurt.

I have tried to replicate faults again with local preemption disabled
and unmodified mesa and couldn't do so. It only happens when fine-grain
preemption is used and there was a switch from IB1.
This made me come up with explanation of what could be happening.
If preemption switch is initiated on a some ring at checkpoint in IB1,
CP should save position of that checkpoint in the preemption record and
set some flag in "info" field which will tell it to continue from that
checkpoint when switching back.
When switching back to that ring we program address of its preemption
record to CP_CONTEXT_SWITCH_RESTORE_ADDR. Apparently this won't remove
the flag from "info" field because the preemption record is only being
read from. This leaves preemption record outdated on that ring until
next switch will override it. This doesn't cause issues on downstream
because it won't try to restore from that record since it's ignored
during GPU power-up.

Vladimir

> 
> -Akhil
> >  		a5xx_gpu->preempt[i]->wptr = 0;
> >  		a5xx_gpu->preempt[i]->rptr = 0;
> >  		a5xx_gpu->preempt[i]->rbase = gpu->rb[i]->iova;
> > -- 
> > 2.45.2
> > 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] drm/msm/a5xx: properly clear preemption records on resume
  2024-08-02 13:41     ` Vladimir Lypak
@ 2024-08-05 19:07       ` Akhil P Oommen
  0 siblings, 0 replies; 21+ messages in thread
From: Akhil P Oommen @ 2024-08-05 19:07 UTC (permalink / raw)
  To: Vladimir Lypak
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

On Fri, Aug 02, 2024 at 01:41:32PM +0000, Vladimir Lypak wrote:
> On Thu, Aug 01, 2024 at 06:46:10PM +0530, Akhil P Oommen wrote:
> > On Thu, Jul 11, 2024 at 10:00:19AM +0000, Vladimir Lypak wrote:
> > > Two fields of preempt_record which are used by CP aren't reset on
> > > resume: "data" and "info". This is the reason behind faults which happen
> > > when we try to switch to the ring that was active last before suspend.
> > > In addition those faults can't be recovered from because we use suspend
> > > and resume to do so (keeping values of those fields again).
> > > 
> > > Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
> > > Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
> > > ---
> > >  drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > > index f58dd564d122..67a8ef4adf6b 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > > @@ -204,6 +204,8 @@ void a5xx_preempt_hw_init(struct msm_gpu *gpu)
> > >  		return;
> > >  
> > >  	for (i = 0; i < gpu->nr_rings; i++) {
> > > +		a5xx_gpu->preempt[i]->data = 0;
> > > +		a5xx_gpu->preempt[i]->info = 0;
> > 
> > I don't see this bit in the downstream driver. Just curious, do we need
> > to clear both fields to avoid the gpu faults?
> 
> Downstream gets away without doing so because it resumes on the same
> ring that it suspended on. On mainline we always do GPU resume on first
> ring. It was enough to zero info field to avoid faults but clearing
> both shouldn't hurt.
> 
> I have tried to replicate faults again with local preemption disabled
> and unmodified mesa and couldn't do so. It only happens when fine-grain
> preemption is used and there was a switch from IB1.

So, I guess gpu is going to rpm suspend while there is pending
(preempted) submits present in another ringbuffer. Probably the other
fixes you have in this series make this not necessary during rpm suspend.
But we can keep as it is harmless and might help during gpu recovery.

> This made me come up with explanation of what could be happening.
> If preemption switch is initiated on a some ring at checkpoint in IB1,
> CP should save position of that checkpoint in the preemption record and
> set some flag in "info" field which will tell it to continue from that
> checkpoint when switching back.
> When switching back to that ring we program address of its preemption
> record to CP_CONTEXT_SWITCH_RESTORE_ADDR. Apparently this won't remove
> the flag from "info" field because the preemption record is only being
> read from. This leaves preemption record outdated on that ring until
> next switch will override it. This doesn't cause issues on downstream
> because it won't try to restore from that record since it's ignored
> during GPU power-up.

I guess it is fine if you never go to rpm suspend without idling all
RBs!

-Akhil

> 
> Vladimir
> 
> > 
> > -Akhil
> > >  		a5xx_gpu->preempt[i]->wptr = 0;
> > >  		a5xx_gpu->preempt[i]->rptr = 0;
> > >  		a5xx_gpu->preempt[i]->rbase = gpu->rb[i]->iova;
> > > -- 
> > > 2.45.2
> > > 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 3/4] drm/msm/a5xx: fix races in preemption evaluation stage
  2024-07-11 10:00 [PATCH 0/4] fixes for Adreno A5Xx preemption Vladimir Lypak
  2024-07-11 10:00 ` [PATCH 1/4] drm/msm/a5xx: disable preemption in submits by default Vladimir Lypak
  2024-07-11 10:00 ` [PATCH 2/4] drm/msm/a5xx: properly clear preemption records on resume Vladimir Lypak
@ 2024-07-11 10:00 ` Vladimir Lypak
  2024-07-29 17:26   ` Connor Abbott
  2024-08-05 19:29   ` Akhil P Oommen
  2024-07-11 10:00 ` [PATCH 4/4] drm/msm/a5xx: workaround early ring-buffer emptiness check Vladimir Lypak
  2024-07-17  9:40 ` [PATCH 0/4] fixes for Adreno A5Xx preemption Connor Abbott
  4 siblings, 2 replies; 21+ messages in thread
From: Vladimir Lypak @ 2024-07-11 10:00 UTC (permalink / raw)
  To: Vladimir Lypak
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

On A5XX GPUs when preemption is used it's invietable to enter a soft
lock-up state in which GPU is stuck at empty ring-buffer doing nothing.
This appears as full UI lockup and not detected as GPU hang (because
it's not). This happens due to not triggering preemption when it was
needed. Sometimes this state can be recovered by some new submit but
generally it won't happen because applications are waiting for old
submits to retire.

One of the reasons why this happens is a race between a5xx_submit and
a5xx_preempt_trigger called from IRQ during submit retire. Former thread
updates ring->cur of previously empty and not current ring right after
latter checks it for emptiness. Then both threads can just exit because
for first one preempt_state wasn't NONE yet and for second one all rings
appeared to be empty.

To prevent such situations from happening we need to establish guarantee
for preempt_trigger to be called after each submit. To implement it this
patch adds trigger call at the end of a5xx_preempt_irq to re-check if we
should switch to non-empty or higher priority ring. Also we find next
ring in new preemption state "EVALUATE". If the thread that updated some
ring with new submit sees this state it should wait until it passes.

Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c     |  6 +++---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.h     | 11 +++++++----
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 24 +++++++++++++++++++----
 3 files changed, 30 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 6c80d3003966..266744ee1d5f 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -110,7 +110,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit
 	}
 
 	a5xx_flush(gpu, ring, true);
-	a5xx_preempt_trigger(gpu);
+	a5xx_preempt_trigger(gpu, true);
 
 	/* we might not necessarily have a cmd from userspace to
 	 * trigger an event to know that submit has completed, so
@@ -240,7 +240,7 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 	a5xx_flush(gpu, ring, false);
 
 	/* Check to see if we need to start preemption */
-	a5xx_preempt_trigger(gpu);
+	a5xx_preempt_trigger(gpu, true);
 }
 
 static const struct adreno_five_hwcg_regs {
@@ -1296,7 +1296,7 @@ static irqreturn_t a5xx_irq(struct msm_gpu *gpu)
 		a5xx_gpmu_err_irq(gpu);
 
 	if (status & A5XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS) {
-		a5xx_preempt_trigger(gpu);
+		a5xx_preempt_trigger(gpu, false);
 		msm_gpu_retire(gpu);
 	}
 
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
index c7187bcc5e90..1120824853d4 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
@@ -57,10 +57,12 @@ void a5xx_debugfs_init(struct msm_gpu *gpu, struct drm_minor *minor);
  * through the process.
  *
  * PREEMPT_NONE - no preemption in progress.  Next state START.
- * PREEMPT_START - The trigger is evaulating if preemption is possible. Next
- * states: TRIGGERED, NONE
+ * PREEMPT_EVALUATE - The trigger is evaulating if preemption is possible. Next
+ * states: START, ABORT
  * PREEMPT_ABORT - An intermediate state before moving back to NONE. Next
  * state: NONE.
+ * PREEMPT_START - The trigger is preparing for preemption. Next state:
+ * TRIGGERED
  * PREEMPT_TRIGGERED: A preemption has been executed on the hardware. Next
  * states: FAULTED, PENDING
  * PREEMPT_FAULTED: A preemption timed out (never completed). This will trigger
@@ -71,8 +73,9 @@ void a5xx_debugfs_init(struct msm_gpu *gpu, struct drm_minor *minor);
 
 enum preempt_state {
 	PREEMPT_NONE = 0,
-	PREEMPT_START,
+	PREEMPT_EVALUATE,
 	PREEMPT_ABORT,
+	PREEMPT_START,
 	PREEMPT_TRIGGERED,
 	PREEMPT_FAULTED,
 	PREEMPT_PENDING,
@@ -156,7 +159,7 @@ void a5xx_set_hwcg(struct msm_gpu *gpu, bool state);
 
 void a5xx_preempt_init(struct msm_gpu *gpu);
 void a5xx_preempt_hw_init(struct msm_gpu *gpu);
-void a5xx_preempt_trigger(struct msm_gpu *gpu);
+void a5xx_preempt_trigger(struct msm_gpu *gpu, bool new_submit);
 void a5xx_preempt_irq(struct msm_gpu *gpu);
 void a5xx_preempt_fini(struct msm_gpu *gpu);
 
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
index 67a8ef4adf6b..f8d09a83c5ae 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
@@ -87,21 +87,33 @@ static void a5xx_preempt_timer(struct timer_list *t)
 }
 
 /* Try to trigger a preemption switch */
-void a5xx_preempt_trigger(struct msm_gpu *gpu)
+void a5xx_preempt_trigger(struct msm_gpu *gpu, bool new_submit)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
 	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
 	unsigned long flags;
 	struct msm_ringbuffer *ring;
+	enum preempt_state state;
 
 	if (gpu->nr_rings == 1)
 		return;
 
 	/*
-	 * Try to start preemption by moving from NONE to START. If
-	 * unsuccessful, a preemption is already in flight
+	 * Try to start preemption by moving from NONE to EVALUATE. If current
+	 * state is EVALUATE/ABORT we can't just quit because then we can't
+	 * guarantee that preempt_trigger will be called after ring is updated
+	 * by new submit.
 	 */
-	if (!try_preempt_state(a5xx_gpu, PREEMPT_NONE, PREEMPT_START))
+	state = atomic_cmpxchg(&a5xx_gpu->preempt_state, PREEMPT_NONE,
+			       PREEMPT_EVALUATE);
+	while (new_submit && (state == PREEMPT_EVALUATE ||
+			      state == PREEMPT_ABORT)) {
+		cpu_relax();
+		state = atomic_cmpxchg(&a5xx_gpu->preempt_state, PREEMPT_NONE,
+				       PREEMPT_EVALUATE);
+	}
+
+	if (state != PREEMPT_NONE)
 		return;
 
 	/* Get the next ring to preempt to */
@@ -130,6 +142,8 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu)
 		return;
 	}
 
+	set_preempt_state(a5xx_gpu, PREEMPT_START);
+
 	/* Make sure the wptr doesn't update while we're in motion */
 	spin_lock_irqsave(&ring->preempt_lock, flags);
 	a5xx_gpu->preempt[ring->id]->wptr = get_wptr(ring);
@@ -188,6 +202,8 @@ void a5xx_preempt_irq(struct msm_gpu *gpu)
 	update_wptr(gpu, a5xx_gpu->cur_ring);
 
 	set_preempt_state(a5xx_gpu, PREEMPT_NONE);
+
+	a5xx_preempt_trigger(gpu, false);
 }
 
 void a5xx_preempt_hw_init(struct msm_gpu *gpu)
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] drm/msm/a5xx: fix races in preemption evaluation stage
  2024-07-11 10:00 ` [PATCH 3/4] drm/msm/a5xx: fix races in preemption evaluation stage Vladimir Lypak
@ 2024-07-29 17:26   ` Connor Abbott
  2024-08-01 12:22     ` Vladimir Lypak
  2024-08-05 19:29   ` Akhil P Oommen
  1 sibling, 1 reply; 21+ messages in thread
From: Connor Abbott @ 2024-07-29 17:26 UTC (permalink / raw)
  To: Vladimir Lypak
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

On Thu, Jul 11, 2024 at 11:10 AM Vladimir Lypak
<vladimir.lypak@gmail.com> wrote:
>
> On A5XX GPUs when preemption is used it's invietable to enter a soft
> lock-up state in which GPU is stuck at empty ring-buffer doing nothing.
> This appears as full UI lockup and not detected as GPU hang (because
> it's not). This happens due to not triggering preemption when it was
> needed. Sometimes this state can be recovered by some new submit but
> generally it won't happen because applications are waiting for old
> submits to retire.
>
> One of the reasons why this happens is a race between a5xx_submit and
> a5xx_preempt_trigger called from IRQ during submit retire. Former thread
> updates ring->cur of previously empty and not current ring right after
> latter checks it for emptiness. Then both threads can just exit because
> for first one preempt_state wasn't NONE yet and for second one all rings
> appeared to be empty.
>
> To prevent such situations from happening we need to establish guarantee
> for preempt_trigger to be called after each submit. To implement it this
> patch adds trigger call at the end of a5xx_preempt_irq to re-check if we
> should switch to non-empty or higher priority ring. Also we find next
> ring in new preemption state "EVALUATE". If the thread that updated some
> ring with new submit sees this state it should wait until it passes.
>
> Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
> Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
> ---
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c     |  6 +++---
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.h     | 11 +++++++----
>  drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 24 +++++++++++++++++++----
>  3 files changed, 30 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index 6c80d3003966..266744ee1d5f 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -110,7 +110,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit
>         }
>
>         a5xx_flush(gpu, ring, true);
> -       a5xx_preempt_trigger(gpu);
> +       a5xx_preempt_trigger(gpu, true);
>
>         /* we might not necessarily have a cmd from userspace to
>          * trigger an event to know that submit has completed, so
> @@ -240,7 +240,7 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
>         a5xx_flush(gpu, ring, false);
>
>         /* Check to see if we need to start preemption */
> -       a5xx_preempt_trigger(gpu);
> +       a5xx_preempt_trigger(gpu, true);
>  }
>
>  static const struct adreno_five_hwcg_regs {
> @@ -1296,7 +1296,7 @@ static irqreturn_t a5xx_irq(struct msm_gpu *gpu)
>                 a5xx_gpmu_err_irq(gpu);
>
>         if (status & A5XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS) {
> -               a5xx_preempt_trigger(gpu);
> +               a5xx_preempt_trigger(gpu, false);
>                 msm_gpu_retire(gpu);
>         }
>
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> index c7187bcc5e90..1120824853d4 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> @@ -57,10 +57,12 @@ void a5xx_debugfs_init(struct msm_gpu *gpu, struct drm_minor *minor);
>   * through the process.
>   *
>   * PREEMPT_NONE - no preemption in progress.  Next state START.
> - * PREEMPT_START - The trigger is evaulating if preemption is possible. Next
> - * states: TRIGGERED, NONE
> + * PREEMPT_EVALUATE - The trigger is evaulating if preemption is possible. Next
> + * states: START, ABORT
>   * PREEMPT_ABORT - An intermediate state before moving back to NONE. Next
>   * state: NONE.
> + * PREEMPT_START - The trigger is preparing for preemption. Next state:
> + * TRIGGERED
>   * PREEMPT_TRIGGERED: A preemption has been executed on the hardware. Next
>   * states: FAULTED, PENDING
>   * PREEMPT_FAULTED: A preemption timed out (never completed). This will trigger
> @@ -71,8 +73,9 @@ void a5xx_debugfs_init(struct msm_gpu *gpu, struct drm_minor *minor);
>
>  enum preempt_state {
>         PREEMPT_NONE = 0,
> -       PREEMPT_START,
> +       PREEMPT_EVALUATE,
>         PREEMPT_ABORT,
> +       PREEMPT_START,
>         PREEMPT_TRIGGERED,
>         PREEMPT_FAULTED,
>         PREEMPT_PENDING,
> @@ -156,7 +159,7 @@ void a5xx_set_hwcg(struct msm_gpu *gpu, bool state);
>
>  void a5xx_preempt_init(struct msm_gpu *gpu);
>  void a5xx_preempt_hw_init(struct msm_gpu *gpu);
> -void a5xx_preempt_trigger(struct msm_gpu *gpu);
> +void a5xx_preempt_trigger(struct msm_gpu *gpu, bool new_submit);
>  void a5xx_preempt_irq(struct msm_gpu *gpu);
>  void a5xx_preempt_fini(struct msm_gpu *gpu);
>
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> index 67a8ef4adf6b..f8d09a83c5ae 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> @@ -87,21 +87,33 @@ static void a5xx_preempt_timer(struct timer_list *t)
>  }
>
>  /* Try to trigger a preemption switch */
> -void a5xx_preempt_trigger(struct msm_gpu *gpu)
> +void a5xx_preempt_trigger(struct msm_gpu *gpu, bool new_submit)
>  {
>         struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>         struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
>         unsigned long flags;
>         struct msm_ringbuffer *ring;
> +       enum preempt_state state;
>
>         if (gpu->nr_rings == 1)
>                 return;
>
>         /*
> -        * Try to start preemption by moving from NONE to START. If
> -        * unsuccessful, a preemption is already in flight
> +        * Try to start preemption by moving from NONE to EVALUATE. If current
> +        * state is EVALUATE/ABORT we can't just quit because then we can't
> +        * guarantee that preempt_trigger will be called after ring is updated
> +        * by new submit.
>          */
> -       if (!try_preempt_state(a5xx_gpu, PREEMPT_NONE, PREEMPT_START))
> +       state = atomic_cmpxchg(&a5xx_gpu->preempt_state, PREEMPT_NONE,
> +                              PREEMPT_EVALUATE);
> +       while (new_submit && (state == PREEMPT_EVALUATE ||
> +                             state == PREEMPT_ABORT)) {

This isn't enough because even if new_submit is false then we may
still need to guarantee that evaluation happens. We've seen a hang in
a scenario like:

1. A job is submitted and executed on ring 0.
2. A job is submitted on ring 2 while ring 0 is still active but
almost finished.
3. The submission thread starts evaluating and sees that ring 0 is still busy.
4. The job on ring 0 finishes and a CACHE_FLUSH IRQ is raised.
5. The IRQ tries to trigger a preemption but the state is still
PREEMPT_EVALUATE or PREEMPT_ABORT and exits.
6. The submission thread finishes update_wptr() and finally sets the
state to PREEMPT_NONE too late.

Then we never preempt to ring 2 and there's a soft lockup.

Connor

> +               cpu_relax();
> +               state = atomic_cmpxchg(&a5xx_gpu->preempt_state, PREEMPT_NONE,
> +                                      PREEMPT_EVALUATE);
> +       }
> +
> +       if (state != PREEMPT_NONE)
>                 return;
>
>         /* Get the next ring to preempt to */
> @@ -130,6 +142,8 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu)
>                 return;
>         }
>
> +       set_preempt_state(a5xx_gpu, PREEMPT_START);
> +
>         /* Make sure the wptr doesn't update while we're in motion */
>         spin_lock_irqsave(&ring->preempt_lock, flags);
>         a5xx_gpu->preempt[ring->id]->wptr = get_wptr(ring);
> @@ -188,6 +202,8 @@ void a5xx_preempt_irq(struct msm_gpu *gpu)
>         update_wptr(gpu, a5xx_gpu->cur_ring);
>
>         set_preempt_state(a5xx_gpu, PREEMPT_NONE);
> +
> +       a5xx_preempt_trigger(gpu, false);
>  }
>
>  void a5xx_preempt_hw_init(struct msm_gpu *gpu)
> --
> 2.45.2
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] drm/msm/a5xx: fix races in preemption evaluation stage
  2024-07-29 17:26   ` Connor Abbott
@ 2024-08-01 12:22     ` Vladimir Lypak
  2024-08-01 12:52       ` Connor Abbott
  0 siblings, 1 reply; 21+ messages in thread
From: Vladimir Lypak @ 2024-08-01 12:22 UTC (permalink / raw)
  To: Connor Abbott
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

On Mon, Jul 29, 2024 at 06:26:45PM +0100, Connor Abbott wrote:
> On Thu, Jul 11, 2024 at 11:10 AM Vladimir Lypak
> <vladimir.lypak@gmail.com> wrote:
> >
> > On A5XX GPUs when preemption is used it's invietable to enter a soft
> > lock-up state in which GPU is stuck at empty ring-buffer doing nothing.
> > This appears as full UI lockup and not detected as GPU hang (because
> > it's not). This happens due to not triggering preemption when it was
> > needed. Sometimes this state can be recovered by some new submit but
> > generally it won't happen because applications are waiting for old
> > submits to retire.
> >
> > One of the reasons why this happens is a race between a5xx_submit and
> > a5xx_preempt_trigger called from IRQ during submit retire. Former thread
> > updates ring->cur of previously empty and not current ring right after
> > latter checks it for emptiness. Then both threads can just exit because
> > for first one preempt_state wasn't NONE yet and for second one all rings
> > appeared to be empty.
> >
> > To prevent such situations from happening we need to establish guarantee
> > for preempt_trigger to be called after each submit. To implement it this
> > patch adds trigger call at the end of a5xx_preempt_irq to re-check if we
> > should switch to non-empty or higher priority ring. Also we find next
> > ring in new preemption state "EVALUATE". If the thread that updated some
> > ring with new submit sees this state it should wait until it passes.
> >
> > Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
> > Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
> > ---
> >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c     |  6 +++---
> >  drivers/gpu/drm/msm/adreno/a5xx_gpu.h     | 11 +++++++----
> >  drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 24 +++++++++++++++++++----
> >  3 files changed, 30 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > index 6c80d3003966..266744ee1d5f 100644
> > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > @@ -110,7 +110,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit
> >         }
> >
> >         a5xx_flush(gpu, ring, true);
> > -       a5xx_preempt_trigger(gpu);
> > +       a5xx_preempt_trigger(gpu, true);
> >
> >         /* we might not necessarily have a cmd from userspace to
> >          * trigger an event to know that submit has completed, so
> > @@ -240,7 +240,7 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
> >         a5xx_flush(gpu, ring, false);
> >
> >         /* Check to see if we need to start preemption */
> > -       a5xx_preempt_trigger(gpu);
> > +       a5xx_preempt_trigger(gpu, true);
> >  }
> >
> >  static const struct adreno_five_hwcg_regs {
> > @@ -1296,7 +1296,7 @@ static irqreturn_t a5xx_irq(struct msm_gpu *gpu)
> >                 a5xx_gpmu_err_irq(gpu);
> >
> >         if (status & A5XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS) {
> > -               a5xx_preempt_trigger(gpu);
> > +               a5xx_preempt_trigger(gpu, false);
> >                 msm_gpu_retire(gpu);
> >         }
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> > index c7187bcc5e90..1120824853d4 100644
> > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> > @@ -57,10 +57,12 @@ void a5xx_debugfs_init(struct msm_gpu *gpu, struct drm_minor *minor);
> >   * through the process.
> >   *
> >   * PREEMPT_NONE - no preemption in progress.  Next state START.
> > - * PREEMPT_START - The trigger is evaulating if preemption is possible. Next
> > - * states: TRIGGERED, NONE
> > + * PREEMPT_EVALUATE - The trigger is evaulating if preemption is possible. Next
> > + * states: START, ABORT
> >   * PREEMPT_ABORT - An intermediate state before moving back to NONE. Next
> >   * state: NONE.
> > + * PREEMPT_START - The trigger is preparing for preemption. Next state:
> > + * TRIGGERED
> >   * PREEMPT_TRIGGERED: A preemption has been executed on the hardware. Next
> >   * states: FAULTED, PENDING
> >   * PREEMPT_FAULTED: A preemption timed out (never completed). This will trigger
> > @@ -71,8 +73,9 @@ void a5xx_debugfs_init(struct msm_gpu *gpu, struct drm_minor *minor);
> >
> >  enum preempt_state {
> >         PREEMPT_NONE = 0,
> > -       PREEMPT_START,
> > +       PREEMPT_EVALUATE,
> >         PREEMPT_ABORT,
> > +       PREEMPT_START,
> >         PREEMPT_TRIGGERED,
> >         PREEMPT_FAULTED,
> >         PREEMPT_PENDING,
> > @@ -156,7 +159,7 @@ void a5xx_set_hwcg(struct msm_gpu *gpu, bool state);
> >
> >  void a5xx_preempt_init(struct msm_gpu *gpu);
> >  void a5xx_preempt_hw_init(struct msm_gpu *gpu);
> > -void a5xx_preempt_trigger(struct msm_gpu *gpu);
> > +void a5xx_preempt_trigger(struct msm_gpu *gpu, bool new_submit);
> >  void a5xx_preempt_irq(struct msm_gpu *gpu);
> >  void a5xx_preempt_fini(struct msm_gpu *gpu);
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > index 67a8ef4adf6b..f8d09a83c5ae 100644
> > --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > @@ -87,21 +87,33 @@ static void a5xx_preempt_timer(struct timer_list *t)
> >  }
> >
> >  /* Try to trigger a preemption switch */
> > -void a5xx_preempt_trigger(struct msm_gpu *gpu)
> > +void a5xx_preempt_trigger(struct msm_gpu *gpu, bool new_submit)
> >  {
> >         struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> >         struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
> >         unsigned long flags;
> >         struct msm_ringbuffer *ring;
> > +       enum preempt_state state;
> >
> >         if (gpu->nr_rings == 1)
> >                 return;
> >
> >         /*
> > -        * Try to start preemption by moving from NONE to START. If
> > -        * unsuccessful, a preemption is already in flight
> > +        * Try to start preemption by moving from NONE to EVALUATE. If current
> > +        * state is EVALUATE/ABORT we can't just quit because then we can't
> > +        * guarantee that preempt_trigger will be called after ring is updated
> > +        * by new submit.
> >          */
> > -       if (!try_preempt_state(a5xx_gpu, PREEMPT_NONE, PREEMPT_START))
> > +       state = atomic_cmpxchg(&a5xx_gpu->preempt_state, PREEMPT_NONE,
> > +                              PREEMPT_EVALUATE);
> > +       while (new_submit && (state == PREEMPT_EVALUATE ||
> > +                             state == PREEMPT_ABORT)) {
> 
> This isn't enough because even if new_submit is false then we may
> still need to guarantee that evaluation happens. We've seen a hang in
> a scenario like:
> 
> 1. A job is submitted and executed on ring 0.
> 2. A job is submitted on ring 2 while ring 0 is still active but
> almost finished.
> 3. The submission thread starts evaluating and sees that ring 0 is still busy.
> 4. The job on ring 0 finishes and a CACHE_FLUSH IRQ is raised.
> 5. The IRQ tries to trigger a preemption but the state is still
> PREEMPT_EVALUATE or PREEMPT_ABORT and exits.
> 6. The submission thread finishes update_wptr() and finally sets the
> state to PREEMPT_NONE too late.
> 
> Then we never preempt to ring 2 and there's a soft lockup.

Thanks, i've missed that. It would need to always wait to prevent such
scenario. The next patch prevented this from happening for me so i have
overlooked it.

Alternatively there is another approach which should perform better: to
let evaluation stage run in parallel.

Also i've tried serializing preemption handling on ordered workqueue and
GPU kthread worker. It's a lot simpler but latency from IRQ doesn't look
good:

           flush-trigger    SW_IRQ-pending   flush_IRQ-trigger
    uSecs    1    2    3       1    2    3       1    2    3
     0-10 1515   43   65    4423   39   24     647    0    2
    10-20 1484  453  103     446  414  309     399    1    1
    20-40  827 1802  358      19  819  587       2   21    6
    40-60    7 1264  397       1  368  329       0   30   14
    60-80    4  311  115       0  181  178       0   24   12
   80-120    2   36  251       0  250  188       0    9   13
  120-160    0    4  244       0  176  248       0  226  150
  160-200    0    1  278       0  221  235       0   86   78
  200-400    0    2 1266       0 1318 1186       0  476  688
  400-700    0    0  553       0  745 1028       0  150  106
 700-1000    0    0  121       0  264  366       0   65   28
1000-1500    0    0   61       0  160  205       0   21    8
    >2000    0    0   12       0   71   48       0    0    0

1 - current implementation but with evaluation in parallel.
2 - serialized on ordered workqueue.
3 - serialized on GPU kthread_worker.

Vladimir

> 
> Connor
> 
> > +               cpu_relax();
> > +               state = atomic_cmpxchg(&a5xx_gpu->preempt_state, PREEMPT_NONE,
> > +                                      PREEMPT_EVALUATE);
> > +       }
> > +
> > +       if (state != PREEMPT_NONE)
> >                 return;
> >
> >         /* Get the next ring to preempt to */
> > @@ -130,6 +142,8 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu)
> >                 return;
> >         }
> >
> > +       set_preempt_state(a5xx_gpu, PREEMPT_START);
> > +
> >         /* Make sure the wptr doesn't update while we're in motion */
> >         spin_lock_irqsave(&ring->preempt_lock, flags);
> >         a5xx_gpu->preempt[ring->id]->wptr = get_wptr(ring);
> > @@ -188,6 +202,8 @@ void a5xx_preempt_irq(struct msm_gpu *gpu)
> >         update_wptr(gpu, a5xx_gpu->cur_ring);
> >
> >         set_preempt_state(a5xx_gpu, PREEMPT_NONE);
> > +
> > +       a5xx_preempt_trigger(gpu, false);
> >  }
> >
> >  void a5xx_preempt_hw_init(struct msm_gpu *gpu)
> > --
> > 2.45.2
> >

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] drm/msm/a5xx: fix races in preemption evaluation stage
  2024-08-01 12:22     ` Vladimir Lypak
@ 2024-08-01 12:52       ` Connor Abbott
  2024-08-01 14:23         ` Vladimir Lypak
  0 siblings, 1 reply; 21+ messages in thread
From: Connor Abbott @ 2024-08-01 12:52 UTC (permalink / raw)
  To: Vladimir Lypak
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

On Thu, Aug 1, 2024 at 1:25 PM Vladimir Lypak <vladimir.lypak@gmail.com> wrote:
>
> On Mon, Jul 29, 2024 at 06:26:45PM +0100, Connor Abbott wrote:
> > On Thu, Jul 11, 2024 at 11:10 AM Vladimir Lypak
> > <vladimir.lypak@gmail.com> wrote:
> > >
> > > On A5XX GPUs when preemption is used it's invietable to enter a soft
> > > lock-up state in which GPU is stuck at empty ring-buffer doing nothing.
> > > This appears as full UI lockup and not detected as GPU hang (because
> > > it's not). This happens due to not triggering preemption when it was
> > > needed. Sometimes this state can be recovered by some new submit but
> > > generally it won't happen because applications are waiting for old
> > > submits to retire.
> > >
> > > One of the reasons why this happens is a race between a5xx_submit and
> > > a5xx_preempt_trigger called from IRQ during submit retire. Former thread
> > > updates ring->cur of previously empty and not current ring right after
> > > latter checks it for emptiness. Then both threads can just exit because
> > > for first one preempt_state wasn't NONE yet and for second one all rings
> > > appeared to be empty.
> > >
> > > To prevent such situations from happening we need to establish guarantee
> > > for preempt_trigger to be called after each submit. To implement it this
> > > patch adds trigger call at the end of a5xx_preempt_irq to re-check if we
> > > should switch to non-empty or higher priority ring. Also we find next
> > > ring in new preemption state "EVALUATE". If the thread that updated some
> > > ring with new submit sees this state it should wait until it passes.
> > >
> > > Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
> > > Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
> > > ---
> > >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c     |  6 +++---
> > >  drivers/gpu/drm/msm/adreno/a5xx_gpu.h     | 11 +++++++----
> > >  drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 24 +++++++++++++++++++----
> > >  3 files changed, 30 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > index 6c80d3003966..266744ee1d5f 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > @@ -110,7 +110,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit
> > >         }
> > >
> > >         a5xx_flush(gpu, ring, true);
> > > -       a5xx_preempt_trigger(gpu);
> > > +       a5xx_preempt_trigger(gpu, true);
> > >
> > >         /* we might not necessarily have a cmd from userspace to
> > >          * trigger an event to know that submit has completed, so
> > > @@ -240,7 +240,7 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
> > >         a5xx_flush(gpu, ring, false);
> > >
> > >         /* Check to see if we need to start preemption */
> > > -       a5xx_preempt_trigger(gpu);
> > > +       a5xx_preempt_trigger(gpu, true);
> > >  }
> > >
> > >  static const struct adreno_five_hwcg_regs {
> > > @@ -1296,7 +1296,7 @@ static irqreturn_t a5xx_irq(struct msm_gpu *gpu)
> > >                 a5xx_gpmu_err_irq(gpu);
> > >
> > >         if (status & A5XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS) {
> > > -               a5xx_preempt_trigger(gpu);
> > > +               a5xx_preempt_trigger(gpu, false);
> > >                 msm_gpu_retire(gpu);
> > >         }
> > >
> > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> > > index c7187bcc5e90..1120824853d4 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> > > @@ -57,10 +57,12 @@ void a5xx_debugfs_init(struct msm_gpu *gpu, struct drm_minor *minor);
> > >   * through the process.
> > >   *
> > >   * PREEMPT_NONE - no preemption in progress.  Next state START.
> > > - * PREEMPT_START - The trigger is evaulating if preemption is possible. Next
> > > - * states: TRIGGERED, NONE
> > > + * PREEMPT_EVALUATE - The trigger is evaulating if preemption is possible. Next
> > > + * states: START, ABORT
> > >   * PREEMPT_ABORT - An intermediate state before moving back to NONE. Next
> > >   * state: NONE.
> > > + * PREEMPT_START - The trigger is preparing for preemption. Next state:
> > > + * TRIGGERED
> > >   * PREEMPT_TRIGGERED: A preemption has been executed on the hardware. Next
> > >   * states: FAULTED, PENDING
> > >   * PREEMPT_FAULTED: A preemption timed out (never completed). This will trigger
> > > @@ -71,8 +73,9 @@ void a5xx_debugfs_init(struct msm_gpu *gpu, struct drm_minor *minor);
> > >
> > >  enum preempt_state {
> > >         PREEMPT_NONE = 0,
> > > -       PREEMPT_START,
> > > +       PREEMPT_EVALUATE,
> > >         PREEMPT_ABORT,
> > > +       PREEMPT_START,
> > >         PREEMPT_TRIGGERED,
> > >         PREEMPT_FAULTED,
> > >         PREEMPT_PENDING,
> > > @@ -156,7 +159,7 @@ void a5xx_set_hwcg(struct msm_gpu *gpu, bool state);
> > >
> > >  void a5xx_preempt_init(struct msm_gpu *gpu);
> > >  void a5xx_preempt_hw_init(struct msm_gpu *gpu);
> > > -void a5xx_preempt_trigger(struct msm_gpu *gpu);
> > > +void a5xx_preempt_trigger(struct msm_gpu *gpu, bool new_submit);
> > >  void a5xx_preempt_irq(struct msm_gpu *gpu);
> > >  void a5xx_preempt_fini(struct msm_gpu *gpu);
> > >
> > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > > index 67a8ef4adf6b..f8d09a83c5ae 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > > @@ -87,21 +87,33 @@ static void a5xx_preempt_timer(struct timer_list *t)
> > >  }
> > >
> > >  /* Try to trigger a preemption switch */
> > > -void a5xx_preempt_trigger(struct msm_gpu *gpu)
> > > +void a5xx_preempt_trigger(struct msm_gpu *gpu, bool new_submit)
> > >  {
> > >         struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> > >         struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
> > >         unsigned long flags;
> > >         struct msm_ringbuffer *ring;
> > > +       enum preempt_state state;
> > >
> > >         if (gpu->nr_rings == 1)
> > >                 return;
> > >
> > >         /*
> > > -        * Try to start preemption by moving from NONE to START. If
> > > -        * unsuccessful, a preemption is already in flight
> > > +        * Try to start preemption by moving from NONE to EVALUATE. If current
> > > +        * state is EVALUATE/ABORT we can't just quit because then we can't
> > > +        * guarantee that preempt_trigger will be called after ring is updated
> > > +        * by new submit.
> > >          */
> > > -       if (!try_preempt_state(a5xx_gpu, PREEMPT_NONE, PREEMPT_START))
> > > +       state = atomic_cmpxchg(&a5xx_gpu->preempt_state, PREEMPT_NONE,
> > > +                              PREEMPT_EVALUATE);
> > > +       while (new_submit && (state == PREEMPT_EVALUATE ||
> > > +                             state == PREEMPT_ABORT)) {
> >
> > This isn't enough because even if new_submit is false then we may
> > still need to guarantee that evaluation happens. We've seen a hang in
> > a scenario like:
> >
> > 1. A job is submitted and executed on ring 0.
> > 2. A job is submitted on ring 2 while ring 0 is still active but
> > almost finished.
> > 3. The submission thread starts evaluating and sees that ring 0 is still busy.
> > 4. The job on ring 0 finishes and a CACHE_FLUSH IRQ is raised.
> > 5. The IRQ tries to trigger a preemption but the state is still
> > PREEMPT_EVALUATE or PREEMPT_ABORT and exits.
> > 6. The submission thread finishes update_wptr() and finally sets the
> > state to PREEMPT_NONE too late.
> >
> > Then we never preempt to ring 2 and there's a soft lockup.
>
> Thanks, i've missed that. It would need to always wait to prevent such
> scenario. The next patch prevented this from happening for me so i have
> overlooked it.
>
> Alternatively there is another approach which should perform better: to
> let evaluation stage run in parallel.
>
> Also i've tried serializing preemption handling on ordered workqueue and
> GPU kthread worker. It's a lot simpler but latency from IRQ doesn't look
> good:
>
>            flush-trigger    SW_IRQ-pending   flush_IRQ-trigger
>     uSecs    1    2    3       1    2    3       1    2    3
>      0-10 1515   43   65    4423   39   24     647    0    2
>     10-20 1484  453  103     446  414  309     399    1    1
>     20-40  827 1802  358      19  819  587       2   21    6
>     40-60    7 1264  397       1  368  329       0   30   14
>     60-80    4  311  115       0  181  178       0   24   12
>    80-120    2   36  251       0  250  188       0    9   13
>   120-160    0    4  244       0  176  248       0  226  150
>   160-200    0    1  278       0  221  235       0   86   78
>   200-400    0    2 1266       0 1318 1186       0  476  688
>   400-700    0    0  553       0  745 1028       0  150  106
>  700-1000    0    0  121       0  264  366       0   65   28
> 1000-1500    0    0   61       0  160  205       0   21    8
>     >2000    0    0   12       0   71   48       0    0    0
>
> 1 - current implementation but with evaluation in parallel.
> 2 - serialized on ordered workqueue.
> 3 - serialized on GPU kthread_worker.

The problem with evaluating in parallel is that evaluating can race
with the rest of the process - it's possible for the thread evaluating
to go to be interrupted just before moving the state to PREEMPT_START
and in the meantime an entire preemption has happened and it's out of
date.

What we did was to put a spinlock around the entire evaluation stage,
effectively replacing the busy loop and the EVALUATE stage. It unlocks
only after moving the state to NONE or knowing for certain that we're
starting a preemption. That should be lower latency than a workqueue
while the critical section shouldn't be that large (it's one atomic
operation and checking each ring), and in any case that's what the
spinning loop was doing anyway.

Connor

>
> Vladimir
>
> >
> > Connor
> >
> > > +               cpu_relax();
> > > +               state = atomic_cmpxchg(&a5xx_gpu->preempt_state, PREEMPT_NONE,
> > > +                                      PREEMPT_EVALUATE);
> > > +       }
> > > +
> > > +       if (state != PREEMPT_NONE)
> > >                 return;
> > >
> > >         /* Get the next ring to preempt to */
> > > @@ -130,6 +142,8 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu)
> > >                 return;
> > >         }
> > >
> > > +       set_preempt_state(a5xx_gpu, PREEMPT_START);
> > > +
> > >         /* Make sure the wptr doesn't update while we're in motion */
> > >         spin_lock_irqsave(&ring->preempt_lock, flags);
> > >         a5xx_gpu->preempt[ring->id]->wptr = get_wptr(ring);
> > > @@ -188,6 +202,8 @@ void a5xx_preempt_irq(struct msm_gpu *gpu)
> > >         update_wptr(gpu, a5xx_gpu->cur_ring);
> > >
> > >         set_preempt_state(a5xx_gpu, PREEMPT_NONE);
> > > +
> > > +       a5xx_preempt_trigger(gpu, false);
> > >  }
> > >
> > >  void a5xx_preempt_hw_init(struct msm_gpu *gpu)
> > > --
> > > 2.45.2
> > >

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] drm/msm/a5xx: fix races in preemption evaluation stage
  2024-08-01 12:52       ` Connor Abbott
@ 2024-08-01 14:23         ` Vladimir Lypak
  2024-08-01 15:57           ` Connor Abbott
  0 siblings, 1 reply; 21+ messages in thread
From: Vladimir Lypak @ 2024-08-01 14:23 UTC (permalink / raw)
  To: Connor Abbott
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

On Thu, Aug 01, 2024 at 01:52:32PM +0100, Connor Abbott wrote:
> On Thu, Aug 1, 2024 at 1:25 PM Vladimir Lypak <vladimir.lypak@gmail.com> wrote:
> >
> > On Mon, Jul 29, 2024 at 06:26:45PM +0100, Connor Abbott wrote:
> > > On Thu, Jul 11, 2024 at 11:10 AM Vladimir Lypak
> > > <vladimir.lypak@gmail.com> wrote:
> > > >
> > > > On A5XX GPUs when preemption is used it's invietable to enter a soft
> > > > lock-up state in which GPU is stuck at empty ring-buffer doing nothing.
> > > > This appears as full UI lockup and not detected as GPU hang (because
> > > > it's not). This happens due to not triggering preemption when it was
> > > > needed. Sometimes this state can be recovered by some new submit but
> > > > generally it won't happen because applications are waiting for old
> > > > submits to retire.
> > > >
> > > > One of the reasons why this happens is a race between a5xx_submit and
> > > > a5xx_preempt_trigger called from IRQ during submit retire. Former thread
> > > > updates ring->cur of previously empty and not current ring right after
> > > > latter checks it for emptiness. Then both threads can just exit because
> > > > for first one preempt_state wasn't NONE yet and for second one all rings
> > > > appeared to be empty.
> > > >
> > > > To prevent such situations from happening we need to establish guarantee
> > > > for preempt_trigger to be called after each submit. To implement it this
> > > > patch adds trigger call at the end of a5xx_preempt_irq to re-check if we
> > > > should switch to non-empty or higher priority ring. Also we find next
> > > > ring in new preemption state "EVALUATE". If the thread that updated some
> > > > ring with new submit sees this state it should wait until it passes.
> > > >
> > > > Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
> > > > Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
> > > > ---
> > > >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c     |  6 +++---
> > > >  drivers/gpu/drm/msm/adreno/a5xx_gpu.h     | 11 +++++++----
> > > >  drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 24 +++++++++++++++++++----
> > > >  3 files changed, 30 insertions(+), 11 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > > index 6c80d3003966..266744ee1d5f 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > > @@ -110,7 +110,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit
> > > >         }
> > > >
> > > >         a5xx_flush(gpu, ring, true);
> > > > -       a5xx_preempt_trigger(gpu);
> > > > +       a5xx_preempt_trigger(gpu, true);
> > > >
> > > >         /* we might not necessarily have a cmd from userspace to
> > > >          * trigger an event to know that submit has completed, so
> > > > @@ -240,7 +240,7 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
> > > >         a5xx_flush(gpu, ring, false);
> > > >
> > > >         /* Check to see if we need to start preemption */
> > > > -       a5xx_preempt_trigger(gpu);
> > > > +       a5xx_preempt_trigger(gpu, true);
> > > >  }
> > > >
> > > >  static const struct adreno_five_hwcg_regs {
> > > > @@ -1296,7 +1296,7 @@ static irqreturn_t a5xx_irq(struct msm_gpu *gpu)
> > > >                 a5xx_gpmu_err_irq(gpu);
> > > >
> > > >         if (status & A5XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS) {
> > > > -               a5xx_preempt_trigger(gpu);
> > > > +               a5xx_preempt_trigger(gpu, false);
> > > >                 msm_gpu_retire(gpu);
> > > >         }
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> > > > index c7187bcc5e90..1120824853d4 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> > > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> > > > @@ -57,10 +57,12 @@ void a5xx_debugfs_init(struct msm_gpu *gpu, struct drm_minor *minor);
> > > >   * through the process.
> > > >   *
> > > >   * PREEMPT_NONE - no preemption in progress.  Next state START.
> > > > - * PREEMPT_START - The trigger is evaulating if preemption is possible. Next
> > > > - * states: TRIGGERED, NONE
> > > > + * PREEMPT_EVALUATE - The trigger is evaulating if preemption is possible. Next
> > > > + * states: START, ABORT
> > > >   * PREEMPT_ABORT - An intermediate state before moving back to NONE. Next
> > > >   * state: NONE.
> > > > + * PREEMPT_START - The trigger is preparing for preemption. Next state:
> > > > + * TRIGGERED
> > > >   * PREEMPT_TRIGGERED: A preemption has been executed on the hardware. Next
> > > >   * states: FAULTED, PENDING
> > > >   * PREEMPT_FAULTED: A preemption timed out (never completed). This will trigger
> > > > @@ -71,8 +73,9 @@ void a5xx_debugfs_init(struct msm_gpu *gpu, struct drm_minor *minor);
> > > >
> > > >  enum preempt_state {
> > > >         PREEMPT_NONE = 0,
> > > > -       PREEMPT_START,
> > > > +       PREEMPT_EVALUATE,
> > > >         PREEMPT_ABORT,
> > > > +       PREEMPT_START,
> > > >         PREEMPT_TRIGGERED,
> > > >         PREEMPT_FAULTED,
> > > >         PREEMPT_PENDING,
> > > > @@ -156,7 +159,7 @@ void a5xx_set_hwcg(struct msm_gpu *gpu, bool state);
> > > >
> > > >  void a5xx_preempt_init(struct msm_gpu *gpu);
> > > >  void a5xx_preempt_hw_init(struct msm_gpu *gpu);
> > > > -void a5xx_preempt_trigger(struct msm_gpu *gpu);
> > > > +void a5xx_preempt_trigger(struct msm_gpu *gpu, bool new_submit);
> > > >  void a5xx_preempt_irq(struct msm_gpu *gpu);
> > > >  void a5xx_preempt_fini(struct msm_gpu *gpu);
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > > > index 67a8ef4adf6b..f8d09a83c5ae 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > > > @@ -87,21 +87,33 @@ static void a5xx_preempt_timer(struct timer_list *t)
> > > >  }
> > > >
> > > >  /* Try to trigger a preemption switch */
> > > > -void a5xx_preempt_trigger(struct msm_gpu *gpu)
> > > > +void a5xx_preempt_trigger(struct msm_gpu *gpu, bool new_submit)
> > > >  {
> > > >         struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> > > >         struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
> > > >         unsigned long flags;
> > > >         struct msm_ringbuffer *ring;
> > > > +       enum preempt_state state;
> > > >
> > > >         if (gpu->nr_rings == 1)
> > > >                 return;
> > > >
> > > >         /*
> > > > -        * Try to start preemption by moving from NONE to START. If
> > > > -        * unsuccessful, a preemption is already in flight
> > > > +        * Try to start preemption by moving from NONE to EVALUATE. If current
> > > > +        * state is EVALUATE/ABORT we can't just quit because then we can't
> > > > +        * guarantee that preempt_trigger will be called after ring is updated
> > > > +        * by new submit.
> > > >          */
> > > > -       if (!try_preempt_state(a5xx_gpu, PREEMPT_NONE, PREEMPT_START))
> > > > +       state = atomic_cmpxchg(&a5xx_gpu->preempt_state, PREEMPT_NONE,
> > > > +                              PREEMPT_EVALUATE);
> > > > +       while (new_submit && (state == PREEMPT_EVALUATE ||
> > > > +                             state == PREEMPT_ABORT)) {
> > >
> > > This isn't enough because even if new_submit is false then we may
> > > still need to guarantee that evaluation happens. We've seen a hang in
> > > a scenario like:
> > >
> > > 1. A job is submitted and executed on ring 0.
> > > 2. A job is submitted on ring 2 while ring 0 is still active but
> > > almost finished.
> > > 3. The submission thread starts evaluating and sees that ring 0 is still busy.
> > > 4. The job on ring 0 finishes and a CACHE_FLUSH IRQ is raised.
> > > 5. The IRQ tries to trigger a preemption but the state is still
> > > PREEMPT_EVALUATE or PREEMPT_ABORT and exits.
> > > 6. The submission thread finishes update_wptr() and finally sets the
> > > state to PREEMPT_NONE too late.
> > >
> > > Then we never preempt to ring 2 and there's a soft lockup.
> >
> > Thanks, i've missed that. It would need to always wait to prevent such
> > scenario. The next patch prevented this from happening for me so i have
> > overlooked it.
> >
> > Alternatively there is another approach which should perform better: to
> > let evaluation stage run in parallel.
> >
> > Also i've tried serializing preemption handling on ordered workqueue and
> > GPU kthread worker. It's a lot simpler but latency from IRQ doesn't look
> > good:
> >
> >            flush-trigger    SW_IRQ-pending   flush_IRQ-trigger
> >     uSecs    1    2    3       1    2    3       1    2    3
> >      0-10 1515   43   65    4423   39   24     647    0    2
> >     10-20 1484  453  103     446  414  309     399    1    1
> >     20-40  827 1802  358      19  819  587       2   21    6
> >     40-60    7 1264  397       1  368  329       0   30   14
> >     60-80    4  311  115       0  181  178       0   24   12
> >    80-120    2   36  251       0  250  188       0    9   13
> >   120-160    0    4  244       0  176  248       0  226  150
> >   160-200    0    1  278       0  221  235       0   86   78
> >   200-400    0    2 1266       0 1318 1186       0  476  688
> >   400-700    0    0  553       0  745 1028       0  150  106
> >  700-1000    0    0  121       0  264  366       0   65   28
> > 1000-1500    0    0   61       0  160  205       0   21    8
> >     >2000    0    0   12       0   71   48       0    0    0
> >
> > 1 - current implementation but with evaluation in parallel.
> > 2 - serialized on ordered workqueue.
> > 3 - serialized on GPU kthread_worker.
> 
> The problem with evaluating in parallel is that evaluating can race
> with the rest of the process - it's possible for the thread evaluating
> to go to be interrupted just before moving the state to PREEMPT_START
> and in the meantime an entire preemption has happened and it's out of
> date.

Right. This gets complicated to fix for sure.

> 
> What we did was to put a spinlock around the entire evaluation stage,
> effectively replacing the busy loop and the EVALUATE stage. It unlocks
> only after moving the state to NONE or knowing for certain that we're
> starting a preemption. That should be lower latency than a workqueue
> while the critical section shouldn't be that large (it's one atomic
> operation and checking each ring), and in any case that's what the
> spinning loop was doing anyway.

Actually this is what i've done initially but i didn't want to introduce
another spinlock.

Another thing to consider is: to disable IRQs for entire trigger routine
and use regular spin_lock instead so once it's decided to do a switch
it won't get interrupted (when called from submit).

Vladimir

> 
> Connor
> 
> >
> > Vladimir
> >
> > >
> > > Connor
> > >
> > > > +               cpu_relax();
> > > > +               state = atomic_cmpxchg(&a5xx_gpu->preempt_state, PREEMPT_NONE,
> > > > +                                      PREEMPT_EVALUATE);
> > > > +       }
> > > > +
> > > > +       if (state != PREEMPT_NONE)
> > > >                 return;
> > > >
> > > >         /* Get the next ring to preempt to */
> > > > @@ -130,6 +142,8 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu)
> > > >                 return;
> > > >         }
> > > >
> > > > +       set_preempt_state(a5xx_gpu, PREEMPT_START);
> > > > +
> > > >         /* Make sure the wptr doesn't update while we're in motion */
> > > >         spin_lock_irqsave(&ring->preempt_lock, flags);
> > > >         a5xx_gpu->preempt[ring->id]->wptr = get_wptr(ring);
> > > > @@ -188,6 +202,8 @@ void a5xx_preempt_irq(struct msm_gpu *gpu)
> > > >         update_wptr(gpu, a5xx_gpu->cur_ring);
> > > >
> > > >         set_preempt_state(a5xx_gpu, PREEMPT_NONE);
> > > > +
> > > > +       a5xx_preempt_trigger(gpu, false);
> > > >  }
> > > >
> > > >  void a5xx_preempt_hw_init(struct msm_gpu *gpu)
> > > > --
> > > > 2.45.2
> > > >

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] drm/msm/a5xx: fix races in preemption evaluation stage
  2024-08-01 14:23         ` Vladimir Lypak
@ 2024-08-01 15:57           ` Connor Abbott
  0 siblings, 0 replies; 21+ messages in thread
From: Connor Abbott @ 2024-08-01 15:57 UTC (permalink / raw)
  To: Vladimir Lypak
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

On Thu, Aug 1, 2024 at 3:26 PM Vladimir Lypak <vladimir.lypak@gmail.com> wrote:
>
> On Thu, Aug 01, 2024 at 01:52:32PM +0100, Connor Abbott wrote:
> > On Thu, Aug 1, 2024 at 1:25 PM Vladimir Lypak <vladimir.lypak@gmail.com> wrote:
> > >
> > > On Mon, Jul 29, 2024 at 06:26:45PM +0100, Connor Abbott wrote:
> > > > On Thu, Jul 11, 2024 at 11:10 AM Vladimir Lypak
> > > > <vladimir.lypak@gmail.com> wrote:
> > > > >
> > > > > On A5XX GPUs when preemption is used it's invietable to enter a soft
> > > > > lock-up state in which GPU is stuck at empty ring-buffer doing nothing.
> > > > > This appears as full UI lockup and not detected as GPU hang (because
> > > > > it's not). This happens due to not triggering preemption when it was
> > > > > needed. Sometimes this state can be recovered by some new submit but
> > > > > generally it won't happen because applications are waiting for old
> > > > > submits to retire.
> > > > >
> > > > > One of the reasons why this happens is a race between a5xx_submit and
> > > > > a5xx_preempt_trigger called from IRQ during submit retire. Former thread
> > > > > updates ring->cur of previously empty and not current ring right after
> > > > > latter checks it for emptiness. Then both threads can just exit because
> > > > > for first one preempt_state wasn't NONE yet and for second one all rings
> > > > > appeared to be empty.
> > > > >
> > > > > To prevent such situations from happening we need to establish guarantee
> > > > > for preempt_trigger to be called after each submit. To implement it this
> > > > > patch adds trigger call at the end of a5xx_preempt_irq to re-check if we
> > > > > should switch to non-empty or higher priority ring. Also we find next
> > > > > ring in new preemption state "EVALUATE". If the thread that updated some
> > > > > ring with new submit sees this state it should wait until it passes.
> > > > >
> > > > > Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
> > > > > Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
> > > > > ---
> > > > >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c     |  6 +++---
> > > > >  drivers/gpu/drm/msm/adreno/a5xx_gpu.h     | 11 +++++++----
> > > > >  drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 24 +++++++++++++++++++----
> > > > >  3 files changed, 30 insertions(+), 11 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > > > index 6c80d3003966..266744ee1d5f 100644
> > > > > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > > > @@ -110,7 +110,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit
> > > > >         }
> > > > >
> > > > >         a5xx_flush(gpu, ring, true);
> > > > > -       a5xx_preempt_trigger(gpu);
> > > > > +       a5xx_preempt_trigger(gpu, true);
> > > > >
> > > > >         /* we might not necessarily have a cmd from userspace to
> > > > >          * trigger an event to know that submit has completed, so
> > > > > @@ -240,7 +240,7 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
> > > > >         a5xx_flush(gpu, ring, false);
> > > > >
> > > > >         /* Check to see if we need to start preemption */
> > > > > -       a5xx_preempt_trigger(gpu);
> > > > > +       a5xx_preempt_trigger(gpu, true);
> > > > >  }
> > > > >
> > > > >  static const struct adreno_five_hwcg_regs {
> > > > > @@ -1296,7 +1296,7 @@ static irqreturn_t a5xx_irq(struct msm_gpu *gpu)
> > > > >                 a5xx_gpmu_err_irq(gpu);
> > > > >
> > > > >         if (status & A5XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS) {
> > > > > -               a5xx_preempt_trigger(gpu);
> > > > > +               a5xx_preempt_trigger(gpu, false);
> > > > >                 msm_gpu_retire(gpu);
> > > > >         }
> > > > >
> > > > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> > > > > index c7187bcc5e90..1120824853d4 100644
> > > > > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> > > > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> > > > > @@ -57,10 +57,12 @@ void a5xx_debugfs_init(struct msm_gpu *gpu, struct drm_minor *minor);
> > > > >   * through the process.
> > > > >   *
> > > > >   * PREEMPT_NONE - no preemption in progress.  Next state START.
> > > > > - * PREEMPT_START - The trigger is evaulating if preemption is possible. Next
> > > > > - * states: TRIGGERED, NONE
> > > > > + * PREEMPT_EVALUATE - The trigger is evaulating if preemption is possible. Next
> > > > > + * states: START, ABORT
> > > > >   * PREEMPT_ABORT - An intermediate state before moving back to NONE. Next
> > > > >   * state: NONE.
> > > > > + * PREEMPT_START - The trigger is preparing for preemption. Next state:
> > > > > + * TRIGGERED
> > > > >   * PREEMPT_TRIGGERED: A preemption has been executed on the hardware. Next
> > > > >   * states: FAULTED, PENDING
> > > > >   * PREEMPT_FAULTED: A preemption timed out (never completed). This will trigger
> > > > > @@ -71,8 +73,9 @@ void a5xx_debugfs_init(struct msm_gpu *gpu, struct drm_minor *minor);
> > > > >
> > > > >  enum preempt_state {
> > > > >         PREEMPT_NONE = 0,
> > > > > -       PREEMPT_START,
> > > > > +       PREEMPT_EVALUATE,
> > > > >         PREEMPT_ABORT,
> > > > > +       PREEMPT_START,
> > > > >         PREEMPT_TRIGGERED,
> > > > >         PREEMPT_FAULTED,
> > > > >         PREEMPT_PENDING,
> > > > > @@ -156,7 +159,7 @@ void a5xx_set_hwcg(struct msm_gpu *gpu, bool state);
> > > > >
> > > > >  void a5xx_preempt_init(struct msm_gpu *gpu);
> > > > >  void a5xx_preempt_hw_init(struct msm_gpu *gpu);
> > > > > -void a5xx_preempt_trigger(struct msm_gpu *gpu);
> > > > > +void a5xx_preempt_trigger(struct msm_gpu *gpu, bool new_submit);
> > > > >  void a5xx_preempt_irq(struct msm_gpu *gpu);
> > > > >  void a5xx_preempt_fini(struct msm_gpu *gpu);
> > > > >
> > > > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > > > > index 67a8ef4adf6b..f8d09a83c5ae 100644
> > > > > --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > > > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > > > > @@ -87,21 +87,33 @@ static void a5xx_preempt_timer(struct timer_list *t)
> > > > >  }
> > > > >
> > > > >  /* Try to trigger a preemption switch */
> > > > > -void a5xx_preempt_trigger(struct msm_gpu *gpu)
> > > > > +void a5xx_preempt_trigger(struct msm_gpu *gpu, bool new_submit)
> > > > >  {
> > > > >         struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> > > > >         struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
> > > > >         unsigned long flags;
> > > > >         struct msm_ringbuffer *ring;
> > > > > +       enum preempt_state state;
> > > > >
> > > > >         if (gpu->nr_rings == 1)
> > > > >                 return;
> > > > >
> > > > >         /*
> > > > > -        * Try to start preemption by moving from NONE to START. If
> > > > > -        * unsuccessful, a preemption is already in flight
> > > > > +        * Try to start preemption by moving from NONE to EVALUATE. If current
> > > > > +        * state is EVALUATE/ABORT we can't just quit because then we can't
> > > > > +        * guarantee that preempt_trigger will be called after ring is updated
> > > > > +        * by new submit.
> > > > >          */
> > > > > -       if (!try_preempt_state(a5xx_gpu, PREEMPT_NONE, PREEMPT_START))
> > > > > +       state = atomic_cmpxchg(&a5xx_gpu->preempt_state, PREEMPT_NONE,
> > > > > +                              PREEMPT_EVALUATE);
> > > > > +       while (new_submit && (state == PREEMPT_EVALUATE ||
> > > > > +                             state == PREEMPT_ABORT)) {
> > > >
> > > > This isn't enough because even if new_submit is false then we may
> > > > still need to guarantee that evaluation happens. We've seen a hang in
> > > > a scenario like:
> > > >
> > > > 1. A job is submitted and executed on ring 0.
> > > > 2. A job is submitted on ring 2 while ring 0 is still active but
> > > > almost finished.
> > > > 3. The submission thread starts evaluating and sees that ring 0 is still busy.
> > > > 4. The job on ring 0 finishes and a CACHE_FLUSH IRQ is raised.
> > > > 5. The IRQ tries to trigger a preemption but the state is still
> > > > PREEMPT_EVALUATE or PREEMPT_ABORT and exits.
> > > > 6. The submission thread finishes update_wptr() and finally sets the
> > > > state to PREEMPT_NONE too late.
> > > >
> > > > Then we never preempt to ring 2 and there's a soft lockup.
> > >
> > > Thanks, i've missed that. It would need to always wait to prevent such
> > > scenario. The next patch prevented this from happening for me so i have
> > > overlooked it.
> > >
> > > Alternatively there is another approach which should perform better: to
> > > let evaluation stage run in parallel.
> > >
> > > Also i've tried serializing preemption handling on ordered workqueue and
> > > GPU kthread worker. It's a lot simpler but latency from IRQ doesn't look
> > > good:
> > >
> > >            flush-trigger    SW_IRQ-pending   flush_IRQ-trigger
> > >     uSecs    1    2    3       1    2    3       1    2    3
> > >      0-10 1515   43   65    4423   39   24     647    0    2
> > >     10-20 1484  453  103     446  414  309     399    1    1
> > >     20-40  827 1802  358      19  819  587       2   21    6
> > >     40-60    7 1264  397       1  368  329       0   30   14
> > >     60-80    4  311  115       0  181  178       0   24   12
> > >    80-120    2   36  251       0  250  188       0    9   13
> > >   120-160    0    4  244       0  176  248       0  226  150
> > >   160-200    0    1  278       0  221  235       0   86   78
> > >   200-400    0    2 1266       0 1318 1186       0  476  688
> > >   400-700    0    0  553       0  745 1028       0  150  106
> > >  700-1000    0    0  121       0  264  366       0   65   28
> > > 1000-1500    0    0   61       0  160  205       0   21    8
> > >     >2000    0    0   12       0   71   48       0    0    0
> > >
> > > 1 - current implementation but with evaluation in parallel.
> > > 2 - serialized on ordered workqueue.
> > > 3 - serialized on GPU kthread_worker.
> >
> > The problem with evaluating in parallel is that evaluating can race
> > with the rest of the process - it's possible for the thread evaluating
> > to go to be interrupted just before moving the state to PREEMPT_START
> > and in the meantime an entire preemption has happened and it's out of
> > date.
>
> Right. This gets complicated to fix for sure.
>
> >
> > What we did was to put a spinlock around the entire evaluation stage,
> > effectively replacing the busy loop and the EVALUATE stage. It unlocks
> > only after moving the state to NONE or knowing for certain that we're
> > starting a preemption. That should be lower latency than a workqueue
> > while the critical section shouldn't be that large (it's one atomic
> > operation and checking each ring), and in any case that's what the
> > spinning loop was doing anyway.
>
> Actually this is what i've done initially but i didn't want to introduce
> another spinlock.
>
> Another thing to consider is: to disable IRQs for entire trigger routine
> and use regular spin_lock instead so once it's decided to do a switch
> it won't get interrupted (when called from submit).

Isn't disabling IRQs and a regular spinlock the same as
spin_lock_irqsave()? We just used the latter, and it seems to work.
The pseudocode is something like:

spin_lock_irqsave();
if (!try_preempt_state(NONE, START)) { unlock(); return; }
if (!evaluate()) {
    set_preempt_state(ABORT);
    update_wptr();
    set_preempt_state(NONE);
    unlock();
    return;
}
unlock();
... // trigger the preemption

If the point is to have interrupts disabled for the entire time,
that's not necessary - once you unlock the state will be START and any
IRQ calling preempt_trigger() will immediately exit.

As for introducing another spinlock, this is no different than what
you're doing already with the spin loop. It's already basically a
(simple) spinlock. We're just replacing this with a real spinlock
that's simpler to reason about. However, if the spinlock is somehow
measurably slower than manually spinning, you can always keep it how
it is now with the EVALUATE state and manually add the irq
save/restore to make it safe to spin in the submit thread.

Connor

>
> Vladimir
>
> >
> > Connor
> >
> > >
> > > Vladimir
> > >
> > > >
> > > > Connor
> > > >
> > > > > +               cpu_relax();
> > > > > +               state = atomic_cmpxchg(&a5xx_gpu->preempt_state, PREEMPT_NONE,
> > > > > +                                      PREEMPT_EVALUATE);
> > > > > +       }
> > > > > +
> > > > > +       if (state != PREEMPT_NONE)
> > > > >                 return;
> > > > >
> > > > >         /* Get the next ring to preempt to */
> > > > > @@ -130,6 +142,8 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu)
> > > > >                 return;
> > > > >         }
> > > > >
> > > > > +       set_preempt_state(a5xx_gpu, PREEMPT_START);
> > > > > +
> > > > >         /* Make sure the wptr doesn't update while we're in motion */
> > > > >         spin_lock_irqsave(&ring->preempt_lock, flags);
> > > > >         a5xx_gpu->preempt[ring->id]->wptr = get_wptr(ring);
> > > > > @@ -188,6 +202,8 @@ void a5xx_preempt_irq(struct msm_gpu *gpu)
> > > > >         update_wptr(gpu, a5xx_gpu->cur_ring);
> > > > >
> > > > >         set_preempt_state(a5xx_gpu, PREEMPT_NONE);
> > > > > +
> > > > > +       a5xx_preempt_trigger(gpu, false);
> > > > >  }
> > > > >
> > > > >  void a5xx_preempt_hw_init(struct msm_gpu *gpu)
> > > > > --
> > > > > 2.45.2
> > > > >

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] drm/msm/a5xx: fix races in preemption evaluation stage
  2024-07-11 10:00 ` [PATCH 3/4] drm/msm/a5xx: fix races in preemption evaluation stage Vladimir Lypak
  2024-07-29 17:26   ` Connor Abbott
@ 2024-08-05 19:29   ` Akhil P Oommen
  1 sibling, 0 replies; 21+ messages in thread
From: Akhil P Oommen @ 2024-08-05 19:29 UTC (permalink / raw)
  To: Vladimir Lypak
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

On Thu, Jul 11, 2024 at 10:00:20AM +0000, Vladimir Lypak wrote:
> On A5XX GPUs when preemption is used it's invietable to enter a soft
> lock-up state in which GPU is stuck at empty ring-buffer doing nothing.
> This appears as full UI lockup and not detected as GPU hang (because
> it's not). This happens due to not triggering preemption when it was
> needed. Sometimes this state can be recovered by some new submit but
> generally it won't happen because applications are waiting for old
> submits to retire.
> 
> One of the reasons why this happens is a race between a5xx_submit and
> a5xx_preempt_trigger called from IRQ during submit retire. Former thread
> updates ring->cur of previously empty and not current ring right after
> latter checks it for emptiness. Then both threads can just exit because
> for first one preempt_state wasn't NONE yet and for second one all rings
> appeared to be empty.
> 
> To prevent such situations from happening we need to establish guarantee
> for preempt_trigger to be called after each submit. To implement it this
> patch adds trigger call at the end of a5xx_preempt_irq to re-check if we
> should switch to non-empty or higher priority ring. Also we find next
> ring in new preemption state "EVALUATE". If the thread that updated some
> ring with new submit sees this state it should wait until it passes.
> 
> Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
> Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>

I didn't go through the other thread with Connor completely, but can you
please check if the below chunk is enough instead of this patch?

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
index f58dd564d122..d69b14ebbe44 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
@@ -47,9 +47,8 @@ static inline void update_wptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)

        spin_lock_irqsave(&ring->preempt_lock, flags);
        wptr = get_wptr(ring);
-       spin_unlock_irqrestore(&ring->preempt_lock, flags);
-
        gpu_write(gpu, REG_A5XX_CP_RB_WPTR, wptr);
+       spin_unlock_irqrestore(&ring->preempt_lock, flags);
 }

 /* Return the highest priority ringbuffer with something in it */
@@ -188,6 +187,8 @@ void a5xx_preempt_irq(struct msm_gpu *gpu)
        update_wptr(gpu, a5xx_gpu->cur_ring);

        set_preempt_state(a5xx_gpu, PREEMPT_NONE);
+
+       a5xx_preempt_trigger(gpu);
 }

-Akhil

> ---
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c     |  6 +++---
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.h     | 11 +++++++----
>  drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 24 +++++++++++++++++++----
>  3 files changed, 30 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index 6c80d3003966..266744ee1d5f 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -110,7 +110,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit
>  	}
>  
>  	a5xx_flush(gpu, ring, true);
> -	a5xx_preempt_trigger(gpu);
> +	a5xx_preempt_trigger(gpu, true);
>  
>  	/* we might not necessarily have a cmd from userspace to
>  	 * trigger an event to know that submit has completed, so
> @@ -240,7 +240,7 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
>  	a5xx_flush(gpu, ring, false);
>  
>  	/* Check to see if we need to start preemption */
> -	a5xx_preempt_trigger(gpu);
> +	a5xx_preempt_trigger(gpu, true);
>  }
>  
>  static const struct adreno_five_hwcg_regs {
> @@ -1296,7 +1296,7 @@ static irqreturn_t a5xx_irq(struct msm_gpu *gpu)
>  		a5xx_gpmu_err_irq(gpu);
>  
>  	if (status & A5XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS) {
> -		a5xx_preempt_trigger(gpu);
> +		a5xx_preempt_trigger(gpu, false);
>  		msm_gpu_retire(gpu);
>  	}
>  
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> index c7187bcc5e90..1120824853d4 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> @@ -57,10 +57,12 @@ void a5xx_debugfs_init(struct msm_gpu *gpu, struct drm_minor *minor);
>   * through the process.
>   *
>   * PREEMPT_NONE - no preemption in progress.  Next state START.
> - * PREEMPT_START - The trigger is evaulating if preemption is possible. Next
> - * states: TRIGGERED, NONE
> + * PREEMPT_EVALUATE - The trigger is evaulating if preemption is possible. Next
> + * states: START, ABORT
>   * PREEMPT_ABORT - An intermediate state before moving back to NONE. Next
>   * state: NONE.
> + * PREEMPT_START - The trigger is preparing for preemption. Next state:
> + * TRIGGERED
>   * PREEMPT_TRIGGERED: A preemption has been executed on the hardware. Next
>   * states: FAULTED, PENDING
>   * PREEMPT_FAULTED: A preemption timed out (never completed). This will trigger
> @@ -71,8 +73,9 @@ void a5xx_debugfs_init(struct msm_gpu *gpu, struct drm_minor *minor);
>  
>  enum preempt_state {
>  	PREEMPT_NONE = 0,
> -	PREEMPT_START,
> +	PREEMPT_EVALUATE,
>  	PREEMPT_ABORT,
> +	PREEMPT_START,
>  	PREEMPT_TRIGGERED,
>  	PREEMPT_FAULTED,
>  	PREEMPT_PENDING,
> @@ -156,7 +159,7 @@ void a5xx_set_hwcg(struct msm_gpu *gpu, bool state);
>  
>  void a5xx_preempt_init(struct msm_gpu *gpu);
>  void a5xx_preempt_hw_init(struct msm_gpu *gpu);
> -void a5xx_preempt_trigger(struct msm_gpu *gpu);
> +void a5xx_preempt_trigger(struct msm_gpu *gpu, bool new_submit);
>  void a5xx_preempt_irq(struct msm_gpu *gpu);
>  void a5xx_preempt_fini(struct msm_gpu *gpu);
>  
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> index 67a8ef4adf6b..f8d09a83c5ae 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> @@ -87,21 +87,33 @@ static void a5xx_preempt_timer(struct timer_list *t)
>  }
>  
>  /* Try to trigger a preemption switch */
> -void a5xx_preempt_trigger(struct msm_gpu *gpu)
> +void a5xx_preempt_trigger(struct msm_gpu *gpu, bool new_submit)
>  {
>  	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>  	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
>  	unsigned long flags;
>  	struct msm_ringbuffer *ring;
> +	enum preempt_state state;
>  
>  	if (gpu->nr_rings == 1)
>  		return;
>  
>  	/*
> -	 * Try to start preemption by moving from NONE to START. If
> -	 * unsuccessful, a preemption is already in flight
> +	 * Try to start preemption by moving from NONE to EVALUATE. If current
> +	 * state is EVALUATE/ABORT we can't just quit because then we can't
> +	 * guarantee that preempt_trigger will be called after ring is updated
> +	 * by new submit.
>  	 */
> -	if (!try_preempt_state(a5xx_gpu, PREEMPT_NONE, PREEMPT_START))
> +	state = atomic_cmpxchg(&a5xx_gpu->preempt_state, PREEMPT_NONE,
> +			       PREEMPT_EVALUATE);
> +	while (new_submit && (state == PREEMPT_EVALUATE ||
> +			      state == PREEMPT_ABORT)) {
> +		cpu_relax();
> +		state = atomic_cmpxchg(&a5xx_gpu->preempt_state, PREEMPT_NONE,
> +				       PREEMPT_EVALUATE);
> +	}
> +
> +	if (state != PREEMPT_NONE)
>  		return;
>  
>  	/* Get the next ring to preempt to */
> @@ -130,6 +142,8 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu)
>  		return;
>  	}
>  
> +	set_preempt_state(a5xx_gpu, PREEMPT_START);
> +
>  	/* Make sure the wptr doesn't update while we're in motion */
>  	spin_lock_irqsave(&ring->preempt_lock, flags);
>  	a5xx_gpu->preempt[ring->id]->wptr = get_wptr(ring);
> @@ -188,6 +202,8 @@ void a5xx_preempt_irq(struct msm_gpu *gpu)
>  	update_wptr(gpu, a5xx_gpu->cur_ring);
>  
>  	set_preempt_state(a5xx_gpu, PREEMPT_NONE);
> +
> +	a5xx_preempt_trigger(gpu, false);
>  }
>  
>  void a5xx_preempt_hw_init(struct msm_gpu *gpu)
> -- 
> 2.45.2
> 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/4] drm/msm/a5xx: workaround early ring-buffer emptiness check
  2024-07-11 10:00 [PATCH 0/4] fixes for Adreno A5Xx preemption Vladimir Lypak
                   ` (2 preceding siblings ...)
  2024-07-11 10:00 ` [PATCH 3/4] drm/msm/a5xx: fix races in preemption evaluation stage Vladimir Lypak
@ 2024-07-11 10:00 ` Vladimir Lypak
  2024-08-05 19:58   ` Akhil P Oommen
  2024-07-17  9:40 ` [PATCH 0/4] fixes for Adreno A5Xx preemption Connor Abbott
  4 siblings, 1 reply; 21+ messages in thread
From: Vladimir Lypak @ 2024-07-11 10:00 UTC (permalink / raw)
  To: Vladimir Lypak
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

There is another cause for soft lock-up of GPU in empty ring-buffer:
race between GPU executing last commands and CPU checking ring for
emptiness. On GPU side IRQ for retire is triggered by CACHE_FLUSH_TS
event and RPTR shadow (which is used to check ring emptiness) is updated
a bit later from CP_CONTEXT_SWITCH_YIELD. Thus if GPU is executing its
last commands slow enough or we check that ring too fast we will miss a
chance to trigger switch to lower priority ring because current ring isn't
empty just yet. This can escalate to lock-up situation described in
previous patch.
To work-around this issue we keep track of last submit sequence number
for each ring and compare it with one written to memptrs from GPU during
execution of CACHE_FLUSH_TS event.

Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c     | 4 ++++
 drivers/gpu/drm/msm/adreno/a5xx_gpu.h     | 1 +
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 4 ++++
 3 files changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 266744ee1d5f..001f11f5febc 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -65,6 +65,8 @@ void a5xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
 
 static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 {
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
 	struct msm_ringbuffer *ring = submit->ring;
 	struct drm_gem_object *obj;
 	uint32_t *ptr, dwords;
@@ -109,6 +111,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit
 		}
 	}
 
+	a5xx_gpu->last_seqno[ring->id] = submit->seqno;
 	a5xx_flush(gpu, ring, true);
 	a5xx_preempt_trigger(gpu, true);
 
@@ -210,6 +213,7 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 	/* Write the fence to the scratch register */
 	OUT_PKT4(ring, REG_A5XX_CP_SCRATCH_REG(2), 1);
 	OUT_RING(ring, submit->seqno);
+	a5xx_gpu->last_seqno[ring->id] = submit->seqno;
 
 	/*
 	 * Execute a CACHE_FLUSH_TS event. This will ensure that the
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
index 1120824853d4..7269eaab9a7a 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
@@ -34,6 +34,7 @@ struct a5xx_gpu {
 	struct drm_gem_object *preempt_counters_bo[MSM_GPU_MAX_RINGS];
 	struct a5xx_preempt_record *preempt[MSM_GPU_MAX_RINGS];
 	uint64_t preempt_iova[MSM_GPU_MAX_RINGS];
+	uint32_t last_seqno[MSM_GPU_MAX_RINGS];
 
 	atomic_t preempt_state;
 	struct timer_list preempt_timer;
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
index f8d09a83c5ae..6bd92f9b2338 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
@@ -55,6 +55,8 @@ static inline void update_wptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
 /* Return the highest priority ringbuffer with something in it */
 static struct msm_ringbuffer *get_next_ring(struct msm_gpu *gpu)
 {
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
 	unsigned long flags;
 	int i;
 
@@ -64,6 +66,8 @@ static struct msm_ringbuffer *get_next_ring(struct msm_gpu *gpu)
 
 		spin_lock_irqsave(&ring->preempt_lock, flags);
 		empty = (get_wptr(ring) == gpu->funcs->get_rptr(gpu, ring));
+		if (!empty && ring == a5xx_gpu->cur_ring)
+			empty = ring->memptrs->fence == a5xx_gpu->last_seqno[i];
 		spin_unlock_irqrestore(&ring->preempt_lock, flags);
 
 		if (!empty)
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/4] drm/msm/a5xx: workaround early ring-buffer emptiness check
  2024-07-11 10:00 ` [PATCH 4/4] drm/msm/a5xx: workaround early ring-buffer emptiness check Vladimir Lypak
@ 2024-08-05 19:58   ` Akhil P Oommen
  0 siblings, 0 replies; 21+ messages in thread
From: Akhil P Oommen @ 2024-08-05 19:58 UTC (permalink / raw)
  To: Vladimir Lypak
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

On Thu, Jul 11, 2024 at 10:00:21AM +0000, Vladimir Lypak wrote:
> There is another cause for soft lock-up of GPU in empty ring-buffer:
> race between GPU executing last commands and CPU checking ring for
> emptiness. On GPU side IRQ for retire is triggered by CACHE_FLUSH_TS
> event and RPTR shadow (which is used to check ring emptiness) is updated
> a bit later from CP_CONTEXT_SWITCH_YIELD. Thus if GPU is executing its
> last commands slow enough or we check that ring too fast we will miss a
> chance to trigger switch to lower priority ring because current ring isn't
> empty just yet. This can escalate to lock-up situation described in
> previous patch.
> To work-around this issue we keep track of last submit sequence number
> for each ring and compare it with one written to memptrs from GPU during
> execution of CACHE_FLUSH_TS event.
> 

This is interesting! Is this just theoretical or are you able to hit
this race on your device (after picking other fixes in this series)?

-Akhil.

> Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
> Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
> ---
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c     | 4 ++++
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.h     | 1 +
>  drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 4 ++++
>  3 files changed, 9 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index 266744ee1d5f..001f11f5febc 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -65,6 +65,8 @@ void a5xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
>  
>  static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit)
>  {
> +	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> +	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
>  	struct msm_ringbuffer *ring = submit->ring;
>  	struct drm_gem_object *obj;
>  	uint32_t *ptr, dwords;
> @@ -109,6 +111,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit
>  		}
>  	}
>  
> +	a5xx_gpu->last_seqno[ring->id] = submit->seqno;
>  	a5xx_flush(gpu, ring, true);
>  	a5xx_preempt_trigger(gpu, true);
>  
> @@ -210,6 +213,7 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
>  	/* Write the fence to the scratch register */
>  	OUT_PKT4(ring, REG_A5XX_CP_SCRATCH_REG(2), 1);
>  	OUT_RING(ring, submit->seqno);
> +	a5xx_gpu->last_seqno[ring->id] = submit->seqno;
>  
>  	/*
>  	 * Execute a CACHE_FLUSH_TS event. This will ensure that the
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> index 1120824853d4..7269eaab9a7a 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> @@ -34,6 +34,7 @@ struct a5xx_gpu {
>  	struct drm_gem_object *preempt_counters_bo[MSM_GPU_MAX_RINGS];
>  	struct a5xx_preempt_record *preempt[MSM_GPU_MAX_RINGS];
>  	uint64_t preempt_iova[MSM_GPU_MAX_RINGS];
> +	uint32_t last_seqno[MSM_GPU_MAX_RINGS];
>  
>  	atomic_t preempt_state;
>  	struct timer_list preempt_timer;
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> index f8d09a83c5ae..6bd92f9b2338 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> @@ -55,6 +55,8 @@ static inline void update_wptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
>  /* Return the highest priority ringbuffer with something in it */
>  static struct msm_ringbuffer *get_next_ring(struct msm_gpu *gpu)
>  {
> +	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> +	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
>  	unsigned long flags;
>  	int i;
>  
> @@ -64,6 +66,8 @@ static struct msm_ringbuffer *get_next_ring(struct msm_gpu *gpu)
>  
>  		spin_lock_irqsave(&ring->preempt_lock, flags);
>  		empty = (get_wptr(ring) == gpu->funcs->get_rptr(gpu, ring));
> +		if (!empty && ring == a5xx_gpu->cur_ring)
> +			empty = ring->memptrs->fence == a5xx_gpu->last_seqno[i];
>  		spin_unlock_irqrestore(&ring->preempt_lock, flags);
>  
>  		if (!empty)
> -- 
> 2.45.2
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/4] fixes for Adreno A5Xx preemption
  2024-07-11 10:00 [PATCH 0/4] fixes for Adreno A5Xx preemption Vladimir Lypak
                   ` (3 preceding siblings ...)
  2024-07-11 10:00 ` [PATCH 4/4] drm/msm/a5xx: workaround early ring-buffer emptiness check Vladimir Lypak
@ 2024-07-17  9:40 ` Connor Abbott
  2024-07-17 16:31   ` Vladimir Lypak
  4 siblings, 1 reply; 21+ messages in thread
From: Connor Abbott @ 2024-07-17  9:40 UTC (permalink / raw)
  To: Vladimir Lypak
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

On Thu, Jul 11, 2024 at 11:10 AM Vladimir Lypak
<vladimir.lypak@gmail.com> wrote:
>
> There are several issues with preemption on Adreno A5XX GPUs which
> render system unusable if more than one priority level is used. Those
> issues include persistent GPU faults and hangs, full UI lockups with
> idling GPU.
>
> ---
> Vladimir Lypak (4):
>   drm/msm/a5xx: disable preemption in submits by default
>   drm/msm/a5xx: properly clear preemption records on resume
>   drm/msm/a5xx: fix races in preemption evaluation stage
>   drm/msm/a5xx: workaround early ring-buffer emptiness check
>
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c     | 18 ++++++++++----
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.h     | 12 ++++++---
>  drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 30 ++++++++++++++++++++---
>  3 files changed, 47 insertions(+), 13 deletions(-)
> ---
> base-commit: 523b23f0bee3014a7a752c9bb9f5c54f0eddae88
> --
> 2.45.2
>

Hi Vladimir,

While looking at preemption on a7xx, where the overall logic is pretty
much the same, and I've been seeing the same "soft lockups". However
even after porting patch 3, it turns out that's not enough because
there's a different race. The sequence of events is something like
this:

1. Medium-prio app A submits to ring 2.
2. Ring 0 retires its last job and we start a preemption to ring 2.
3. High-prio app B submits to ring 0. It sees the preemption from step
2 ongoing and doesn't write the WTPR register or try to preempt.
4. The preemption finishes and we update WPTR.
5. App A's submit retires. We try to preempt, but the submit and
ring->cur write from step 3 happened on a different CPU and the write
hasn't landed yet so we skip it.

It's a bit tricky because write reordering is involved, but this seems
to be what's happening - everything except my speculation about the
delayed write to ring->cur being the problem comes straight from a
trace of this happening.

Rob suggested on IRC that we make retire handling happen on the same
workqueue as submissions, so that preempt_trigger is always
serialized, which IIUC would also make patch 3 unnecessary. What do
you think?

Best regards,

Connor

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/4] fixes for Adreno A5Xx preemption
  2024-07-17  9:40 ` [PATCH 0/4] fixes for Adreno A5Xx preemption Connor Abbott
@ 2024-07-17 16:31   ` Vladimir Lypak
  2024-07-17 17:52     ` Connor Abbott
  0 siblings, 1 reply; 21+ messages in thread
From: Vladimir Lypak @ 2024-07-17 16:31 UTC (permalink / raw)
  To: Connor Abbott
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

On Wed, Jul 17, 2024 at 10:40:26AM +0100, Connor Abbott wrote:
> On Thu, Jul 11, 2024 at 11:10 AM Vladimir Lypak
> <vladimir.lypak@gmail.com> wrote:
> >
> > There are several issues with preemption on Adreno A5XX GPUs which
> > render system unusable if more than one priority level is used. Those
> > issues include persistent GPU faults and hangs, full UI lockups with
> > idling GPU.
> >
> > ---
> > Vladimir Lypak (4):
> >   drm/msm/a5xx: disable preemption in submits by default
> >   drm/msm/a5xx: properly clear preemption records on resume
> >   drm/msm/a5xx: fix races in preemption evaluation stage
> >   drm/msm/a5xx: workaround early ring-buffer emptiness check
> >
> >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c     | 18 ++++++++++----
> >  drivers/gpu/drm/msm/adreno/a5xx_gpu.h     | 12 ++++++---
> >  drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 30 ++++++++++++++++++++---
> >  3 files changed, 47 insertions(+), 13 deletions(-)
> > ---
> > base-commit: 523b23f0bee3014a7a752c9bb9f5c54f0eddae88
> > --
> > 2.45.2
> >
> 
> Hi Vladimir,

Hi Connor!

> 
> While looking at preemption on a7xx, where the overall logic is pretty
> much the same, and I've been seeing the same "soft lockups". However
> even after porting patch 3, it turns out that's not enough because
> there's a different race. The sequence of events is something like
> this:
> 
> 1. Medium-prio app A submits to ring 2.
> 2. Ring 0 retires its last job and we start a preemption to ring 2.
> 3. High-prio app B submits to ring 0. It sees the preemption from step
> 2 ongoing and doesn't write the WTPR register or try to preempt.
> 4. The preemption finishes and we update WPTR.
At this point with patch 3 applied it should switch to ring 0 right away
because of trigger call in the end of a5xx_preempt_irq. Didn't you
forget it? Downstream has such call too. Even though it makes preemption
a little more aggressive it doesn't work without it.

> 5. App A's submit retires. We try to preempt, but the submit and
> ring->cur write from step 3 happened on a different CPU and the write
> hasn't landed yet so we skip it.

I don't think this is possible on modern CPUs. Could it be that retire
IRQ appeared earlier (while it was switching from 0 to 2) and you are
looking at msm_gpu_submit_retired trace event which is called from
retire work later.

> 
> It's a bit tricky because write reordering is involved, but this seems
> to be what's happening - everything except my speculation about the
> delayed write to ring->cur being the problem comes straight from a
> trace of this happening.
> 
> Rob suggested on IRC that we make retire handling happen on the same
> workqueue as submissions, so that preempt_trigger is always
> serialized, which IIUC would also make patch 3 unnecessary. What do
> you think?

In this patch series i have tried to do least amount of changes so it
could be easily back-ported. It isn't pretty but it works reliably for
me. Otherwise it would be fine to just disable preemption like it's done
on LTS before 5.4 and rework preemption in new kernel releases.

Kind regards,

Vladimir

> 
> Best regards,
> 
> Connor

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/4] fixes for Adreno A5Xx preemption
  2024-07-17 16:31   ` Vladimir Lypak
@ 2024-07-17 17:52     ` Connor Abbott
  0 siblings, 0 replies; 21+ messages in thread
From: Connor Abbott @ 2024-07-17 17:52 UTC (permalink / raw)
  To: Vladimir Lypak
  Cc: Rob Clark, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Dmitry Baryshkov, Marijn Suijten, David Airlie, Daniel Vetter,
	Jordan Crouse, linux-arm-msm, dri-devel, freedreno, linux-kernel

On Wed, Jul 17, 2024 at 5:33 PM Vladimir Lypak <vladimir.lypak@gmail.com> wrote:
>
> On Wed, Jul 17, 2024 at 10:40:26AM +0100, Connor Abbott wrote:
> > On Thu, Jul 11, 2024 at 11:10 AM Vladimir Lypak
> > <vladimir.lypak@gmail.com> wrote:
> > >
> > > There are several issues with preemption on Adreno A5XX GPUs which
> > > render system unusable if more than one priority level is used. Those
> > > issues include persistent GPU faults and hangs, full UI lockups with
> > > idling GPU.
> > >
> > > ---
> > > Vladimir Lypak (4):
> > >   drm/msm/a5xx: disable preemption in submits by default
> > >   drm/msm/a5xx: properly clear preemption records on resume
> > >   drm/msm/a5xx: fix races in preemption evaluation stage
> > >   drm/msm/a5xx: workaround early ring-buffer emptiness check
> > >
> > >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c     | 18 ++++++++++----
> > >  drivers/gpu/drm/msm/adreno/a5xx_gpu.h     | 12 ++++++---
> > >  drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 30 ++++++++++++++++++++---
> > >  3 files changed, 47 insertions(+), 13 deletions(-)
> > > ---
> > > base-commit: 523b23f0bee3014a7a752c9bb9f5c54f0eddae88
> > > --
> > > 2.45.2
> > >
> >
> > Hi Vladimir,
>
> Hi Connor!
>
> >
> > While looking at preemption on a7xx, where the overall logic is pretty
> > much the same, and I've been seeing the same "soft lockups". However
> > even after porting patch 3, it turns out that's not enough because
> > there's a different race. The sequence of events is something like
> > this:
> >
> > 1. Medium-prio app A submits to ring 2.
> > 2. Ring 0 retires its last job and we start a preemption to ring 2.
> > 3. High-prio app B submits to ring 0. It sees the preemption from step
> > 2 ongoing and doesn't write the WTPR register or try to preempt.
> > 4. The preemption finishes and we update WPTR.
> At this point with patch 3 applied it should switch to ring 0 right away
> because of trigger call in the end of a5xx_preempt_irq. Didn't you
> forget it? Downstream has such call too. Even though it makes preemption
> a little more aggressive it doesn't work without it.

Yes, I didn't apply that bit because it didn't seem necessary to fix
the original issue you described and it seemed like just an
optimization to make preemption more aggressive, however it does seem
to fix the issue. I can't verify that the issue you describe (the
retire IRQ arriving before preemption IRQ) is what's actually
happening because adding a tracepoint on retire seems to change the
timing enough so that the lockup doesn't happen, though. So I guess
I'll just have to assume that's what it was.

Given how subtle this is, enough that I missed it, maybe it's worth a
comment and an extra commit.

Also, I forgot to mention that while I was reading this over I found
another (theoretical) race - we could flush a submit in between
calling update_wptr() and set_preempt_state(PREEMPT_NONE) in
a5xx_preempt_irq() and never update wptr. I would fix it by renaming
PREEMPT_ABORT to PREEMPT_FINISH and pulling out the ABORT ->
update_wptr() -> NONE sequence in a5xx_preempt_trigger() into a
separate function that also gets called in a5xx_preempt_irq().

Connor

>
> > 5. App A's submit retires. We try to preempt, but the submit and
> > ring->cur write from step 3 happened on a different CPU and the write
> > hasn't landed yet so we skip it.
>
> I don't think this is possible on modern CPUs. Could it be that retire
> IRQ appeared earlier (while it was switching from 0 to 2) and you are
> looking at msm_gpu_submit_retired trace event which is called from
> retire work later.
>
> >
> > It's a bit tricky because write reordering is involved, but this seems
> > to be what's happening - everything except my speculation about the
> > delayed write to ring->cur being the problem comes straight from a
> > trace of this happening.
> >
> > Rob suggested on IRC that we make retire handling happen on the same
> > workqueue as submissions, so that preempt_trigger is always
> > serialized, which IIUC would also make patch 3 unnecessary. What do
> > you think?
>
> In this patch series i have tried to do least amount of changes so it
> could be easily back-ported. It isn't pretty but it works reliably for
> me. Otherwise it would be fine to just disable preemption like it's done
> on LTS before 5.4 and rework preemption in new kernel releases.
>
> Kind regards,
>
> Vladimir
>
> >
> > Best regards,
> >
> > Connor

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-08-05 19:59 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-11 10:00 [PATCH 0/4] fixes for Adreno A5Xx preemption Vladimir Lypak
2024-07-11 10:00 ` [PATCH 1/4] drm/msm/a5xx: disable preemption in submits by default Vladimir Lypak
2024-07-15 21:00   ` Rob Clark
2024-08-01 12:38     ` Akhil P Oommen
2024-07-11 10:00 ` [PATCH 2/4] drm/msm/a5xx: properly clear preemption records on resume Vladimir Lypak
2024-07-11 10:42   ` Konrad Dybcio
2024-08-01 13:16   ` Akhil P Oommen
2024-08-02 13:41     ` Vladimir Lypak
2024-08-05 19:07       ` Akhil P Oommen
2024-07-11 10:00 ` [PATCH 3/4] drm/msm/a5xx: fix races in preemption evaluation stage Vladimir Lypak
2024-07-29 17:26   ` Connor Abbott
2024-08-01 12:22     ` Vladimir Lypak
2024-08-01 12:52       ` Connor Abbott
2024-08-01 14:23         ` Vladimir Lypak
2024-08-01 15:57           ` Connor Abbott
2024-08-05 19:29   ` Akhil P Oommen
2024-07-11 10:00 ` [PATCH 4/4] drm/msm/a5xx: workaround early ring-buffer emptiness check Vladimir Lypak
2024-08-05 19:58   ` Akhil P Oommen
2024-07-17  9:40 ` [PATCH 0/4] fixes for Adreno A5Xx preemption Connor Abbott
2024-07-17 16:31   ` Vladimir Lypak
2024-07-17 17:52     ` Connor Abbott

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox