public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Lypak <vladimir.lypak@gmail.com>
To: Connor Abbott <cwabbott0@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>, Sean Paul <sean@poorly.run>,
	Konrad Dybcio <konrad.dybcio@linaro.org>,
	Abhinav Kumar <quic_abhinavk@quicinc.com>,
	Dmitry Baryshkov <dmitry.baryshkov@linaro.org>,
	Marijn Suijten <marijn.suijten@somainline.org>,
	David Airlie <airlied@gmail.com>, Daniel Vetter <daniel@ffwll.ch>,
	Jordan Crouse <jordan@cosmicpenguin.net>,
	linux-arm-msm@vger.kernel.org, dri-devel@lists.freedesktop.org,
	freedreno@lists.freedesktop.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/4] drm/msm/a5xx: fix races in preemption evaluation stage
Date: Thu, 1 Aug 2024 14:23:58 +0000	[thread overview]
Message-ID: <ZquafkbK67ecsp99@trashcan> (raw)
In-Reply-To: <CACu1E7GrWj1EiTgov6f_nUkUR3WPWD6zs4H7OPL7Maw3i2-WQg@mail.gmail.com>

On Thu, Aug 01, 2024 at 01:52:32PM +0100, Connor Abbott wrote:
> On Thu, Aug 1, 2024 at 1:25 PM Vladimir Lypak <vladimir.lypak@gmail.com> wrote:
> >
> > On Mon, Jul 29, 2024 at 06:26:45PM +0100, Connor Abbott wrote:
> > > On Thu, Jul 11, 2024 at 11:10 AM Vladimir Lypak
> > > <vladimir.lypak@gmail.com> wrote:
> > > >
> > > > On A5XX GPUs when preemption is used it's invietable to enter a soft
> > > > lock-up state in which GPU is stuck at empty ring-buffer doing nothing.
> > > > This appears as full UI lockup and not detected as GPU hang (because
> > > > it's not). This happens due to not triggering preemption when it was
> > > > needed. Sometimes this state can be recovered by some new submit but
> > > > generally it won't happen because applications are waiting for old
> > > > submits to retire.
> > > >
> > > > One of the reasons why this happens is a race between a5xx_submit and
> > > > a5xx_preempt_trigger called from IRQ during submit retire. Former thread
> > > > updates ring->cur of previously empty and not current ring right after
> > > > latter checks it for emptiness. Then both threads can just exit because
> > > > for first one preempt_state wasn't NONE yet and for second one all rings
> > > > appeared to be empty.
> > > >
> > > > To prevent such situations from happening we need to establish guarantee
> > > > for preempt_trigger to be called after each submit. To implement it this
> > > > patch adds trigger call at the end of a5xx_preempt_irq to re-check if we
> > > > should switch to non-empty or higher priority ring. Also we find next
> > > > ring in new preemption state "EVALUATE". If the thread that updated some
> > > > ring with new submit sees this state it should wait until it passes.
> > > >
> > > > Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
> > > > Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
> > > > ---
> > > >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c     |  6 +++---
> > > >  drivers/gpu/drm/msm/adreno/a5xx_gpu.h     | 11 +++++++----
> > > >  drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 24 +++++++++++++++++++----
> > > >  3 files changed, 30 insertions(+), 11 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > > index 6c80d3003966..266744ee1d5f 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > > @@ -110,7 +110,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit
> > > >         }
> > > >
> > > >         a5xx_flush(gpu, ring, true);
> > > > -       a5xx_preempt_trigger(gpu);
> > > > +       a5xx_preempt_trigger(gpu, true);
> > > >
> > > >         /* we might not necessarily have a cmd from userspace to
> > > >          * trigger an event to know that submit has completed, so
> > > > @@ -240,7 +240,7 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
> > > >         a5xx_flush(gpu, ring, false);
> > > >
> > > >         /* Check to see if we need to start preemption */
> > > > -       a5xx_preempt_trigger(gpu);
> > > > +       a5xx_preempt_trigger(gpu, true);
> > > >  }
> > > >
> > > >  static const struct adreno_five_hwcg_regs {
> > > > @@ -1296,7 +1296,7 @@ static irqreturn_t a5xx_irq(struct msm_gpu *gpu)
> > > >                 a5xx_gpmu_err_irq(gpu);
> > > >
> > > >         if (status & A5XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS) {
> > > > -               a5xx_preempt_trigger(gpu);
> > > > +               a5xx_preempt_trigger(gpu, false);
> > > >                 msm_gpu_retire(gpu);
> > > >         }
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> > > > index c7187bcc5e90..1120824853d4 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> > > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
> > > > @@ -57,10 +57,12 @@ void a5xx_debugfs_init(struct msm_gpu *gpu, struct drm_minor *minor);
> > > >   * through the process.
> > > >   *
> > > >   * PREEMPT_NONE - no preemption in progress.  Next state START.
> > > > - * PREEMPT_START - The trigger is evaulating if preemption is possible. Next
> > > > - * states: TRIGGERED, NONE
> > > > + * PREEMPT_EVALUATE - The trigger is evaulating if preemption is possible. Next
> > > > + * states: START, ABORT
> > > >   * PREEMPT_ABORT - An intermediate state before moving back to NONE. Next
> > > >   * state: NONE.
> > > > + * PREEMPT_START - The trigger is preparing for preemption. Next state:
> > > > + * TRIGGERED
> > > >   * PREEMPT_TRIGGERED: A preemption has been executed on the hardware. Next
> > > >   * states: FAULTED, PENDING
> > > >   * PREEMPT_FAULTED: A preemption timed out (never completed). This will trigger
> > > > @@ -71,8 +73,9 @@ void a5xx_debugfs_init(struct msm_gpu *gpu, struct drm_minor *minor);
> > > >
> > > >  enum preempt_state {
> > > >         PREEMPT_NONE = 0,
> > > > -       PREEMPT_START,
> > > > +       PREEMPT_EVALUATE,
> > > >         PREEMPT_ABORT,
> > > > +       PREEMPT_START,
> > > >         PREEMPT_TRIGGERED,
> > > >         PREEMPT_FAULTED,
> > > >         PREEMPT_PENDING,
> > > > @@ -156,7 +159,7 @@ void a5xx_set_hwcg(struct msm_gpu *gpu, bool state);
> > > >
> > > >  void a5xx_preempt_init(struct msm_gpu *gpu);
> > > >  void a5xx_preempt_hw_init(struct msm_gpu *gpu);
> > > > -void a5xx_preempt_trigger(struct msm_gpu *gpu);
> > > > +void a5xx_preempt_trigger(struct msm_gpu *gpu, bool new_submit);
> > > >  void a5xx_preempt_irq(struct msm_gpu *gpu);
> > > >  void a5xx_preempt_fini(struct msm_gpu *gpu);
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > > > index 67a8ef4adf6b..f8d09a83c5ae 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> > > > @@ -87,21 +87,33 @@ static void a5xx_preempt_timer(struct timer_list *t)
> > > >  }
> > > >
> > > >  /* Try to trigger a preemption switch */
> > > > -void a5xx_preempt_trigger(struct msm_gpu *gpu)
> > > > +void a5xx_preempt_trigger(struct msm_gpu *gpu, bool new_submit)
> > > >  {
> > > >         struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> > > >         struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
> > > >         unsigned long flags;
> > > >         struct msm_ringbuffer *ring;
> > > > +       enum preempt_state state;
> > > >
> > > >         if (gpu->nr_rings == 1)
> > > >                 return;
> > > >
> > > >         /*
> > > > -        * Try to start preemption by moving from NONE to START. If
> > > > -        * unsuccessful, a preemption is already in flight
> > > > +        * Try to start preemption by moving from NONE to EVALUATE. If current
> > > > +        * state is EVALUATE/ABORT we can't just quit because then we can't
> > > > +        * guarantee that preempt_trigger will be called after ring is updated
> > > > +        * by new submit.
> > > >          */
> > > > -       if (!try_preempt_state(a5xx_gpu, PREEMPT_NONE, PREEMPT_START))
> > > > +       state = atomic_cmpxchg(&a5xx_gpu->preempt_state, PREEMPT_NONE,
> > > > +                              PREEMPT_EVALUATE);
> > > > +       while (new_submit && (state == PREEMPT_EVALUATE ||
> > > > +                             state == PREEMPT_ABORT)) {
> > >
> > > This isn't enough because even if new_submit is false then we may
> > > still need to guarantee that evaluation happens. We've seen a hang in
> > > a scenario like:
> > >
> > > 1. A job is submitted and executed on ring 0.
> > > 2. A job is submitted on ring 2 while ring 0 is still active but
> > > almost finished.
> > > 3. The submission thread starts evaluating and sees that ring 0 is still busy.
> > > 4. The job on ring 0 finishes and a CACHE_FLUSH IRQ is raised.
> > > 5. The IRQ tries to trigger a preemption but the state is still
> > > PREEMPT_EVALUATE or PREEMPT_ABORT and exits.
> > > 6. The submission thread finishes update_wptr() and finally sets the
> > > state to PREEMPT_NONE too late.
> > >
> > > Then we never preempt to ring 2 and there's a soft lockup.
> >
> > Thanks, i've missed that. It would need to always wait to prevent such
> > scenario. The next patch prevented this from happening for me so i have
> > overlooked it.
> >
> > Alternatively there is another approach which should perform better: to
> > let evaluation stage run in parallel.
> >
> > Also i've tried serializing preemption handling on ordered workqueue and
> > GPU kthread worker. It's a lot simpler but latency from IRQ doesn't look
> > good:
> >
> >            flush-trigger    SW_IRQ-pending   flush_IRQ-trigger
> >     uSecs    1    2    3       1    2    3       1    2    3
> >      0-10 1515   43   65    4423   39   24     647    0    2
> >     10-20 1484  453  103     446  414  309     399    1    1
> >     20-40  827 1802  358      19  819  587       2   21    6
> >     40-60    7 1264  397       1  368  329       0   30   14
> >     60-80    4  311  115       0  181  178       0   24   12
> >    80-120    2   36  251       0  250  188       0    9   13
> >   120-160    0    4  244       0  176  248       0  226  150
> >   160-200    0    1  278       0  221  235       0   86   78
> >   200-400    0    2 1266       0 1318 1186       0  476  688
> >   400-700    0    0  553       0  745 1028       0  150  106
> >  700-1000    0    0  121       0  264  366       0   65   28
> > 1000-1500    0    0   61       0  160  205       0   21    8
> >     >2000    0    0   12       0   71   48       0    0    0
> >
> > 1 - current implementation but with evaluation in parallel.
> > 2 - serialized on ordered workqueue.
> > 3 - serialized on GPU kthread_worker.
> 
> The problem with evaluating in parallel is that evaluating can race
> with the rest of the process - it's possible for the thread evaluating
> to go to be interrupted just before moving the state to PREEMPT_START
> and in the meantime an entire preemption has happened and it's out of
> date.

Right. This gets complicated to fix for sure.

> 
> What we did was to put a spinlock around the entire evaluation stage,
> effectively replacing the busy loop and the EVALUATE stage. It unlocks
> only after moving the state to NONE or knowing for certain that we're
> starting a preemption. That should be lower latency than a workqueue
> while the critical section shouldn't be that large (it's one atomic
> operation and checking each ring), and in any case that's what the
> spinning loop was doing anyway.

Actually this is what i've done initially but i didn't want to introduce
another spinlock.

Another thing to consider is: to disable IRQs for entire trigger routine
and use regular spin_lock instead so once it's decided to do a switch
it won't get interrupted (when called from submit).

Vladimir

> 
> Connor
> 
> >
> > Vladimir
> >
> > >
> > > Connor
> > >
> > > > +               cpu_relax();
> > > > +               state = atomic_cmpxchg(&a5xx_gpu->preempt_state, PREEMPT_NONE,
> > > > +                                      PREEMPT_EVALUATE);
> > > > +       }
> > > > +
> > > > +       if (state != PREEMPT_NONE)
> > > >                 return;
> > > >
> > > >         /* Get the next ring to preempt to */
> > > > @@ -130,6 +142,8 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu)
> > > >                 return;
> > > >         }
> > > >
> > > > +       set_preempt_state(a5xx_gpu, PREEMPT_START);
> > > > +
> > > >         /* Make sure the wptr doesn't update while we're in motion */
> > > >         spin_lock_irqsave(&ring->preempt_lock, flags);
> > > >         a5xx_gpu->preempt[ring->id]->wptr = get_wptr(ring);
> > > > @@ -188,6 +202,8 @@ void a5xx_preempt_irq(struct msm_gpu *gpu)
> > > >         update_wptr(gpu, a5xx_gpu->cur_ring);
> > > >
> > > >         set_preempt_state(a5xx_gpu, PREEMPT_NONE);
> > > > +
> > > > +       a5xx_preempt_trigger(gpu, false);
> > > >  }
> > > >
> > > >  void a5xx_preempt_hw_init(struct msm_gpu *gpu)
> > > > --
> > > > 2.45.2
> > > >

  reply	other threads:[~2024-08-01 14:26 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-11 10:00 [PATCH 0/4] fixes for Adreno A5Xx preemption Vladimir Lypak
2024-07-11 10:00 ` [PATCH 1/4] drm/msm/a5xx: disable preemption in submits by default Vladimir Lypak
2024-07-15 21:00   ` Rob Clark
2024-08-01 12:38     ` Akhil P Oommen
2024-07-11 10:00 ` [PATCH 2/4] drm/msm/a5xx: properly clear preemption records on resume Vladimir Lypak
2024-07-11 10:42   ` Konrad Dybcio
2024-08-01 13:16   ` Akhil P Oommen
2024-08-02 13:41     ` Vladimir Lypak
2024-08-05 19:07       ` Akhil P Oommen
2024-07-11 10:00 ` [PATCH 3/4] drm/msm/a5xx: fix races in preemption evaluation stage Vladimir Lypak
2024-07-29 17:26   ` Connor Abbott
2024-08-01 12:22     ` Vladimir Lypak
2024-08-01 12:52       ` Connor Abbott
2024-08-01 14:23         ` Vladimir Lypak [this message]
2024-08-01 15:57           ` Connor Abbott
2024-08-05 19:29   ` Akhil P Oommen
2024-07-11 10:00 ` [PATCH 4/4] drm/msm/a5xx: workaround early ring-buffer emptiness check Vladimir Lypak
2024-08-05 19:58   ` Akhil P Oommen
2024-07-17  9:40 ` [PATCH 0/4] fixes for Adreno A5Xx preemption Connor Abbott
2024-07-17 16:31   ` Vladimir Lypak
2024-07-17 17:52     ` Connor Abbott

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZquafkbK67ecsp99@trashcan \
    --to=vladimir.lypak@gmail.com \
    --cc=airlied@gmail.com \
    --cc=cwabbott0@gmail.com \
    --cc=daniel@ffwll.ch \
    --cc=dmitry.baryshkov@linaro.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=freedreno@lists.freedesktop.org \
    --cc=jordan@cosmicpenguin.net \
    --cc=konrad.dybcio@linaro.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marijn.suijten@somainline.org \
    --cc=quic_abhinavk@quicinc.com \
    --cc=robdclark@gmail.com \
    --cc=sean@poorly.run \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox