Re: [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ ring reset

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Timur Kristóf" <timur.kristof@gmail.com>
To: Alex Deucher <alexdeucher@gmail.com>
Cc: amd-gfx@lists.freedesktop.org, Alex Deucher <alexander.deucher@amd.com>
Subject: Re: [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ ring reset
Date: Thu, 18 Dec 2025 11:03:15 -0600	[thread overview]
Message-ID: <98350545.IzOArtZ34E@timur-max> (raw)
In-Reply-To: <CADnq5_OdBWFJJEAH9_YikzVXw-JcGCF1E6OnUhruw+_F0H9E3w@mail.gmail.com>

On 2025. december 18., csütörtök 9:58:45 középső államokbeli zónaidő Alex 
Deucher wrote:
> On Thu, Dec 18, 2025 at 12:21 AM Timur Kristóf <timur.kristof@gmail.com> 
wrote:
> > On 2025. december 15., hétfő 10:07:11 középső államokbeli zónaidő Alex
> > Deucher> 
> > wrote:
> > > GFX ring resets work differently on pre-GFX10 hardware since
> > > there is no MQD managed by the scheduler.
> > > For ring reset, you need issue the reset via CP_VMID_RESET
> > > via KIQ or MMIO and submit the following to the gfx ring to
> > > complete the reset:
> > > 1. EOP packet with EXEC bit set
> > > 2. WAIT_REG_MEM to wait for the fence
> > > 3. Clear CP_VMID_RESET to 0
> > > 4. EVENT_WRITE ENABLE_LEGACY_PIPELINE
> > > 5. EOP packet with EXEC bit set
> > > 6. WAIT_REG_MEM to wait for the fence
> > > Once those commands have completed the reset should
> > > be complete and the ring can accept new packets.
> > > 
> > > However, because we have a pipeline sync between jobs,
> > > the PFP is waiting on the fence from the bad job to signal so
> > > it can't process any of the packets in the reset sequence
> > > until that pipeline sync clears.  To unblock the PFP, we
> > > use the KIQ to signal the fence after we reset the queue.
> > > 
> > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> > > ---
> > > 
> > >  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 104 +++++++++++++++++++++++++-
> > >  1 file changed, 101 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index
> > > bb1465a98c7ca..9b7073650315e
> > > 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > > @@ -2410,8 +2410,10 @@ static int gfx_v9_0_sw_init(struct
> > > amdgpu_ip_block
> > > *ip_block) amdgpu_get_soft_full_reset_mask(&adev->gfx.gfx_ring[0]);
> > > 
> > >       adev->gfx.compute_supported_reset =
> > >       
> > >               amdgpu_get_soft_full_reset_mask(&adev-
> > >
> > >gfx.compute_ring[0]);
> > >
> > > -     if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset)
> > > +     if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset)
> > 
> > {
> > 
> > >               adev->gfx.compute_supported_reset |=
> > 
> > AMDGPU_RESET_TYPE_PER_QUEUE;
> > 
> > > +             adev->gfx.gfx_supported_reset |=
> > 
> > AMDGPU_RESET_TYPE_PER_QUEUE;
> > 
> > > +     }
> > > 
> > >       r = amdgpu_gfx_kiq_init(adev, GFX9_MEC_HPD_SIZE, 0);
> > >       if (r) {
> > > 
> > > @@ -7163,6 +7165,103 @@ static void gfx_v9_ring_insert_nop(struct
> > > amdgpu_ring *ring, uint32_t num_nop) amdgpu_ring_insert_nop(ring,
> > > num_nop -
> > > 1);
> > > 
> > >  }
> > > 
> > > +static void gfx_v9_0_ring_emit_wreg_me(struct amdgpu_ring *ring,
> > > +                                    uint32_t reg,
> > > +                                    uint32_t val)
> > > +{
> > > +     uint32_t cmd = 0;
> > > +
> > > +     switch (ring->funcs->type) {
> > > +     case AMDGPU_RING_TYPE_KIQ:
> > > +             cmd = (1 << 16); /* no inc addr */
> > 
> > What do you mean by "inc addr" in this context?
> 
> It's part of the packet.  bit 16 controls whether the address is
> incremented or not.  This function is basically the same as
> gfx_v9_0_ring_emit_wreg(), but uses the ME to do the wait rather than
> the PFP.  I could have alternatively added a new parameter to
> gfx_v9_0_ring_emit_wreg() to select between PFP and ME.

Thanks. I was just not familiar with this field of the packet.

> 
> > > +             break;
> > > +     default:
> > > +             cmd = WR_CONFIRM;
> > > +             break;
> > > +     }
> > > +     amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3));
> > > +     amdgpu_ring_write(ring, cmd);
> > > +     amdgpu_ring_write(ring, reg);
> > > +     amdgpu_ring_write(ring, 0);
> > > +     amdgpu_ring_write(ring, val);
> > > +}
> > > +
> > > +static void gfx_v9_0_ring_emit_event_write(struct amdgpu_ring *ring,
> > > +                                        uint32_t event_type,
> > > +                                        uint32_t
> > 
> > event_index)
> > 
> > > +{
> > > +     amdgpu_ring_write(ring, PACKET3(PACKET3_EVENT_WRITE, 0));
> > > +     amdgpu_ring_write(ring, EVENT_TYPE(event_type) |
> > > +                       EVENT_INDEX(event_index));
> > > +}
> > > +
> > > +static int gfx_v9_0_reset_kgq(struct amdgpu_ring *ring,
> > > +                           unsigned int vmid,
> > > +                           struct amdgpu_fence *timedout_fence)
> > > +{
> > > +     struct amdgpu_device *adev = ring->adev;
> > > +     struct amdgpu_kiq *kiq = &adev->gfx.kiq[0];
> > > +     struct amdgpu_ring *kiq_ring = &kiq->ring;
> > > +     unsigned long flags;
> > > +     u32 tmp;
> > > +     int r;
> > > +
> > > +     amdgpu_ring_reset_helper_begin(ring, timedout_fence);
> > > +
> > > +     spin_lock_irqsave(&kiq->ring_lock, flags);
> > > +
> > > +     if (amdgpu_ring_alloc(kiq_ring, 5 + 5)) {
> > > +             spin_unlock_irqrestore(&kiq->ring_lock, flags);
> > > +             return -ENOMEM;
> > > +     }
> > > +
> > > +     /* send the reset - 5 */
> > > +     tmp = REG_SET_FIELD(0, CP_VMID_RESET, RESET_REQUEST, 1 << vmid);
> > > +     gfx_v9_0_ring_emit_wreg(kiq_ring,
> > > +                             SOC15_REG_OFFSET(GC, 0,
> > 
> > mmCP_VMID_RESET), tmp);
> > 
> > > +     /* emit the fence to clear the pipeline sync - 5 */
> > > +     gfx_v9_0_ring_emit_fence_kiq(kiq_ring, ring->fence_drv.gpu_addr,
> > > +                                  timedout_fence->base.seqno,
> > 
> > 0);
> > 
> > As far as I see, this isn't going to work when sched_hw_submission > 2 and
> > there are more than two jobs (from various different userspace processes)
> > emitted in the ring.
> > 
> > I can think of two possible solutons:
> > - Emit each fence individually, with a short delay in between to give a
> > chance to the GFX ring to catch up with the KIQ.
> > - Change the wait_reg_mem command used for the pipeline sync to allow
> > greater than equal instead of just equal. Then it's enough to signal just
> > the last fence on the KIQ ring.
> 
> That won't work.  The signalling patch is asynchronous so if we signal
> additional fences other than the bad one, those jobs will end up being
> seen as successfully completed.

Yeah, I already realized that after I sent the email, and sent a follow-up 
email with a different suggestion, that is to change the pipeline sync to not 
rely on the fence of the previous submission. eg. the pipeline sync could emit 
a separate fence, or even simpler, an ACQUIRE_MEM (gfx9+) / SURFACE_SYNC 
(gfx6-8). Then this becomes a non-issue.

What do you think about that?

> That said, there is a change coming
> for the firmware to fix this. 

That's very nice to hear that AMD is still willing to fix firmware for old chips 
like Vega. However,

- Would be nice to solve the problem for users who are still running old 
firmware, as it doesn't seem too difficult to do so.
- AFAIK gfx7-8 also use the same queue reset mechanism so I think they may be 
susceptible to the same issue (and I don't think anyone's gonna release new FW 
for those).

> I'd suggest we just limit the queue
> depth to 2 until the new firmware is available.

Last I heard of it, the SteamOS kernel set it to 4 due to "CPU bubbles" that 
they observed with the default setting of 2 on the Steam Deck. I think on the 
long term we should seriously consider increasing the default in upstream as 
well.

> 
> > > +     amdgpu_ring_commit(kiq_ring);
> > > +     r = amdgpu_ring_test_ring(kiq_ring);
> > > +     spin_unlock_irqrestore(&kiq->ring_lock, flags);
> > > +     if (r)
> > > +             return r;
> > > +
> > > +     if (amdgpu_ring_alloc(ring, 8 + 7 + 5 + 2 + 8 + 7))
> > > +             return -ENOMEM;
> > > +     /* emit the fence to finish the reset - 8 */
> > > +     ring->trail_seq++;
> > > +     gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr,
> > > +                              ring->trail_seq,
> > 
> > AMDGPU_FENCE_FLAG_EXEC);
> > 
> > > +     /* wait for the fence - 7 */
> > > +     gfx_v9_0_wait_reg_mem(ring, 0, 1, 0,
> > > +                           lower_32_bits(ring-
> > >
> > >trail_fence_gpu_addr),
> > >
> > > +                           upper_32_bits(ring-
> > >
> > >trail_fence_gpu_addr),
> > >
> > > +                           ring->trail_seq, 0xffffffff, 4);
> > > +     /* clear mmCP_VMID_RESET - 5 */
> > > +     gfx_v9_0_ring_emit_wreg_me(ring,
> > > +                                SOC15_REG_OFFSET(GC, 0,
> > 
> > mmCP_VMID_RESET), 0);
> > 
> > > +     /* event write ENABLE_LEGACY_PIPELINE - 2 */
> > > +     gfx_v9_0_ring_emit_event_write(ring, ENABLE_LEGACY_PIPELINE, 0);
> > > +     /* emit a regular fence - 8 */
> > > +     ring->trail_seq++;
> > > +     gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr,
> > > +                              ring->trail_seq,
> > 
> > AMDGPU_FENCE_FLAG_EXEC);
> > 
> > > +     /* wait for the fence - 7 */
> > > +     gfx_v9_0_wait_reg_mem(ring, 1, 1, 0,
> > > +                           lower_32_bits(ring-
> > >
> > >trail_fence_gpu_addr),
> > >
> > > +                           upper_32_bits(ring-
> > >
> > >trail_fence_gpu_addr),
> > >
> > > +                           ring->trail_seq, 0xffffffff, 4);
> > 
> > Why is it necessary to emit (and wait for) a regular fence here?
> > I'm not against it, just curious why it's needed.
> 
> It's part of the recovery sequence to make sure the
> ENABLE_LEGACY_PIPELINE event has completed successfully.

Ah, okay. I didn't realize that we needed fences after EVENT_WRITE.

> 
> > > +     amdgpu_ring_commit(ring);
> > > +     /* wait for the commands to complete */
> > > +     r = amdgpu_ring_test_ring(ring);
> > > +     if (r)
> > > +             return r;
> > > +
> > > +     return amdgpu_ring_reset_helper_end(ring, timedout_fence);
> > > +}
> > > +
> > > 
> > >  static int gfx_v9_0_reset_kcq(struct amdgpu_ring *ring,
> > >  
> > >                             unsigned int vmid,
> > >                             struct amdgpu_fence *timedout_fence)
> > > 
> > > @@ -7441,9 +7540,9 @@ static const struct amdgpu_ring_funcs
> > > gfx_v9_0_ring_funcs_gfx = { .emit_wreg = gfx_v9_0_ring_emit_wreg,
> > > 
> > >       .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait,
> > >       .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait,
> > > 
> > > -     .soft_recovery = gfx_v9_0_ring_soft_recovery,
> > 
> > Can you please split removing the soft recovery into a separate patch?
> > 
> > Can we talk about removing the soft recovery? For the other chips where it
> > has already been removed, it is percieved by users as a regression.
> 
> Queue reset is superset of soft recovery.  There's no need for soft
> recovery when queue reset is available.
> 

I'm not arguing for or against either approach, just relaying how the change 
is percieved by users who tell their experience.

From a user's perspective:
- queue reset just kills the entire job and the offending process, ie. the user 
sees the game crash
- soft recovery sometimes allows the guilty process to move on intact, ie. the 
user would see a "hitch" in the game but then it would keep working


> 
> > >       .emit_mem_sync = gfx_v9_0_emit_mem_sync,
> > >       .emit_cleaner_shader = gfx_v9_0_ring_emit_cleaner_shader,
> > > 
> > > +     .reset = gfx_v9_0_reset_kgq,
> > > 
> > >       .begin_use = amdgpu_gfx_enforce_isolation_ring_begin_use,
> > >       .end_use = amdgpu_gfx_enforce_isolation_ring_end_use,
> > >  
> > >  };
> > > 
> > > @@ -7542,7 +7641,6 @@ static const struct amdgpu_ring_funcs
> > > gfx_v9_0_ring_funcs_compute = { .emit_wreg = gfx_v9_0_ring_emit_wreg,
> > > 
> > >       .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait,
> > >       .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait,
> > > 
> > > -     .soft_recovery = gfx_v9_0_ring_soft_recovery,
> > > 
> > >       .emit_mem_sync = gfx_v9_0_emit_mem_sync,
> > >       .emit_wave_limit = gfx_v9_0_emit_wave_limit,
> > >       .reset = gfx_v9_0_reset_kcq,

next prev parent reply	other threads:[~2025-12-18 17:03 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-15 16:07 [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once Alex Deucher
2025-12-15 16:07 ` [PATCH 2/6] drm/amdgpu: always backup and reemit fences Alex Deucher
2025-12-18  5:17   ` Timur Kristóf
2025-12-15 16:07 ` [PATCH 3/6] drm/amdgpu: use dma_fence_get_status() for adapter reset Alex Deucher
2025-12-15 16:07 ` [PATCH 4/6] drm/amdgpu: avoid a warning in timedout job handler Alex Deucher
2025-12-18  5:15   ` Timur Kristóf
2025-12-18 15:41     ` Alex Deucher
2025-12-18 15:49       ` Timur Kristóf
2025-12-15 16:07 ` [PATCH 5/6] drm/amdgpu: mark fences with errors before ring reset Alex Deucher
2025-12-15 16:07 ` [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ " Alex Deucher
2025-12-18  5:11   ` Timur Kristóf
2025-12-18  5:28     ` Timur Kristóf
2025-12-18 15:58     ` Alex Deucher
2025-12-18 17:03       ` Timur Kristóf [this message]
2025-12-18 19:02         ` Alex Deucher
2025-12-18  5:20 ` [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once Timur Kristóf
2025-12-18 15:36   ` Alex Deucher
2025-12-18 15:47     ` Timur Kristóf
     [not found] <BL1PR12MB5849EF718ACDD3FB753F8D1BE7A8A@BL1PR12MB5849.namprd12.prod.outlook.com>
2025-12-18  2:56 ` [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ ring reset Chen, Jiqian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98350545.IzOArtZ34E@timur-max \
    --to=timur.kristof@gmail.com \
    --cc=alexander.deucher@amd.com \
    --cc=alexdeucher@gmail.com \
    --cc=amd-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.