From: Amber Lin <Amber.Lin@amd.com>
To: <amd-gfx@lists.freedesktop.org>
Cc: <Shaoyun.Liu@amd.com>, <Michael.Chen@amd.com>,
<Jesse.Zhang@amd.com>, Amber Lin <Amber.Lin@amd.com>
Subject: [PATCH 0/8] Support compute queue/pipe reset on gfx 12.1
Date: Fri, 20 Mar 2026 16:02:00 -0400 [thread overview]
Message-ID: <20260320200208.1188307-1-Amber.Lin@amd.com> (raw)
Instead of MES does the detection and driver does the reset, this series
implements compute queue/pipe reset with detection and reset both done
in MES.
When REMOVE_QUEUE fails, driver takes it as at least one queue hanged.
Driver sends SUSPEND to suspend all queues, then RESET to reset hung
queues. MES will unmap hung queues and store hung queues information
in doorbell array and hqd_info for driver. Driver finds valid doorbell
offset in doorbell array and looks up hqd_info for each hung queue's
information. Next, driver cleans up hung queues and sends RESUME to resume
healthy queues.
Amber Lin (8):
drm/amdgpu: Fix gfx_hqd_mask in mes 12.1
drm/amdgpu: Fixup boost mes detect hang array size
drm/amdgpu: Fixup detect and reset
drm/amdgpu: Create hqd info structure
drm/amdgpu: Missing multi-XCC support in MES
drm/amdgpu: Enable suspend/resume gang in mes 12.1
drm/amdkfd: Add detect+reset hangs to GC 12.1
drm/amdkfd: Reset queue/pipe in MES
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 89 ++++++++---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 23 ++-
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 +-
drivers/gpu/drm/amd/amdgpu/mes_userqueue.c | 2 +-
drivers/gpu/drm/amd/amdgpu/mes_v12_1.c | 98 ++++++++----
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 151 +++++++++++++++++-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 +
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 1 +
8 files changed, 306 insertions(+), 61 deletions(-)
--
2.43.0
next reply other threads:[~2026-03-20 20:02 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-20 20:02 Amber Lin [this message]
2026-03-20 20:02 ` [PATCH 1/8] drm/amdgpu: Fix gfx_hqd_mask in mes 12.1 Amber Lin
2026-03-23 19:03 ` Alex Deucher
2026-03-20 20:02 ` [PATCH 2/8] drm/amdgpu: Fixup boost mes detect hang array size Amber Lin
2026-03-23 19:04 ` Alex Deucher
2026-03-23 19:15 ` Amber Lin
2026-03-20 20:02 ` [PATCH 3/8] drm/amdgpu: Fixup detect and reset Amber Lin
2026-03-23 19:07 ` Alex Deucher
2026-03-20 20:02 ` [PATCH 4/8] drm/amdgpu: Create hqd info structure Amber Lin
2026-03-23 19:01 ` Alex Deucher
2026-03-23 19:11 ` Amber Lin
2026-03-20 20:02 ` [PATCH 5/8] drm/amdgpu: Missing multi-XCC support in MES Amber Lin
2026-03-23 19:10 ` Alex Deucher
2026-03-23 19:19 ` Amber Lin
2026-03-20 20:02 ` [PATCH 6/8] drm/amdgpu: Enable suspend/resume gang in mes 12.1 Amber Lin
2026-03-23 19:11 ` Alex Deucher
2026-03-20 20:02 ` [PATCH 7/8] drm/amdkfd: Add detect+reset hangs to GC 12.1 Amber Lin
2026-03-23 19:12 ` Alex Deucher
2026-03-20 20:02 ` [PATCH 8/8] drm/amdkfd: Reset queue/pipe in MES Amber Lin
2026-03-23 19:21 ` Alex Deucher
2026-03-23 19:42 ` Amber Lin
-- strict thread matches above, loose matches on Subject: below --
2026-03-24 17:56 [PATCH 0/8] Support compute queue/pipe reset on gfx 12.1 Amber Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260320200208.1188307-1-Amber.Lin@amd.com \
--to=amber.lin@amd.com \
--cc=Jesse.Zhang@amd.com \
--cc=Michael.Chen@amd.com \
--cc=Shaoyun.Liu@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox