* [PATCH 0/2] Support SDMA queue reset @ 2026-03-23 18:44 Amber Lin 2026-03-23 18:44 ` [PATCH 1/2] drm/amdgpu: Support MES suspend_all_sdma_gangs Amber Lin 2026-03-23 18:44 ` [PATCH 2/2] drm/amdkfd: Enable SDMA queue reset on gfx v12.1 Amber Lin 0 siblings, 2 replies; 6+ messages in thread From: Amber Lin @ 2026-03-23 18:44 UTC (permalink / raw) To: amd-gfx; +Cc: Michael.Chen, Shaoyun.Liu, Amber Lin This series follows the compute queue/pipe reset series of patches to enable SDMA queue reset on gfx v12.1 Amber Lin (2): drm/amdgpu: Support MES suspend_all_sdma_gangs drm/amdkfd: Enable SDMA queue reset on GC v12 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 3 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 1 + drivers/gpu/drm/amd/amdgpu/mes_v12_1.c | 1 + drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 3 ++- 4 files changed, 7 insertions(+), 1 deletion(-) -- 2.43.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] drm/amdgpu: Support MES suspend_all_sdma_gangs 2026-03-23 18:44 [PATCH 0/2] Support SDMA queue reset Amber Lin @ 2026-03-23 18:44 ` Amber Lin 2026-03-23 19:25 ` Alex Deucher 2026-03-23 18:44 ` [PATCH 2/2] drm/amdkfd: Enable SDMA queue reset on gfx v12.1 Amber Lin 1 sibling, 1 reply; 6+ messages in thread From: Amber Lin @ 2026-03-23 18:44 UTC (permalink / raw) To: amd-gfx; +Cc: Michael.Chen, Shaoyun.Liu, Amber Lin suspend_all_sdma_gangs is supported in new MES firmware for gfx 12.1 Signed-off-by: Amber Lin <Amber.Lin@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 3 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 1 + drivers/gpu/drm/amd/amdgpu/mes_v12_1.c | 1 + 3 files changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c index fd6b40d9da58..dfe18f0a3501 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c @@ -311,6 +311,9 @@ int amdgpu_mes_suspend(struct amdgpu_device *adev, uint32_t xcc_id) memset(&input, 0x0, sizeof(struct mes_suspend_gang_input)); input.suspend_all_gangs = 1; input.xcc_id = xcc_id; + if ((amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(12, 1, 0)) && + ((adev->mes.sched_version & AMDGPU_MES_VERSION_MASK) >= 0x71)) + input.suspend_all_sdma_gangs = 1; /* * Avoid taking any other locks under MES lock to avoid circular diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h index 44fa4d73bce8..2d08e33eb1e9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h @@ -298,6 +298,7 @@ struct mes_unmap_legacy_queue_input { struct mes_suspend_gang_input { uint32_t xcc_id; bool suspend_all_gangs; + bool suspend_all_sdma_gangs; uint64_t gang_context_addr; uint64_t suspend_fence_addr; uint32_t suspend_fence_value; diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v12_1.c index ac9e26b8bb52..8a0c3dc0ecb7 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_v12_1.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v12_1.c @@ -479,6 +479,7 @@ static int mes_v12_1_suspend_gang(struct amdgpu_mes *mes, mes_suspend_gang_pkt.header.dwsize = API_FRAME_SIZE_IN_DWORDS; mes_suspend_gang_pkt.suspend_all_gangs = input->suspend_all_gangs; + mes_suspend_gang_pkt.suspend_all_sdma_gangs = input->suspend_all_sdma_gangs; mes_suspend_gang_pkt.gang_context_addr = input->gang_context_addr; mes_suspend_gang_pkt.suspend_fence_addr = input->suspend_fence_addr; mes_suspend_gang_pkt.suspend_fence_value = input->suspend_fence_value; -- 2.43.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] drm/amdgpu: Support MES suspend_all_sdma_gangs 2026-03-23 18:44 ` [PATCH 1/2] drm/amdgpu: Support MES suspend_all_sdma_gangs Amber Lin @ 2026-03-23 19:25 ` Alex Deucher 0 siblings, 0 replies; 6+ messages in thread From: Alex Deucher @ 2026-03-23 19:25 UTC (permalink / raw) To: Amber Lin; +Cc: amd-gfx, Michael.Chen, Shaoyun.Liu On Mon, Mar 23, 2026 at 2:54 PM Amber Lin <Amber.Lin@amd.com> wrote: > > suspend_all_sdma_gangs is supported in new MES firmware for gfx 12.1 > > Signed-off-by: Amber Lin <Amber.Lin@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 3 +++ > drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 1 + > drivers/gpu/drm/amd/amdgpu/mes_v12_1.c | 1 + > 3 files changed, 5 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c > index fd6b40d9da58..dfe18f0a3501 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c > @@ -311,6 +311,9 @@ int amdgpu_mes_suspend(struct amdgpu_device *adev, uint32_t xcc_id) > memset(&input, 0x0, sizeof(struct mes_suspend_gang_input)); > input.suspend_all_gangs = 1; > input.xcc_id = xcc_id; > + if ((amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(12, 1, 0)) && > + ((adev->mes.sched_version & AMDGPU_MES_VERSION_MASK) >= 0x71)) > + input.suspend_all_sdma_gangs = 1; > > /* > * Avoid taking any other locks under MES lock to avoid circular > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h > index 44fa4d73bce8..2d08e33eb1e9 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h > @@ -298,6 +298,7 @@ struct mes_unmap_legacy_queue_input { > struct mes_suspend_gang_input { > uint32_t xcc_id; > bool suspend_all_gangs; > + bool suspend_all_sdma_gangs; > uint64_t gang_context_addr; > uint64_t suspend_fence_addr; > uint32_t suspend_fence_value; > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v12_1.c > index ac9e26b8bb52..8a0c3dc0ecb7 100644 > --- a/drivers/gpu/drm/amd/amdgpu/mes_v12_1.c > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v12_1.c > @@ -479,6 +479,7 @@ static int mes_v12_1_suspend_gang(struct amdgpu_mes *mes, > mes_suspend_gang_pkt.header.dwsize = API_FRAME_SIZE_IN_DWORDS; > > mes_suspend_gang_pkt.suspend_all_gangs = input->suspend_all_gangs; > + mes_suspend_gang_pkt.suspend_all_sdma_gangs = input->suspend_all_sdma_gangs; > mes_suspend_gang_pkt.gang_context_addr = input->gang_context_addr; > mes_suspend_gang_pkt.suspend_fence_addr = input->suspend_fence_addr; > mes_suspend_gang_pkt.suspend_fence_value = input->suspend_fence_value; > -- > 2.43.0 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] drm/amdkfd: Enable SDMA queue reset on gfx v12.1 2026-03-23 18:44 [PATCH 0/2] Support SDMA queue reset Amber Lin 2026-03-23 18:44 ` [PATCH 1/2] drm/amdgpu: Support MES suspend_all_sdma_gangs Amber Lin @ 2026-03-23 18:44 ` Amber Lin 2026-03-23 19:25 ` Alex Deucher 1 sibling, 1 reply; 6+ messages in thread From: Amber Lin @ 2026-03-23 18:44 UTC (permalink / raw) To: amd-gfx; +Cc: Michael.Chen, Shaoyun.Liu, Amber Lin After suspend/resume sdma_gang is supported on MES 12.1, SDMA queue reset is supported too. Signed-off-by: Amber Lin <Amber.Lin@amd.com> --- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c index 4c52819aef9e..42d52c1f5109 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c @@ -514,7 +514,8 @@ static ssize_t node_show(struct kobject *kobj, struct attribute *attr, dev->node_props.capability |= HSA_CAP_AQL_QUEUE_DOUBLE_MAP; - if (KFD_GC_VERSION(dev->gpu) < IP_VERSION(10, 0, 0) && + if ((KFD_GC_VERSION(dev->gpu) < IP_VERSION(10, 0, 0) || + KFD_GC_VERSION(dev->gpu) == IP_VERSION(12, 1, 0)) && (dev->gpu->adev->sdma.supported_reset & AMDGPU_RESET_TYPE_PER_QUEUE)) dev->node_props.capability2 |= HSA_CAP2_PER_SDMA_QUEUE_RESET_SUPPORTED; -- 2.43.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] drm/amdkfd: Enable SDMA queue reset on gfx v12.1 2026-03-23 18:44 ` [PATCH 2/2] drm/amdkfd: Enable SDMA queue reset on gfx v12.1 Amber Lin @ 2026-03-23 19:25 ` Alex Deucher 2026-03-23 19:48 ` Amber Lin 0 siblings, 1 reply; 6+ messages in thread From: Alex Deucher @ 2026-03-23 19:25 UTC (permalink / raw) To: Amber Lin; +Cc: amd-gfx, Michael.Chen, Shaoyun.Liu On Mon, Mar 23, 2026 at 3:04 PM Amber Lin <Amber.Lin@amd.com> wrote: > > After suspend/resume sdma_gang is supported on MES 12.1, SDMA queue reset > is supported too. > > Signed-off-by: Amber Lin <Amber.Lin@amd.com> > --- > drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c > index 4c52819aef9e..42d52c1f5109 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c > @@ -514,7 +514,8 @@ static ssize_t node_show(struct kobject *kobj, struct attribute *attr, > dev->node_props.capability |= > HSA_CAP_AQL_QUEUE_DOUBLE_MAP; > > - if (KFD_GC_VERSION(dev->gpu) < IP_VERSION(10, 0, 0) && > + if ((KFD_GC_VERSION(dev->gpu) < IP_VERSION(10, 0, 0) || > + KFD_GC_VERSION(dev->gpu) == IP_VERSION(12, 1, 0)) && I thought this already worked on other MES-enabled chips. Alex > (dev->gpu->adev->sdma.supported_reset & AMDGPU_RESET_TYPE_PER_QUEUE)) > dev->node_props.capability2 |= HSA_CAP2_PER_SDMA_QUEUE_RESET_SUPPORTED; > > -- > 2.43.0 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] drm/amdkfd: Enable SDMA queue reset on gfx v12.1 2026-03-23 19:25 ` Alex Deucher @ 2026-03-23 19:48 ` Amber Lin 0 siblings, 0 replies; 6+ messages in thread From: Amber Lin @ 2026-03-23 19:48 UTC (permalink / raw) To: Alex Deucher; +Cc: amd-gfx, Michael.Chen, Shaoyun.Liu On 3/23/26 15:25, Alex Deucher wrote: > On Mon, Mar 23, 2026 at 3:04 PM Amber Lin <Amber.Lin@amd.com> wrote: >> After suspend/resume sdma_gang is supported on MES 12.1, SDMA queue reset >> is supported too. >> >> Signed-off-by: Amber Lin <Amber.Lin@amd.com> >> --- >> drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c >> index 4c52819aef9e..42d52c1f5109 100644 >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c >> @@ -514,7 +514,8 @@ static ssize_t node_show(struct kobject *kobj, struct attribute *attr, >> dev->node_props.capability |= >> HSA_CAP_AQL_QUEUE_DOUBLE_MAP; >> >> - if (KFD_GC_VERSION(dev->gpu) < IP_VERSION(10, 0, 0) && >> + if ((KFD_GC_VERSION(dev->gpu) < IP_VERSION(10, 0, 0) || >> + KFD_GC_VERSION(dev->gpu) == IP_VERSION(12, 1, 0)) && > I thought this already worked on other MES-enabled chips. > > Alex My understanding is user mode compute(and sdma) queue/pipe reset only works on gfx 9 HWS. gfx 12.1 is the first MES-enabled chip on user mode compute side while they are already supported on gfx or kernel mode compute side. My next task is to extend this support to gfx 12 and other existing MES chips. > >> (dev->gpu->adev->sdma.supported_reset & AMDGPU_RESET_TYPE_PER_QUEUE)) >> dev->node_props.capability2 |= HSA_CAP2_PER_SDMA_QUEUE_RESET_SUPPORTED; >> >> -- >> 2.43.0 >> ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-03-23 19:48 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-23 18:44 [PATCH 0/2] Support SDMA queue reset Amber Lin 2026-03-23 18:44 ` [PATCH 1/2] drm/amdgpu: Support MES suspend_all_sdma_gangs Amber Lin 2026-03-23 19:25 ` Alex Deucher 2026-03-23 18:44 ` [PATCH 2/2] drm/amdkfd: Enable SDMA queue reset on gfx v12.1 Amber Lin 2026-03-23 19:25 ` Alex Deucher 2026-03-23 19:48 ` Amber Lin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox