* [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
@ 2026-01-18 12:57 Timur Kristóf
2026-01-19 1:57 ` Liang, Prike
2026-01-19 10:12 ` Christian König
0 siblings, 2 replies; 10+ messages in thread
From: Timur Kristóf @ 2026-01-18 12:57 UTC (permalink / raw)
To: amd-gfx, Alexander.Deucher, Christian.Koenig, Prike Liang,
Mario Limonciello
Cc: Timur Kristóf
When a function holds a lock and we return without unlocking it,
it deadlocks the kernel. We should always unlock before returning.
This commit fixes suspend/resume on SI.
Tested on two Tahiti GPUs: FirePro W9000 and R9 280X.
Fixes: bc2dea30038a ("drm/amdgpu: validate the flush_gpu_tlb_pasid()")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 0e67fa4338ff..4fa24be1bf45 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -769,7 +769,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct amdgpu_device *adev, uint16_t pasid,
struct amdgpu_ring *ring = &adev->gfx.kiq[inst].ring;
struct amdgpu_kiq *kiq = &adev->gfx.kiq[inst];
unsigned int ndw;
- int r, cnt = 0;
+ int r = 0, cnt = 0;
uint32_t seq;
/*
@@ -782,7 +782,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct amdgpu_device *adev, uint16_t pasid,
if (!adev->gmc.flush_pasid_uses_kiq || !ring->sched.ready) {
if (!adev->gmc.gmc_funcs->flush_gpu_tlb_pasid)
- return 0;
+ goto error_unlock_reset;
if (adev->gmc.flush_tlb_needs_extra_type_2)
adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid,
@@ -797,7 +797,6 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct amdgpu_device *adev, uint16_t pasid,
adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid,
flush_type, all_hub,
inst);
- r = 0;
} else {
/* 2 dwords flush + 8 dwords fence */
ndw = kiq->pmf->invalidate_tlbs_size + 8;
--
2.52.0
^ permalink raw reply related [flat|nested] 10+ messages in thread* RE: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
2026-01-18 12:57 [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid() Timur Kristóf
@ 2026-01-19 1:57 ` Liang, Prike
2026-01-19 5:27 ` Liang, Prike
2026-01-19 10:12 ` Christian König
1 sibling, 1 reply; 10+ messages in thread
From: Liang, Prike @ 2026-01-19 1:57 UTC (permalink / raw)
To: Timur Kristóf, amd-gfx@lists.freedesktop.org,
Deucher, Alexander, Koenig, Christian, Limonciello, Mario,
Dan Carpenter
[Public]
Thank you for the fix. Could you please add the following the tags?
| Reported-by: kernel test robot <lkp@intel.com>
| Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
| Closes: https://lore.kernel.org/r/202601190121.z9C0uml5-lkp@intel.com/
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Regards,
Prike
> -----Original Message-----
> From: Timur Kristóf <timur.kristof@gmail.com>
> Sent: Sunday, January 18, 2026 8:58 PM
> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>;
> Liang, Prike <Prike.Liang@amd.com>; Limonciello, Mario
> <Mario.Limonciello@amd.com>
> Cc: Timur Kristóf <timur.kristof@gmail.com>
> Subject: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
>
> When a function holds a lock and we return without unlocking it, it deadlocks the
> kernel. We should always unlock before returning.
>
> This commit fixes suspend/resume on SI.
> Tested on two Tahiti GPUs: FirePro W9000 and R9 280X.
>
> Fixes: bc2dea30038a ("drm/amdgpu: validate the flush_gpu_tlb_pasid()")
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index 0e67fa4338ff..4fa24be1bf45 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -769,7 +769,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> amdgpu_device *adev, uint16_t pasid,
> struct amdgpu_ring *ring = &adev->gfx.kiq[inst].ring;
> struct amdgpu_kiq *kiq = &adev->gfx.kiq[inst];
> unsigned int ndw;
> - int r, cnt = 0;
> + int r = 0, cnt = 0;
> uint32_t seq;
>
> /*
> @@ -782,7 +782,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> amdgpu_device *adev, uint16_t pasid,
> if (!adev->gmc.flush_pasid_uses_kiq || !ring->sched.ready) {
>
> if (!adev->gmc.gmc_funcs->flush_gpu_tlb_pasid)
> - return 0;
> + goto error_unlock_reset;
>
> if (adev->gmc.flush_tlb_needs_extra_type_2)
> adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid,
> @@ -797,7 +797,6 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> amdgpu_device *adev, uint16_t pasid,
> adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid,
> flush_type, all_hub,
> inst);
> - r = 0;
> } else {
> /* 2 dwords flush + 8 dwords fence */
> ndw = kiq->pmf->invalidate_tlbs_size + 8;
> --
> 2.52.0
^ permalink raw reply [flat|nested] 10+ messages in thread* RE: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
2026-01-19 1:57 ` Liang, Prike
@ 2026-01-19 5:27 ` Liang, Prike
2026-01-19 9:00 ` Timur Kristóf
2026-01-19 10:33 ` Christian König
0 siblings, 2 replies; 10+ messages in thread
From: Liang, Prike @ 2026-01-19 5:27 UTC (permalink / raw)
To: Liang, Prike, Timur Kristóf, amd-gfx@lists.freedesktop.org,
Deucher, Alexander, Koenig, Christian, Limonciello, Mario,
Dan Carpenter
[Public]
In order to avoid being blocked by the lock issue on some older GFX, I will push the patch to amd-staging-drm-next.
If you have any concerns, please let me know.
Regards,
Prike
> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Liang, Prike
> Sent: Monday, January 19, 2026 9:58 AM
> To: Timur Kristóf <timur.kristof@gmail.com>; amd-gfx@lists.freedesktop.org;
> Deucher, Alexander <Alexander.Deucher@amd.com>; Koenig, Christian
> <Christian.Koenig@amd.com>; Limonciello, Mario <Mario.Limonciello@amd.com>;
> Dan Carpenter <dan.carpenter@linaro.org>
> Subject: RE: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
>
> [Public]
>
> Thank you for the fix. Could you please add the following the tags?
>
> | Reported-by: kernel test robot <lkp@intel.com>
> | Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
> | Closes: https://lore.kernel.org/r/202601190121.z9C0uml5-lkp@intel.com/
>
> Reviewed-by: Prike Liang <Prike.Liang@amd.com>
>
> Regards,
> Prike
>
> > -----Original Message-----
> > From: Timur Kristóf <timur.kristof@gmail.com>
> > Sent: Sunday, January 18, 2026 8:58 PM
> > To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> > <Alexander.Deucher@amd.com>; Koenig, Christian
> > <Christian.Koenig@amd.com>; Liang, Prike <Prike.Liang@amd.com>;
> > Limonciello, Mario <Mario.Limonciello@amd.com>
> > Cc: Timur Kristóf <timur.kristof@gmail.com>
> > Subject: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
> >
> > When a function holds a lock and we return without unlocking it, it
> > deadlocks the kernel. We should always unlock before returning.
> >
> > This commit fixes suspend/resume on SI.
> > Tested on two Tahiti GPUs: FirePro W9000 and R9 280X.
> >
> > Fixes: bc2dea30038a ("drm/amdgpu: validate the flush_gpu_tlb_pasid()")
> > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > ---
> > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 5 ++---
> > 1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > index 0e67fa4338ff..4fa24be1bf45 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > @@ -769,7 +769,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> > amdgpu_device *adev, uint16_t pasid,
> > struct amdgpu_ring *ring = &adev->gfx.kiq[inst].ring;
> > struct amdgpu_kiq *kiq = &adev->gfx.kiq[inst];
> > unsigned int ndw;
> > - int r, cnt = 0;
> > + int r = 0, cnt = 0;
> > uint32_t seq;
> >
> > /*
> > @@ -782,7 +782,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> > amdgpu_device *adev, uint16_t pasid,
> > if (!adev->gmc.flush_pasid_uses_kiq || !ring->sched.ready) {
> >
> > if (!adev->gmc.gmc_funcs->flush_gpu_tlb_pasid)
> > - return 0;
> > + goto error_unlock_reset;
> >
> > if (adev->gmc.flush_tlb_needs_extra_type_2)
> > adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev,
> > pasid, @@ -797,7 +797,6 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> > amdgpu_device *adev, uint16_t pasid,
> > adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid,
> > flush_type, all_hub,
> > inst);
> > - r = 0;
> > } else {
> > /* 2 dwords flush + 8 dwords fence */
> > ndw = kiq->pmf->invalidate_tlbs_size + 8;
> > --
> > 2.52.0
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
2026-01-19 5:27 ` Liang, Prike
@ 2026-01-19 9:00 ` Timur Kristóf
2026-01-19 10:33 ` Christian König
1 sibling, 0 replies; 10+ messages in thread
From: Timur Kristóf @ 2026-01-19 9:00 UTC (permalink / raw)
To: Liang, Prike, amd-gfx@lists.freedesktop.org, Deucher, Alexander,
Koenig, Christian, Limonciello, Mario, Dan Carpenter,
Liang, Prike
On Monday, January 19, 2026 6:27:10 AM Central European Standard Time Liang,
Prike wrote:
> [Public]
>
> In order to avoid being blocked by the lock issue on some older GFX, I will
> push the patch to amd-staging-drm-next.
> If you have any concerns, please
> let me know.
>
Hi Prike,
Thank you, feel free to add the necessary tags and push the patch.
Best regards,
Timur
>
>
> > -----Original Message-----
> > From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Liang,
> > Prike
Sent: Monday, January 19, 2026 9:58 AM
> > To: Timur Kristóf <timur.kristof@gmail.com>;
> > amd-gfx@lists.freedesktop.org;
Deucher, Alexander
> > <Alexander.Deucher@amd.com>; Koenig, Christian
> > <Christian.Koenig@amd.com>; Limonciello, Mario
> > <Mario.Limonciello@amd.com>; Dan Carpenter <dan.carpenter@linaro.org>
> > Subject: RE: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
> >
> >
> >
> > [Public]
> >
> >
> >
> > Thank you for the fix. Could you please add the following the tags?
> >
> >
> >
> > | Reported-by: kernel test robot <lkp@intel.com>
> > | Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
> > | Closes: https://lore.kernel.org/r/202601190121.z9C0uml5-lkp@intel.com/
> >
> >
> >
> > Reviewed-by: Prike Liang <Prike.Liang@amd.com>
> >
> >
> >
> > Regards,
> >
> > Prike
> >
> >
> >
> > > -----Original Message-----
> > > From: Timur Kristóf <timur.kristof@gmail.com>
> > > Sent: Sunday, January 18, 2026 8:58 PM
> > > To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> > > <Alexander.Deucher@amd.com>; Koenig, Christian
> > > <Christian.Koenig@amd.com>; Liang, Prike <Prike.Liang@amd.com>;
> > > Limonciello, Mario <Mario.Limonciello@amd.com>
> > > Cc: Timur Kristóf <timur.kristof@gmail.com>
> > > Subject: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
> > >
> > >
> > >
> > > When a function holds a lock and we return without unlocking it, it
> > > deadlocks the kernel. We should always unlock before returning.
> > >
> > >
> > >
> > > This commit fixes suspend/resume on SI.
> > > Tested on two Tahiti GPUs: FirePro W9000 and R9 280X.
> > >
> > >
> > >
> > > Fixes: bc2dea30038a ("drm/amdgpu: validate the flush_gpu_tlb_pasid()")
> > > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > > ---
> > >
> > > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 5 ++---
> > > 1 file changed, 2 insertions(+), 3 deletions(-)
> > >
> > >
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > > index 0e67fa4338ff..4fa24be1bf45 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > > @@ -769,7 +769,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> > > amdgpu_device *adev, uint16_t pasid,
> > >
> > > struct amdgpu_ring *ring = &adev->gfx.kiq[inst].ring;
> > > struct amdgpu_kiq *kiq = &adev->gfx.kiq[inst];
> > > unsigned int ndw;
> > >
> > > - int r, cnt = 0;
> > > + int r = 0, cnt = 0;
> > >
> > > uint32_t seq;
> > >
> > >
> > >
> > > /*
> > >
> > > @@ -782,7 +782,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> > > amdgpu_device *adev, uint16_t pasid,
> > >
> > > if (!adev->gmc.flush_pasid_uses_kiq || !ring->sched.ready) {
> > >
> > >
> > >
> > > if (!adev->gmc.gmc_funcs->flush_gpu_tlb_pasid)
> > >
> > > - return 0;
> > > + goto error_unlock_reset;
> > >
> > >
> > >
> > > if (adev->gmc.flush_tlb_needs_extra_type_2)
> > >
> > > adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev,
> > >
> > > pasid, @@ -797,7 +797,6 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> > > amdgpu_device *adev, uint16_t pasid,
> > >
> > > adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid,
> > >
> > > flush_type,
> > > all_hub,
> > > inst);
> > >
> > > - r = 0;
> > >
> > > } else {
> > >
> > > /* 2 dwords flush + 8 dwords fence */
> > > ndw = kiq->pmf->invalidate_tlbs_size + 8;
> > >
> > > --
> > > 2.52.0
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
2026-01-19 5:27 ` Liang, Prike
2026-01-19 9:00 ` Timur Kristóf
@ 2026-01-19 10:33 ` Christian König
2026-01-19 11:44 ` Liang, Prike
1 sibling, 1 reply; 10+ messages in thread
From: Christian König @ 2026-01-19 10:33 UTC (permalink / raw)
To: Liang, Prike, Timur Kristóf, amd-gfx@lists.freedesktop.org,
Deucher, Alexander, Limonciello, Mario, Dan Carpenter
On 1/19/26 06:27, Liang, Prike wrote:
> [Public]
>
> In order to avoid being blocked by the lock issue on some older GFX, I will push the patch to amd-staging-drm-next.
> If you have any concerns, please let me know.
I only had a coding style comment on the patch and also gave my rb with that as well.
So if you haven't pushed it yet please fix what I've pointed out. Otherwise it is not much of an issue.
Regards,
Christian.
>
> Regards,
> Prike
>
>> -----Original Message-----
>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Liang, Prike
>> Sent: Monday, January 19, 2026 9:58 AM
>> To: Timur Kristóf <timur.kristof@gmail.com>; amd-gfx@lists.freedesktop.org;
>> Deucher, Alexander <Alexander.Deucher@amd.com>; Koenig, Christian
>> <Christian.Koenig@amd.com>; Limonciello, Mario <Mario.Limonciello@amd.com>;
>> Dan Carpenter <dan.carpenter@linaro.org>
>> Subject: RE: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
>>
>> [Public]
>>
>> Thank you for the fix. Could you please add the following the tags?
>>
>> | Reported-by: kernel test robot <lkp@intel.com>
>> | Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
>> | Closes: https://lore.kernel.org/r/202601190121.z9C0uml5-lkp@intel.com/
>>
>> Reviewed-by: Prike Liang <Prike.Liang@amd.com>
>>
>> Regards,
>> Prike
>>
>>> -----Original Message-----
>>> From: Timur Kristóf <timur.kristof@gmail.com>
>>> Sent: Sunday, January 18, 2026 8:58 PM
>>> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
>>> <Alexander.Deucher@amd.com>; Koenig, Christian
>>> <Christian.Koenig@amd.com>; Liang, Prike <Prike.Liang@amd.com>;
>>> Limonciello, Mario <Mario.Limonciello@amd.com>
>>> Cc: Timur Kristóf <timur.kristof@gmail.com>
>>> Subject: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
>>>
>>> When a function holds a lock and we return without unlocking it, it
>>> deadlocks the kernel. We should always unlock before returning.
>>>
>>> This commit fixes suspend/resume on SI.
>>> Tested on two Tahiti GPUs: FirePro W9000 and R9 280X.
>>>
>>> Fixes: bc2dea30038a ("drm/amdgpu: validate the flush_gpu_tlb_pasid()")
>>> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 5 ++---
>>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> index 0e67fa4338ff..4fa24be1bf45 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> @@ -769,7 +769,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
>>> amdgpu_device *adev, uint16_t pasid,
>>> struct amdgpu_ring *ring = &adev->gfx.kiq[inst].ring;
>>> struct amdgpu_kiq *kiq = &adev->gfx.kiq[inst];
>>> unsigned int ndw;
>>> - int r, cnt = 0;
>>> + int r = 0, cnt = 0;
>>> uint32_t seq;
>>>
>>> /*
>>> @@ -782,7 +782,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
>>> amdgpu_device *adev, uint16_t pasid,
>>> if (!adev->gmc.flush_pasid_uses_kiq || !ring->sched.ready) {
>>>
>>> if (!adev->gmc.gmc_funcs->flush_gpu_tlb_pasid)
>>> - return 0;
>>> + goto error_unlock_reset;
>>>
>>> if (adev->gmc.flush_tlb_needs_extra_type_2)
>>> adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev,
>>> pasid, @@ -797,7 +797,6 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
>>> amdgpu_device *adev, uint16_t pasid,
>>> adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid,
>>> flush_type, all_hub,
>>> inst);
>>> - r = 0;
>>> } else {
>>> /* 2 dwords flush + 8 dwords fence */
>>> ndw = kiq->pmf->invalidate_tlbs_size + 8;
>>> --
>>> 2.52.0
>
^ permalink raw reply [flat|nested] 10+ messages in thread* RE: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
2026-01-19 10:33 ` Christian König
@ 2026-01-19 11:44 ` Liang, Prike
0 siblings, 0 replies; 10+ messages in thread
From: Liang, Prike @ 2026-01-19 11:44 UTC (permalink / raw)
To: Koenig, Christian, Timur Kristóf,
amd-gfx@lists.freedesktop.org, Deucher, Alexander,
Limonciello, Mario, Dan Carpenter
[Public]
Regards,
Prike
> -----Original Message-----
> From: Koenig, Christian <Christian.Koenig@amd.com>
> Sent: Monday, January 19, 2026 6:33 PM
> To: Liang, Prike <Prike.Liang@amd.com>; Timur Kristóf <timur.kristof@gmail.com>;
> amd-gfx@lists.freedesktop.org; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Limonciello, Mario
> <Mario.Limonciello@amd.com>; Dan Carpenter <dan.carpenter@linaro.org>
> Subject: Re: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
>
> On 1/19/26 06:27, Liang, Prike wrote:
> > [Public]
> >
> > In order to avoid being blocked by the lock issue on some older GFX, I will push
> the patch to amd-staging-drm-next.
> > If you have any concerns, please let me know.
>
> I only had a coding style comment on the patch and also gave my rb with that as
> well.
>
> So if you haven't pushed it yet please fix what I've pointed out. Otherwise it is not
> much of an issue.
>
> Regards,
> Christian.
The patch is still running in the CI pipeline. I’ll revise it to improve the patch style.
> >
> > Regards,
> > Prike
> >
> >> -----Original Message-----
> >> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of
> >> Liang, Prike
> >> Sent: Monday, January 19, 2026 9:58 AM
> >> To: Timur Kristóf <timur.kristof@gmail.com>;
> >> amd-gfx@lists.freedesktop.org; Deucher, Alexander
> >> <Alexander.Deucher@amd.com>; Koenig, Christian
> >> <Christian.Koenig@amd.com>; Limonciello, Mario
> >> <Mario.Limonciello@amd.com>; Dan Carpenter <dan.carpenter@linaro.org>
> >> Subject: RE: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
> >>
> >> [Public]
> >>
> >> Thank you for the fix. Could you please add the following the tags?
> >>
> >> | Reported-by: kernel test robot <lkp@intel.com>
> >> | Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
> >> | Closes:
> >> | https://lore.kernel.org/r/202601190121.z9C0uml5-lkp@intel.com/
> >>
> >> Reviewed-by: Prike Liang <Prike.Liang@amd.com>
> >>
> >> Regards,
> >> Prike
> >>
> >>> -----Original Message-----
> >>> From: Timur Kristóf <timur.kristof@gmail.com>
> >>> Sent: Sunday, January 18, 2026 8:58 PM
> >>> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> >>> <Alexander.Deucher@amd.com>; Koenig, Christian
> >>> <Christian.Koenig@amd.com>; Liang, Prike <Prike.Liang@amd.com>;
> >>> Limonciello, Mario <Mario.Limonciello@amd.com>
> >>> Cc: Timur Kristóf <timur.kristof@gmail.com>
> >>> Subject: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
> >>>
> >>> When a function holds a lock and we return without unlocking it, it
> >>> deadlocks the kernel. We should always unlock before returning.
> >>>
> >>> This commit fixes suspend/resume on SI.
> >>> Tested on two Tahiti GPUs: FirePro W9000 and R9 280X.
> >>>
> >>> Fixes: bc2dea30038a ("drm/amdgpu: validate the
> >>> flush_gpu_tlb_pasid()")
> >>> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> >>> ---
> >>> drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 5 ++---
> >>> 1 file changed, 2 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> >>> index 0e67fa4338ff..4fa24be1bf45 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> >>> @@ -769,7 +769,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> >>> amdgpu_device *adev, uint16_t pasid,
> >>> struct amdgpu_ring *ring = &adev->gfx.kiq[inst].ring;
> >>> struct amdgpu_kiq *kiq = &adev->gfx.kiq[inst];
> >>> unsigned int ndw;
> >>> - int r, cnt = 0;
> >>> + int r = 0, cnt = 0;
> >>> uint32_t seq;
> >>>
> >>> /*
> >>> @@ -782,7 +782,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> >>> amdgpu_device *adev, uint16_t pasid,
> >>> if (!adev->gmc.flush_pasid_uses_kiq || !ring->sched.ready) {
> >>>
> >>> if (!adev->gmc.gmc_funcs->flush_gpu_tlb_pasid)
> >>> - return 0;
> >>> + goto error_unlock_reset;
> >>>
> >>> if (adev->gmc.flush_tlb_needs_extra_type_2)
> >>> adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev,
> >>> pasid, @@ -797,7 +797,6 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> >>> amdgpu_device *adev, uint16_t pasid,
> >>> adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid,
> >>> flush_type, all_hub,
> >>> inst);
> >>> - r = 0;
> >>> } else {
> >>> /* 2 dwords flush + 8 dwords fence */
> >>> ndw = kiq->pmf->invalidate_tlbs_size + 8;
> >>> --
> >>> 2.52.0
> >
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
2026-01-18 12:57 [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid() Timur Kristóf
2026-01-19 1:57 ` Liang, Prike
@ 2026-01-19 10:12 ` Christian König
2026-01-19 11:20 ` Timur Kristóf
1 sibling, 1 reply; 10+ messages in thread
From: Christian König @ 2026-01-19 10:12 UTC (permalink / raw)
To: Timur Kristóf, amd-gfx, Alexander.Deucher, Prike Liang,
Mario Limonciello
On 1/18/26 13:57, Timur Kristóf wrote:
> When a function holds a lock and we return without unlocking it,
> it deadlocks the kernel. We should always unlock before returning.
>
> This commit fixes suspend/resume on SI.
> Tested on two Tahiti GPUs: FirePro W9000 and R9 280X.
>
> Fixes: bc2dea30038a ("drm/amdgpu: validate the flush_gpu_tlb_pasid()")
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index 0e67fa4338ff..4fa24be1bf45 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -769,7 +769,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct amdgpu_device *adev, uint16_t pasid,
> struct amdgpu_ring *ring = &adev->gfx.kiq[inst].ring;
> struct amdgpu_kiq *kiq = &adev->gfx.kiq[inst];
> unsigned int ndw;
> - int r, cnt = 0;
> + int r = 0, cnt = 0;
Please don't initialize return values in the declaration, that is usually considered bad coding style.
> uint32_t seq;
>
> /*
> @@ -782,7 +782,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct amdgpu_device *adev, uint16_t pasid,
> if (!adev->gmc.flush_pasid_uses_kiq || !ring->sched.ready) {
>
> if (!adev->gmc.gmc_funcs->flush_gpu_tlb_pasid)
> - return 0;
> + goto error_unlock_reset;
Ah, yes good catch!
With the change to r initialization dropped: Reviewed-by: Christian König <christian.koenig@amd.com>
Regards,
Christian.
>
> if (adev->gmc.flush_tlb_needs_extra_type_2)
> adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid,
> @@ -797,7 +797,6 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct amdgpu_device *adev, uint16_t pasid,
> adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid,
> flush_type, all_hub,
> inst);
> - r = 0;
> } else {
> /* 2 dwords flush + 8 dwords fence */
> ndw = kiq->pmf->invalidate_tlbs_size + 8;
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
2026-01-19 10:12 ` Christian König
@ 2026-01-19 11:20 ` Timur Kristóf
2026-01-19 11:47 ` Liang, Prike
0 siblings, 1 reply; 10+ messages in thread
From: Timur Kristóf @ 2026-01-19 11:20 UTC (permalink / raw)
To: amd-gfx, Alexander.Deucher, Prike Liang, Mario Limonciello,
Christian König
On Monday, January 19, 2026 11:12:02 AM Central European Standard Time
Christian König wrote:
> On 1/18/26 13:57, Timur Kristóf wrote:
> > When a function holds a lock and we return without unlocking it,
> > it deadlocks the kernel. We should always unlock before returning.
> >
> > This commit fixes suspend/resume on SI.
> > Tested on two Tahiti GPUs: FirePro W9000 and R9 280X.
> >
> > Fixes: bc2dea30038a ("drm/amdgpu: validate the flush_gpu_tlb_pasid()")
> > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > ---
> >
> > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 5 ++---
> > 1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index
> > 0e67fa4338ff..4fa24be1bf45 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > @@ -769,7 +769,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> > amdgpu_device *adev, uint16_t pasid,>
> > struct amdgpu_ring *ring = &adev->gfx.kiq[inst].ring;
> > struct amdgpu_kiq *kiq = &adev->gfx.kiq[inst];
> > unsigned int ndw;
> >
> > - int r, cnt = 0;
> > + int r = 0, cnt = 0;
>
> Please don't initialize return values in the declaration, that is usually
> considered bad coding style.
The initialization is necessary, otherwise the function will return an
uninitialized value when flush_gpu_tlb_pasid==NULL
> > uint32_t seq;
> >
> > /*
> >
> > @@ -782,7 +782,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> > amdgpu_device *adev, uint16_t pasid,>
> > if (!adev->gmc.flush_pasid_uses_kiq || !ring->sched.ready) {
> >
> > if (!adev->gmc.gmc_funcs->flush_gpu_tlb_pasid)
> >
> > - return 0;
> > + goto error_unlock_reset;
>
> Ah, yes good catch!
>
> With the change to r initialization dropped: Reviewed-by: Christian König
> <christian.koenig@amd.com>
If I drop it, then it will regress again because it returns an uninitialized
value.
>
> Regards,
> Christian.
>
> > if (adev->gmc.flush_tlb_needs_extra_type_2)
> >
> > adev->gmc.gmc_funcs-
>flush_gpu_tlb_pasid(adev, pasid,
> >
> > @@ -797,7 +797,6 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> > amdgpu_device *adev, uint16_t pasid,>
> > adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid,
> >
> >
flush_type, all_hub,
> >
inst);
> >
> > - r = 0;
> >
> > } else {
> >
> > /* 2 dwords flush + 8 dwords fence */
> > ndw = kiq->pmf->invalidate_tlbs_size + 8;
^ permalink raw reply [flat|nested] 10+ messages in thread* RE: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
2026-01-19 11:20 ` Timur Kristóf
@ 2026-01-19 11:47 ` Liang, Prike
0 siblings, 0 replies; 10+ messages in thread
From: Liang, Prike @ 2026-01-19 11:47 UTC (permalink / raw)
To: Timur Kristóf, amd-gfx@lists.freedesktop.org,
Deucher, Alexander, Limonciello, Mario, Koenig, Christian
[Public]
Regards,
Prike
> -----Original Message-----
> From: Timur Kristóf <timur.kristof@gmail.com>
> Sent: Monday, January 19, 2026 7:21 PM
> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Liang, Prike <Prike.Liang@amd.com>;
> Limonciello, Mario <Mario.Limonciello@amd.com>; Koenig, Christian
> <Christian.Koenig@amd.com>
> Subject: Re: [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
>
> On Monday, January 19, 2026 11:12:02 AM Central European Standard Time
> Christian König wrote:
> > On 1/18/26 13:57, Timur Kristóf wrote:
> > > When a function holds a lock and we return without unlocking it, it
> > > deadlocks the kernel. We should always unlock before returning.
> > >
> > > This commit fixes suspend/resume on SI.
> > > Tested on two Tahiti GPUs: FirePro W9000 and R9 280X.
> > >
> > > Fixes: bc2dea30038a ("drm/amdgpu: validate the
> > > flush_gpu_tlb_pasid()")
> > > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > > ---
> > >
> > > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 5 ++---
> > > 1 file changed, 2 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index
> > > 0e67fa4338ff..4fa24be1bf45 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > > @@ -769,7 +769,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> > > amdgpu_device *adev, uint16_t pasid,>
> > > struct amdgpu_ring *ring = &adev->gfx.kiq[inst].ring;
> > > struct amdgpu_kiq *kiq = &adev->gfx.kiq[inst];
> > > unsigned int ndw;
> > >
> > > - int r, cnt = 0;
> > > + int r = 0, cnt = 0;
> >
> > Please don't initialize return values in the declaration, that is
> > usually considered bad coding style.
>
> The initialization is necessary, otherwise the function will return an uninitialized value
> when flush_gpu_tlb_pasid==NULL
We can initialize the r before goto error_unlock_reset.
> > > uint32_t seq;
> > >
> > > /*
> > >
> > > @@ -782,7 +782,7 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> > > amdgpu_device *adev, uint16_t pasid,>
> > > if (!adev->gmc.flush_pasid_uses_kiq || !ring->sched.ready) {
> > >
> > > if (!adev->gmc.gmc_funcs->flush_gpu_tlb_pasid)
> > >
> > > - return 0;
> > > + goto error_unlock_reset;
> >
> > Ah, yes good catch!
> >
> > With the change to r initialization dropped: Reviewed-by: Christian
> > König <christian.koenig@amd.com>
>
> If I drop it, then it will regress again because it returns an uninitialized value.
>
> >
> > Regards,
> > Christian.
> >
> > > if (adev->gmc.flush_tlb_needs_extra_type_2)
> > >
> > > adev->gmc.gmc_funcs-
> >flush_gpu_tlb_pasid(adev, pasid,
> > >
> > > @@ -797,7 +797,6 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct
> > > amdgpu_device *adev, uint16_t pasid,>
> > > adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid,
> > >
> > >
> flush_type, all_hub,
> > >
> inst);
> > >
> > > - r = 0;
> > >
> > > } else {
> > >
> > > /* 2 dwords flush + 8 dwords fence */
> > > ndw = kiq->pmf->invalidate_tlbs_size + 8;
>
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
@ 2026-01-19 12:07 Prike Liang
0 siblings, 0 replies; 10+ messages in thread
From: Prike Liang @ 2026-01-19 12:07 UTC (permalink / raw)
To: amd-gfx
Cc: Alexander.Deucher, Christian.Koenig, Timur Kristóf,
kernel test robot, Dan Carpenter, Prike Liang,
Christian König
From: Timur Kristóf <timur.kristof@gmail.com>
When a function holds a lock and we return without unlocking it,
it deadlocks the kernel. We should always unlock before returning.
This commit fixes suspend/resume on SI.
Tested on two Tahiti GPUs: FirePro W9000 and R9 280X.
Fixes: bc2dea30038a ("drm/amdgpu: validate the flush_gpu_tlb_pasid()")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202601190121.z9C0uml5-lkp@intel.com/
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 0e67fa4338ff..d9ff68a43178 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -781,8 +781,10 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct amdgpu_device *adev, uint16_t pasid,
if (!adev->gmc.flush_pasid_uses_kiq || !ring->sched.ready) {
- if (!adev->gmc.gmc_funcs->flush_gpu_tlb_pasid)
- return 0;
+ if (!adev->gmc.gmc_funcs->flush_gpu_tlb_pasid) {
+ r = 0;
+ goto error_unlock_reset;
+ }
if (adev->gmc.flush_tlb_needs_extra_type_2)
adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid,
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-01-19 12:07 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-18 12:57 [PATCH] drm/amdgpu: Fix validating flush_gpu_tlb_pasid() Timur Kristóf
2026-01-19 1:57 ` Liang, Prike
2026-01-19 5:27 ` Liang, Prike
2026-01-19 9:00 ` Timur Kristóf
2026-01-19 10:33 ` Christian König
2026-01-19 11:44 ` Liang, Prike
2026-01-19 10:12 ` Christian König
2026-01-19 11:20 ` Timur Kristóf
2026-01-19 11:47 ` Liang, Prike
-- strict thread matches above, loose matches on Subject: below --
2026-01-19 12:07 Prike Liang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox