public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Revert "drm/amdgpu: init iommu after amdkfd device init"
@ 2024-05-23 17:30 Armin Wolf
  2024-06-03 22:19 ` Armin Wolf
  0 siblings, 1 reply; 7+ messages in thread
From: Armin Wolf @ 2024-05-23 17:30 UTC (permalink / raw)
  To: alexander.deucher, christian.koenig, Xinhui.Pan, gregkh, sashal
  Cc: stable, bkauler, yifan1.zhang, Prike.Liang, dri-devel, amd-gfx

This reverts commit 56b522f4668167096a50c39446d6263c96219f5f.

A user reported that this commit breaks the integrated gpu of his
notebook, causing a black screen. He was able to bisect the problematic
commit and verified that by reverting it the notebook works again.
He also confirmed that kernel 6.8.1 also works on his device, so the
upstream commit itself seems to be ok.

An amdgpu developer (Alex Deucher) confirmed that this patch should
have never been ported to 5.15 in the first place, so revert this
commit from the 5.15 stable series.

Reported-by: Barry Kauler <bkauler@gmail.com>
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 222a1d9ecf16..5f6c32ec674d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2487,6 +2487,10 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
 	if (r)
 		goto init_failed;

+	r = amdgpu_amdkfd_resume_iommu(adev);
+	if (r)
+		goto init_failed;
+
 	r = amdgpu_device_ip_hw_init_phase1(adev);
 	if (r)
 		goto init_failed;
@@ -2525,10 +2529,6 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
 	if (!adev->gmc.xgmi.pending_reset)
 		amdgpu_amdkfd_device_init(adev);

-	r = amdgpu_amdkfd_resume_iommu(adev);
-	if (r)
-		goto init_failed;
-
 	amdgpu_fru_get_product_info(adev);

 init_failed:
--
2.39.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] Revert "drm/amdgpu: init iommu after amdkfd device init"
  2024-05-23 17:30 [PATCH] Revert "drm/amdgpu: init iommu after amdkfd device init" Armin Wolf
@ 2024-06-03 22:19 ` Armin Wolf
  2024-06-04 18:24   ` Felix Kuehling
  0 siblings, 1 reply; 7+ messages in thread
From: Armin Wolf @ 2024-06-03 22:19 UTC (permalink / raw)
  To: alexander.deucher, christian.koenig, Xinhui.Pan, gregkh, sashal
  Cc: stable, bkauler, yifan1.zhang, Prike.Liang, dri-devel, amd-gfx

Am 23.05.24 um 19:30 schrieb Armin Wolf:

> This reverts commit 56b522f4668167096a50c39446d6263c96219f5f.
>
> A user reported that this commit breaks the integrated gpu of his
> notebook, causing a black screen. He was able to bisect the problematic
> commit and verified that by reverting it the notebook works again.
> He also confirmed that kernel 6.8.1 also works on his device, so the
> upstream commit itself seems to be ok.
>
> An amdgpu developer (Alex Deucher) confirmed that this patch should
> have never been ported to 5.15 in the first place, so revert this
> commit from the 5.15 stable series.

Hi,

what is the status of this?

Armin Wolf

>
> Reported-by: Barry Kauler <bkauler@gmail.com>
> Signed-off-by: Armin Wolf <W_Armin@gmx.de>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 222a1d9ecf16..5f6c32ec674d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2487,6 +2487,10 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
>   	if (r)
>   		goto init_failed;
>
> +	r = amdgpu_amdkfd_resume_iommu(adev);
> +	if (r)
> +		goto init_failed;
> +
>   	r = amdgpu_device_ip_hw_init_phase1(adev);
>   	if (r)
>   		goto init_failed;
> @@ -2525,10 +2529,6 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
>   	if (!adev->gmc.xgmi.pending_reset)
>   		amdgpu_amdkfd_device_init(adev);
>
> -	r = amdgpu_amdkfd_resume_iommu(adev);
> -	if (r)
> -		goto init_failed;
> -
>   	amdgpu_fru_get_product_info(adev);
>
>   init_failed:
> --
> 2.39.2
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Revert "drm/amdgpu: init iommu after amdkfd device init"
  2024-06-03 22:19 ` Armin Wolf
@ 2024-06-04 18:24   ` Felix Kuehling
  2024-06-04 18:28     ` Deucher, Alexander
  0 siblings, 1 reply; 7+ messages in thread
From: Felix Kuehling @ 2024-06-04 18:24 UTC (permalink / raw)
  To: Armin Wolf, alexander.deucher, christian.koenig, Xinhui.Pan,
	gregkh, sashal
  Cc: stable, bkauler, yifan1.zhang, Prike.Liang, dri-devel, amd-gfx


On 2024-06-03 18:19, Armin Wolf wrote:
> Am 23.05.24 um 19:30 schrieb Armin Wolf:
>
>> This reverts commit 56b522f4668167096a50c39446d6263c96219f5f.
>>
>> A user reported that this commit breaks the integrated gpu of his
>> notebook, causing a black screen. He was able to bisect the problematic
>> commit and verified that by reverting it the notebook works again.
>> He also confirmed that kernel 6.8.1 also works on his device, so the
>> upstream commit itself seems to be ok.
>>
>> An amdgpu developer (Alex Deucher) confirmed that this patch should
>> have never been ported to 5.15 in the first place, so revert this
>> commit from the 5.15 stable series.
>
> Hi,
>
> what is the status of this?

Which branch is this for? This patch won't apply to anything after Linux 
6.5. Support for IOMMUv2 was removed from amdgpu in Linux 6.6 by:

commit c99a2e7ae291e5b19b60443eb6397320ef9e8571
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Fri Jul 28 12:20:12 2023 -0400

     drm/amdkfd: drop IOMMUv2 support

     Now that we use the dGPU path for all APUs, drop the
     IOMMUv2 support.

     v2: drop the now unused queue manager functions for gfx7/8 APUs

     Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
     Acked-by: Christian König <christian.koenig@amd.com>
     Tested-by: Mike Lothian <mike@fireburn.co.uk>
     Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Regards,
   Felix


>
> Armin Wolf
>
>>
>> Reported-by: Barry Kauler <bkauler@gmail.com>
>> Signed-off-by: Armin Wolf <W_Armin@gmx.de>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 222a1d9ecf16..5f6c32ec674d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2487,6 +2487,10 @@ static int amdgpu_device_ip_init(struct 
>> amdgpu_device *adev)
>>       if (r)
>>           goto init_failed;
>>
>> +    r = amdgpu_amdkfd_resume_iommu(adev);
>> +    if (r)
>> +        goto init_failed;
>> +
>>       r = amdgpu_device_ip_hw_init_phase1(adev);
>>       if (r)
>>           goto init_failed;
>> @@ -2525,10 +2529,6 @@ static int amdgpu_device_ip_init(struct 
>> amdgpu_device *adev)
>>       if (!adev->gmc.xgmi.pending_reset)
>>           amdgpu_amdkfd_device_init(adev);
>>
>> -    r = amdgpu_amdkfd_resume_iommu(adev);
>> -    if (r)
>> -        goto init_failed;
>> -
>>       amdgpu_fru_get_product_info(adev);
>>
>>   init_failed:
>> -- 
>> 2.39.2
>>
>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] Revert "drm/amdgpu: init iommu after amdkfd device init"
  2024-06-04 18:24   ` Felix Kuehling
@ 2024-06-04 18:28     ` Deucher, Alexander
  2024-06-10 14:28       ` Armin Wolf
  0 siblings, 1 reply; 7+ messages in thread
From: Deucher, Alexander @ 2024-06-04 18:28 UTC (permalink / raw)
  To: Kuehling, Felix, Armin Wolf, Koenig, Christian, Pan, Xinhui,
	gregkh@linuxfoundation.org, sashal@kernel.org
  Cc: stable@vger.kernel.org, bkauler@gmail.com, Zhang, Yifan,
	Liang, Prike, dri-devel@lists.freedesktop.org,
	amd-gfx@lists.freedesktop.org

[AMD Official Use Only - AMD Internal Distribution Only]

> -----Original Message-----
> From: Kuehling, Felix <Felix.Kuehling@amd.com>
> Sent: Tuesday, June 4, 2024 2:25 PM
> To: Armin Wolf <W_Armin@gmx.de>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Koenig, Christian
> <Christian.Koenig@amd.com>; Pan, Xinhui <Xinhui.Pan@amd.com>;
> gregkh@linuxfoundation.org; sashal@kernel.org
> Cc: stable@vger.kernel.org; bkauler@gmail.com; Zhang, Yifan
> <Yifan1.Zhang@amd.com>; Liang, Prike <Prike.Liang@amd.com>; dri-
> devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] Revert "drm/amdgpu: init iommu after amdkfd device
> init"
>
>
> On 2024-06-03 18:19, Armin Wolf wrote:
> > Am 23.05.24 um 19:30 schrieb Armin Wolf:
> >
> >> This reverts commit 56b522f4668167096a50c39446d6263c96219f5f.
> >>
> >> A user reported that this commit breaks the integrated gpu of his
> >> notebook, causing a black screen. He was able to bisect the
> >> problematic commit and verified that by reverting it the notebook works
> again.
> >> He also confirmed that kernel 6.8.1 also works on his device, so the
> >> upstream commit itself seems to be ok.
> >>
> >> An amdgpu developer (Alex Deucher) confirmed that this patch should
> >> have never been ported to 5.15 in the first place, so revert this
> >> commit from the 5.15 stable series.
> >
> > Hi,
> >
> > what is the status of this?
>
> Which branch is this for? This patch won't apply to anything after Linux 6.5.

It's applicable to 5.15 stable only.  The original patch caused a regression on 5.15 so probably should not have been applied there.

Alex


> Support for IOMMUv2 was removed from amdgpu in Linux 6.6 by:
>
> commit c99a2e7ae291e5b19b60443eb6397320ef9e8571
> Author: Alex Deucher <alexander.deucher@amd.com>
> Date:   Fri Jul 28 12:20:12 2023 -0400
>
>      drm/amdkfd: drop IOMMUv2 support
>
>      Now that we use the dGPU path for all APUs, drop the
>      IOMMUv2 support.
>
>      v2: drop the now unused queue manager functions for gfx7/8 APUs
>
>      Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
>      Acked-by: Christian König <christian.koenig@amd.com>
>      Tested-by: Mike Lothian <mike@fireburn.co.uk>
>      Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>
> Regards,
>    Felix
>
>
> >
> > Armin Wolf
> >
> >>
> >> Reported-by: Barry Kauler <bkauler@gmail.com>
> >> Signed-off-by: Armin Wolf <W_Armin@gmx.de>
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++----
> >>   1 file changed, 4 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> index 222a1d9ecf16..5f6c32ec674d 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> @@ -2487,6 +2487,10 @@ static int amdgpu_device_ip_init(struct
> >> amdgpu_device *adev)
> >>       if (r)
> >>           goto init_failed;
> >>
> >> +    r = amdgpu_amdkfd_resume_iommu(adev);
> >> +    if (r)
> >> +        goto init_failed;
> >> +
> >>       r = amdgpu_device_ip_hw_init_phase1(adev);
> >>       if (r)
> >>           goto init_failed;
> >> @@ -2525,10 +2529,6 @@ static int amdgpu_device_ip_init(struct
> >> amdgpu_device *adev)
> >>       if (!adev->gmc.xgmi.pending_reset)
> >>           amdgpu_amdkfd_device_init(adev);
> >>
> >> -    r = amdgpu_amdkfd_resume_iommu(adev);
> >> -    if (r)
> >> -        goto init_failed;
> >> -
> >>       amdgpu_fru_get_product_info(adev);
> >>
> >>   init_failed:
> >> --
> >> 2.39.2
> >>
> >>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Revert "drm/amdgpu: init iommu after amdkfd device init"
  2024-06-04 18:28     ` Deucher, Alexander
@ 2024-06-10 14:28       ` Armin Wolf
  2024-06-12  0:10         ` Matthew Ruffell
  0 siblings, 1 reply; 7+ messages in thread
From: Armin Wolf @ 2024-06-10 14:28 UTC (permalink / raw)
  To: Deucher, Alexander, Kuehling, Felix, Koenig, Christian,
	Pan, Xinhui, gregkh@linuxfoundation.org, sashal@kernel.org
  Cc: stable@vger.kernel.org, bkauler@gmail.com, Zhang, Yifan,
	Liang, Prike, dri-devel@lists.freedesktop.org,
	amd-gfx@lists.freedesktop.org

Am 04.06.24 um 20:28 schrieb Deucher, Alexander:

> [AMD Official Use Only - AMD Internal Distribution Only]
>
>> -----Original Message-----
>> From: Kuehling, Felix <Felix.Kuehling@amd.com>
>> Sent: Tuesday, June 4, 2024 2:25 PM
>> To: Armin Wolf <W_Armin@gmx.de>; Deucher, Alexander
>> <Alexander.Deucher@amd.com>; Koenig, Christian
>> <Christian.Koenig@amd.com>; Pan, Xinhui <Xinhui.Pan@amd.com>;
>> gregkh@linuxfoundation.org; sashal@kernel.org
>> Cc: stable@vger.kernel.org; bkauler@gmail.com; Zhang, Yifan
>> <Yifan1.Zhang@amd.com>; Liang, Prike <Prike.Liang@amd.com>; dri-
>> devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
>> Subject: Re: [PATCH] Revert "drm/amdgpu: init iommu after amdkfd device
>> init"
>>
>>
>> On 2024-06-03 18:19, Armin Wolf wrote:
>>> Am 23.05.24 um 19:30 schrieb Armin Wolf:
>>>
>>>> This reverts commit 56b522f4668167096a50c39446d6263c96219f5f.
>>>>
>>>> A user reported that this commit breaks the integrated gpu of his
>>>> notebook, causing a black screen. He was able to bisect the
>>>> problematic commit and verified that by reverting it the notebook works
>> again.
>>>> He also confirmed that kernel 6.8.1 also works on his device, so the
>>>> upstream commit itself seems to be ok.
>>>>
>>>> An amdgpu developer (Alex Deucher) confirmed that this patch should
>>>> have never been ported to 5.15 in the first place, so revert this
>>>> commit from the 5.15 stable series.
>>> Hi,
>>>
>>> what is the status of this?
>> Which branch is this for? This patch won't apply to anything after Linux 6.5.
> It's applicable to 5.15 stable only.  The original patch caused a regression on 5.15 so probably should not have been applied there.
>
> Alex
>
Correct, and i would be very grateful if this regression could be resolved in the near future.
The user already wrote a blog post about the whole issue, see here:

https://bkhome.org/news/202405/kernel-amd-gpu-disaster-fixed.html

Thanks,
Armin Wolf

>> Support for IOMMUv2 was removed from amdgpu in Linux 6.6 by:
>>
>> commit c99a2e7ae291e5b19b60443eb6397320ef9e8571
>> Author: Alex Deucher <alexander.deucher@amd.com>
>> Date:   Fri Jul 28 12:20:12 2023 -0400
>>
>>       drm/amdkfd: drop IOMMUv2 support
>>
>>       Now that we use the dGPU path for all APUs, drop the
>>       IOMMUv2 support.
>>
>>       v2: drop the now unused queue manager functions for gfx7/8 APUs
>>
>>       Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>       Acked-by: Christian König <christian.koenig@amd.com>
>>       Tested-by: Mike Lothian <mike@fireburn.co.uk>
>>       Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>>
>> Regards,
>>     Felix
>>
>>
>>> Armin Wolf
>>>
>>>> Reported-by: Barry Kauler <bkauler@gmail.com>
>>>> Signed-off-by: Armin Wolf <W_Armin@gmx.de>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++----
>>>>    1 file changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index 222a1d9ecf16..5f6c32ec674d 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -2487,6 +2487,10 @@ static int amdgpu_device_ip_init(struct
>>>> amdgpu_device *adev)
>>>>        if (r)
>>>>            goto init_failed;
>>>>
>>>> +    r = amdgpu_amdkfd_resume_iommu(adev);
>>>> +    if (r)
>>>> +        goto init_failed;
>>>> +
>>>>        r = amdgpu_device_ip_hw_init_phase1(adev);
>>>>        if (r)
>>>>            goto init_failed;
>>>> @@ -2525,10 +2529,6 @@ static int amdgpu_device_ip_init(struct
>>>> amdgpu_device *adev)
>>>>        if (!adev->gmc.xgmi.pending_reset)
>>>>            amdgpu_amdkfd_device_init(adev);
>>>>
>>>> -    r = amdgpu_amdkfd_resume_iommu(adev);
>>>> -    if (r)
>>>> -        goto init_failed;
>>>> -
>>>>        amdgpu_fru_get_product_info(adev);
>>>>
>>>>    init_failed:
>>>> --
>>>> 2.39.2
>>>>
>>>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Revert "drm/amdgpu: init iommu after amdkfd device init"
  2024-06-10 14:28       ` Armin Wolf
@ 2024-06-12  0:10         ` Matthew Ruffell
  2024-06-12 12:44           ` Greg KH
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Ruffell @ 2024-06-12  0:10 UTC (permalink / raw)
  To: w_armin
  Cc: Alexander.Deucher, Christian.Koenig, Felix.Kuehling, Prike.Liang,
	Xinhui.Pan, Yifan1.Zhang, amd-gfx, bkauler, dri-devel, gregkh,
	sashal, stable

Hi Greg KH, Sasha,

Please pick up this patch for 5.15 stable tree. I have built a test kernel and
can confirm that it fixes affected users.

Downstream bug:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2068738

Thanks,
Matthew

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Revert "drm/amdgpu: init iommu after amdkfd device init"
  2024-06-12  0:10         ` Matthew Ruffell
@ 2024-06-12 12:44           ` Greg KH
  0 siblings, 0 replies; 7+ messages in thread
From: Greg KH @ 2024-06-12 12:44 UTC (permalink / raw)
  To: Matthew Ruffell
  Cc: w_armin, Alexander.Deucher, Christian.Koenig, Felix.Kuehling,
	Prike.Liang, Xinhui.Pan, Yifan1.Zhang, amd-gfx, bkauler,
	dri-devel, sashal, stable

On Wed, Jun 12, 2024 at 12:10:37PM +1200, Matthew Ruffell wrote:
> Hi Greg KH, Sasha,
> 
> Please pick up this patch for 5.15 stable tree. I have built a test kernel and
> can confirm that it fixes affected users.
> 
> Downstream bug:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2068738

Sorry for the delay, now picked up.

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-06-12 12:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-23 17:30 [PATCH] Revert "drm/amdgpu: init iommu after amdkfd device init" Armin Wolf
2024-06-03 22:19 ` Armin Wolf
2024-06-04 18:24   ` Felix Kuehling
2024-06-04 18:28     ` Deucher, Alexander
2024-06-10 14:28       ` Armin Wolf
2024-06-12  0:10         ` Matthew Ruffell
2024-06-12 12:44           ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox