* Re: [PATCH v3] drm/radeon: Fix EEH during kexec
[not found] <1572036050-18945-1-git-send-email-kmahlkuc@linux.vnet.ibm.com>
@ 2019-10-30 10:35 ` Michael Ellerman
2019-10-31 15:24 ` Kyle Mahlkuch
0 siblings, 1 reply; 2+ messages in thread
From: Michael Ellerman @ 2019-10-30 10:35 UTC (permalink / raw)
To: KyleMahlkuch, alexander.deucher; +Cc: linuxppc-dev, Kyle Mahlkuch, amd-gfx
Hi Kyle,
KyleMahlkuch <kmahlkuc@linux.vnet.ibm.com> writes:
> From: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>
>
> During kexec some adapters hit an EEH since they are not properly
> shut down in the radeon_pci_shutdown() function. Adding
> radeon_suspend_kms() fixes this issue.
> Enabled only on PPC because this patch causes issues on some other
> boards.
Which adapters hit the issues?
And do we know why they're not shut down correctly in
radeon_pci_shutdown()? That seems like the root cause no?
> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
> index 9e55076..4528f4d 100644
> --- a/drivers/gpu/drm/radeon/radeon_drv.c
> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
> @@ -379,11 +379,25 @@ static int radeon_pci_probe(struct pci_dev *pdev,
> static void
> radeon_pci_shutdown(struct pci_dev *pdev)
> {
> +#ifdef CONFIG_PPC64
> + struct drm_device *ddev = pci_get_drvdata(pdev);
> +#endif
This local serves no real purpose and could be avoided, which would also
avoid this ifdef.
> /* if we are running in a VM, make sure the device
> * torn down properly on reboot/shutdown
> */
> if (radeon_device_is_virtual())
> radeon_pci_remove(pdev);
> +
> +#ifdef CONFIG_PPC64
> + /* Some adapters need to be suspended before a
AFAIK drm uses normal kernel comment style, so this should be:
/*
* Some adapters need to be suspended before a
> + * shutdown occurs in order to prevent an error
> + * during kexec.
> + * Make this power specific becauase it breaks
> + * some non-power boards.
> + */
> + radeon_suspend_kms(ddev, true, true, false);
ie, instead do:
radeon_suspend_kms(pci_get_drvdata(pdev), true, true, false);
> +#endif
> }
>
> static int radeon_pmops_suspend(struct device *dev)
> --
> 1.8.3.1
cheers
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH v3] drm/radeon: Fix EEH during kexec
2019-10-30 10:35 ` [PATCH v3] drm/radeon: Fix EEH during kexec Michael Ellerman
@ 2019-10-31 15:24 ` Kyle Mahlkuch
0 siblings, 0 replies; 2+ messages in thread
From: Kyle Mahlkuch @ 2019-10-31 15:24 UTC (permalink / raw)
To: Michael Ellerman, alexander.deucher; +Cc: linuxppc-dev, amd-gfx
[-- Attachment #1: Type: text/plain, Size: 2465 bytes --]
On 10/30/19 5:35 AM, Michael Ellerman wrote:
> Hi Kyle,
>
> KyleMahlkuch <kmahlkuc@linux.vnet.ibm.com> writes:
>> From: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>
>>
>> During kexec some adapters hit an EEH since they are not properly
>> shut down in the radeon_pci_shutdown() function. Adding
>> radeon_suspend_kms() fixes this issue.
>> Enabled only on PPC because this patch causes issues on some other
>> boards.
> Which adapters hit the issues?
>
> And do we know why they're not shut down correctly in
> radeon_pci_shutdown()? That seems like the root cause no?
Hi Michael,
This is hit by the Caicos (edwards2) adapter that I have on ppc. It is not hit
on the Cedar (FirePro) adapter - though I haven't tested this one recently. I'm
not able to test any other adapters. As far as "why", I'm unsure. During
initialization after the kexec we hit an EEH. There could be another point in
the shutdown / start up process where something doesn't get reset correctly.
I'm open to other ideas if you have any.
>> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
>> index 9e55076..4528f4d 100644
>> --- a/drivers/gpu/drm/radeon/radeon_drv.c
>> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
>> @@ -379,11 +379,25 @@ static int radeon_pci_probe(struct pci_dev *pdev,
>> static void
>> radeon_pci_shutdown(struct pci_dev *pdev)
>> {
>> +#ifdef CONFIG_PPC64
>> + struct drm_device *ddev = pci_get_drvdata(pdev);
>> +#endif
> This local serves no real purpose and could be avoided, which would also
> avoid this ifdef.
>
>> /* if we are running in a VM, make sure the device
>> * torn down properly on reboot/shutdown
>> */
>> if (radeon_device_is_virtual())
>> radeon_pci_remove(pdev);
>> +
>> +#ifdef CONFIG_PPC64
>> + /* Some adapters need to be suspended before a
> AFAIK drm uses normal kernel comment style, so this should be:
>
> /*
> * Some adapters need to be suspended before a
>> + * shutdown occurs in order to prevent an error
>> + * during kexec.
>> + * Make this power specific becauase it breaks
>> + * some non-power boards.
>> + */
>> + radeon_suspend_kms(ddev, true, true, false);
> ie, instead do:
>
> radeon_suspend_kms(pci_get_drvdata(pdev), true, true, false);
I agree, this is a cleaner way to write this patch. I'll update the comment as
well. Thanks for the help.
>> +#endif
>> }
>>
>> static int radeon_pmops_suspend(struct device *dev)
>> --
>> 1.8.3.1
> cheers
>
[-- Attachment #2: Type: text/html, Size: 3897 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2019-10-31 20:18 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1572036050-18945-1-git-send-email-kmahlkuc@linux.vnet.ibm.com>
2019-10-30 10:35 ` [PATCH v3] drm/radeon: Fix EEH during kexec Michael Ellerman
2019-10-31 15:24 ` Kyle Mahlkuch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).