Linux PCI subsystem development
 help / color / mirror / Atom feed
* Re: [PATCH 1/3] drm/amd/pm/smu11: BACO is supported when it's in BACO state
       [not found] <20221123014307.263178-1-guchun.chen@amd.com>
@ 2022-12-27 18:23 ` Bjorn Helgaas
  2022-12-28  2:02   ` Chen, Guchun
  0 siblings, 1 reply; 2+ messages in thread
From: Bjorn Helgaas @ 2022-12-27 18:23 UTC (permalink / raw)
  To: Guchun Chen
  Cc: amd-gfx, alexander.deucher, hawking.zhang, lijo.lazar, evan.quan,
	Stefan Roese, linux-pci

[+cc Stefan, linux-pci]

On Wed, Nov 23, 2022 at 09:43:07AM +0800, Guchun Chen wrote:
> Return true early if ASIC is in BACO state already, no need
> to talk to SMU. It can fix the issue that driver was not
> calling BACO exit at all in runtime pm resume, and a timing
> issue leading to a PCI AER error happened eventually.

This sounds suspiciously racy.

> Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")

To clarify, this patch avoids a driver problem, not a problem with
8795e182b02d.

See question below.

> Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
> Signed-off-by: Guchun Chen <guchun.chen@amd.com>
> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
> index 70b560737687..ad5f6a15a1d7 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
> @@ -1588,6 +1588,10 @@ bool smu_v11_0_baco_is_support(struct smu_context *smu)
>  	if (amdgpu_sriov_vf(smu->adev) || !smu_baco->platform_support)
>  		return false;
>  
> +	/* return true if ASIC is in BACO state already */
> +	if (smu_v11_0_baco_get_state(smu) == SMU_BACO_STATE_ENTER)
> +		return true;

smu_v13_0_baco_is_support() is essentially identical to
smu_v11_0_baco_is_support().  Does it need a similar change?

>  	/* Arcturus does not support this bit mask */
>  	if (smu_cmn_feature_is_supported(smu, SMU_FEATURE_BACO_BIT) &&
>  	   !smu_cmn_feature_is_enabled(smu, SMU_FEATURE_BACO_BIT))
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 2+ messages in thread

* RE: [PATCH 1/3] drm/amd/pm/smu11: BACO is supported when it's in BACO state
  2022-12-27 18:23 ` [PATCH 1/3] drm/amd/pm/smu11: BACO is supported when it's in BACO state Bjorn Helgaas
@ 2022-12-28  2:02   ` Chen, Guchun
  0 siblings, 0 replies; 2+ messages in thread
From: Chen, Guchun @ 2022-12-28  2:02 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: amd-gfx@lists.freedesktop.org, Deucher, Alexander, Zhang, Hawking,
	Lazar, Lijo, Quan, Evan, Stefan Roese, linux-pci@vger.kernel.org

Hi Bjorn,

Thanks for your comments.

Exactly, this patch actually fixes an amdgpu driver issue, however it's only exposed by 8795e182b02d. Without it, it was running well over the past.

Regarding the question, I will provide a similar change in smu13 later on.

Regards,
Guchun

-----Original Message-----
From: Bjorn Helgaas <helgaas@kernel.org> 
Sent: Wednesday, December 28, 2022 2:24 AM
To: Chen, Guchun <Guchun.Chen@amd.com>
Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>; Lazar, Lijo <Lijo.Lazar@amd.com>; Quan, Evan <Evan.Quan@amd.com>; Stefan Roese <sr@denx.de>; linux-pci@vger.kernel.org
Subject: Re: [PATCH 1/3] drm/amd/pm/smu11: BACO is supported when it's in BACO state

[+cc Stefan, linux-pci]

On Wed, Nov 23, 2022 at 09:43:07AM +0800, Guchun Chen wrote:
> Return true early if ASIC is in BACO state already, no need to talk to 
> SMU. It can fix the issue that driver was not calling BACO exit at all 
> in runtime pm resume, and a timing issue leading to a PCI AER error 
> happened eventually.

This sounds suspiciously racy.

> Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in 
> get_port_device_capability()")

To clarify, this patch avoids a driver problem, not a problem with 8795e182b02d.

See question below.

> Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
> Signed-off-by: Guchun Chen <guchun.chen@amd.com>
> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
> index 70b560737687..ad5f6a15a1d7 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
> @@ -1588,6 +1588,10 @@ bool smu_v11_0_baco_is_support(struct smu_context *smu)
>  	if (amdgpu_sriov_vf(smu->adev) || !smu_baco->platform_support)
>  		return false;
>  
> +	/* return true if ASIC is in BACO state already */
> +	if (smu_v11_0_baco_get_state(smu) == SMU_BACO_STATE_ENTER)
> +		return true;

smu_v13_0_baco_is_support() is essentially identical to smu_v11_0_baco_is_support().  Does it need a similar change?

>  	/* Arcturus does not support this bit mask */
>  	if (smu_cmn_feature_is_supported(smu, SMU_FEATURE_BACO_BIT) &&
>  	   !smu_cmn_feature_is_enabled(smu, SMU_FEATURE_BACO_BIT))
> --
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-12-28  2:02 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20221123014307.263178-1-guchun.chen@amd.com>
2022-12-27 18:23 ` [PATCH 1/3] drm/amd/pm/smu11: BACO is supported when it's in BACO state Bjorn Helgaas
2022-12-28  2:02   ` Chen, Guchun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox