amd-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] SI power management fixes
@ 2025-08-04 13:41 Timur Kristóf
  2025-08-04 13:41 ` [PATCH 1/6] drm/amdgpu: Power up UVD 3 for FW validation Timur Kristóf
                   ` (5 more replies)
  0 siblings, 6 replies; 28+ messages in thread
From: Timur Kristóf @ 2025-08-04 13:41 UTC (permalink / raw)
  To: amd-gfx; +Cc: Timur Kristóf

This series fixes some power management issues on SI.
Where applicable, a radeon backport is also included.
I recommend backporting these commits to the stable kernel.

1. Ensure that DPM powers on the UVD for FW validation.
This is necessary because SI needs to use a specific power
state for UVD to work correctly, which we forgot to use
for FW validation.

2. Disable ULV mode even if unsupported (+ radeon backport).
Send the PPSMC_MSG_DisableULV message to the SMC even when ULV
mode is not supported, to ensure that ULV mode is actually
turned off. This fixes some instability on Tahiti.

3. Fix si_upload_smc_data (+ radeon backport).
The si_write_smc_soft_register return value is actually zero on
success, fix checking it. This fixes an SMC hang on Tahiti.

4. Fix issues with high pixel clock displays
As we don't have something like dce_calcs for old GPUs, sadly
we have to apply ad-hoc fixes. This fixes different flickering
issues that I observed on Oland, Tahiti and Pitcairn.

Timur Kristóf (6):
  drm/amdgpu: Power up UVD 3 for FW validation
  drm/amd/pm: Disable ULV even if unsupported
  drm/radeon: Disable ULV even if unsupported
  drm/amd/pm: Fix si_upload_smc_data
  drm/radeon: Fix si_upload_smc_data
  drm/amd/pm: Fix SI DPM issues with high pixel clock

 drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c      |  9 ++-
 drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 75 ++++++++++++++++++----
 drivers/gpu/drm/radeon/si_dpm.c            | 38 +++++++----
 3 files changed, 91 insertions(+), 31 deletions(-)

-- 
2.50.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/6] drm/amdgpu: Power up UVD 3 for FW validation
  2025-08-04 13:41 [PATCH 0/6] SI power management fixes Timur Kristóf
@ 2025-08-04 13:41 ` Timur Kristóf
  2025-08-04 15:20   ` Alex Deucher
  2025-08-04 13:41 ` [PATCH 2/6] drm/amd/pm: Disable ULV even if unsupported Timur Kristóf
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 28+ messages in thread
From: Timur Kristóf @ 2025-08-04 13:41 UTC (permalink / raw)
  To: amd-gfx; +Cc: Timur Kristóf

Unlike later versions, UVD 3 has firmware validation.
For this to work, the UVD should be powered up correctly.

When DPM is enabled and the display clock is off,
the SMU may choose a power state which doesn't power
the UVD, which can result in failure to initialize UVD.

Fixes: b38f3e80ecec ("drm amdgpu: SI UVD v3_1 (v2)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
index 5dbaebb592b3..9ad06c1e150d 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
@@ -633,6 +633,12 @@ static int uvd_v3_1_hw_init(struct amdgpu_ip_block *ip_block)
 	int r;
 
 	uvd_v3_1_mc_resume(adev);
+	uvd_v3_1_enable_mgcg(adev, true);
+
+	if (adev->pm.dpm_enabled)
+		amdgpu_dpm_enable_uvd(adev, true);
+	else
+		amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
 
 	r = uvd_v3_1_fw_validate(adev);
 	if (r) {
@@ -640,9 +646,6 @@ static int uvd_v3_1_hw_init(struct amdgpu_ip_block *ip_block)
 		return r;
 	}
 
-	uvd_v3_1_enable_mgcg(adev, true);
-	amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
-
 	uvd_v3_1_start(adev);
 
 	r = amdgpu_ring_test_helper(ring);
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/6] drm/amd/pm: Disable ULV even if unsupported
  2025-08-04 13:41 [PATCH 0/6] SI power management fixes Timur Kristóf
  2025-08-04 13:41 ` [PATCH 1/6] drm/amdgpu: Power up UVD 3 for FW validation Timur Kristóf
@ 2025-08-04 13:41 ` Timur Kristóf
  2025-08-04 15:24   ` Alex Deucher
  2025-08-04 13:41 ` [PATCH 3/6] drm/radeon: " Timur Kristóf
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 28+ messages in thread
From: Timur Kristóf @ 2025-08-04 13:41 UTC (permalink / raw)
  To: amd-gfx; +Cc: Timur Kristóf

This commit fixes some instability on Tahiti.

Sometimes UVD initialization would fail when using DC.
I suspect this is because DC doesn't immediately turn on the
display clock, so it changes how DPM behaves.

Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
index 52e732be59e3..33b9d4beec84 100644
--- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
+++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
@@ -5639,10 +5639,13 @@ static int si_disable_ulv(struct amdgpu_device *adev)
 {
 	struct si_power_info *si_pi = si_get_pi(adev);
 	struct si_ulv_param *ulv = &si_pi->ulv;
+	PPSMC_Result r;
 
+	r = amdgpu_si_send_msg_to_smc(adev, PPSMC_MSG_DisableULV);
+
+	/* Only care about SMC reply when ULV is supported. */
 	if (ulv->supported)
-		return (amdgpu_si_send_msg_to_smc(adev, PPSMC_MSG_DisableULV) == PPSMC_Result_OK) ?
-			0 : -EINVAL;
+		return (r == PPSMC_Result_OK) ? 0 : -EINVAL;
 
 	return 0;
 }
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/6] drm/radeon: Disable ULV even if unsupported
  2025-08-04 13:41 [PATCH 0/6] SI power management fixes Timur Kristóf
  2025-08-04 13:41 ` [PATCH 1/6] drm/amdgpu: Power up UVD 3 for FW validation Timur Kristóf
  2025-08-04 13:41 ` [PATCH 2/6] drm/amd/pm: Disable ULV even if unsupported Timur Kristóf
@ 2025-08-04 13:41 ` Timur Kristóf
  2025-08-04 15:24   ` Alex Deucher
  2025-08-04 13:41 ` [PATCH 4/6] drm/amd/pm: Fix si_upload_smc_data Timur Kristóf
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 28+ messages in thread
From: Timur Kristóf @ 2025-08-04 13:41 UTC (permalink / raw)
  To: amd-gfx; +Cc: Timur Kristóf

Backport of the same commit to amdgpu.
This commit fixes some instability on Tahiti.

Fixes: a9e61410921b ("drm/radeon/kms: add dpm support for SI (v7)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/radeon/si_dpm.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c
index 9deb91970d4d..47fba85436a7 100644
--- a/drivers/gpu/drm/radeon/si_dpm.c
+++ b/drivers/gpu/drm/radeon/si_dpm.c
@@ -5073,10 +5073,13 @@ static int si_disable_ulv(struct radeon_device *rdev)
 {
 	struct si_power_info *si_pi = si_get_pi(rdev);
 	struct si_ulv_param *ulv = &si_pi->ulv;
+	PPSMC_Result r;
 
+	r = si_send_msg_to_smc(rdev, PPSMC_MSG_DisableULV);
+
+	/* Only care about SMC reply when ULV is supported. */
 	if (ulv->supported)
-		return (si_send_msg_to_smc(rdev, PPSMC_MSG_DisableULV) == PPSMC_Result_OK) ?
-			0 : -EINVAL;
+		return (r == PPSMC_Result_OK) ? 0 : -EINVAL;
 
 	return 0;
 }
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 4/6] drm/amd/pm: Fix si_upload_smc_data
  2025-08-04 13:41 [PATCH 0/6] SI power management fixes Timur Kristóf
                   ` (2 preceding siblings ...)
  2025-08-04 13:41 ` [PATCH 3/6] drm/radeon: " Timur Kristóf
@ 2025-08-04 13:41 ` Timur Kristóf
  2025-08-04 15:32   ` Alex Deucher
  2025-08-04 13:41 ` [PATCH 5/6] drm/radeon: " Timur Kristóf
  2025-08-04 13:41 ` [PATCH 6/6] drm/amd/pm: Fix SI DPM issues with high pixel clock Timur Kristóf
  5 siblings, 1 reply; 28+ messages in thread
From: Timur Kristóf @ 2025-08-04 13:41 UTC (permalink / raw)
  To: amd-gfx; +Cc: Timur Kristóf

The si_upload_smc_data function uses si_write_smc_soft_register
to set some register values in the SMC, and expects the result
to be PPSMC_Result_OK which is 1.

The PPSMC_Result_OK / PPSMC_Result_Failed values are used for
checking the result of a command sent to the SMC.

However, the si_write_smc_soft_register actually doesn't send
any commands to the SMC and returns zero on success,
so this check was incorrect.

Fix that by correctly interpreting zero as success.
This seems to fix an SMC hang that happens in si_set_sw_state.

Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 31 +++++++++++++---------
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
index 33b9d4beec84..e9f034ade214 100644
--- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
+++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
@@ -5820,6 +5820,7 @@ static int si_upload_smc_data(struct amdgpu_device *adev)
 {
 	struct amdgpu_crtc *amdgpu_crtc = NULL;
 	int i;
+	int ret;
 
 	if (adev->pm.dpm.new_active_crtc_count == 0)
 		return 0;
@@ -5837,20 +5838,26 @@ static int si_upload_smc_data(struct amdgpu_device *adev)
 	if (amdgpu_crtc->line_time <= 0)
 		return 0;
 
-	if (si_write_smc_soft_register(adev,
-				       SI_SMC_SOFT_REGISTER_crtc_index,
-				       amdgpu_crtc->crtc_id) != PPSMC_Result_OK)
-		return 0;
+	ret = si_write_smc_soft_register(
+		adev,
+		SI_SMC_SOFT_REGISTER_crtc_index,
+		amdgpu_crtc->crtc_id);
+	if (ret)
+		return ret;
 
-	if (si_write_smc_soft_register(adev,
-				       SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
-				       amdgpu_crtc->wm_high / amdgpu_crtc->line_time) != PPSMC_Result_OK)
-		return 0;
+	ret = si_write_smc_soft_register(
+		adev,
+		SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
+		amdgpu_crtc->wm_high / amdgpu_crtc->line_time);
+	if (ret)
+		return ret;
 
-	if (si_write_smc_soft_register(adev,
-				       SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
-				       amdgpu_crtc->wm_low / amdgpu_crtc->line_time) != PPSMC_Result_OK)
-		return 0;
+	ret = si_write_smc_soft_register(
+		adev,
+		SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
+		amdgpu_crtc->wm_low / amdgpu_crtc->line_time);
+	if (ret)
+		return ret;
 
 	return 0;
 }
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 5/6] drm/radeon: Fix si_upload_smc_data
  2025-08-04 13:41 [PATCH 0/6] SI power management fixes Timur Kristóf
                   ` (3 preceding siblings ...)
  2025-08-04 13:41 ` [PATCH 4/6] drm/amd/pm: Fix si_upload_smc_data Timur Kristóf
@ 2025-08-04 13:41 ` Timur Kristóf
  2025-08-04 15:33   ` Alex Deucher
  2025-08-04 13:41 ` [PATCH 6/6] drm/amd/pm: Fix SI DPM issues with high pixel clock Timur Kristóf
  5 siblings, 1 reply; 28+ messages in thread
From: Timur Kristóf @ 2025-08-04 13:41 UTC (permalink / raw)
  To: amd-gfx; +Cc: Timur Kristóf

Backport of the fix to the same amdgpu issue.

The si_upload_smc_data function uses si_write_smc_soft_register
to set some register values in the SMC, and expects the result
to be PPSMC_Result_OK which is 1.

The PPSMC_Result_OK / PPSMC_Result_Failed values are used for
checking the result of a command sent to the SMC.

However, the si_write_smc_soft_register actually doesn't send
any commands to the SMC and returns zero on success,
so this check was incorrect.

Fix that by correctly interpreting zero as success.
This seems to fix an SMC hang that happens in si_set_sw_state.

Fixes: a9e61410921b ("drm/radeon/kms: add dpm support for SI (v7)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/radeon/si_dpm.c | 31 +++++++++++++++++++------------
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c
index 47fba85436a7..8bda1e655468 100644
--- a/drivers/gpu/drm/radeon/si_dpm.c
+++ b/drivers/gpu/drm/radeon/si_dpm.c
@@ -5257,6 +5257,7 @@ static int si_upload_smc_data(struct radeon_device *rdev)
 {
 	struct radeon_crtc *radeon_crtc = NULL;
 	int i;
+	int ret;
 
 	if (rdev->pm.dpm.new_active_crtc_count == 0)
 		return 0;
@@ -5274,20 +5275,26 @@ static int si_upload_smc_data(struct radeon_device *rdev)
 	if (radeon_crtc->line_time <= 0)
 		return 0;
 
-	if (si_write_smc_soft_register(rdev,
-				       SI_SMC_SOFT_REGISTER_crtc_index,
-				       radeon_crtc->crtc_id) != PPSMC_Result_OK)
-		return 0;
+	ret = si_write_smc_soft_register(
+		rdev,
+		SI_SMC_SOFT_REGISTER_crtc_index,
+		radeon_crtc->crtc_id);
+	if (ret)
+		return ret;
 
-	if (si_write_smc_soft_register(rdev,
-				       SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
-				       radeon_crtc->wm_high / radeon_crtc->line_time) != PPSMC_Result_OK)
-		return 0;
+	ret = si_write_smc_soft_register(
+		rdev,
+		SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
+		radeon_crtc->wm_high / radeon_crtc->line_time);
+	if (ret)
+		return ret;
 
-	if (si_write_smc_soft_register(rdev,
-				       SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
-				       radeon_crtc->wm_low / radeon_crtc->line_time) != PPSMC_Result_OK)
-		return 0;
+	ret = si_write_smc_soft_register(
+		rdev,
+		SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
+		radeon_crtc->wm_low / radeon_crtc->line_time);
+	if (ret)
+		return ret;
 
 	return 0;
 }
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 6/6] drm/amd/pm: Fix SI DPM issues with high pixel clock
  2025-08-04 13:41 [PATCH 0/6] SI power management fixes Timur Kristóf
                   ` (4 preceding siblings ...)
  2025-08-04 13:41 ` [PATCH 5/6] drm/radeon: " Timur Kristóf
@ 2025-08-04 13:41 ` Timur Kristóf
  5 siblings, 0 replies; 28+ messages in thread
From: Timur Kristóf @ 2025-08-04 13:41 UTC (permalink / raw)
  To: amd-gfx; +Cc: Timur Kristóf

We define "high pixelclock" for SI as higher than necessary
for 4K 30Hz. For example, 4K 60Hz and 1080p 144Hz fall into
this category.

When a high pixel clock display is connected, always disable
memory clock switching to solve some flickering that can be
observed on Tahiti and Pitcairn on 4K 60Hz displays.

When two high pixel clock displays are connected to Oland,
additionally disable shader clock switching, which results in
a higher voltage, thereby addressing some visible flickering.

Ideally, these issues would be solved by introducing something
like dce_calcs, but we don't have that for old GPUs, so
ad-hoc fixes become necessary.

Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 37 ++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
index e9f034ade214..7bfe39a7d653 100644
--- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
+++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
@@ -3443,12 +3443,14 @@ static void si_apply_state_adjust_rules(struct amdgpu_device *adev,
 {
 	struct  si_ps *ps = si_get_ps(rps);
 	struct amdgpu_clock_and_voltage_limits *max_limits;
+	struct amdgpu_connector *conn;
 	bool disable_mclk_switching = false;
 	bool disable_sclk_switching = false;
 	u32 mclk, sclk;
 	u16 vddc, vddci, min_vce_voltage = 0;
 	u32 max_sclk_vddc, max_mclk_vddci, max_mclk_vddc;
 	u32 max_sclk = 0, max_mclk = 0;
+	u32 high_pixelclock_count = 0;
 	int i;
 
 	if (adev->asic_type == CHIP_HAINAN) {
@@ -3476,6 +3478,41 @@ static void si_apply_state_adjust_rules(struct amdgpu_device *adev,
 		}
 	}
 
+	/* We define "high pixelclock" for SI as higher than necessary for 4K 30Hz.
+	 * For example, 4K 60Hz and 1080p 144Hz fall into this category.
+	 * Find number of such displays connected.
+	 */
+	for (i = 0; i < adev->mode_info.num_crtc; i++) {
+		if (!(adev->pm.dpm.new_active_crtcs & (1 << i)) ||
+			!adev->mode_info.crtcs[i]->enabled)
+			continue;
+
+		conn = to_amdgpu_connector(adev->mode_info.crtcs[i]->connector);
+
+		if (conn->pixelclock_for_modeset > 297000)
+			high_pixelclock_count++;
+	}
+
+	/* These are some ad-hoc fixes to some issues observed with SI GPUs.
+	 * They are necessary because we don't have something like dce_calcs
+	 * for these GPUs.
+	 */
+	if (high_pixelclock_count) {
+		/* High pixelclock displays need higher memory bandwidth.
+		 * Disable memory clock switching in order to:
+		 * - Solve visible display flickering
+		 * - Avoid starving other clients of the memory controller
+		 */
+		disable_mclk_switching = true;
+
+		/* On Oland, we observe some flickering when two 4K 60Hz
+		 * displays are connected, possibly because voltage is too low.
+		 * Raise the voltage by requiring a higher SCLK.
+		 */
+		if (high_pixelclock_count > 1 && adev->asic_type == CHIP_OLAND)
+			disable_sclk_switching = true;
+	}
+
 	if (rps->vce_active) {
 		rps->evclk = adev->pm.dpm.vce_states[adev->pm.dpm.vce_level].evclk;
 		rps->ecclk = adev->pm.dpm.vce_states[adev->pm.dpm.vce_level].ecclk;
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/6] drm/amdgpu: Power up UVD 3 for FW validation
  2025-08-04 13:41 ` [PATCH 1/6] drm/amdgpu: Power up UVD 3 for FW validation Timur Kristóf
@ 2025-08-04 15:20   ` Alex Deucher
  2025-08-04 16:00     ` Timur Kristóf
  0 siblings, 1 reply; 28+ messages in thread
From: Alex Deucher @ 2025-08-04 15:20 UTC (permalink / raw)
  To: Timur Kristóf, Christian Koenig, Leo Liu; +Cc: amd-gfx

On Mon, Aug 4, 2025 at 9:58 AM Timur Kristóf <timur.kristof@gmail.com> wrote:
>
> Unlike later versions, UVD 3 has firmware validation.
> For this to work, the UVD should be powered up correctly.
>
> When DPM is enabled and the display clock is off,
> the SMU may choose a power state which doesn't power
> the UVD, which can result in failure to initialize UVD.

+ Christian, Leo

That doesn't seem right to me.  IIRC, the driver always set the UVD
PLL directly on SI and I don't think SI supported any kind of UVD
power gating. I guess it's probably some sort of subtle sequencing
difference between radeon and amdgpu.  Unless Christian or Leo have
any ideas, I think the patch is probably fine.

Alex

>
> Fixes: b38f3e80ecec ("drm amdgpu: SI UVD v3_1 (v2)")
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> index 5dbaebb592b3..9ad06c1e150d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> @@ -633,6 +633,12 @@ static int uvd_v3_1_hw_init(struct amdgpu_ip_block *ip_block)
>         int r;
>
>         uvd_v3_1_mc_resume(adev);
> +       uvd_v3_1_enable_mgcg(adev, true);
> +
> +       if (adev->pm.dpm_enabled)
> +               amdgpu_dpm_enable_uvd(adev, true);
> +       else
> +               amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
>
>         r = uvd_v3_1_fw_validate(adev);
>         if (r) {
> @@ -640,9 +646,6 @@ static int uvd_v3_1_hw_init(struct amdgpu_ip_block *ip_block)
>                 return r;
>         }
>
> -       uvd_v3_1_enable_mgcg(adev, true);
> -       amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
> -
>         uvd_v3_1_start(adev);
>
>         r = amdgpu_ring_test_helper(ring);
> --
> 2.50.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/6] drm/amd/pm: Disable ULV even if unsupported
  2025-08-04 13:41 ` [PATCH 2/6] drm/amd/pm: Disable ULV even if unsupported Timur Kristóf
@ 2025-08-04 15:24   ` Alex Deucher
  2025-08-04 16:04     ` Timur Kristóf
  0 siblings, 1 reply; 28+ messages in thread
From: Alex Deucher @ 2025-08-04 15:24 UTC (permalink / raw)
  To: Timur Kristóf; +Cc: amd-gfx

On Mon, Aug 4, 2025 at 9:58 AM Timur Kristóf <timur.kristof@gmail.com> wrote:
>
> This commit fixes some instability on Tahiti.
>
> Sometimes UVD initialization would fail when using DC.
> I suspect this is because DC doesn't immediately turn on the
> display clock, so it changes how DPM behaves.

Is this the right description for this patch?  I thought you had said
this fixed something else.

Alex

>
> Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)")
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> ---
>  drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> index 52e732be59e3..33b9d4beec84 100644
> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> @@ -5639,10 +5639,13 @@ static int si_disable_ulv(struct amdgpu_device *adev)
>  {
>         struct si_power_info *si_pi = si_get_pi(adev);
>         struct si_ulv_param *ulv = &si_pi->ulv;
> +       PPSMC_Result r;
>
> +       r = amdgpu_si_send_msg_to_smc(adev, PPSMC_MSG_DisableULV);
> +
> +       /* Only care about SMC reply when ULV is supported. */
>         if (ulv->supported)
> -               return (amdgpu_si_send_msg_to_smc(adev, PPSMC_MSG_DisableULV) == PPSMC_Result_OK) ?
> -                       0 : -EINVAL;
> +               return (r == PPSMC_Result_OK) ? 0 : -EINVAL;
>
>         return 0;
>  }
> --
> 2.50.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/6] drm/radeon: Disable ULV even if unsupported
  2025-08-04 13:41 ` [PATCH 3/6] drm/radeon: " Timur Kristóf
@ 2025-08-04 15:24   ` Alex Deucher
  2025-08-04 16:09     ` Timur Kristóf
  0 siblings, 1 reply; 28+ messages in thread
From: Alex Deucher @ 2025-08-04 15:24 UTC (permalink / raw)
  To: Timur Kristóf; +Cc: amd-gfx

On Mon, Aug 4, 2025 at 10:18 AM Timur Kristóf <timur.kristof@gmail.com> wrote:
>
> Backport of the same commit to amdgpu.
> This commit fixes some instability on Tahiti.

Have you tested this with radeon?

Alex

>
> Fixes: a9e61410921b ("drm/radeon/kms: add dpm support for SI (v7)")
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> ---
>  drivers/gpu/drm/radeon/si_dpm.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c
> index 9deb91970d4d..47fba85436a7 100644
> --- a/drivers/gpu/drm/radeon/si_dpm.c
> +++ b/drivers/gpu/drm/radeon/si_dpm.c
> @@ -5073,10 +5073,13 @@ static int si_disable_ulv(struct radeon_device *rdev)
>  {
>         struct si_power_info *si_pi = si_get_pi(rdev);
>         struct si_ulv_param *ulv = &si_pi->ulv;
> +       PPSMC_Result r;
>
> +       r = si_send_msg_to_smc(rdev, PPSMC_MSG_DisableULV);
> +
> +       /* Only care about SMC reply when ULV is supported. */
>         if (ulv->supported)
> -               return (si_send_msg_to_smc(rdev, PPSMC_MSG_DisableULV) == PPSMC_Result_OK) ?
> -                       0 : -EINVAL;
> +               return (r == PPSMC_Result_OK) ? 0 : -EINVAL;
>
>         return 0;
>  }
> --
> 2.50.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/6] drm/amd/pm: Fix si_upload_smc_data
  2025-08-04 13:41 ` [PATCH 4/6] drm/amd/pm: Fix si_upload_smc_data Timur Kristóf
@ 2025-08-04 15:32   ` Alex Deucher
  2025-08-04 16:16     ` Timur Kristóf
  0 siblings, 1 reply; 28+ messages in thread
From: Alex Deucher @ 2025-08-04 15:32 UTC (permalink / raw)
  To: Timur Kristóf; +Cc: amd-gfx

On Mon, Aug 4, 2025 at 9:42 AM Timur Kristóf <timur.kristof@gmail.com> wrote:
>
> The si_upload_smc_data function uses si_write_smc_soft_register
> to set some register values in the SMC, and expects the result
> to be PPSMC_Result_OK which is 1.
>
> The PPSMC_Result_OK / PPSMC_Result_Failed values are used for
> checking the result of a command sent to the SMC.
>
> However, the si_write_smc_soft_register actually doesn't send
> any commands to the SMC and returns zero on success,
> so this check was incorrect.
>
> Fix that by correctly interpreting zero as success.
> This seems to fix an SMC hang that happens in si_set_sw_state.
>
> Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)")
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> ---
>  drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 31 +++++++++++++---------
>  1 file changed, 19 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> index 33b9d4beec84..e9f034ade214 100644
> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> @@ -5820,6 +5820,7 @@ static int si_upload_smc_data(struct amdgpu_device *adev)
>  {
>         struct amdgpu_crtc *amdgpu_crtc = NULL;
>         int i;
> +       int ret;
>
>         if (adev->pm.dpm.new_active_crtc_count == 0)
>                 return 0;
> @@ -5837,20 +5838,26 @@ static int si_upload_smc_data(struct amdgpu_device *adev)
>         if (amdgpu_crtc->line_time <= 0)
>                 return 0;
>
> -       if (si_write_smc_soft_register(adev,
> -                                      SI_SMC_SOFT_REGISTER_crtc_index,
> -                                      amdgpu_crtc->crtc_id) != PPSMC_Result_OK)
> -               return 0;
> +       ret = si_write_smc_soft_register(
> +               adev,
> +               SI_SMC_SOFT_REGISTER_crtc_index,
> +               amdgpu_crtc->crtc_id);
> +       if (ret)
> +               return ret;
>
> -       if (si_write_smc_soft_register(adev,
> -                                      SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> -                                      amdgpu_crtc->wm_high / amdgpu_crtc->line_time) != PPSMC_Result_OK)
> -               return 0;
> +       ret = si_write_smc_soft_register(
> +               adev,
> +               SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> +               amdgpu_crtc->wm_high / amdgpu_crtc->line_time);
> +       if (ret)
> +               return ret;
>
> -       if (si_write_smc_soft_register(adev,
> -                                      SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> -                                      amdgpu_crtc->wm_low / amdgpu_crtc->line_time) != PPSMC_Result_OK)
> -               return 0;
> +       ret = si_write_smc_soft_register(
> +               adev,
> +               SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> +               amdgpu_crtc->wm_low / amdgpu_crtc->line_time);
> +       if (ret)
> +               return ret;

This patch changes the behavior of this function (i.e., it always
returns 0 before this patch).  Not sure if that matters or not.  I
think this could be simplified to something like the following to
retain the current behavior.

diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
index 52e732be59e36..3dd0115aa15f8 100644
--- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
+++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
@@ -5836,17 +5836,17 @@ static int si_upload_smc_data(struct
amdgpu_device *adev)

        if (si_write_smc_soft_register(adev,
                                       SI_SMC_SOFT_REGISTER_crtc_index,
-                                      amdgpu_crtc->crtc_id) != PPSMC_Result_OK)
+                                      amdgpu_crtc->crtc_id))
                return 0;

        if (si_write_smc_soft_register(adev,

SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
-                                      amdgpu_crtc->wm_high /
amdgpu_crtc->line_time) != PPSMC_Result_OK)
+                                      amdgpu_crtc->wm_high /
amdgpu_crtc->line_time))
                return 0;

        if (si_write_smc_soft_register(adev,

SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
-                                      amdgpu_crtc->wm_low /
amdgpu_crtc->line_time) != PPSMC_Result_OK)
+                                      amdgpu_crtc->wm_low /
amdgpu_crtc->line_time))
                return 0;

        return 0;


>
>         return 0;
>  }
> --
> 2.50.1
>

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/6] drm/radeon: Fix si_upload_smc_data
  2025-08-04 13:41 ` [PATCH 5/6] drm/radeon: " Timur Kristóf
@ 2025-08-04 15:33   ` Alex Deucher
  0 siblings, 0 replies; 28+ messages in thread
From: Alex Deucher @ 2025-08-04 15:33 UTC (permalink / raw)
  To: Timur Kristóf; +Cc: amd-gfx

On Mon, Aug 4, 2025 at 9:48 AM Timur Kristóf <timur.kristof@gmail.com> wrote:
>
> Backport of the fix to the same amdgpu issue.
>
> The si_upload_smc_data function uses si_write_smc_soft_register
> to set some register values in the SMC, and expects the result
> to be PPSMC_Result_OK which is 1.
>
> The PPSMC_Result_OK / PPSMC_Result_Failed values are used for
> checking the result of a command sent to the SMC.
>
> However, the si_write_smc_soft_register actually doesn't send
> any commands to the SMC and returns zero on success,
> so this check was incorrect.
>
> Fix that by correctly interpreting zero as success.
> This seems to fix an SMC hang that happens in si_set_sw_state.
>
> Fixes: a9e61410921b ("drm/radeon/kms: add dpm support for SI (v7)")
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>

Same comment here as the previous patch.

Alex


> ---
>  drivers/gpu/drm/radeon/si_dpm.c | 31 +++++++++++++++++++------------
>  1 file changed, 19 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c
> index 47fba85436a7..8bda1e655468 100644
> --- a/drivers/gpu/drm/radeon/si_dpm.c
> +++ b/drivers/gpu/drm/radeon/si_dpm.c
> @@ -5257,6 +5257,7 @@ static int si_upload_smc_data(struct radeon_device *rdev)
>  {
>         struct radeon_crtc *radeon_crtc = NULL;
>         int i;
> +       int ret;
>
>         if (rdev->pm.dpm.new_active_crtc_count == 0)
>                 return 0;
> @@ -5274,20 +5275,26 @@ static int si_upload_smc_data(struct radeon_device *rdev)
>         if (radeon_crtc->line_time <= 0)
>                 return 0;
>
> -       if (si_write_smc_soft_register(rdev,
> -                                      SI_SMC_SOFT_REGISTER_crtc_index,
> -                                      radeon_crtc->crtc_id) != PPSMC_Result_OK)
> -               return 0;
> +       ret = si_write_smc_soft_register(
> +               rdev,
> +               SI_SMC_SOFT_REGISTER_crtc_index,
> +               radeon_crtc->crtc_id);
> +       if (ret)
> +               return ret;
>
> -       if (si_write_smc_soft_register(rdev,
> -                                      SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> -                                      radeon_crtc->wm_high / radeon_crtc->line_time) != PPSMC_Result_OK)
> -               return 0;
> +       ret = si_write_smc_soft_register(
> +               rdev,
> +               SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> +               radeon_crtc->wm_high / radeon_crtc->line_time);
> +       if (ret)
> +               return ret;
>
> -       if (si_write_smc_soft_register(rdev,
> -                                      SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> -                                      radeon_crtc->wm_low / radeon_crtc->line_time) != PPSMC_Result_OK)
> -               return 0;
> +       ret = si_write_smc_soft_register(
> +               rdev,
> +               SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> +               radeon_crtc->wm_low / radeon_crtc->line_time);
> +       if (ret)
> +               return ret;
>
>         return 0;
>  }
> --
> 2.50.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/6] drm/amdgpu: Power up UVD 3 for FW validation
  2025-08-04 15:20   ` Alex Deucher
@ 2025-08-04 16:00     ` Timur Kristóf
  2025-08-04 17:45       ` Alex Deucher
  0 siblings, 1 reply; 28+ messages in thread
From: Timur Kristóf @ 2025-08-04 16:00 UTC (permalink / raw)
  To: Alex Deucher, Christian Koenig, Leo Liu; +Cc: amd-gfx

On Mon, 2025-08-04 at 11:20 -0400, Alex Deucher wrote:
> On Mon, Aug 4, 2025 at 9:58 AM Timur Kristóf
> <timur.kristof@gmail.com> wrote:
> > 
> > Unlike later versions, UVD 3 has firmware validation.
> > For this to work, the UVD should be powered up correctly.
> > 
> > When DPM is enabled and the display clock is off,
> > the SMU may choose a power state which doesn't power
> > the UVD, which can result in failure to initialize UVD.
> 
> + Christian, Leo
> 
> That doesn't seem right to me.  IIRC, the driver always set the UVD
> PLL directly on SI and I don't think SI supported any kind of UVD
> power gating. I guess it's probably some sort of subtle sequencing
> difference between radeon and amdgpu.  Unless Christian or Leo have
> any ideas, I think the patch is probably fine.
> 
> Alex

Hi,

These are my observations about how the UVD clock works on SI:

1. It seems that the SMC needs to know whether UVD is enabled or not,
and the UVD clocks are included as part of the power states. See:
si_convert_power_state_to_smc
si_convert_power_level_to_smc

On SI the default power state doesn't set the UVD clocks,
and SI has a specific power state to be used with UVD. Actually
amdgpu_dpm_enable_uvd has a special case code path for SI, where it
sets this power state. If I print the power states from
si_parse_power_table, it indeed confirms that there is only one power
state that has non-zero UVD clocks, and the rest of them just have the
UVD clocks at zero.

It's unclear to me what happens if we try to enable UVD clocks when we
selected a power state that doesn't include them (ie. when we don't
tell the SMC that UVD is active).

2. When setting a power state that enables UVD, the UVD clock is
enabled either before or after the engine clock by si_dpm. This is done
so in both radeon and amdgpu, see:
si_dpm_set_power_state
ni_set_uvd_clock_before_set_eng_clock
ni_set_uvd_clock_after_set_eng_clock

The specific sequence in which the UVD clock is enabled by
si_dpm_set_power_state leads me to the conclusion that
amdgpu_asic_set_uvd_clocks should not be directly called on SI outside
of the DPM code.

Please correct me if I misunderstood the code.

Thanks,
Timur


> 
> > 
> > Fixes: b38f3e80ecec ("drm amdgpu: SI UVD v3_1 (v2)")
> > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c | 9 ++++++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> > b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> > index 5dbaebb592b3..9ad06c1e150d 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> > @@ -633,6 +633,12 @@ static int uvd_v3_1_hw_init(struct
> > amdgpu_ip_block *ip_block)
> >         int r;
> > 
> >         uvd_v3_1_mc_resume(adev);
> > +       uvd_v3_1_enable_mgcg(adev, true);
> > +
> > +       if (adev->pm.dpm_enabled)
> > +               amdgpu_dpm_enable_uvd(adev, true);
> > +       else
> > +               amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
> > 
> >         r = uvd_v3_1_fw_validate(adev);
> >         if (r) {
> > @@ -640,9 +646,6 @@ static int uvd_v3_1_hw_init(struct
> > amdgpu_ip_block *ip_block)
> >                 return r;
> >         }
> > 
> > -       uvd_v3_1_enable_mgcg(adev, true);
> > -       amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
> > -
> >         uvd_v3_1_start(adev);
> > 
> >         r = amdgpu_ring_test_helper(ring);
> > --
> > 2.50.1
> > 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/6] drm/amd/pm: Disable ULV even if unsupported
  2025-08-04 15:24   ` Alex Deucher
@ 2025-08-04 16:04     ` Timur Kristóf
  2025-08-04 17:33       ` Alex Deucher
  0 siblings, 1 reply; 28+ messages in thread
From: Timur Kristóf @ 2025-08-04 16:04 UTC (permalink / raw)
  To: Alex Deucher; +Cc: amd-gfx

On Mon, 2025-08-04 at 11:24 -0400, Alex Deucher wrote:
> On Mon, Aug 4, 2025 at 9:58 AM Timur Kristóf
> <timur.kristof@gmail.com> wrote:
> > 
> > This commit fixes some instability on Tahiti.
> > 
> > Sometimes UVD initialization would fail when using DC.
> > I suspect this is because DC doesn't immediately turn on the
> > display clock, so it changes how DPM behaves.
> 
> Is this the right description for this patch?  I thought you had said
> this fixed something else.
> 
> Alex

Yes, this patch together with the previous one fixes the "amdgpu: UVD
Firmware validate fail" when I enable DC on Tahiti.

Last week I thought this also fixed the "si_set_sw_state failed", but
that turned out to be wrong. For that one, I sent a separate patch
which involves a different fix.

Timur

> 
> > 
> > Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)")
> > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > ---
> >  drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > index 52e732be59e3..33b9d4beec84 100644
> > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > @@ -5639,10 +5639,13 @@ static int si_disable_ulv(struct
> > amdgpu_device *adev)
> >  {
> >         struct si_power_info *si_pi = si_get_pi(adev);
> >         struct si_ulv_param *ulv = &si_pi->ulv;
> > +       PPSMC_Result r;
> > 
> > +       r = amdgpu_si_send_msg_to_smc(adev, PPSMC_MSG_DisableULV);
> > +
> > +       /* Only care about SMC reply when ULV is supported. */
> >         if (ulv->supported)
> > -               return (amdgpu_si_send_msg_to_smc(adev,
> > PPSMC_MSG_DisableULV) == PPSMC_Result_OK) ?
> > -                       0 : -EINVAL;
> > +               return (r == PPSMC_Result_OK) ? 0 : -EINVAL;
> > 
> >         return 0;
> >  }
> > --
> > 2.50.1
> > 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/6] drm/radeon: Disable ULV even if unsupported
  2025-08-04 15:24   ` Alex Deucher
@ 2025-08-04 16:09     ` Timur Kristóf
  2025-08-04 17:34       ` Alex Deucher
  0 siblings, 1 reply; 28+ messages in thread
From: Timur Kristóf @ 2025-08-04 16:09 UTC (permalink / raw)
  To: Alex Deucher; +Cc: amd-gfx

On Mon, 2025-08-04 at 11:24 -0400, Alex Deucher wrote:
> On Mon, Aug 4, 2025 at 10:18 AM Timur Kristóf
> <timur.kristof@gmail.com> wrote:
> > 
> > Backport of the same commit to amdgpu.
> > This commit fixes some instability on Tahiti.
> 
> Have you tested this with radeon?
> 
> Alex

Considering that ULV is currently always marked as unsupported by both
radeon and amdgpu, and that the same patch fixes a real issue on
amdgpu, I assume it would improve stability on radeon too.

I leave it up to your judgement whether it's a good idea to apply this
to radeon or not.

That said, I haven't got a specific issue which this fixes on radeon.
I only tested that Tahiti can boot with radeon with this patch
included, but didn't do detailed testing other than that. Do you have
something specific in mind that I should try?

Timur

> 
> > 
> > Fixes: a9e61410921b ("drm/radeon/kms: add dpm support for SI (v7)")
> > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > ---
> >  drivers/gpu/drm/radeon/si_dpm.c | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/radeon/si_dpm.c
> > b/drivers/gpu/drm/radeon/si_dpm.c
> > index 9deb91970d4d..47fba85436a7 100644
> > --- a/drivers/gpu/drm/radeon/si_dpm.c
> > +++ b/drivers/gpu/drm/radeon/si_dpm.c
> > @@ -5073,10 +5073,13 @@ static int si_disable_ulv(struct
> > radeon_device *rdev)
> >  {
> >         struct si_power_info *si_pi = si_get_pi(rdev);
> >         struct si_ulv_param *ulv = &si_pi->ulv;
> > +       PPSMC_Result r;
> > 
> > +       r = si_send_msg_to_smc(rdev, PPSMC_MSG_DisableULV);
> > +
> > +       /* Only care about SMC reply when ULV is supported. */
> >         if (ulv->supported)
> > -               return (si_send_msg_to_smc(rdev,
> > PPSMC_MSG_DisableULV) == PPSMC_Result_OK) ?
> > -                       0 : -EINVAL;
> > +               return (r == PPSMC_Result_OK) ? 0 : -EINVAL;
> > 
> >         return 0;
> >  }
> > --
> > 2.50.1
> > 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/6] drm/amd/pm: Fix si_upload_smc_data
  2025-08-04 15:32   ` Alex Deucher
@ 2025-08-04 16:16     ` Timur Kristóf
  2025-08-04 17:31       ` Alex Deucher
  0 siblings, 1 reply; 28+ messages in thread
From: Timur Kristóf @ 2025-08-04 16:16 UTC (permalink / raw)
  To: Alex Deucher; +Cc: amd-gfx

On Mon, 2025-08-04 at 11:32 -0400, Alex Deucher wrote:
> On Mon, Aug 4, 2025 at 9:42 AM Timur Kristóf
> <timur.kristof@gmail.com> wrote:
> > 
> > The si_upload_smc_data function uses si_write_smc_soft_register
> > to set some register values in the SMC, and expects the result
> > to be PPSMC_Result_OK which is 1.
> > 
> > The PPSMC_Result_OK / PPSMC_Result_Failed values are used for
> > checking the result of a command sent to the SMC.
> > 
> > However, the si_write_smc_soft_register actually doesn't send
> > any commands to the SMC and returns zero on success,
> > so this check was incorrect.
> > 
> > Fix that by correctly interpreting zero as success.
> > This seems to fix an SMC hang that happens in si_set_sw_state.
> > 
> > Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)")
> > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > ---
> >  drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 31 +++++++++++++-----
> > ----
> >  1 file changed, 19 insertions(+), 12 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > index 33b9d4beec84..e9f034ade214 100644
> > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > @@ -5820,6 +5820,7 @@ static int si_upload_smc_data(struct
> > amdgpu_device *adev)
> >  {
> >         struct amdgpu_crtc *amdgpu_crtc = NULL;
> >         int i;
> > +       int ret;
> > 
> >         if (adev->pm.dpm.new_active_crtc_count == 0)
> >                 return 0;
> > @@ -5837,20 +5838,26 @@ static int si_upload_smc_data(struct
> > amdgpu_device *adev)
> >         if (amdgpu_crtc->line_time <= 0)
> >                 return 0;
> > 
> > -       if (si_write_smc_soft_register(adev,
> > -                                     
> > SI_SMC_SOFT_REGISTER_crtc_index,
> > -                                      amdgpu_crtc->crtc_id) !=
> > PPSMC_Result_OK)
> > -               return 0;
> > +       ret = si_write_smc_soft_register(
> > +               adev,
> > +               SI_SMC_SOFT_REGISTER_crtc_index,
> > +               amdgpu_crtc->crtc_id);
> > +       if (ret)
> > +               return ret;
> > 
> > -       if (si_write_smc_soft_register(adev,
> > -                                     
> > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> > -                                      amdgpu_crtc->wm_high /
> > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > -               return 0;
> > +       ret = si_write_smc_soft_register(
> > +               adev,
> > +               SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> > +               amdgpu_crtc->wm_high / amdgpu_crtc->line_time);
> > +       if (ret)
> > +               return ret;
> > 
> > -       if (si_write_smc_soft_register(adev,
> > -                                     
> > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> > -                                      amdgpu_crtc->wm_low /
> > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > -               return 0;
> > +       ret = si_write_smc_soft_register(
> > +               adev,
> > +               SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> > +               amdgpu_crtc->wm_low / amdgpu_crtc->line_time);
> > +       if (ret)
> > +               return ret;
> 
> This patch changes the behavior of this function (i.e., it always
> returns 0 before this patch).  Not sure if that matters or not.  I
> think this could be simplified to something like the following to
> retain the current behavior.

Actually now that I think of it more, I think it may be entirely
unnecessary to check the return value.

si_upload_smc_data calls:
si_write_smc_soft_register
amdgpu_si_write_smc_sram_dword
si_set_smc_sram_address

This last one, si_set_smc_sram_address returns -EINVAL when its
smc_address parameter is not dword-aligned or out of bounds. Otherwise
all of the above functions return 0 (success). Considering that all of
the addresses passed by si_upload_smc_data are compile time constants,
we know they are correct so there is no reason why any of those
functions would return an error.

Looking at other callers of si_write_smc_soft_register, I see that they
don't check the return value at all.

So, I'd actually simplify this even more and just not check the return
values. What do you think about that?


> 
> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> index 52e732be59e36..3dd0115aa15f8 100644
> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> @@ -5836,17 +5836,17 @@ static int si_upload_smc_data(struct
> amdgpu_device *adev)
> 
>         if (si_write_smc_soft_register(adev,
>                                       
> SI_SMC_SOFT_REGISTER_crtc_index,
> -                                      amdgpu_crtc->crtc_id) !=
> PPSMC_Result_OK)
> +                                      amdgpu_crtc->crtc_id))
>                 return 0;
> 
>         if (si_write_smc_soft_register(adev,
> 
> SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> -                                      amdgpu_crtc->wm_high /
> amdgpu_crtc->line_time) != PPSMC_Result_OK)
> +                                      amdgpu_crtc->wm_high /
> amdgpu_crtc->line_time))
>                 return 0;
> 
>         if (si_write_smc_soft_register(adev,
> 
> SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> -                                      amdgpu_crtc->wm_low /
> amdgpu_crtc->line_time) != PPSMC_Result_OK)
> +                                      amdgpu_crtc->wm_low /
> amdgpu_crtc->line_time))
>                 return 0;
> 
>         return 0;
> 
> 
> > 
> >         return 0;
> >  }
> > --
> > 2.50.1
> > 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/6] drm/amd/pm: Fix si_upload_smc_data
  2025-08-04 16:16     ` Timur Kristóf
@ 2025-08-04 17:31       ` Alex Deucher
  2025-08-08  8:22         ` Timur Kristóf
  0 siblings, 1 reply; 28+ messages in thread
From: Alex Deucher @ 2025-08-04 17:31 UTC (permalink / raw)
  To: Timur Kristóf; +Cc: amd-gfx

On Mon, Aug 4, 2025 at 12:16 PM Timur Kristóf <timur.kristof@gmail.com> wrote:
>
> On Mon, 2025-08-04 at 11:32 -0400, Alex Deucher wrote:
> > On Mon, Aug 4, 2025 at 9:42 AM Timur Kristóf
> > <timur.kristof@gmail.com> wrote:
> > >
> > > The si_upload_smc_data function uses si_write_smc_soft_register
> > > to set some register values in the SMC, and expects the result
> > > to be PPSMC_Result_OK which is 1.
> > >
> > > The PPSMC_Result_OK / PPSMC_Result_Failed values are used for
> > > checking the result of a command sent to the SMC.
> > >
> > > However, the si_write_smc_soft_register actually doesn't send
> > > any commands to the SMC and returns zero on success,
> > > so this check was incorrect.
> > >
> > > Fix that by correctly interpreting zero as success.
> > > This seems to fix an SMC hang that happens in si_set_sw_state.
> > >
> > > Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)")
> > > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > > ---
> > >  drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 31 +++++++++++++-----
> > > ----
> > >  1 file changed, 19 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > index 33b9d4beec84..e9f034ade214 100644
> > > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > @@ -5820,6 +5820,7 @@ static int si_upload_smc_data(struct
> > > amdgpu_device *adev)
> > >  {
> > >         struct amdgpu_crtc *amdgpu_crtc = NULL;
> > >         int i;
> > > +       int ret;
> > >
> > >         if (adev->pm.dpm.new_active_crtc_count == 0)
> > >                 return 0;
> > > @@ -5837,20 +5838,26 @@ static int si_upload_smc_data(struct
> > > amdgpu_device *adev)
> > >         if (amdgpu_crtc->line_time <= 0)
> > >                 return 0;
> > >
> > > -       if (si_write_smc_soft_register(adev,
> > > -
> > > SI_SMC_SOFT_REGISTER_crtc_index,
> > > -                                      amdgpu_crtc->crtc_id) !=
> > > PPSMC_Result_OK)
> > > -               return 0;
> > > +       ret = si_write_smc_soft_register(
> > > +               adev,
> > > +               SI_SMC_SOFT_REGISTER_crtc_index,
> > > +               amdgpu_crtc->crtc_id);
> > > +       if (ret)
> > > +               return ret;
> > >
> > > -       if (si_write_smc_soft_register(adev,
> > > -
> > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> > > -                                      amdgpu_crtc->wm_high /
> > > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > > -               return 0;
> > > +       ret = si_write_smc_soft_register(
> > > +               adev,
> > > +               SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> > > +               amdgpu_crtc->wm_high / amdgpu_crtc->line_time);
> > > +       if (ret)
> > > +               return ret;
> > >
> > > -       if (si_write_smc_soft_register(adev,
> > > -
> > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> > > -                                      amdgpu_crtc->wm_low /
> > > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > > -               return 0;
> > > +       ret = si_write_smc_soft_register(
> > > +               adev,
> > > +               SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> > > +               amdgpu_crtc->wm_low / amdgpu_crtc->line_time);
> > > +       if (ret)
> > > +               return ret;
> >
> > This patch changes the behavior of this function (i.e., it always
> > returns 0 before this patch).  Not sure if that matters or not.  I
> > think this could be simplified to something like the following to
> > retain the current behavior.
>
> Actually now that I think of it more, I think it may be entirely
> unnecessary to check the return value.
>
> si_upload_smc_data calls:
> si_write_smc_soft_register
> amdgpu_si_write_smc_sram_dword
> si_set_smc_sram_address
>
> This last one, si_set_smc_sram_address returns -EINVAL when its
> smc_address parameter is not dword-aligned or out of bounds. Otherwise
> all of the above functions return 0 (success). Considering that all of
> the addresses passed by si_upload_smc_data are compile time constants,
> we know they are correct so there is no reason why any of those
> functions would return an error.
>
> Looking at other callers of si_write_smc_soft_register, I see that they
> don't check the return value at all.
>
> So, I'd actually simplify this even more and just not check the return
> values. What do you think about that?

Sure.  Works for me.

Alex

>
>
> >
> > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > index 52e732be59e36..3dd0115aa15f8 100644
> > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > @@ -5836,17 +5836,17 @@ static int si_upload_smc_data(struct
> > amdgpu_device *adev)
> >
> >         if (si_write_smc_soft_register(adev,
> >
> > SI_SMC_SOFT_REGISTER_crtc_index,
> > -                                      amdgpu_crtc->crtc_id) !=
> > PPSMC_Result_OK)
> > +                                      amdgpu_crtc->crtc_id))
> >                 return 0;
> >
> >         if (si_write_smc_soft_register(adev,
> >
> > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> > -                                      amdgpu_crtc->wm_high /
> > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > +                                      amdgpu_crtc->wm_high /
> > amdgpu_crtc->line_time))
> >                 return 0;
> >
> >         if (si_write_smc_soft_register(adev,
> >
> > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> > -                                      amdgpu_crtc->wm_low /
> > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > +                                      amdgpu_crtc->wm_low /
> > amdgpu_crtc->line_time))
> >                 return 0;
> >
> >         return 0;
> >
> >
> > >
> > >         return 0;
> > >  }
> > > --
> > > 2.50.1
> > >

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/6] drm/amd/pm: Disable ULV even if unsupported
  2025-08-04 16:04     ` Timur Kristóf
@ 2025-08-04 17:33       ` Alex Deucher
  0 siblings, 0 replies; 28+ messages in thread
From: Alex Deucher @ 2025-08-04 17:33 UTC (permalink / raw)
  To: Timur Kristóf; +Cc: amd-gfx

On Mon, Aug 4, 2025 at 12:04 PM Timur Kristóf <timur.kristof@gmail.com> wrote:
>
> On Mon, 2025-08-04 at 11:24 -0400, Alex Deucher wrote:
> > On Mon, Aug 4, 2025 at 9:58 AM Timur Kristóf
> > <timur.kristof@gmail.com> wrote:
> > >
> > > This commit fixes some instability on Tahiti.
> > >
> > > Sometimes UVD initialization would fail when using DC.
> > > I suspect this is because DC doesn't immediately turn on the
> > > display clock, so it changes how DPM behaves.
> >
> > Is this the right description for this patch?  I thought you had said
> > this fixed something else.
> >
> > Alex
>
> Yes, this patch together with the previous one fixes the "amdgpu: UVD
> Firmware validate fail" when I enable DC on Tahiti.
>
> Last week I thought this also fixed the "si_set_sw_state failed", but
> that turned out to be wrong. For that one, I sent a separate patch
> which involves a different fix.

Thanks for clarifying.
Acked-by: Alex Deucher <alexander.deucher@amd.com>

>
> Timur
>
> >
> > >
> > > Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)")
> > > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > > ---
> > >  drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 7 +++++--
> > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > index 52e732be59e3..33b9d4beec84 100644
> > > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > @@ -5639,10 +5639,13 @@ static int si_disable_ulv(struct
> > > amdgpu_device *adev)
> > >  {
> > >         struct si_power_info *si_pi = si_get_pi(adev);
> > >         struct si_ulv_param *ulv = &si_pi->ulv;
> > > +       PPSMC_Result r;
> > >
> > > +       r = amdgpu_si_send_msg_to_smc(adev, PPSMC_MSG_DisableULV);
> > > +
> > > +       /* Only care about SMC reply when ULV is supported. */
> > >         if (ulv->supported)
> > > -               return (amdgpu_si_send_msg_to_smc(adev,
> > > PPSMC_MSG_DisableULV) == PPSMC_Result_OK) ?
> > > -                       0 : -EINVAL;
> > > +               return (r == PPSMC_Result_OK) ? 0 : -EINVAL;
> > >
> > >         return 0;
> > >  }
> > > --
> > > 2.50.1
> > >

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/6] drm/radeon: Disable ULV even if unsupported
  2025-08-04 16:09     ` Timur Kristóf
@ 2025-08-04 17:34       ` Alex Deucher
  0 siblings, 0 replies; 28+ messages in thread
From: Alex Deucher @ 2025-08-04 17:34 UTC (permalink / raw)
  To: Timur Kristóf; +Cc: amd-gfx

On Mon, Aug 4, 2025 at 12:09 PM Timur Kristóf <timur.kristof@gmail.com> wrote:
>
> On Mon, 2025-08-04 at 11:24 -0400, Alex Deucher wrote:
> > On Mon, Aug 4, 2025 at 10:18 AM Timur Kristóf
> > <timur.kristof@gmail.com> wrote:
> > >
> > > Backport of the same commit to amdgpu.
> > > This commit fixes some instability on Tahiti.
> >
> > Have you tested this with radeon?
> >
> > Alex
>
> Considering that ULV is currently always marked as unsupported by both
> radeon and amdgpu, and that the same patch fixes a real issue on
> amdgpu, I assume it would improve stability on radeon too.
>
> I leave it up to your judgement whether it's a good idea to apply this
> to radeon or not.
>
> That said, I haven't got a specific issue which this fixes on radeon.
> I only tested that Tahiti can boot with radeon with this patch
> included, but didn't do detailed testing other than that. Do you have
> something specific in mind that I should try?

That's fine.  I was thinking this contributed to the other issue you
were debugging, but you clarified that in the other patch.
Acked-by: Alex Deucher <alexander.deucher@amd.com>

Alex

>
> Timur
>
> >
> > >
> > > Fixes: a9e61410921b ("drm/radeon/kms: add dpm support for SI (v7)")
> > > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > > ---
> > >  drivers/gpu/drm/radeon/si_dpm.c | 7 +++++--
> > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/radeon/si_dpm.c
> > > b/drivers/gpu/drm/radeon/si_dpm.c
> > > index 9deb91970d4d..47fba85436a7 100644
> > > --- a/drivers/gpu/drm/radeon/si_dpm.c
> > > +++ b/drivers/gpu/drm/radeon/si_dpm.c
> > > @@ -5073,10 +5073,13 @@ static int si_disable_ulv(struct
> > > radeon_device *rdev)
> > >  {
> > >         struct si_power_info *si_pi = si_get_pi(rdev);
> > >         struct si_ulv_param *ulv = &si_pi->ulv;
> > > +       PPSMC_Result r;
> > >
> > > +       r = si_send_msg_to_smc(rdev, PPSMC_MSG_DisableULV);
> > > +
> > > +       /* Only care about SMC reply when ULV is supported. */
> > >         if (ulv->supported)
> > > -               return (si_send_msg_to_smc(rdev,
> > > PPSMC_MSG_DisableULV) == PPSMC_Result_OK) ?
> > > -                       0 : -EINVAL;
> > > +               return (r == PPSMC_Result_OK) ? 0 : -EINVAL;
> > >
> > >         return 0;
> > >  }
> > > --
> > > 2.50.1
> > >

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/6] drm/amdgpu: Power up UVD 3 for FW validation
  2025-08-04 16:00     ` Timur Kristóf
@ 2025-08-04 17:45       ` Alex Deucher
  2025-08-04 18:51         ` Timur Kristóf
  2025-08-04 18:59         ` Christian König
  0 siblings, 2 replies; 28+ messages in thread
From: Alex Deucher @ 2025-08-04 17:45 UTC (permalink / raw)
  To: Timur Kristóf; +Cc: Christian Koenig, Leo Liu, amd-gfx

On Mon, Aug 4, 2025 at 12:00 PM Timur Kristóf <timur.kristof@gmail.com> wrote:
>
> On Mon, 2025-08-04 at 11:20 -0400, Alex Deucher wrote:
> > On Mon, Aug 4, 2025 at 9:58 AM Timur Kristóf
> > <timur.kristof@gmail.com> wrote:
> > >
> > > Unlike later versions, UVD 3 has firmware validation.
> > > For this to work, the UVD should be powered up correctly.
> > >
> > > When DPM is enabled and the display clock is off,
> > > the SMU may choose a power state which doesn't power
> > > the UVD, which can result in failure to initialize UVD.
> >
> > + Christian, Leo
> >
> > That doesn't seem right to me.  IIRC, the driver always set the UVD
> > PLL directly on SI and I don't think SI supported any kind of UVD
> > power gating. I guess it's probably some sort of subtle sequencing
> > difference between radeon and amdgpu.  Unless Christian or Leo have
> > any ideas, I think the patch is probably fine.
> >
> > Alex
>
> Hi,
>
> These are my observations about how the UVD clock works on SI:
>
> 1. It seems that the SMC needs to know whether UVD is enabled or not,
> and the UVD clocks are included as part of the power states. See:
> si_convert_power_state_to_smc
> si_convert_power_level_to_smc
>
> On SI the default power state doesn't set the UVD clocks,
> and SI has a specific power state to be used with UVD. Actually
> amdgpu_dpm_enable_uvd has a special case code path for SI, where it
> sets this power state. If I print the power states from
> si_parse_power_table, it indeed confirms that there is only one power
> state that has non-zero UVD clocks, and the rest of them just have the
> UVD clocks at zero.
>
> It's unclear to me what happens if we try to enable UVD clocks when we
> selected a power state that doesn't include them (ie. when we don't
> tell the SMC that UVD is active).
>
> 2. When setting a power state that enables UVD, the UVD clock is
> enabled either before or after the engine clock by si_dpm. This is done
> so in both radeon and amdgpu, see:
> si_dpm_set_power_state
> ni_set_uvd_clock_before_set_eng_clock
> ni_set_uvd_clock_after_set_eng_clock
>
> The specific sequence in which the UVD clock is enabled by
> si_dpm_set_power_state leads me to the conclusion that
> amdgpu_asic_set_uvd_clocks should not be directly called on SI outside
> of the DPM code.
>
> Please correct me if I misunderstood the code.

Yeah, I don't remember the clock dependencies.  I thought that you
should be able to program the UVD PLLs any time you wanted and the
ordering only mattered when you were also changing the sclk.
Programming the PLLs directly works as is in radeon, but I guess maybe
we init DPM in a different order in radeon vs amdgpu.

It would also probably be a good idea to disable the UVD clocks again
after IP init to save power. E.g., something like:

       if (adev->pm.dpm_enabled)
               amdgpu_dpm_enable_uvd(adev, false);
       else
               amdgpu_asic_set_uvd_clocks(adev, 0, 0);

Alex


>
> Thanks,
> Timur
>
>
> >
> > >
> > > Fixes: b38f3e80ecec ("drm amdgpu: SI UVD v3_1 (v2)")
> > > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > > ---
> > >  drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c | 9 ++++++---
> > >  1 file changed, 6 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> > > b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> > > index 5dbaebb592b3..9ad06c1e150d 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> > > @@ -633,6 +633,12 @@ static int uvd_v3_1_hw_init(struct
> > > amdgpu_ip_block *ip_block)
> > >         int r;
> > >
> > >         uvd_v3_1_mc_resume(adev);
> > > +       uvd_v3_1_enable_mgcg(adev, true);
> > > +
> > > +       if (adev->pm.dpm_enabled)
> > > +               amdgpu_dpm_enable_uvd(adev, true);
> > > +       else
> > > +               amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
> > >
> > >         r = uvd_v3_1_fw_validate(adev);
> > >         if (r) {
> > > @@ -640,9 +646,6 @@ static int uvd_v3_1_hw_init(struct
> > > amdgpu_ip_block *ip_block)
> > >                 return r;
> > >         }
> > >
> > > -       uvd_v3_1_enable_mgcg(adev, true);
> > > -       amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
> > > -
> > >         uvd_v3_1_start(adev);
> > >
> > >         r = amdgpu_ring_test_helper(ring);
> > > --
> > > 2.50.1
> > >

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/6] drm/amdgpu: Power up UVD 3 for FW validation
  2025-08-04 17:45       ` Alex Deucher
@ 2025-08-04 18:51         ` Timur Kristóf
  2025-08-04 18:59         ` Christian König
  1 sibling, 0 replies; 28+ messages in thread
From: Timur Kristóf @ 2025-08-04 18:51 UTC (permalink / raw)
  To: Alex Deucher; +Cc: Christian Koenig, Leo Liu, amd-gfx

On Mon, 2025-08-04 at 13:45 -0400, Alex Deucher wrote:
> On Mon, Aug 4, 2025 at 12:00 PM Timur Kristóf
> <timur.kristof@gmail.com> wrote:
> > 
> > On Mon, 2025-08-04 at 11:20 -0400, Alex Deucher wrote:
> > > On Mon, Aug 4, 2025 at 9:58 AM Timur Kristóf
> > > <timur.kristof@gmail.com> wrote:
> > > > 
> > > > Unlike later versions, UVD 3 has firmware validation.
> > > > For this to work, the UVD should be powered up correctly.
> > > > 
> > > > When DPM is enabled and the display clock is off,
> > > > the SMU may choose a power state which doesn't power
> > > > the UVD, which can result in failure to initialize UVD.
> > > 
> > > + Christian, Leo
> > > 
> > > That doesn't seem right to me.  IIRC, the driver always set the
> > > UVD
> > > PLL directly on SI and I don't think SI supported any kind of UVD
> > > power gating. I guess it's probably some sort of subtle
> > > sequencing
> > > difference between radeon and amdgpu.  Unless Christian or Leo
> > > have
> > > any ideas, I think the patch is probably fine.
> > > 
> > > Alex
> > 
> > Hi,
> > 
> > These are my observations about how the UVD clock works on SI:
> > 
> > 1. It seems that the SMC needs to know whether UVD is enabled or
> > not,
> > and the UVD clocks are included as part of the power states. See:
> > si_convert_power_state_to_smc
> > si_convert_power_level_to_smc
> > 
> > On SI the default power state doesn't set the UVD clocks,
> > and SI has a specific power state to be used with UVD. Actually
> > amdgpu_dpm_enable_uvd has a special case code path for SI, where it
> > sets this power state. If I print the power states from
> > si_parse_power_table, it indeed confirms that there is only one
> > power
> > state that has non-zero UVD clocks, and the rest of them just have
> > the
> > UVD clocks at zero.
> > 
> > It's unclear to me what happens if we try to enable UVD clocks when
> > we
> > selected a power state that doesn't include them (ie. when we don't
> > tell the SMC that UVD is active).
> > 
> > 2. When setting a power state that enables UVD, the UVD clock is
> > enabled either before or after the engine clock by si_dpm. This is
> > done
> > so in both radeon and amdgpu, see:
> > si_dpm_set_power_state
> > ni_set_uvd_clock_before_set_eng_clock
> > ni_set_uvd_clock_after_set_eng_clock
> > 
> > The specific sequence in which the UVD clock is enabled by
> > si_dpm_set_power_state leads me to the conclusion that
> > amdgpu_asic_set_uvd_clocks should not be directly called on SI
> > outside
> > of the DPM code.
> > 
> > Please correct me if I misunderstood the code.
> 
> Yeah, I don't remember the clock dependencies.  I thought that you
> should be able to program the UVD PLLs any time you wanted and the
> ordering only mattered when you were also changing the sclk.

In that case, wouldn't it cause issues that the SMC is not aware that
the UVD is running?

As far as I see from si_convert_power_*_to_smc there are some flags in
the power states which are set differently depending on whether UVD is
active or not.

A secondary issue is that the VBIOS seems to have different clock
values for the UVD compared to what is hardcoded in the driver. This
doesn't mean the driver is incorrect, but si_dpm does use the values
from the VBIOS.

> Programming the PLLs directly works as is in radeon, but I guess
> maybe
> we init DPM in a different order in radeon vs amdgpu.

The order is the same.

Looking at radeon's si.c, in si_init:
It calls radeon_pm_init (which eventually calls si_dpm_init) before
calling si_uvd_init.

Looking at amdgpu's si.c, in si_set_ip_blocks:
The si_smu_ip_block is added before uvd_v3_1_ip_block.

I think the main difference between radeon and amdgpu as far as I see:
- radeon doesn't do the FW validation
- radeon doesn't do a ring test

Hence, radeon doesn't actually need to enable the UVD clocks at
initialization.

> 
> It would also probably be a good idea to disable the UVD clocks again
> after IP init to save power. E.g., something like:
> 
>        if (adev->pm.dpm_enabled)
>                amdgpu_dpm_enable_uvd(adev, false);
>        else
>                amdgpu_asic_set_uvd_clocks(adev, 0, 0);
> 
> Alex

Note that the UVD clocks were already enabled by uvd_v3_1_hw_init, when
it creates the ring test, that calls amdgpu_uvd_ring_begin_use which
enables the clock by calling amdgpu_dpm_enable_uvd. The only difference
that my patch makes is that it now enables the UVD clock just a little
bit earlier.

I think we can't disable the UVD clocks directly in uvd_v3_1_hw_init,
because the ring test would fail. However, as far as I understand, the
UVD clocks are already disabled automatically anyway after the ring
test is done: amdgpu_ring_commit calls amdgpu_uvd_ring_end_use. This
launches the UVD idle work handler, which automatically disables the UV
clock when all pending fences are signalled.

Please correct me if I misunderstood this.

Thanks,
Timur


> 
> 
> > 
> > Thanks,
> > Timur
> > 
> > 
> > > 
> > > > 
> > > > Fixes: b38f3e80ecec ("drm amdgpu: SI UVD v3_1 (v2)")
> > > > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > > > ---
> > > >  drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c | 9 ++++++---
> > > >  1 file changed, 6 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> > > > b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> > > > index 5dbaebb592b3..9ad06c1e150d 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> > > > @@ -633,6 +633,12 @@ static int uvd_v3_1_hw_init(struct
> > > > amdgpu_ip_block *ip_block)
> > > >         int r;
> > > > 
> > > >         uvd_v3_1_mc_resume(adev);
> > > > +       uvd_v3_1_enable_mgcg(adev, true);
> > > > +
> > > > +       if (adev->pm.dpm_enabled)
> > > > +               amdgpu_dpm_enable_uvd(adev, true);
> > > > +       else
> > > > +               amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
> > > > 
> > > >         r = uvd_v3_1_fw_validate(adev);
> > > >         if (r) {
> > > > @@ -640,9 +646,6 @@ static int uvd_v3_1_hw_init(struct
> > > > amdgpu_ip_block *ip_block)
> > > >                 return r;
> > > >         }
> > > > 
> > > > -       uvd_v3_1_enable_mgcg(adev, true);
> > > > -       amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
> > > > -
> > > >         uvd_v3_1_start(adev);
> > > > 
> > > >         r = amdgpu_ring_test_helper(ring);
> > > > --
> > > > 2.50.1
> > > > 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/6] drm/amdgpu: Power up UVD 3 for FW validation
  2025-08-04 17:45       ` Alex Deucher
  2025-08-04 18:51         ` Timur Kristóf
@ 2025-08-04 18:59         ` Christian König
  2025-08-06  0:35           ` Timur Kristóf
  1 sibling, 1 reply; 28+ messages in thread
From: Christian König @ 2025-08-04 18:59 UTC (permalink / raw)
  To: Alex Deucher, Timur Kristóf; +Cc: Leo Liu, amd-gfx

On 04.08.25 19:45, Alex Deucher wrote:
> On Mon, Aug 4, 2025 at 12:00 PM Timur Kristóf <timur.kristof@gmail.com> wrote:
>>
>> On Mon, 2025-08-04 at 11:20 -0400, Alex Deucher wrote:
>>> On Mon, Aug 4, 2025 at 9:58 AM Timur Kristóf
>>> <timur.kristof@gmail.com> wrote:
>>>>
>>>> Unlike later versions, UVD 3 has firmware validation.
>>>> For this to work, the UVD should be powered up correctly.
>>>>
>>>> When DPM is enabled and the display clock is off,
>>>> the SMU may choose a power state which doesn't power
>>>> the UVD, which can result in failure to initialize UVD.
>>>
>>> + Christian, Leo
>>>
>>> That doesn't seem right to me.  IIRC, the driver always set the UVD
>>> PLL directly on SI and I don't think SI supported any kind of UVD
>>> power gating. I guess it's probably some sort of subtle sequencing
>>> difference between radeon and amdgpu.  Unless Christian or Leo have
>>> any ideas, I think the patch is probably fine.

Oh my, that stuff was last at the front of my head a long long time ago.

>>>
>>> Alex
>>
>> Hi,
>>
>> These are my observations about how the UVD clock works on SI:
>>
>> 1. It seems that the SMC needs to know whether UVD is enabled or not,
>> and the UVD clocks are included as part of the power states. See:
>> si_convert_power_state_to_smc
>> si_convert_power_level_to_smc

Correct, yes. The design was that either the KMD or the SMC could program the PLLs.

>>
>> On SI the default power state doesn't set the UVD clocks,
>> and SI has a specific power state to be used with UVD. Actually
>> amdgpu_dpm_enable_uvd has a special case code path for SI, where it
>> sets this power state. If I print the power states from
>> si_parse_power_table, it indeed confirms that there is only one power
>> state that has non-zero UVD clocks, and the rest of them just have the
>> UVD clocks at zero.
>>
>> It's unclear to me what happens if we try to enable UVD clocks when we
>> selected a power state that doesn't include them (ie. when we don't
>> tell the SMC that UVD is active).

IIRC there were two possibilities.

Either you let the SMC handle the clocks in which case it would lower the GFX clock in favor of stable UVD clocks.

Or the KMD would lock the SMC to the highest level and then program the UVD clocks manually.

The later was not really validated but requested by a lot of people because otherwise you got a GFX performance reduction whenever you used UVD.

>>
>> 2. When setting a power state that enables UVD, the UVD clock is
>> enabled either before or after the engine clock by si_dpm. This is done
>> so in both radeon and amdgpu, see:
>> si_dpm_set_power_state
>> ni_set_uvd_clock_before_set_eng_clock
>> ni_set_uvd_clock_after_set_eng_clock
>>
>> The specific sequence in which the UVD clock is enabled by
>> si_dpm_set_power_state leads me to the conclusion that
>> amdgpu_asic_set_uvd_clocks should not be directly called on SI outside
>> of the DPM code.
>>
>> Please correct me if I misunderstood the code.

That sounds correct to me.

> 
> Yeah, I don't remember the clock dependencies.  I thought that you
> should be able to program the UVD PLLs any time you wanted and the
> ordering only mattered when you were also changing the sclk.
> Programming the PLLs directly works as is in radeon, but I guess maybe
> we init DPM in a different order in radeon vs amdgpu.
> 
> It would also probably be a good idea to disable the UVD clocks again
> after IP init to save power. E.g., something like:
> 
>        if (adev->pm.dpm_enabled)
>                amdgpu_dpm_enable_uvd(adev, false);
>        else
>                amdgpu_asic_set_uvd_clocks(adev, 0, 0);

IIRC we always disabled the PLL manually when UVD was unused because the SMC sometimes failed to do this.

Regards,
Christian.

> 
> Alex
> 
> 
>>
>> Thanks,
>> Timur
>>
>>
>>>
>>>>
>>>> Fixes: b38f3e80ecec ("drm amdgpu: SI UVD v3_1 (v2)")
>>>> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
>>>> ---
>>>>  drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c | 9 ++++++---
>>>>  1 file changed, 6 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
>>>> b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
>>>> index 5dbaebb592b3..9ad06c1e150d 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
>>>> @@ -633,6 +633,12 @@ static int uvd_v3_1_hw_init(struct
>>>> amdgpu_ip_block *ip_block)
>>>>         int r;
>>>>
>>>>         uvd_v3_1_mc_resume(adev);
>>>> +       uvd_v3_1_enable_mgcg(adev, true);
>>>> +
>>>> +       if (adev->pm.dpm_enabled)
>>>> +               amdgpu_dpm_enable_uvd(adev, true);
>>>> +       else
>>>> +               amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
>>>>
>>>>         r = uvd_v3_1_fw_validate(adev);
>>>>         if (r) {
>>>> @@ -640,9 +646,6 @@ static int uvd_v3_1_hw_init(struct
>>>> amdgpu_ip_block *ip_block)
>>>>                 return r;
>>>>         }
>>>>
>>>> -       uvd_v3_1_enable_mgcg(adev, true);
>>>> -       amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
>>>> -
>>>>         uvd_v3_1_start(adev);
>>>>
>>>>         r = amdgpu_ring_test_helper(ring);
>>>> --
>>>> 2.50.1
>>>>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/6] drm/amdgpu: Power up UVD 3 for FW validation
  2025-08-04 18:59         ` Christian König
@ 2025-08-06  0:35           ` Timur Kristóf
  2025-08-06 12:27             ` Christian König
  0 siblings, 1 reply; 28+ messages in thread
From: Timur Kristóf @ 2025-08-06  0:35 UTC (permalink / raw)
  To: Christian König, Alex Deucher; +Cc: Leo Liu, amd-gfx

On Mon, 2025-08-04 at 20:59 +0200, Christian König wrote:
> On 04.08.25 19:45, Alex Deucher wrote:
> > On Mon, Aug 4, 2025 at 12:00 PM Timur Kristóf
> > <timur.kristof@gmail.com> wrote:
> > > 
> > > On Mon, 2025-08-04 at 11:20 -0400, Alex Deucher wrote:
> > > > On Mon, Aug 4, 2025 at 9:58 AM Timur Kristóf
> > > > <timur.kristof@gmail.com> wrote:
> > > > > 
> > > > > Unlike later versions, UVD 3 has firmware validation.
> > > > > For this to work, the UVD should be powered up correctly.
> > > > > 
> > > > > When DPM is enabled and the display clock is off,
> > > > > the SMU may choose a power state which doesn't power
> > > > > the UVD, which can result in failure to initialize UVD.
> > > > 
> > > > + Christian, Leo
> > > > 
> > > > That doesn't seem right to me.  IIRC, the driver always set the
> > > > UVD
> > > > PLL directly on SI and I don't think SI supported any kind of
> > > > UVD
> > > > power gating. I guess it's probably some sort of subtle
> > > > sequencing
> > > > difference between radeon and amdgpu.  Unless Christian or Leo
> > > > have
> > > > any ideas, I think the patch is probably fine.
> 
> Oh my, that stuff was last at the front of my head a long long time
> ago.

Thanks for taking the time to reply anyway!

> 
> > > > 
> > > > Alex
> > > 
> > > Hi,
> > > 
> > > These are my observations about how the UVD clock works on SI:
> > > 
> > > 1. It seems that the SMC needs to know whether UVD is enabled or
> > > not,
> > > and the UVD clocks are included as part of the power states. See:
> > > si_convert_power_state_to_smc
> > > si_convert_power_level_to_smc
> 
> Correct, yes. The design was that either the KMD or the SMC could
> program the PLLs.
> 
> > > 
> > > On SI the default power state doesn't set the UVD clocks,
> > > and SI has a specific power state to be used with UVD. Actually
> > > amdgpu_dpm_enable_uvd has a special case code path for SI, where
> > > it
> > > sets this power state. If I print the power states from
> > > si_parse_power_table, it indeed confirms that there is only one
> > > power
> > > state that has non-zero UVD clocks, and the rest of them just
> > > have the
> > > UVD clocks at zero.
> > > 
> > > It's unclear to me what happens if we try to enable UVD clocks
> > > when we
> > > selected a power state that doesn't include them (ie. when we
> > > don't
> > > tell the SMC that UVD is active).
> 
> IIRC there were two possibilities.
> 
> Either you let the SMC handle the clocks in which case it would lower
> the GFX clock in favor of stable UVD clocks.
> 
> Or the KMD would lock the SMC to the highest level and then program
> the UVD clocks manually.

As far as I see the si_dpm code does a mixture of the above two.
When UVD is enabled, it selects the VBIOS-provided UVD power state and
then it manually enables the UVD clocks to the value provided by the
VBIOS.

When the UVD ring is not used anymore, it then shuts the UVD clock down
manually.

(I assume then it goes back to a normal power state but I haven't
actually verified that.)

> 
> The later was not really validated but requested by a lot of people
> because otherwise you got a GFX performance reduction whenever you
> used UVD.

Yes, the UVD power state from the VBIOS indeed has lower shader clocks
compared to the normal power state.

> 
> > > 
> > > 2. When setting a power state that enables UVD, the UVD clock is
> > > enabled either before or after the engine clock by si_dpm. This
> > > is done
> > > so in both radeon and amdgpu, see:
> > > si_dpm_set_power_state
> > > ni_set_uvd_clock_before_set_eng_clock
> > > ni_set_uvd_clock_after_set_eng_clock
> > > 
> > > The specific sequence in which the UVD clock is enabled by
> > > si_dpm_set_power_state leads me to the conclusion that
> > > amdgpu_asic_set_uvd_clocks should not be directly called on SI
> > > outside
> > > of the DPM code.
> > > 
> > > Please correct me if I misunderstood the code.
> 
> That sounds correct to me.

Thanks!

Sounds like the patch is correct, then.

> 
> > 
> > Yeah, I don't remember the clock dependencies.  I thought that you
> > should be able to program the UVD PLLs any time you wanted and the
> > ordering only mattered when you were also changing the sclk.
> > Programming the PLLs directly works as is in radeon, but I guess
> > maybe
> > we init DPM in a different order in radeon vs amdgpu.
> > 
> > It would also probably be a good idea to disable the UVD clocks
> > again
> > after IP init to save power. E.g., something like:
> > 
> >        if (adev->pm.dpm_enabled)
> >                amdgpu_dpm_enable_uvd(adev, false);
> >        else
> >                amdgpu_asic_set_uvd_clocks(adev, 0, 0);
> 
> IIRC we always disabled the PLL manually when UVD was unused because
> the SMC sometimes failed to do this.


Yes, as I mentioned in my previous mail the PM code does that already
when the UVD ring is not in use anymore. So it's not necessary to add
any code to shut it down.

Maybe I should edit the commit to explain that in a comment?

Thanks,
Timur

> 
> Regards,
> Christian.
> 
> > 
> > Alex
> > 
> > 
> > > 
> > > Thanks,
> > > Timur
> > > 
> > > 
> > > > 
> > > > > 
> > > > > Fixes: b38f3e80ecec ("drm amdgpu: SI UVD v3_1 (v2)")
> > > > > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > > > > ---
> > > > >  drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c | 9 ++++++---
> > > > >  1 file changed, 6 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> > > > > b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> > > > > index 5dbaebb592b3..9ad06c1e150d 100644
> > > > > --- a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> > > > > +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
> > > > > @@ -633,6 +633,12 @@ static int uvd_v3_1_hw_init(struct
> > > > > amdgpu_ip_block *ip_block)
> > > > >         int r;
> > > > > 
> > > > >         uvd_v3_1_mc_resume(adev);
> > > > > +       uvd_v3_1_enable_mgcg(adev, true);
> > > > > +
> > > > > +       if (adev->pm.dpm_enabled)
> > > > > +               amdgpu_dpm_enable_uvd(adev, true);
> > > > > +       else
> > > > > +               amdgpu_asic_set_uvd_clocks(adev, 53300,
> > > > > 40000);
> > > > > 
> > > > >         r = uvd_v3_1_fw_validate(adev);
> > > > >         if (r) {
> > > > > @@ -640,9 +646,6 @@ static int uvd_v3_1_hw_init(struct
> > > > > amdgpu_ip_block *ip_block)
> > > > >                 return r;
> > > > >         }
> > > > > 
> > > > > -       uvd_v3_1_enable_mgcg(adev, true);
> > > > > -       amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
> > > > > -
> > > > >         uvd_v3_1_start(adev);
> > > > > 
> > > > >         r = amdgpu_ring_test_helper(ring);
> > > > > --
> > > > > 2.50.1
> > > > > 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/6] drm/amdgpu: Power up UVD 3 for FW validation
  2025-08-06  0:35           ` Timur Kristóf
@ 2025-08-06 12:27             ` Christian König
  2025-08-08  8:23               ` Timur Kristóf
  0 siblings, 1 reply; 28+ messages in thread
From: Christian König @ 2025-08-06 12:27 UTC (permalink / raw)
  To: Timur Kristóf, Alex Deucher; +Cc: Leo Liu, amd-gfx

On 06.08.25 02:35, Timur Kristóf wrote:
>>
>>>>>
>>>>> Alex
>>>>
>>>> Hi,
>>>>
>>>> These are my observations about how the UVD clock works on SI:
>>>>
>>>> 1. It seems that the SMC needs to know whether UVD is enabled or
>>>> not,
>>>> and the UVD clocks are included as part of the power states. See:
>>>> si_convert_power_state_to_smc
>>>> si_convert_power_level_to_smc
>>
>> Correct, yes. The design was that either the KMD or the SMC could
>> program the PLLs.
>>
>>>>
>>>> On SI the default power state doesn't set the UVD clocks,
>>>> and SI has a specific power state to be used with UVD. Actually
>>>> amdgpu_dpm_enable_uvd has a special case code path for SI, where
>>>> it
>>>> sets this power state. If I print the power states from
>>>> si_parse_power_table, it indeed confirms that there is only one
>>>> power
>>>> state that has non-zero UVD clocks, and the rest of them just
>>>> have the
>>>> UVD clocks at zero.
>>>>
>>>> It's unclear to me what happens if we try to enable UVD clocks
>>>> when we
>>>> selected a power state that doesn't include them (ie. when we
>>>> don't
>>>> tell the SMC that UVD is active).
>>
>> IIRC there were two possibilities.
>>
>> Either you let the SMC handle the clocks in which case it would lower
>> the GFX clock in favor of stable UVD clocks.
>>
>> Or the KMD would lock the SMC to the highest level and then program
>> the UVD clocks manually.
> 
> As far as I see the si_dpm code does a mixture of the above two.
> When UVD is enabled, it selects the VBIOS-provided UVD power state and
> then it manually enables the UVD clocks to the value provided by the
> VBIOS.
> 
> When the UVD ring is not used anymore, it then shuts the UVD clock down
> manually.
> 
> (I assume then it goes back to a normal power state but I haven't
> actually verified that.)
> 
>>
>> The later was not really validated but requested by a lot of people
>> because otherwise you got a GFX performance reduction whenever you
>> used UVD.
> 
> Yes, the UVD power state from the VBIOS indeed has lower shader clocks
> compared to the normal power state.
> 
>>
>>>>
>>>> 2. When setting a power state that enables UVD, the UVD clock is
>>>> enabled either before or after the engine clock by si_dpm. This
>>>> is done
>>>> so in both radeon and amdgpu, see:
>>>> si_dpm_set_power_state
>>>> ni_set_uvd_clock_before_set_eng_clock
>>>> ni_set_uvd_clock_after_set_eng_clock
>>>>
>>>> The specific sequence in which the UVD clock is enabled by
>>>> si_dpm_set_power_state leads me to the conclusion that
>>>> amdgpu_asic_set_uvd_clocks should not be directly called on SI
>>>> outside
>>>> of the DPM code.
>>>>
>>>> Please correct me if I misunderstood the code.
>>
>> That sounds correct to me.
> 
> Thanks!
> 
> Sounds like the patch is correct, then.

Most likely yes.

> 
>>
>>>
>>> Yeah, I don't remember the clock dependencies.  I thought that you
>>> should be able to program the UVD PLLs any time you wanted and the
>>> ordering only mattered when you were also changing the sclk.
>>> Programming the PLLs directly works as is in radeon, but I guess
>>> maybe
>>> we init DPM in a different order in radeon vs amdgpu.
>>>
>>> It would also probably be a good idea to disable the UVD clocks
>>> again
>>> after IP init to save power. E.g., something like:
>>>
>>>        if (adev->pm.dpm_enabled)
>>>                amdgpu_dpm_enable_uvd(adev, false);
>>>        else
>>>                amdgpu_asic_set_uvd_clocks(adev, 0, 0);
>>
>> IIRC we always disabled the PLL manually when UVD was unused because
>> the SMC sometimes failed to do this.
> 
> 
> Yes, as I mentioned in my previous mail the PM code does that already
> when the UVD ring is not in use anymore. So it's not necessary to add
> any code to shut it down.
> 
> Maybe I should edit the commit to explain that in a comment?

Code comment please!

That's basically the only chance we have to keep the knowledge why we did something the way we do it for the old HW generations around.

Regards,
Christian.

> 
> Thanks,
> Timur
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/6] drm/amd/pm: Fix si_upload_smc_data
  2025-08-04 17:31       ` Alex Deucher
@ 2025-08-08  8:22         ` Timur Kristóf
  2025-08-08 19:04           ` Alex Deucher
  0 siblings, 1 reply; 28+ messages in thread
From: Timur Kristóf @ 2025-08-08  8:22 UTC (permalink / raw)
  To: Alex Deucher; +Cc: amd-gfx

On Mon, 2025-08-04 at 13:31 -0400, Alex Deucher wrote:
> On Mon, Aug 4, 2025 at 12:16 PM Timur Kristóf
> <timur.kristof@gmail.com> wrote:
> > 
> > On Mon, 2025-08-04 at 11:32 -0400, Alex Deucher wrote:
> > > On Mon, Aug 4, 2025 at 9:42 AM Timur Kristóf
> > > <timur.kristof@gmail.com> wrote:
> > > > 
> > > > The si_upload_smc_data function uses si_write_smc_soft_register
> > > > to set some register values in the SMC, and expects the result
> > > > to be PPSMC_Result_OK which is 1.
> > > > 
> > > > The PPSMC_Result_OK / PPSMC_Result_Failed values are used for
> > > > checking the result of a command sent to the SMC.
> > > > 
> > > > However, the si_write_smc_soft_register actually doesn't send
> > > > any commands to the SMC and returns zero on success,
> > > > so this check was incorrect.
> > > > 
> > > > Fix that by correctly interpreting zero as success.
> > > > This seems to fix an SMC hang that happens in si_set_sw_state.
> > > > 
> > > > Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)")
> > > > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > > > ---
> > > >  drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 31 +++++++++++++-
> > > > ----
> > > > ----
> > > >  1 file changed, 19 insertions(+), 12 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > index 33b9d4beec84..e9f034ade214 100644
> > > > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > @@ -5820,6 +5820,7 @@ static int si_upload_smc_data(struct
> > > > amdgpu_device *adev)
> > > >  {
> > > >         struct amdgpu_crtc *amdgpu_crtc = NULL;
> > > >         int i;
> > > > +       int ret;
> > > > 
> > > >         if (adev->pm.dpm.new_active_crtc_count == 0)
> > > >                 return 0;
> > > > @@ -5837,20 +5838,26 @@ static int si_upload_smc_data(struct
> > > > amdgpu_device *adev)
> > > >         if (amdgpu_crtc->line_time <= 0)
> > > >                 return 0;
> > > > 
> > > > -       if (si_write_smc_soft_register(adev,
> > > > -
> > > > SI_SMC_SOFT_REGISTER_crtc_index,
> > > > -                                      amdgpu_crtc->crtc_id) !=
> > > > PPSMC_Result_OK)
> > > > -               return 0;
> > > > +       ret = si_write_smc_soft_register(
> > > > +               adev,
> > > > +               SI_SMC_SOFT_REGISTER_crtc_index,
> > > > +               amdgpu_crtc->crtc_id);
> > > > +       if (ret)
> > > > +               return ret;
> > > > 
> > > > -       if (si_write_smc_soft_register(adev,
> > > > -
> > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> > > > -                                      amdgpu_crtc->wm_high /
> > > > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > > > -               return 0;
> > > > +       ret = si_write_smc_soft_register(
> > > > +               adev,
> > > > +               SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> > > > +               amdgpu_crtc->wm_high / amdgpu_crtc->line_time);
> > > > +       if (ret)
> > > > +               return ret;
> > > > 
> > > > -       if (si_write_smc_soft_register(adev,
> > > > -
> > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> > > > -                                      amdgpu_crtc->wm_low /
> > > > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > > > -               return 0;
> > > > +       ret = si_write_smc_soft_register(
> > > > +               adev,
> > > > +               SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> > > > +               amdgpu_crtc->wm_low / amdgpu_crtc->line_time);
> > > > +       if (ret)
> > > > +               return ret;
> > > 
> > > This patch changes the behavior of this function (i.e., it always
> > > returns 0 before this patch).  Not sure if that matters or not. 
> > > I
> > > think this could be simplified to something like the following to
> > > retain the current behavior.
> > 
> > Actually now that I think of it more, I think it may be entirely
> > unnecessary to check the return value.
> > 
> > si_upload_smc_data calls:
> > si_write_smc_soft_register
> > amdgpu_si_write_smc_sram_dword
> > si_set_smc_sram_address
> > 
> > This last one, si_set_smc_sram_address returns -EINVAL when its
> > smc_address parameter is not dword-aligned or out of bounds.
> > Otherwise
> > all of the above functions return 0 (success). Considering that all
> > of
> > the addresses passed by si_upload_smc_data are compile time
> > constants,
> > we know they are correct so there is no reason why any of those
> > functions would return an error.
> > 
> > Looking at other callers of si_write_smc_soft_register, I see that
> > they
> > don't check the return value at all.
> > 
> > So, I'd actually simplify this even more and just not check the
> > return
> > values. What do you think about that?
> 
> Sure.  Works for me.
> 
> Alex

Alex, before I send a new version of this series, can you please
clarify what these registers are and verify that the actual programming
of these SMC registers is correct?

The reason I ask is because due the the bug being fixed by these patch,
these registers were never actually written, which makes me wonder if
the value we program them to is actually correct.

I mean the values that we program these registers to:

SI_SMC_SOFT_REGISTER_crtc_index - we just program the index of the
first active CRTC, seems straightforward enough, but it's unclear what
the SMC uses this for. Why does the SMC care which crtc we use?

SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min - programmed to the high
display watermark divided by the line time. But I can't find any
information about what this information represents or what the SMC uses
it for. Judging by the name it has to do with mclk switching?

SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max - same concern as _min.

Thanks,
Timur


> 
> > 
> > 
> > > 
> > > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > index 52e732be59e36..3dd0115aa15f8 100644
> > > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > @@ -5836,17 +5836,17 @@ static int si_upload_smc_data(struct
> > > amdgpu_device *adev)
> > > 
> > >         if (si_write_smc_soft_register(adev,
> > > 
> > > SI_SMC_SOFT_REGISTER_crtc_index,
> > > -                                      amdgpu_crtc->crtc_id) !=
> > > PPSMC_Result_OK)
> > > +                                      amdgpu_crtc->crtc_id))
> > >                 return 0;
> > > 
> > >         if (si_write_smc_soft_register(adev,
> > > 
> > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> > > -                                      amdgpu_crtc->wm_high /
> > > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > > +                                      amdgpu_crtc->wm_high /
> > > amdgpu_crtc->line_time))
> > >                 return 0;
> > > 
> > >         if (si_write_smc_soft_register(adev,
> > > 
> > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> > > -                                      amdgpu_crtc->wm_low /
> > > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > > +                                      amdgpu_crtc->wm_low /
> > > amdgpu_crtc->line_time))
> > >                 return 0;
> > > 
> > >         return 0;
> > > 
> > > 
> > > > 
> > > >         return 0;
> > > >  }
> > > > --
> > > > 2.50.1
> > > > 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/6] drm/amdgpu: Power up UVD 3 for FW validation
  2025-08-06 12:27             ` Christian König
@ 2025-08-08  8:23               ` Timur Kristóf
  0 siblings, 0 replies; 28+ messages in thread
From: Timur Kristóf @ 2025-08-08  8:23 UTC (permalink / raw)
  To: Christian König, Alex Deucher; +Cc: Leo Liu, amd-gfx

On Wed, 2025-08-06 at 14:27 +0200, Christian König wrote:
> On 06.08.25 02:35, Timur Kristóf wrote:
> > > 
> > > > > > 
> > > > > > Alex
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > These are my observations about how the UVD clock works on
> > > > > SI:
> > > > > 
> > > > > 1. It seems that the SMC needs to know whether UVD is enabled
> > > > > or
> > > > > not,
> > > > > and the UVD clocks are included as part of the power states.
> > > > > See:
> > > > > si_convert_power_state_to_smc
> > > > > si_convert_power_level_to_smc
> > > 
> > > Correct, yes. The design was that either the KMD or the SMC could
> > > program the PLLs.
> > > 
> > > > > 
> > > > > On SI the default power state doesn't set the UVD clocks,
> > > > > and SI has a specific power state to be used with UVD.
> > > > > Actually
> > > > > amdgpu_dpm_enable_uvd has a special case code path for SI,
> > > > > where
> > > > > it
> > > > > sets this power state. If I print the power states from
> > > > > si_parse_power_table, it indeed confirms that there is only
> > > > > one
> > > > > power
> > > > > state that has non-zero UVD clocks, and the rest of them just
> > > > > have the
> > > > > UVD clocks at zero.
> > > > > 
> > > > > It's unclear to me what happens if we try to enable UVD
> > > > > clocks
> > > > > when we
> > > > > selected a power state that doesn't include them (ie. when we
> > > > > don't
> > > > > tell the SMC that UVD is active).
> > > 
> > > IIRC there were two possibilities.
> > > 
> > > Either you let the SMC handle the clocks in which case it would
> > > lower
> > > the GFX clock in favor of stable UVD clocks.
> > > 
> > > Or the KMD would lock the SMC to the highest level and then
> > > program
> > > the UVD clocks manually.
> > 
> > As far as I see the si_dpm code does a mixture of the above two.
> > When UVD is enabled, it selects the VBIOS-provided UVD power state
> > and
> > then it manually enables the UVD clocks to the value provided by
> > the
> > VBIOS.
> > 
> > When the UVD ring is not used anymore, it then shuts the UVD clock
> > down
> > manually.
> > 
> > (I assume then it goes back to a normal power state but I haven't
> > actually verified that.)
> > 
> > > 
> > > The later was not really validated but requested by a lot of
> > > people
> > > because otherwise you got a GFX performance reduction whenever
> > > you
> > > used UVD.
> > 
> > Yes, the UVD power state from the VBIOS indeed has lower shader
> > clocks
> > compared to the normal power state.
> > 
> > > 
> > > > > 
> > > > > 2. When setting a power state that enables UVD, the UVD clock
> > > > > is
> > > > > enabled either before or after the engine clock by si_dpm.
> > > > > This
> > > > > is done
> > > > > so in both radeon and amdgpu, see:
> > > > > si_dpm_set_power_state
> > > > > ni_set_uvd_clock_before_set_eng_clock
> > > > > ni_set_uvd_clock_after_set_eng_clock
> > > > > 
> > > > > The specific sequence in which the UVD clock is enabled by
> > > > > si_dpm_set_power_state leads me to the conclusion that
> > > > > amdgpu_asic_set_uvd_clocks should not be directly called on
> > > > > SI
> > > > > outside
> > > > > of the DPM code.
> > > > > 
> > > > > Please correct me if I misunderstood the code.
> > > 
> > > That sounds correct to me.
> > 
> > Thanks!
> > 
> > Sounds like the patch is correct, then.
> 
> Most likely yes.
> 
> > 
> > > 
> > > > 
> > > > Yeah, I don't remember the clock dependencies.  I thought that
> > > > you
> > > > should be able to program the UVD PLLs any time you wanted and
> > > > the
> > > > ordering only mattered when you were also changing the sclk.
> > > > Programming the PLLs directly works as is in radeon, but I
> > > > guess
> > > > maybe
> > > > we init DPM in a different order in radeon vs amdgpu.
> > > > 
> > > > It would also probably be a good idea to disable the UVD clocks
> > > > again
> > > > after IP init to save power. E.g., something like:
> > > > 
> > > >        if (adev->pm.dpm_enabled)
> > > >                amdgpu_dpm_enable_uvd(adev, false);
> > > >        else
> > > >                amdgpu_asic_set_uvd_clocks(adev, 0, 0);
> > > 
> > > IIRC we always disabled the PLL manually when UVD was unused
> > > because
> > > the SMC sometimes failed to do this.
> > 
> > 
> > Yes, as I mentioned in my previous mail the PM code does that
> > already
> > when the UVD ring is not in use anymore. So it's not necessary to
> > add
> > any code to shut it down.
> > 
> > Maybe I should edit the commit to explain that in a comment?
> 
> Code comment please!
> 
> That's basically the only chance we have to keep the knowledge why we
> did something the way we do it for the old HW generations around.
> 
> Regards,
> Christian.

Hi Christian,

Sounds good. I will add a comment when I submit a next version of this
series.

> 
> > 
> > Thanks,
> > Timur

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/6] drm/amd/pm: Fix si_upload_smc_data
  2025-08-08  8:22         ` Timur Kristóf
@ 2025-08-08 19:04           ` Alex Deucher
  2025-08-09 19:56             ` Timur Kristóf
  0 siblings, 1 reply; 28+ messages in thread
From: Alex Deucher @ 2025-08-08 19:04 UTC (permalink / raw)
  To: Timur Kristóf; +Cc: amd-gfx

On Fri, Aug 8, 2025 at 4:22 AM Timur Kristóf <timur.kristof@gmail.com> wrote:
>
> On Mon, 2025-08-04 at 13:31 -0400, Alex Deucher wrote:
> > On Mon, Aug 4, 2025 at 12:16 PM Timur Kristóf
> > <timur.kristof@gmail.com> wrote:
> > >
> > > On Mon, 2025-08-04 at 11:32 -0400, Alex Deucher wrote:
> > > > On Mon, Aug 4, 2025 at 9:42 AM Timur Kristóf
> > > > <timur.kristof@gmail.com> wrote:
> > > > >
> > > > > The si_upload_smc_data function uses si_write_smc_soft_register
> > > > > to set some register values in the SMC, and expects the result
> > > > > to be PPSMC_Result_OK which is 1.
> > > > >
> > > > > The PPSMC_Result_OK / PPSMC_Result_Failed values are used for
> > > > > checking the result of a command sent to the SMC.
> > > > >
> > > > > However, the si_write_smc_soft_register actually doesn't send
> > > > > any commands to the SMC and returns zero on success,
> > > > > so this check was incorrect.
> > > > >
> > > > > Fix that by correctly interpreting zero as success.
> > > > > This seems to fix an SMC hang that happens in si_set_sw_state.
> > > > >
> > > > > Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)")
> > > > > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > > > > ---
> > > > >  drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 31 +++++++++++++-
> > > > > ----
> > > > > ----
> > > > >  1 file changed, 19 insertions(+), 12 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > > b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > > index 33b9d4beec84..e9f034ade214 100644
> > > > > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > > @@ -5820,6 +5820,7 @@ static int si_upload_smc_data(struct
> > > > > amdgpu_device *adev)
> > > > >  {
> > > > >         struct amdgpu_crtc *amdgpu_crtc = NULL;
> > > > >         int i;
> > > > > +       int ret;
> > > > >
> > > > >         if (adev->pm.dpm.new_active_crtc_count == 0)
> > > > >                 return 0;
> > > > > @@ -5837,20 +5838,26 @@ static int si_upload_smc_data(struct
> > > > > amdgpu_device *adev)
> > > > >         if (amdgpu_crtc->line_time <= 0)
> > > > >                 return 0;
> > > > >
> > > > > -       if (si_write_smc_soft_register(adev,
> > > > > -
> > > > > SI_SMC_SOFT_REGISTER_crtc_index,
> > > > > -                                      amdgpu_crtc->crtc_id) !=
> > > > > PPSMC_Result_OK)
> > > > > -               return 0;
> > > > > +       ret = si_write_smc_soft_register(
> > > > > +               adev,
> > > > > +               SI_SMC_SOFT_REGISTER_crtc_index,
> > > > > +               amdgpu_crtc->crtc_id);
> > > > > +       if (ret)
> > > > > +               return ret;
> > > > >
> > > > > -       if (si_write_smc_soft_register(adev,
> > > > > -
> > > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> > > > > -                                      amdgpu_crtc->wm_high /
> > > > > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > > > > -               return 0;
> > > > > +       ret = si_write_smc_soft_register(
> > > > > +               adev,
> > > > > +               SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> > > > > +               amdgpu_crtc->wm_high / amdgpu_crtc->line_time);
> > > > > +       if (ret)
> > > > > +               return ret;
> > > > >
> > > > > -       if (si_write_smc_soft_register(adev,
> > > > > -
> > > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> > > > > -                                      amdgpu_crtc->wm_low /
> > > > > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > > > > -               return 0;
> > > > > +       ret = si_write_smc_soft_register(
> > > > > +               adev,
> > > > > +               SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> > > > > +               amdgpu_crtc->wm_low / amdgpu_crtc->line_time);
> > > > > +       if (ret)
> > > > > +               return ret;
> > > >
> > > > This patch changes the behavior of this function (i.e., it always
> > > > returns 0 before this patch).  Not sure if that matters or not.
> > > > I
> > > > think this could be simplified to something like the following to
> > > > retain the current behavior.
> > >
> > > Actually now that I think of it more, I think it may be entirely
> > > unnecessary to check the return value.
> > >
> > > si_upload_smc_data calls:
> > > si_write_smc_soft_register
> > > amdgpu_si_write_smc_sram_dword
> > > si_set_smc_sram_address
> > >
> > > This last one, si_set_smc_sram_address returns -EINVAL when its
> > > smc_address parameter is not dword-aligned or out of bounds.
> > > Otherwise
> > > all of the above functions return 0 (success). Considering that all
> > > of
> > > the addresses passed by si_upload_smc_data are compile time
> > > constants,
> > > we know they are correct so there is no reason why any of those
> > > functions would return an error.
> > >
> > > Looking at other callers of si_write_smc_soft_register, I see that
> > > they
> > > don't check the return value at all.
> > >
> > > So, I'd actually simplify this even more and just not check the
> > > return
> > > values. What do you think about that?
> >
> > Sure.  Works for me.
> >
> > Alex
>
> Alex, before I send a new version of this series, can you please
> clarify what these registers are and verify that the actual programming
> of these SMC registers is correct?

This code was based on what the windows code did.

>
> The reason I ask is because due the the bug being fixed by these patch,
> these registers were never actually written, which makes me wonder if
> the value we program them to is actually correct.
>
> I mean the values that we program these registers to:
>
> SI_SMC_SOFT_REGISTER_crtc_index - we just program the index of the
> first active CRTC, seems straightforward enough, but it's unclear what
> the SMC uses this for. Why does the SMC care which crtc we use?
>
> SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min - programmed to the high
> display watermark divided by the line time. But I can't find any
> information about what this information represents or what the SMC uses
> it for. Judging by the name it has to do with mclk switching?
>
> SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max - same concern as _min.

For mclk switching, the mclk has to be changed during the display
blanking period to avoid display artifacts.  This is presumably part
of that, but I don't remember exactly what all of these do anymore.

Alex

>
> Thanks,
> Timur
>
>
> >
> > >
> > >
> > > >
> > > > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > index 52e732be59e36..3dd0115aa15f8 100644
> > > > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > @@ -5836,17 +5836,17 @@ static int si_upload_smc_data(struct
> > > > amdgpu_device *adev)
> > > >
> > > >         if (si_write_smc_soft_register(adev,
> > > >
> > > > SI_SMC_SOFT_REGISTER_crtc_index,
> > > > -                                      amdgpu_crtc->crtc_id) !=
> > > > PPSMC_Result_OK)
> > > > +                                      amdgpu_crtc->crtc_id))
> > > >                 return 0;
> > > >
> > > >         if (si_write_smc_soft_register(adev,
> > > >
> > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> > > > -                                      amdgpu_crtc->wm_high /
> > > > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > > > +                                      amdgpu_crtc->wm_high /
> > > > amdgpu_crtc->line_time))
> > > >                 return 0;
> > > >
> > > >         if (si_write_smc_soft_register(adev,
> > > >
> > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> > > > -                                      amdgpu_crtc->wm_low /
> > > > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > > > +                                      amdgpu_crtc->wm_low /
> > > > amdgpu_crtc->line_time))
> > > >                 return 0;
> > > >
> > > >         return 0;
> > > >
> > > >
> > > > >
> > > > >         return 0;
> > > > >  }
> > > > > --
> > > > > 2.50.1
> > > > >

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/6] drm/amd/pm: Fix si_upload_smc_data
  2025-08-08 19:04           ` Alex Deucher
@ 2025-08-09 19:56             ` Timur Kristóf
  0 siblings, 0 replies; 28+ messages in thread
From: Timur Kristóf @ 2025-08-09 19:56 UTC (permalink / raw)
  To: Alex Deucher; +Cc: amd-gfx

On Fri, 2025-08-08 at 15:04 -0400, Alex Deucher wrote:
> On Fri, Aug 8, 2025 at 4:22 AM Timur Kristóf
> <timur.kristof@gmail.com> wrote:
> > 
> > On Mon, 2025-08-04 at 13:31 -0400, Alex Deucher wrote:
> > > On Mon, Aug 4, 2025 at 12:16 PM Timur Kristóf
> > > <timur.kristof@gmail.com> wrote:
> > > > 
> > > > On Mon, 2025-08-04 at 11:32 -0400, Alex Deucher wrote:
> > > > > On Mon, Aug 4, 2025 at 9:42 AM Timur Kristóf
> > > > > <timur.kristof@gmail.com> wrote:
> > > > > > 
> > > > > > The si_upload_smc_data function uses
> > > > > > si_write_smc_soft_register
> > > > > > to set some register values in the SMC, and expects the
> > > > > > result
> > > > > > to be PPSMC_Result_OK which is 1.
> > > > > > 
> > > > > > The PPSMC_Result_OK / PPSMC_Result_Failed values are used
> > > > > > for
> > > > > > checking the result of a command sent to the SMC.
> > > > > > 
> > > > > > However, the si_write_smc_soft_register actually doesn't
> > > > > > send
> > > > > > any commands to the SMC and returns zero on success,
> > > > > > so this check was incorrect.
> > > > > > 
> > > > > > Fix that by correctly interpreting zero as success.
> > > > > > This seems to fix an SMC hang that happens in
> > > > > > si_set_sw_state.
> > > > > > 
> > > > > > Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)")
> > > > > > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > > > > > ---
> > > > > >  drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 31
> > > > > > +++++++++++++-
> > > > > > ----
> > > > > > ----
> > > > > >  1 file changed, 19 insertions(+), 12 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > > > b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > > > index 33b9d4beec84..e9f034ade214 100644
> > > > > > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > > > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > > > @@ -5820,6 +5820,7 @@ static int si_upload_smc_data(struct
> > > > > > amdgpu_device *adev)
> > > > > >  {
> > > > > >         struct amdgpu_crtc *amdgpu_crtc = NULL;
> > > > > >         int i;
> > > > > > +       int ret;
> > > > > > 
> > > > > >         if (adev->pm.dpm.new_active_crtc_count == 0)
> > > > > >                 return 0;
> > > > > > @@ -5837,20 +5838,26 @@ static int
> > > > > > si_upload_smc_data(struct
> > > > > > amdgpu_device *adev)
> > > > > >         if (amdgpu_crtc->line_time <= 0)
> > > > > >                 return 0;
> > > > > > 
> > > > > > -       if (si_write_smc_soft_register(adev,
> > > > > > -
> > > > > > SI_SMC_SOFT_REGISTER_crtc_index,
> > > > > > -                                      amdgpu_crtc-
> > > > > > >crtc_id) !=
> > > > > > PPSMC_Result_OK)
> > > > > > -               return 0;
> > > > > > +       ret = si_write_smc_soft_register(
> > > > > > +               adev,
> > > > > > +               SI_SMC_SOFT_REGISTER_crtc_index,
> > > > > > +               amdgpu_crtc->crtc_id);
> > > > > > +       if (ret)
> > > > > > +               return ret;
> > > > > > 
> > > > > > -       if (si_write_smc_soft_register(adev,
> > > > > > -
> > > > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> > > > > > -                                      amdgpu_crtc->wm_high
> > > > > > /
> > > > > > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > > > > > -               return 0;
> > > > > > +       ret = si_write_smc_soft_register(
> > > > > > +               adev,
> > > > > > +              
> > > > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> > > > > > +               amdgpu_crtc->wm_high / amdgpu_crtc-
> > > > > > >line_time);
> > > > > > +       if (ret)
> > > > > > +               return ret;
> > > > > > 
> > > > > > -       if (si_write_smc_soft_register(adev,
> > > > > > -
> > > > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> > > > > > -                                      amdgpu_crtc->wm_low
> > > > > > /
> > > > > > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > > > > > -               return 0;
> > > > > > +       ret = si_write_smc_soft_register(
> > > > > > +               adev,
> > > > > > +              
> > > > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> > > > > > +               amdgpu_crtc->wm_low / amdgpu_crtc-
> > > > > > >line_time);
> > > > > > +       if (ret)
> > > > > > +               return ret;
> > > > > 
> > > > > This patch changes the behavior of this function (i.e., it
> > > > > always
> > > > > returns 0 before this patch).  Not sure if that matters or
> > > > > not.
> > > > > I
> > > > > think this could be simplified to something like the
> > > > > following to
> > > > > retain the current behavior.
> > > > 
> > > > Actually now that I think of it more, I think it may be
> > > > entirely
> > > > unnecessary to check the return value.
> > > > 
> > > > si_upload_smc_data calls:
> > > > si_write_smc_soft_register
> > > > amdgpu_si_write_smc_sram_dword
> > > > si_set_smc_sram_address
> > > > 
> > > > This last one, si_set_smc_sram_address returns -EINVAL when its
> > > > smc_address parameter is not dword-aligned or out of bounds.
> > > > Otherwise
> > > > all of the above functions return 0 (success). Considering that
> > > > all
> > > > of
> > > > the addresses passed by si_upload_smc_data are compile time
> > > > constants,
> > > > we know they are correct so there is no reason why any of those
> > > > functions would return an error.
> > > > 
> > > > Looking at other callers of si_write_smc_soft_register, I see
> > > > that
> > > > they
> > > > don't check the return value at all.
> > > > 
> > > > So, I'd actually simplify this even more and just not check the
> > > > return
> > > > values. What do you think about that?
> > > 
> > > Sure.  Works for me.
> > > 
> > > Alex
> > 
> > Alex, before I send a new version of this series, can you please
> > clarify what these registers are and verify that the actual
> > programming
> > of these SMC registers is correct?
> 
> This code was based on what the windows code did.

Makes sense. Let's assume that it's correct then.

> 
> > 
> > The reason I ask is because due the the bug being fixed by these
> > patch,
> > these registers were never actually written, which makes me wonder
> > if
> > the value we program them to is actually correct.
> > 
> > I mean the values that we program these registers to:
> > 
> > SI_SMC_SOFT_REGISTER_crtc_index - we just program the index of the
> > first active CRTC, seems straightforward enough, but it's unclear
> > what
> > the SMC uses this for. Why does the SMC care which crtc we use?
> > 
> > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min - programmed to the
> > high
> > display watermark divided by the line time. But I can't find any
> > information about what this information represents or what the SMC
> > uses
> > it for. Judging by the name it has to do with mclk switching?
> > 
> > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max - same concern as
> > _min.
> 
> For mclk switching, the mclk has to be changed during the display
> blanking period to avoid display artifacts.  This is presumably part
> of that, but I don't remember exactly what all of these do anymore.
> 
> Alex

Thank you. I will make the changes that we discussed above and send a
second version of this series next week.

> 
> > 
> > Thanks,
> > Timur
> > 
> > 
> > > 
> > > > 
> > > > 
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > > b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > > index 52e732be59e36..3dd0115aa15f8 100644
> > > > > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > > > > @@ -5836,17 +5836,17 @@ static int si_upload_smc_data(struct
> > > > > amdgpu_device *adev)
> > > > > 
> > > > >         if (si_write_smc_soft_register(adev,
> > > > > 
> > > > > SI_SMC_SOFT_REGISTER_crtc_index,
> > > > > -                                      amdgpu_crtc->crtc_id)
> > > > > !=
> > > > > PPSMC_Result_OK)
> > > > > +                                      amdgpu_crtc->crtc_id))
> > > > >                 return 0;
> > > > > 
> > > > >         if (si_write_smc_soft_register(adev,
> > > > > 
> > > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
> > > > > -                                      amdgpu_crtc->wm_high /
> > > > > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > > > > +                                      amdgpu_crtc->wm_high /
> > > > > amdgpu_crtc->line_time))
> > > > >                 return 0;
> > > > > 
> > > > >         if (si_write_smc_soft_register(adev,
> > > > > 
> > > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
> > > > > -                                      amdgpu_crtc->wm_low /
> > > > > amdgpu_crtc->line_time) != PPSMC_Result_OK)
> > > > > +                                      amdgpu_crtc->wm_low /
> > > > > amdgpu_crtc->line_time))
> > > > >                 return 0;
> > > > > 
> > > > >         return 0;
> > > > > 
> > > > > 
> > > > > > 
> > > > > >         return 0;
> > > > > >  }
> > > > > > --
> > > > > > 2.50.1
> > > > > > 

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2025-08-09 19:56 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-04 13:41 [PATCH 0/6] SI power management fixes Timur Kristóf
2025-08-04 13:41 ` [PATCH 1/6] drm/amdgpu: Power up UVD 3 for FW validation Timur Kristóf
2025-08-04 15:20   ` Alex Deucher
2025-08-04 16:00     ` Timur Kristóf
2025-08-04 17:45       ` Alex Deucher
2025-08-04 18:51         ` Timur Kristóf
2025-08-04 18:59         ` Christian König
2025-08-06  0:35           ` Timur Kristóf
2025-08-06 12:27             ` Christian König
2025-08-08  8:23               ` Timur Kristóf
2025-08-04 13:41 ` [PATCH 2/6] drm/amd/pm: Disable ULV even if unsupported Timur Kristóf
2025-08-04 15:24   ` Alex Deucher
2025-08-04 16:04     ` Timur Kristóf
2025-08-04 17:33       ` Alex Deucher
2025-08-04 13:41 ` [PATCH 3/6] drm/radeon: " Timur Kristóf
2025-08-04 15:24   ` Alex Deucher
2025-08-04 16:09     ` Timur Kristóf
2025-08-04 17:34       ` Alex Deucher
2025-08-04 13:41 ` [PATCH 4/6] drm/amd/pm: Fix si_upload_smc_data Timur Kristóf
2025-08-04 15:32   ` Alex Deucher
2025-08-04 16:16     ` Timur Kristóf
2025-08-04 17:31       ` Alex Deucher
2025-08-08  8:22         ` Timur Kristóf
2025-08-08 19:04           ` Alex Deucher
2025-08-09 19:56             ` Timur Kristóf
2025-08-04 13:41 ` [PATCH 5/6] drm/radeon: " Timur Kristóf
2025-08-04 15:33   ` Alex Deucher
2025-08-04 13:41 ` [PATCH 6/6] drm/amd/pm: Fix SI DPM issues with high pixel clock Timur Kristóf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).