* Re: [RFC PATCH] drm/amd/display: fix bandwidth validation failure on DCN 2.1
[not found] <20231229163821.144599-1-mwen@igalia.com>
@ 2024-01-03 14:44 ` Hamza Mahfooz
2024-01-03 15:41 ` Greg KH
0 siblings, 1 reply; 2+ messages in thread
From: Hamza Mahfooz @ 2024-01-03 14:44 UTC (permalink / raw)
To: Melissa Wen, Harry Wentland, Alex Deucher, Rodrigo Siqueira,
sunpeng.li, christian.koenig, Xinhui.Pan, airlied, daniel,
qingqing.zhuo
Cc: Alex Hung, dri-devel, Daniel Wheeler, amd-gfx, kernel-dev,
stable@vger.kernel.org
On 12/29/23 11:25, Melissa Wen wrote:
> IGT `amdgpu/amd_color/crtc-lut-accuracy` fails right at the beginning of
> the test execution, during atomic check, because DC rejects the
> bandwidth state for a fb sizing 64x64. The test was previously working
> with the deprecated dc_commit_state(). Now using
> dc_validate_with_context() approach, the atomic check needs to perform a
> full state validation. Therefore, set fast_validation to false in the
> dc_validate_global_state call for atomic check.
>
> Fixes: b8272241ff9d ("drm/amd/display: Drop dc_commit_state in favor of dc_commit_streams")
> Signed-off-by: Melissa Wen <mwen@igalia.com>
> ---
>
> Hi,
>
> It's a long story. I was inspecting this bug report:
> - https://gitlab.freedesktop.org/drm/amd/-/issues/2016
> and noticed the IGT test `igt@amdgpu/amd_color@crtc-lut-accuracy`
> mentioned there wasn't even being executed on a laptop with DCN 2.1
> (HP HP ENVY x360 Convertible 13-ay1xxx/8929). The test fails right at
> the beginning due to an atomic check rejection, as below:
>
> Starting subtest: crtc-lut-accuracy
> (amd_color:14772) igt_kms-CRITICAL: Test assertion failure function igt_display_commit_atomic, file ../lib/igt_kms.c:4530:
> (amd_color:14772) igt_kms-CRITICAL: Failed assertion: ret == 0
> (amd_color:14772) igt_kms-CRITICAL: Last errno: 22, Invalid argument
> (amd_color:14772) igt_kms-CRITICAL: error: -22 != 0
> Stack trace:
> #0 ../lib/igt_core.c:1989 __igt_fail_assert()
> #1 [igt_display_commit_atomic+0x44]
> #2 ../tests/amdgpu/amd_color.c:159 __igt_unique____real_main395()
> #3 ../tests/amdgpu/amd_color.c:395 main()
> #4 ../sysdeps/nptl/libc_start_call_main.h:74 __libc_start_call_main()
> #5 ../csu/libc-start.c:128 __libc_start_main@@GLIBC_2.34()
> #6 [_start+0x21]
> Subtest crtc-lut-accuracy failed.
>
> Checking dmesg, we can see that a bandwidth validation failure causes
> the atomic check rejection:
>
> [ 711.147663] amdgpu 0000:04:00.0: [drm] Mode Validation Warning: Unknown Status failed validation.
> [ 711.147667] [drm:amdgpu_dm_atomic_check [amdgpu]] DC global validation failure: Bandwidth validation failure (BW and Watermark) (13)
> [ 711.147772] [drm:amdgpu_irq_dispatch [amdgpu]] Unregistered interrupt src_id: 243 of client_id:8
> [ 711.148033] [drm:amdgpu_dm_atomic_check [amdgpu]] Atomic check failed with err: -22
>
> I also noticed that the atomic check doesn't fail if I change the fb
> width and height used in the test from 64 to 66, and I can get the test
> execution back (and with success). However, I recall that all test cases
> of IGT `amd_color` were working in the past, so I bisected and found the
> first bad commit:
>
> b8272241ff9d drm/amd/display: Drop dc_commit_state in favor of dc_commit_streams
>
> Bringing the `dc_commit_state` machinery back also prevents the
> bandwidth validation failure, but the commit above says
> dc_commit_streams validation is more complete than dc_commit_state, so I
> discarded this approach.
>
> After some debugging and code inspection, I found out that avoiding fast
> validation on dc_validate_global_state during atomic check solves the
> issue, but I'm not sure if this change may affect performance. I
> compared exec time of some IGT tests and didn't see any differences, but
> I recognize it doesn't provide enough evidence.
>
> What do you think about this change? Should I examine other things? Do
> you see any potential issue that I should investigate? Could you
> recommend a better approach to assess any side-effect of not enabling
> fast validation in the atomic check?
>
> Please, let me know your thoughts.
We shouldn't be doing fast updates when lock_and_validation_needed is
true, so this seems to be correct.
Which is to say, applied, thanks!
Cc: stable@vger.kernel.org
>
> Happy New Year!
>
> Melissa
>
> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 2845c884398e..4f51a7ad7a3c 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -10745,7 +10745,7 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
> DRM_DEBUG_DRIVER("drm_dp_mst_atomic_check() failed\n");
> goto fail;
> }
> - status = dc_validate_global_state(dc, dm_state->context, true);
> + status = dc_validate_global_state(dc, dm_state->context, false);
> if (status != DC_OK) {
> DRM_DEBUG_DRIVER("DC global validation failure: %s (%d)",
> dc_status_to_str(status), status);
--
Hamza
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [RFC PATCH] drm/amd/display: fix bandwidth validation failure on DCN 2.1
2024-01-03 14:44 ` [RFC PATCH] drm/amd/display: fix bandwidth validation failure on DCN 2.1 Hamza Mahfooz
@ 2024-01-03 15:41 ` Greg KH
0 siblings, 0 replies; 2+ messages in thread
From: Greg KH @ 2024-01-03 15:41 UTC (permalink / raw)
To: Hamza Mahfooz
Cc: Melissa Wen, Harry Wentland, Alex Deucher, Rodrigo Siqueira,
sunpeng.li, christian.koenig, Xinhui.Pan, airlied, daniel,
qingqing.zhuo, Alex Hung, dri-devel, Daniel Wheeler, amd-gfx,
kernel-dev, stable@vger.kernel.org
On Wed, Jan 03, 2024 at 09:44:18AM -0500, Hamza Mahfooz wrote:
> On 12/29/23 11:25, Melissa Wen wrote:
> > IGT `amdgpu/amd_color/crtc-lut-accuracy` fails right at the beginning of
> > the test execution, during atomic check, because DC rejects the
> > bandwidth state for a fb sizing 64x64. The test was previously working
> > with the deprecated dc_commit_state(). Now using
> > dc_validate_with_context() approach, the atomic check needs to perform a
> > full state validation. Therefore, set fast_validation to false in the
> > dc_validate_global_state call for atomic check.
> >
> > Fixes: b8272241ff9d ("drm/amd/display: Drop dc_commit_state in favor of dc_commit_streams")
> > Signed-off-by: Melissa Wen <mwen@igalia.com>
> > ---
> >
> > Hi,
> >
> > It's a long story. I was inspecting this bug report:
> > - https://gitlab.freedesktop.org/drm/amd/-/issues/2016
> > and noticed the IGT test `igt@amdgpu/amd_color@crtc-lut-accuracy`
> > mentioned there wasn't even being executed on a laptop with DCN 2.1
> > (HP HP ENVY x360 Convertible 13-ay1xxx/8929). The test fails right at
> > the beginning due to an atomic check rejection, as below:
> >
> > Starting subtest: crtc-lut-accuracy
> > (amd_color:14772) igt_kms-CRITICAL: Test assertion failure function igt_display_commit_atomic, file ../lib/igt_kms.c:4530:
> > (amd_color:14772) igt_kms-CRITICAL: Failed assertion: ret == 0
> > (amd_color:14772) igt_kms-CRITICAL: Last errno: 22, Invalid argument
> > (amd_color:14772) igt_kms-CRITICAL: error: -22 != 0
> > Stack trace:
> > #0 ../lib/igt_core.c:1989 __igt_fail_assert()
> > #1 [igt_display_commit_atomic+0x44]
> > #2 ../tests/amdgpu/amd_color.c:159 __igt_unique____real_main395()
> > #3 ../tests/amdgpu/amd_color.c:395 main()
> > #4 ../sysdeps/nptl/libc_start_call_main.h:74 __libc_start_call_main()
> > #5 ../csu/libc-start.c:128 __libc_start_main@@GLIBC_2.34()
> > #6 [_start+0x21]
> > Subtest crtc-lut-accuracy failed.
> >
> > Checking dmesg, we can see that a bandwidth validation failure causes
> > the atomic check rejection:
> >
> > [ 711.147663] amdgpu 0000:04:00.0: [drm] Mode Validation Warning: Unknown Status failed validation.
> > [ 711.147667] [drm:amdgpu_dm_atomic_check [amdgpu]] DC global validation failure: Bandwidth validation failure (BW and Watermark) (13)
> > [ 711.147772] [drm:amdgpu_irq_dispatch [amdgpu]] Unregistered interrupt src_id: 243 of client_id:8
> > [ 711.148033] [drm:amdgpu_dm_atomic_check [amdgpu]] Atomic check failed with err: -22
> >
> > I also noticed that the atomic check doesn't fail if I change the fb
> > width and height used in the test from 64 to 66, and I can get the test
> > execution back (and with success). However, I recall that all test cases
> > of IGT `amd_color` were working in the past, so I bisected and found the
> > first bad commit:
> >
> > b8272241ff9d drm/amd/display: Drop dc_commit_state in favor of dc_commit_streams
> >
> > Bringing the `dc_commit_state` machinery back also prevents the
> > bandwidth validation failure, but the commit above says
> > dc_commit_streams validation is more complete than dc_commit_state, so I
> > discarded this approach.
> >
> > After some debugging and code inspection, I found out that avoiding fast
> > validation on dc_validate_global_state during atomic check solves the
> > issue, but I'm not sure if this change may affect performance. I
> > compared exec time of some IGT tests and didn't see any differences, but
> > I recognize it doesn't provide enough evidence.
> >
> > What do you think about this change? Should I examine other things? Do
> > you see any potential issue that I should investigate? Could you
> > recommend a better approach to assess any side-effect of not enabling
> > fast validation in the atomic check?
> >
> > Please, let me know your thoughts.
>
> We shouldn't be doing fast updates when lock_and_validation_needed is
> true, so this seems to be correct.
>
> Which is to say, applied, thanks!
>
> Cc: stable@vger.kernel.org
<formletter>
This is not the correct way to submit patches for inclusion in the
stable kernel tree. Please read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.
</formletter>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-01-03 15:41 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20231229163821.144599-1-mwen@igalia.com>
2024-01-03 14:44 ` [RFC PATCH] drm/amd/display: fix bandwidth validation failure on DCN 2.1 Hamza Mahfooz
2024-01-03 15:41 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox