* 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang
@ 2024-08-05 18:05 Mikhail Gavrilov
2024-08-24 21:12 ` Mikhail Gavrilov
0 siblings, 1 reply; 15+ messages in thread
From: Mikhail Gavrilov @ 2024-08-05 18:05 UTC (permalink / raw)
To: Leo Li, Harry Wentland, zaeem.mohamed, pekka.paalanen,
Wheeler, Daniel, Deucher, Alexander, amd-gfx list, dri-devel,
Linux List Kernel Mailing, Linux regressions mailing list
Hi,
After commit 1b04dcca4fb1, launching some RenPy games causes computer hang.
After the hang, even Alt + sysrq + REISUB can't reboot the computer!
And no trace in the kernel log!
For demonstration, I'm going to use the game "Find the Orange Narwhal"
because it is free and has 100% reproducivity for this issue.
You can find it in the Steam Store:
https://store.steampowered.com/app/2946010/Find_the_Orange_Narwhal/
I uploaded demonstration video to youtube: https://youtu.be/yVW6rImRpXw
Unfortunately, I can't check the revert commit 1541d63c5fe2 because of
conflicts.
mikhail@primary-ws ~/p/g/linux (master)> git reset v6.11-rc1 --hard
HEAD is now at 8400291e289e Linux 6.11-rc1
mikhail@primary-ws ~/p/g/linux (master)> git revert -n 1b04dcca4fb1
Auto-merging drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
CONFLICT (content): Merge conflict in
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
Auto-merging drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
Auto-merging drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
Auto-merging drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
CONFLICT (content): Merge conflict in
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
error: could not revert 1b04dcca4fb1... drm/amd/display: Introduce
overlay cursor mode
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: Disable this message with "git config advice.mergeConflict false"
commit 1b04dcca4fb10dd3834893a60de74edd99f2bfaf
Author: Leo Li <sunpeng.li@amd.com>
Date: Thu Jan 18 16:29:49 2024 -0500
drm/amd/display: Introduce overlay cursor mode
[Why]
DCN is the display hardware for amdgpu. DRM planes are backed by DCN
hardware pipes, which carry pixel data from one end (memory), to the
other (output encoder).
Each DCN pipe has the ability to blend in a cursor early on in the
pipeline. In other words, there are no dedicated cursor planes in DCN,
which makes cursor behavior somewhat unintuitive for compositors.
For example, if the cursor is in RGB format, but the top-most DRM plane
is in YUV format, DCN will not be able to blend them. Because of this,
amdgpu_dm rejects all configurations where a cursor needs to be enabled
on top of a YUV formatted plane.
From a compositor's perspective, when computing an allocation for
hardware plane offloading, this cursor-on-yuv configuration result in an
atomic test failure. Since the failure reason is not obvious at all,
compositors will likely fall back to full rendering, which is not ideal.
Instead, amdgpu_dm can try to accommodate the cursor-on-yuv
configuration by opportunistically reserving a separate DCN pipe just
for the cursor. We can refer to this as "overlay cursor mode". It is
contrasted with "native cursor mode", where the native DCN per-pipe
cursor is used.
[How]
On each crtc, compute whether the cursor plane should be enabled in
overlay mode. If it is, mark the CRTC as requesting overlay cursor mode.
Overlay cursor should be enabled whenever there exists a underlying
plane that has YUV format, or is scaled differently than the cursor. It
should also be enabled if there is no underlying plane, or if underlying
planes do not cover the entire CRTC.
During DC validation, attempt to enable a separate DCN pipe for the
cursor if it's in overlay mode. If that fails, or if no overlay mode is
requested, then fallback to native mode.
v2:
* Update commit message for when overlay cursor should be enabled
* Also consider scale and no-underlying-plane case (cursor on crtc bg)
* Consider all underlying planes when determinig overlay/native, not
just the plane immediately beneath the cursor, as it may not cover the
entire CRTC.
* Fix typo s/decending/descending/
* Force native cursor on pre-DCN hardware
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 490
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------------
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 7 +++
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c | 1 +
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c | 13 ++++-
4 files changed, 389 insertions(+), 122 deletions(-)
My hardware specs are: https://linux-hardware.org/?probe=61bd7390a9
Leo, can you look into it, please?
--
Best Regards,
Mike Gavrilov.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang
2024-08-05 18:05 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang Mikhail Gavrilov
@ 2024-08-24 21:12 ` Mikhail Gavrilov
2024-09-03 6:35 ` Mikhail Gavrilov
0 siblings, 1 reply; 15+ messages in thread
From: Mikhail Gavrilov @ 2024-08-24 21:12 UTC (permalink / raw)
To: Leo Li, Harry Wentland, zaeem.mohamed, pekka.paalanen,
Wheeler, Daniel, Deucher, Alexander, amd-gfx list, dri-devel,
Linux List Kernel Mailing, Linux regressions mailing list
On Mon, Aug 5, 2024 at 11:05 PM Mikhail Gavrilov
<mikhail.v.gavrilov@gmail.com> wrote:
>
> Hi,
> After commit 1b04dcca4fb1, launching some RenPy games causes computer hang.
> After the hang, even Alt + sysrq + REISUB can't reboot the computer!
> And no trace in the kernel log!
> For demonstration, I'm going to use the game "Find the Orange Narwhal"
> because it is free and has 100% reproducivity for this issue.
> You can find it in the Steam Store:
> https://store.steampowered.com/app/2946010/Find_the_Orange_Narwhal/
> I uploaded demonstration video to youtube: https://youtu.be/yVW6rImRpXw
>
> Unfortunately, I can't check the revert commit 1541d63c5fe2 because of
> conflicts.
>
> mikhail@primary-ws ~/p/g/linux (master)> git reset v6.11-rc1 --hard
> HEAD is now at 8400291e289e Linux 6.11-rc1
>
> mikhail@primary-ws ~/p/g/linux (master)> git revert -n 1b04dcca4fb1
> Auto-merging drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> CONFLICT (content): Merge conflict in
> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> Auto-merging drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> Auto-merging drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
> Auto-merging drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
> CONFLICT (content): Merge conflict in
> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
> error: could not revert 1b04dcca4fb1... drm/amd/display: Introduce
> overlay cursor mode
> hint: after resolving the conflicts, mark the corrected paths
> hint: with 'git add <paths>' or 'git rm <paths>'
> hint: Disable this message with "git config advice.mergeConflict false"
>
> commit 1b04dcca4fb10dd3834893a60de74edd99f2bfaf
> Author: Leo Li <sunpeng.li@amd.com>
> Date: Thu Jan 18 16:29:49 2024 -0500
>
> drm/amd/display: Introduce overlay cursor mode
>
> [Why]
>
> DCN is the display hardware for amdgpu. DRM planes are backed by DCN
> hardware pipes, which carry pixel data from one end (memory), to the
> other (output encoder).
>
> Each DCN pipe has the ability to blend in a cursor early on in the
> pipeline. In other words, there are no dedicated cursor planes in DCN,
> which makes cursor behavior somewhat unintuitive for compositors.
>
> For example, if the cursor is in RGB format, but the top-most DRM plane
> is in YUV format, DCN will not be able to blend them. Because of this,
> amdgpu_dm rejects all configurations where a cursor needs to be enabled
> on top of a YUV formatted plane.
>
> From a compositor's perspective, when computing an allocation for
> hardware plane offloading, this cursor-on-yuv configuration result in an
> atomic test failure. Since the failure reason is not obvious at all,
> compositors will likely fall back to full rendering, which is not ideal.
>
> Instead, amdgpu_dm can try to accommodate the cursor-on-yuv
> configuration by opportunistically reserving a separate DCN pipe just
> for the cursor. We can refer to this as "overlay cursor mode". It is
> contrasted with "native cursor mode", where the native DCN per-pipe
> cursor is used.
>
> [How]
>
> On each crtc, compute whether the cursor plane should be enabled in
> overlay mode. If it is, mark the CRTC as requesting overlay cursor mode.
>
> Overlay cursor should be enabled whenever there exists a underlying
> plane that has YUV format, or is scaled differently than the cursor. It
> should also be enabled if there is no underlying plane, or if underlying
> planes do not cover the entire CRTC.
>
> During DC validation, attempt to enable a separate DCN pipe for the
> cursor if it's in overlay mode. If that fails, or if no overlay mode is
> requested, then fallback to native mode.
>
> v2:
> * Update commit message for when overlay cursor should be enabled
> * Also consider scale and no-underlying-plane case (cursor on crtc bg)
> * Consider all underlying planes when determinig overlay/native, not
> just the plane immediately beneath the cursor, as it may not cover the
> entire CRTC.
> * Fix typo s/decending/descending/
> * Force native cursor on pre-DCN hardware
>
> Reviewed-by: Harry Wentland <harry.wentland@amd.com>
> Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
> Signed-off-by: Leo Li <sunpeng.li@amd.com>
> Acked-by: Harry Wentland <harry.wentland@amd.com>
> Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>
> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>
> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 490
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------------
> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 7 +++
> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c | 1 +
> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c | 13 ++++-
> 4 files changed, 389 insertions(+), 122 deletions(-)
>
>
> My hardware specs are: https://linux-hardware.org/?probe=61bd7390a9
>
> Leo, can you look into it, please?
>
Hi,
Is anyone trying to look into it?
I continue to reproduce this issue on fresh kernel builds 6.11-rc4+.
In addition to the RenPy engine, the problem also reproduces on games
from Ubisoft, such as Far Cry 4.
A very important note that I missed in the first message.
To reproduce the problem, you need to enable scaling in Gnome for
HiDPI monitors.
I am using 4K resolution with 200% of fractional scaling.
--
Best Regards,
Mike Gavrilov.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang
2024-08-24 21:12 ` Mikhail Gavrilov
@ 2024-09-03 6:35 ` Mikhail Gavrilov
2024-09-03 23:15 ` Leo Li
0 siblings, 1 reply; 15+ messages in thread
From: Mikhail Gavrilov @ 2024-09-03 6:35 UTC (permalink / raw)
To: Leo Li, Harry Wentland, zaeem.mohamed, pekka.paalanen,
Wheeler, Daniel, Deucher, Alexander, amd-gfx list, dri-devel,
Linux List Kernel Mailing, Linux regressions mailing list
On Sun, Aug 25, 2024 at 2:12 AM Mikhail Gavrilov
<mikhail.v.gavrilov@gmail.com> wrote:
>
> Hi,
> Is anyone trying to look into it?
> I continue to reproduce this issue on fresh kernel builds 6.11-rc4+.
> In addition to the RenPy engine, the problem also reproduces on games
> from Ubisoft, such as Far Cry 4.
> A very important note that I missed in the first message.
> To reproduce the problem, you need to enable scaling in Gnome for
> HiDPI monitors.
> I am using 4K resolution with 200% of fractional scaling.
Sorry for persistence, but I'm afraid there's no time left to fix this
regression.
There's a week left until the release.
A month later, no one has looked at what the problem is.
--
Best Regards,
Mike Gavrilov.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang
2024-09-03 6:35 ` Mikhail Gavrilov
@ 2024-09-03 23:15 ` Leo Li
2024-09-04 22:21 ` Mikhail Gavrilov
0 siblings, 1 reply; 15+ messages in thread
From: Leo Li @ 2024-09-03 23:15 UTC (permalink / raw)
To: Mikhail Gavrilov, Harry Wentland, zaeem.mohamed, pekka.paalanen,
Wheeler, Daniel, Deucher, Alexander, amd-gfx list, dri-devel,
Linux List Kernel Mailing, Linux regressions mailing list
On 2024-09-03 02:35, Mikhail Gavrilov wrote:
> On Sun, Aug 25, 2024 at 2:12 AM Mikhail Gavrilov
> <mikhail.v.gavrilov@gmail.com> wrote:
>>
>> Hi,
>> Is anyone trying to look into it?
>> I continue to reproduce this issue on fresh kernel builds 6.11-rc4+.
>> In addition to the RenPy engine, the problem also reproduces on games
>> from Ubisoft, such as Far Cry 4.
>> A very important note that I missed in the first message.
>> To reproduce the problem, you need to enable scaling in Gnome for
>> HiDPI monitors.
>> I am using 4K resolution with 200% of fractional scaling.
>
> Sorry for persistence, but I'm afraid there's no time left to fix this
> regression.
> There's a week left until the release.
> A month later, no one has looked at what the problem is.
>
Hi Mike,
Super sorry for the ridiculous wait. Your first two emails slipped by my inbox,
which is really silly, given I'm first in the to field...
Thanks for bisecting and finding a free game to reproduce it on. I did not have
luck reproducing this today, but I am on sway and not gnome. While I get gnome
set up, will you be able to test which one of these reverts fixes the hang for
you? Whether just 1/2 is enough, or both 1/2 and 2/2 is required?
I applied them on top of Linus's v6.11-rc6 tag, so hopefully they'll git am
cleanly for you:
1/2:
https://gist.github.com/leeonadoh/69147b5fa8d815b39c5f4c3e005cca28#file-0001-revert-drm-amd-display-move-primary-plane-zpos-highe-patch
2/2:
https://gist.github.com/leeonadoh/69147b5fa8d815b39c5f4c3e005cca28#file-0002-revert-drm-amd-display-introduce-overlay-cursor-mode-patch
Thanks,
Leo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang
2024-09-03 23:15 ` Leo Li
@ 2024-09-04 22:21 ` Mikhail Gavrilov
2024-09-04 23:06 ` Leo Li
0 siblings, 1 reply; 15+ messages in thread
From: Mikhail Gavrilov @ 2024-09-04 22:21 UTC (permalink / raw)
To: Leo Li
Cc: Harry Wentland, zaeem.mohamed, pekka.paalanen, Wheeler, Daniel,
Deucher, Alexander, amd-gfx list, dri-devel,
Linux List Kernel Mailing, Linux regressions mailing list
On Wed, Sep 4, 2024 at 4:15 AM Leo Li <sunpeng.li@amd.com> wrote:
> Hi Mike,
>
> Super sorry for the ridiculous wait. Your first two emails slipped by my inbox,
> which is really silly, given I'm first in the to field...
>
> Thanks for bisecting and finding a free game to reproduce it on. I did not have
> luck reproducing this today, but I am on sway and not gnome. While I get gnome
> set up, will you be able to test which one of these reverts fixes the hang for
> you? Whether just 1/2 is enough, or both 1/2 and 2/2 is required?
>
> I applied them on top of Linus's v6.11-rc6 tag, so hopefully they'll git am
> cleanly for you:
>
> 1/2:
> https://gist.github.com/leeonadoh/69147b5fa8d815b39c5f4c3e005cca28#file-0001-revert-drm-amd-display-move-primary-plane-zpos-highe-patch
> 2/2:
> https://gist.github.com/leeonadoh/69147b5fa8d815b39c5f4c3e005cca28#file-0002-revert-drm-amd-display-introduce-overlay-cursor-mode-patch
>
The first patch is not enough.
Yes, it fixes the system hang when I launch the game "Find the Orange Narwhal".
But it does not fix the issue completely.
Some RenPy games still can lead the system to hang.
For example "Innocence Or Money Season 1"
https://store.steampowered.com/app/1958390/Innocence_Or_Money_Season_1__Episodes_1_to_3/
on the language selection screen.
Unfortunately the kernel is not builded with both patches.
I have got compilation error after applying second patch:
CC [M] drivers/gpu/drm/nouveau/nvkm/engine/fifo/chid.o
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c: In
function ‘amdgpu_dm_atomic_check’:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:11003:69:
error: unused variable ‘new_cursor_state’ [-Werror=unused-variable]
11003 | struct drm_plane_state *old_plane_state,
*new_plane_state, *new_cursor_state;
|
^~~~~~~~~~~~~~~~
CC [M] drivers/gpu/drm/amd/amdgpu/../display/dc/basics/conversion.o
***
CC [M] drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.o
cc1: all warnings being treated as errors
CC [M] drivers/gpu/drm/amd/amdgpu/../display/dc/dml/calcs/dcn_calc_auto.o
CC [M] drivers/gpu/drm/nouveau/nvkm/engine/gr/ga102.o
CC [M] drivers/gpu/drm/nouveau/nvkm/engine/gr/ad102.o
CC [M] drivers/gpu/drm/nouveau/nvkm/engine/gr/r535.o
CC [M] drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/clk_mgr.o
CC [M] drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxnv40.o
CC [M] drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dce60/dce60_clk_mgr.o
make[6]: *** [scripts/Makefile.build:244:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.o] Error 1
make[6]: *** Waiting for unfinished jobs....
CC [M] drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxnv50.o
***
make[5]: *** [scripts/Makefile.build:485: drivers/gpu/drm/amd/amdgpu] Error 2
make[4]: *** [scripts/Makefile.build:485: drivers/gpu/drm] Error 2
make[3]: *** [scripts/Makefile.build:485: drivers/gpu] Error 2
make[2]: *** [scripts/Makefile.build:485: drivers] Error 2
make[1]: *** [/home/mikhail/packaging-work/git/linux-3/Makefile:1925: .] Error 2
make: *** [Makefile:224: __sub-make] Error 2
--
Best Regards,
Mike Gavrilov.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang
2024-09-04 22:21 ` Mikhail Gavrilov
@ 2024-09-04 23:06 ` Leo Li
2024-09-05 6:06 ` Mikhail Gavrilov
2025-01-26 16:46 ` [BUG,BISECTED] WARNING dcn20_find_secondary_pipe Chris Bainbridge
0 siblings, 2 replies; 15+ messages in thread
From: Leo Li @ 2024-09-04 23:06 UTC (permalink / raw)
To: Mikhail Gavrilov
Cc: Harry Wentland, zaeem.mohamed, pekka.paalanen, Wheeler, Daniel,
Deucher, Alexander, amd-gfx list, dri-devel,
Linux List Kernel Mailing, Linux regressions mailing list
On 2024-09-04 18:21, Mikhail Gavrilov wrote:
> On Wed, Sep 4, 2024 at 4:15 AM Leo Li <sunpeng.li@amd.com> wrote:
>> Hi Mike,
>>
>> Super sorry for the ridiculous wait. Your first two emails slipped by my inbox,
>> which is really silly, given I'm first in the to field...
>>
>> Thanks for bisecting and finding a free game to reproduce it on. I did not have
>> luck reproducing this today, but I am on sway and not gnome. While I get gnome
>> set up, will you be able to test which one of these reverts fixes the hang for
>> you? Whether just 1/2 is enough, or both 1/2 and 2/2 is required?
>>
>> I applied them on top of Linus's v6.11-rc6 tag, so hopefully they'll git am
>> cleanly for you:
>>
>> 1/2:
>> https://gist.github.com/leeonadoh/69147b5fa8d815b39c5f4c3e005cca28#file-0001-revert-drm-amd-display-move-primary-plane-zpos-highe-patch
>> 2/2:
>> https://gist.github.com/leeonadoh/69147b5fa8d815b39c5f4c3e005cca28#file-0002-revert-drm-amd-display-introduce-overlay-cursor-mode-patch
>>
>
> The first patch is not enough.
> Yes, it fixes the system hang when I launch the game "Find the Orange Narwhal".
> But it does not fix the issue completely.
> Some RenPy games still can lead the system to hang.
> For example "Innocence Or Money Season 1"
> https://store.steampowered.com/app/1958390/Innocence_Or_Money_Season_1__Episodes_1_to_3/
> on the language selection screen.
>
> Unfortunately the kernel is not builded with both patches.
> I have got compilation error after applying second patch:
>
> CC [M] drivers/gpu/drm/nouveau/nvkm/engine/fifo/chid.o
> drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c: In
> function ‘amdgpu_dm_atomic_check’:
> drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:11003:69:
> error: unused variable ‘new_cursor_state’ [-Werror=unused-variable]
> 11003 | struct drm_plane_state *old_plane_state,
> *new_plane_state, *new_cursor_state;
Can you delete ", new_cursor_state" on that line and try again? Seems to be a
unused variable warning being elevated to an error.
Thanks,
Leo
> |
> ^~~~~~~~~~~~~~~~
> CC [M] drivers/gpu/drm/amd/amdgpu/../display/dc/basics/conversion.o
> ***
> CC [M] drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.o
> cc1: all warnings being treated as errors
> CC [M] drivers/gpu/drm/amd/amdgpu/../display/dc/dml/calcs/dcn_calc_auto.o
> CC [M] drivers/gpu/drm/nouveau/nvkm/engine/gr/ga102.o
> CC [M] drivers/gpu/drm/nouveau/nvkm/engine/gr/ad102.o
> CC [M] drivers/gpu/drm/nouveau/nvkm/engine/gr/r535.o
> CC [M] drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/clk_mgr.o
> CC [M] drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxnv40.o
> CC [M] drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dce60/dce60_clk_mgr.o
> make[6]: *** [scripts/Makefile.build:244:
> drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.o] Error 1
> make[6]: *** Waiting for unfinished jobs....
> CC [M] drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxnv50.o
> ***
> make[5]: *** [scripts/Makefile.build:485: drivers/gpu/drm/amd/amdgpu] Error 2
> make[4]: *** [scripts/Makefile.build:485: drivers/gpu/drm] Error 2
> make[3]: *** [scripts/Makefile.build:485: drivers/gpu] Error 2
> make[2]: *** [scripts/Makefile.build:485: drivers] Error 2
> make[1]: *** [/home/mikhail/packaging-work/git/linux-3/Makefile:1925: .] Error 2
> make: *** [Makefile:224: __sub-make] Error 2
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang
2024-09-04 23:06 ` Leo Li
@ 2024-09-05 6:06 ` Mikhail Gavrilov
2024-09-06 19:46 ` Leo Li
2025-01-26 16:46 ` [BUG,BISECTED] WARNING dcn20_find_secondary_pipe Chris Bainbridge
1 sibling, 1 reply; 15+ messages in thread
From: Mikhail Gavrilov @ 2024-09-05 6:06 UTC (permalink / raw)
To: Leo Li
Cc: Harry Wentland, zaeem.mohamed, pekka.paalanen, Wheeler, Daniel,
Deucher, Alexander, amd-gfx list, dri-devel,
Linux List Kernel Mailing, Linux regressions mailing list
On Thu, Sep 5, 2024 at 4:06 AM Leo Li <sunpeng.li@amd.com> wrote:
>
> Can you delete ", new_cursor_state" on that line and try again? Seems to be a
> unused variable warning being elevated to an error.
>
Thanks, I applied both patches and can confirm that this solved the issue.
The first patch was definitely not enough.
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
--
Best Regards,
Mike Gavrilov.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang
2024-09-05 6:06 ` Mikhail Gavrilov
@ 2024-09-06 19:46 ` Leo Li
2024-09-08 23:30 ` Mikhail Gavrilov
2024-09-09 8:49 ` Michel Dänzer
0 siblings, 2 replies; 15+ messages in thread
From: Leo Li @ 2024-09-06 19:46 UTC (permalink / raw)
To: Mikhail Gavrilov
Cc: Harry Wentland, zaeem.mohamed, pekka.paalanen, Wheeler, Daniel,
Deucher, Alexander, amd-gfx list, dri-devel,
Linux List Kernel Mailing, Linux regressions mailing list
Hi Mikhail,
I've tried to align my system with yours as best as I can, but so far, I've had
no luck reproducing the hang. A video of what I'm doing:
https://youtu.be/VeD-LPCnfWM?si=b2baF8MyDBuU4jRH
(Under the hood, the W7900 and 7900xt should be the same)
I have a few suggestions:
First, can you also open an issue on the amd gitlab tracker? It gives more
visibility to others, and makes working together a bit easier:
https://gitlab.freedesktop.org/drm/amd/-/issues
Second, can you try adding "amdgpu.dcdebugmask=0x40" to your kernel cmdline at
boot, and see if you can still repro the hang?
This setting disables hw planes. If it resolves the hang, then it's quite
interesting, because it suggests that gnome may be using direct-scanout via hw
planes. We may need to align our gnome configuration in that case, since I don't
see any additional hw planes being used on my setup.
Third, in case these two issues are related, can you give the attached patch on
this issue thread a try as well?
https://gitlab.freedesktop.org/drm/amd/-/issues/3569#note_2558359
Thanks,
Leo
On 2024-09-05 02:06, Mikhail Gavrilov wrote:
> On Thu, Sep 5, 2024 at 4:06 AM Leo Li <sunpeng.li@amd.com> wrote:
>>
>> Can you delete ", new_cursor_state" on that line and try again? Seems to be a
>> unused variable warning being elevated to an error.
>>
>
> Thanks, I applied both patches and can confirm that this solved the issue.
> The first patch was definitely not enough.
>
> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang
2024-09-06 19:46 ` Leo Li
@ 2024-09-08 23:30 ` Mikhail Gavrilov
2024-09-10 15:47 ` Leo Li
2024-09-09 8:49 ` Michel Dänzer
1 sibling, 1 reply; 15+ messages in thread
From: Mikhail Gavrilov @ 2024-09-08 23:30 UTC (permalink / raw)
To: Leo Li
Cc: Harry Wentland, zaeem.mohamed, pekka.paalanen, Wheeler, Daniel,
Deucher, Alexander, amd-gfx list, dri-devel,
Linux List Kernel Mailing, Linux regressions mailing list
[-- Attachment #1: Type: text/plain, Size: 2286 bytes --]
On Sat, Sep 7, 2024 at 12:47 AM Leo Li <sunpeng.li@amd.com> wrote:
>
>
> Hi Mikhail,
>
> I've tried to align my system with yours as best as I can, but so far, I've had
> no luck reproducing the hang. A video of what I'm doing:
> https://youtu.be/VeD-LPCnfWM?si=b2baF8MyDBuU4jRH
> (Under the hood, the W7900 and 7900xt should be the same)
I have done additional tests:
1. The computer does not hang with 6900XT instead the screen flickers
when moving the cursor.
2. The computer does not hang with 7900XTX if I turn off VRR. But the
screen flickers when moving the cursor, as on 6900XT.
To enable VRR, please set 'variable-refresh-rate' in
experimental-features, and in the Display setting, enable Variable
Refresh Rate.
$ gsettings set org.gnome.mutter experimental-features
"['variable-refresh-rate', 'scale-monitor-framebuffer']"
https://postimg.cc/PvXYdvGR
3. The chances of the problem reoccurring are much higher when running
the game "Play Innocence Or Money Season 1 - Episodes 1 to 3". There
is a free demo version.
https://store.steampowered.com/app/1958390/Innocence_Or_Money_Season_1__Episodes_1_to_3/
Demonstration: https://youtu.be/XIe0pQYPVUo
>
> I have a few suggestions:
>
> First, can you also open an issue on the amd gitlab tracker? It gives more
> visibility to others, and makes working together a bit easier:
> https://gitlab.freedesktop.org/drm/amd/-/issues
>
> Second, can you try adding "amdgpu.dcdebugmask=0x40" to your kernel cmdline at
> boot, and see if you can still repro the hang?
Yes. This didn't help.
> This setting disables hw planes. If it resolves the hang, then it's quite
> interesting, because it suggests that gnome may be using direct-scanout via hw
> planes. We may need to align our gnome configuration in that case, since I don't
> see any additional hw planes being used on my setup.
>
> Third, in case these two issues are related, can you give the attached patch on
> this issue thread a try as well?
> https://gitlab.freedesktop.org/drm/amd/-/issues/3569#note_2558359
This patch also didn't help.
Maybe try to compile a kernel with the same config as mine and enable
VRR to repeat the problem?
I attached my build config to this message.
--
Best Regards,
Mike Gavrilov.
[-- Attachment #2: .config.zip --]
[-- Type: application/zip, Size: 67015 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang
2024-09-06 19:46 ` Leo Li
2024-09-08 23:30 ` Mikhail Gavrilov
@ 2024-09-09 8:49 ` Michel Dänzer
1 sibling, 0 replies; 15+ messages in thread
From: Michel Dänzer @ 2024-09-09 8:49 UTC (permalink / raw)
To: Leo Li, Mikhail Gavrilov
Cc: Harry Wentland, zaeem.mohamed, pekka.paalanen, Wheeler, Daniel,
Deucher, Alexander, amd-gfx list, dri-devel,
Linux List Kernel Mailing, Linux regressions mailing list
On 2024-09-06 21:46, Leo Li wrote:
>
> Second, can you try adding "amdgpu.dcdebugmask=0x40" to your kernel cmdline at
> boot, and see if you can still repro the hang?
>
> This setting disables hw planes. If it resolves the hang, then it's quite
> interesting, because it suggests that gnome may be using direct-scanout via hw
> planes. We may need to align our gnome configuration in that case, since I don't
> see any additional hw planes being used on my setup.
GNOME's mutter doesn't make use of overlay planes yet.
(There's a WIP MR though: https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/2660)
--
Earthling Michel Dänzer \ GNOME / Xwayland / Mesa developer
https://redhat.com \ Libre software enthusiast
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang
2024-09-08 23:30 ` Mikhail Gavrilov
@ 2024-09-10 15:47 ` Leo Li
2024-09-10 21:11 ` Leo Li
0 siblings, 1 reply; 15+ messages in thread
From: Leo Li @ 2024-09-10 15:47 UTC (permalink / raw)
To: Mikhail Gavrilov
Cc: Harry Wentland, zaeem.mohamed, pekka.paalanen, Wheeler, Daniel,
Deucher, Alexander, amd-gfx list, dri-devel,
Linux List Kernel Mailing, Linux regressions mailing list
On 2024-09-08 19:30, Mikhail Gavrilov wrote:
> I have done additional tests:
> 1. The computer does not hang with 6900XT instead the screen flickers
> when moving the cursor.
> 2. The computer does not hang with 7900XTX if I turn off VRR. But the
> screen flickers when moving the cursor, as on 6900XT.
> To enable VRR, please set 'variable-refresh-rate' in
> experimental-features, and in the Display setting, enable Variable
> Refresh Rate.
> $ gsettings set org.gnome.mutter experimental-features
> "['variable-refresh-rate', 'scale-monitor-framebuffer']"
> https://postimg.cc/PvXYdvGR
Thanks Mikhail, I think I know what's going on now.
The `scale-monitor-framebuffer` experimental setting is what puts us down the
bad code path. It seems VRR has nothing to do with this issue, just setting
`scale-monitor-framebuffer` is enough to reproduce.
It seems that mutter with this setting is opting for HW scaling rather than GPU
scaling. I see that "Find the Orange Narwhal" sends out a 1080p buffer,
which with this setting, gets directly scanned out and scaled by DCN HW to 4k in
full screen.
An oddity with current gen DCN hardware is that the cursor inherits the scaling
of the HW plane underneath. So if mutter requests a hw cursor with a different
scaling than the game's plane, amdgpu will reject that, and likely force mutter
into SW cursor.
My offending patch changed this behavior by rerouting DCN HW pipes to
accommodate such a configuration. It essentially takes a full-fledged DCN
overlay plane, and uses that just for the cursor, and thereby freeing it from
inheriting things from the underlying hw plane.
My guess is this causes flickering due to how DC (display core driver) handles
updates; it needs all enabled planes in it's update state. However, a KMS cursor
update will only include the cursor plane. It's likely that amdgpu_dm only adds
the dedicated cursor plane to DC's update state, leaving the game's plane out.
The fix isn't exactly trivial. If I don't get anywhere before the fixes window,
I'll send out a revert.
Cheers,
Leo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang
2024-09-10 15:47 ` Leo Li
@ 2024-09-10 21:11 ` Leo Li
2024-09-10 22:16 ` Mikhail Gavrilov
0 siblings, 1 reply; 15+ messages in thread
From: Leo Li @ 2024-09-10 21:11 UTC (permalink / raw)
To: Mikhail Gavrilov
Cc: Harry Wentland, zaeem.mohamed, pekka.paalanen, Wheeler, Daniel,
Deucher, Alexander, amd-gfx list, dri-devel,
Linux List Kernel Mailing, Linux regressions mailing list
Hi Mikhail,
Can you give this patch a try to see if it helps?
https://gist.github.com/leeonadoh/3271e90ec95d768424c572c970ada743
Thanks,
Leo
On 2024-09-10 11:47, Leo Li wrote:
>
>
> On 2024-09-08 19:30, Mikhail Gavrilov wrote:
>> I have done additional tests:
>> 1. The computer does not hang with 6900XT instead the screen flickers
>> when moving the cursor.
>> 2. The computer does not hang with 7900XTX if I turn off VRR. But the
>> screen flickers when moving the cursor, as on 6900XT.
>> To enable VRR, please set 'variable-refresh-rate' in
>> experimental-features, and in the Display setting, enable Variable
>> Refresh Rate.
>> $ gsettings set org.gnome.mutter experimental-features
>> "['variable-refresh-rate', 'scale-monitor-framebuffer']"
>> https://postimg.cc/PvXYdvGR
>
> Thanks Mikhail, I think I know what's going on now.
>
> The `scale-monitor-framebuffer` experimental setting is what puts us down the
> bad code path. It seems VRR has nothing to do with this issue, just setting
> `scale-monitor-framebuffer` is enough to reproduce.
>
> It seems that mutter with this setting is opting for HW scaling rather than GPU
> scaling. I see that "Find the Orange Narwhal" sends out a 1080p buffer,
> which with this setting, gets directly scanned out and scaled by DCN HW to 4k in
> full screen.
>
> An oddity with current gen DCN hardware is that the cursor inherits the scaling
> of the HW plane underneath. So if mutter requests a hw cursor with a different
> scaling than the game's plane, amdgpu will reject that, and likely force mutter
> into SW cursor.
>
> My offending patch changed this behavior by rerouting DCN HW pipes to
> accommodate such a configuration. It essentially takes a full-fledged DCN
> overlay plane, and uses that just for the cursor, and thereby freeing it from
> inheriting things from the underlying hw plane.
>
> My guess is this causes flickering due to how DC (display core driver) handles
> updates; it needs all enabled planes in it's update state. However, a KMS cursor
> update will only include the cursor plane. It's likely that amdgpu_dm only adds
> the dedicated cursor plane to DC's update state, leaving the game's plane out.
>
> The fix isn't exactly trivial. If I don't get anywhere before the fixes window,
> I'll send out a revert.
>
> Cheers,
> Leo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang
2024-09-10 21:11 ` Leo Li
@ 2024-09-10 22:16 ` Mikhail Gavrilov
0 siblings, 0 replies; 15+ messages in thread
From: Mikhail Gavrilov @ 2024-09-10 22:16 UTC (permalink / raw)
To: Leo Li
Cc: Harry Wentland, zaeem.mohamed, pekka.paalanen, Wheeler, Daniel,
Deucher, Alexander, amd-gfx list, dri-devel,
Linux List Kernel Mailing, Linux regressions mailing list
On Tue, Sep 10, 2024 at 8:47 PM Leo Li <sunpeng.li@amd.com> wrote:
>
> Thanks Mikhail, I think I know what's going on now.
>
> The `scale-monitor-framebuffer` experimental setting is what puts us down the
> bad code path. It seems VRR has nothing to do with this issue, just setting
> `scale-monitor-framebuffer` is enough to reproduce.
I ran some additional tests:
1)
$ gsettings set org.gnome.mutter experimental-features
"['variable-refresh-rate']"
Symptoms: No
2)
$ gsettings set org.gnome.mutter experimental-features
"['scale-monitor-framebuffer']"
Symptoms: Screen flickers happening when moving cursor.
3)
$ gsettings set org.gnome.mutter experimental-features
"['variable-refresh-rate', 'scale-monitor-framebuffer']"
But Variable Refresh Rate is disabled in the display settings.
Symptoms: As previous - Screen flickers happening when moving cursor.
4)
$ gsettings set org.gnome.mutter experimental-features
"['variable-refresh-rate', 'scale-monitor-framebuffer']"
And Variable Refresh Rate is enabled in the display settings.
Symptoms: On Radeon 7900XTX hardware computer completely hangs without
any messages in kernel logs.
On Wed, Sep 11, 2024 at 2:11 AM Leo Li <sunpeng.li@amd.com> wrote:
>
> Hi Mikhail,
>
> Can you give this patch a try to see if it helps?
> https://gist.github.com/leeonadoh/3271e90ec95d768424c572c970ada743
>
Thanks, with this patch, the issue is not reproduced anymore.
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
The only thing that worries me is the thought that the problem with
hang is now hidden.
It's one thing when the GPU hangs but the system continues to work,
another thing when the system hangs completely and even
Alt+SysRq+REISUB does not help to reboot the system. It shouldn't be
like this...
--
Best Regards,
Mike Gavrilov.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [BUG,BISECTED] WARNING dcn20_find_secondary_pipe
2024-09-04 23:06 ` Leo Li
2024-09-05 6:06 ` Mikhail Gavrilov
@ 2025-01-26 16:46 ` Chris Bainbridge
2025-01-27 9:44 ` Imre Deak
1 sibling, 1 reply; 15+ messages in thread
From: Chris Bainbridge @ 2025-01-26 16:46 UTC (permalink / raw)
To: Leo Li
Cc: Mikhail Gavrilov, Harry Wentland, zaeem.mohamed, pekka.paalanen,
Wheeler, Daniel, Deucher, Alexander, amd-gfx list, dri-devel,
Linux List Kernel Mailing, Linux regressions mailing list,
imre.deak, lyude
Hardware is HP Pavilion Aero 13 laptop with Dell WD19 dock and three
external monitors. I get a warning with recent kernel builds when
enabling the external monitors with xrandr after initial boot:
16:57:49 kernel: WARNING: CPU: 4 PID: 1347 at drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn20/dcn20_resource.c:1734 dcn20_find_secondary_pipe+0x1a6/0x400 [amdgpu]
16:57:49 kernel: Modules linked in: rfcomm xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc nvme_fabrics ccm snd_seq_dummy snd_hrtimer snd_seq cmac algif_hash algif_skcipher af_alg bnep qrtr overlay binfmt_misc snd_acp3x_pdm_dma snd_soc_dmic snd_acp3x_rn snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_ctl_led snd_soc_core snd_hda_codec_realtek snd_compress snd_hda_codec_generic snd_pci_ps snd_hda_scodec_component snd_soc_acpi_amd_match snd_hda_codec_hdmi snd_rpl_pci_acp6x uvcvideo snd_hda_intel snd_usb_audio snd_acp_pci iwlmvm videobuf2_vmalloc btusb snd_intel_dspcfg intel_rapl_msr snd_usbmidi_lib snd_acp_legacy_common videobuf2_memops btrtl intel_rapl_common snd_pci_acp6x snd_hda_codec snd_ump uvc mac80211 btintel snd_pci_acp5x snd_hwdep kvm_amd videobuf2_v4l2 snd_rawmidi btbcm libarc4 snd_hda_core
16:57:49 kernel: snd_seq_device snd_rn_pci_acp3x btmtk videodev kvm hp_wmi snd_pcm snd_acp_config ucsi_acpi iwlwifi ee1004 videobuf2_common platform_profile rapl snd_timer snd_soc_acpi pcspkr sparse_keymap bluetooth typec_ucsi wmi_bmof k10temp sp5100_tco snd_pci_acp3x snd ccp mc cfg80211 soundcore typec input_leds joydev amd_pmc acpi_tad serio_raw mac_hid msr parport_pc ppdev lp parport efi_pstore dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq dm_crypt hid_microsoft ff_memless usbmouse usbkbd hid_cmedia r8153_ecm cdc_ether usbnet usbhid r8152 mii amdgpu i2c_algo_bit drm_ttm_helper ttm drm_panel_backlight_quirks drm_exec drm_suballoc_helper uas cec hid_multitouch usb_storage rc_core hid_generic amdxcp polyval_clmulni drm_buddy nvme i2c_hid_acpi polyval_generic gpu_sched i2c_piix4 nvme_core amd_sfh i2c_hid video ghash_clmulni_intel drm_display_helper i2c_smbus nvme_auth hid wmi aesni_intel crypto_simd cryptd
16:57:49 kernel: CPU: 4 UID: 0 PID: 1347 Comm: Xorg Not tainted 6.13.0-07078-gb46c89c08f41 #139
16:57:49 kernel: Hardware name: HP HP Pavilion Aero Laptop 13-be0xxx/8916, BIOS F.16 08/01/2024
16:57:49 kernel: RIP: 0010:dcn20_find_secondary_pipe+0x1a6/0x400 [amdgpu]
16:57:49 kernel: Code: 48 69 db b8 0f 00 00 49 8d 44 1d 00 44 88 a0 24 08 00 00 48 85 c0 75 c7 49 8b 86 98 05 00 00 44 8b a0 a8 02 00 00 41 83 ec 01 <0f> 0b 45 85 e4 78 ac 44 89 e0 48 69 c0 b8 0f 00 00 4d 8d 74 05 08
16:57:49 kernel: RSP: 0018:ffff9efc4383f478 EFLAGS: 00010206
16:57:49 kernel: RAX: ffff910e59a6f800 RBX: 0000000000000000 RCX: ffff910ea4a02218
16:57:49 kernel: RDX: ffff910e59a6f800 RSI: ffff910ea4a002a8 RDI: ffff910e59400000
16:57:49 kernel: RBP: ffff9efc4383f4b0 R08: ffff910ea4a02218 R09: 0000000000000000
16:57:49 kernel: R10: ffff9efc4383f5d0 R11: 0000000000000000 R12: 0000000000000003
16:57:49 kernel: R13: ffff910ea4a002a8 R14: ffff910e59400000 R15: fffffffffffff048
16:57:49 kernel: FS: 0000711c38030ac0(0000) GS:ffff91114e400000(0000) knlGS:0000000000000000
16:57:49 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
16:57:49 kernel: CR2: 00007cae3d793018 CR3: 000000010d5b1000 CR4: 0000000000f50ef0
16:57:49 kernel: PKRU: 55555554
16:57:49 kernel: Call Trace:
16:57:49 kernel: <TASK>
16:57:49 kernel: ? show_regs+0x68/0x80
16:57:49 kernel: ? __warn+0x93/0x1b0
16:57:49 kernel: ? dcn20_find_secondary_pipe+0x1a6/0x400 [amdgpu]
16:57:49 kernel: ? report_bug+0x17e/0x1b0
16:57:49 kernel: ? handle_bug+0x6a/0xb0
16:57:49 kernel: ? exc_invalid_op+0x18/0x80
16:57:49 kernel: ? asm_exc_invalid_op+0x1b/0x20
16:57:49 kernel: ? dcn20_find_secondary_pipe+0x1a6/0x400 [amdgpu]
16:57:49 kernel: dcn21_fast_validate_bw+0x409/0x740 [amdgpu]
16:57:49 kernel: dcn21_validate_bandwidth_fp+0xd6/0xf20 [amdgpu]
16:57:49 kernel: ? __might_sleep+0x58/0x90
16:57:49 kernel: dcn21_validate_bandwidth+0x62/0xa0 [amdgpu]
16:57:49 kernel: ? dcn21_validate_bandwidth+0x62/0xa0 [amdgpu]
16:57:49 kernel: dc_validate_global_state+0x444/0x600 [amdgpu]
16:57:49 kernel: ? drm_dp_mst_atomic_check+0xbd/0x100 [drm_display_helper]
16:57:49 kernel: amdgpu_dm_atomic_check+0x17ae/0x1940 [amdgpu]
16:57:49 kernel: drm_atomic_check_only+0x6a4/0xb30
16:57:49 kernel: drm_atomic_commit+0x6f/0xe0
16:57:49 kernel: ? __drm_printfn_seq_file+0x30/0x30
16:57:49 kernel: drm_atomic_helper_set_config+0x7e/0xc0
16:57:49 kernel: drm_mode_setcrtc+0x416/0x9e0
16:57:49 kernel: ? __lock_acquire+0x415/0x27d0
16:57:49 kernel: ? __lock_acquire+0x415/0x27d0
16:57:49 kernel: ? drm_mode_getcrtc+0x1e0/0x1e0
16:57:49 kernel: drm_ioctl_kernel+0xb5/0x120
16:57:49 kernel: drm_ioctl+0x300/0x5a0
16:57:49 kernel: ? drm_mode_getcrtc+0x1e0/0x1e0
16:57:49 kernel: amdgpu_drm_ioctl+0x4e/0x90 [amdgpu]
16:57:49 kernel: __x64_sys_ioctl+0xa0/0xd0
16:57:49 kernel: x64_sys_call+0xee7/0xfb0
16:57:49 kernel: do_syscall_64+0x87/0x140
16:57:49 kernel: ? find_held_lock+0x31/0x90
16:57:49 kernel: ? find_held_lock+0x31/0x90
16:57:49 kernel: ? lock_release+0xdb/0x2c0
16:57:49 kernel: ? dput.part.0+0x91/0x460
16:57:49 kernel: ? dput.part.0+0x9b/0x460
16:57:49 kernel: ? dput+0x13/0x20
16:57:49 kernel: ? __fsnotify_parent+0x200/0x3b0
16:57:49 kernel: ? find_held_lock+0x31/0x90
16:57:49 kernel: ? find_held_lock+0x31/0x90
16:57:49 kernel: ? lock_release+0xdb/0x2c0
16:57:49 kernel: ? __f_unlock_pos+0x15/0x20
16:57:49 kernel: ? __mutex_unlock_slowpath+0x41/0x2e0
16:57:49 kernel: ? mutex_unlock+0x12/0x20
16:57:49 kernel: ? trace_irq_disable+0x7b/0xb0
16:57:49 kernel: ? trace_irq_enable+0x7b/0xb0
16:57:49 kernel: ? syscall_exit_to_user_mode+0xcc/0x210
16:57:49 kernel: ? do_syscall_64+0x93/0x140
16:57:49 kernel: ? do_syscall_64+0x93/0x140
16:57:49 kernel: ? sysvec_apic_timer_interrupt+0x57/0xc0
16:57:49 kernel: entry_SYSCALL_64_after_hwframe+0x4b/0x53
16:57:49 kernel: RIP: 0033:0x711c3831ccdb
16:57:49 kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b 04 25 28 00 00
16:57:49 kernel: RSP: 002b:00007fffe26e36f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
16:57:49 kernel: RAX: ffffffffffffffda RBX: 0000592748268ac0 RCX: 0000711c3831ccdb
16:57:49 kernel: RDX: 00007fffe26e3780 RSI: 00000000c06864a2 RDI: 000000000000000f
16:57:49 kernel: RBP: 00007fffe26e3780 R08: 0000000000000000 R09: 0000000000000000
16:57:49 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c06864a2
16:57:49 kernel: R13: 000000000000000f R14: 0000000000000000 R15: 0000000000000000
16:57:49 kernel: </TASK>
16:57:49 kernel: irq event stamp: 2276151
16:57:49 kernel: hardirqs last enabled at (2276157): [<ffffffffa985a9c5>] __up_console_sem+0x75/0x90
16:57:49 kernel: hardirqs last disabled at (2276162): [<ffffffffa985a9aa>] __up_console_sem+0x5a/0x90
16:57:49 kernel: softirqs last enabled at (2274574): [<ffffffffa979ecff>] __irq_exit_rcu+0xbf/0xf0
16:57:49 kernel: softirqs last disabled at (2274567): [<ffffffffa979ecff>] __irq_exit_rcu+0xbf/0xf0
16:57:49 kernel: ---[ end trace 0000000000000000 ]---
The bisect leads to a merge commit 43102a2012c2 ("Merge tag
'drm-misc-fixes-2024-09-26'"). Neither parent commit produces the
warning, but the merged commit does.
There are two commits that interact to cause this warning:
2a2a865aee43 ("drm/amd/display: Add all planes on CRTC to state for
overlay cursor").
and
70a6587dca37 ("drm/dp_mst: Fix DSC decompression detection in Synaptics
branch devices")
2a2a865aee43 was added to the mainline Linux repo first, but the warning
only appears following the merge of 70a6587dca37.
#regzbot introduced: 43102a2012c2
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG,BISECTED] WARNING dcn20_find_secondary_pipe
2025-01-26 16:46 ` [BUG,BISECTED] WARNING dcn20_find_secondary_pipe Chris Bainbridge
@ 2025-01-27 9:44 ` Imre Deak
0 siblings, 0 replies; 15+ messages in thread
From: Imre Deak @ 2025-01-27 9:44 UTC (permalink / raw)
To: Chris Bainbridge
Cc: Leo Li, Mikhail Gavrilov, Harry Wentland, zaeem.mohamed,
pekka.paalanen, Wheeler, Daniel, Deucher, Alexander, amd-gfx list,
dri-devel, Linux List Kernel Mailing,
Linux regressions mailing list, lyude
On Sun, Jan 26, 2025 at 04:46:49PM +0000, Chris Bainbridge wrote:
> Hardware is HP Pavilion Aero 13 laptop with Dell WD19 dock and three
> external monitors. I get a warning with recent kernel builds when
> enabling the external monitors with xrandr after initial boot:
>
> 16:57:49 kernel: WARNING: CPU: 4 PID: 1347 at drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn20/dcn20_resource.c:1734 dcn20_find_secondary_pipe+0x1a6/0x400 [amdgpu]
>
> [...]
>
> The bisect leads to a merge commit 43102a2012c2 ("Merge tag
> 'drm-misc-fixes-2024-09-26'"). Neither parent commit produces the
> warning, but the merged commit does.
>
> There are two commits that interact to cause this warning:
>
> 2a2a865aee43 ("drm/amd/display: Add all planes on CRTC to state for
> overlay cursor").
>
> and
>
> 70a6587dca37 ("drm/dp_mst: Fix DSC decompression detection in Synaptics
> branch devices")
>
> 2a2a865aee43 was added to the mainline Linux repo first, but the warning
> only appears following the merge of 70a6587dca37.
The effect of 70a6587dca37 is to enable DSC only if the dock supports
this. IIRC the WD19 dock does support DSC in both of the branch devices
within it, so not sure how the commit makes a difference on it.
Checking if it's the DP_DSC_SUPPORT DPCD register AUX read which fails,
or if the DP_DSC_DECOMPRESSION_IS_SUPPORTED flag is not set in the
register would tell more (maybe by using drm.debug=0x100).
In any case not sure how the reported DSC capability would relate to
the above warn in dcn20_find_secondary_pipe(), the driver should handle
a dock both with and without DSC support.
> #regzbot introduced: 43102a2012c2
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-01-27 9:43 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-05 18:05 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang Mikhail Gavrilov
2024-08-24 21:12 ` Mikhail Gavrilov
2024-09-03 6:35 ` Mikhail Gavrilov
2024-09-03 23:15 ` Leo Li
2024-09-04 22:21 ` Mikhail Gavrilov
2024-09-04 23:06 ` Leo Li
2024-09-05 6:06 ` Mikhail Gavrilov
2024-09-06 19:46 ` Leo Li
2024-09-08 23:30 ` Mikhail Gavrilov
2024-09-10 15:47 ` Leo Li
2024-09-10 21:11 ` Leo Li
2024-09-10 22:16 ` Mikhail Gavrilov
2024-09-09 8:49 ` Michel Dänzer
2025-01-26 16:46 ` [BUG,BISECTED] WARNING dcn20_find_secondary_pipe Chris Bainbridge
2025-01-27 9:44 ` Imre Deak
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox