* [bug][vaapi][h264] The commit 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 on certain video files leads to problems with VAAPI hardware decoding.
@ 2022-12-07 14:44 Mikhail Gavrilov
2022-12-07 14:58 ` Alex Deucher
0 siblings, 1 reply; 9+ messages in thread
From: Mikhail Gavrilov @ 2022-12-07 14:44 UTC (permalink / raw)
To: Deucher, Alexander, Chen, Guchun, James.Zhu, amd-gfx list
[-- Attachment #1: Type: text/plain, Size: 3520 bytes --]
Hi,
I found a commit that on certain video files leads to problems with
VAAPI hardware decoding.
Reproducing the issue requires mesa to be built with the h264 hardware
encoder enabled and the attached file to be playable in the vlc
player.
Before kernel 5.16 this only led to an artifact in the form of a green
bar at the top of the screen, then starting from 5.17 the GPU began to
freeze.
In 6.0, the problem with GPU freezing is solved, but the kernel itself
freezes when certain actions are performed. And the vlc application
cannot be terminated in any way.
The kernel trace would be like:
[ 976.184187] amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault
(src_id:0 ring:40 vmid:1 pasid:32785, for process vlc pid 9905 thread
vlc:cs0 pid 9956)
[ 976.184205] amdgpu 0000:03:00.0: amdgpu: in page starting at
address 0x0000800106b53000 from client 0x12 (VMC)
[ 976.184210] amdgpu 0000:03:00.0: amdgpu:
MMVM_L2_PROTECTION_FAULT_STATUS:0x00141651
[ 976.184213] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: VCN0 (0xb)
[ 976.184216] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1
[ 976.184219] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 976.184222] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x5
[ 976.184225] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 976.184228] amdgpu 0000:03:00.0: amdgpu: RW: 0x1
[ 976.184234] amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault
(src_id:0 ring:40 vmid:1 pasid:32785, for process vlc pid 9905 thread
vlc:cs0 pid 9956)
[ 976.184238] amdgpu 0000:03:00.0: amdgpu: in page starting at
address 0x0000800106b52000 from client 0x12 (VMC)
[ 976.184242] amdgpu 0000:03:00.0: amdgpu:
MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 976.184245] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID:
unknown (0x0)
[ 976.184248] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 976.184251] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 976.184253] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 976.184256] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 976.184259] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 976.184264] amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault
(src_id:0 ring:40 vmid:1 pasid:32785, for process vlc pid 9905 thread
vlc:cs0 pid 9956)
[ 976.184268] amdgpu 0000:03:00.0: amdgpu: in page starting at
address 0x0000800106b53000 from client 0x12 (VMC)
[ 976.184271] amdgpu 0000:03:00.0: amdgpu:
MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 976.184273] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID:
unknown (0x0)
[ 976.184276] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 976.184279] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 976.184281] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 976.184284] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 976.184286] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
The problematic commit is:
commit 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 (HEAD)
Author: Alex Deucher <alexander.deucher@amd.com>
Date: Mon Aug 9 11:22:20 2021 -0400
drm/amdgpu: handle VCN instances when harvesting (v2)
There may be multiple instances and only one is harvested.
v2: fix typo in commit message
Fixes: 83a0b8639185 ("drm/amdgpu: add judgement when add ip blocks (v2)")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1673
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Thanks!
--
Best Regards,
Mike Gavrilov.
[-- Attachment #2: test_sample_480_2.mp4 --]
[-- Type: video/mp4, Size: 127816 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [bug][vaapi][h264] The commit 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 on certain video files leads to problems with VAAPI hardware decoding. 2022-12-07 14:44 [bug][vaapi][h264] The commit 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 on certain video files leads to problems with VAAPI hardware decoding Mikhail Gavrilov @ 2022-12-07 14:58 ` Alex Deucher 2022-12-07 20:43 ` Mikhail Gavrilov 0 siblings, 1 reply; 9+ messages in thread From: Alex Deucher @ 2022-12-07 14:58 UTC (permalink / raw) To: Mikhail Gavrilov Cc: Deucher, Alexander, James.Zhu, amd-gfx list, Chen, Guchun On Wed, Dec 7, 2022 at 9:44 AM Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> wrote: > > Hi, > > I found a commit that on certain video files leads to problems with > VAAPI hardware decoding. > Reproducing the issue requires mesa to be built with the h264 hardware > encoder enabled and the attached file to be playable in the vlc > player. > Before kernel 5.16 this only led to an artifact in the form of a green > bar at the top of the screen, then starting from 5.17 the GPU began to > freeze. > In 6.0, the problem with GPU freezing is solved, but the kernel itself > freezes when certain actions are performed. And the vlc application > cannot be terminated in any way. What GPU do you have and what entries do you have in sys/class/drm/card0/device/ip_discovery/die/0/UVD for the device? specifically the harvest settings for each instance if there are multiple instances. If you had an rx6700 you might have been using software rendering prior to commit 7cbe08a930a132d84b4cf79953b00b074ec7a2a. Alex > > The kernel trace would be like: > [ 976.184187] amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault > (src_id:0 ring:40 vmid:1 pasid:32785, for process vlc pid 9905 thread > vlc:cs0 pid 9956) > [ 976.184205] amdgpu 0000:03:00.0: amdgpu: in page starting at > address 0x0000800106b53000 from client 0x12 (VMC) > [ 976.184210] amdgpu 0000:03:00.0: amdgpu: > MMVM_L2_PROTECTION_FAULT_STATUS:0x00141651 > [ 976.184213] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: VCN0 (0xb) > [ 976.184216] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1 > [ 976.184219] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 > [ 976.184222] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x5 > [ 976.184225] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 > [ 976.184228] amdgpu 0000:03:00.0: amdgpu: RW: 0x1 > [ 976.184234] amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault > (src_id:0 ring:40 vmid:1 pasid:32785, for process vlc pid 9905 thread > vlc:cs0 pid 9956) > [ 976.184238] amdgpu 0000:03:00.0: amdgpu: in page starting at > address 0x0000800106b52000 from client 0x12 (VMC) > [ 976.184242] amdgpu 0000:03:00.0: amdgpu: > MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000 > [ 976.184245] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: > unknown (0x0) > [ 976.184248] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0 > [ 976.184251] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 > [ 976.184253] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0 > [ 976.184256] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 > [ 976.184259] amdgpu 0000:03:00.0: amdgpu: RW: 0x0 > [ 976.184264] amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault > (src_id:0 ring:40 vmid:1 pasid:32785, for process vlc pid 9905 thread > vlc:cs0 pid 9956) > [ 976.184268] amdgpu 0000:03:00.0: amdgpu: in page starting at > address 0x0000800106b53000 from client 0x12 (VMC) > [ 976.184271] amdgpu 0000:03:00.0: amdgpu: > MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000 > [ 976.184273] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: > unknown (0x0) > [ 976.184276] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0 > [ 976.184279] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 > [ 976.184281] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0 > [ 976.184284] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 > [ 976.184286] amdgpu 0000:03:00.0: amdgpu: RW: 0x0 > > > The problematic commit is: > commit 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 (HEAD) > Author: Alex Deucher <alexander.deucher@amd.com> > Date: Mon Aug 9 11:22:20 2021 -0400 > > drm/amdgpu: handle VCN instances when harvesting (v2) > > There may be multiple instances and only one is harvested. > > v2: fix typo in commit message > > Fixes: 83a0b8639185 ("drm/amdgpu: add judgement when add ip blocks (v2)") > Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1673 > Reviewed-by: Guchun Chen <guchun.chen@amd.com> > Reviewed-by: James Zhu <James.Zhu@amd.com> > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > Cc: stable@vger.kernel.org > > > Thanks! > > -- > Best Regards, > Mike Gavrilov. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bug][vaapi][h264] The commit 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 on certain video files leads to problems with VAAPI hardware decoding. 2022-12-07 14:58 ` Alex Deucher @ 2022-12-07 20:43 ` Mikhail Gavrilov 2022-12-07 20:54 ` Alex Deucher 0 siblings, 1 reply; 9+ messages in thread From: Mikhail Gavrilov @ 2022-12-07 20:43 UTC (permalink / raw) To: Alex Deucher; +Cc: Deucher, Alexander, James.Zhu, amd-gfx list, Chen, Guchun On Wed, Dec 7, 2022 at 7:58 PM Alex Deucher <alexdeucher@gmail.com> wrote: > > > What GPU do you have and what entries do you have in > sys/class/drm/card0/device/ip_discovery/die/0/UVD for the device? I bisected the issue on the Radeon 6800M. Parent commit for 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 is 46dd2965bdd1c5a4f6499c73ff32e636fa8f9769. For both commits ip_discovery is absent. # ls /sys/class/drm/card0/device/ | grep ip # ls /sys/class/drm/card1/device/ | grep ip But from verbose info I see that player for 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 use acceleration: $ vlc -v Downloads/test_sample_480_2.mp4 VLC media player 3.0.18 Vetinari (revision ) [0000561f72097520] main libvlc: Running vlc with the default interface. Use 'cvlc' to use vlc without interface. [00007fa224001190] mp4 demux warning: elst box found [00007fa224001190] mp4 demux warning: STTS table of 1 entries [00007fa224001190] mp4 demux warning: CTTS table of 78 entries [00007fa224001190] mp4 demux warning: elst box found [00007fa224001190] mp4 demux warning: STTS table of 1 entries [00007fa224001190] mp4 demux warning: elst old=0 new=1 [00007fa224d19010] faad decoder warning: decoded zero sample [00007fa224001190] mp4 demux warning: elst old=0 new=1 [00007fa214007030] gl gl: Initialized libplacebo v4.208.0 (API v208) libva info: VA-API version 1.16.0 libva error: vaGetDriverNameByIndex() failed with unknown libva error, driver_name = (null) [00007fa214007030] glconv_vaapi_x11 gl error: vaInitialize: unknown libva error libva info: VA-API version 1.16.0 libva info: Trying to open /usr/lib64/dri/radeonsi_drv_video.so libva info: Found init function __vaDriverInit_1_16 libva info: va_openDriver() returns 0 [00007fa224c0b3a0] avcodec decoder: Using Mesa Gallium driver 23.0.0-devel for AMD Radeon RX 6800M (navi22, LLVM 15.0.4, DRM 3.42, 5.14.0-rc4-14-7cbe08a930a132d84b4cf79953b00b074ec7a2a7+) for hardware decoding [h264 @ 0x7fa224c3fa40] Using deprecated struct vaapi_context in decode. [0000561f72174de0] pulse audio output warning: starting late (-9724 us) And for 46dd2965bdd1c5a4f6499c73ff32e636fa8f9769 commit did not use acceleration: $ vlc -v Downloads/test_sample_480_2.mp4 VLC media player 3.0.18 Vetinari (revision ) [000055f61ad35520] main libvlc: Running vlc with the default interface. Use 'cvlc' to use vlc without interface. [00007fc7e8001190] mp4 demux warning: elst box found [00007fc7e8001190] mp4 demux warning: STTS table of 1 entries [00007fc7e8001190] mp4 demux warning: CTTS table of 78 entries [00007fc7e8001190] mp4 demux warning: elst box found [00007fc7e8001190] mp4 demux warning: STTS table of 1 entries [00007fc7e8001190] mp4 demux warning: elst old=0 new=1 [00007fc7e8d19010] faad decoder warning: decoded zero sample [00007fc7e8001190] mp4 demux warning: elst old=0 new=1 [00007fc7d8007030] gl gl: Initialized libplacebo v4.208.0 (API v208) libva info: VA-API version 1.16.0 libva error: vaGetDriverNameByIndex() failed with unknown libva error, driver_name = (null) [00007fc7d8007030] glconv_vaapi_x11 gl error: vaInitialize: unknown libva error libva info: VA-API version 1.16.0 libva info: Trying to open /usr/lib64/dri/radeonsi_drv_video.so libva info: Found init function __vaDriverInit_1_16 libva info: va_openDriver() returns 0 [00007fc7d40b3260] vaapi generic error: profile(7) is not supported [00007fc7d8a089c0] gl gl: Initialized libplacebo v4.208.0 (API v208) Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared object file: No such file or directory Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared object file: No such file or directory [00007fc7d89e4f80] gl gl: Initialized libplacebo v4.208.0 (API v208) [000055f61ae12de0] pulse audio output warning: starting late (-13537 us) So my bisect didn't make sense :( Anyway can you reproduce the issue with the attached sample file and vlc on fresh kernel (6.1-rc8)? Thanks! -- Best Regards, Mike Gavrilov. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bug][vaapi][h264] The commit 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 on certain video files leads to problems with VAAPI hardware decoding. 2022-12-07 20:43 ` Mikhail Gavrilov @ 2022-12-07 20:54 ` Alex Deucher 2022-12-09 14:37 ` Leo Liu 0 siblings, 1 reply; 9+ messages in thread From: Alex Deucher @ 2022-12-07 20:54 UTC (permalink / raw) To: Mikhail Gavrilov, Leo Liu, Thong Thai Cc: Deucher, Alexander, James.Zhu, amd-gfx list, Chen, Guchun + Leo, Thong On Wed, Dec 7, 2022 at 3:43 PM Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> wrote: > > On Wed, Dec 7, 2022 at 7:58 PM Alex Deucher <alexdeucher@gmail.com> wrote: > > > > > > What GPU do you have and what entries do you have in > > sys/class/drm/card0/device/ip_discovery/die/0/UVD for the device? > > I bisected the issue on the Radeon 6800M. > > Parent commit for 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 is > 46dd2965bdd1c5a4f6499c73ff32e636fa8f9769. > For both commits ip_discovery is absent. > # ls /sys/class/drm/card0/device/ | grep ip > # ls /sys/class/drm/card1/device/ | grep ip > > But from verbose info I see that player for > 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 use acceleration: > $ vlc -v Downloads/test_sample_480_2.mp4 > VLC media player 3.0.18 Vetinari (revision ) > [0000561f72097520] main libvlc: Running vlc with the default > interface. Use 'cvlc' to use vlc without interface. > [00007fa224001190] mp4 demux warning: elst box found > [00007fa224001190] mp4 demux warning: STTS table of 1 entries > [00007fa224001190] mp4 demux warning: CTTS table of 78 entries > [00007fa224001190] mp4 demux warning: elst box found > [00007fa224001190] mp4 demux warning: STTS table of 1 entries > [00007fa224001190] mp4 demux warning: elst old=0 new=1 > [00007fa224d19010] faad decoder warning: decoded zero sample > [00007fa224001190] mp4 demux warning: elst old=0 new=1 > [00007fa214007030] gl gl: Initialized libplacebo v4.208.0 (API v208) > libva info: VA-API version 1.16.0 > libva error: vaGetDriverNameByIndex() failed with unknown libva error, > driver_name = (null) > [00007fa214007030] glconv_vaapi_x11 gl error: vaInitialize: unknown libva error > libva info: VA-API version 1.16.0 > libva info: Trying to open /usr/lib64/dri/radeonsi_drv_video.so > libva info: Found init function __vaDriverInit_1_16 > libva info: va_openDriver() returns 0 > [00007fa224c0b3a0] avcodec decoder: Using Mesa Gallium driver > 23.0.0-devel for AMD Radeon RX 6800M (navi22, LLVM 15.0.4, DRM 3.42, > 5.14.0-rc4-14-7cbe08a930a132d84b4cf79953b00b074ec7a2a7+) for hardware > decoding > [h264 @ 0x7fa224c3fa40] Using deprecated struct vaapi_context in decode. > [0000561f72174de0] pulse audio output warning: starting late (-9724 us) > > And for 46dd2965bdd1c5a4f6499c73ff32e636fa8f9769 commit did not use > acceleration: > $ vlc -v Downloads/test_sample_480_2.mp4 > VLC media player 3.0.18 Vetinari (revision ) > [000055f61ad35520] main libvlc: Running vlc with the default > interface. Use 'cvlc' to use vlc without interface. > [00007fc7e8001190] mp4 demux warning: elst box found > [00007fc7e8001190] mp4 demux warning: STTS table of 1 entries > [00007fc7e8001190] mp4 demux warning: CTTS table of 78 entries > [00007fc7e8001190] mp4 demux warning: elst box found > [00007fc7e8001190] mp4 demux warning: STTS table of 1 entries > [00007fc7e8001190] mp4 demux warning: elst old=0 new=1 > [00007fc7e8d19010] faad decoder warning: decoded zero sample > [00007fc7e8001190] mp4 demux warning: elst old=0 new=1 > [00007fc7d8007030] gl gl: Initialized libplacebo v4.208.0 (API v208) > libva info: VA-API version 1.16.0 > libva error: vaGetDriverNameByIndex() failed with unknown libva error, > driver_name = (null) > [00007fc7d8007030] glconv_vaapi_x11 gl error: vaInitialize: unknown libva error > libva info: VA-API version 1.16.0 > libva info: Trying to open /usr/lib64/dri/radeonsi_drv_video.so > libva info: Found init function __vaDriverInit_1_16 > libva info: va_openDriver() returns 0 > [00007fc7d40b3260] vaapi generic error: profile(7) is not supported > [00007fc7d8a089c0] gl gl: Initialized libplacebo v4.208.0 (API v208) > Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared > object file: No such file or directory > Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared > object file: No such file or directory > [00007fc7d89e4f80] gl gl: Initialized libplacebo v4.208.0 (API v208) > [000055f61ae12de0] pulse audio output warning: starting late (-13537 us) > > So my bisect didn't make sense :( > Anyway can you reproduce the issue with the attached sample file and > vlc on fresh kernel (6.1-rc8)? > > Thanks! > > -- > Best Regards, > Mike Gavrilov. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bug][vaapi][h264] The commit 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 on certain video files leads to problems with VAAPI hardware decoding. 2022-12-07 20:54 ` Alex Deucher @ 2022-12-09 14:37 ` Leo Liu 2023-02-17 6:09 ` Mikhail Gavrilov 0 siblings, 1 reply; 9+ messages in thread From: Leo Liu @ 2022-12-09 14:37 UTC (permalink / raw) To: Alex Deucher, Mikhail Gavrilov, Thong Thai Cc: Deucher, Alexander, James.Zhu, amd-gfx list, Chen, Guchun Please try the latest AMDGPU driver: https://gitlab.freedesktop.org/agd5f/linux/-/commits/amd-staging-drm-next/ On 2022-12-07 15:54, Alex Deucher wrote: > + Leo, Thong > > On Wed, Dec 7, 2022 at 3:43 PM Mikhail Gavrilov > <mikhail.v.gavrilov@gmail.com> wrote: >> On Wed, Dec 7, 2022 at 7:58 PM Alex Deucher <alexdeucher@gmail.com> wrote: >>> >>> What GPU do you have and what entries do you have in >>> sys/class/drm/card0/device/ip_discovery/die/0/UVD for the device? >> I bisected the issue on the Radeon 6800M. >> >> Parent commit for 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 is >> 46dd2965bdd1c5a4f6499c73ff32e636fa8f9769. >> For both commits ip_discovery is absent. >> # ls /sys/class/drm/card0/device/ | grep ip >> # ls /sys/class/drm/card1/device/ | grep ip >> >> But from verbose info I see that player for >> 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 use acceleration: >> $ vlc -v Downloads/test_sample_480_2.mp4 >> VLC media player 3.0.18 Vetinari (revision ) >> [0000561f72097520] main libvlc: Running vlc with the default >> interface. Use 'cvlc' to use vlc without interface. >> [00007fa224001190] mp4 demux warning: elst box found >> [00007fa224001190] mp4 demux warning: STTS table of 1 entries >> [00007fa224001190] mp4 demux warning: CTTS table of 78 entries >> [00007fa224001190] mp4 demux warning: elst box found >> [00007fa224001190] mp4 demux warning: STTS table of 1 entries >> [00007fa224001190] mp4 demux warning: elst old=0 new=1 >> [00007fa224d19010] faad decoder warning: decoded zero sample >> [00007fa224001190] mp4 demux warning: elst old=0 new=1 >> [00007fa214007030] gl gl: Initialized libplacebo v4.208.0 (API v208) >> libva info: VA-API version 1.16.0 >> libva error: vaGetDriverNameByIndex() failed with unknown libva error, >> driver_name = (null) >> [00007fa214007030] glconv_vaapi_x11 gl error: vaInitialize: unknown libva error >> libva info: VA-API version 1.16.0 >> libva info: Trying to open /usr/lib64/dri/radeonsi_drv_video.so >> libva info: Found init function __vaDriverInit_1_16 >> libva info: va_openDriver() returns 0 >> [00007fa224c0b3a0] avcodec decoder: Using Mesa Gallium driver >> 23.0.0-devel for AMD Radeon RX 6800M (navi22, LLVM 15.0.4, DRM 3.42, >> 5.14.0-rc4-14-7cbe08a930a132d84b4cf79953b00b074ec7a2a7+) for hardware >> decoding >> [h264 @ 0x7fa224c3fa40] Using deprecated struct vaapi_context in decode. >> [0000561f72174de0] pulse audio output warning: starting late (-9724 us) >> >> And for 46dd2965bdd1c5a4f6499c73ff32e636fa8f9769 commit did not use >> acceleration: >> $ vlc -v Downloads/test_sample_480_2.mp4 >> VLC media player 3.0.18 Vetinari (revision ) >> [000055f61ad35520] main libvlc: Running vlc with the default >> interface. Use 'cvlc' to use vlc without interface. >> [00007fc7e8001190] mp4 demux warning: elst box found >> [00007fc7e8001190] mp4 demux warning: STTS table of 1 entries >> [00007fc7e8001190] mp4 demux warning: CTTS table of 78 entries >> [00007fc7e8001190] mp4 demux warning: elst box found >> [00007fc7e8001190] mp4 demux warning: STTS table of 1 entries >> [00007fc7e8001190] mp4 demux warning: elst old=0 new=1 >> [00007fc7e8d19010] faad decoder warning: decoded zero sample >> [00007fc7e8001190] mp4 demux warning: elst old=0 new=1 >> [00007fc7d8007030] gl gl: Initialized libplacebo v4.208.0 (API v208) >> libva info: VA-API version 1.16.0 >> libva error: vaGetDriverNameByIndex() failed with unknown libva error, >> driver_name = (null) >> [00007fc7d8007030] glconv_vaapi_x11 gl error: vaInitialize: unknown libva error >> libva info: VA-API version 1.16.0 >> libva info: Trying to open /usr/lib64/dri/radeonsi_drv_video.so >> libva info: Found init function __vaDriverInit_1_16 >> libva info: va_openDriver() returns 0 >> [00007fc7d40b3260] vaapi generic error: profile(7) is not supported >> [00007fc7d8a089c0] gl gl: Initialized libplacebo v4.208.0 (API v208) >> Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared >> object file: No such file or directory >> Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared >> object file: No such file or directory >> [00007fc7d89e4f80] gl gl: Initialized libplacebo v4.208.0 (API v208) >> [000055f61ae12de0] pulse audio output warning: starting late (-13537 us) >> >> So my bisect didn't make sense :( >> Anyway can you reproduce the issue with the attached sample file and >> vlc on fresh kernel (6.1-rc8)? >> >> Thanks! >> >> -- >> Best Regards, >> Mike Gavrilov. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bug][vaapi][h264] The commit 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 on certain video files leads to problems with VAAPI hardware decoding. 2022-12-09 14:37 ` Leo Liu @ 2023-02-17 6:09 ` Mikhail Gavrilov 2023-02-17 15:29 ` Alex Deucher 0 siblings, 1 reply; 9+ messages in thread From: Mikhail Gavrilov @ 2023-02-17 6:09 UTC (permalink / raw) To: Leo Liu Cc: Chen, Guchun, Thong Thai, amd-gfx list, Deucher, Alexander, Alex Deucher, James.Zhu On Fri, Dec 9, 2022 at 7:37 PM Leo Liu <leo.liu@amd.com> wrote: > > Please try the latest AMDGPU driver: > > https://gitlab.freedesktop.org/agd5f/linux/-/commits/amd-staging-drm-next/ > Sorry Leo, I miss your message. This issue is still actual for 6.2-rc8. In my first message I was mistaken. > Before kernel 5.16 this only led to an artifact in the form of > a green bar at the top of the screen, then starting from 5.17 > the GPU began to freeze. The real behaviour before 5.18: - vlc could plays video with small artifacts in the form of a green bar on top of the video - after playing video process vlc correctly exiting On 5.18 this behaviour changed: - vlc show black screen instead of playing video - after playing the process not exiting - if I tries kill vlc process with 'kill -9' vlc became zombi process and many other processes start hangs (in kernel log appears follow lines after 2 minutes) INFO: task vlc:sh8:5248 blocked for more than 122 seconds. Tainted: G W L -------- --- 5.18.0-60.fc37.x86_64+debug #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:vlc:sh8 state:D stack:13616 pid: 5248 ppid: 1934 flags:0x00004006 Call Trace: <TASK> __schedule+0x492/0x1650 ? _raw_spin_unlock_irqrestore+0x40/0x60 ? debug_check_no_obj_freed+0x12d/0x250 schedule+0x4e/0xb0 schedule_timeout+0xe1/0x120 ? lock_release+0x215/0x460 ? trace_hardirqs_on+0x1a/0xf0 ? _raw_spin_unlock_irqrestore+0x40/0x60 dma_fence_default_wait+0x197/0x240 ? __bpf_trace_dma_fence+0x10/0x10 dma_fence_wait_timeout+0x229/0x260 drm_sched_entity_fini+0x101/0x270 [gpu_sched] amdgpu_vm_fini+0x2b5/0x460 [amdgpu] ? idr_destroy+0x70/0xb0 ? mutex_destroy+0x1e/0x50 amdgpu_driver_postclose_kms+0x1ec/0x2c0 [amdgpu] drm_file_free.part.0+0x20d/0x260 drm_release+0x6a/0x120 __fput+0xab/0x270 task_work_run+0x5c/0xa0 do_exit+0x394/0xc40 ? rcu_read_lock_sched_held+0x10/0x70 do_group_exit+0x33/0xb0 get_signal+0xbbc/0xbc0 arch_do_signal_or_restart+0x30/0x770 ? do_futex+0xfd/0x190 ? __x64_sys_futex+0x63/0x190 exit_to_user_mode_prepare+0x172/0x270 syscall_exit_to_user_mode+0x16/0x50 do_syscall_64+0x67/0x80 ? do_syscall_64+0x67/0x80 ? rcu_read_lock_sched_held+0x10/0x70 ? trace_hardirqs_on_prepare+0x5e/0x110 ? do_syscall_64+0x67/0x80 ? rcu_read_lock_sched_held+0x10/0x70 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f82c2364529 RSP: 002b:00007f8210ff8c00 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00007f82c2364529 RDX: 0000000000000000 RSI: 0000000000000189 RDI: 00007f823022542c RBP: 00007f8210ff8c30 R08: 0000000000000000 R09: 00000000ffffffff R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000001 R15: 00007f823022542c </TASK> INFO: lockdep is turned off. I bisected this issue and problematic commit is ❯ git bisect bad 5f3854f1f4e211f494018160b348a1c16e58013f is the first bad commit commit 5f3854f1f4e211f494018160b348a1c16e58013f Author: Alex Deucher <alexander.deucher@amd.com> Date: Thu Mar 24 18:04:00 2022 -0400 drm/amdgpu: add more cases to noretry=1 Port current list from amd-staging-drm-next. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 3 +++ 1 file changed, 3 insertions(+) Unfortunately I couldn't simply revert this commit on 6.2-rc8 for checking, because it leads to conflicts. Alex, you as author of this commit could help me with it? -- Best Regards, Mike Gavrilov. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bug][vaapi][h264] The commit 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 on certain video files leads to problems with VAAPI hardware decoding. 2023-02-17 6:09 ` Mikhail Gavrilov @ 2023-02-17 15:29 ` Alex Deucher 2023-02-17 21:50 ` Mikhail Gavrilov 0 siblings, 1 reply; 9+ messages in thread From: Alex Deucher @ 2023-02-17 15:29 UTC (permalink / raw) To: Mikhail Gavrilov Cc: Chen, Guchun, Thong Thai, amd-gfx list, Deucher, Alexander, James.Zhu, Leo Liu On Fri, Feb 17, 2023 at 1:10 AM Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> wrote: > > On Fri, Dec 9, 2022 at 7:37 PM Leo Liu <leo.liu@amd.com> wrote: > > > > Please try the latest AMDGPU driver: > > > > https://gitlab.freedesktop.org/agd5f/linux/-/commits/amd-staging-drm-next/ > > > > Sorry Leo, I miss your message. > This issue is still actual for 6.2-rc8. > > In my first message I was mistaken. > > > Before kernel 5.16 this only led to an artifact in the form of > > a green bar at the top of the screen, then starting from 5.17 > > the GPU began to freeze. > > The real behaviour before 5.18: > - vlc could plays video with small artifacts in the form of a green > bar on top of the video > - after playing video process vlc correctly exiting > > On 5.18 this behaviour changed: > - vlc show black screen instead of playing video > - after playing the process not exiting > - if I tries kill vlc process with 'kill -9' vlc became zombi process > and many other processes start hangs (in kernel log appears follow > lines after 2 minutes) > > INFO: task vlc:sh8:5248 blocked for more than 122 seconds. > Tainted: G W L -------- --- 5.18.0-60.fc37.x86_64+debug #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:vlc:sh8 state:D stack:13616 pid: 5248 ppid: 1934 flags:0x00004006 > Call Trace: > <TASK> > __schedule+0x492/0x1650 > ? _raw_spin_unlock_irqrestore+0x40/0x60 > ? debug_check_no_obj_freed+0x12d/0x250 > schedule+0x4e/0xb0 > schedule_timeout+0xe1/0x120 > ? lock_release+0x215/0x460 > ? trace_hardirqs_on+0x1a/0xf0 > ? _raw_spin_unlock_irqrestore+0x40/0x60 > dma_fence_default_wait+0x197/0x240 > ? __bpf_trace_dma_fence+0x10/0x10 > dma_fence_wait_timeout+0x229/0x260 > drm_sched_entity_fini+0x101/0x270 [gpu_sched] > amdgpu_vm_fini+0x2b5/0x460 [amdgpu] > ? idr_destroy+0x70/0xb0 > ? mutex_destroy+0x1e/0x50 > amdgpu_driver_postclose_kms+0x1ec/0x2c0 [amdgpu] > drm_file_free.part.0+0x20d/0x260 > drm_release+0x6a/0x120 > __fput+0xab/0x270 > task_work_run+0x5c/0xa0 > do_exit+0x394/0xc40 > ? rcu_read_lock_sched_held+0x10/0x70 > do_group_exit+0x33/0xb0 > get_signal+0xbbc/0xbc0 > arch_do_signal_or_restart+0x30/0x770 > ? do_futex+0xfd/0x190 > ? __x64_sys_futex+0x63/0x190 > exit_to_user_mode_prepare+0x172/0x270 > syscall_exit_to_user_mode+0x16/0x50 > do_syscall_64+0x67/0x80 > ? do_syscall_64+0x67/0x80 > ? rcu_read_lock_sched_held+0x10/0x70 > ? trace_hardirqs_on_prepare+0x5e/0x110 > ? do_syscall_64+0x67/0x80 > ? rcu_read_lock_sched_held+0x10/0x70 > entry_SYSCALL_64_after_hwframe+0x44/0xae > RIP: 0033:0x7f82c2364529 > RSP: 002b:00007f8210ff8c00 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca > RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00007f82c2364529 > RDX: 0000000000000000 RSI: 0000000000000189 RDI: 00007f823022542c > RBP: 00007f8210ff8c30 R08: 0000000000000000 R09: 00000000ffffffff > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > R13: 0000000000000000 R14: 0000000000000001 R15: 00007f823022542c > </TASK> > INFO: lockdep is turned off. > > I bisected this issue and problematic commit is > > ❯ git bisect bad > 5f3854f1f4e211f494018160b348a1c16e58013f is the first bad commit > commit 5f3854f1f4e211f494018160b348a1c16e58013f > Author: Alex Deucher <alexander.deucher@amd.com> > Date: Thu Mar 24 18:04:00 2022 -0400 > > drm/amdgpu: add more cases to noretry=1 > > Port current list from amd-staging-drm-next. > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 3 +++ > 1 file changed, 3 insertions(+) > > Unfortunately I couldn't simply revert this commit on 6.2-rc8 for > checking, because it leads to conflicts. > > Alex, you as author of this commit could help me with it? append amdgpu.noretry=0 to the kernel command line in grub. Alex > > > -- > Best Regards, > Mike Gavrilov. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bug][vaapi][h264] The commit 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 on certain video files leads to problems with VAAPI hardware decoding. 2023-02-17 15:29 ` Alex Deucher @ 2023-02-17 21:50 ` Mikhail Gavrilov 2023-02-27 21:10 ` Alex Deucher 0 siblings, 1 reply; 9+ messages in thread From: Mikhail Gavrilov @ 2023-02-17 21:50 UTC (permalink / raw) To: Alex Deucher Cc: Chen, Guchun, Thong Thai, amd-gfx list, Deucher, Alexander, James.Zhu, Leo Liu [-- Attachment #1: Type: text/plain, Size: 5273 bytes --] On Fri, Feb 17, 2023 at 8:30 PM Alex Deucher <alexdeucher@gmail.com> wrote: > > On Fri, Feb 17, 2023 at 1:10 AM Mikhail Gavrilov > <mikhail.v.gavrilov@gmail.com> wrote: > > > > On Fri, Dec 9, 2022 at 7:37 PM Leo Liu <leo.liu@amd.com> wrote: > > > > > > Please try the latest AMDGPU driver: > > > > > > https://gitlab.freedesktop.org/agd5f/linux/-/commits/amd-staging-drm-next/ > > > > > > > Sorry Leo, I miss your message. > > This issue is still actual for 6.2-rc8. > > > > In my first message I was mistaken. > > > > > Before kernel 5.16 this only led to an artifact in the form of > > > a green bar at the top of the screen, then starting from 5.17 > > > the GPU began to freeze. > > > > The real behaviour before 5.18: > > - vlc could plays video with small artifacts in the form of a green > > bar on top of the video > > - after playing video process vlc correctly exiting > > > > On 5.18 this behaviour changed: > > - vlc show black screen instead of playing video > > - after playing the process not exiting > > - if I tries kill vlc process with 'kill -9' vlc became zombi process > > and many other processes start hangs (in kernel log appears follow > > lines after 2 minutes) > > > > INFO: task vlc:sh8:5248 blocked for more than 122 seconds. > > Tainted: G W L -------- --- 5.18.0-60.fc37.x86_64+debug #1 > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > task:vlc:sh8 state:D stack:13616 pid: 5248 ppid: 1934 flags:0x00004006 > > Call Trace: > > <TASK> > > __schedule+0x492/0x1650 > > ? _raw_spin_unlock_irqrestore+0x40/0x60 > > ? debug_check_no_obj_freed+0x12d/0x250 > > schedule+0x4e/0xb0 > > schedule_timeout+0xe1/0x120 > > ? lock_release+0x215/0x460 > > ? trace_hardirqs_on+0x1a/0xf0 > > ? _raw_spin_unlock_irqrestore+0x40/0x60 > > dma_fence_default_wait+0x197/0x240 > > ? __bpf_trace_dma_fence+0x10/0x10 > > dma_fence_wait_timeout+0x229/0x260 > > drm_sched_entity_fini+0x101/0x270 [gpu_sched] > > amdgpu_vm_fini+0x2b5/0x460 [amdgpu] > > ? idr_destroy+0x70/0xb0 > > ? mutex_destroy+0x1e/0x50 > > amdgpu_driver_postclose_kms+0x1ec/0x2c0 [amdgpu] > > drm_file_free.part.0+0x20d/0x260 > > drm_release+0x6a/0x120 > > __fput+0xab/0x270 > > task_work_run+0x5c/0xa0 > > do_exit+0x394/0xc40 > > ? rcu_read_lock_sched_held+0x10/0x70 > > do_group_exit+0x33/0xb0 > > get_signal+0xbbc/0xbc0 > > arch_do_signal_or_restart+0x30/0x770 > > ? do_futex+0xfd/0x190 > > ? __x64_sys_futex+0x63/0x190 > > exit_to_user_mode_prepare+0x172/0x270 > > syscall_exit_to_user_mode+0x16/0x50 > > do_syscall_64+0x67/0x80 > > ? do_syscall_64+0x67/0x80 > > ? rcu_read_lock_sched_held+0x10/0x70 > > ? trace_hardirqs_on_prepare+0x5e/0x110 > > ? do_syscall_64+0x67/0x80 > > ? rcu_read_lock_sched_held+0x10/0x70 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > RIP: 0033:0x7f82c2364529 > > RSP: 002b:00007f8210ff8c00 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca > > RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00007f82c2364529 > > RDX: 0000000000000000 RSI: 0000000000000189 RDI: 00007f823022542c > > RBP: 00007f8210ff8c30 R08: 0000000000000000 R09: 00000000ffffffff > > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > > R13: 0000000000000000 R14: 0000000000000001 R15: 00007f823022542c > > </TASK> > > INFO: lockdep is turned off. > > > > I bisected this issue and problematic commit is > > > > ❯ git bisect bad > > 5f3854f1f4e211f494018160b348a1c16e58013f is the first bad commit > > commit 5f3854f1f4e211f494018160b348a1c16e58013f > > Author: Alex Deucher <alexander.deucher@amd.com> > > Date: Thu Mar 24 18:04:00 2022 -0400 > > > > drm/amdgpu: add more cases to noretry=1 > > > > Port current list from amd-staging-drm-next. > > > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > Unfortunately I couldn't simply revert this commit on 6.2-rc8 for > > checking, because it leads to conflicts. > > > > Alex, you as author of this commit could help me with it? > > append amdgpu.noretry=0 to the kernel command line in grub. Thanks, I checked the "amdgpu.noretry=0" and after the page fault occurs vlc could play video with little artifacts. So I have some questions: 1. Why retrys was disabled by default if it really stills needed for recoverable page faults? As Christian answered me before here: https://lore.kernel.org/all/f253ff1f-3c5c-c785-1272-e4fe69a366ec@amd.com/T/#m73a0a6eb7b2531eacf24fd498e8d2eec675f05a6 The page faults (Not to be confused with kernel panic) it's absolutely normal phenomenon for a buggy userspace. And if it "normal" I wold prefer what is not had affect on system reliability. But as we can see it leads to appears zombie processes with follow hang. 2.If recoverable page faults is not an option, is it possible to somehow fix this issue or not? P.S. I also see page faults in other scenarios (for example when playing in "Division 2" or "The Callisto Protocol". I attached my kernel log for show it) but it not leads to zombie processes. -- Best Regards, Mike Gavrilov. [-- Attachment #2: dmesg.tar.xz --] [-- Type: application/x-xz, Size: 35188 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bug][vaapi][h264] The commit 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 on certain video files leads to problems with VAAPI hardware decoding. 2023-02-17 21:50 ` Mikhail Gavrilov @ 2023-02-27 21:10 ` Alex Deucher 0 siblings, 0 replies; 9+ messages in thread From: Alex Deucher @ 2023-02-27 21:10 UTC (permalink / raw) To: Mikhail Gavrilov, Kuehling, Felix Cc: Chen, Guchun, Thong Thai, amd-gfx list, Deucher, Alexander, James.Zhu, Leo Liu + Felix On Fri, Feb 17, 2023 at 4:50 PM Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> wrote: > > On Fri, Feb 17, 2023 at 8:30 PM Alex Deucher <alexdeucher@gmail.com> wrote: > > > > On Fri, Feb 17, 2023 at 1:10 AM Mikhail Gavrilov > > <mikhail.v.gavrilov@gmail.com> wrote: > > > > > > On Fri, Dec 9, 2022 at 7:37 PM Leo Liu <leo.liu@amd.com> wrote: > > > > > > > > Please try the latest AMDGPU driver: > > > > > > > > https://gitlab.freedesktop.org/agd5f/linux/-/commits/amd-staging-drm-next/ > > > > > > > > > > Sorry Leo, I miss your message. > > > This issue is still actual for 6.2-rc8. > > > > > > In my first message I was mistaken. > > > > > > > Before kernel 5.16 this only led to an artifact in the form of > > > > a green bar at the top of the screen, then starting from 5.17 > > > > the GPU began to freeze. > > > > > > The real behaviour before 5.18: > > > - vlc could plays video with small artifacts in the form of a green > > > bar on top of the video > > > - after playing video process vlc correctly exiting > > > > > > On 5.18 this behaviour changed: > > > - vlc show black screen instead of playing video > > > - after playing the process not exiting > > > - if I tries kill vlc process with 'kill -9' vlc became zombi process > > > and many other processes start hangs (in kernel log appears follow > > > lines after 2 minutes) > > > > > > INFO: task vlc:sh8:5248 blocked for more than 122 seconds. > > > Tainted: G W L -------- --- 5.18.0-60.fc37.x86_64+debug #1 > > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > task:vlc:sh8 state:D stack:13616 pid: 5248 ppid: 1934 flags:0x00004006 > > > Call Trace: > > > <TASK> > > > __schedule+0x492/0x1650 > > > ? _raw_spin_unlock_irqrestore+0x40/0x60 > > > ? debug_check_no_obj_freed+0x12d/0x250 > > > schedule+0x4e/0xb0 > > > schedule_timeout+0xe1/0x120 > > > ? lock_release+0x215/0x460 > > > ? trace_hardirqs_on+0x1a/0xf0 > > > ? _raw_spin_unlock_irqrestore+0x40/0x60 > > > dma_fence_default_wait+0x197/0x240 > > > ? __bpf_trace_dma_fence+0x10/0x10 > > > dma_fence_wait_timeout+0x229/0x260 > > > drm_sched_entity_fini+0x101/0x270 [gpu_sched] > > > amdgpu_vm_fini+0x2b5/0x460 [amdgpu] > > > ? idr_destroy+0x70/0xb0 > > > ? mutex_destroy+0x1e/0x50 > > > amdgpu_driver_postclose_kms+0x1ec/0x2c0 [amdgpu] > > > drm_file_free.part.0+0x20d/0x260 > > > drm_release+0x6a/0x120 > > > __fput+0xab/0x270 > > > task_work_run+0x5c/0xa0 > > > do_exit+0x394/0xc40 > > > ? rcu_read_lock_sched_held+0x10/0x70 > > > do_group_exit+0x33/0xb0 > > > get_signal+0xbbc/0xbc0 > > > arch_do_signal_or_restart+0x30/0x770 > > > ? do_futex+0xfd/0x190 > > > ? __x64_sys_futex+0x63/0x190 > > > exit_to_user_mode_prepare+0x172/0x270 > > > syscall_exit_to_user_mode+0x16/0x50 > > > do_syscall_64+0x67/0x80 > > > ? do_syscall_64+0x67/0x80 > > > ? rcu_read_lock_sched_held+0x10/0x70 > > > ? trace_hardirqs_on_prepare+0x5e/0x110 > > > ? do_syscall_64+0x67/0x80 > > > ? rcu_read_lock_sched_held+0x10/0x70 > > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > RIP: 0033:0x7f82c2364529 > > > RSP: 002b:00007f8210ff8c00 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca > > > RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00007f82c2364529 > > > RDX: 0000000000000000 RSI: 0000000000000189 RDI: 00007f823022542c > > > RBP: 00007f8210ff8c30 R08: 0000000000000000 R09: 00000000ffffffff > > > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > > > R13: 0000000000000000 R14: 0000000000000001 R15: 00007f823022542c > > > </TASK> > > > INFO: lockdep is turned off. > > > > > > I bisected this issue and problematic commit is > > > > > > ❯ git bisect bad > > > 5f3854f1f4e211f494018160b348a1c16e58013f is the first bad commit > > > commit 5f3854f1f4e211f494018160b348a1c16e58013f > > > Author: Alex Deucher <alexander.deucher@amd.com> > > > Date: Thu Mar 24 18:04:00 2022 -0400 > > > > > > drm/amdgpu: add more cases to noretry=1 > > > > > > Port current list from amd-staging-drm-next. > > > > > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > > > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > Unfortunately I couldn't simply revert this commit on 6.2-rc8 for > > > checking, because it leads to conflicts. > > > > > > Alex, you as author of this commit could help me with it? > > > > append amdgpu.noretry=0 to the kernel command line in grub. > > Thanks, I checked the "amdgpu.noretry=0" and after the page fault > occurs vlc could play video with little artifacts. > > So I have some questions: > > 1. Why retrys was disabled by default if it really stills needed for > recoverable page faults? As Christian answered me before here: > https://lore.kernel.org/all/f253ff1f-3c5c-c785-1272-e4fe69a366ec@amd.com/T/#m73a0a6eb7b2531eacf24fd498e8d2eec675f05a6 > You don't actually want retry page faults, because for gfx apps, nothing is going to page in the missing pages. The retry stuff is for demand paging type scenarios and only certain GPUs (GFX9-based) actually support the necessary semantics to make this work. Even then it would only be useful in APIs which support demand paging. Right now GFX APIs don't really do this. > The page faults (Not to be confused with kernel panic) it's absolutely > normal phenomenon for a buggy userspace. And if it "normal" I wold > prefer what is not had affect on system reliability. But as we can see > it leads to appears zombie processes with follow hang. > If you don't retry the fault, the kernel reports the fault, but the engine should continue. Reads will return 0 and writes will be dropped. So it shouldn't hang unless the page fault causes some deadlock in the engine itself (e.g., due to the bogus data returned). > 2.If recoverable page faults is not an option, is it possible to > somehow fix this issue or not? I think this is probably a bug in mesa somewhere where the UMD has the alignment wrong somewhere or some dependency between GFX and VCN is not completing because of the page fault. > > P.S. I also see page faults in other scenarios (for example when > playing in "Division 2" or "The Callisto Protocol". I attached my > kernel log for show it) but it not leads to zombie processes. Right, that is the expected behavior when the fault is non-fatal. Alex ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-02-27 21:10 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-12-07 14:44 [bug][vaapi][h264] The commit 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 on certain video files leads to problems with VAAPI hardware decoding Mikhail Gavrilov 2022-12-07 14:58 ` Alex Deucher 2022-12-07 20:43 ` Mikhail Gavrilov 2022-12-07 20:54 ` Alex Deucher 2022-12-09 14:37 ` Leo Liu 2023-02-17 6:09 ` Mikhail Gavrilov 2023-02-17 15:29 ` Alex Deucher 2023-02-17 21:50 ` Mikhail Gavrilov 2023-02-27 21:10 ` Alex Deucher
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox