From mboxrd@z Thu Jan 1 00:00:00 1970 From: bugzilla-daemon@freedesktop.org Subject: [Bug 108118] AMDGPU sometimes hangs forever when running graphical applications Date: Tue, 02 Oct 2018 01:27:29 +0000 Message-ID: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1196999525==" Return-path: Received: from culpepper.freedesktop.org (culpepper.freedesktop.org [IPv6:2610:10:20:722:a800:ff:fe98:4b55]) by gabe.freedesktop.org (Postfix) with ESMTP id 7175C6E11E for ; Tue, 2 Oct 2018 01:27:29 +0000 (UTC) List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============1196999525== Content-Type: multipart/alternative; boundary="15384436490.D4DBa.27849" Content-Transfer-Encoding: 7bit --15384436490.D4DBa.27849 Date: Tue, 2 Oct 2018 01:27:29 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated https://bugs.freedesktop.org/show_bug.cgi?id=3D108118 Bug ID: 108118 Summary: AMDGPU sometimes hangs forever when running graphical applications Product: DRI Version: unspecified Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: major Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: duooratar@gmail.com Sometimes when running a graphical application the display will freeze up b= ut system sound will continue. The machine is still functioning and is accessi= ble over ssh. Using ssh the following related messages are found in the output = of dmesg: [drm:admgpu_job_timeout [amdgpu]] *ERROR* ring gfx timeout, signalled seq=3D1256221, emitted seq=3D1256223 amdgpu 0000:10:00.0 GPU reset begin! [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:43:crtc-0] hw_done or flip_done timed out This is the output with amdgpu.gpu_recovery=3D1 set in the kernel launch parameters. Without that the output is the same except the last two messages are replaced with a message about GPU recovery being disabled. Any attempt = to access /sys/kernel/debug/dri/0/amdgpu_gpu_recover in either state hangs forever. Magic SysRq keys still work and processes can be killed over SSH b= ut killing the game/Xorg/etc. will not cause the display to start working agai= n, a reset is required. This has been observed with both Xorg and the KDE's Wayland compositor. This has only been observed with Vulkan applications (native Dota 2's Vulkan mode and DXVK backed Wine games) but hasn't been confirmed to not occur with others. This was observed with the libvulkan_radeon.so Mesa Vulkan driver. I couldn= 't confirm the behavior with AMDVLK because graphical applications failed to launch with it installed. I don't believe the GPU in question is faulty, I'= ve used it for long periods of time on Windows and it's rock stable. Observed on kernels 4.18.11 and 4.19.0-rc6. Searching around the Internet suggests this may have started with 4.18.0 but I haven't confirmed that yet. Hardware: CPU: AMD R7 1800X GPU: AMD RX Vega 64 --=20 You are receiving this mail because: You are the assignee for the bug.= --15384436490.D4DBa.27849 Date: Tue, 2 Oct 2018 01:27:29 +0000 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated
Bug ID 108118
Summary AMDGPU sometimes hangs forever when running graphical applica= tions
Product DRI
Version unspecified
Hardware x86-64 (AMD64)
OS Linux (All)
Status NEW
Severity major
Priority medium
Component DRM/AMDgpu
Assignee dri-devel@lists.freedesktop.org
Reporter duooratar@gmail.com

Sometimes when running a graphical application the display wil=
l freeze up but
system sound will continue. The machine is still functioning and is accessi=
ble
over ssh. Using ssh the following related messages are found in the output =
of
dmesg:

[drm:admgpu_job_timeout [amdgpu]] *ERROR* ring gfx timeout, signalled
seq=3D1256221, emitted seq=3D1256223
amdgpu 0000:10:00.0 GPU reset begin!
[drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:43:crtc-0] hw_done or
flip_done timed out

This is the output with amdgpu.gpu_recovery=3D1 set in the kernel launch
parameters. Without that the output is the same except the last two messages
are replaced with a message about GPU recovery being disabled. Any attempt =
to
access /sys/kernel/debug/dri/0/amdgpu_gpu_recover in either state hangs
forever. Magic SysRq keys still work and processes can be killed over SSH b=
ut
killing the game/Xorg/etc. will not cause the display to start working agai=
n, a
reset is required.

This has been observed with both Xorg and the KDE's Wayland compositor.

This has only been observed with Vulkan applications (native Dota 2's Vulkan
mode and DXVK backed Wine games) but hasn't been confirmed to not occur with
others.

This was observed with the libvulkan_radeon.so Mesa Vulkan driver. I couldn=
't
confirm the behavior with AMDVLK because graphical applications failed to
launch with it installed. I don't believe the GPU in question is faulty, I'=
ve
used it for long periods of time on Windows and it's rock stable.

Observed on kernels 4.18.11 and 4.19.0-rc6. Searching around the Internet
suggests this may have started with 4.18.0 but I haven't confirmed that yet.

Hardware:
CPU: AMD R7 1800X
GPU: AMD RX Vega 64


You are receiving this mail because:
  • You are the assignee for the bug.
= --15384436490.D4DBa.27849-- --===============1196999525== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== --===============1196999525==--