From mboxrd@z Thu Jan 1 00:00:00 1970
From: bugzilla-daemon@freedesktop.org
Subject: [Bug 107572] Unrecoverable GPU hang with IP block:gfx_v8_0 is hung
Date: Tue, 14 Aug 2018 23:45:33 +0000
Message-ID:
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============1253971646=="
Return-path:
Received: from culpepper.freedesktop.org (culpepper.freedesktop.org
[131.252.210.165])
by gabe.freedesktop.org (Postfix) with ESMTP id 23E076E02A
for ; Tue, 14 Aug 2018 23:45:33 +0000 (UTC)
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel"
To: dri-devel@lists.freedesktop.org
List-Id: dri-devel@lists.freedesktop.org
--===============1253971646==
Content-Type: multipart/alternative; boundary="15342903330.FC2aC9D.32560"
Content-Transfer-Encoding: 7bit
--15342903330.FC2aC9D.32560
Date: Tue, 14 Aug 2018 23:45:33 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
https://bugs.freedesktop.org/show_bug.cgi?id=3D107572
Bug ID: 107572
Summary: Unrecoverable GPU hang with IP block:gfx_v8_0 is hung
Product: DRI
Version: unspecified
Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
Severity: normal
Priority: medium
Component: DRM/AMDgpu
Assignee: dri-devel@lists.freedesktop.org
Reporter: madcatx@atlas.cz
Hello,
I have been experiencing a worrying amount of these ever since I got my RX =
570
a few months ago. I can reproduce the hang quite reliably by with some 3D
workloads, for instance the Unigine Superposition run on High quality or
Witcher 3 (through WINE) crash the GPU quite reliably within minutes.
Once that happens I can always SSH into the machine and try to get at least
some debugging information. Unfortunately, there does not seem to be much t=
o go
on.
dmesg does not tell me more than this:
[ 254.704581] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout,
last signaled seq=3D103742, last emitted seq=3D103745
[ 254.704586] [drm] IP block:gfx_v8_0 is hung!
[ 254.704629] [drm] GPU recovery disabled.
Here are a few things I have tried so far:
- Boot with amdgpu.dc=3D0
- Boot with amdgpu.vm_update_mode=3D3
- Force the GPU to max power state
- Disable IOMMU (both by iommu=3Doff and by disabling VT-d in BIOS)
- Boot with amdgpu.gpu_recovery=3D1 (does not produce any additional info)
I grabbed the umr tool to try to get the state of the GPU when in crashes b=
ut
it does not seem to be able to read anything. Running:
umr -R gfx[.]
Leaves me with:
[ERROR]: Could not open ring debugfs file#=20=20
I check that entries in /sys/kernel/debug/amdgpu that look relevant are the=
re,
cat'ing them gives me "Operation not permitted". Yes, I am doing it as root.
Once this happens the only way out is a hard reboot.
I am running up-to-date Fedora 28, kernel 4.17.2, Mesa 18.0 series, LLVM 6.=
0.1.
Is there anything else I can do?
Thanks.
--=20
You are receiving this mail because:
You are the assignee for the bug.=
--15342903330.FC2aC9D.32560
Date: Tue, 14 Aug 2018 23:45:33 +0000
MIME-Version: 1.0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
| Bug ID |
107572
|
| Summary |
Unrecoverable GPU hang with IP block:gfx_v8_0 is hung
|
| Product |
DRI
|
| Version |
unspecified
|
| Hardware |
x86-64 (AMD64)
|
| OS |
Linux (All)
|
| Status |
NEW
|
| Severity |
normal
|
| Priority |
medium
|
| Component |
DRM/AMDgpu
|
| Assignee |
dri-devel@lists.freedesktop.org
|
| Reporter |
madcatx@atlas.cz
|
Hello,
I have been experiencing a worrying amount of these ever since I got my RX =
570
a few months ago. I can reproduce the hang quite reliably by with some 3D
workloads, for instance the Unigine Superposition run on High quality or
Witcher 3 (through WINE) crash the GPU quite reliably within minutes.
Once that happens I can always SSH into the machine and try to get at least
some debugging information. Unfortunately, there does not seem to be much t=
o go
on.
dmesg does not tell me more than this:
[ 254.704581] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout,
last signaled seq=3D103742, last emitted seq=3D103745
[ 254.704586] [drm] IP block:gfx_v8_0 is hung!
[ 254.704629] [drm] GPU recovery disabled.
Here are a few things I have tried so far:
- Boot with amdgpu.dc=3D0
- Boot with amdgpu.vm_update_mode=3D3
- Force the GPU to max power state
- Disable IOMMU (both by iommu=3Doff and by disabling VT-d in BIOS)
- Boot with amdgpu.gpu_recovery=3D1 (does not produce any additional info)
I grabbed the umr tool to try to get the state of the GPU when in crashes b=
ut
it does not seem to be able to read anything. Running:
umr -R gfx[.]
Leaves me with:
[ERROR]: Could not open ring debugfs file#=20=20
I check that entries in /sys/kernel/debug/amdgpu that look relevant are the=
re,
cat'ing them gives me "Operation not permitted". Yes, I am doing =
it as root.
Once this happens the only way out is a hard reboot.
I am running up-to-date Fedora 28, kernel 4.17.2, Mesa 18.0 series, LLVM 6.=
0.1.
Is there anything else I can do?
Thanks.
You are receiving this mail because:
- You are the assignee for the bug.
=
--15342903330.FC2aC9D.32560--
--===============1253971646==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs
IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz
dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg==
--===============1253971646==--