From mboxrd@z Thu Jan 1 00:00:00 1970
From: bugzilla-daemon@freedesktop.org
Subject: [Bug 105733] Amdgpu randomly hangs and only ssh works. Mouse cursor
moves sometimes but does nothing. Keyboard stops working.
Date: Sun, 25 Mar 2018 16:52:20 +0000
Message-ID:
References:
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============1438590327=="
Return-path:
Received: from culpepper.freedesktop.org (culpepper.freedesktop.org
[131.252.210.165])
by gabe.freedesktop.org (Postfix) with ESMTP id 681DE6E42A
for ; Sun, 25 Mar 2018 16:52:20 +0000 (UTC)
In-Reply-To:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel"
To: dri-devel@lists.freedesktop.org
List-Id: dri-devel@lists.freedesktop.org
--===============1438590327==
Content-Type: multipart/alternative; boundary="15219967400.CFaAe.18910"
Content-Transfer-Encoding: 7bit
--15219967400.CFaAe.18910
Date: Sun, 25 Mar 2018 16:52:20 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
https://bugs.freedesktop.org/show_bug.cgi?id=3D105733
--- Comment #2 from Allan ---
Tried getting all binaries available here
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git=
/tree/amdgpu
.
Even that I included the polaris binaries in the kernel, some binaries were
missing (exactly those that were required...).
I've seen that before, but since sometimes it got working I just thought th=
at
some other bin was being used instead.
Well... I launched Unigine Valley as a test and now the problem is even wor=
se :
[From dmesg]
```
[ 517.630633] amdgpu 0000:0e:00.0: GPU fault detected: 147 0x00004802
[ 517.630636] amdgpu 0000:0e:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR=20=
=20
0x00000000
[ 517.630638] amdgpu 0000:0e:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x08048002
[ 517.630640] amdgpu 0000:0e:00.0: VM fault (0x02, vmid 4) at page 0, read
from 'TC4' (0x54433400) (72)
[ 517.630644] amdgpu 0000:0e:00.0: GPU fault detected: 147 0x00004802
[ 517.630645] amdgpu 0000:0e:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR=20=
=20
0x00000000
[ 517.630646] amdgpu 0000:0e:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x08084002
[ 517.630648] amdgpu 0000:0e:00.0: VM fault (0x02, vmid 4) at page 0, read
from 'TC7' (0x54433700) (132)
```
The symptoms and reactions are the same as above. I got the output from a s=
sh
because only the cursor was moving and nothing else working.
So ... did my card die or is it a bug?
By the way ... I also have an RX580 and the problem described firstly was
happening too. (I had not tried forcing binaries before)
--=20
You are receiving this mail because:
You are the assignee for the bug.=
--15219967400.CFaAe.18910
Date: Sun, 25 Mar 2018 16:52:20 +0000
MIME-Version: 1.0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
Commen=
t # 2
on bug 10573=
3
from <=
span class=3D"fn">Allan
Tried getting all binaries available here
https://git.kernel.org/pub/scm/linux/kernel/git/fi=
rmware/linux-firmware.git/tree/amdgpu
.
Even that I included the polaris binaries in the kernel, some binaries were
missing (exactly those that were required...).
I've seen that before, but since sometimes it got working I just thought th=
at
some other bin was being used instead.
Well... I launched Unigine Valley as a test and now the problem is even wor=
se :
[From dmesg]
```
[ 517.630633] amdgpu 0000:0e:00.0: GPU fault detected: 147 0x00004802
[ 517.630636] amdgpu 0000:0e:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR=20=
=20
0x00000000
[ 517.630638] amdgpu 0000:0e:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x08048002
[ 517.630640] amdgpu 0000:0e:00.0: VM fault (0x02, vmid 4) at page 0, read
from 'TC4' (0x54433400) (72)
[ 517.630644] amdgpu 0000:0e:00.0: GPU fault detected: 147 0x00004802
[ 517.630645] amdgpu 0000:0e:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR=20=
=20
0x00000000
[ 517.630646] amdgpu 0000:0e:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x08084002
[ 517.630648] amdgpu 0000:0e:00.0: VM fault (0x02, vmid 4) at page 0, read
from 'TC7' (0x54433700) (132)
```
The symptoms and reactions are the same as above. I got the output from a s=
sh
because only the cursor was moving and nothing else working.
So ... did my card die or is it a bug?
By the way ... I also have an RX580 and the problem described firstly was
happening too. (I had not tried forcing binaries before)
You are receiving this mail because:
- You are the assignee for the bug.
=
--15219967400.CFaAe.18910--
--===============1438590327==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs
IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz
dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg==
--===============1438590327==--