From mboxrd@z Thu Jan 1 00:00:00 1970
From: bugzilla-daemon@freedesktop.org
Subject: [Bug 105733] Amdgpu randomly hangs and only ssh works. Mouse cursor
moves sometimes but does nothing. Keyboard stops working.
Date: Sun, 04 Nov 2018 01:19:18 +0000
Message-ID:
References:
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============1626257497=="
Return-path:
Received: from culpepper.freedesktop.org (culpepper.freedesktop.org
[131.252.210.165])
by gabe.freedesktop.org (Postfix) with ESMTP id F2EC96E01B
for ; Sun, 4 Nov 2018 01:19:34 +0000 (UTC)
In-Reply-To:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel"
To: dri-devel@lists.freedesktop.org
List-Id: dri-devel@lists.freedesktop.org
--===============1626257497==
Content-Type: multipart/alternative; boundary="15412943741.e7ecaAa2.30172"
Content-Transfer-Encoding: 7bit
--15412943741.e7ecaAa2.30172
Date: Sun, 4 Nov 2018 01:19:34 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
https://bugs.freedesktop.org/show_bug.cgi?id=3D105733
--- Comment #40 from John W. ---
Is there any resolution or work being done on this issue?
I've tried the frequency hack and it slightly delayed the issue
I also tried the latest amd staging kernel with latest firmware and XF86 dr=
iver
and found the same issue still happened but somewhat less. Reading my
journalctl logs I found sometimes when it occurs it will attempt to recover=
but
in the process loses NRAM and freezes the screen covered in odd colors
At least when this occurs the machine is otherwise functional and I can cha=
nge
TTYs and kill X11
I'm using a 580 and I've added the relevant logs of the attempted recovery.
Nov 02 15:31:26 Towering-DG kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERR=
OR*
ring sdma1 timeout, signaled seq=3D59193, emitted seq=3D59194
Nov 02 15:31:27 Towering-DG kernel: amdgpu 0000:01:00.0: GPU reset begin!
Nov 02 15:31:27 Towering-DG kernel: amdgpu 0000:01:00.0: GPU pci config res=
et
Nov 02 15:31:27 Towering-DG kernel: amdgpu 0000:01:00.0: GPU reset succeede=
d,
trying to resume
Nov 02 15:31:27 Towering-DG kernel: [drm] PCIE GART of 256M enabled (table =
at
0x000000F400300000).
Nov 02 15:31:27 Towering-DG kernel: [drm:amdgpu_device_gpu_recover [amdgpu]]
*ERROR* VRAM is lost!
Nov 02 15:31:27 Towering-DG kernel: amdgpu 0000:01:00.0:
[drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring comp_1.2.1 test failed
(-110)
(Note: Usually it's ring SDMA0 instead of SDMA1 and occasionally GFX)
--=20
You are receiving this mail because:
You are the assignee for the bug.=
--15412943741.e7ecaAa2.30172
Date: Sun, 4 Nov 2018 01:19:34 +0000
MIME-Version: 1.0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
Comme=
nt # 40
on bug 10573=
3
from John W.
Is there any resolution or work being done on this issue?
I've tried the frequency hack and it slightly delayed the issue
I also tried the latest amd staging kernel with latest firmware and XF86 dr=
iver
and found the same issue still happened but somewhat less. Reading my
journalctl logs I found sometimes when it occurs it will attempt to recover=
but
in the process loses NRAM and freezes the screen covered in odd colors
At least when this occurs the machine is otherwise functional and I can cha=
nge
TTYs and kill X11
I'm using a 580 and I've added the relevant logs of the attempted recovery.
Nov 02 15:31:26 Towering-DG kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERR=
OR*
ring sdma1 timeout, signaled seq=3D59193, emitted seq=3D59194
Nov 02 15:31:27 Towering-DG kernel: amdgpu 0000:01:00.0: GPU reset begin!
Nov 02 15:31:27 Towering-DG kernel: amdgpu 0000:01:00.0: GPU pci config res=
et
Nov 02 15:31:27 Towering-DG kernel: amdgpu 0000:01:00.0: GPU reset succeede=
d,
trying to resume
Nov 02 15:31:27 Towering-DG kernel: [drm] PCIE GART of 256M enabled (table =
at
0x000000F400300000).
Nov 02 15:31:27 Towering-DG kernel: [drm:amdgpu_device_gpu_recover [amdgpu]]
*ERROR* VRAM is lost!
Nov 02 15:31:27 Towering-DG kernel: amdgpu 0000:01:00.0:
[drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring comp_1.2.1 test failed
(-110)
(Note: Usually it's ring SDMA0 instead of SDMA1 and occasionally GFX)
You are receiving this mail because:
- You are the assignee for the bug.
=
--15412943741.e7ecaAa2.30172--
--===============1626257497==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs
IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz
dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg==
--===============1626257497==--