From mboxrd@z Thu Jan 1 00:00:00 1970
From: bugzilla-daemon@freedesktop.org
Subject: [Bug 102322] System crashes after "[drm] IP block:gmc_v8_0 is hung!"
/ [drm] IP block:sdma_v3_0 is hung!
Date: Tue, 21 Aug 2018 21:29:48 +0000
Message-ID:
References:
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============0161517789=="
Return-path:
Received: from culpepper.freedesktop.org (culpepper.freedesktop.org
[131.252.210.165])
by gabe.freedesktop.org (Postfix) with ESMTP id A2EDD6E16F
for ; Tue, 21 Aug 2018 21:29:48 +0000 (UTC)
In-Reply-To:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel"
To: dri-devel@lists.freedesktop.org
List-Id: dri-devel@lists.freedesktop.org
--===============0161517789==
Content-Type: multipart/alternative; boundary="15348869884.4CBC79.16641"
Content-Transfer-Encoding: 7bit
--15348869884.4CBC79.16641
Date: Tue, 21 Aug 2018 21:29:48 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
https://bugs.freedesktop.org/show_bug.cgi?id=3D102322
--- Comment #57 from Andrey Grodzovsky ---
(In reply to dwagner from comment #56)
> (In reply to Andrey Grodzovsky from comment #55)
> > > In above attached file "xz-compressed output of gpu_debug3.sh" there =
is umr
> > > output at the time of the crash (238 seconds after the reboot):
> > >=20
> > > ----------------------------------------------
> > > ...
> > > mpv/vo-897 [005] .... 235.191542: dma_fence_wait_start:
> > > driver=3Ddrm_sched timeline=3Dgfx context=3D162 seqno=3D87
> > > mpv/vo-897 [005] d... 235.191548: dma_fence_enable_sign=
al:
> > > driver=3Ddrm_sched timeline=3Dgfx context=3D162 seqno=3D87
> > > kworker/0:2-92 [000] .... 238.275988: dma_fence_signaled:
> > > driver=3Damdgpu timeline=3Dsdma1 context=3D11 seqno=3D210
> > > kworker/0:2-92 [000] .... 238.276004: dma_fence_signaled:
> > > driver=3Damdgpu timeline=3Dsdma1 context=3D11 seqno=3D211
> > > [ 238.180634] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0
> > > timeout, signaled seq=3D32624, emitted seq=3D32626
> > > [ 238.180641] amdgpu 0000:0a:00.0: GPU reset begin!
> > > [ 238.180641] amdgpu 0000:0a:00.0: GPU reset begin!
> > >=20
> > > crash detected!
> > >=20
> > > executing umr -O halt_waves -wa
> > > No active waves!
> >=20
> > Did you use amdgpu.vm_fault_stop=3D2 parameter ? In case a fault happen=
ed that
> > should have froze GPUs compute units and hence the above command would
> > produce a lot of wave info.
>=20
> Yes I did, as can be seen from the kernel command line at the very beginn=
ing
> of the file I attached:
> [ 0.000000] Command line: BOOT_IMAGE=3D/vmlinuz-linux_amd
> root=3DUUID=3Db5d56e15-18f3-4783-af84-bbff3bbff3ef rw
> cryptdevice=3D/dev/nvme0n1p2:root:allow-discards libata.force=3D1.5 video=
=3DDP-1:d
> video=3DDVI-D-1:d video=3DHDMI-A-1:1024x768 amdgpu.dc=3D1 amdgpu.vm_updat=
e_mode=3D0
> amdgpu.dpm=3D-1 amdgpu.ppfeaturemask=3D0xffffffff amdgpu.vm_fault_stop=3D2
> amdgpu.vm_debug=3D1
>=20
> Could the "amdgpu 0000:0a:00.0: GPU reset begin!" message indicate a
> procedure that discards whatever has been in thoses "waves" before? If ye=
s,
> could amdgpu.gpu_recovery=3D0 prevent that from happening?
Yes, missed that one. No resets.
--=20
You are receiving this mail because:
You are the assignee for the bug.=
--15348869884.4CBC79.16641
Date: Tue, 21 Aug 2018 21:29:48 +0000
MIME-Version: 1.0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
Comme=
nt # 57
on bug 10232=
2
from Andrey Grodzovsky
(In reply to dwagner from comment #56)
> (In reply to Andrey Grodzovsky from comment #55)
> > > In above attached file "xz-compressed output of gpu_deb=
ug3.sh" there is umr
> > > output at the time of the crash (238 seconds after the reboo=
t):
> > >=20
> > > ----------------------------------------------
> > > ...
> > > mpv/vo-897 [005] .... 235.191542: dma_fence_wa=
it_start:
> > > driver=3Ddrm_sched timeline=3Dgfx context=3D162 seqno=3D87
> > > mpv/vo-897 [005] d... 235.191548: dma_fence_en=
able_signal:
> > > driver=3Ddrm_sched timeline=3Dgfx context=3D162 seqno=3D87
> > > kworker/0:2-92 [000] .... 238.275988: dma_fence_si=
gnaled:
> > > driver=3Damdgpu timeline=3Dsdma1 context=3D11 seqno=3D210
> > > kworker/0:2-92 [000] .... 238.276004: dma_fence_si=
gnaled:
> > > driver=3Damdgpu timeline=3Dsdma1 context=3D11 seqno=3D211
> > > [ 238.180634] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ri=
ng sdma0
> > > timeout, signaled seq=3D32624, emitted seq=3D32626
> > > [ 238.180641] amdgpu 0000:0a:00.0: GPU reset begin!
> > > [ 238.180641] amdgpu 0000:0a:00.0: GPU reset begin!
> > >=20
> > > crash detected!
> > >=20
> > > executing umr -O halt_waves -wa
> > > No active waves!
> >=20
> > Did you use amdgpu.vm_fault_stop=3D2 parameter ? In case a fault =
happened that
> > should have froze GPUs compute units and hence the above command =
would
> > produce a lot of wave info.
>=20
> Yes I did, as can be seen from the kernel command line at the very beg=
inning
> of the file I attached:
> [ 0.000000] Command line: BOOT_IMAGE=3D/vmlinuz-linux_amd
> root=3DUUID=3Db5d56e15-18f3-4783-af84-bbff3bbff3ef rw
> cryptdevice=3D/dev/nvme0n1p2:root:allow-discards libata.force=3D1.5 vi=
deo=3DDP-1:d
> video=3DDVI-D-1:d video=3DHDMI-A-1:1024x768 amdgpu.dc=3D1 amdgpu.vm_up=
date_mode=3D0
> amdgpu.dpm=3D-1 amdgpu.ppfeaturemask=3D0xffffffff amdgpu.vm_fault_stop=
=3D2
> amdgpu.vm_debug=3D1
>=20
> Could the "amdgpu 0000:0a:00.0: GPU reset begin!" message in=
dicate a
> procedure that discards whatever has been in thoses "waves" =
before? If yes,
> could amdgpu.gpu_recovery=3D0 prevent that from happening?
Yes, missed that one. No resets.
You are receiving this mail because:
- You are the assignee for the bug.
=
--15348869884.4CBC79.16641--
--===============0161517789==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs
IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz
dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg==
--===============0161517789==--