From mboxrd@z Thu Jan 1 00:00:00 1970
From: bugzilla-daemon@freedesktop.org
Subject: [Bug 102322] System crashes after "[drm] IP block:gmc_v8_0 is hung!"
/ [drm] IP block:sdma_v3_0 is hung!
Date: Thu, 28 Jun 2018 21:09:09 +0000
Message-ID:
References:
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============1963752947=="
Return-path:
Received: from culpepper.freedesktop.org (culpepper.freedesktop.org
[IPv6:2610:10:20:722:a800:ff:fe98:4b55])
by gabe.freedesktop.org (Postfix) with ESMTP id 09CAA6E012
for ; Thu, 28 Jun 2018 21:09:09 +0000 (UTC)
In-Reply-To:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel"
To: dri-devel@lists.freedesktop.org
List-Id: dri-devel@lists.freedesktop.org
--===============1963752947==
Content-Type: multipart/alternative; boundary="15302201480.2d5EDF30E.17675"
Content-Transfer-Encoding: 7bit
--15302201480.2d5EDF30E.17675
Date: Thu, 28 Jun 2018 21:09:08 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
https://bugs.freedesktop.org/show_bug.cgi?id=3D102322
--- Comment #19 from Andrey Grodzovsky ---
Can you use addr2line or gdb with 'list' command to give the line number
matching (In reply to dwagner from comment #18)
> The good news: So far no crashes during normal uptime with
> amdgpu.vm_update_mode=3D3
>=20
> The bad news: System crashes immediately upon S3 resume (with messages qu=
ite
> different from the ones I saw with earlier S3-resume crashes) - I filed b=
ug
> report https://bugs.freedesktop.org/show_bug.cgi?id=3D107065 on this.
>=20
> (In reply to Andrey Grodzovsky from comment #17)
> > dwagner, this is obviously just a work around and not a fix. It points =
to
> > some problem with SDMA packets, if you want to continue exploring we ca=
n try
> > to dump some fence traces and SDMA HW ring content to examine the latest
> > packets before the hang happened.
>=20
> If you can include some debug output into "amd-staging-drm-next" that hel=
ps
> finding the root cause, I might be able to provide some output - if the
> kernel survives long enough after the crash to write the system journal -
> this has not always been the case.
No need to recompile, just need to see what is the content of SDMA ring buf=
fer
when the hang occurs.
Clone and build our register analyzer from here -
https://cgit.freedesktop.org/amd/umr/ and once the hang happens just run=20
sudo umr -lb
sudo umr -R gfx[.]
sudo umr -R sdma0[.]
sudo umr -R sdma1[.]
I will probably need more info later but let's try this first.
--=20
You are receiving this mail because:
You are the assignee for the bug.=
--15302201480.2d5EDF30E.17675
Date: Thu, 28 Jun 2018 21:09:08 +0000
MIME-Version: 1.0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
Comme=
nt # 19
on bug 10232=
2
from Andrey Grodzovsky
Can you use addr2line or gdb with 'list' command to give the l=
ine number
matching (In reply to dwagner from comment #18)
> The good news: So far no crashes during normal u=
ptime with
> amdgpu.vm_update_mode=3D3
>=20
> The bad news: System crashes immediately upon S3 resume (with messages=
quite
> different from the ones I saw with earlier S3-resume crashes) - I file=
d bug
> report https://bugs.freedesktop.org/show_bug.=
cgi?id=3D107065 on this.
>=20
> (In reply to Andrey Grodzovsky from comment #17)
> > dwagner, this is obviously just a work around and not a fix. It p=
oints to
> > some problem with SDMA packets, if you want to continue exploring=
we can try
> > to dump some fence traces and SDMA HW ring content to examine the=
latest
> > packets before the hang happened.
>=20
> If you can include some debug output into "amd-staging-drm-next&q=
uot; that helps
> finding the root cause, I might be able to provide some output - if the
> kernel survives long enough after the crash to write the system journa=
l -
> this has not always been the case.
No need to recompile, just need to see what is the content of SDMA ring buf=
fer
when the hang occurs.
Clone and build our register analyzer from here -
https://cgit.freedesktop.=
org/amd/umr/ and once the hang happens just run=20
sudo umr -lb
sudo umr -R gfx[.]
sudo umr -R sdma0[.]
sudo umr -R sdma1[.]
I will probably need more info later but let's try this first.
You are receiving this mail because:
- You are the assignee for the bug.
=
--15302201480.2d5EDF30E.17675--
--===============1963752947==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs
IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz
dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg==
--===============1963752947==--