From mboxrd@z Thu Jan 1 00:00:00 1970
From: bugzilla-daemon@freedesktop.org
Subject: [Bug 102322] System crashes after "[drm] IP block:gmc_v8_0 is hung!"
/ [drm] IP block:sdma_v3_0 is hung!
Date: Thu, 28 Jun 2018 04:17:19 +0000
Message-ID:
References:
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============0306489572=="
Return-path:
Received: from culpepper.freedesktop.org (culpepper.freedesktop.org
[IPv6:2610:10:20:722:a800:ff:fe98:4b55])
by gabe.freedesktop.org (Postfix) with ESMTP id BA6566E704
for ; Thu, 28 Jun 2018 04:17:19 +0000 (UTC)
In-Reply-To:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel"
To: dri-devel@lists.freedesktop.org
List-Id: dri-devel@lists.freedesktop.org
--===============0306489572==
Content-Type: multipart/alternative; boundary="15301594392.DABD003B.877"
Content-Transfer-Encoding: 7bit
--15301594392.DABD003B.877
Date: Thu, 28 Jun 2018 04:17:19 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
https://bugs.freedesktop.org/show_bug.cgi?id=3D102322
--- Comment #15 from Andrey Grodzovsky ---
(In reply to dwagner from comment #13)
> (In reply to Andrey Grodzovsky from comment #12)
> > Can you load the kernel with grub command line amdgpu.vm_update_mode=3D=
3 to
> > force CPU VM update mode and see if this helps ?
>=20
> Sure. Too early yet to say "hurray", but at an uptime of one hour,
> currently, 4.17.2 survived with amdgpu.vm_update_mode=3D3 already about 20
> times longer than without that option before the first crash.
>=20
> One (probably just informal) message is emitted by the kernel:
> [ 19.319565] CPU update of VM recommended only for large BAR system
>=20
> Can you explain a little: What is a "large BAR system", and what does the
> vm_update_mode=3D3 option actually cause? Should I expect any weird side
> effects to look for?
I think it just means systems with large VRAM so it will require large BAR =
for
mapping. But I am not sure on that point.
vm_update_mode=3D3 means GPUVM page tables update is done using CPU. By def=
ault
we do it using DMA engine on the ASIC. The log showed a hang in this engine=
so
I assumed there is something wrong with SDMA commands we submit.
I assume more CPU utilization as a side effect and maybe slower rendering.
>=20
>=20
> BTW: Not a result of that option, but of the kernel version, seems to be =
the
> fact that the shader clock keeps at a pretty high frequency all the time -
> even without any 3d or compute load, just displaying a quiet 4k/60Hz desk=
top
> image:
>=20
> cat pp_dpm_sclk
> 0: 214Mhz=20
> 1: 481Mhz=20
> 2: 760Mhz=20
> 3: 1020Mhz=20
> 4: 1102Mhz=20
> 5: 1138Mhz=20
> 6: 1180Mhz *
> 7: 1220Mhz=20
>=20
> Much lower shader clocks are used only if I lower the refresh rate of the
> screen. Is there a reason why the shader clocks should stay high even in =
the
> absence of 3d/compute load?
>=20
> (I would have better understood if the minimum memory clock was depending=
on
> the refresh rate, but memory clock stays as low as with the older kernels=
.)
--=20
You are receiving this mail because:
You are the assignee for the bug.=
--15301594392.DABD003B.877
Date: Thu, 28 Jun 2018 04:17:19 +0000
MIME-Version: 1.0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
Comme=
nt # 15
on bug 10232=
2
from Andrey Grodzovsky
(In reply to dwagner from comment #13)
> (In reply to Andrey Grodzovsky from comment #12)
> > Can you load the kernel with grub command line amdgpu.vm_update_m=
ode=3D3 to
> > force CPU VM update mode and see if this helps ?
>=20
> Sure. Too early yet to say "hurray", but at an uptime of one=
hour,
> currently, 4.17.2 survived with amdgpu.vm_update_mode=3D3 already abou=
t 20
> times longer than without that option before the first crash.
>=20
> One (probably just informal) message is emitted by the kernel:
> [ 19.319565] CPU update of VM recommended only for large BAR system
>=20
> Can you explain a little: What is a "large BAR system", and =
what does the
> vm_update_mode=3D3 option actually cause? Should I expect any weird si=
de
> effects to look for?
I think it just means systems with large VRAM so it will require large BAR =
for
mapping. But I am not sure on that point.
vm_update_mode=3D3 means GPUVM page tables update is done using CPU. By def=
ault
we do it using DMA engine on the ASIC. The log showed a hang in this engine=
so
I assumed there is something wrong with SDMA commands we submit.
I assume more CPU utilization as a side effect and maybe slower rendering.
>=20
>=20
> BTW: Not a result of that option, but of the kernel version, seems to =
be the
> fact that the shader clock keeps at a pretty high frequency all the ti=
me -
> even without any 3d or compute load, just displaying a quiet 4k/60Hz d=
esktop
> image:
>=20
> cat pp_dpm_sclk
> 0: 214Mhz=20
> 1: 481Mhz=20
> 2: 760Mhz=20
> 3: 1020Mhz=20
> 4: 1102Mhz=20
> 5: 1138Mhz=20
> 6: 1180Mhz *
> 7: 1220Mhz=20
>=20
> Much lower shader clocks are used only if I lower the refresh rate of =
the
> screen. Is there a reason why the shader clocks should stay high even =
in the
> absence of 3d/compute load?
>=20
> (I would have better understood if the minimum memory clock was depend=
ing on
> the refresh rate, but memory clock stays as low as with the older kern=
els.)
You are receiving this mail because:
- You are the assignee for the bug.
=
--15301594392.DABD003B.877--
--===============0306489572==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs
IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz
dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg==
--===============0306489572==--