From mboxrd@z Thu Jan 1 00:00:00 1970 From: bugzilla-daemon@freedesktop.org Subject: [Bug 102322] System crashes after "[drm] IP block:gmc_v8_0 is hung!" / [drm] IP block:sdma_v3_0 is hung! Date: Thu, 28 Jun 2018 04:17:19 +0000 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0306489572==" Return-path: Received: from culpepper.freedesktop.org (culpepper.freedesktop.org [IPv6:2610:10:20:722:a800:ff:fe98:4b55]) by gabe.freedesktop.org (Postfix) with ESMTP id BA6566E704 for ; Thu, 28 Jun 2018 04:17:19 +0000 (UTC) In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============0306489572== Content-Type: multipart/alternative; boundary="15301594392.DABD003B.877" Content-Transfer-Encoding: 7bit --15301594392.DABD003B.877 Date: Thu, 28 Jun 2018 04:17:19 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated https://bugs.freedesktop.org/show_bug.cgi?id=3D102322 --- Comment #15 from Andrey Grodzovsky --- (In reply to dwagner from comment #13) > (In reply to Andrey Grodzovsky from comment #12) > > Can you load the kernel with grub command line amdgpu.vm_update_mode=3D= 3 to > > force CPU VM update mode and see if this helps ? >=20 > Sure. Too early yet to say "hurray", but at an uptime of one hour, > currently, 4.17.2 survived with amdgpu.vm_update_mode=3D3 already about 20 > times longer than without that option before the first crash. >=20 > One (probably just informal) message is emitted by the kernel: > [ 19.319565] CPU update of VM recommended only for large BAR system >=20 > Can you explain a little: What is a "large BAR system", and what does the > vm_update_mode=3D3 option actually cause? Should I expect any weird side > effects to look for? I think it just means systems with large VRAM so it will require large BAR = for mapping. But I am not sure on that point. vm_update_mode=3D3 means GPUVM page tables update is done using CPU. By def= ault we do it using DMA engine on the ASIC. The log showed a hang in this engine= so I assumed there is something wrong with SDMA commands we submit. I assume more CPU utilization as a side effect and maybe slower rendering. >=20 >=20 > BTW: Not a result of that option, but of the kernel version, seems to be = the > fact that the shader clock keeps at a pretty high frequency all the time - > even without any 3d or compute load, just displaying a quiet 4k/60Hz desk= top > image: >=20 > cat pp_dpm_sclk > 0: 214Mhz=20 > 1: 481Mhz=20 > 2: 760Mhz=20 > 3: 1020Mhz=20 > 4: 1102Mhz=20 > 5: 1138Mhz=20 > 6: 1180Mhz * > 7: 1220Mhz=20 >=20 > Much lower shader clocks are used only if I lower the refresh rate of the > screen. Is there a reason why the shader clocks should stay high even in = the > absence of 3d/compute load? >=20 > (I would have better understood if the minimum memory clock was depending= on > the refresh rate, but memory clock stays as low as with the older kernels= .) --=20 You are receiving this mail because: You are the assignee for the bug.= --15301594392.DABD003B.877 Date: Thu, 28 Jun 2018 04:17:19 +0000 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated

Comme= nt # 15 on bug 10232= 2 from Andrey Grodzovsky
(In reply to dwagner from comment #13)
> (In reply to Andrey Grodzovsky from comment #12)
> > Can you load the kernel with grub command line amdgpu.vm_update_m=
ode=3D3 to
> > force CPU VM update mode and see if this helps ?
>=20
> Sure. Too early yet to say "hurray", but at an uptime of one=
 hour,
> currently, 4.17.2 survived with amdgpu.vm_update_mode=3D3 already abou=
t 20
> times longer than without that option before the first crash.
>=20
> One (probably just informal) message is emitted by the kernel:
> [   19.319565] CPU update of VM recommended only for large BAR system
>=20
> Can you explain a little: What is a "large BAR system", and =
what does the
> vm_update_mode=3D3 option actually cause? Should I expect any weird si=
de
> effects to look for?

I think it just means systems with large VRAM so it will require large BAR =
for
mapping. But I am not sure on that point.
vm_update_mode=3D3 means GPUVM page tables update is done using CPU. By def=
ault
we do it using DMA engine on the ASIC. The log showed a hang in this engine=
 so
I assumed there is something wrong with SDMA commands we submit.
I assume more CPU utilization as a side effect and maybe slower rendering.

>=20
>=20
> BTW: Not a result of that option, but of the kernel version, seems to =
be the
> fact that the shader clock keeps at a pretty high frequency all the ti=
me -
> even without any 3d or compute load, just displaying a quiet 4k/60Hz d=
esktop
> image:
>=20
> cat pp_dpm_sclk
> 0: 214Mhz=20
> 1: 481Mhz=20
> 2: 760Mhz=20
> 3: 1020Mhz=20
> 4: 1102Mhz=20
> 5: 1138Mhz=20
> 6: 1180Mhz *
> 7: 1220Mhz=20
>=20
> Much lower shader clocks are used only if I lower the refresh rate of =
the
> screen. Is there a reason why the shader clocks should stay high even =
in the
> absence of 3d/compute load?
>=20
> (I would have better understood if the minimum memory clock was depend=
ing on
> the refresh rate, but memory clock stays as low as with the older kern=
els.)


You are receiving this mail because:
  • You are the assignee for the bug.
= --15301594392.DABD003B.877-- --===============0306489572== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== --===============0306489572==--