From mboxrd@z Thu Jan 1 00:00:00 1970 From: bugzilla-daemon@freedesktop.org Subject: [Bug 101387] amdgpu display corruption and hang on AMD A10-9620P Date: Wed, 14 Jun 2017 11:59:41 +0000 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0960258696==" Return-path: Received: from culpepper.freedesktop.org (culpepper.freedesktop.org [IPv6:2610:10:20:722:a800:ff:fe98:4b55]) by gabe.freedesktop.org (Postfix) with ESMTP id 509796E52D for ; Wed, 14 Jun 2017 11:59:41 +0000 (UTC) In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============0960258696== Content-Type: multipart/alternative; boundary="14974415810.4ED55458a.19282"; charset="UTF-8" --14974415810.4ED55458a.19282 Date: Wed, 14 Jun 2017 11:59:41 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated https://bugs.freedesktop.org/show_bug.cgi?id=3D101387 --- Comment #8 from Carlo Caione --- Just a better description of what's going on and a couple of questions. When amdgpu_atombios_crtc_powergate_init() is called this triggers the pars= ing of the command table with index =3D=3D 13 (>> execute C5C0 (len 589, WS 0, = PS 0)). As already reported the parameter space used (struct ENABLE_DISP_POWER_GATING_PARAMETERS_V2_1) is 32 bytes wide. During the execution of this table several CALL_TABLE (op =3D=3D 82) are ex= ecuted.=20 In particular we first just to table with index =3D=3D 78 (>> execute F166 = (len 588, WS 0, PS 8)), then to table with index =3D=3D 51 (>> execute F446 (len= 465, WS 4, PS 4)) and finally to table with index =3D=3D 75 (>> execute F6CC (len 1= 330, WS 4, PS 0)) before finally reaching the EOT for table 13. During the execution of table 75 a MOVE_PS is executed with a destination i= ndex =3D=3D 1, accessing ctx->ps[idx] and causing the stack corruption. So either the atombios code is wrong or the atombios interpreter in the ker= nel is doing something wrong. I also have a couple of questions / observations: 1) Table 75 has WS =3D=3D 4 and PS =3D=3D 0 and looking at the opcodes in t= he table I basically have only *_WS opcodes (MOVE_WS, TEST_WS, ADD_WS, etc...) and just two *_PS instructions (MOVE_PS and OR_PS) that (guess what) are the instructions causing the stack corruption. My guess here is that the opcodes *_PS in the atombios are wrong and they should actually be *_WS opcodes. 2) Don't we need to allocate the size of the ps allocation struct for the command table we are going to execute after a CALL_TABLE matching the ps si= ze in the table header? IIUC the code in the kernel, when we are jumping to a different table ctx->ps is not being reallocated. 3) Could the point at (2) also be a problem in our case? Assuming that ps r= ead from the table header has something to do with the size of the parameter sp= ace (guessing here) Table 13 has PS =3D=3D 0, while table 75 has PS =3D=3D 4 wh= ereas both are using the same ctx->ps. --=20 You are receiving this mail because: You are the assignee for the bug.= --14974415810.4ED55458a.19282 Date: Wed, 14 Jun 2017 11:59:41 +0000 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated

Commen= t # 8 on bug 10138= 7 from = Carlo Caione
Just a better description of what's going on and a couple of q=
uestions.

When amdgpu_atombios_crtc_powergate_init() is called this triggers the pars=
ing
of the command table with index =3D=3D 13 (>> execute C5C0 (len 589, =
WS 0, PS 0)).
As already reported the parameter space used (struct
ENABLE_DISP_POWER_GATING_PARAMETERS_V2_1) is 32 bytes wide.

During the execution of this table several CALL_TABLE (op =3D=3D 82) are ex=
ecuted.=20

In particular we first just to table with index =3D=3D 78 (>> execute=
 F166 (len
588, WS 0, PS 8)), then to table with index =3D=3D 51 (>> execute F44=
6 (len 465, WS
4, PS 4)) and finally to table with index =3D=3D 75 (>> execute F6CC =
(len 1330, WS
4, PS 0)) before finally reaching the EOT for table 13.

During the execution of table 75 a MOVE_PS is executed with a destination i=
ndex
=3D=3D 1, accessing ctx->ps[idx] and causing the stack corruption.

So either the atombios code is wrong or the atombios interpreter in the ker=
nel
is doing something wrong.

I also have a couple of questions / observations:

1) Table 75 has WS =3D=3D 4 and PS =3D=3D 0 and looking at the opcodes in t=
he table I
basically have only *_WS opcodes (MOVE_WS, TEST_WS, ADD_WS, etc...) and just
two *_PS instructions (MOVE_PS and OR_PS) that (guess what) are the
instructions causing the stack corruption. My guess here is that the opcodes
*_PS in the atombios are wrong and they should actually be *_WS opcodes.

2) Don't we need to allocate the size of the ps allocation struct for the
command table we are going to execute after a CALL_TABLE matching the ps si=
ze
in the table header? IIUC the code in the kernel, when we are jumping to a
different table ctx->ps is not being reallocated.

3) Could the point at (2) also be a problem in our case? Assuming that ps r=
ead
from the table header has something to do with the size of the parameter sp=
ace
(guessing here) Table 13 has PS =3D=3D 0, while table 75 has PS =3D=3D 4 wh=
ereas both
are using the same ctx->ps.


You are receiving this mail because:
  • You are the assignee for the bug.
= --14974415810.4ED55458a.19282-- --===============0960258696== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== --===============0960258696==--