From mboxrd@z Thu Jan  1 00:00:00 1970
From: bugzilla-daemon@freedesktop.org
Subject: [Bug 101387] amdgpu display corruption and hang on AMD A10-9620P
Date: Wed, 14 Jun 2017 11:59:41 +0000
Message-ID: <bug-101387-502-4jBOhJy9IV@http.bugs.freedesktop.org/>
References: <bug-101387-502@http.bugs.freedesktop.org/>
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============0960258696=="
Return-path: <dri-devel-bounces@lists.freedesktop.org>
Received: from culpepper.freedesktop.org (culpepper.freedesktop.org
 [IPv6:2610:10:20:722:a800:ff:fe98:4b55])
 by gabe.freedesktop.org (Postfix) with ESMTP id 509796E52D
 for <dri-devel@lists.freedesktop.org>; Wed, 14 Jun 2017 11:59:41 +0000 (UTC)
In-Reply-To: <bug-101387-502@http.bugs.freedesktop.org/>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/dri-devel>,
 <mailto:dri-devel-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/dri-devel>
List-Post: <mailto:dri-devel@lists.freedesktop.org>
List-Help: <mailto:dri-devel-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/dri-devel>,
 <mailto:dri-devel-request@lists.freedesktop.org?subject=subscribe>
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
To: dri-devel@lists.freedesktop.org
List-Id: dri-devel@lists.freedesktop.org


--===============0960258696==
Content-Type: multipart/alternative; boundary="14974415810.4ED55458a.19282";
 charset="UTF-8"


--14974415810.4ED55458a.19282
Date: Wed, 14 Jun 2017 11:59:41 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated

https://bugs.freedesktop.org/show_bug.cgi?id=3D101387

--- Comment #8 from Carlo Caione <carlo@caione.org> ---
Just a better description of what's going on and a couple of questions.

When amdgpu_atombios_crtc_powergate_init() is called this triggers the pars=
ing
of the command table with index =3D=3D 13 (>> execute C5C0 (len 589, WS 0, =
PS 0)).
As already reported the parameter space used (struct
ENABLE_DISP_POWER_GATING_PARAMETERS_V2_1) is 32 bytes wide.

During the execution of this table several CALL_TABLE (op =3D=3D 82) are ex=
ecuted.=20

In particular we first just to table with index =3D=3D 78 (>> execute F166 =
(len
588, WS 0, PS 8)), then to table with index =3D=3D 51 (>> execute F446 (len=
 465, WS
4, PS 4)) and finally to table with index =3D=3D 75 (>> execute F6CC (len 1=
330, WS
4, PS 0)) before finally reaching the EOT for table 13.

During the execution of table 75 a MOVE_PS is executed with a destination i=
ndex
=3D=3D 1, accessing ctx->ps[idx] and causing the stack corruption.

So either the atombios code is wrong or the atombios interpreter in the ker=
nel
is doing something wrong.

I also have a couple of questions / observations:

1) Table 75 has WS =3D=3D 4 and PS =3D=3D 0 and looking at the opcodes in t=
he table I
basically have only *_WS opcodes (MOVE_WS, TEST_WS, ADD_WS, etc...) and just
two *_PS instructions (MOVE_PS and OR_PS) that (guess what) are the
instructions causing the stack corruption. My guess here is that the opcodes
*_PS in the atombios are wrong and they should actually be *_WS opcodes.

2) Don't we need to allocate the size of the ps allocation struct for the
command table we are going to execute after a CALL_TABLE matching the ps si=
ze
in the table header? IIUC the code in the kernel, when we are jumping to a
different table ctx->ps is not being reallocated.

3) Could the point at (2) also be a problem in our case? Assuming that ps r=
ead
from the table header has something to do with the size of the parameter sp=
ace
(guessing here) Table 13 has PS =3D=3D 0, while table 75 has PS =3D=3D 4 wh=
ereas both
are using the same ctx->ps.

--=20
You are receiving this mail because:
You are the assignee for the bug.=

--14974415810.4ED55458a.19282
Date: Wed, 14 Jun 2017 11:59:41 +0000
MIME-Version: 1.0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated

<html>
    <head>
      <base href=3D"https://bugs.freedesktop.org/">
    </head>
    <body>
      <p>
        <div>
            <b><a class=3D"bz_bug_link=20
          bz_status_NEW "
   title=3D"NEW - amdgpu display corruption and hang on AMD A10-9620P"
   href=3D"https://bugs.freedesktop.org/show_bug.cgi?id=3D101387#c8">Commen=
t # 8</a>
              on <a class=3D"bz_bug_link=20
          bz_status_NEW "
   title=3D"NEW - amdgpu display corruption and hang on AMD A10-9620P"
   href=3D"https://bugs.freedesktop.org/show_bug.cgi?id=3D101387">bug 10138=
7</a>
              from <span class=3D"vcard"><a class=3D"email" href=3D"mailto:=
carlo&#64;caione.org" title=3D"Carlo Caione &lt;carlo&#64;caione.org&gt;"> =
<span class=3D"fn">Carlo Caione</span></a>
</span></b>
        <pre>Just a better description of what's going on and a couple of q=
uestions.

When amdgpu_atombios_crtc_powergate_init() is called this triggers the pars=
ing
of the command table with index =3D=3D 13 (&gt;&gt; execute C5C0 (len 589, =
WS 0, PS 0)).
As already reported the parameter space used (struct
ENABLE_DISP_POWER_GATING_PARAMETERS_V2_1) is 32 bytes wide.

During the execution of this table several CALL_TABLE (op =3D=3D 82) are ex=
ecuted.=20

In particular we first just to table with index =3D=3D 78 (&gt;&gt; execute=
 F166 (len
588, WS 0, PS 8)), then to table with index =3D=3D 51 (&gt;&gt; execute F44=
6 (len 465, WS
4, PS 4)) and finally to table with index =3D=3D 75 (&gt;&gt; execute F6CC =
(len 1330, WS
4, PS 0)) before finally reaching the EOT for table 13.

During the execution of table 75 a MOVE_PS is executed with a destination i=
ndex
=3D=3D 1, accessing ctx-&gt;ps[idx] and causing the stack corruption.

So either the atombios code is wrong or the atombios interpreter in the ker=
nel
is doing something wrong.

I also have a couple of questions / observations:

1) Table 75 has WS =3D=3D 4 and PS =3D=3D 0 and looking at the opcodes in t=
he table I
basically have only *_WS opcodes (MOVE_WS, TEST_WS, ADD_WS, etc...) and just
two *_PS instructions (MOVE_PS and OR_PS) that (guess what) are the
instructions causing the stack corruption. My guess here is that the opcodes
*_PS in the atombios are wrong and they should actually be *_WS opcodes.

2) Don't we need to allocate the size of the ps allocation struct for the
command table we are going to execute after a CALL_TABLE matching the ps si=
ze
in the table header? IIUC the code in the kernel, when we are jumping to a
different table ctx-&gt;ps is not being reallocated.

3) Could the point at (2) also be a problem in our case? Assuming that ps r=
ead
from the table header has something to do with the size of the parameter sp=
ace
(guessing here) Table 13 has PS =3D=3D 0, while table 75 has PS =3D=3D 4 wh=
ereas both
are using the same ctx-&gt;ps.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>=

--14974415810.4ED55458a.19282--

--===============0960258696==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline

X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs
IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz
dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg==

--===============0960258696==--