From mboxrd@z Thu Jan 1 00:00:00 1970 From: bugzilla-daemon@freedesktop.org Subject: [Bug 106671] Frequent lock ups for AMD RX 550 graphics card Date: Wed, 17 Oct 2018 21:54:37 +0000 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0881330772==" Return-path: Received: from culpepper.freedesktop.org (culpepper.freedesktop.org [IPv6:2610:10:20:722:a800:ff:fe98:4b55]) by gabe.freedesktop.org (Postfix) with ESMTP id B5C5D6E41A for ; Wed, 17 Oct 2018 21:54:37 +0000 (UTC) In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============0881330772== Content-Type: multipart/alternative; boundary="15398132770.4f20be1Ca.8958" Content-Transfer-Encoding: 7bit --15398132770.4f20be1Ca.8958 Date: Wed, 17 Oct 2018 21:54:37 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated https://bugs.freedesktop.org/show_bug.cgi?id=3D106671 --- Comment #30 from Alan W. Irwin --- This time the system lasted almost 14 days before the lockup. See the late= st attachment for the log details which contain NMI messages followed by a bur= st of ascii null characters (which in my experience can be due to different threads or processes trying to write to the same file, i.e., the NMI error messages themselves might have exposed another kernel bug). Unlike the last case of NMI mesages where an Intel network card was mentioned, the only hardware I can see mentioned in these messages is a particular cpu and my motherboard, e.g., Oct 17 13:25:02 merlin kernel: [1177237.021995] NMI watchdog: Watchdog dete= cted hard LOCKUP on cpu 13 [...] Oct 17 13:25:02 merlin kernel: [1177237.022042] Hardware name: System manufacturer System Product Name/PRIME B350-PLUS, BIOS 3803 01/22/2018 So this appears not to be hard evidence of a graphics stack bug since likely any linux system component bug could lock up a cpu, but I am still pretty s= ure this is a graphics stack issue with the RX 550 because of my prior evidence showing much better kernel stability if I do not use that RX550 card at all. I started a new up-time experiment using today's snapshot of Debian Buster which left most of the graphics stack the same other than libdrm-amdgpu1 wh= ich has been updated from 2.4.94-1 to 2.4.95-1 and the=20 linux kernel which has been updated from 4.18.6-1 to 4.18.10-2. --=20 You are receiving this mail because: You are the assignee for the bug.= --15398132770.4f20be1Ca.8958 Date: Wed, 17 Oct 2018 21:54:37 +0000 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated

Comme= nt # 30 on bug 10667= 1 from Alan W. Irwin
This time the system lasted almost 14 days before the lockup. =
 See the latest
attachment for the log details which contain NMI messages followed by a bur=
st
of ascii null characters (which in my experience can be due to different
threads or processes trying to write to the same file, i.e., the NMI error
messages themselves might have exposed another kernel bug).  Unlike the last
case of NMI mesages where an Intel network card was mentioned, the only
hardware I can see
mentioned in these messages is a particular cpu and my motherboard, e.g.,
Oct 17 13:25:02 merlin kernel: [1177237.021995] NMI watchdog: Watchdog dete=
cted
hard LOCKUP on cpu 13
[...]
Oct 17 13:25:02 merlin kernel: [1177237.022042] Hardware name: System
manufacturer System Product Name/PRIME B350-PLUS, BIOS 3803 01/22/2018

So this appears not to be hard evidence of a graphics stack bug since likely
any linux system component bug could lock up a cpu, but I am still pretty s=
ure
this is a graphics stack issue with the RX 550 because of my prior evidence
showing
much better kernel stability if I do not use that RX550 card at all.

I started a new up-time experiment using today's snapshot of Debian Buster
which left most of the graphics stack the same other than libdrm-amdgpu1 wh=
ich
has been updated from 2.4.94-1 to 2.4.95-1 and the=20
linux kernel which has been updated from 4.18.6-1 to 4.18.10-2.


You are receiving this mail because:
  • You are the assignee for the bug.
= --15398132770.4f20be1Ca.8958-- --===============0881330772== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== --===============0881330772==--