From mboxrd@z Thu Jan 1 00:00:00 1970
From: bugzilla-daemon@freedesktop.org
Subject: [Bug 111808] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout cause process into Disk sleep state
Date: Wed, 25 Sep 2019 02:42:41 +0000
Message-ID:
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============1168862317=="
Return-path:
Received: from culpepper.freedesktop.org (culpepper.freedesktop.org
[131.252.210.165])
by gabe.freedesktop.org (Postfix) with ESMTP id 6E77F6EB19
for ; Wed, 25 Sep 2019 02:42:43 +0000 (UTC)
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel"
To: dri-devel@lists.freedesktop.org
List-Id: dri-devel@lists.freedesktop.org
--===============1168862317==
Content-Type: multipart/alternative; boundary="15693793630.Fa35.20801"
Content-Transfer-Encoding: 7bit
--15693793630.Fa35.20801
Date: Wed, 25 Sep 2019 02:42:43 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
https://bugs.freedesktop.org/show_bug.cgi?id=3D111808
Bug ID: 111808
Summary: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout cause process into Disk sleep state
Product: DRI
Version: DRI git
Hardware: ARM
OS: Linux (All)
Status: NEW
Severity: major
Priority: not set
Component: DRM/AMDgpu
Assignee: dri-devel@lists.freedesktop.org
Reporter: liansz@fzcyjh.com
Created attachment 145507
--> https://bugs.freedesktop.org/attachment.cgi?id=3D145507&action=3Dedit
timeoutlog
We ran into some gfx timeout problems.
Currently, we use the kernel of 4.19.36. We merged some patches regarding G=
PU
from the community. There are multiple GPUs on each server, and each GPU is
running some rendering programs. Now, there are 2 different cases of failur=
es.
The first one is that one graphics card of a server fails, rendering program
does not have a D state, and it shows error code 110 tested by
/sys/kernel/debug/dri/1/amdgpu_test_ib, then shows pass after a second test.
See tmp-618-2.zip for details.
The second one is that one graphics card of a server fails, the whole rende=
ring
program running on the server fails and has D state. It fails at drm_releas=
e.
See tmp-619.zip for details.
Could you please help us out?
--=20
You are receiving this mail because:
You are the assignee for the bug.=
--15693793630.Fa35.20801
Date: Wed, 25 Sep 2019 02:42:43 +0000
MIME-Version: 1.0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
| Bug ID |
111808
|
| Summary |
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout c=
ause process into Disk sleep state
|
| Product |
DRI
|
| Version |
DRI git
|
| Hardware |
ARM
|
| OS |
Linux (All)
|
| Status |
NEW
|
| Severity |
major
|
| Priority |
not set
|
| Component |
DRM/AMDgpu
|
| Assignee |
dri-devel@lists.freedesktop.org
|
| Reporter |
liansz@fzcyjh.com
|
Created attachment 145507 [det=
ails]
timeoutlog
We ran into some gfx timeout problems.
Currently, we use the kernel of 4.19.36. We merged some patches regarding G=
PU
from the community. There are multiple GPUs on each server, and each GPU is
running some rendering programs. Now, there are 2 different cases of failur=
es.
The first one is that one graphics card of a server fails, rendering program
does not have a D state, and it shows error code 110 tested by
/sys/kernel/debug/dri/1/amdgpu_test_ib, then shows pass after a second test.
See tmp-618-2.zip for details.
The second one is that one graphics card of a server fails, the whole rende=
ring
program running on the server fails and has D state. It fails at drm_releas=
e.
See tmp-619.zip for details.
Could you please help us out?
You are receiving this mail because:
- You are the assignee for the bug.
=
--15693793630.Fa35.20801--
--===============1168862317==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs
IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz
dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVs
--===============1168862317==--