From mboxrd@z Thu Jan 1 00:00:00 1970
From: bugzilla-daemon@freedesktop.org
Subject: [Bug 107762] [Intel GFX CI] *ERROR* ring sdma0 timeout, signaled
seq=137, emitted seq=137
Date: Thu, 06 Sep 2018 15:16:07 +0000
Message-ID:
References:
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============1068092740=="
Return-path:
Received: from culpepper.freedesktop.org (culpepper.freedesktop.org
[131.252.210.165])
by gabe.freedesktop.org (Postfix) with ESMTP id 339846E6D2
for ; Thu, 6 Sep 2018 15:16:08 +0000 (UTC)
In-Reply-To:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel"
To: dri-devel@lists.freedesktop.org
List-Id: dri-devel@lists.freedesktop.org
--===============1068092740==
Content-Type: multipart/alternative; boundary="15362469681.cC3FAa.10803"
Content-Transfer-Encoding: 7bit
--15362469681.cC3FAa.10803
Date: Thu, 6 Sep 2018 15:16:08 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
https://bugs.freedesktop.org/show_bug.cgi?id=3D107762
Michel D=C3=A4nzer changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |ckoenig.leichtzumerken@gmai
| |l.com, dev@lynxeye.de
--- Comment #2 from Michel D=C3=A4nzer ---
(In reply to Martin Peres from comment #0)
> [ 358.292609] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 time=
out, signaled seq=3D137, emitted seq=3D137
> [ 358.292635] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 time=
out, signaled seq=3D145, emitted seq=3D145
(In reply to Martin Peres from comment #1)
> [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled s=
eq=3D137, emitted seq=3D137
> [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled s=
eq=3D147, emitted seq=3D147
Hmm, signalled and emitted sequence numbers are always the same, meaning the
hardware hasn't actually timed out?
I can think of two possibilities:
* A GPU scheduler bug causing the job timeout handling to be triggered
spuriously. (Could something be stalling the system work queue, so the items
scheduled by drm_sched_job_finish_cb can't call drm_sched_job_finish in tim=
e?)
* A problem with the handling of the GPU's interrupts. Do the numbers on the
amdgpu line in /proc/interrupts still increase after these messages appeare=
d,
or at least in the ten seconds before they appear?
--=20
You are receiving this mail because:
You are the assignee for the bug.=
--15362469681.cC3FAa.10803
Date: Thu, 6 Sep 2018 15:16:08 +0000
MIME-Version: 1.0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
=
Michel D=C3=A4nzer
changed
bug 10776=
2
What |
Removed |
Added |
CC |
|
ckoenig.leichtzumerken@gmail.com, dev@lynxeye.de
|
Commen=
t # 2
on bug 10776=
2
from Michel D=C3=A4nzer
(In reply to Martin Peres from comment #0)
> [ 358.292609] [drm:amdgpu_job_timedout [amdgpu]=
] *ERROR* ring sdma0 timeout, signaled seq=3D137, emitted seq=3D137
> [ 358.292635] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 t=
imeout, signaled seq=3D145, emitted seq=3D145
(In reply to Martin Peres from comm=
ent #1)
> [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring =
sdma0 timeout, signaled seq=3D137, emitted seq=3D137
> [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signale=
d seq=3D147, emitted seq=3D147
Hmm, signalled and emitted sequence numbers are always the same, meaning the
hardware hasn't actually timed out?
I can think of two possibilities:
* A GPU scheduler bug causing the job timeout handling to be triggered
spuriously. (Could something be stalling the system work queue, so the items
scheduled by drm_sched_job_finish_cb can't call drm_sched_job_finish in tim=
e?)
* A problem with the handling of the GPU's interrupts. Do the numbers on the
amdgpu line in /proc/interrupts still increase after these messages appeare=
d,
or at least in the ten seconds before they appear?
You are receiving this mail because:
- You are the assignee for the bug.
=
--15362469681.cC3FAa.10803--
--===============1068092740==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs
IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz
dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg==
--===============1068092740==--