* [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout
@ 2018-07-17 20:39 bugzilla-daemon
2018-07-17 20:49 ` bugzilla-daemon
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: bugzilla-daemon @ 2018-07-17 20:39 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 1364 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107266
Bug ID: 107266
Summary: Radeon Pro Duo (Polaris) - ring sdma0 timeout
Product: DRI
Version: unspecified
Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
Severity: normal
Priority: medium
Component: DRM/AMDgpu-pro
Assignee: dri-devel@lists.freedesktop.org
Reporter: rhlug@hotmail.com
Running a polaris Pro Duo on Ubuntu 18.04, Kernel from ROCm branch
4.17.0-rc2-180424-fkxamd and everything works great.
In testing latest drm-next-4.19-wip kernel, I get the following errors on boot,
and have no working opencl (ie clinfo hangs indefinitely)
[ 58.913281] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout,
signaled seq=573, emitted seq=574
[ 58.913284] [drm] GPU recovery disabled.
[ 58.914276] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout,
signaled seq=331, emitted seq=333
[ 58.914280] [drm] GPU recovery disabled.
[ 58.914312] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout,
signaled seq=331, emitted seq=333
[ 58.914313] [drm] GPU recovery disabled.
Please let me know if you need any specifics.
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 2657 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout
2018-07-17 20:39 [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout bugzilla-daemon
@ 2018-07-17 20:49 ` bugzilla-daemon
2018-07-18 14:09 ` bugzilla-daemon
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2018-07-17 20:49 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 302 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107266
--- Comment #1 from Alex Deucher <alexdeucher@gmail.com> ---
Please attach your full dmesg output. Are you attempting to use ROCm over the
drm-next-4.19-wip kernel?
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 1079 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout
2018-07-17 20:39 [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout bugzilla-daemon
2018-07-17 20:49 ` bugzilla-daemon
@ 2018-07-18 14:09 ` bugzilla-daemon
2018-07-18 14:09 ` bugzilla-daemon
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2018-07-18 14:09 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 352 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107266
--- Comment #2 from robert <rhlug@hotmail.com> ---
Created attachment 140696
--> https://bugs.freedesktop.org/attachment.cgi?id=140696&action=edit
dmesg.base.txt
Polaris radeon pro duo, dmesg with no userland.
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 1239 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout
2018-07-17 20:39 [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout bugzilla-daemon
2018-07-17 20:49 ` bugzilla-daemon
2018-07-18 14:09 ` bugzilla-daemon
@ 2018-07-18 14:09 ` bugzilla-daemon
2018-07-18 14:10 ` bugzilla-daemon
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2018-07-18 14:09 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 368 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107266
--- Comment #3 from robert <rhlug@hotmail.com> ---
Created attachment 140697
--> https://bugs.freedesktop.org/attachment.cgi?id=140697&action=edit
dmesg.amdgpupro.txt
Polaris radeon pro duo, dmesg with amdgpu-pro 18.20-606296
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 1265 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout
2018-07-17 20:39 [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout bugzilla-daemon
` (2 preceding siblings ...)
2018-07-18 14:09 ` bugzilla-daemon
@ 2018-07-18 14:10 ` bugzilla-daemon
2018-07-18 14:28 ` bugzilla-daemon
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2018-07-18 14:10 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 476 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107266
--- Comment #4 from robert <rhlug@hotmail.com> ---
No, I wasnt running ROCm userland.
I've been using amdgpu-pro-18.20-606296 for several weeks with the fkxamd
kernel as recommened by Felix.
When I remove all userland, I dont see the ring sdma0 timeout. Without any
userland, it initializes the driver, but with a couple warnings.
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 1243 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout
2018-07-17 20:39 [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout bugzilla-daemon
` (3 preceding siblings ...)
2018-07-18 14:10 ` bugzilla-daemon
@ 2018-07-18 14:28 ` bugzilla-daemon
2018-10-08 15:34 ` bugzilla-daemon
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2018-07-18 14:28 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 775 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107266
--- Comment #5 from robert <rhlug@hotmail.com> ---
So I stripped this down, and the ring error pops up after I've applied a new
pp_table and start utilizing the GPU.
My guess is this error has something to do with why this doesnt work in
drm-next-4.19-wip
[ 2.635258] amdgpu: [powerplay] Failed to retrieve minimum clocks.
[ 2.635259] amdgpu: [powerplay] Error in phm_get_clock_info
That error is not present when I load 4.17.0-rc2-180424-fkxamd kernel.
I apply same pp_table file while running 4.17.0-rc2-180424-fkxamd, and it works
as expected.
So there is something funky in the powerplay of drm-next-4.19-wip
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 1542 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout
2018-07-17 20:39 [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout bugzilla-daemon
` (4 preceding siblings ...)
2018-07-18 14:28 ` bugzilla-daemon
@ 2018-10-08 15:34 ` bugzilla-daemon
2018-10-14 19:29 ` bugzilla-daemon
2019-11-19 7:58 ` bugzilla-daemon
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2018-10-08 15:34 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 1131 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107266
--- Comment #6 from dallase <dallas@engelken.net> ---
amd-staging-drm-next (built Oct 7 2018)
[ 61.701281] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout,
signaled seq=888, emitted seq=890
[ 61.701285] [drm] GPU recovery disabled.
[ 61.701397] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout,
signaled seq=902, emitted seq=904
[ 61.701399] [drm] GPU recovery disabled.
drm-next-4.20-wip (built Oct 8 2018)
[ 60.840847] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout,
signaled seq=914, emitted seq=916
[ 60.840851] [drm] GPU recovery disabled.
[ 60.840962] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout,
signaled seq=907, emitted seq=909
[ 60.840964] [drm] GPU recovery disabled.
Both of these kernels work fine on my Vega 56 and Vega 64's, just the Pro Duo
has the ring timeouts. Was tested with amdgpu-pro 18.20 and 18.30, and
nothing utilizing the GPUs besides on boot initializations.
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 1901 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout
2018-07-17 20:39 [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout bugzilla-daemon
` (5 preceding siblings ...)
2018-10-08 15:34 ` bugzilla-daemon
@ 2018-10-14 19:29 ` bugzilla-daemon
2019-11-19 7:58 ` bugzilla-daemon
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2018-10-14 19:29 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 1506 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107266
--- Comment #7 from robert <rhlug@hotmail.com> ---
All Polaris are experiencing ring errors on mainline kernels, its not just Pro
Duo Polaris.
# lspci | grep VGA
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Ellesmere [Radeon RX 470/480] (rev ef)
02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Ellesmere [Radeon RX 470/480] (rev ef)
04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Ellesmere [Radeon RX 470/480] (rev ef)
05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Ellesmere [Radeon RX 470/480] (rev cf)
09:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Ellesmere [Radeon RX 470/480] (rev ef)
# uname -a
Linux localhost 4.19.0-999-lowlatency #201810092201 SMP PREEMPT Wed Oct 10
02:12:06 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
# dmesg | grep amdgpu
[ 8.125848] amdgpu: [powerplay] Failed to retrieve minimum clocks.
[ 8.125849] amdgpu: [powerplay] Error in phm_get_clock_info
[ 8.260967] [drm] Initialized amdgpu 3.27.0 20150101 for 0000:09:00.0 on
minor 4
[ 70.238071] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout,
signaled seq=597, emitted seq=599
[ 70.238198] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout,
signaled seq=597, emitted seq=599
etc etc
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 2273 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout
2018-07-17 20:39 [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout bugzilla-daemon
` (6 preceding siblings ...)
2018-10-14 19:29 ` bugzilla-daemon
@ 2019-11-19 7:58 ` bugzilla-daemon
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2019-11-19 7:58 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 804 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107266
Martin Peres <martin.peres@free.fr> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |MOVED
Status|NEW |RESOLVED
--- Comment #8 from Martin Peres <martin.peres@free.fr> ---
-- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been
closed from further activity.
You can subscribe and participate further through the new bug through this link
to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/16.
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 2357 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2019-11-19 7:58 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-17 20:39 [Bug 107266] Radeon Pro Duo (Polaris) - ring sdma0 timeout bugzilla-daemon
2018-07-17 20:49 ` bugzilla-daemon
2018-07-18 14:09 ` bugzilla-daemon
2018-07-18 14:09 ` bugzilla-daemon
2018-07-18 14:10 ` bugzilla-daemon
2018-07-18 14:28 ` bugzilla-daemon
2018-10-08 15:34 ` bugzilla-daemon
2018-10-14 19:29 ` bugzilla-daemon
2019-11-19 7:58 ` bugzilla-daemon
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.