All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 111669] Navi GPU hang in Minecraft
@ 2019-09-12  8:11 bugzilla-daemon
  2019-09-12 13:50 ` bugzilla-daemon
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: bugzilla-daemon @ 2019-09-12  8:11 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 2248 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111669

            Bug ID: 111669
           Summary: Navi GPU hang in Minecraft
           Product: Mesa
           Version: git
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: major
          Priority: not set
         Component: Drivers/Gallium/radeonsi
          Assignee: dri-devel@lists.freedesktop.org
          Reporter: git@dougty.com
        QA Contact: dri-devel@lists.freedesktop.org

When playing Minecraft, being in a certain area of my world at night causes my
GPU to hang. I'm using Optifine and Sildur's shaders.

Sep 12 01:38:42 xxx kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR*
Waiting for fences timed out or interrupted!
Sep 12 01:38:47 xxx kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR*
Waiting for fences timed out or interrupted!
Sep 12 01:38:47 xxx kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR*
Waiting for fences timed out or interrupted!
Sep 12 01:38:47 xxx kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring
gfx_0.0.0 timeout, signaled seq=19965, emitted seq=19967
Sep 12 01:38:47 xxx kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process
information: process java pid 1375 thread java:cs0 pid 1433


CPU: 3700X
GPU: Sapphire 5700XT (reference)
Motherboard: Gigabyte X570-I (BIOS F4)
Kernel: 5.3.0-rc8-mainline
Mesa: 19.3.0_devel.115190.f83f9d7daa0
LLVM: 10.0.0_r326348.d7d8bb937ad
OpenGL string (as seen ingame): 4.5 (Compatibility Profile) Mesa 19.3.0-devel
(git-f83f9d7daa), X.Org, AMD NAVI10 (DRM 3.33.0, 5.3.0-rc8-mainline, LLVM
10.0.0)

I get the hang extremely reliably when in this specific spot at night, but only
this one apitrace recreates the hang when I replay it. Apologies for the
filesize.

https://drive.google.com/open?id=16wAmCa27o2xxv3bFXnR6rGXAum0Wci_5

When the hangs occur, my screen freezes but everything is still running in the
background, and I need to use REISUB hotkeys in order to reboot. Occurs with
both PCIe 4.0 and 3.0 set in the BIOS.

Please let me know if any more info is needed.
Thank you.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 3680 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug 111669] Navi GPU hang in Minecraft
  2019-09-12  8:11 [Bug 111669] Navi GPU hang in Minecraft bugzilla-daemon
@ 2019-09-12 13:50 ` bugzilla-daemon
  2019-09-13 12:17 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: bugzilla-daemon @ 2019-09-12 13:50 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 885 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111669

--- Comment #1 from Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> ---
Thanks for the bug report and the trace.

I can reproduce the hang. There's always a page fault before, e.g:

amdgpu 0000:0b:00.0: [gfxhub] page fault (src_id:0 ring:24 vmid:3 pasid:32772,
for process glretrace pid 8616 thread glretrace:cs0 pid 8617)
amdgpu 0000:0b:00.0:   in page starting at address 0x0000000000f03000 from
client 27
amdgpu 0000:0b:00.0: GCVM_L2_PROTECTION_FAULT_STATUS:0x00301031
amdgpu 0000:0b:00.0:     MORE_FAULTS: 0x1
amdgpu 0000:0b:00.0:     WALKER_ERROR: 0x0
amdgpu 0000:0b:00.0:     PERMISSION_FAULTS: 0x3
amdgpu 0000:0b:00.0:     MAPPING_ERROR: 0x0
amdgpu 0000:0b:00.0:     RW: 0x0

I couldn't find the root cause yet.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1651 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug 111669] Navi GPU hang in Minecraft
  2019-09-12  8:11 [Bug 111669] Navi GPU hang in Minecraft bugzilla-daemon
  2019-09-12 13:50 ` bugzilla-daemon
@ 2019-09-13 12:17 ` bugzilla-daemon
  2019-09-13 13:54 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: bugzilla-daemon @ 2019-09-13 12:17 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 386 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111669

--- Comment #2 from Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> ---
The kernel patch from https://bugs.freedesktop.org/show_bug.cgi?id=111481#c33
seems to prevent the hang here.

Could you try it as well and report the results?

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1356 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug 111669] Navi GPU hang in Minecraft
  2019-09-12  8:11 [Bug 111669] Navi GPU hang in Minecraft bugzilla-daemon
  2019-09-12 13:50 ` bugzilla-daemon
  2019-09-13 12:17 ` bugzilla-daemon
@ 2019-09-13 13:54 ` bugzilla-daemon
  2019-09-13 15:49 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: bugzilla-daemon @ 2019-09-13 13:54 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1636 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111669

--- Comment #3 from Doug Ty <git@dougty.com> ---
Thanks for the response.
Still hanging, unfortunately.

While the patch allows me to replay the first apitrace just fine now, I'm still
hanging in the same spot ingame. Same messages in journalctl

I've captured a new apitrace that recreates the hang with the patch for me.

https://drive.google.com/open?id=1WMeuCoZnOOqD0Tbjix6nNpFyVkzzbd94

As suggested in the other thread, AMD_DEBUG=nodma seems to successfully prevent
the hang. Unsure if you can see it in the apitrace, but there are usually some
artifacts shortly before the hang: stretchy verts, sheep textures turning blue
-- these are also not present with nodma


It's worth noting that I am getting some general desktop instability and sdma
hangs like in the other thread you linked as well. While compiling the kernel
patch I got a hang trying to watch a video in Firefox (has happened a couple
times before), and previously I've also gotten hangs while loading Half Life 2
maps and closing GIMP. Not sure if any of these could be related. They happen
so irregularly that I've been unable to reproduce or capture apitraces for
them. Occasionally images on web pages will load corrupted and not display as
well, though I can't tell if this is a GPU problem or a browser/network
problem.

The card works great on my Windows dual boot, so I'm pretty sure it's not a
hardware problem. (though I have to use 19.7.5 as anything newer causes Firefox
to blue screen me)

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2444 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug 111669] Navi GPU hang in Minecraft
  2019-09-12  8:11 [Bug 111669] Navi GPU hang in Minecraft bugzilla-daemon
                   ` (2 preceding siblings ...)
  2019-09-13 13:54 ` bugzilla-daemon
@ 2019-09-13 15:49 ` bugzilla-daemon
  2019-09-16 12:40 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: bugzilla-daemon @ 2019-09-13 15:49 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 540 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111669

--- Comment #4 from Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> ---
Thanks for the test and new trace.

I can reproduce the hang and it seems to go away with AMD_DEBUG=nodma.

Another workaround is to use the following kernel parameter
amdgpu.vm_update_mode=3 (well, except that sometimes this introduces another
problem, see https://bugs.freedesktop.org/show_bug.cgi?id=111682)

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1459 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug 111669] Navi GPU hang in Minecraft
  2019-09-12  8:11 [Bug 111669] Navi GPU hang in Minecraft bugzilla-daemon
                   ` (3 preceding siblings ...)
  2019-09-13 15:49 ` bugzilla-daemon
@ 2019-09-16 12:40 ` bugzilla-daemon
  2019-09-17  0:46 ` bugzilla-daemon
  2019-09-25 18:50 ` bugzilla-daemon
  6 siblings, 0 replies; 8+ messages in thread
From: bugzilla-daemon @ 2019-09-16 12:40 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 453 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111669

--- Comment #5 from Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> ---
Another env variable to test is: AMD_DEBUG=nongg

Using AMD_DEBUG=nongg and a kernel with the patch from
https://bugs.freedesktop.org/show_bug.cgi?id=111481#c33 I could replay both
traces multiple times without a single hang.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1423 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug 111669] Navi GPU hang in Minecraft
  2019-09-12  8:11 [Bug 111669] Navi GPU hang in Minecraft bugzilla-daemon
                   ` (4 preceding siblings ...)
  2019-09-16 12:40 ` bugzilla-daemon
@ 2019-09-17  0:46 ` bugzilla-daemon
  2019-09-25 18:50 ` bugzilla-daemon
  6 siblings, 0 replies; 8+ messages in thread
From: bugzilla-daemon @ 2019-09-17  0:46 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 995 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111669

--- Comment #6 from Doug Ty <git@dougty.com> ---
Unfortunately I'm still getting the hang with the kernel patch +
AMD_DEBUG=nongg, both ingame as well as replaying the apitraces. Same messages
in journalctl

Not sure how useful it'll be but I've made another apitrace with patch + nongg
https://drive.google.com/open?id=1NSMBW-GKHMAMOjrHS_cD-CvvUkvviqx5

Is there anything more I can do to help debug this? A specific firmware I
should be using?

Currently using:
Linux 5.3 (both rc8 and now stable release, compiled with the patch)
llvm-git 10.0.0_r326744.bfb5b0cb86c-1
mesa-git 1:19.3.0_devel.115313.f812cbfd884-1
Latest firmware (9/13) from
https://people.freedesktop.org/~agd5f/radeon_ucode/navi10/ (was previously
using 7/14 from Fedora's linux-firmware)

Only AMD_DEBUG=nodma stops the hang for me
No luck with amdgpu.vm_update_mode=3

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1876 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug 111669] Navi GPU hang in Minecraft
  2019-09-12  8:11 [Bug 111669] Navi GPU hang in Minecraft bugzilla-daemon
                   ` (5 preceding siblings ...)
  2019-09-17  0:46 ` bugzilla-daemon
@ 2019-09-25 18:50 ` bugzilla-daemon
  6 siblings, 0 replies; 8+ messages in thread
From: bugzilla-daemon @ 2019-09-25 18:50 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 842 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111669

GitLab Migration User <gitlab-migration@fdo.invalid> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |MOVED

--- Comment #7 from GitLab Migration User <gitlab-migration@fdo.invalid> ---
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been
closed from further activity.

You can subscribe and participate further through the new bug through this link
to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1429.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2376 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-09-25 18:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-09-12  8:11 [Bug 111669] Navi GPU hang in Minecraft bugzilla-daemon
2019-09-12 13:50 ` bugzilla-daemon
2019-09-13 12:17 ` bugzilla-daemon
2019-09-13 13:54 ` bugzilla-daemon
2019-09-13 15:49 ` bugzilla-daemon
2019-09-16 12:40 ` bugzilla-daemon
2019-09-17  0:46 ` bugzilla-daemon
2019-09-25 18:50 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.