Bug ID 106196
Summary GPU randomly hangs while playing game Rise of the Tomb Rider
Product DRI
Version XOrg git
Hardware Other
OS All
Status NEW
Severity normal
Priority medium
Component DRM/AMDgpu
Assignee dri-devel@lists.freedesktop.org
Reporter mikhail.v.gavrilov@gmail.com

Created attachment 139028 [details]
dmesg

* Fedora 28 -
https://download.fedoraproject.org/pub/fedora/linux/releases/test/28_Beta/Workstation/x86_64/iso/Fedora-Workstation-Live-x86_64-28_Beta-1.3.iso
* Latest system updates:
 - kernel 4.16.3
 - mesa 18.0.1
 - llvm 6.0.0
* Steam client version 1523923735

For reproduction issue:
1) Play in game several hours until GPU hang occurs

Symptoms:
1. The system stop to respod.
2. All the LEDs on the video card showing the load start to glow.
3. The turbine on the video card starts to make a lot of noise.

kernel output after GPU hang:
[10918.342576] amdgpu 0000:07:00.0: [gfxhub] VMC page fault (src_id:0 ring:158
vmid:7 pas_id:0)
[10918.342582] amdgpu 0000:07:00.0:   at page 0x00001891a90f0000 from 27
[10918.342585] amdgpu 0000:07:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x0070113D
[10918.342591] amdgpu 0000:07:00.0: [gfxhub] VMC page fault (src_id:0 ring:158
vmid:7 pas_id:0)
[10918.342594] amdgpu 0000:07:00.0:   at page 0x00001891a90f0000 from 27
[10918.342597] amdgpu 0000:07:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
[10928.687661] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout,
last signaled seq=1874360, last emitted seq=1874362
[10928.687666] [drm] No hardware hang detected. Did some blocks stall?
[11016.291301] sysrq: SysRq : Show Blocked State
[11016.291315]   task                        PC stack   pid father
[11016.291509] Xwayland        D10616  1956   1882 0x00000004
[11016.291522] Call Trace:
[11016.291541]  ? __schedule+0x2bd/0xb00
[11016.291555]  ? dma_fence_wait_any_timeout+0x264/0x2f0
[11016.291564]  schedule+0x2f/0x90
[11016.291573]  schedule_timeout+0x35c/0x520
[11016.291592]  ? dma_fence_wait_any_timeout+0x264/0x2f0
[11016.291602]  dma_fence_wait_any_timeout+0x230/0x2f0
[11016.291734]  amdgpu_sa_bo_new+0x444/0x510 [amdgpu]
[11016.291900]  amdgpu_ib_get+0x31/0x90 [amdgpu]
[11016.292048]  amdgpu_job_alloc_with_ib+0x46/0x80 [amdgpu]
[11016.292128]  amdgpu_map_buffer.isra.10+0xa3/0x1f0 [amdgpu]
[11016.292215]  amdgpu_ttm_copy_mem_to_mem+0x3c6/0x5d0 [amdgpu]
[11016.292305]  ? amdgpu_vm_bo_invalidate+0x3b/0x210 [amdgpu]
[11016.292385]  amdgpu_move_blit.constprop.13+0x82/0x110 [amdgpu]
[11016.292467]  amdgpu_bo_move+0x94/0x1c0 [amdgpu]
[11016.292486]  ttm_bo_handle_move_mem+0x10d/0x540 [ttm]
[11016.292509]  ? ttm_bo_evict+0x155/0x1e0 [ttm]
[11016.292530]  ? mutex_trylock+0xcd/0xe0
[11016.292552]  ? ttm_mem_evict_first+0x1cf/0x260 [ttm]
[11016.292574]  ? ttm_bo_mem_space+0x2da/0x4a0 [ttm]
[11016.292599]  ? ttm_bo_validate+0xe3/0x1a0 [ttm]
[11016.292612]  ? ttm_bo_init_reserved+0x40e/0x470 [ttm]
[11016.292628]  ? mutex_trylock+0xcd/0xe0
[11016.292645]  ? ttm_bo_init_reserved+0x42a/0x470 [ttm]
[11016.292723]  ? amdgpu_bo_do_create+0x1da/0x570 [amdgpu]
[11016.292799]  ? amdgpu_fill_buffer+0x320/0x320 [amdgpu]
[11016.292885]  ? amdgpu_bo_create+0x4f/0x2c0 [amdgpu]
[11016.292993]  ? amdgpu_gem_object_create+0x80/0x110 [amdgpu]
[11016.293075]  ? amdgpu_gem_object_close+0x1e0/0x1e0 [amdgpu]
[11016.293153]  ? amdgpu_gem_create_ioctl+0x1eb/0x2a0 [amdgpu]
[11016.293165]  ? __might_fault+0x3e/0x90
[11016.293244]  ? amdgpu_gem_object_close+0x1e0/0x1e0 [amdgpu]
[11016.293277]  ? drm_ioctl_kernel+0x5b/0xb0 [drm]
[11016.293308]  ? drm_ioctl+0x1c0/0x380 [drm]
[11016.293417]  ? amdgpu_gem_object_close+0x1e0/0x1e0 [amdgpu]
[11016.293529]  ? amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[11016.293544]  ? do_vfs_ioctl+0xa5/0x6d0
[11016.293559]  ? SyS_ioctl+0x74/0x80
[11016.293574]  ? do_syscall_64+0x79/0x210
[11016.293584]  ? entry_SYSCALL_64_after_hwframe+0x42/0xb7
[11016.294195] kworker/u16:2   D12472 10521      2 0x80000000
[11016.294228] Workqueue: events_unbound commit_work [drm_kms_helper]
[11016.294237] Call Trace:
[11016.294252]  ? __schedule+0x2bd/0xb00
[11016.294266]  ? dma_fence_default_wait+0x231/0x370
[11016.294278]  schedule+0x2f/0x90
[11016.294288]  schedule_timeout+0x35c/0x520
[11016.294301]  ? dma_fence_default_wait+0x72/0x370
[11016.294316]  ? dma_fence_default_wait+0x231/0x370
[11016.294325]  dma_fence_default_wait+0x25d/0x370
[11016.294334]  ? dma_fence_release+0x160/0x160
[11016.294347]  dma_fence_wait_timeout+0x4f/0x270
[11016.294358]  reservation_object_wait_timeout_rcu+0x236/0x4e0
[11016.294485]  amdgpu_dm_do_flip+0x112/0x360 [amdgpu]
[11016.294624]  amdgpu_dm_atomic_commit_tail+0xac7/0xda0 [amdgpu]
[11016.294640]  ? wait_for_completion_timeout+0x73/0x1a0
[11016.294673]  commit_tail+0x3d/0x70 [drm_kms_helper]
[11016.294685]  process_one_work+0x261/0x630
[11016.294703]  worker_thread+0x3a/0x390
[11016.294715]  ? process_one_work+0x630/0x630
[11016.294725]  kthread+0x120/0x140
[11016.294739]  ? kthread_create_worker_on_cpu+0x70/0x70
[11016.294750]  ret_from_fork+0x3a/0x50


You are receiving this mail because: