From mboxrd@z Thu Jan 1 00:00:00 1970 From: bugzilla-daemon@freedesktop.org Subject: [Bug 109181] Mesa git causes AMDGPU hang, Tonga Firepro chip W7170M MXM Date: Sun, 30 Dec 2018 08:48:27 +0000 Message-ID: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1545816123==" Return-path: Received: from culpepper.freedesktop.org (culpepper.freedesktop.org [131.252.210.165]) by gabe.freedesktop.org (Postfix) with ESMTP id E38066E600 for ; Sun, 30 Dec 2018 08:48:27 +0000 (UTC) List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============1545816123== Content-Type: multipart/alternative; boundary="15461597070.3CEb2cB5.20352" Content-Transfer-Encoding: 7bit --15461597070.3CEb2cB5.20352 Date: Sun, 30 Dec 2018 08:48:27 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated https://bugs.freedesktop.org/show_bug.cgi?id=3D109181 Bug ID: 109181 Summary: Mesa git causes AMDGPU hang, Tonga Firepro chip W7170M MXM Product: Mesa Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: Drivers/Gallium/radeonsi Assignee: dri-devel@lists.freedesktop.org Reporter: Babblebones@gmail.com QA Contact: dri-devel@lists.freedesktop.org Hello, I've run into a bug wherein AMDGPU hangs as mesa 19 does something it doesn= 't like in particular applications. Mesa 18.3 is totally fine, as per padoka stable ppa. Padoka unstable and Oibaf both crash. OpenGL like team fortress 2 seem to be fine but Vulkan (DXVK) wrapped applications will just bury the whole GPU. Even steam itself seems to hard lock when I start it which I can sidesetep = by allowing GPU recovery with the kernel parameter and proceed. It does not recover when I start something vulkan and graphically intensive from steam itself. Below is my dmesg from the card. This may be an issue with mesa or it may be AMDGPU, I am very curious as to which as this has affected me for about a m= onth now across both Arch Linux and my new Ubuntu install, making mesa git unusa= ble on my new card. To make matters worse there is an issue wherein the EDID is messed up on bo= ot with amdgpu.dc=3D1, worth mentioning if it's part of a deeper issue in AMDG= PU. [ 55.671991] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=3D1614, emitted seq=3D1616 [ 55.671996] amdgpu 0000:01:00.0: GPU reset begin! [ 55.678313] amdgpu 0000:01:00.0: GPU pci config reset [ 55.682867] amdgpu 0000:01:00.0: GPU reset succeeded, trying to resume [ 55.683678] [drm] PCIE GART of 1024M enabled (table at 0x000000F4007E900= 0). [ 55.687097] amdgpu: [powerplay] dpm has been enabled [ 55.741072] [drm] UVD initialized successfully. [ 55.950146] [drm] VCE initialized successfully. [ 55.952966] [drm] recover vram bo from shadow start [ 55.955698] [drm] recover vram bo from shadow done [ 55.955746] WARNING: CPU: 5 PID: 120 at /build/linux-liquorix-eJ9K8E/linux-liquorix-4.19/include/linux/dma-fence.h:= 503 drm_sched_job_recovery+0x1db/0x1e0 [gpu_sched] [ 55.955747] Modules linked in: rfcomm fuse ccm ext4 jbd2 fscrypto af_pac= ket cmac bnep uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common btusb btrtl btbcm btintel videodev bluetooth media ecdh_generic crc16 nls_utf8 nls_cp437 vfat fat ext2 mbcache squashfs loop snd_hda_codec_idt snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hda_core arc4 snd_hwdep intel_rapl snd_pcm x86_pkg_temp_thermal intel_powerclamp snd_seq_dummy coretemp snd_seq_oss snd_seq_midi kvm_intel snd_seq_midi_event ath9k ath9k_common snd_rawmidi ath9k_hw kvm snd_seq ath irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_seq_device mac80211 snd_timer pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper joydev input_leds snd cfg80211 tpm_infineon hp_wmi sparse_keymap [ 55.955775] serio_raw wmi_bmof sg lpc_ich rfkill soundcore tpm_tis tpm_tis_core tpm rng_core hp_accel lis3lv02d input_polldev evdev pcc_cpufreq acpi_cpufreq battery ac hp_wireless sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables ipv6 crc_ccitt autofs4 btrfs xor raid6_pq libcrc32c crc32c_generic bcache crc64 sr_mod cdrom sd_mod hid_generic usbhid amdkfd amd_iommu_v2 amdgpu chash gpu_sched ahci i2c_algo_bit libahci ttm sdhci_pci libata cqhci drm_kms_helper sdhci ehci_pci crc32c_intel firewire_ohci xhci_= pci drm psmouse i2c_i801 firewire_core scsi_mod mmc_core crc_itu_t e1000e ehci_= hcd i2c_core xhci_hcd thermal wmi rtc_cmos video button [ 55.955806] CPU: 5 PID: 120 Comm: kworker/5:1 Not tainted 4.19.0-13.1-liquorix-amd64 #1 liquorix 4.19-8ubuntu1~bionic [ 55.955807] Hardware name: Hewlett-Packard /176C, BIOS 68IAV Ver. F.70 07/30/2018 [ 55.955809] Workqueue: events drm_sched_job_timedout [gpu_sched] [ 55.955811] RIP: 0010:drm_sched_job_recovery+0x1db/0x1e0 [gpu_sched] [ 55.955812] Code: ff ff ff 48 8b 3c 24 48 83 c4 20 5b 5d 41 5c 41 5d 41 = 5e 41 5f e9 d5 ab 71 e1 4c 89 f6 4c 89 ff e8 5a fd ff ff e9 33 ff ff ff <0f> 0= b eb 93 90 55 53 48 89 fb 48 8b 46 10 48 89 f7 48 8b 68 08 48 [ 55.955813] RSP: 0018:ffffc900038e7de0 EFLAGS: 00210202 [ 55.955814] RAX: 0000000000000523 RBX: ffff888811704df0 RCX: 0000000000000001 [ 55.955814] RDX: ffff88878c5d8050 RSI: ffff888107c18c00 RDI: 0000000000200286 [ 55.955815] RBP: ffff888811704d10 R08: 0000000000000000 R09: 0000000000000001 [ 55.955816] R10: ffffc90003213dd0 R11: 0000000000000026 R12: ffff88881491cb00 [ 55.955816] R13: ffff888811704e28 R14: ffff88878c5d8000 R15: ffff888811704c98 [ 55.955817] FS: 0000000000000000(0000) GS:ffff88881db40000(0000) knlGS:0000000000000000 [ 55.955818] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 55.955819] CR2: 00000000c12ec008 CR3: 000000000320a003 CR4: 00000000001606e0 [ 55.955819] Call Trace: [ 55.955855] amdgpu_device_gpu_recover+0x3bd/0xa30 [amdgpu] [ 55.955860] process_one_work+0x1f5/0x420 [ 55.955862] worker_thread+0x43/0x490 [ 55.955864] ? rescuer_thread+0x490/0x490 [ 55.955865] kthread+0x153/0x170 [ 55.955866] ? kthread_park+0x80/0x80 [ 55.955869] ret_from_fork+0x35/0x40 [ 55.955870] ---[ end trace 2f9f5d70a335c56f ]--- [ 55.955875] [drm] Skip scheduling IBs! [ 56.541150] amdgpu 0000:01:00.0: GPU reset(1) succeeded! --=20 You are receiving this mail because: You are the assignee for the bug.= --15461597070.3CEb2cB5.20352 Date: Sun, 30 Dec 2018 08:48:27 +0000 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated
Bug ID 109181
Summary Mesa git causes AMDGPU hang, Tonga Firepro chip W7170M MXM
Product Mesa
Version git
Hardware x86-64 (AMD64)
OS Linux (All)
Status NEW
Severity normal
Priority medium
Component Drivers/Gallium/radeonsi
Assignee dri-devel@lists.freedesktop.org
Reporter Babblebones@gmail.com
QA Contact dri-devel@lists.freedesktop.org

Hello,

I've run into a bug wherein AMDGPU hangs as mesa 19 does something it doesn=
't
like in particular applications. Mesa 18.3 is totally fine, as per padoka
stable ppa. Padoka unstable and Oibaf both crash.
OpenGL like team fortress 2 seem to be fine but Vulkan (DXVK) wrapped
applications  will just bury the whole GPU.
Even steam itself seems to hard lock when I start it which I can sidesetep =
by
allowing GPU recovery with the kernel parameter and proceed. It does not
recover when I start something vulkan and graphically intensive from steam
itself.



Below is my dmesg from the card. This may be an issue with mesa or it may be
AMDGPU, I am very curious as to which as this has affected me for about a m=
onth
now across both Arch Linux and my new Ubuntu install, making mesa git unusa=
ble
on my new card.

To make matters worse there is an issue wherein the EDID is messed up on bo=
ot
with amdgpu.dc=3D1, worth mentioning if it's part of a deeper issue in AMDG=
PU.




[   55.671991] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout,
signaled seq=3D1614, emitted seq=3D1616
[   55.671996] amdgpu 0000:01:00.0: GPU reset begin!
[   55.678313] amdgpu 0000:01:00.0: GPU pci config reset
[   55.682867] amdgpu 0000:01:00.0: GPU reset succeeded, trying to resume
[   55.683678] [drm] PCIE GART of 1024M enabled (table at 0x000000F4007E900=
0).
[   55.687097] amdgpu: [powerplay] dpm has been enabled
[   55.741072] [drm] UVD initialized successfully.
[   55.950146] [drm] VCE initialized successfully.
[   55.952966] [drm] recover vram bo from shadow start
[   55.955698] [drm] recover vram bo from shadow done
[   55.955746] WARNING: CPU: 5 PID: 120 at
/build/linux-liquorix-eJ9K8E/linux-liquorix-4.19/include/linux/dma-fence.h:=
503
drm_sched_job_recovery+0x1db/0x1e0 [gpu_sched]
[   55.955747] Modules linked in: rfcomm fuse ccm ext4 jbd2 fscrypto af_pac=
ket
cmac bnep uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2
videobuf2_common btusb btrtl btbcm btintel videodev bluetooth media
ecdh_generic crc16 nls_utf8 nls_cp437 vfat fat ext2 mbcache squashfs loop
snd_hda_codec_idt snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel
snd_hda_codec snd_hda_core arc4 snd_hwdep intel_rapl snd_pcm
x86_pkg_temp_thermal intel_powerclamp snd_seq_dummy coretemp snd_seq_oss
snd_seq_midi kvm_intel snd_seq_midi_event ath9k ath9k_common snd_rawmidi
ath9k_hw kvm snd_seq ath irqbypass crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel snd_seq_device mac80211 snd_timer pcbc aesni_intel
aes_x86_64 crypto_simd cryptd glue_helper joydev input_leds snd cfg80211
tpm_infineon hp_wmi sparse_keymap
[   55.955775]  serio_raw wmi_bmof sg lpc_ich rfkill soundcore tpm_tis
tpm_tis_core tpm rng_core hp_accel lis3lv02d input_polldev evdev pcc_cpufreq
acpi_cpufreq battery ac hp_wireless sch_fq_codel parport_pc ppdev lp parport
ip_tables x_tables ipv6 crc_ccitt autofs4 btrfs xor raid6_pq libcrc32c
crc32c_generic bcache crc64 sr_mod cdrom sd_mod hid_generic usbhid amdkfd
amd_iommu_v2 amdgpu chash gpu_sched ahci i2c_algo_bit libahci ttm sdhci_pci
libata cqhci drm_kms_helper sdhci ehci_pci crc32c_intel firewire_ohci xhci_=
pci
drm psmouse i2c_i801 firewire_core scsi_mod mmc_core crc_itu_t e1000e ehci_=
hcd
i2c_core xhci_hcd thermal wmi rtc_cmos video button
[   55.955806] CPU: 5 PID: 120 Comm: kworker/5:1 Not tainted
4.19.0-13.1-liquorix-amd64 #1 liquorix 4.19-8ubuntu1~bionic
[   55.955807] Hardware name: Hewlett-Packard /176C, BIOS 68IAV Ver. F.70
07/30/2018
[   55.955809] Workqueue: events drm_sched_job_timedout [gpu_sched]
[   55.955811] RIP: 0010:drm_sched_job_recovery+0x1db/0x1e0 [gpu_sched]
[   55.955812] Code: ff ff ff 48 8b 3c 24 48 83 c4 20 5b 5d 41 5c 41 5d 41 =
5e
41 5f e9 d5 ab 71 e1 4c 89 f6 4c 89 ff e8 5a fd ff ff e9 33 ff ff ff <0f=
> 0b eb
93 90 55 53 48 89 fb 48 8b 46 10 48 89 f7 48 8b 68 08 48
[   55.955813] RSP: 0018:ffffc900038e7de0 EFLAGS: 00210202
[   55.955814] RAX: 0000000000000523 RBX: ffff888811704df0 RCX:
0000000000000001
[   55.955814] RDX: ffff88878c5d8050 RSI: ffff888107c18c00 RDI:
0000000000200286
[   55.955815] RBP: ffff888811704d10 R08: 0000000000000000 R09:
0000000000000001
[   55.955816] R10: ffffc90003213dd0 R11: 0000000000000026 R12:
ffff88881491cb00
[   55.955816] R13: ffff888811704e28 R14: ffff88878c5d8000 R15:
ffff888811704c98
[   55.955817] FS:  0000000000000000(0000) GS:ffff88881db40000(0000)
knlGS:0000000000000000
[   55.955818] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   55.955819] CR2: 00000000c12ec008 CR3: 000000000320a003 CR4:
00000000001606e0
[   55.955819] Call Trace:
[   55.955855]  amdgpu_device_gpu_recover+0x3bd/0xa30 [amdgpu]
[   55.955860]  process_one_work+0x1f5/0x420
[   55.955862]  worker_thread+0x43/0x490
[   55.955864]  ? rescuer_thread+0x490/0x490
[   55.955865]  kthread+0x153/0x170
[   55.955866]  ? kthread_park+0x80/0x80
[   55.955869]  ret_from_fork+0x35/0x40
[   55.955870] ---[ end trace 2f9f5d70a335c56f ]---
[   55.955875] [drm] Skip scheduling IBs!
[   56.541150] amdgpu 0000:01:00.0: GPU reset(1) succeeded!


You are receiving this mail because:
  • You are the assignee for the bug.
= --15461597070.3CEb2cB5.20352-- --===============1545816123== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== --===============1545816123==--