All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@freedesktop.org
To: dri-devel@lists.freedesktop.org
Subject: [Bug 111021] [amdgpu][CIK] cp queue preemption time out, BUG: kernel NULL pointer dereference, address: 0000000000000038
Date: Fri, 28 Jun 2019 22:52:28 +0000	[thread overview]
Message-ID: <bug-111021-502@http.bugs.freedesktop.org/> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 7179 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111021

            Bug ID: 111021
           Summary: [amdgpu][CIK] cp queue preemption time out, BUG:
                    kernel NULL pointer dereference, address:
                    0000000000000038
           Product: DRI
           Version: unspecified
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/AMDgpu
          Assignee: dri-devel@lists.freedesktop.org
          Reporter: erhard_f@mailbox.org

Created attachment 144678
  --> https://bugs.freedesktop.org/attachment.cgi?id=144678&action=edit
kernel .dmesg (5.2-rc6)

[...]
[  440.685185] cp queue preemption time out
[  440.685338] Resetting wave fronts (nocpsch) on dev 00000000feee3825
[  440.685426] BUG: kernel NULL pointer dereference, address: 0000000000000038
[  440.685432] #PF: supervisor read access in kernel mode
[  440.685436] #PF: error_code(0x0000) - not-present page
[  440.685440] PGD 0 P4D 0 
[  440.685448] Oops: 0000 [#1] SMP NOPTI
[  440.685455] CPU: 3 PID: 1026 Comm: xmr-stak Not tainted 5.2.0-rc6 #1
[  440.685459] Hardware name: System manufacturer System Product Name/M5A78L-M
LX3, BIOS 1401    05/05/2016
[  440.685610] RIP: 0010:amdgpu_ib_schedule+0x4b/0x520 [amdgpu]
[  440.685616] Code: 89 f5 49 89 ff 48 89 54 24 08 0f b6 87 38 04 00 00 48 85
c9 0f 84 5d 03 00 00 48 8b 91 b0 00 00 00 48 89 54 24 10 48 8b 51 10 <48> 8b 52
38 48 89 14 24 84 c0 0f 84 09 e2 17 00 48 83 7c 24 10 00
[  440.685621] RSP: 0018:ffffac368c2a7ad0 EFLAGS: 00010286
[  440.685626] RAX: 0000000000000001 RBX: ffff97d66533dc00 RCX:
ffff97d66533dc00
[  440.685630] RDX: 0000000000000000 RSI: 0000000000000001 RDI:
ffff97d685fe7d48
[  440.685634] RBP: 0000000000000001 R08: ffffac368c2a7b48 R09:
0000000000000001
[  440.685638] R10: 0000000000000000 R11: 0000000000000001 R12:
0000000000000007
[  440.685642] R13: 0000000000ffd000 R14: ffff97d685fe0000 R15:
ffff97d685fe7d48
[  440.685647] FS:  00007f2115109700(0000) GS:ffff97d6a6ac0000(0000)
knlGS:0000000000000000
[  440.685651] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  440.685655] CR2: 0000000000000038 CR3: 00000003e4236000 CR4:
00000000000406e0
[  440.685659] Call Trace:
[  440.685669]  ? rcu_read_lock_sched_held+0x50/0x60
[  440.685807]  amdgpu_amdkfd_submit_ib+0xb6/0x170 [amdgpu]
[  440.685949]  deallocate_vmid.isra.12+0xe4/0xf0 [amdgpu]
[  440.686091]  destroy_queue_nocpsch_locked+0x176/0x190 [amdgpu]
[  440.686233]  process_termination_nocpsch+0x5e/0x130 [amdgpu]
[  440.686373]  kfd_process_dequeue_from_all_devices+0x36/0x50 [amdgpu]
[  440.686512]  kfd_process_notifier_release+0xf4/0x180 [amdgpu]
[  440.686519]  __mmu_notifier_release+0x65/0x110
[  440.686527]  exit_mmap+0x3b/0x170
[  440.686534]  mmput+0x45/0x110
[  440.686539]  do_exit+0x27d/0xb90
[  440.686546]  ? find_held_lock+0x2d/0x90
[  440.686551]  ? get_signal+0xcc/0xaa0
[  440.686556]  do_group_exit+0x42/0xb0
[  440.686561]  get_signal+0x119/0xaa0
[  440.686568]  do_signal+0x3e/0x620
[  440.686574]  ? find_held_lock+0x2d/0x90
[  440.686580]  exit_to_usermode_loop+0x4b/0xa0
[  440.686585]  do_syscall_64+0x149/0x1a0
[  440.686591]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  440.686596] RIP: 0033:0x7f212b976f6c
[  440.686604] Code: Bad RIP value.
[  440.686608] RSP: 002b:00007f2115108d30 EFLAGS: 00000246 ORIG_RAX:
00000000000000ca
[  440.686614] RAX: fffffffffffffe00 RBX: 00007f211d838c48 RCX:
00007f212b976f6c
[  440.686618] RDX: 0000000000000000 RSI: 0000000000000080 RDI:
00007f211d838c70
[  440.686622] RBP: 0000000000000000 R08: 0000000000000000 R09:
00007f2115109700
[  440.686626] R10: 0000000000000000 R11: 0000000000000246 R12:
0000000000000010
[  440.686630] R13: 00007f211d838c20 R14: 0000000000000000 R15:
00007f211d838c70
[  440.686634] Modules linked in: fuse sha256_ssse3 sha256_generic cfg80211
rfkill dm_crypt nhpoly1305_sse2 nhpoly1305 chacha_x86_64 chacha_generic
adiantum poly1305_generic algif_skcipher af_alg ext4 crc16 mbcache jbd2
input_leds led_class joydev hid_generic usbhid hid crct10dif_pclmul
crc32_generic crc32_pclmul ghash_generic gf128mul gcm xts ctr dm_mod cbc amdgpu
ecb evdev gpu_sched ohci_pci i2c_algo_bit ttm snd_hda_codec_realtek
snd_hda_codec_generic snd_hda_codec_hdmi drm_kms_helper ehci_pci ohci_hcd
cfbfillrect syscopyarea snd_hda_intel cfbimgblt k10temp sysfillrect ehci_hcd
aesni_intel sysimgblt fb_sys_fops snd_hda_codec cfbcopyarea fb snd_hwdep
usbcore aes_x86_64 snd_hda_core fam15h_power hwmon i2c_piix4 usb_common font
glue_helper crypto_simd sr_mod snd_pcm cryptd fbdev cdrom button snd_timer drm
acpi_cpufreq snd alx drm_panel_orientation_quirks soundcore processor backlight
mdio lzo nfsd auth_rpcgss lockd grace zstd sunrpc sg zram zsmalloc
[  440.686714] CR2: 0000000000000038
[  440.686720] ---[ end trace 39cfe5e575b273f7 ]---
[  440.686847] RIP: 0010:amdgpu_ib_schedule+0x4b/0x520 [amdgpu]
[  440.686852] Code: 89 f5 49 89 ff 48 89 54 24 08 0f b6 87 38 04 00 00 48 85
c9 0f 84 5d 03 00 00 48 8b 91 b0 00 00 00 48 89 54 24 10 48 8b 51 10 <48> 8b 52
38 48 89 14 24 84 c0 0f 84 09 e2 17 00 48 83 7c 24 10 00
[  440.686857] RSP: 0018:ffffac368c2a7ad0 EFLAGS: 00010286
[  440.686862] RAX: 0000000000000001 RBX: ffff97d66533dc00 RCX:
ffff97d66533dc00
[  440.686866] RDX: 0000000000000000 RSI: 0000000000000001 RDI:
ffff97d685fe7d48
[  440.686869] RBP: 0000000000000001 R08: ffffac368c2a7b48 R09:
0000000000000001
[  440.686873] R10: 0000000000000000 R11: 0000000000000001 R12:
0000000000000007
[  440.686877] R13: 0000000000ffd000 R14: ffff97d685fe0000 R15:
ffff97d685fe7d48
[  440.686882] FS:  00007f2115109700(0000) GS:ffff97d6a6ac0000(0000)
knlGS:0000000000000000
[  440.686887] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  440.686890] CR2: 00007f212b976f42 CR3: 00000003e4236000 CR4:
00000000000406e0
[  440.686894] Fixing recursive fault but reboot is needed!

This happens every time when xmr-stak 2.10.5 (w. ROCm 2.5) tries to compile
shaders for this R9 290X. An ~/.AMD archive is generated but the compilation
process never finishes. When I close the shell with xmr-stak running (CTRL-C
xmr-stack does not work), I get this kernel BUG. I used a 5.2-rc6 debug kernel,
but it happens on 5.1.15 too.

Card is a Sapphire Radeon R9 290X Tri-X OC (11226-18-20G), additional info
about the the system:

Machine:   Type: Desktop Mobo: ASUSTeK model: M5A78L-M LX3 v: Rev X.0x serial:
<root required> BIOS: American Megatrends 
           v: 1401 date: 05/05/2016 
CPU:       6-Core: AMD FX-6300 type: MCP speed: 3817 MHz min/max: 1400/3800 MHz 
Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Hawaii XT / Grenada XT
[Radeon R9 290X/390X] driver: amdgpu v: kernel 
           Display: x11 server: X.Org 1.20.4 driver: amdgpu,ati unloaded:
modesetting,radeon resolution: 1920x1080~60Hz 
           OpenGL: renderer: AMD Radeon R9 200 Series (HAWAII DRM 3.30.0
5.1.15-gentoo LLVM 8.0.0) v: 4.5 Mesa 19.0.8

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 8648 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

             reply	other threads:[~2019-06-28 22:52 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-28 22:52 bugzilla-daemon [this message]
2019-06-28 22:54 ` [Bug 111021] [amdgpu][CIK] cp queue preemption time out, BUG: kernel NULL pointer dereference, address: 0000000000000038 bugzilla-daemon
2019-06-28 22:55 ` bugzilla-daemon
2019-07-21 13:21 ` bugzilla-daemon
2019-07-21 13:22 ` bugzilla-daemon
2019-07-21 13:25 ` [Bug 111021] [kernel 5.2.1][amdgpu][CIK] " bugzilla-daemon
2019-08-02 18:25 ` [Bug 111021] [kernel 5.2.1][amdgpu][CIK] BUG: KASAN: null-ptr-deref in amdgpu_ib_schedule+0x82/0x790 [amdgpu] bugzilla-daemon
2019-08-02 18:26 ` bugzilla-daemon
2019-08-02 18:28 ` bugzilla-daemon
2019-08-02 18:29 ` bugzilla-daemon
2019-09-06 16:48 ` [Bug 111021] [kernel >=5.2.x][amdgpu][CIK] " bugzilla-daemon
2019-09-06 17:01 ` bugzilla-daemon
2019-09-06 17:01 ` bugzilla-daemon
2019-10-02 13:27 ` bugzilla-daemon
2019-10-02 13:28 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-111021-502@http.bugs.freedesktop.org/ \
    --to=bugzilla-daemon@freedesktop.org \
    --cc=dri-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.