All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@freedesktop.org
To: dri-devel@lists.freedesktop.org
Subject: [Bug 111881] [kernel 5.4-rc1][amdgpu][CIK]: FW bug: No PASID in KFD interrupt
Date: Wed, 02 Oct 2019 11:03:26 +0000	[thread overview]
Message-ID: <bug-111881-502@http.bugs.freedesktop.org/> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 7573 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111881

            Bug ID: 111881
           Summary: [kernel 5.4-rc1][amdgpu][CIK]: FW bug: No PASID in KFD
                    interrupt
           Product: DRI
           Version: XOrg git
          Hardware: x86-64 (AMD64)
                OS: All
            Status: NEW
          Severity: not set
          Priority: not set
         Component: DRM/amdkfd
          Assignee: dri-devel@lists.freedesktop.org
          Reporter: erhard_f@mailbox.org

Created attachment 145612
  --> https://bugs.freedesktop.org/attachment.cgi?id=145612&action=edit
dmesg (kernel 5.4-rc1)

Card is a Sapphire Radeon R9 290 Tri-X running on a Supermicro H8SGL (Opteron
6380) with Gentoo Linux. OpenCL driver is ROCm 2.8.0.

clinfo segfaults, also the kernel gets a hit:

[...]
Okt 02 12:47:51 yea kernel: clinfo[1138]: segfault at 1000 ip 00007f78d4f52971
sp 00007ffd81ab7170 error 6 in libhsa-runtime64.so.1.1.9[7f78d4f34000+c7000]
Okt 02 12:47:51 yea kernel: Code: ff ff ff 48 8b 85 58 ff ff ff 48 8b 80 b8 03
00 00 48 8b 95 78 ff ff ff 48 c1 e2 03 48 01 c2 48 8b 85 68 ff ff ff 48 8b 40
18 <48> 89 02 c6 45 b0 01 bb 00 00 00 00 0f b6 45 b0 83 f0 01 84 c0 74
Okt 02 12:47:59 yea kernel: Evicting PASID 32770 queues
Okt 02 12:47:59 yea kernel: ------------[ cut here ]------------
Okt 02 12:47:59 yea kernel: FW bug: No PASID in KFD interrupt
Okt 02 12:47:59 yea kernel: WARNING: CPU: 5 PID: 0 at
drivers/gpu/drm/amd/amdgpu/../amdkfd/cik_event_interrupt.c:70
cik_event_interrupt_isr+0x223/0x230 [amdgpu]
Okt 02 12:47:59 yea kernel: Modules linked in: fuse dm_crypt nhpoly1305_sse2
nhpoly1305 chacha_x86_64 chacha_generic adiantum poly1305_generic
algif_skcipher amd64_edac_mod crct10dif_pclmul crc32_generic crc32_pclmul
dm_mod joydev input_leds ghash_generic gf128mul gcm hid_generic usbhid hid xts
ext4 crc16 mbcache ctr jbd2 ath5k led_class amdgpu cbc mac80211 ath ohci_pci
ecb evdev cfg80211 gpu_sched ehci_pci ohci_hcd snd_oxygen i2c_algo_bit ehci_hcd
fam15h_power snd_oxygen_lib aesni_intel ttm snd_mpu401_uart sr_mod glue_helper
rfkill snd_rawmidi usbcore crypto_simd k10temp libarc4 cdrom cryptd
drm_kms_helper snd_hda_codec_hdmi hwmon snd_seq_device i2c_piix4 usb_common
cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt snd_hda_intel
fb_sys_fops cfbcopyarea snd_intel_nhlt fb snd_hda_codec font snd_hwdep fbdev
snd_hda_core drm e1000e snd_pcm snd_timer snd drm_panel_orientation_quirks
backlight soundcore button acpi_cpufreq processor lzo zstd sg zram zsmalloc
Okt 02 12:47:59 yea kernel: CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.4.0-rc1
#1
Okt 02 12:47:59 yea kernel: Hardware name: Supermicro H8SGL/H8SGL, BIOS 3.5b   
   03/18/2016
Okt 02 12:47:59 yea kernel: RIP: 0010:cik_event_interrupt_isr+0x223/0x230
[amdgpu]
Okt 02 12:47:59 yea kernel: Code: ff 0f b6 05 53 15 49 00 84 c0 74 07 31 c0 e9
b0 fe ff ff 48 c7 c7 c0 b2 88 c1 88 44 24 08 c6 05 36 15 49 00 01 e8 81 0f a5
f8 <0f> 0b 0f b6 44 24 08 e9 8d fe ff ff 90 48 b8 00 00 00 00 00 fc ff
Okt 02 12:47:59 yea kernel: RSP: 0018:ffff8883e7888c08 EFLAGS: 00010086
Okt 02 12:47:59 yea kernel: RAX: 0000000000000000 RBX: ffff8883cc044b48 RCX:
ffffffffba10693f
Okt 02 12:47:59 yea kernel: RDX: 0000000000000003 RSI: dffffc0000000000 RDI:
ffff8883e5704f80
Okt 02 12:47:59 yea kernel: RBP: ffff8883e7888c40 R08: fffffbfff76d3d31 R09:
fffffbfff76d3d31
Okt 02 12:47:59 yea kernel: R10: fffffbfff76d3d30 R11: ffffffffbb69e983 R12:
0000000000000008
Okt 02 12:47:59 yea kernel: R13: 00000000000000b5 R14: 0000000000000023 R15:
0000000000000000
Okt 02 12:47:59 yea kernel: FS:  0000000000000000(0000)
GS:ffff8883e7880000(0000) knlGS:0000000000000000
Okt 02 12:47:59 yea kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Okt 02 12:47:59 yea kernel: CR2: 00007fea9066f000 CR3: 00000007f52c2000 CR4:
00000000000406e0
Okt 02 12:47:59 yea kernel: Call Trace:
Okt 02 12:47:59 yea kernel:  <IRQ>
Okt 02 12:47:59 yea kernel:  kgd2kfd_interrupt+0x151/0x1a0 [amdgpu]
Okt 02 12:47:59 yea kernel:  ? kgd2kfd_resume+0xa0/0xa0 [amdgpu]
Okt 02 12:47:59 yea kernel:  ? check_flags.part.41+0x82/0x210
Okt 02 12:47:59 yea kernel:  ? amdgpu_fence_process+0x95/0x1b0 [amdgpu]
Okt 02 12:47:59 yea kernel:  ? amdgpu_irq_dispatch+0x184/0x390 [amdgpu]
Okt 02 12:47:59 yea kernel:  ? gfx_v7_0_eop_irq+0xba/0x100 [amdgpu]
Okt 02 12:47:59 yea kernel:  amdgpu_irq_dispatch+0x1c6/0x390 [amdgpu]
Okt 02 12:47:59 yea kernel:  ? amdgpu_irq_add_id+0x160/0x160 [amdgpu]
Okt 02 12:47:59 yea kernel:  ? lock_downgrade+0x390/0x390
Okt 02 12:47:59 yea kernel:  amdgpu_ih_process+0xf4/0x1d0 [amdgpu]
Okt 02 12:47:59 yea kernel:  ? amdgpu_irq_disable_all+0x1b0/0x1b0 [amdgpu]
Okt 02 12:47:59 yea kernel:  amdgpu_irq_handler+0x20/0x60 [amdgpu]
Okt 02 12:47:59 yea kernel:  ? amdgpu_irq_disable_all+0x1b0/0x1b0 [amdgpu]
Okt 02 12:47:59 yea kernel:  __handle_irq_event_percpu+0x72/0x390
Okt 02 12:47:59 yea kernel:  handle_irq_event_percpu+0x6a/0xe0
Okt 02 12:47:59 yea kernel:  ? __handle_irq_event_percpu+0x390/0x390
Okt 02 12:47:59 yea kernel:  ? rwlock_bug.part.2+0x50/0x50
Okt 02 12:47:59 yea kernel:  ? do_raw_spin_unlock+0x9d/0x130
Okt 02 12:47:59 yea kernel:  handle_irq_event+0x4f/0x7e
Okt 02 12:47:59 yea kernel:  handle_edge_irq+0x100/0x2d0
Okt 02 12:47:59 yea kernel:  do_IRQ+0x72/0x160
Okt 02 12:47:59 yea kernel:  common_interrupt+0xf/0xf
Okt 02 12:47:59 yea kernel:  </IRQ>
Okt 02 12:47:59 yea kernel: RIP: 0010:cpuidle_enter_state+0xcd/0x640
Okt 02 12:47:59 yea kernel: Code: 00 31 ff e8 a5 86 80 ff 80 7c 24 10 00 74 12
9c 58 f6 c4 02 0f 85 42 05 00 00 31 ff e8 cc 5e 89 ff e8 f7 be 8f ff fb 45 85
e4 <0f> 88 fb 03 00 00 4d 63 ec 4f 8d 74 6d 00 49 c1 e6 05 4a 8d 7c 33
Okt 02 12:47:59 yea kernel: RSP: 0018:ffff8883e571fd98 EFLAGS: 00000202
ORIG_RAX: ffffffffffffffdd
Okt 02 12:47:59 yea kernel: RAX: 0000000000000000 RBX: ffffffffc0316680 RCX:
ffffffffba1067e0
Okt 02 12:47:59 yea kernel: RDX: 0000000000000007 RSI: dffffc0000000000 RDI:
ffff8883e5704fb4
Okt 02 12:47:59 yea kernel: RBP: ffff888812779028 R08: 0000000000000002 R09:
0000000000000000
Okt 02 12:47:59 yea kernel: R10: 0000000000000000 R11: 0000000000000000 R12:
0000000000000002
Okt 02 12:47:59 yea kernel: R13: 0000000000000002 R14: ffffffffc0316740 R15:
ffffffffc0316780
Okt 02 12:47:59 yea kernel:  ? lockdep_hardirqs_on+0x190/0x280
Okt 02 12:47:59 yea kernel:  ? cpuidle_enter_state+0xc9/0x640
Okt 02 12:47:59 yea kernel:  cpuidle_enter+0x37/0x60
Okt 02 12:47:59 yea kernel:  do_idle+0x2e7/0x380
Okt 02 12:47:59 yea kernel:  ? arch_cpu_idle_exit+0x40/0x40
Okt 02 12:47:59 yea kernel:  ? schedule_idle+0x41/0x50
Okt 02 12:47:59 yea kernel:  cpu_startup_entry+0x14/0x20
Okt 02 12:47:59 yea kernel:  start_secondary+0x1fd/0x240
Okt 02 12:47:59 yea kernel:  ? set_cpu_sibling_map+0xbc0/0xbc0
Okt 02 12:47:59 yea kernel:  secondary_startup_64+0xa4/0xb0
Okt 02 12:47:59 yea kernel: irq event stamp: 450550
Okt 02 12:47:59 yea kernel: hardirqs last  enabled at (450547):
[<ffffffffba8c30b9>] cpuidle_enter_state+0xc9/0x640
Okt 02 12:47:59 yea kernel: hardirqs last disabled at (450548):
[<ffffffffba00276a>] trace_hardirqs_off_thunk+0x1a/0x20
Okt 02 12:47:59 yea kernel: softirqs last  enabled at (450550):
[<ffffffffba07b210>] irq_enter+0x70/0x80
Okt 02 12:47:59 yea kernel: softirqs last disabled at (450549):
[<ffffffffba07b1f5>] irq_enter+0x55/0x80
Okt 02 12:47:59 yea kernel: ---[ end trace 5951fa91933dcafd ]---

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 9054 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

             reply	other threads:[~2019-10-02 11:03 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-02 11:03 bugzilla-daemon [this message]
2019-10-02 11:05 ` [Bug 111881] [kernel 5.4-rc1][amdgpu][CIK]: FW bug: No PASID in KFD interrupt bugzilla-daemon
2019-10-02 13:28 ` bugzilla-daemon
2019-10-03 22:02 ` bugzilla-daemon
2019-10-26 10:58 ` [Bug 111881] [kernel 5.4-rc4][amdgpu][CIK]: " bugzilla-daemon
2019-10-26 10:59 ` bugzilla-daemon
2019-10-26 11:01 ` bugzilla-daemon
2019-10-26 11:04 ` bugzilla-daemon
2019-11-19  7:54 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-111881-502@http.bugs.freedesktop.org/ \
    --to=bugzilla-daemon@freedesktop.org \
    --cc=dri-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.