All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@freedesktop.org
To: dri-devel@lists.freedesktop.org
Subject: [Bug 111633] amdgpu driver crash with kernel NULL pointer dereference
Date: Tue, 10 Sep 2019 19:03:17 +0000	[thread overview]
Message-ID: <bug-111633-502@http.bugs.freedesktop.org/> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 4038 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111633

            Bug ID: 111633
           Summary: amdgpu driver crash with kernel NULL pointer
                    dereference
           Product: DRI
           Version: unspecified
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: not set
          Priority: not set
         Component: DRM/AMDgpu
          Assignee: dri-devel@lists.freedesktop.org
          Reporter: vakevk+freedesktopbugzilla@gmail.com

I am running on arch linux: Linux arch 5.2.13-arch1-1-ARCH #1 SMP PREEMPT Fri
Sep 6 17:52:33 UTC 2019 x86_64 GNU/Linux

I am running wayland via sway.

My gpu is a Radeon RX Vega 64.

While in my sway session the image on my screen froze but audio from a video
continued to play. I was able to ssh in from a different machine and found this
message with dmesg:

BUG: kernel NULL pointer dereference, address: 0000000000000360
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP PTI
CPU: 1 PID: 12766 Comm: kworker/u16:0 Not tainted 5.2.11-arch1-1-ARCH #1
Hardware name: ASUS All Series/Z87-PLUS, BIOS 2103 08/15/2014
Workqueue: events_unbound commit_work [drm_kms_helper]
RIP: 0010:dc_stream_retain+0x5/0x20 [amdgpu]
<Code and registers omitted. Can post if important and someone reassures me
that it doesn't sensitive information since it looks like a memory dump.>
Call Trace:
 dc_resource_state_copy_construct+0xa0/0xf0 [amdgpu]
 dc_commit_updates_for_stream+0xa63/0xc20 [amdgpu]
 amdgpu_dm_atomic_commit_tail+0xabe/0x19a0 [amdgpu]
 ? commit_tail+0x3c/0x70 [drm_kms_helper]
 commit_tail+0x3c/0x70 [drm_kms_helper]
 process_one_work+0x1d1/0x3e0
 worker_thread+0x4a/0x3d0
 kthread+0xfb/0x130
 ? process_one_work+0x3e0/0x3e0
 ? kthread_park+0x80/0x80
 ret_from_fork+0x35/0x40
Modules linked in: snd_seq_dummy snd_seq tun nft_ct nf_conntrack nf_defrag_ipv6
nf_defrag_ipv4 libcrc32c nf_tables_set cfg80211 nf_tables nfnetlink 8021q garp
mrp stp llc intel_rapl nls_iso8859_1 nls_cp437 vfat fat snd_hda_codec_realtek
snd_hda_codec_generic fuse ledtrig_audio ofpart snd_hda_codec_hdmi cmdlinepart
btusb intel_spi_platform snd_hda_intel btrtl x86_pkg_temp_thermal intel_spi
btbcm intel_powerclamp spi_nor btintel eeepc_wmi snd_usb_audio coretemp
snd_hda_codec uvcvideo asus_wmi bluetooth snd_usbmidi_lib iTCO_wdt kvm_intel
snd_hda_core videobuf2_vmalloc mei_hdcp mtd iTCO_vendor_support mxm_wmi
wmi_bmof sparse_keymap snd_hwdep snd_rawmidi snd_seq_device videobuf2_memops
snd_pcm videobuf2_v4l2 snd_timer videobuf2_common snd videodev kvm irqbypass
input_leds ecdh_generic intel_cstate mousedev rfkill intel_uncore mei_me joydev
cdc_acm media ecc e1000e intel_rapl_perf mei soundcore pcc_cpufreq i2c_i801
lpc_ich pcspkr wmi evdev mac_hid ip_tables x_tables ext4
 crc32c_generic crc16 mbcache jbd2 hid_generic usbhid hid dm_crypt dm_mod
sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ahci
libahci aesni_intel libata aes_x86_64 xhci_pci crypto_simd cryptd glue_helper
xhci_hcd scsi_mod ehci_pci ehci_hcd amdgpu gpu_sched i2c_algo_bit ttm
drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm agpgart
CR2: 0000000000000360
---[ end trace 08eaa2e1d713ba4d ]---

At this point I tried killing the sway process but did not succeed even with
`kill -9`. Not even `sudo reboot` completed despite killing the ssh session. I
had to hard reset the machine.

Potentially related is that since roughly a week I have been experiencing
intermittent screen freezes from time to time that would resolve themselves
after about 10 seconds with a message like this in dmesg:

drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR*
[CRTC:47:crtc-0] flip_done timed out
[drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:47:crtc-0] hw_done or
flip_done timed out

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 5327 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

             reply	other threads:[~2019-09-10 19:03 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-10 19:03 bugzilla-daemon [this message]
2019-09-19 20:31 ` [Bug 111633] amdgpu driver crash with kernel NULL pointer dereference bugzilla-daemon
2019-11-19  9:51 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-111633-502@http.bugs.freedesktop.org/ \
    --to=bugzilla-daemon@freedesktop.org \
    --cc=dri-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.