public inbox for amd-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dave@treblig.org>
To: "Christian König" <christian.koenig@amd.com>
Cc: alexander.deucher@amd.com, amd-gfx@lists.freedesktop.org
Subject: Re: oops/null pointer in 0010:dma_fence_is_signaled+0x12/0x60 [amdgpu]
Date: Mon, 16 Mar 2026 12:34:57 +0000	[thread overview]
Message-ID: <abf48cuwzRBczbeA@gallifrey> (raw)
In-Reply-To: <41627fc7-a0d8-44bf-982d-334f548d9f88@amd.com>

* Christian König (christian.koenig@amd.com) wrote:
> Hi,
> 
> On 3/16/26 01:57, Dr. David Alan Gilbert wrote:
> > Hi,
> >   I'm not sure if this is repeatable, but I landed with a null
> > pointer during a GPU reset, so thought I should probably
> > report it:
> >    6.19.7-300.fc44.x86_64
> > Mar 16 00:24:39 dalek kernel: BUG: kernel NULL pointer dereference, address: 0000000000000018
> > Mar 16 00:24:39 dalek kernel: #PF: supervisor read access in kernel mode
> > ....
> > Mar 16 00:24:39 dalek kernel: Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
> > Mar 16 00:24:39 dalek kernel: RIP: 0010:dma_fence_is_signaled+0x12/0x60 [amdgpu]
> > 
> > see full oops below;
> > 
> > 09:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Lexa PRO [Radeon 540/540X/550/550X / RX 540X/550/550X] (rev c7)
> > AMD Ryzen 9 3950X
> > 
> > I suspect the timeout was real, and caused by a runaway llama, I forgot the
> > flag to stop it trying to use the GPU for image encoding; but the null page
> > seems unfortunate.  Impressively the audio still kept playing via it:
> > 
> > I have the devcoredump copied if it's of interest
> > (Note for self: ~/amd.core-2026-03-16)
> 
> 
> Can you open up a bug report for that

Sure; on bugzilla.kernel.org ?

> and please provide what line of code amdgpu_device_gpu_recover.cold+0x244 decodes to.

Yeh, let me see what I can do with the Fedora symbol packages.

Dave

> 
> Thanks,
> Christian.
> 
> > 
> > Dave
> > 
> > 
> > Mar 16 00:24:28 dalek kernel: amdgpu 0000:09:00.0: amdgpu: Dumping IP State
> > Mar 16 00:24:28 dalek kernel: amdgpu 0000:09:00.0: amdgpu: Dumping IP State Completed
> > Mar 16 00:24:28 dalek kernel: amdgpu 0000:09:00.0: amdgpu: [drm] AMDGPU device coredump file has been created
> > Mar 16 00:24:28 dalek kernel: amdgpu 0000:09:00.0: amdgpu: [drm] Check your /sys/class/drm/card1/device/devcoredump/data
> > Mar 16 00:24:28 dalek kernel: amdgpu 0000:09:00.0: amdgpu: ring gfx timeout, signaled seq=1705630, emitted seq=1705633
> > Mar 16 00:24:28 dalek kernel: amdgpu 0000:09:00.0: amdgpu:  Process llama-mtmd-cli pid 299886 thread llama-mtmd-cli pid 299886
> > Mar 16 00:24:28 dalek kernel: amdgpu 0000:09:00.0: amdgpu: GPU reset begin!. Source:  1
> > Mar 16 00:24:32 dalek kernel: amdgpu 0000:09:00.0: amdgpu: failed to suspend display audio
> > Mar 16 00:24:32 dalek kernel: amdgpu 0000:09:00.0: amdgpu: Guilty job already signaled, skipping HW reset
> > Mar 16 00:24:32 dalek kernel: amdgpu 0000:09:00.0: amdgpu: GPU reset(1) succeeded!
> > Mar 16 00:24:32 dalek kernel: amdgpu 0000:09:00.0: [drm] device wedged, but recovered through reset
> > Mar 16 00:24:34 dalek lightdm[40555]: ATTENTION: default value of option mesa_glthread overridden by environment.
> > Mar 16 00:24:34 dalek kernel: amdgpu 0000:09:00.0: amdgpu: Dumping IP State
> > Mar 16 00:24:34 dalek kernel: amdgpu 0000:09:00.0: amdgpu: Dumping IP State Completed
> > Mar 16 00:24:34 dalek kernel: amdgpu 0000:09:00.0: amdgpu: [drm] AMDGPU device coredump file has been created
> > Mar 16 00:24:34 dalek kernel: amdgpu 0000:09:00.0: amdgpu: [drm] Check your /sys/class/drm/card1/device/devcoredump/data
> > Mar 16 00:24:34 dalek kernel: amdgpu 0000:09:00.0: amdgpu: ring gfx timeout, signaled seq=1705632, emitted seq=1705637
> > Mar 16 00:24:34 dalek kernel: amdgpu 0000:09:00.0: amdgpu: GPU reset begin!. Source:  1
> > Mar 16 00:24:38 dalek kernel: amdgpu 0000:09:00.0: amdgpu: failed to suspend display audio
> > Mar 16 00:24:39 dalek kernel: BUG: kernel NULL pointer dereference, address: 0000000000000018
> > Mar 16 00:24:39 dalek kernel: #PF: supervisor read access in kernel mode
> > Mar 16 00:24:39 dalek kernel: #PF: error_code(0x0000) - not-present page
> > Mar 16 00:24:39 dalek kernel: PGD 849708067 P4D 849708067 PUD 15d277067 PMD 0
> > Mar 16 00:24:39 dalek kernel: Oops: Oops: 0000 [#1] SMP NOPTI
> > Mar 16 00:24:39 dalek kernel: CPU: 7 UID: 0 PID: 298062 Comm: kworker/u128:2 Not tainted 6.19.7-300.fc44.x86_64 #1 PREEMPT(lazy)
> > Mar 16 00:24:39 dalek kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570 Pro4, BIOS P3.10 07/13/2020
> > Mar 16 00:24:39 dalek kernel: Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
> > Mar 16 00:24:39 dalek kernel: RIP: 0010:dma_fence_is_signaled+0x12/0x60 [amdgpu]
> > Mar 16 00:24:39 dalek kernel: Code: 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 8b 47 30 48 d1 e8 89 c2 83 e2 01 75 2c 48 8b>
> > Mar 16 00:24:39 dalek kernel: RSP: 0018:ffffcf349c553d20 EFLAGS: 00010246
> > Mar 16 00:24:39 dalek kernel: RAX: 0000000000000000 RBX: ffffcf349c553da0 RCX: 0000000000000000
> > Mar 16 00:24:39 dalek kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8bce4ecd4380
> > Mar 16 00:24:39 dalek kernel: RBP: 0000000000000000 R08: 0000000010000020 R09: ffff8bc9c0400b68
> > Mar 16 00:24:39 dalek kernel: R10: 0000000000000080 R11: ffffffffa16760a0 R12: ffff8bc9e9100000
> > Mar 16 00:24:39 dalek kernel: R13: ffff8bca7ccab200 R14: 0000000000000000 R15: 0000000000000000
> > Mar 16 00:24:39 dalek kernel: FS:  0000000000000000(0000) GS:ffff8bd90c066000(0000) knlGS:0000000000000000
> > Mar 16 00:24:39 dalek kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > Mar 16 00:24:39 dalek kernel: CR2: 0000000000000018 CR3: 00000001c591f000 CR4: 0000000000350ef0
> > Mar 16 00:24:39 dalek kernel: Call Trace:
> > Mar 16 00:24:39 dalek kernel:  <TASK>
> > Mar 16 00:24:39 dalek kernel:  amdgpu_device_gpu_recover.cold+0x244/0x2ec [amdgpu]
> > Mar 16 00:24:39 dalek kernel:  amdgpu_job_timedout.cold+0x218/0x258 [amdgpu]
> > Mar 16 00:24:39 dalek kernel:  ? srso_return_thunk+0x5/0x5f
> > Mar 16 00:24:39 dalek kernel:  drm_sched_job_timedout+0x8b/0x190 [gpu_sched]
> > Mar 16 00:24:39 dalek kernel:  ? srso_return_thunk+0x5/0x5f
> > Mar 16 00:24:39 dalek kernel:  process_one_work+0x190/0x350
> > Mar 16 00:24:39 dalek kernel:  worker_thread+0x18d/0x2f0
> > Mar 16 00:24:39 dalek kernel:  ? __pfx_worker_thread+0x10/0x10
> > Mar 16 00:24:39 dalek kernel:  kthread+0xfa/0x240
> > Mar 16 00:24:39 dalek kernel:  ? finish_task_switch.isra.0+0x82/0x2a0
> > Mar 16 00:24:39 dalek kernel:  ? __pfx_kthread+0x10/0x10
> > Mar 16 00:24:39 dalek kernel:  ? __pfx_kthread+0x10/0x10
> > Mar 16 00:24:39 dalek kernel:  ret_from_fork+0x130/0x1a0
> > Mar 16 00:24:39 dalek kernel:  ? __pfx_kthread+0x10/0x10
> > Mar 16 00:24:39 dalek kernel:  ret_from_fork_asm+0x1a/0x30
> > Mar 16 00:24:39 dalek kernel:  </TASK>
> > Mar 16 00:24:39 dalek kernel: Modules linked in: dm_crypt snd_seq_dummy snd_hrtimer nft_masq nft_reject_ipv4 act_csum cls_u32 sch_htb nf_nat_tftp nf_conntr>
> > Mar 16 00:24:39 dalek kernel:  drm_panel_backlight_quirks gpu_sched drm_suballoc_helper video drm_buddy drm_display_helper nvme nvme_core cec ghash_clmulni>
> > Mar 16 00:24:39 dalek kernel: CR2: 0000000000000018
> > Mar 16 00:24:39 dalek kernel: ---[ end trace 0000000000000000 ]---
> > Mar 16 00:24:39 dalek kernel: RIP: 0010:dma_fence_is_signaled+0x12/0x60 [amdgpu]
> > Mar 16 00:24:39 dalek kernel: Code: 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 8b 47 30 48 d1 e8 89 c2 83 e2 01 75 2c 48 8b>
> > Mar 16 00:24:39 dalek kernel: RSP: 0018:ffffcf349c553d20 EFLAGS: 00010246
> > Mar 16 00:24:39 dalek kernel: RAX: 0000000000000000 RBX: ffffcf349c553da0 RCX: 0000000000000000
> > Mar 16 00:24:39 dalek kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8bce4ecd4380
> > Mar 16 00:24:39 dalek kernel: RBP: 0000000000000000 R08: 0000000010000020 R09: ffff8bc9c0400b68
> > Mar 16 00:24:39 dalek kernel: R10: 0000000000000080 R11: ffffffffa16760a0 R12: ffff8bc9e9100000
> > Mar 16 00:24:39 dalek kernel: R13: ffff8bca7ccab200 R14: 0000000000000000 R15: 0000000000000000
> > Mar 16 00:24:39 dalek kernel: FS:  0000000000000000(0000) GS:ffff8bd90c066000(0000) knlGS:0000000000000000
> > Mar 16 00:24:39 dalek kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > Mar 16 00:24:39 dalek kernel: CR2: 0000000000000018 CR3: 00000001c591f000 CR4: 0000000000350ef0
> > Mar 16 00:24:39 dalek kernel: note: kworker/u128:2[298062] exited with irqs disabled
> > 
> > --
> >  -----Open up your eyes, open up your mind, open up your code -------
> > / Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \
> > \        dave @ treblig.org |                               | In Hex /
> >  \ _________________________|_____ http://www.treblig.org   |_______/
> 
-- 
 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
\        dave @ treblig.org |                               | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/

  reply	other threads:[~2026-03-16 13:28 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-16  0:57 oops/null pointer in 0010:dma_fence_is_signaled+0x12/0x60 [amdgpu] Dr. David Alan Gilbert
2026-03-16  8:22 ` Christian König
2026-03-16 12:34   ` Dr. David Alan Gilbert [this message]
2026-03-16 12:37     ` Christian König
2026-03-16 14:18     ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abf48cuwzRBczbeA@gallifrey \
    --to=dave@treblig.org \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox