All of lore.kernel.org
 help / color / mirror / Atom feed
* drm: xe: Kernel-submitted job timed out
@ 2026-05-22 18:52 Linus Torvalds
  2026-05-22 18:55 ` Maarten Lankhorst
  0 siblings, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 2026-05-22 18:52 UTC (permalink / raw)
  To: Matthew Brost, Thomas Hellström, Rodrigo Vivi
  Cc: David Airlie, Simona Vetter, intel-xe, dri-devel

[-- Attachment #1: Type: text/plain, Size: 690 bytes --]

Actually, this doesn't seem to have actually timed out, it seems to
have never been started, and then subsequent operations were confused.

Because I had to reboot my desktop as non-responsive (the cursor was
moving, but no screen updates) after two lines of

  xe 0000:4b:00.0: [drm] Tile0: GT0: Check job timeout: seqno=4485322,
lrc_seqno=4485322, guc_id=0, not started

followed a few seconds later by some Xe fault and then an endless
stream of "Kernel-submitted job timed out" reports.

Presumably that job was the thing that was never started in the first place.

Cut-down dmesg with the endless repeats deleted (after rebooting to
get a working system) attached.

             Linus

[-- Attachment #2: out --]
[-- Type: application/octet-stream, Size: 5224 bytes --]


May 22 11:09:11 3970x kernel: xe 0000:4b:00.0: [drm] Tile0: GT0: Check job timeout: seqno=4485322, lrc_seqno=4485322, guc_id=0, not started
May 22 11:09:16 3970x kernel: xe 0000:4b:00.0: [drm] Tile0: GT0: Check job timeout: seqno=4485322, lrc_seqno=4485322, guc_id=0, not started

May 22 11:09:19 3970x kernel: xe 0000:4b:00.0: [drm] Tile0: GT0: 
                                      ASID: 0
                                      Faulted Address: 0x00000002fa9fa000
                                      FaultType: 0
                                      AccessType: 0
                                      FaultLevel: 2
                                      EngineClass: 3 bcs
                                      EngineInstance: 8
May 22 11:09:19 3970x kernel: xe 0000:4b:00.0: [drm] Tile0: GT0: Fault response: Unsuccessful -EINVAL
May 22 11:09:19 3970x kernel: xe 0000:4b:00.0: [drm] Tile0: GT0: Engine memory CAT error [18]: class=bcs, logical_mask: 0x2, guc_id=0
May 22 11:09:19 3970x kernel: xe 0000:4b:00.0: [drm] Tile0: GT0: Engine reset: engine_class=bcs, logical_mask: 0x2, guc_id=0, state=0x249
May 22 11:09:19 3970x kernel: xe 0000:4b:00.0: [drm] Tile0: GT0: Timedout job: seqno=4485322, lrc_seqno=4485322, guc_id=0, flags=0x73 in no process [-1]
May 22 11:09:19 3970x kernel: xe 0000:4b:00.0: [drm] Xe device coredump has been created
May 22 11:09:19 3970x kernel: xe 0000:4b:00.0: [drm] Check your /sys/class/drm/card0/device/devcoredump/data
May 22 11:09:19 3970x kernel: ------------[ cut here ]------------
May 22 11:09:19 3970x kernel: xe 0000:4b:00.0: [drm] Tile0: GT0: Kernel-submitted job timed out
May 22 11:09:19 3970x kernel: WARNING: drivers/gpu/drm/xe/xe_guc_submit.c:1627 at guc_exec_queue_timedout_job+0xe29/0x1000 [xe], CPU#17: kworker/u256:0/2306935
May 22 11:09:19 3970x kernel: Modules linked in: uas usb_storage uinput rfcomm nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables sunrpc bnep vfat fat iwlmvm mac80211 libarc4 snd_hda_codec_intelhdmi snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_usb_audio snd_hda_core btusb snd_hwdep btrtl amd_atl snd_usbmidi_lib iwlwifi snd_seq btintel amd64_edac snd_rawmidi btbcm bluetooth snd_pcm edac_mce_amd snd_seq_device wmi_bmof atlantic cfg80211 pcspkr igb mc macsec dca snd_timer rfkill mxm_wmi snd i2c_piix4 soundcore i2c_smbus k10temp joydev nfnetlink zram dm_crypt xe drm_ttm_helper ttm i2c_algo_bit gpu_sched drm_buddy video drm_client_lib drm_suballoc_helper drm_gpuvm drm_exec drm_gpusvm_helper drm_display_helper drm_kms_helper ccp drm cec nvme sp5100_tco nvme_core wmi i2c_dev fuse
May 22 11:09:19 3970x kernel: CPU: 17 UID: 0 PID: 2306935 Comm: kworker/u256:0 Not tainted 7.1.0-rc3-00073-ga6920214ba75 #46 PREEMPTLAZY 
May 22 11:09:19 3970x kernel: Hardware name: Gigabyte Technology Co., Ltd. TRX40 AORUS MASTER/TRX40 AORUS MASTER, BIOS F7 09/07/2022
May 22 11:09:19 3970x kernel: Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched]
May 22 11:09:19 3970x kernel: RIP: 0010:guc_exec_queue_timedout_job+0xf3b/0x1000 [xe]
May 22 11:09:19 3970x kernel: Code: 8b 11 48 85 d2 74 06 48 8b 7a 08 eb 02 31 ff 48 8b 57 50 48 85 d2 75 03 48 8b 17 44 0f b6 46 26 0f b6 49 08 4c 89 f7 48 89 c6 <67> 48 0f b9 3a 48 8b 43 60 4c 8b 7c 24 20 44 8b 74 24 28 49 89 dc
May 22 11:09:19 3970x kernel: RSP: 0018:ffffd0e0aa6c7d88 EFLAGS: 00010246
May 22 11:09:19 3970x kernel: RAX: ffffffffc0c3b7cd RBX: ffff8a4020c46400 RCX: 0000000000000000
May 22 11:09:19 3970x kernel: RDX: ffff8a4004d369c0 RSI: ffffffffc0c3b7cd RDI: ffffffffc091cd30
May 22 11:09:19 3970x kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: ffff8a4f3f0fb240
May 22 11:09:19 3970x kernel: R10: 000000000000bffd R11: 3fffffffffffbfff R12: ffff8a4020c46400
May 22 11:09:19 3970x kernel: R13: ffff8a401c558e00 R14: ffffffffc091cd30 R15: ffff8a40166c0000
May 22 11:09:19 3970x kernel: FS:  0000000000000000(0000) GS:ffff8a4f4e2ed000(0000) knlGS:0000000000000000
May 22 11:09:19 3970x kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 22 11:09:19 3970x kernel: CR2: 0000243c08ff0000 CR3: 00000001994d5000 CR4: 0000000000350ef0
May 22 11:09:19 3970x kernel: Call Trace:
May 22 11:09:19 3970x kernel:  <TASK>
May 22 11:09:19 3970x kernel:  ? wake_bit_function+0x60/0x60
May 22 11:09:19 3970x kernel:  drm_sched_job_timedout+0xb8/0x130 [gpu_sched]
May 22 11:09:19 3970x kernel:  process_scheduled_works+0x1ac/0x380
May 22 11:09:19 3970x kernel:  worker_thread+0x1f4/0x2d0
May 22 11:09:19 3970x kernel:  ? pr_cont_work+0x1b0/0x1b0
May 22 11:09:19 3970x kernel:  kthread+0xee/0x120
May 22 11:09:19 3970x kernel:  ? kthread_blkcg+0x30/0x30
May 22 11:09:19 3970x kernel:  ret_from_fork+0x9d/0x200
May 22 11:09:19 3970x kernel:  ? kthread_blkcg+0x30/0x30
May 22 11:09:19 3970x kernel:  ret_from_fork_asm+0x11/0x20
May 22 11:09:19 3970x kernel:  </TASK>
May 22 11:09:19 3970x kernel: ---[ end trace 0000000000000000 ]---
May 22 11:09:19 3970x kernel: xe 0000:4b:00.0: [drm] Tile0: GT0: Timedout job: seqno=4485325, lrc_seqno=4485325, guc_id=0, flags=0x73 in no process [-1]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-06-11 13:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-22 18:52 drm: xe: Kernel-submitted job timed out Linus Torvalds
2026-05-22 18:55 ` Maarten Lankhorst
2026-05-22 19:05   ` Linus Torvalds
2026-05-22 20:44     ` Rodrigo Vivi
2026-05-22 20:54       ` Linus Torvalds
2026-05-23  8:29         ` Maarten Lankhorst
2026-05-23 14:48           ` Linus Torvalds
2026-06-09 16:30             ` Matthew Brost
2026-06-11 13:46               ` Rodrigo Vivi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.