intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Romain Guyard <kernel@romainguyard.com>
Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	Jani Nikula <jani.nikula@linux.intel.com>,
	Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Tvrtko Ursulin <tursulin@ursulin.net>,
	Scott Oehrlein <scott.oehrlein@intel.com>,
	Ben Hutchings <ben@decadent.org.uk>
Subject: Re: [PATCH v4 0/9] drm/i915: PREEMPT_RT related fixups.
Date: Thu, 21 Aug 2025 13:13:48 +0200	[thread overview]
Message-ID: <20250821111348.6iskn4K9@linutronix.de> (raw)
In-Reply-To: <f6b3ed54-dc0e-45a0-8f8d-0826d0133705@romainguyard.com>

On 2025-07-21 14:06:48 [+0900], Romain Guyard wrote:
> Hello,
Hi,

> [ 2349.629427] Hardware name: ADLINK TECHNOLOGY Inc. -612X/-612X, BIOS
> [ 2349.629454]  </TASK>
> [ 2412.634282] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 2412.634284] rcu:     Tasks blocked on level-0 rcu_node (CPUs 0-15): P12083/1:b..l P12724/1:b..l P12725/1:b..l P4057/3:b..l
> [ 2412.634289] rcu:     (detected by 14, t=147008 jiffies, g=355917, q=9582 ncpus=16)
> [ 2412.634290] task:Xorg            state:D stack:0     pid:4057 tgid:4057 ppid:4055   task_flags:0x400100 flags:0x00004000
> [ 2412.634292] Call Trace:
> [ 2412.634293]  <TASK>
> [ 2412.634295]  __schedule+0x44c/0xad0
> [ 2412.634302]  schedule_rtlock+0x25/0x40
> [ 2412.634303]  rtlock_slowlock_locked+0x20d/0xe00
> [ 2412.634307]  rt_spin_lock+0x7a/0xd0
> [ 2412.634309]  execlists_submission_tasklet+0x143/0x14d0
> [ 2412.634354]  tasklet_action_common+0xc1/0x230
> [ 2412.634356]  handle_softirqs.constprop.0+0xce/0x280
> [ 2412.634358]  __local_bh_enable_ip+0xa0/0xd0
> [ 2412.634359]  i915_gem_do_execbuffer+0x1a73/0x2920

This blocks on a lock and waits to make progress. I did not find out who
is holding that one but.

…

> [ 2412.634511]  </TASK>
> [ 2412.634511] task:kworker/14:1    state:R  running task stack:0    pid:12083 tgid:12083 ppid:2      task_flags:0x4208060 flags:0x00004000
> [ 2412.634513] Workqueue: i915-unordered engine_retire
> [ 2412.634515] Call Trace:
> [ 2412.634516]  <TASK>
> [ 2412.634516]  __schedule+0x44c/0xad0
> [ 2412.634520]  preempt_schedule_common+0x31/0x80
> [ 2412.634521]  preempt_schedule_thunk+0x16/0x30
> [ 2412.634523]  migrate_enable+0xe6/0x100
> [ 2412.634525]  rt_spin_unlock+0x12/0x40
> [ 2412.634526]  remove_from_engine+0x76/0xc0
> [ 2412.634528]  i915_request_retire.part.0+0x7c/0x220
> [ 2412.634530]  engine_retire+0xc3/0x100
> [ 2412.634531]  process_one_work+0x166/0x390
> [ 2412.634533]  worker_thread+0x29d/0x3c0

this might be the one. The task is running state so I don't understand
what is holding the scheduler back to put it back on the CPU.
There is at least one CPU idle available but this workqueue is called
i915-unordered but must complete on the same CPU (it can't migrate). So
what is CPU14 doing? It should schedule something and not be idle.

> Looks like there are some i915 locking stuff in those BTs.
> 
> I am not very knowledgeable about i915 and RT, so my help is quite limited,
> but since this is easily reproduced (always crash or hangs after <1H), I can
> try things.

I don't know what you can retrieve from the kdump but CPU14 should be
spinning on something I guess. RCU complains about not making progress.
If RCU-boost is enabled then the kworker should have one more reason to
be on the CPU.
Could you try v6.17-rc? I didn't add anything i915 related.
Could lease please enable CONFIG_PROVE_LOCKING,
CONFIG_DEBUG_ATOMIC_SLEEP and check if the kernel complains? Maybe there
is something new I haven't noticed. 

> Thank you!
> 
> Romain Guyard

Sebastian

  reply	other threads:[~2025-08-21 11:13 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-14 15:39 [PATCH v4 0/9] drm/i915: PREEMPT_RT related fixups Sebastian Andrzej Siewior
2025-07-14 15:39 ` [PATCH v4 1/9] drm/i915: Use preempt_disable/enable_rt() where recommended Sebastian Andrzej Siewior
2025-07-14 15:39 ` [PATCH v4 2/9] drm/i915: Don't disable interrupts on PREEMPT_RT during atomic updates Sebastian Andrzej Siewior
2025-07-14 15:39 ` [PATCH v4 3/9] drm/i915: Don't check for atomic context on PREEMPT_RT Sebastian Andrzej Siewior
2025-07-14 15:39 ` [PATCH v4 4/9] drm/i915: Disable tracing points " Sebastian Andrzej Siewior
2025-07-15  8:01   ` Maarten Lankhorst
2025-07-15 17:54     ` Ville Syrjälä
2025-07-14 15:39 ` [PATCH v4 5/9] drm/i915/gt: Use spin_lock_irq() instead of local_irq_disable() + spin_lock() Sebastian Andrzej Siewior
2025-07-14 15:39 ` [PATCH v4 6/9] drm/i915: Drop the irqs_disabled() check Sebastian Andrzej Siewior
2025-07-14 15:39 ` [PATCH v4 7/9] drm/i915/guc: Consider also RCU depth in busy loop Sebastian Andrzej Siewior
2025-07-14 15:39 ` [PATCH v4 8/9] drm/i915: Consider RCU read section as atomic Sebastian Andrzej Siewior
2025-07-14 15:39 ` [PATCH v4 9/9] Revert "drm/i915: Depend on !PREEMPT_RT." Sebastian Andrzej Siewior
2025-07-14 18:08 ` ✓ i915.CI.BAT: success for drm/i915: PREEMPT_RT related fixups. (rev13) Patchwork
2025-07-14 21:16 ` ✗ i915.CI.Full: failure " Patchwork
2025-07-21  5:06 ` [PATCH v4 0/9] drm/i915: PREEMPT_RT related fixups Romain Guyard
2025-08-21 11:13   ` Sebastian Andrzej Siewior [this message]
2025-08-21 11:17 ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250821111348.6iskn4K9@linutronix.de \
    --to=bigeasy@linutronix.de \
    --cc=ben@decadent.org.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=kernel@romainguyard.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=scott.oehrlein@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tursulin@ursulin.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).