public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
To: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>, Phil Auld <pauld@redhat.com>,
	Valentin Schneider <vschneid@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Shizhao Chen <shichen@redhat.com>,
	linux-kernel@vger.kernel.org, Omar Sandoval <osandov@fb.com>,
	Xuewen Yan <xuewen.yan@unisoc.com>
Subject: Re: sched: update_entity_lag does not handle corner case with task in PI chain
Date: Tue, 21 Oct 2025 21:35:52 -0300	[thread overview]
Message-ID: <aPgm6KvDx5Os2oJS@uudg.org> (raw)
In-Reply-To: <c10f6fda-aa8c-4d8e-a315-3c084af08862@amd.com>

On Tue, Oct 21, 2025 at 12:38:17PM +0530, K Prateek Nayak wrote:
> Hello Peter, Luis,
> 
> On 10/19/2025 1:27 AM, Peter Zijlstra wrote:
> >> [ 1805.450470] ------------[ cut here ]------------
> >> [ 1805.450474] WARNING: CPU: 2 PID: 19 at kernel/sched/fair.c:697 update_entity_lag+0x5b/0x70
> >> [ 1805.463366] Modules linked in: intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common skx_edac skx_edac_common nfit libnvdimm x86_pkg_temp_th
> >> ermal intel_powerclamp coretemp kvm_intel kvm platform_profile dell_wmi sparse_keymap rfkill irqbypass iTCO_wdt video mgag200 rapl iTCO_vendor_support dell_smbios ipmi_ssif in
> >> tel_cstate vfat dcdbas wmi_bmof intel_uncore dell_wmi_descriptor pcspkr fat i2c_algo_bit lpc_ich mei_me i2c_i801 i2c_smbus mei intel_pch_thermal ipmi_si acpi_power_meter acpi_
> >> ipmi ipmi_devintf ipmi_msghandler sg fuse loop xfs sd_mod i40e ghash_clmulni_intel libie libie_adminq ahci libahci tg3 libata wmi sunrpc dm_mirror dm_region_hash dm_log dm_mod
> >>  nfnetlink
> >> [ 1805.525160] CPU: 2 UID: 0 PID: 19 Comm: rcub/0 Kdump: loaded Not tainted 6.17.1-rt5 #1 PREEMPT_RT 
> >> [ 1805.534113] Hardware name: Dell Inc. PowerEdge R440/0WKGTH, BIOS 2.21.1 03/07/2024
> >> [ 1805.541678] RIP: 0010:update_entity_lag+0x5b/0x70
> >> [ 1805.546385] Code: 42 f8 48 81 3b 00 00 10 00 75 23 48 89 fa 48 f7 da 48 39 ea 48 0f 4c d5 48 39 fd 48 0f 4d d7 48 89 53 78 5b 5d c3 cc cc cc cc <0f> 0b eb b1 48 89 de e8 b9
> >>  8c ff ff 48 89 c7 eb d0 0f 1f 40 00 90
> >> [ 1805.565130] RSP: 0000:ffffcc9e802f7b90 EFLAGS: 00010046
> >> [ 1805.570358] RAX: 0000000000000000 RBX: ffff8959080c0080 RCX: 0000000000000000
> >> [ 1805.577488] RDX: 0000000000000000 RSI: ffff8959080c0080 RDI: ffff895592cc1c00
> >> [ 1805.584622] RBP: ffff895592cc1c00 R08: 0000000000008800 R09: 0000000000000000
> >> [ 1805.591756] R10: 0000000000000001 R11: 0000000000200b20 R12: 000000000000000e
> >> [ 1805.598886] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> >> [ 1805.606020] FS:  0000000000000000(0000) GS:ffff895947da2000(0000) knlGS:0000000000000000
> >> [ 1805.614107] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [ 1805.619853] CR2: 00007f655816ed40 CR3: 00000004ab854006 CR4: 00000000007726f0
> >> [ 1805.626985] PKRU: 55555554
> >> [ 1805.629696] Call Trace:
> >> [ 1805.632150]  <TASK>
> >> [ 1805.634258]  dequeue_entity+0x90/0x4f0
> >> [ 1805.638012]  dequeue_entities+0xc9/0x6b0
> >> [ 1805.641935]  dequeue_task_fair+0x8a/0x190
> >> [ 1805.645949]  ? sched_clock+0x10/0x30
> >> [ 1805.649527]  rt_mutex_setprio+0x318/0x4b0
> > 
> > So we have:
> > 
> > rt_mutex_setprio()
> > 
> >   rq = __task_rq_lock(p, ..); // this asserts p->pi_lock is held
> > 
> >   ...
> > 
> >   queued = task_on_rq_queued(rq); // basically reads p->on_rq
> >   running = task_current_donor()
> >   if (queued)
> >     dequeue_task(rq, p, queue_flags);
> >       dequeue_task_fair()
> >         dequeue_entities()
> > 	  dequeue_entity()
> > 	    update_entity_lag()
> > 	      WARN_ON_ONCE(se->on_rq);
> > 
> > So the only way to get here is if: rq->on_rq is in fact !0 *and*
> > se->on_rq is zero.
> > 
> > And I'm not at all sure how one would get into such a state.
> 
> This looks like something that can happen when a delayed task is
> dequeued from a throttled hierarchy. Matt had reported similar
> problem with wait_task_inactive() in
> https://lore.kernel.org/all/20250925133310.1843863-1-matt@readmodwrite.com/
> 
> rt_mutex_setprio()
>   ...
>   if (prev_class != next_class && p->se.sched_delayed)
>     dequeue_task(rq, p, DEQUEUE_DELAYED)
>       dequeue_entities(se = &p->se)
>         dequeue_entity(se)
>           se->on_rq = 0; /* se->on_rq turns 0 here */
>         ...
>         if (cfs_rq_throttled(cfs_rq))
>           return 0; /* Early return brfore __block_task() */
>   ...
> 
>   /* __block_task() not called; task_on_rq_queued() returns true. */
>   queued = task_on_rq_queued(p);
>   ...
> 
>   if (queued)
>     dequeue_task(rq, p, queue_flag)
>       dequeue_entities(se = &p->se)
>         dequeue_entity(se)
>           update_entity_lag(se)
>             WARN_ON_ONCE(!se->on_rq)
> 
> 
> v6.18 kernels will get rid of this issue as a part of per-task throttle
> feature and stable should pick up the fix for same on the thread soon. 

Thank you! You were right, your patch in that thread seems to have fixed
the issue I reported.

I read the thread you mentioned, built a test kernel with the patch and have
been running tests for more than 6h now without a single backtrace. As reported
earlier, I was able to hit the bug within 15 minutes without the patch.

Best regards,
Luis

> 
> -- 
> Thanks and Regards,
> Prateek
> 
---end quoted text---


  reply	other threads:[~2025-10-22  0:36 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-18 11:34 sched: update_entity_lag does not handle corner case with task in PI chain Luis Claudio R. Goncalves
2025-10-18 19:57 ` Peter Zijlstra
2025-10-20 11:00   ` Luis Claudio R. Goncalves
2025-10-21  7:08   ` K Prateek Nayak
2025-10-22  0:35     ` Luis Claudio R. Goncalves [this message]
2025-10-24  4:00       ` K Prateek Nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPgm6KvDx5Os2oJS@uudg.org \
    --to=lgoncalv@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=osandov@fb.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=shichen@redhat.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=xuewen.yan@unisoc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox