From: Peter Zijlstra <peterz@infradead.org>
To: Shrikanth Hegde <sshegde@linux.ibm.com>
Cc: Venkat Rao Bagalkote <venkat88@linux.ibm.com>,
LKML <linux-kernel@vger.kernel.org>,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
Madhavan Srinivasan <maddy@linux.ibm.com>,
jstultz@google.com, stultz@google.com
Subject: Re: [bisected][mainline]Kernel warnings at kernel/sched/cpudeadline.c:219
Date: Thu, 9 Oct 2025 13:49:58 +0200 [thread overview]
Message-ID: <20251009114958.GC4067720@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <ccac0b98-fd05-403f-8cd2-6143f6e8cbdd@linux.ibm.com>
On Thu, Oct 09, 2025 at 03:17:40PM +0530, Shrikanth Hegde wrote:
>
>
> On 10/9/25 1:30 PM, Peter Zijlstra wrote:
> > On Wed, Oct 08, 2025 at 11:39:11PM +0530, Shrikanth Hegde wrote:
> >
> > > *It pointed to this*
> > >
> > > NIP [c0000000001fd798] dl_server_start+0x50/0xd8
> > > LR [c0000000001d9534] enqueue_task_fair+0x228/0x8ec
> > > Call Trace:
> > > [c000006684a579c0] [0000000000000001] 0x1 (unreliable)
> > > [c000006684a579f0] [c0000000001d9534] enqueue_task_fair+0x228/0x8ec
> > > [c000006684a57a60] [c0000000001bb344] enqueue_task+0x5c/0x1c8
> > > [c000006684a57aa0] [c0000000001c5fc0] ttwu_do_activate+0x98/0x2fc
> > > [c000006684a57af0] [c0000000001c671c] try_to_wake_up+0x2e0/0xa60
> > > [c000006684a57b80] [c00000000019fb48] kthread_park+0x7c/0xf0
> > > [c000006684a57bb0] [c00000000015fefc] takedown_cpu+0x60/0x194
> > > [c000006684a57c00] [c000000000161924] cpuhp_invoke_callback+0x1f4/0x9a4
> > > [c000006684a57c90] [c0000000001621a4] __cpuhp_invoke_callback_range+0xd0/0x188
> > > [c000006684a57d30] [c000000000165aec] _cpu_down+0x19c/0x560
> > > [c000006684a57df0] [c0000000001637c0] __cpu_down_maps_locked+0x2c/0x3c
> > > [c000006684a57e10] [c00000000018a100] work_for_cpu_fn+0x38/0x54
> > > [c000006684a57e40] [c00000000019075c] process_one_work+0x1d8/0x554
> > > [c000006684a57ef0] [c00000000019165c] worker_thread+0x308/0x46c
> > > [c000006684a57f90] [c00000000019e474] kthread+0x16c/0x19c
> > > [c000006684a57fe0] [c00000000000dd58] start_kernel_thread+0x14/0x18
> > >
> > > It is takedown_cpu called from CPU0(boot CPU) and it wakes up kthread
> > > which is CPU Bound I guess. Since happens after rq was marked
> > > offline, it ends up starting the deadline server again.
> > >
> > > So i think it is sensible idea to stop the deadline server if the cpu
> > > is going down. Once we stop the server we will return
> > > HRTIMER_NORESTART.
> >
> > D'0h.. that stop was far too early.
> >
> > How about moving that dl_server_stop() into sched_cpu_dying() like so.
> >
> > This seems to survive a few hotplugs for me.
> >
> > ---
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 198d2dd45f59..f1ebf67b48e2 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -8571,10 +8571,12 @@ int sched_cpu_dying(unsigned int cpu)
> > sched_tick_stop(cpu);
> > rq_lock_irqsave(rq, &rf);
> > + update_rq_clock(rq);
> > if (rq->nr_running != 1 || rq_has_pinned_tasks(rq)) {
> > WARN(true, "Dying CPU not properly vacated!");
> > dump_rq_tasks(rq, KERN_WARNING);
> > }
> > + dl_server_stop(&rq->fair_server);
> > rq_unlock_irqrestore(rq, &rf);
> > calc_load_migrate(rq);
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index 615411a0a881..7b7671060bf9 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -1582,6 +1582,9 @@ void dl_server_start(struct sched_dl_entity *dl_se)
> > if (!dl_server(dl_se) || dl_se->dl_server_active)
> > return;
> > + if (WARN_ON_ONCE(!cpu_online(cpu_of(rq))))
> > + return;
> > +
> > dl_se->dl_server_active = 1;
> > enqueue_dl_entity(dl_se, ENQUEUE_WAKEUP);
> > if (!dl_task(dl_se->rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl))
>
> Yes. This works. no warning with drmgr or chcpu.
>
> shall i write changelog and send it as patch?
If you would. Thanks!
next prev parent reply other threads:[~2025-10-09 11:50 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-08 2:11 [bisected][mainline]Kernel warnings at kernel/sched/cpudeadline.c:219 Venkat Rao Bagalkote
2025-10-08 9:50 ` Peter Zijlstra
2025-10-08 10:17 ` Shrikanth Hegde
2025-10-08 11:13 ` Peter Zijlstra
2025-10-08 18:09 ` Shrikanth Hegde
2025-10-09 8:00 ` Peter Zijlstra
2025-10-09 9:47 ` Shrikanth Hegde
2025-10-09 11:49 ` Peter Zijlstra [this message]
2025-10-09 11:54 ` Marek Szyprowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251009114958.GC4067720@noisy.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=jstultz@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maddy@linux.ibm.com \
--cc=sshegde@linux.ibm.com \
--cc=stultz@google.com \
--cc=venkat88@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox