public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Oleg Nesterov <oleg@redhat.com>, Ingo Molnar <mingo@redhat.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Phil Auld <pauld@redhat.com>,
	Chris von Recklinghausen <crecklin@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: sched/isolation: tick_take_do_timer_from_boot() calls smp_call_function_single() with irqs disabled
Date: Fri, 24 May 2024 17:20:10 +0200	[thread overview]
Message-ID: <ZlCwKk65-eL0FrKX@pavilion.home> (raw)
In-Reply-To: <87h6eneeu7.ffs@tglx>

Le Fri, May 24, 2024 at 11:31:12AM +0200, Thomas Gleixner a écrit :
> Oleg!
> 
> On Thu, May 23 2024 at 15:23, Oleg Nesterov wrote:
> > On 05/22, Oleg Nesterov wrote:
> >>
> >> After the recent comment 5097cbcb38e6 ("sched/isolation: Prevent boot crash
> >> when the boot CPU is nohz_full") the kernel no longer crashes, but there is
> >> another problem.
> >>
> >> In this case tick_setup_device() does tick_take_do_timer_from_boot() to
> >> update tick_do_timer_cpu and this triggers WARN_ON_ONCE(irqs_disabled())
> >> in smp_call_function_single().
> >>
> >> I don't understand this code even remotely, I failed to find the fix.
> >>
> >> Perhaps we can use smp_call_function_single_async() as a workaround ?
> >>
> >> But I don't even understand why exactly we need smp_call_function()...
> 
> It's not required at all.
> 
> >> Race with tick_nohz_stop_tick() on boot CPU which can set
> >> tick_do_timer_cpu = TICK_DO_TIMER_NONE? Is it really bad?
> 
> This can't happen.

Actually... The boot CPU is nohz_full and nothing prevents it
from stopping its tick once IRQs are enabled and before calling
tick_nohz_idle_enter(). When that happens, tick_nohz_full_update_tick()
doesn't go through can_stop_idle_tick() and therefore doesn't check if it
is the timekeeper. And then it goes through tick_nohz_stop_tick() which
can set tick_do_timer_cpu = TICK_DO_TIMER_NONE.

> 
> > And is it supposed to happen if tick_nohz_full_running ?
> >
> > tick_sched_do_timer() and can_stop_idle_tick() claim that
> > TICK_DO_TIMER_NONE is not possible in this case...
> 
> What happens during boot is:
> 
>   1) The boot CPU takes the do_timer duty when it installs its
>      clockevent device
> 
>   2) The boot CPU does not give up the duty because of this
>      condition in can_stop_idle_tick():
> 
>      if (tick_nohz_full_enabled()) {
>      	if (tick_cpu == cpu)
>            return false;
>         ...
> 
> So there is no race because the boot CPU _cannot_ reach
> tick_nohz_stop_tick() as long as no secondary has taken over.
> 
> It's far from obvious. What a horrible maze..

I know, I wish I had the time to Nack that nohz_full boot CPU
patch back then. But now we have to maintain it, even though it's
broken and uglifies the situation.

Anyway, we probably need to prevent from stopping the tick
as long as a CPU is the timekeeper and some CPU (could be the same)
is nohz_full somewhere.

That needs to be a seperate change (I'll try to fix that after
the week-end with a new brain) and then Oleg's patch can go on
top of it.

Thanks.

> 
> > So, once again, could you explain why the patch below is wrong?
> 
> > -			tick_take_do_timer_from_boot();
> >  			tick_do_timer_boot_cpu = -1;
> > -			WARN_ON(READ_ONCE(tick_do_timer_cpu) != cpu);
> > +			WRITE_ONCE(tick_do_timer_cpu, cpu);
> 
> This part is perfectly fine.
> 
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index 71a792cd8936..3b1d011d45e1 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -1014,6 +1014,9 @@ static void tick_nohz_stop_tick(struct tick_sched *ts, int cpu)
> >  	 */
> >  	tick_cpu = READ_ONCE(tick_do_timer_cpu);
> >  	if (tick_cpu == cpu) {
> > +#ifdef CONFIG_NO_HZ_FULL
> > +		WARN_ON_ONCE(tick_nohz_full_running);
> > +#endif
> 
>                 WARN_ON_ONCE(tick_nohz_full_enabled());
> 
> which spares the ugly #ifdef?
> 
> >  		WRITE_ONCE(tick_do_timer_cpu, TICK_DO_TIMER_NONE);
> >  		tick_sched_flag_set(ts, TS_FLAG_DO_TIMER_LAST);
> >  	} else if (tick_cpu != TICK_DO_TIMER_NONE) {
> 
> Thanks,
> 
>         tglx

  parent reply	other threads:[~2024-05-24 15:20 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-22 15:17 sched/isolation: tick_take_do_timer_from_boot() calls smp_call_function_single() with irqs disabled Oleg Nesterov
2024-05-23 13:23 ` Oleg Nesterov
2024-05-24  9:31   ` Thomas Gleixner
2024-05-24 14:10     ` Oleg Nesterov
2024-05-24 15:22       ` Frederic Weisbecker
2024-05-24 15:20     ` Frederic Weisbecker [this message]
2024-05-24 17:16       ` Thomas Gleixner
2024-05-24 18:37       ` Oleg Nesterov
2024-05-24 22:06         ` Thomas Gleixner
2024-05-25 13:51           ` Oleg Nesterov
2024-05-25 14:13             ` Oleg Nesterov
2024-05-26 19:27           ` Oleg Nesterov
2024-05-26 20:52             ` Frederic Weisbecker
2024-05-27 15:57               ` Oleg Nesterov
2024-05-27 11:01             ` Nicholas Piggin
2024-05-27 15:57               ` Oleg Nesterov
2024-05-28  1:02                 ` Nicholas Piggin
2024-05-28 12:19                   ` Oleg Nesterov
2024-05-27 16:13               ` Thomas Gleixner
2024-05-26 20:57           ` Frederic Weisbecker
2024-05-27  9:10           ` Nicholas Piggin
2024-05-27 10:23             ` Thomas Gleixner
2024-05-27 11:16               ` Nicholas Piggin
2024-05-28 12:20 ` [PATCH] tick/nohz_full: don't abuse smp_call_function_single() in tick_setup_device() Oleg Nesterov
2024-05-28 12:22   ` Oleg Nesterov
2024-05-30 12:40   ` [PATCH] tick/nohz_full: turn tick_do_timer_boot_cpu into boot_cpu_is_nohz_full Oleg Nesterov
2024-06-03 15:35     ` [PATCH v2] " Oleg Nesterov
2024-06-03 21:44       ` Frederic Weisbecker
2024-06-04  5:08       ` Nicholas Piggin
2024-05-30 14:52   ` [PATCH] tick/nohz_full: don't abuse smp_call_function_single() in tick_setup_device() Frederic Weisbecker
2024-05-30 16:52     ` Oleg Nesterov
2024-05-30 17:01     ` Oleg Nesterov
2024-06-01 14:03     ` Oleg Nesterov
2024-06-02 21:29       ` Frederic Weisbecker
2024-06-03 15:41         ` Oleg Nesterov
2024-06-03 21:45           ` Frederic Weisbecker
2024-06-10 15:55   ` [PING ;)] " Oleg Nesterov
2024-06-10 18:15     ` Thomas Gleixner
2024-06-10 18:26   ` [tip: timers/urgent] tick/nohz_full: Don't " tip-bot2 for Oleg Nesterov
2024-06-10 19:42     ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZlCwKk65-eL0FrKX@pavilion.home \
    --to=frederic@kernel.org \
    --cc=crecklin@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=npiggin@gmail.com \
    --cc=oleg@redhat.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox