From: Frederic Weisbecker <frederic@kernel.org>
To: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Anna-Maria Behnsen <anna-maria@linutronix.de>,
linux-kernel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
John Stultz <jstultz@google.com>,
Thomas Gleixner <tglx@linutronix.de>,
Eric Dumazet <edumazet@google.com>,
"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
Arjan van de Ven <arjan@infradead.org>,
"Paul E . McKenney" <paulmck@kernel.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Rik van Riel <riel@surriel.com>,
Steven Rostedt <rostedt@goodmis.org>,
Giovanni Gherdovich <ggherdovich@suse.cz>,
Lukasz Luba <lukasz.luba@arm.com>,
"Gautham R . Shenoy" <gautham.shenoy@amd.com>
Subject: Re: [PATCH v6 19/21] timer: Implement the hierarchical pull model
Date: Tue, 16 May 2023 11:15:23 +0200 [thread overview]
Message-ID: <ZGNJq_ZITmZ5YciL@localhost.localdomain> (raw)
In-Reply-To: <20230515101936.3amAvw0T@linutronix.de>
Le Mon, May 15, 2023 at 12:19:36PM +0200, Sebastian Siewior a écrit :
> On 2023-05-10 12:32:53 [+0200], Frederic Weisbecker wrote:
> > In the case of !PREEMPT_RT, I suppose it's impossible for the target
> > CPU to be offline. You checked above tmc->online and in-between the
> > call to timer_lock_remote_bases(), the path is BH-disabled, this prevents
> > stop_machine from running and from setting the CPU as offline.
>
> I think you refer to the last one invoked from takedown_cpu(). This does
> not matter, see below.
>
> What bothers me is that _current_ CPU is check for cpu_is_offline() and
> not the variable 'cpu'. Before the check timer_expire_remote() is
> invoked on 'cpu' and not on current.
Oh right!
>
> > However in PREEMPT_RT, ksoftirqd (or timersd) is preemptible, so it seems
> > that it could happen in theory. And that could create a locking imbalance.
>
> The ksoftirqd thread is part of smpboot_park_threads(). They have to
> stop running and clean up before the machinery continues bringing down
> the CPU (that is before takedown_cpu()). On the way down we have:
> - tmigr_cpu_offline() followed by
> - smpboot_park_threads().
>
> So ksoftirqd (preempted or not) finishes before. This is for the
> _target_ CPU.
Ok I forgot about the smpboot cleanup part.
>
> After the "tmc->online" check the lock is dropped and this is invoked
> from run_timer_softirq(). That means that _this_ CPU could get preempted
> (by an IRQ for instance) at this point, and once the CPU gets back here,
> the remote CPU (as specified in `cpu') can already be offline by the
> time timer_lock_remote_bases() is invoked.
>
> So RT or not, this is racy.
Well, all CPUs must schedule to stop machine on take_cpu_down().
So:
//CPU 1 //CPU 2
softirq()
tmigr_handle_remote_cpu()
LOCK(tmc->lock)
if (!tmc->online)
UNLOCK(tmc->lock)
return;
cpu_down()
tmigr_cpu_offline()
LOCK(tmc->lock)
tmc->online = 0
UNLOCK(tmc->lock)
stop_machine()
//wait for CPU 1
poll on MULTI_STOP_PREPARE
if (cpu_is_offline(2))
//not possible
//end of softirq
stop_machine()
set MULTI_STOP_PREPARE
...
set_cpu_online(0)
Things should be fine on !RT but I may easily be missing something.
As for RT it should be fine as well as you pointed out since CPU 1
can be preempted but the CPU still needs to park the kthreads before joining
the stop machine party.
>
> > My suggestion would be to unconditionally lock the bases, you already checked if
> > !tmc->online before. The remote CPU may have gone down since then because the
> > tmc lock has been relaxed but it should be rare enough that you don't care
> > about optimizing with a lockless check. So you can just lock the bases,
> > lock the tmc and check again if tmc->online. If not then you can just ignore
> > the tmigr_new_timer_up call and propagation.
>
> Regardless the previous point, this still looks odd as you pointed out.
> The return code is ignored and the two functions perform lock + unlock
> depending on it.
Agreed!
next prev parent reply other threads:[~2023-05-16 9:17 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-10 7:27 [PATCH v6 00/21] timer: Move from a push remote at enqueue to a pull at expiry model Anna-Maria Behnsen
2023-05-10 7:27 ` [PATCH v6 01/21] tick-sched: Warn when next tick seems to be in the past Anna-Maria Behnsen
2023-05-10 7:27 ` [PATCH v6 02/21] timer: Do not IPI for deferrable timers Anna-Maria Behnsen
2023-05-10 7:27 ` [PATCH v6 03/21] timer: Add comment to get_next_timer_interrupt() description Anna-Maria Behnsen
2023-05-10 7:28 ` [PATCH v6 04/21] timer: Move store of next event into __next_timer_interrupt() Anna-Maria Behnsen
2023-05-10 7:28 ` [PATCH v6 05/21] timer: Split next timer interrupt logic Anna-Maria Behnsen
2023-05-10 7:28 ` [PATCH v6 06/21] timer: Rework idle logic Anna-Maria Behnsen
2023-05-10 7:28 ` [PATCH v6 07/21] timers: Introduce add_timer() variants which modify timer flags Anna-Maria Behnsen
2023-06-05 21:43 ` Frederic Weisbecker
2023-05-10 7:28 ` [PATCH v6 08/21] workqueue: Use global variant for add_timer() Anna-Maria Behnsen
2023-05-10 19:30 ` Tejun Heo
2023-06-05 22:16 ` Frederic Weisbecker
2023-05-10 7:28 ` [PATCH v6 09/21] timer: add_timer_on(): Make sure TIMER_PINNED flag is set Anna-Maria Behnsen
2023-06-05 22:12 ` Frederic Weisbecker
2023-05-10 7:28 ` [PATCH v6 10/21] timers: Ease code in run_local_timers() Anna-Maria Behnsen
2023-05-10 7:28 ` [PATCH v6 11/21] timers: Create helper function to forward timer base clk Anna-Maria Behnsen
2023-05-10 7:28 ` [PATCH v6 12/21] timer: Keep the pinned timers separate from the others Anna-Maria Behnsen
2023-05-10 7:28 ` [PATCH v6 13/21] timer: Retrieve next expiry of pinned/non-pinned timers separately Anna-Maria Behnsen
2023-05-10 7:28 ` [PATCH v6 14/21] timer: Split out "get next timer interrupt" functionality Anna-Maria Behnsen
2023-05-10 7:28 ` [PATCH v6 15/21] timer: Add get next timer interrupt functionality for remote CPUs Anna-Maria Behnsen
2023-05-10 10:16 ` Frederic Weisbecker
2023-05-11 13:06 ` Anna-Maria Behnsen
2023-05-10 7:28 ` [PATCH v6 16/21] timer: Restructure internal locking Anna-Maria Behnsen
2023-05-10 7:28 ` [PATCH v6 17/21] timer: Check if timers base is handled already Anna-Maria Behnsen
2023-05-10 7:28 ` [PATCH v6 18/21] tick/sched: Split out jiffies update helper function Anna-Maria Behnsen
2023-05-10 7:28 ` [PATCH v6 19/21] timer: Implement the hierarchical pull model Anna-Maria Behnsen
2023-05-10 10:32 ` Frederic Weisbecker
2023-05-15 10:19 ` Sebastian Siewior
2023-05-15 10:50 ` Anna-Maria Behnsen
2023-05-15 12:41 ` Sebastian Siewior
2023-05-16 9:24 ` Frederic Weisbecker
2023-05-16 9:37 ` Sebastian Siewior
2023-05-16 12:49 ` Anna-Maria Behnsen
2023-05-16 9:15 ` Frederic Weisbecker [this message]
2023-05-11 16:56 ` Sebastian Siewior
2023-05-15 11:06 ` Sebastian Siewior
2023-05-19 9:32 ` kernel test robot
2023-05-10 7:28 ` [PATCH v6 20/21] timer_migration: Add tracepoints Anna-Maria Behnsen
2023-05-10 7:28 ` [PATCH v6 21/21] timer: Always queue timers on the local CPU Anna-Maria Behnsen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZGNJq_ZITmZ5YciL@localhost.localdomain \
--to=frederic@kernel.org \
--cc=anna-maria@linutronix.de \
--cc=arjan@infradead.org \
--cc=bigeasy@linutronix.de \
--cc=edumazet@google.com \
--cc=fweisbec@gmail.com \
--cc=gautham.shenoy@amd.com \
--cc=ggherdovich@suse.cz \
--cc=jstultz@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lukasz.luba@arm.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=rafael.j.wysocki@intel.com \
--cc=riel@surriel.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.