From: Anna-Maria Behnsen <anna-maria@linutronix.de>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: linux-kernel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
John Stultz <jstultz@google.com>,
Thomas Gleixner <tglx@linutronix.de>,
Eric Dumazet <edumazet@google.com>,
"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
Arjan van de Ven <arjan@infradead.org>,
"Paul E . McKenney" <paulmck@kernel.org>,
Rik van Riel <riel@surriel.com>,
Steven Rostedt <rostedt@goodmis.org>,
Sebastian Siewior <bigeasy@linutronix.de>,
Giovanni Gherdovich <ggherdovich@suse.cz>,
Lukasz Luba <lukasz.luba@arm.com>,
"Gautham R . Shenoy" <gautham.shenoy@amd.com>,
Srinivas Pandruvada <srinivas.pandruvada@intel.com>,
K Prateek Nayak <kprateek.nayak@amd.com>
Subject: Re: [PATCH v8 10/25] timers: Move marking timer bases idle into tick_nohz_stop_tick()
Date: Thu, 19 Oct 2023 15:37:10 +0200 [thread overview]
Message-ID: <87ttqm91ix.fsf@somnus> (raw)
In-Reply-To: <ZSgWUTsV37rEeh3t@localhost.localdomain>
Frederic Weisbecker <frederic@kernel.org> writes:
> Le Wed, Oct 04, 2023 at 02:34:39PM +0200, Anna-Maria Behnsen a écrit :
>> static void tick_nohz_stop_tick(struct tick_sched *ts, int cpu)
>> {
>> struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev);
>> + unsigned long basejiff = ts->last_jiffies;
>> u64 basemono = ts->timer_expires_base;
>> - u64 expires = ts->timer_expires;
>> + bool timer_idle = ts->tick_stopped;
>> + u64 expires;
>>
>> /* Make sure we won't be trying to stop it twice in a row. */
>> ts->timer_expires_base = 0;
>>
>> + /*
>> + * Now the tick should be stopped definitely - so timer base needs to be
>> + * marked idle as well to not miss a newly queued timer.
>> + */
>> + expires = timer_set_idle(basejiff, basemono, &timer_idle);
>> + if (!timer_idle) {
>> + /*
>> + * Do not clear tick_stopped here when it was already set - it will
>> + * be retained on next idle iteration when tick expired earlier
>> + * than expected.
>> + */
>> + expires = basemono + TICK_NSEC;
>> +
>> + /* Undo the effect of timer_set_idle() */
>> + timer_clear_idle();
>
> Looks like you don't even need to clear ->is_idle on failure. timer_set_idle()
> does it for you.
You are right. I tried several approaches and then forgot to remove it
here.
>> + } else if (expires < ts->timer_expires) {
>> + ts->timer_expires = expires;
>> + } else {
>> + expires = ts->timer_expires;
>
> Is it because timer_set_idle() doesn't recalculate the next hrtimer (as opposed
> to get_next_timer_interrupt())? And since tick_nohz_next_event() did, the fact
> that ts->timer_expires has a lower value may mean there is an hrtimer to take
> into account and so you rather use the old calculation?
Yes and because power things rely on it.
> If so please add a comment explaining that because it's not that obvious. It's
> worth noting also the side effect that the nearest timer may have been cancelled
> in-between and we might reprogram too-early but the event should be rare enough
> that we don't care.
>
> Another reason also is that cpuidle may have programmed a shallow C-state
> because it saw an early next expiration estimation. And if the related timer is
> cancelled in-between and we didn't keep the old expiration estimation, we would
> otherwise stop the tick for a long time with a shallow C-state.
I'll add a comment covering all your input! Thanks!
The probability that there happens a lot of enqueue and dequeue of
timers between get_next_timer_interrupt() and setting timer base idle is
not very high. But we have to make sure that we do not miss a new first
timer there.
>> @@ -926,7 +944,7 @@ static void tick_nohz_stop_tick(struct tick_sched *ts, int cpu)
>> * first call we save the current tick time, so we can restart
>> * the scheduler tick in nohz_restart_sched_tick.
>> */
>> - if (!ts->tick_stopped) {
>> + if (!ts->tick_stopped && timer_idle) {
>
> In fact, if (!ts->tick_stopped && !timer_idle) then you
> should return now and avoid the reprogramming.
You are right. I'll add it and test it.
>> @@ -1950,6 +1950,40 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
>> if (cpu_is_offline(smp_processor_id()))
>> return expires;
>>
>> + raw_spin_lock(&base->lock);
>> + nextevt = __get_next_timer_interrupt(basej, base);
>> + raw_spin_unlock(&base->lock);
>
> It's unfortunate we have to lock here, which means we lock twice
> on the idle path. But I can't think of a better way and I guess
> the follow-up patches rely on that.
We have to do it like this, because power people need the sleep length
information to able to decide whether to stop the tick or not. If we do
not want to have the timer base locked two times in idle path, we will
not be able to move timer base idle marking into
tick_nohz_stop_tick(). But the good thing is, that we do not mark timer
bases idle, when tick is not stopped with this approach.
btw, I try to rewrite this patch completely as tglx was not happy about
some parts of code duplication. I'll make sure that your remarks are
also covered.
Thanks,
Anna-Maria
next prev parent reply other threads:[~2023-10-19 13:37 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-04 12:34 [PATCH v8 00/25] timer: Move from a push remote at enqueue to a pull at expiry model Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 01/25] tick/sched: Cleanup confusing variables Anna-Maria Behnsen
2023-10-05 9:44 ` Frederic Weisbecker
2023-10-04 12:34 ` [PATCH v8 02/25] tick-sched: Warn when next tick seems to be in the past Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 03/25] timer: Do not IPI for deferrable timers Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 04/25] timer: Move store of next event into __next_timer_interrupt() Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 05/25] timers: Clarify check in forward_timer_base() Anna-Maria Behnsen
2023-10-05 10:17 ` Frederic Weisbecker
2023-10-16 8:11 ` Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 06/25] timers: Split out forward timer base functionality Anna-Maria Behnsen
2023-10-06 11:16 ` Frederic Weisbecker
2023-10-04 12:34 ` [PATCH v8 07/25] timers: Use already existing function for forwarding timer base Anna-Maria Behnsen
2023-10-06 11:17 ` Frederic Weisbecker
2023-10-04 12:34 ` [PATCH v8 08/25] timer: Rework idle logic Anna-Maria Behnsen
2023-10-09 22:15 ` Thomas Gleixner
2023-10-10 11:19 ` Frederic Weisbecker
2023-10-10 11:48 ` Thomas Gleixner
2023-10-04 12:34 ` [PATCH v8 09/25] timer: Split out get next timer functionality Anna-Maria Behnsen
2023-10-09 21:15 ` Frederic Weisbecker
2023-10-09 22:24 ` Thomas Gleixner
2023-10-09 22:17 ` Thomas Gleixner
2023-10-04 12:34 ` [PATCH v8 10/25] timers: Move marking timer bases idle into tick_nohz_stop_tick() Anna-Maria Behnsen
2023-10-12 15:52 ` Frederic Weisbecker
2023-10-19 13:37 ` Anna-Maria Behnsen [this message]
2023-10-04 12:34 ` [PATCH v8 11/25] timers: Introduce add_timer() variants which modify timer flags Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 12/25] workqueue: Use global variant for add_timer() Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 13/25] timer: add_timer_on(): Make sure TIMER_PINNED flag is set Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 14/25] timers: Ease code in run_local_timers() Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 15/25] timer: Split next timer interrupt logic Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 16/25] timer: Keep the pinned timers separate from the others Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 17/25] timer: Retrieve next expiry of pinned/non-pinned timers separately Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 18/25] timer: Split out "get next timer interrupt" functionality Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 19/25] timer: Add get next timer interrupt functionality for remote CPUs Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 20/25] timer: Restructure internal locking Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 21/25] timer: Check if timers base is handled already Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 22/25] tick/sched: Split out jiffies update helper function Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 23/25] timer: Implement the hierarchical pull model Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 24/25] timer_migration: Add tracepoints Anna-Maria Behnsen
2023-10-04 12:34 ` [PATCH v8 25/25] timer: Always queue timers on the local CPU Anna-Maria Behnsen
2023-10-06 5:05 ` [PATCH v8 00/25] timer: Move from a push remote at enqueue to a pull at expiry model K Prateek Nayak
2023-10-19 14:14 ` Anna-Maria Behnsen
2023-10-20 9:06 ` Peter Zijlstra
2023-10-11 19:34 ` Pandruvada, Srinivas
2023-10-19 13:47 ` Anna-Maria Behnsen
2023-10-12 2:22 ` K Prateek Nayak
2023-10-19 13:55 ` Anna-Maria Behnsen
2023-10-13 11:35 ` Lukasz Luba
2023-10-19 14:04 ` Anna-Maria Behnsen
2023-10-19 14:28 ` Lukasz Luba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ttqm91ix.fsf@somnus \
--to=anna-maria@linutronix.de \
--cc=arjan@infradead.org \
--cc=bigeasy@linutronix.de \
--cc=edumazet@google.com \
--cc=frederic@kernel.org \
--cc=gautham.shenoy@amd.com \
--cc=ggherdovich@suse.cz \
--cc=jstultz@google.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lukasz.luba@arm.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=rafael.j.wysocki@intel.com \
--cc=riel@surriel.com \
--cc=rostedt@goodmis.org \
--cc=srinivas.pandruvada@intel.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox