From: Shrikanth Hegde <sshegde@linux.ibm.com>
To: Steve Wahl <steve.wahl@hpe.com>
Cc: Russ Anderson <rja@hpe.com>, Dimitri Sivanich <sivanich@hpe.com>,
Kyle Meyer <kyle.meyer@hpe.com>,
Anna-Maria Behnsen <anna-maria@linutronix.de>,
Frederic Weisbecker <frederic@kernel.org>,
Ingo Molnar <mingo@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] tick/sched: Limit non-timekeeper CPUs calling jiffies update
Date: Tue, 28 Oct 2025 11:39:30 +0530 [thread overview]
Message-ID: <bfa0a61e-7cb8-4bdd-b913-1bf241f316c7@linux.ibm.com> (raw)
In-Reply-To: <20251027183456.343407-1-steve.wahl@hpe.com>
On 10/28/25 12:04 AM, Steve Wahl wrote:
> On large NUMA systems, while running a test program that saturates the
> inter-processor and inter-NUMA links, acquiring the jiffies_lock can
> be very expensive. If the cpu designated to do jiffies updates
> (tick_do_timer_cpu) gets delayed and other cpus decide to do the
> jiffies update themselves, a large number of them decide to do so at
> the same time. The inexpensive check against tick_next_period is far
> quicker than actually acquiring the lock, so most of these get in line
> to obtain the lock. If obtaining the lock is slow enough, this
> spirals into the vast majority of CPUs continuously being stuck
> waiting for this lock, just to obtain it and find out that time has
> already been updated by another cpu. For example, on one random entry
> to kdb by manually-injected NMI, I saw 2912 of 3840 cpus stuck here.
>
> To avoid this, allow only one non-timekeeper CPU to call
> tick_do_update_jiffies64() at any given time, resetting ts->stalled
> jiffies only if the jiffies update function is actually called.
>
> With this change, manually interrupting the test I find at most two
> CPUs in the tick_do_update_jiffies64 function (the timekeeper and one
> other).
>
> Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
> ---
>
> v2: Rewritten to use an atomic to gate non-timekeeping cpus calling the
> jiffies update, as suggested by tglx. Title of patch has changed
> since trylock is no longer used.
>
> v1 discussion: https://lore.kernel.org/all/20251013150959.298288-1-steve.wahl@hpe.com/
>
> kernel/time/tick-sched.c | 30 ++++++++++++++++++++++++++----
> 1 file changed, 26 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index c527b421c865..3ff3eb1f90d0 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -201,6 +201,27 @@ static inline void tick_sched_flag_clear(struct tick_sched *ts,
> ts->flags &= ~flag;
> }
>
> +/*
> + * Allow only one non-timekeeper CPU at a time update jiffies from
> + * the timer tick.
> + *
> + * Returns true if update was run.
> + */
> +static bool tick_limited_update_jiffies64(struct tick_sched *ts, ktime_t now)
> +{
> + static atomic_t in_progress;
> + int inp;
> +
> + inp = atomic_read(&in_progress);
> + if (inp || !atomic_try_cmpxchg(&in_progress, &inp, 1))
> + return false;
> +
You come here if (ts->last_tick_jiffies == jiffies). So it may be not necessary to check again.
> + if (ts->last_tick_jiffies == jiffies)
> + tick_do_update_jiffies64(now);
> + atomic_set(&in_progress, 0);
> + return true;
> +}
> +
> #define MAX_STALLED_JIFFIES 5
>
> static void tick_sched_do_timer(struct tick_sched *ts, ktime_t now)
> @@ -239,10 +260,11 @@ static void tick_sched_do_timer(struct tick_sched *ts, ktime_t now)
> ts->stalled_jiffies = 0;
> ts->last_tick_jiffies = READ_ONCE(jiffies);
> } else {
> - if (++ts->stalled_jiffies == MAX_STALLED_JIFFIES) {
> - tick_do_update_jiffies64(now);
> - ts->stalled_jiffies = 0;
> - ts->last_tick_jiffies = READ_ONCE(jiffies);
> + if (++ts->stalled_jiffies >= MAX_STALLED_JIFFIES) {
> + if (tick_limited_update_jiffies64(ts, now)) {
> + ts->stalled_jiffies = 0;
> + ts->last_tick_jiffies = READ_ONCE(jiffies);
> + }
> }
> }
>
Yes. This could help large systems.
Acked-by: Shrikanth Hegde <sshegde@linux.ibm.com>
next prev parent reply other threads:[~2025-10-28 6:10 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-27 18:34 [PATCH v2] tick/sched: Limit non-timekeeper CPUs calling jiffies update Steve Wahl
2025-10-28 6:09 ` Shrikanth Hegde [this message]
2025-10-28 14:24 ` Steve Wahl
2025-10-28 15:22 ` Shrikanth Hegde
2025-11-01 19:29 ` [tip: timers/core] " tip-bot2 for Steve Wahl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bfa0a61e-7cb8-4bdd-b913-1bf241f316c7@linux.ibm.com \
--to=sshegde@linux.ibm.com \
--cc=anna-maria@linutronix.de \
--cc=frederic@kernel.org \
--cc=kyle.meyer@hpe.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=rja@hpe.com \
--cc=sivanich@hpe.com \
--cc=steve.wahl@hpe.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox