From: Hanabishi <i.r.e.c.c.a.k.u.n+kernel.org@gmail.com>
To: Thomas Gleixner <tglx@kernel.org>, LKML <linux-kernel@vger.kernel.org>
Cc: Calvin Owens <calvin@wbinvd.org>,
Peter Zijlstra <peterz@infradead.org>,
Anna-Maria Behnsen <anna-maria@linutronix.de>,
Frederic Weisbecker <frederic@kernel.org>,
Ingo Molnar <mingo@kernel.org>, John Stultz <jstultz@google.com>,
Stephen Boyd <sboyd@kernel.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
linux-fsdevel@vger.kernel.org, Sebastian Reichel <sre@kernel.org>,
linux-pm@vger.kernel.org, Pablo Neira Ayuso <pablo@netfilter.org>,
Florian Westphal <fw@strlen.de>, Phil Sutter <phil@nwl.cc>,
netfilter-devel@vger.kernel.org, coreteam@netfilter.org
Subject: The "clockevents: Prevent timer interrupt starvation" patch causes lockups
Date: Mon, 13 Apr 2026 21:20:43 +0000 [thread overview]
Message-ID: <68d1e9ac-2780-4be3-8ee3-0788062dd3a4@gmail.com> (raw)
In-Reply-To: <20260407083247.562657657@kernel.org>
On 07/04/2026 08:54, Thomas Gleixner wrote:
> From: Thomas Gleixner <tglx@kernel.org>
>
> Calvin reported an odd NMI watchdog lockup which claims that the CPU locked
> up in user space. He provided a reproducer, which sets up a timerfd based
> timer and then rearms it in a loop with an absolute expiry time of 1ns.
>
> As the expiry time is in the past, the timer ends up as the first expiring
> timer in the per CPU hrtimer base and the clockevent device is programmed
> with the minimum delta value. If the machine is fast enough, this ends up
> in a endless loop of programming the delta value to the minimum value
> defined by the clock event device, before the timer interrupt can fire,
> which starves the interrupt and consequently triggers the lockup detector
> because the hrtimer callback of the lockup mechanism is never invoked.
>
> As a first step to prevent this, avoid reprogramming the clock event device
> when:
> - a forced minimum delta event is pending
> - the new expiry delta is less then or equal to the minimum delta
>
> Thanks to Calvin for providing the reproducer and to Borislav for testing
> and providing data from his Zen5 machine.
>
> The problem is not limited to Zen5, but depending on the underlying
> clock event device (e.g. TSC deadline timer on Intel) and the CPU speed
> not necessarily observable.
>
> This change serves only as the last resort and further changes will be made
> to prevent this scenario earlier in the call chain as far as possible.
>
> Fixes: d316c57ff6bf ("[PATCH] clockevents: add core functionality")
> Reported-by: Calvin Owens <calvin@wbinvd.org>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Anna-Maria Behnsen <anna-maria@linutronix.de>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Link: https://lore.kernel.org/lkml/acMe-QZUel-bBYUh@mozart.vkv.me/
> ---
> V2: Simplified the clockevents code - Peter
> ---
> include/linux/clockchips.h | 2 ++
> kernel/time/clockevents.c | 23 +++++++++++++++--------
> kernel/time/hrtimer.c | 1 +
> kernel/time/tick-common.c | 1 +
> kernel/time/tick-sched.c | 1 +
> 5 files changed, 20 insertions(+), 8 deletions(-)
> --- a/include/linux/clockchips.h
> +++ b/include/linux/clockchips.h
> @@ -80,6 +80,7 @@ enum clock_event_state {
> * @shift: nanoseconds to cycles divisor (power of two)
> * @state_use_accessors:current state of the device, assigned by the core code
> * @features: features
> + * @next_event_forced: True if the last programming was a forced event
> * @retries: number of forced programming retries
> * @set_state_periodic: switch state to periodic
> * @set_state_oneshot: switch state to oneshot
> @@ -108,6 +109,7 @@ struct clock_event_device {
> u32 shift;
> enum clock_event_state state_use_accessors;
> unsigned int features;
> + unsigned int next_event_forced;
> unsigned long retries;
>
> int (*set_state_periodic)(struct clock_event_device *);
> --- a/kernel/time/clockevents.c
> +++ b/kernel/time/clockevents.c
> @@ -172,6 +172,7 @@ void clockevents_shutdown(struct clock_e
> {
> clockevents_switch_state(dev, CLOCK_EVT_STATE_SHUTDOWN);
> dev->next_event = KTIME_MAX;
> + dev->next_event_forced = 0;
> }
>
> /**
> @@ -305,7 +306,6 @@ int clockevents_program_event(struct clo
> {
> unsigned long long clc;
> int64_t delta;
> - int rc;
>
> if (WARN_ON_ONCE(expires < 0))
> return -ETIME;
> @@ -324,16 +324,23 @@ int clockevents_program_event(struct clo
> return dev->set_next_ktime(expires, dev);
>
> delta = ktime_to_ns(ktime_sub(expires, ktime_get()));
> - if (delta <= 0)
> - return force ? clockevents_program_min_delta(dev) : -ETIME;
>
> - delta = min(delta, (int64_t) dev->max_delta_ns);
> - delta = max(delta, (int64_t) dev->min_delta_ns);
> + if (delta > (int64_t)dev->min_delta_ns) {
> + delta = min(delta, (int64_t) dev->max_delta_ns);
> + clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
> + if (!dev->set_next_event((unsigned long) clc, dev))
> + return 0;
> + }
>
> - clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
> - rc = dev->set_next_event((unsigned long) clc, dev);
> + if (dev->next_event_forced)
> + return 0;
>
> - return (rc && force) ? clockevents_program_min_delta(dev) : rc;
> + if (dev->set_next_event(dev->min_delta_ticks, dev)) {
> + if (!force || clockevents_program_min_delta(dev))
> + return -ETIME;
> + }
> + dev->next_event_forced = 1;
> + return 0;
> }
>
> /*
> --- a/kernel/time/hrtimer.c
> +++ b/kernel/time/hrtimer.c
> @@ -1888,6 +1888,7 @@ void hrtimer_interrupt(struct clock_even
> BUG_ON(!cpu_base->hres_active);
> cpu_base->nr_events++;
> dev->next_event = KTIME_MAX;
> + dev->next_event_forced = 0;
>
> raw_spin_lock_irqsave(&cpu_base->lock, flags);
> entry_time = now = hrtimer_update_base(cpu_base);
> --- a/kernel/time/tick-common.c
> +++ b/kernel/time/tick-common.c
> @@ -110,6 +110,7 @@ void tick_handle_periodic(struct clock_e
> int cpu = smp_processor_id();
> ktime_t next = dev->next_event;
>
> + dev->next_event_forced = 0;
> tick_periodic(cpu);
>
> /*
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -1513,6 +1513,7 @@ static void tick_nohz_lowres_handler(str
> struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
>
> dev->next_event = KTIME_MAX;
> + dev->next_event_forced = 0;
>
> if (likely(tick_nohz_handler(&ts->sched_timer) == HRTIMER_RESTART))
> tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
>
>
Hello.
Sorry, but this patch as of 7.0 introduced *severe* periodic lockups on my Ryzen 7700X machine.
I see such messages in the log:
clocksource: Long readout interval, skipping watchdog check: cs_nsec: 2897344852 wd_nsec: 2897356996
Reverting d6e152d905bdb1f32f9d99775e2f453350399a6a for mainline fixes the issue for me.
next prev parent reply other threads:[~2026-04-13 21:20 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-07 8:54 [patch 00/12] hrtimers: Prevent hrtimer interrupt starvation Thomas Gleixner
2026-04-07 8:54 ` [patch 01/12] clockevents: Prevent timer " Thomas Gleixner
2026-04-07 9:42 ` Peter Zijlstra
2026-04-07 11:30 ` Thomas Gleixner
2026-04-07 11:49 ` Peter Zijlstra
2026-04-07 13:59 ` Thomas Gleixner
2026-04-07 14:00 ` Frederic Weisbecker
2026-04-07 16:08 ` Thomas Gleixner
2026-04-07 18:01 ` Thomas Gleixner
2026-04-07 14:33 ` Thomas Gleixner
2026-04-08 12:41 ` Thomas Weißschuh
2026-04-08 13:55 ` Thomas Weißschuh
2026-04-08 15:18 ` Thomas Gleixner
2026-04-08 14:15 ` Frederic Weisbecker
2026-04-10 20:52 ` Nathan Chancellor
2026-04-10 21:02 ` Thomas Gleixner
2026-04-10 21:13 ` Nathan Chancellor
2026-04-13 21:20 ` Hanabishi [this message]
2026-04-14 15:39 ` The "clockevents: Prevent timer interrupt starvation" patch causes lockups Eric Naim
2026-04-14 17:25 ` Calvin Owens
2026-04-14 18:19 ` Eric Naim
2026-04-14 18:04 ` Frederic Weisbecker
2026-04-14 18:25 ` Hanabishi
2026-04-14 20:55 ` Thomas Gleixner
2026-04-14 21:35 ` Hanabishi
2026-04-15 13:51 ` Eric Naim
2026-04-07 8:54 ` [patch 02/12] hrtimer: Provide hrtimer_start_range_ns_user() Thomas Gleixner
2026-04-07 9:54 ` Peter Zijlstra
2026-04-07 11:32 ` Thomas Gleixner
2026-04-07 9:57 ` Peter Zijlstra
2026-04-07 11:34 ` Thomas Gleixner
2026-04-07 8:54 ` [patch 03/12] hrtimer: Use hrtimer_start_expires_user() for hrtimer sleepers Thomas Gleixner
2026-04-07 9:59 ` Peter Zijlstra
2026-04-07 8:54 ` [patch 04/12] posix-timers: Expand timer_[re]arm() callbacks with a boolean return value Thomas Gleixner
2026-04-07 10:00 ` Peter Zijlstra
2026-04-07 20:20 ` John Stultz
2026-04-07 8:54 ` [patch 05/12] posix-timers: Handle the timer_[re]arm() " Thomas Gleixner
2026-04-07 10:01 ` Peter Zijlstra
2026-04-07 8:54 ` [patch 06/12] posix-timers: Switch to hrtimer_start_expires_user() Thomas Gleixner
2026-04-07 10:01 ` Peter Zijlstra
2026-04-07 8:54 ` [patch 07/12] alarmtimer: Provide alarmtimer_start() Thomas Gleixner
2026-04-07 10:04 ` Peter Zijlstra
2026-04-07 11:34 ` Thomas Gleixner
2026-04-07 20:23 ` John Stultz
2026-04-07 8:54 ` [patch 08/12] alarmtimer: Convert posix timer functions to alarmtimer_start() Thomas Gleixner
2026-04-07 20:19 ` John Stultz
2026-04-07 8:54 ` [patch 09/12] fs/timerfd: Use the new alarm/hrtimer functions Thomas Gleixner
2026-04-07 10:09 ` Peter Zijlstra
2026-04-07 11:41 ` Thomas Gleixner
2026-04-07 8:55 ` [patch 10/12] power: supply: charger-manager: Switch to alarmtimer_start() Thomas Gleixner
2026-04-07 10:11 ` Peter Zijlstra
2026-04-07 8:55 ` [patch 11/12] netfilter: xt_IDLETIMER: " Thomas Gleixner
2026-04-07 8:55 ` [patch 12/12] alarmtimer: Remove unused interfaces Thomas Gleixner
2026-04-07 20:21 ` John Stultz
2026-04-07 14:43 ` [patch 00/12] hrtimers: Prevent hrtimer interrupt starvation Thomas Gleixner
2026-04-07 16:17 ` Thomas Gleixner
2026-04-07 17:38 ` Calvin Owens
2026-04-07 18:03 ` Thomas Gleixner
2026-04-07 18:35 ` Calvin Owens
2026-04-07 20:58 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=68d1e9ac-2780-4be3-8ee3-0788062dd3a4@gmail.com \
--to=i.r.e.c.c.a.k.u.n+kernel.org@gmail.com \
--cc=anna-maria@linutronix.de \
--cc=brauner@kernel.org \
--cc=calvin@wbinvd.org \
--cc=coreteam@netfilter.org \
--cc=frederic@kernel.org \
--cc=fw@strlen.de \
--cc=jack@suse.cz \
--cc=jstultz@google.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
--cc=peterz@infradead.org \
--cc=phil@nwl.cc \
--cc=sboyd@kernel.org \
--cc=sre@kernel.org \
--cc=tglx@kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox