public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
From: Hanabishi <i.r.e.c.c.a.k.u.n+kernel.org@gmail.com>
To: Thomas Gleixner <tglx@kernel.org>, LKML <linux-kernel@vger.kernel.org>
Cc: Calvin Owens <calvin@wbinvd.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Anna-Maria Behnsen <anna-maria@linutronix.de>,
	Frederic Weisbecker <frederic@kernel.org>,
	Ingo Molnar <mingo@kernel.org>, John Stultz <jstultz@google.com>,
	Stephen Boyd <sboyd@kernel.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org, Sebastian Reichel <sre@kernel.org>,
	linux-pm@vger.kernel.org, Pablo Neira Ayuso <pablo@netfilter.org>,
	Florian Westphal <fw@strlen.de>, Phil Sutter <phil@nwl.cc>,
	netfilter-devel@vger.kernel.org, coreteam@netfilter.org
Subject: The "clockevents: Prevent timer interrupt starvation" patch causes lockups
Date: Mon, 13 Apr 2026 21:20:43 +0000	[thread overview]
Message-ID: <68d1e9ac-2780-4be3-8ee3-0788062dd3a4@gmail.com> (raw)
In-Reply-To: <20260407083247.562657657@kernel.org>

On 07/04/2026 08:54, Thomas Gleixner wrote:
> From: Thomas Gleixner <tglx@kernel.org>
> 
> Calvin reported an odd NMI watchdog lockup which claims that the CPU locked
> up in user space. He provided a reproducer, which sets up a timerfd based
> timer and then rearms it in a loop with an absolute expiry time of 1ns.
> 
> As the expiry time is in the past, the timer ends up as the first expiring
> timer in the per CPU hrtimer base and the clockevent device is programmed
> with the minimum delta value. If the machine is fast enough, this ends up
> in a endless loop of programming the delta value to the minimum value
> defined by the clock event device, before the timer interrupt can fire,
> which starves the interrupt and consequently triggers the lockup detector
> because the hrtimer callback of the lockup mechanism is never invoked.
> 
> As a first step to prevent this, avoid reprogramming the clock event device
> when:
>       - a forced minimum delta event is pending
>       - the new expiry delta is less then or equal to the minimum delta
> 
> Thanks to Calvin for providing the reproducer and to Borislav for testing
> and providing data from his Zen5 machine.
> 
> The problem is not limited to Zen5, but depending on the underlying
> clock event device (e.g. TSC deadline timer on Intel) and the CPU speed
> not necessarily observable.
> 
> This change serves only as the last resort and further changes will be made
> to prevent this scenario earlier in the call chain as far as possible.
> 
> Fixes: d316c57ff6bf ("[PATCH] clockevents: add core functionality")
> Reported-by: Calvin Owens <calvin@wbinvd.org>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Anna-Maria Behnsen <anna-maria@linutronix.de>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Link: https://lore.kernel.org/lkml/acMe-QZUel-bBYUh@mozart.vkv.me/
> ---
> V2: Simplified the clockevents code - Peter
> ---
>   include/linux/clockchips.h |    2 ++
>   kernel/time/clockevents.c  |   23 +++++++++++++++--------
>   kernel/time/hrtimer.c      |    1 +
>   kernel/time/tick-common.c  |    1 +
>   kernel/time/tick-sched.c   |    1 +
>   5 files changed, 20 insertions(+), 8 deletions(-)
> --- a/include/linux/clockchips.h
> +++ b/include/linux/clockchips.h
> @@ -80,6 +80,7 @@ enum clock_event_state {
>    * @shift:		nanoseconds to cycles divisor (power of two)
>    * @state_use_accessors:current state of the device, assigned by the core code
>    * @features:		features
> + * @next_event_forced:	True if the last programming was a forced event
>    * @retries:		number of forced programming retries
>    * @set_state_periodic:	switch state to periodic
>    * @set_state_oneshot:	switch state to oneshot
> @@ -108,6 +109,7 @@ struct clock_event_device {
>   	u32			shift;
>   	enum clock_event_state	state_use_accessors;
>   	unsigned int		features;
> +	unsigned int		next_event_forced;
>   	unsigned long		retries;
>   
>   	int			(*set_state_periodic)(struct clock_event_device *);
> --- a/kernel/time/clockevents.c
> +++ b/kernel/time/clockevents.c
> @@ -172,6 +172,7 @@ void clockevents_shutdown(struct clock_e
>   {
>   	clockevents_switch_state(dev, CLOCK_EVT_STATE_SHUTDOWN);
>   	dev->next_event = KTIME_MAX;
> +	dev->next_event_forced = 0;
>   }
>   
>   /**
> @@ -305,7 +306,6 @@ int clockevents_program_event(struct clo
>   {
>   	unsigned long long clc;
>   	int64_t delta;
> -	int rc;
>   
>   	if (WARN_ON_ONCE(expires < 0))
>   		return -ETIME;
> @@ -324,16 +324,23 @@ int clockevents_program_event(struct clo
>   		return dev->set_next_ktime(expires, dev);
>   
>   	delta = ktime_to_ns(ktime_sub(expires, ktime_get()));
> -	if (delta <= 0)
> -		return force ? clockevents_program_min_delta(dev) : -ETIME;
>   
> -	delta = min(delta, (int64_t) dev->max_delta_ns);
> -	delta = max(delta, (int64_t) dev->min_delta_ns);
> +	if (delta > (int64_t)dev->min_delta_ns) {
> +		delta = min(delta, (int64_t) dev->max_delta_ns);
> +		clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
> +		if (!dev->set_next_event((unsigned long) clc, dev))
> +			return 0;
> +	}
>   
> -	clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
> -	rc = dev->set_next_event((unsigned long) clc, dev);
> +	if (dev->next_event_forced)
> +		return 0;
>   
> -	return (rc && force) ? clockevents_program_min_delta(dev) : rc;
> +	if (dev->set_next_event(dev->min_delta_ticks, dev)) {
> +		if (!force || clockevents_program_min_delta(dev))
> +			return -ETIME;
> +	}
> +	dev->next_event_forced = 1;
> +	return 0;
>   }
>   
>   /*
> --- a/kernel/time/hrtimer.c
> +++ b/kernel/time/hrtimer.c
> @@ -1888,6 +1888,7 @@ void hrtimer_interrupt(struct clock_even
>   	BUG_ON(!cpu_base->hres_active);
>   	cpu_base->nr_events++;
>   	dev->next_event = KTIME_MAX;
> +	dev->next_event_forced = 0;
>   
>   	raw_spin_lock_irqsave(&cpu_base->lock, flags);
>   	entry_time = now = hrtimer_update_base(cpu_base);
> --- a/kernel/time/tick-common.c
> +++ b/kernel/time/tick-common.c
> @@ -110,6 +110,7 @@ void tick_handle_periodic(struct clock_e
>   	int cpu = smp_processor_id();
>   	ktime_t next = dev->next_event;
>   
> +	dev->next_event_forced = 0;
>   	tick_periodic(cpu);
>   
>   	/*
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -1513,6 +1513,7 @@ static void tick_nohz_lowres_handler(str
>   	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
>   
>   	dev->next_event = KTIME_MAX;
> +	dev->next_event_forced = 0;
>   
>   	if (likely(tick_nohz_handler(&ts->sched_timer) == HRTIMER_RESTART))
>   		tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
> 
> 

Hello.

Sorry, but this patch as of 7.0 introduced *severe* periodic lockups on my Ryzen 7700X machine.
I see such messages in the log:

clocksource: Long readout interval, skipping watchdog check: cs_nsec: 2897344852 wd_nsec: 2897356996

Reverting d6e152d905bdb1f32f9d99775e2f453350399a6a for mainline fixes the issue for me.


  parent reply	other threads:[~2026-04-13 21:20 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-07  8:54 [patch 00/12] hrtimers: Prevent hrtimer interrupt starvation Thomas Gleixner
2026-04-07  8:54 ` [patch 01/12] clockevents: Prevent timer " Thomas Gleixner
2026-04-07  9:42   ` Peter Zijlstra
2026-04-07 11:30     ` Thomas Gleixner
2026-04-07 11:49       ` Peter Zijlstra
2026-04-07 13:59         ` Thomas Gleixner
2026-04-07 14:00   ` Frederic Weisbecker
2026-04-07 16:08     ` Thomas Gleixner
2026-04-07 18:01       ` Thomas Gleixner
2026-04-07 14:33   ` Thomas Gleixner
2026-04-08 12:41   ` Thomas Weißschuh
2026-04-08 13:55     ` Thomas Weißschuh
2026-04-08 15:18       ` Thomas Gleixner
2026-04-08 14:15   ` Frederic Weisbecker
2026-04-10 20:52   ` Nathan Chancellor
2026-04-10 21:02     ` Thomas Gleixner
2026-04-10 21:13       ` Nathan Chancellor
2026-04-13 21:20   ` Hanabishi [this message]
2026-04-14 15:39     ` The "clockevents: Prevent timer interrupt starvation" patch causes lockups Eric Naim
2026-04-14 17:25       ` Calvin Owens
2026-04-14 18:19         ` Eric Naim
2026-04-14 18:04       ` Frederic Weisbecker
2026-04-14 18:25         ` Hanabishi
2026-04-14 20:55           ` Thomas Gleixner
2026-04-14 21:35             ` Hanabishi
2026-04-15 13:51               ` Eric Naim
2026-04-07  8:54 ` [patch 02/12] hrtimer: Provide hrtimer_start_range_ns_user() Thomas Gleixner
2026-04-07  9:54   ` Peter Zijlstra
2026-04-07 11:32     ` Thomas Gleixner
2026-04-07  9:57   ` Peter Zijlstra
2026-04-07 11:34     ` Thomas Gleixner
2026-04-07  8:54 ` [patch 03/12] hrtimer: Use hrtimer_start_expires_user() for hrtimer sleepers Thomas Gleixner
2026-04-07  9:59   ` Peter Zijlstra
2026-04-07  8:54 ` [patch 04/12] posix-timers: Expand timer_[re]arm() callbacks with a boolean return value Thomas Gleixner
2026-04-07 10:00   ` Peter Zijlstra
2026-04-07 20:20   ` John Stultz
2026-04-07  8:54 ` [patch 05/12] posix-timers: Handle the timer_[re]arm() " Thomas Gleixner
2026-04-07 10:01   ` Peter Zijlstra
2026-04-07  8:54 ` [patch 06/12] posix-timers: Switch to hrtimer_start_expires_user() Thomas Gleixner
2026-04-07 10:01   ` Peter Zijlstra
2026-04-07  8:54 ` [patch 07/12] alarmtimer: Provide alarmtimer_start() Thomas Gleixner
2026-04-07 10:04   ` Peter Zijlstra
2026-04-07 11:34     ` Thomas Gleixner
2026-04-07 20:23   ` John Stultz
2026-04-07  8:54 ` [patch 08/12] alarmtimer: Convert posix timer functions to alarmtimer_start() Thomas Gleixner
2026-04-07 20:19   ` John Stultz
2026-04-07  8:54 ` [patch 09/12] fs/timerfd: Use the new alarm/hrtimer functions Thomas Gleixner
2026-04-07 10:09   ` Peter Zijlstra
2026-04-07 11:41     ` Thomas Gleixner
2026-04-07  8:55 ` [patch 10/12] power: supply: charger-manager: Switch to alarmtimer_start() Thomas Gleixner
2026-04-07 10:11   ` Peter Zijlstra
2026-04-07  8:55 ` [patch 11/12] netfilter: xt_IDLETIMER: " Thomas Gleixner
2026-04-07  8:55 ` [patch 12/12] alarmtimer: Remove unused interfaces Thomas Gleixner
2026-04-07 20:21   ` John Stultz
2026-04-07 14:43 ` [patch 00/12] hrtimers: Prevent hrtimer interrupt starvation Thomas Gleixner
2026-04-07 16:17   ` Thomas Gleixner
2026-04-07 17:38 ` Calvin Owens
2026-04-07 18:03   ` Thomas Gleixner
2026-04-07 18:35     ` Calvin Owens
2026-04-07 20:58       ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=68d1e9ac-2780-4be3-8ee3-0788062dd3a4@gmail.com \
    --to=i.r.e.c.c.a.k.u.n+kernel.org@gmail.com \
    --cc=anna-maria@linutronix.de \
    --cc=brauner@kernel.org \
    --cc=calvin@wbinvd.org \
    --cc=coreteam@netfilter.org \
    --cc=frederic@kernel.org \
    --cc=fw@strlen.de \
    --cc=jack@suse.cz \
    --cc=jstultz@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=peterz@infradead.org \
    --cc=phil@nwl.cc \
    --cc=sboyd@kernel.org \
    --cc=sre@kernel.org \
    --cc=tglx@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox