public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@kernel.org>
To: Borislav Petkov <bp@alien8.de>, Calvin Owens <calvin@wbinvd.org>
Cc: Petr Mladek <pmladek@suse.com>,
	linux-kernel@vger.kernel.org, arighi@nvidia.com,
	yaozhenguo1@gmail.com, tj@kernel.org,
	feng.tang@linux.alibaba.com, lirongqing@baidu.com,
	realwujing@gmail.com, hu.shengming@zte.com.cn,
	dianders@chromium.org, joel.granados@kernel.org,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Frederic Weisbecker <frederic@kernel.org>,
	Anna-Maria Behnsen <anna-maria@linutronix.de>,
	x86@kernel.org
Subject: [PATCH] clockevents: Prevent timer interrupt starvation
Date: Thu, 02 Apr 2026 19:07:49 +0200	[thread overview]
Message-ID: <87jyup70ka.ffs@tglx> (raw)
In-Reply-To: <20260401163435.GGac1JG42tWmsCKL37@fat_crate.local>

Calvin reported an odd NMI watchdog lockup which claims that the CPU locked
up in user space. He provided a reproducer, which set's up a timerfd based
timer and then rearms it in a loop with an absolute expiry time of 1ns.

As the expiry time is in the past, the timer ends up as the first expiring
timer in the per CPU hrtimer base and the clockevent device is programmed
with the minimum delta value. If the machine is fast enough, this ends up
in a endless loop of programming the delta value to the minimum value
defined by the clock event device, before the timer interrupt can fire,
which starves the interrupt and consequently triggers the lockup detector
because the hrtimer callback of the lockup mechanism is never invoked.

As a first step to prevent this, avoid reprogramming the clock event device
when:
     - a forced minimum delta event is pending
     - the new expiry delta is less then or equal to the minimum delta

Thanks to Calvin for providing the reproducer and to Borislav for testing
and providing data from his Zen5 machine.

The problem is not limited to Zen5, but depending on the underlying
clock event device (e.g. TSC deadline timer on Intel) and the CPU speed
not necessarily observable.

This change serves only as the last resort and further changes will be made
to prevent this scenario earlier in the call chain.

Reported-by: Calvin Owens <calvin@wbinvd.org>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
---
P.S: I'm working on the other changes, but wanted to get this out ASAP
     for testing.
---
 include/linux/clockchips.h |    2 ++
 kernel/time/clockevents.c  |   37 +++++++++++++++++++++++--------------
 2 files changed, 25 insertions(+), 14 deletions(-)

--- a/include/linux/clockchips.h
+++ b/include/linux/clockchips.h
@@ -80,6 +80,7 @@ enum clock_event_state {
  * @shift:		nanoseconds to cycles divisor (power of two)
  * @state_use_accessors:current state of the device, assigned by the core code
  * @features:		features
+ * @next_event_forced:	True if the last programming was a forced event
  * @retries:		number of forced programming retries
  * @set_state_periodic:	switch state to periodic
  * @set_state_oneshot:	switch state to oneshot
@@ -108,6 +109,7 @@ struct clock_event_device {
 	u32			shift;
 	enum clock_event_state	state_use_accessors;
 	unsigned int		features;
+	unsigned int		next_event_forced;
 	unsigned long		retries;
 
 	int			(*set_state_periodic)(struct clock_event_device *);
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -172,6 +172,7 @@ void clockevents_shutdown(struct clock_e
 {
 	clockevents_switch_state(dev, CLOCK_EVT_STATE_SHUTDOWN);
 	dev->next_event = KTIME_MAX;
+	dev->next_event_forced = 0;
 }
 
 /**
@@ -224,13 +225,7 @@ static int clockevents_increase_min_delt
 	return 0;
 }
 
-/**
- * clockevents_program_min_delta - Set clock event device to the minimum delay.
- * @dev:	device to program
- *
- * Returns 0 on success, -ETIME when the retry loop failed.
- */
-static int clockevents_program_min_delta(struct clock_event_device *dev)
+static int __clockevents_program_min_delta(struct clock_event_device *dev)
 {
 	unsigned long long clc;
 	int64_t delta;
@@ -263,13 +258,7 @@ static int clockevents_program_min_delta
 
 #else  /* CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST */
 
-/**
- * clockevents_program_min_delta - Set clock event device to the minimum delay.
- * @dev:	device to program
- *
- * Returns 0 on success, -ETIME when the retry loop failed.
- */
-static int clockevents_program_min_delta(struct clock_event_device *dev)
+static int __clockevents_program_min_delta(struct clock_event_device *dev)
 {
 	unsigned long long clc;
 	int64_t delta = 0;
@@ -293,6 +282,21 @@ static int clockevents_program_min_delta
 #endif /* CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST */
 
 /**
+ * clockevents_program_min_delta - Set clock event device to the minimum delay.
+ * @dev:	device to program
+ *
+ * Returns 0 on success, -ETIME when the retry loop failed.
+ */
+static int clockevents_program_min_delta(struct clock_event_device *dev)
+{
+	if (dev->next_event_forced)
+		return 0;
+
+	dev->next_event_forced = 1;
+	return __clockevents_program_min_delta(dev);
+}
+
+/**
  * clockevents_program_event - Reprogram the clock event device.
  * @dev:	device to program
  * @expires:	absolute expiry time (monotonic clock)
@@ -324,6 +328,11 @@ int clockevents_program_event(struct clo
 		return dev->set_next_ktime(expires, dev);
 
 	delta = ktime_to_ns(ktime_sub(expires, ktime_get()));
+
+	/* Don't reprogram when a forced event is pending */
+	if (dev->next_event_forced && delta <= (int64_t)dev->min_delta_ns)
+		return 0;
+
 	if (delta <= 0)
 		return force ? clockevents_program_min_delta(dev) : -ETIME;
 

  reply	other threads:[~2026-04-02 17:07 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-24 23:32 [BUG] Random hard lockup with userspace %ip on 7.0-rc5 Calvin Owens
2026-03-25  9:03 ` Petr Mladek
2026-03-25 16:56   ` Thomas Gleixner
2026-04-01  1:58     ` Calvin Owens
2026-04-01 15:01       ` Thomas Gleixner
2026-04-01 15:12         ` Borislav Petkov
2026-04-01 16:34         ` Borislav Petkov
2026-04-02 17:07           ` Thomas Gleixner [this message]
2026-04-03  5:11             ` [PATCH] clockevents: Prevent timer interrupt starvation Calvin Owens
2026-04-03 14:41               ` Thomas Gleixner
2026-04-03 15:58                 ` Calvin Owens
2026-04-03 19:00                   ` Thomas Gleixner
2026-04-04  0:15                     ` Calvin Owens
2026-04-03 12:16             ` Peter Zijlstra
2026-04-03 14:43               ` Thomas Gleixner
2026-04-03 16:17               ` Thomas Gleixner
2026-04-03 21:01                 ` Peter Zijlstra
2026-04-03 21:24                   ` Thomas Gleixner
2026-04-03 22:14                     ` Thomas Gleixner
2026-04-03 22:21                       ` Peter Zijlstra
2026-03-27  1:36   ` [BUG] Random hard lockup with userspace %ip on 7.0-rc5 Feng Tang
2026-03-27 15:36     ` Calvin Owens

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87jyup70ka.ffs@tglx \
    --to=tglx@kernel.org \
    --cc=anna-maria@linutronix.de \
    --cc=arighi@nvidia.com \
    --cc=bp@alien8.de \
    --cc=bsegall@google.com \
    --cc=calvin@wbinvd.org \
    --cc=dianders@chromium.org \
    --cc=dietmar.eggemann@arm.com \
    --cc=feng.tang@linux.alibaba.com \
    --cc=frederic@kernel.org \
    --cc=hu.shengming@zte.com.cn \
    --cc=joel.granados@kernel.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lirongqing@baidu.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=realwujing@gmail.com \
    --cc=rostedt@goodmis.org \
    --cc=tj@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=x86@kernel.org \
    --cc=yaozhenguo1@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox