public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* [patch V2 00/11] hrtimers: Prevent hrtimer interrupt starvation
@ 2026-04-08 11:53 Thomas Gleixner
  2026-04-08 11:53 ` [patch V2 01/11] hrtimer: Provide hrtimer_start_range_ns_user() Thomas Gleixner
                   ` (10 more replies)
  0 siblings, 11 replies; 14+ messages in thread
From: Thomas Gleixner @ 2026-04-08 11:53 UTC (permalink / raw)
  To: LKML
  Cc: Calvin Owens, Anna-Maria Behnsen, Frederic Weisbecker,
	Peter Zijlstra (Intel), John Stultz, Stephen Boyd, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel, Sebastian Reichel,
	linux-pm, Pablo Neira Ayuso, Florian Westphal, Phil Sutter,
	netfilter-devel, coreteam

This is a follow up to V1 which can be found here:

 https://lore.kernel.org/lkml/20260407083219.478203185@kernel.org

Calvin reported an odd NMI watchdog lockup which claims that the CPU locked
up in user space:

  https://lore.kernel.org/lkml/acMe-QZUel-bBYUh@mozart.vkv.me/

He provided a reproducer, which sets up a timerfd based timer and then
rearms it in a loop with an absolute expiry time of 1ns.

As the expiry time is in the past, the timer ends up as the first expiring
timer in the per CPU hrtimer base and the clockevent device is programmed
with the minimum delta value. If the machine is fast enough, this ends up
in a endless loop of programming the delta value to the minimum value
defined by the clock event device, before the timer interrupt can fire,
which starves the interrupt and consequently triggers the lockup detector
because the hrtimer callback of the lockup mechanism is never invoked.

The first patch in the V1 series changes the clockevent set next event
mechanism to prevent reprogramming of the clockevent device when the
minimum delta value was programmed unless the new delta is larger than
that. It's a less convoluted variant of the patch which was posted in the
above linked thread and was confirmed to prevent the starvation problem.

But that's only to be considered the last resort because it results in an
insane amount of avoidable hrtimer interrupts. That patch has been merged
into the tip tree already.

The problem of user controlled timers is that the input value is only
sanity checked vs. validity of the provided timespec and clamped to be in
the maximum allowable range. But for performance reasons for in kernel
usage there is no check whether a to be armed timer might have been expired
already at enqueue time.

This series addresses this by providing a separate interface to arm user
controlled timers. This works the same way as the existing
hrtimer_start_range_ns(), but in case that the timer ends up as the first
timer in the clock base after enqueue it provides additional checks:

      - Whether the timer becomes the first expiring timer in the CPU base.

      	If not the timer is considered to expire in the future as there is
	already an earlier event programmed.

      - Whether the timer has expired already by comparing the expiry value
        against current time.

	If it is expired, the timer is removed from the clock base and the
	function returns false, so that the caller can handle it. That's
	required because the function cannot invoke the callback as that
	might need to acquire a lock which is held by the caller.

This function is then used for the user controlled timer arming interfaces
mainly by converting hrtimer sleeper over to it. That affects a few in
kernel users too, but the overhead is minimal in that case and it spares a
tedious whack the mole game all over the tree.

The other usage sites in posixtimers, alarmtimers and timerfd are converted
as well, which should cover the vast majority of user space controllable
timers as far as my investigation goes.

Changes vs. V1:

   - Dropped the clockevents patch as it is already merged

   - Rebased on tip timers/core

   - Moved the user check into hrtimer_start_range_ns_user() - Peter

   - Renamed alarmtimer_start() to alarm_start_timer() - Peter

   - Picked up tags as appropriate
   
The series applies against tip timers/core and is also available from git:

    git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git hrtimer-exp-v2

Thanks,

	tglx
---
 drivers/power/supply/charger-manager.c |   12 +-
 fs/timerfd.c                           |  117 ++++++++++++++++-----------
 include/linux/alarmtimer.h             |    9 +-
 include/linux/hrtimer.h                |   20 ++++
 include/trace/events/timer.h           |   13 +++
 kernel/time/alarmtimer.c               |   70 +++++++---------
 kernel/time/hrtimer.c                  |  140 ++++++++++++++++++++++++++++-----
 kernel/time/posix-cpu-timers.c         |   18 ++--
 kernel/time/posix-timers.c             |   35 +++++---
 kernel/time/posix-timers.h             |    4 
 net/netfilter/xt_IDLETIMER.c           |   24 ++++-
 11 files changed, 320 insertions(+), 142 deletions(-)


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2026-04-08 20:41 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-08 11:53 [patch V2 00/11] hrtimers: Prevent hrtimer interrupt starvation Thomas Gleixner
2026-04-08 11:53 ` [patch V2 01/11] hrtimer: Provide hrtimer_start_range_ns_user() Thomas Gleixner
2026-04-08 16:53   ` Frederic Weisbecker
2026-04-08 11:53 ` [patch V2 02/11] hrtimer: Use hrtimer_start_expires_user() for hrtimer sleepers Thomas Gleixner
2026-04-08 20:41   ` Frederic Weisbecker
2026-04-08 11:53 ` [patch V2 03/11] posix-timers: Expand timer_[re]arm() callbacks with a boolean return value Thomas Gleixner
2026-04-08 11:54 ` [patch V2 04/11] posix-timers: Handle the timer_[re]arm() " Thomas Gleixner
2026-04-08 11:54 ` [patch V2 05/11] posix-timers: Switch to hrtimer_start_expires_user() Thomas Gleixner
2026-04-08 11:54 ` [patch V2 06/11] alarmtimer: Provide alarm_start_timer() Thomas Gleixner
2026-04-08 11:54 ` [patch V2 07/11] alarmtimer: Convert posix timer functions to alarm_start_timer() Thomas Gleixner
2026-04-08 11:54 ` [patch V2 08/11] fs/timerfd: Use the new alarm/hrtimer functions Thomas Gleixner
2026-04-08 11:54 ` [patch V2 09/11] power: supply: charger-manager: Switch to alarm_start_timer() Thomas Gleixner
2026-04-08 11:54 ` [patch V2 10/11] netfilter: xt_IDLETIMER: " Thomas Gleixner
2026-04-08 11:54 ` [patch V2 11/11] alarmtimer: Remove unused interfaces Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox