Re: [tip:sched/hrtick] [hrtimer] 2889243848: stress-ng.timermix.ops_per_sec 30.1% regression

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Peter Zijlstra <peterz@infradead.org>
To: Thomas Gleixner <tglx@kernel.org>
Cc: Joe Talbott <joetalbott@gmail.com>,
	kernel test robot <oliver.sang@intel.com>,
	oe-lkp@lists.linux.dev, lkp@intel.com,
	linux-kernel@vger.kernel.org, x86@kernel.org
Subject: Re: [tip:sched/hrtick] [hrtimer] 2889243848: stress-ng.timermix.ops_per_sec 30.1% regression
Date: Wed, 11 Mar 2026 13:15:00 +0100	[thread overview]
Message-ID: <20260311121500.GF652779@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <20260311105819.GL606826@noisy.programming.kicks-ass.net>

On Wed, Mar 11, 2026 at 11:58:19AM +0100, Peter Zijlstra wrote:

> > Hmm. The original code preserved hang_detected until the next timer
> > interrupt to prevent rearming when a new timer is queued.
> 
> Oh indeed. And that avoids __hrtimer_reprogram() from coming in and
> 'destroying' the delay I suppose.
> 
> Let me poke at this a little more then.

How's this then?

---
Subject: hrtimer: Less agressive interrupt 'hang' handling
From: Peter Zijlstra <peterz@infradead.org>
Date: Tue, 10 Mar 2026 20:02:21 +0100

When the hrtimer_interrupt needs to restart more than 3 times and
still has expired timers, the interrupt is considered hung. To give
the system a little time to recover, the hardware timer is programmed
a little into the future.

Prior to commit 288924384856 ("hrtimer: Re-arrange
hrtimer_interrupt()"), this was relative to the amount of time spend
serving the interrupt with a max of 100 msec.

However, in order to simplify, and because this condition 'should' not
happen, the timeout was unconditionally set to 100 msec.

'Obviously' there is a benchmark that hits this hard, by programming a
ton of very short timers :-/

Since reprogramming is decoupled from the interrupt handling, the
actual execution time is lost, however the code does track
max_hang_time. Using that, rather than the 100 ms max restores
performance.

  stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --timermix 64

                  bogo ops/s
 288924384856^1: 23715979.93
 288924384856:   11550049.77
 patched:        23361116.78

Additionally, Thomas noted that we should not clear ->hang_detected
until the next interrupt, such that __hrtimer_reprogram() won't undo
the extra delay.

Fixes: 288924384856 ("hrtimer: Re-arrange hrtimer_interrupt()")
Closes: https://lore.kernel.org/oe-lkp/202603102229.74b9dee4-lkp@intel.com
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/time/hrtimer.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -2031,8 +2031,8 @@ static void hrtimer_rearm(struct hrtimer
 		 * Give the system a chance to do something else than looping
 		 * on hrtimer interrupts.
 		 */
-		expires_next = ktime_add_ns(ktime_get(), 100 * NSEC_PER_MSEC);
-		cpu_base->hang_detected = false;
+		expires_next = ktime_add_ns(ktime_get(),
+					    min(100 * NSEC_PER_MSEC, cpu_base->max_hang_time));
 	}
 	hrtimer_rearm_event(expires_next, deferred);
 }
@@ -2121,6 +2121,7 @@ void hrtimer_interrupt(struct clock_even
 	 */
 	now = hrtimer_update_base(cpu_base);
 	expires_next = hrtimer_update_next_event(cpu_base);
+	cpu_base->hang_detected = false;
 	if (expires_next < now) {
 		if (++retries < 3)
 			goto retry;

next prev parent reply	other threads:[~2026-03-11 12:15 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-10 14:46 [tip:sched/hrtick] [hrtimer] 2889243848: stress-ng.timermix.ops_per_sec 30.1% regression kernel test robot
2026-03-10 15:23 ` Peter Zijlstra
2026-03-10 17:11   ` Joe Talbott
2026-03-10 18:16     ` Peter Zijlstra
2026-03-10 18:50       ` Peter Zijlstra
2026-03-10 19:02         ` Peter Zijlstra
2026-03-11  9:40           ` Thomas Gleixner
2026-03-11 10:58             ` Peter Zijlstra
2026-03-11 12:15               ` Peter Zijlstra [this message]
2026-03-11 20:16                 ` [tip: sched/hrtick] hrtimer: Less agressive interrupt 'hang' handling tip-bot2 for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260311121500.GF652779@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=joetalbott@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=tglx@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox