All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Peter Zijlstra <peterz@infradead.org>
Cc: arnd@arndb.de, anna-maria@linutronix.de, frederic@kernel.org,
	peterz@infradead.org, luto@kernel.org, mingo@redhat.com,
	juri.lelli@redhat.com, vincent.guittot@linaro.org,
	dietmar.eggemann@arm.com, rostedt@goodmis.org,
	bsegall@google.com, mgorman@suse.de, vschneid@redhat.com,
	linux-kernel@vger.kernel.org, oliver.sang@intel.com
Subject: Re: [RFC][PATCH 7/8] entry,hrtimer: Push reprogramming timers into the interrupt return path
Date: Sat, 20 Sep 2025 11:29:43 +0200	[thread overview]
Message-ID: <875xdd8oag.ffs@tglx> (raw)
In-Reply-To: <20250918080206.180399724@infradead.org>

On Thu, Sep 18 2025 at 09:52, Peter Zijlstra wrote:
> Currently hrtimer_interrupt() runs expired timers, which can re-arm
> themselves, after which it computes the next expiration time and
> re-programs the hardware.
>
> However, things like HRTICK, a highres timer driving preemption,
> cannot re-arm itself at the point of running, since the next task has
> not been determined yet. The schedule() in the interrupt return path
> will switch to the next task, which then causes a new hrtimer to be
> programmed.
>
> This then results in reprogramming the hardware at least twice, once
> after running the timers, and once upon selecting the new task.
>
> Notably, *both* events happen in the interrupt.
>
> By pushing the hrtimer reprogram all the way into the interrupt return
> path, it runs after schedule() and this double reprogram can be
> avoided.
>
> XXX: 0-day is unhappy with this patch -- it is reporting lockups that
> very much look like a timer goes missing. Am unable to reproduce.
> Notable: the lockup goes away when the workloads are ran without perf
> monitors.

After staring at it for a while, I have two observations.

1) In the 0-day report the lockup detector triggers on a spinlock
   contention in futex_wait_setup()

   I'm not really seeing how that's related to a missing timer.

   Without knowing what the other CPUs are doing and what holds the
   lock, it's pretty much impossible to tell what the hell is going on.

   So that might need a back trace triggered on all CPUs and perhaps
   some debug output in the backtrace about the hrtimer state.

   On the CPU where the lockup is detected, the timer is working.


2) I came up with the following scenario, which is broken with this
   delayed rearm.

   Assume this happens on the timekeeping CPU.

      hrtimer_interrupt()
        expire_timers();
        set(TIF_REARM);

      exit_to_user_mode_prepare()
        handle_tif_muck()
          ...
          to = jiffies + 2;
          while (!cond() && time_before(jiffies, to))
          	relax();

     If cond() does not become true for whatever reason, then this won't
     make progress ever because the tick hrtimer which increments
     jiffies is not happening.

     It can also be a wait on a remote CPU preventing progress
     indirectly or a subtle dependency on a timer (timer list or
     hrtimer) to expire.

  I have no idea whether that's related to the reported 0-day fallout,
  but it definitely is a real problem lurking in the dark.

Thanks,

        tglx

  reply	other threads:[~2025-09-20  9:29 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-18  7:52 [PATCH 0/8] hrtimer/sched: Improve hrtick Peter Zijlstra
2025-09-18  7:52 ` [PATCH 1/8] sched: Fix hrtick() vs scheduling context Peter Zijlstra
2025-09-19  3:53   ` K Prateek Nayak
2025-09-23  0:24   ` John Stultz
2025-12-03 18:25   ` [tip: sched/urgent] sched/hrtick: Fix hrtick() vs. " tip-bot2 for Peter Zijlstra
2025-12-03 18:31   ` tip-bot2 for Peter Zijlstra
2025-12-06  9:10   ` tip-bot2 for Peter Zijlstra
2025-09-18  7:52 ` [PATCH 2/8] sched/fair: Limit hrtick work Peter Zijlstra
2025-09-19 14:59   ` K Prateek Nayak
2025-11-28  8:25     ` Peter Zijlstra
2025-12-14  7:46   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2025-09-18  7:52 ` [PATCH 3/8] sched/eevdf: Fix HRTICK duration Peter Zijlstra
2025-09-19 15:34   ` K Prateek Nayak
2025-11-28  8:32     ` Peter Zijlstra
2025-09-18  7:52 ` [PATCH 4/8] hrtimer: Optimize __hrtimer_start_range_ns() Peter Zijlstra
2025-09-18  7:52 ` [PATCH 5/8] hrtimer,sched: Add fuzzy hrtimer mode for HRTICK Peter Zijlstra
2025-09-18  7:52 ` [PATCH 6/8] hrtimer: Re-arrange hrtimer_interrupt() Peter Zijlstra
2025-09-18  7:52 ` [RFC][PATCH 7/8] entry,hrtimer: Push reprogramming timers into the interrupt return path Peter Zijlstra
2025-09-20  9:29   ` Thomas Gleixner [this message]
2025-09-23  7:52     ` Peter Zijlstra
2025-09-23  8:18       ` Peter Zijlstra
2025-09-18  7:52 ` [RFC][PATCH 8/8] sched: Default enable HRTICK Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875xdd8oag.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=anna-maria@linutronix.de \
    --cc=arnd@arndb.de \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=frederic@kernel.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=oliver.sang@intel.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.