public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Anna-Maria Behnsen <anna-maria@linutronix.de>
Subject: [PATCH 1/2] timers/migration: Fix endless timer requeue after idle interrupts
Date: Tue, 19 Mar 2024 00:07:28 +0100	[thread overview]
Message-ID: <20240318230729.15497-2-frederic@kernel.org> (raw)
In-Reply-To: <20240318230729.15497-1-frederic@kernel.org>

When a CPU is an idle migrator, but another CPU wakes up before it,
becomes an active migrator and handles the queue, the initial idle
migrator may end up endlessly reprogramming its clockevent, chasing ghost
timers forever such as in the following scenario:

               [GRP0:0]
             migrator = 0
             active   = 0
             nextevt  = T1
              /         \
             0           1
          active        idle (T1)

0) CPU 1 is idle and has a timer queued (T1), CPU 0 is active and is
the active migrator.

               [GRP0:0]
             migrator = NONE
             active   = NONE
             nextevt  = T1
              /         \
             0           1
          idle        idle (T1)
          wakeup = T1

1) CPU 0 is now idle and is therefore the idle migrator. It has
programmed its next timer interrupt to handle T1.

                [GRP0:0]
             migrator = 1
             active   = 1
             nextevt  = KTIME_MAX
              /         \
             0           1
          idle        active
          wakeup = T1

2) CPU 1 has woken up, it is now active and it has just handled its own
timer T1.

3) CPU 0 gets a timer interrupt to handle T1 but tmigr_handle_remote()
realize it is not the migrator anymore. So it early returns without
observing that T1 has been expired already and therefore without
updating its ->wakeup value.

4) CPU 0 goes into tmigr_cpu_new_timer() which also early returns
because it doesn't queue a timer of its own. So ->wakeup is left
unchanged and the next timer is programmed to fire now.

5) goto 3) forever

This results in timer interrupt storms in idle and also in nohz_full (as
observed in rcutorture's TREE07 scenario).

Fix this with forcing a re-evaluation of tmc->wakeup while trying
remote timer handling when the CPU isn't the migrator anymmore. The
check is inherently racy but in the worst case the CPU just races setting
the KTIME_MAX value that a remote expiry also tries to set.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Fixes: 7ee988770326 ("timers: Implement the hierarchical pull model")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/time/timer_migration.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/kernel/time/timer_migration.c b/kernel/time/timer_migration.c
index 611cd904f035..c63a0afdcebe 100644
--- a/kernel/time/timer_migration.c
+++ b/kernel/time/timer_migration.c
@@ -1038,8 +1038,15 @@ void tmigr_handle_remote(void)
 	 * in tmigr_handle_remote_up() anyway. Keep this check to speed up the
 	 * return when nothing has to be done.
 	 */
-	if (!tmigr_check_migrator(tmc->tmgroup, tmc->childmask))
-		return;
+	if (!tmigr_check_migrator(tmc->tmgroup, tmc->childmask)) {
+		/*
+		 * If this CPU was an idle migrator, make sure to clear its wakeup
+		 * value so it won't chase timers that have already expired elsewhere.
+		 * This avoids endless requeue from tmigr_new_timer().
+		 */
+		if (READ_ONCE(tmc->wakeup) == KTIME_MAX)
+			return;
+	}
 
 	data.now = get_jiffies_update(&data.basej);
 
-- 
2.44.0


  reply	other threads:[~2024-03-18 23:07 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-18 23:07 [PATCH 0/2] timers: More fixes Frederic Weisbecker
2024-03-18 23:07 ` Frederic Weisbecker [this message]
2024-03-21 11:24   ` [tip: timers/urgent] timers/migration: Fix endless timer requeue after idle interrupts tip-bot2 for Frederic Weisbecker
2024-03-18 23:07 ` [PATCH 2/2] timers: Fix removed self-IPI on global timer's enqueue in nohz_full Frederic Weisbecker
2024-03-19  9:18   ` Paul E. McKenney
2024-03-20 11:14     ` Paul E. McKenney
2024-03-20 16:15       ` Frederic Weisbecker
2024-03-20 22:55         ` Paul E. McKenney
2024-03-21 11:42           ` Frederic Weisbecker
2024-03-21 12:47             ` Paul E. McKenney
2024-03-22 11:32               ` Frederic Weisbecker
2024-03-22 13:22                 ` for_each_domain()/sched_domain_span() has offline CPUs (was Re: [PATCH 2/2] timers: Fix removed self-IPI on global timer's enqueue in nohz_full) Frederic Weisbecker
2024-03-26 16:46                   ` Valentin Schneider
2024-03-27 12:42                     ` Frederic Weisbecker
2024-03-27 14:28                       ` Valentin Schneider
2024-03-28 14:08                         ` Valentin Schneider
2024-03-28 16:58                           ` Frederic Weisbecker
2024-03-28 20:31                             ` Valentin Schneider
2024-03-27 20:42                     ` Thomas Gleixner
2024-03-28 20:39                       ` Valentin Schneider
2024-03-29  2:08                         ` Tejun Heo
2024-03-29 17:06                           ` Waiman Long
2024-04-01 21:26               ` [PATCH 2/2] timers: Fix removed self-IPI on global timer's enqueue in nohz_full Paul E. McKenney
2024-04-01 21:56                 ` Frederic Weisbecker
2024-04-02  0:04                   ` Paul E. McKenney
2024-04-02 16:47                     ` Paul E. McKenney
2024-04-03 18:05                       ` Paul E. McKenney
2024-03-21 11:24   ` [tip: timers/urgent] " tip-bot2 for Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240318230729.15497-2-frederic@kernel.org \
    --to=frederic@kernel.org \
    --cc=anna-maria@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox