All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	anna-maria@linutronix.de, Matt Fleming <mfleming@cloudflare.com>
Subject: [PATCH] timer/migration: Fix off-by-one root mis-connection
Date: Wed,  5 Feb 2025 17:02:20 +0100	[thread overview]
Message-ID: <20250205160220.39467-1-frederic@kernel.org> (raw)

Before attaching a new root to the old root, the children counter of the
new root is checked to verify that only the upcoming CPU's top group
have been connected to it. However since the recently added commit:

	b729cc1ec21a ("timers/migration: Fix another race between hotplug and idle entry/exit")

this check is not valid anymore because the old root is pre-accounted
as a child to the new root. Therefore after connecting the upcoming
CPU's top group to the new root, the children count to be expected must
be 2 and not 1 anymore.

This omission results in the old root to not be connected to the new
root. Then eventually the system may run with more than one top level,
which defeats the purpose of a single idle migrator.

Also the old root is pre-accounted but not connected upon the new root
creation. But it can be connected to the new root later on. Therefore
the old root may be accounted twice to the new root. The propagation of
such overcommit can end up creating a double final top-level root with a
groupmask incorrectly initialized. Although harmless given that the final
top level roots will never have a parent to walk up to, this oddity
opportunistically reported the core issue:

	 WARNING: CPU: 8 PID: 0 at kernel/time/timer_migration.c:543 tmigr_requires_handle_remoet
	 CPU: 8 UID: 0 PID: 0 Comm: swapper/8
	 RIP: 0010:tmigr_requires_handle_remote
	 Call Trace:
	  <IRQ>
	  ? tmigr_requires_handle_remote
	  ? hrtimer_run_queues
	  update_process_times
	  tick_periodic
	  tick_handle_periodic
	  __sysvec_apic_timer_interrupt
	  sysvec_apic_timer_interrupt
	  </IRQ>

Fix the problem with taking into account the old root in the children
count of the new root so the connection is not omitted.

Also warn when more than one top level group coexist to better detect
similar issues in the future.

Reported-by: Matt Fleming <mfleming@cloudflare.com>
Fixes: b729cc1ec21a ("timers/migration: Fix another race between hotplug and idle entry/exit")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/time/timer_migration.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/time/timer_migration.c b/kernel/time/timer_migration.c
index 9cb9b6584ea1..2f6330831f08 100644
--- a/kernel/time/timer_migration.c
+++ b/kernel/time/timer_migration.c
@@ -1675,6 +1675,9 @@ static int tmigr_setup_groups(unsigned int cpu, unsigned int node)
 
 	} while (i < tmigr_hierarchy_levels);
 
+	/* Assert single root */
+	WARN_ON_ONCE(!err && !group->parent && !list_is_singular(&tmigr_level_list[top]));
+
 	while (i > 0) {
 		group = stack[--i];
 
@@ -1716,7 +1719,12 @@ static int tmigr_setup_groups(unsigned int cpu, unsigned int node)
 		WARN_ON_ONCE(top == 0);
 
 		lvllist = &tmigr_level_list[top];
-		if (group->num_children == 1 && list_is_singular(lvllist)) {
+
+		/*
+		 * Newly created root level should have accounted the upcoming
+		 * CPU's child group and pre-accounted the old root.
+		 */
+		if (group->num_children == 2 && list_is_singular(lvllist)) {
 			/*
 			 * The target CPU must never do the prepare work, except
 			 * on early boot when the boot CPU is the target. Otherwise
-- 
2.46.0


             reply	other threads:[~2025-02-05 16:02 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-05 16:02 Frederic Weisbecker [this message]
2025-02-07  8:13 ` [tip: timers/urgent] timers/migration: Fix off-by-one root mis-connection tip-bot2 for Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250205160220.39467-1-frederic@kernel.org \
    --to=frederic@kernel.org \
    --cc=anna-maria@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mfleming@cloudflare.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.