From: Frederic Weisbecker <frederic@kernel.org>
To: Anna-Maria Behnsen <anna-maria@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>,
linux-kernel@vger.kernel.org, Borislav Petkov <bp@alien8.de>,
Narasimhan V <Narasimhan.V@amd.com>
Subject: Re: [PATCH 0/3] timer_migration: Fix a possible race and improvements
Date: Fri, 21 Jun 2024 16:31:15 +0200 [thread overview]
Message-ID: <ZnWOswTMML6ShzYO@localhost.localdomain> (raw)
In-Reply-To: <20240621-tmigr-fixes-v1-0-8c8a2d8e8d77@linutronix.de>
Le Fri, Jun 21, 2024 at 11:37:05AM +0200, Anna-Maria Behnsen a écrit :
> Borislav reported a warning in timer migration deactive path
>
> https://lore.kernel.org/r/20240612090347.GBZmlkc5PwlVpOG6vT@fat_crate.local
>
> Sadly it doesn't reproduce directly. But with the change of timing (by
> adding a trace prinkt before the warning), it is possible to trigger the
> warning reliable at least in my test setup. The problem here is a racy
> check agains group->parent pointer. This is also used in other places in
> the code and fixing this racy usage is adressed by the first patch.
>
> While working with the code, I saw two things which could be improved
> (tracing and update of per cpu group wakeup value). This improvements are
> adressed by the other two patches.
>
> Patches are available here:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/anna-maria/linux-devel.git timers/misc
>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: linux-kernel@vger.kernel.org
>
> Thanks,
>
> Anna-Maria
>
> ---
This made me stare at the group creation again and I might have found
something. Does the following race look plausible to you?
[GRP0:0]
migrator = 0
active = 0
nextevt = KTIME_MAX
/ \
0 1 .. 7
active idle
0) Hierarchy has only 8 CPUs (single node for now with only CPU 0
as active.
[GRP1:0]
migrator = TMIGR_NONE
active = NONE
nextevt = KTIME_MAX
\
[GRP0:0] [GRP0:1]
migrator = 0 migrator = TMIGR_NONE
active = 0 active = NONE
nextevt = KTIME_MAX nextevt = KTIME_MAX
/ \ |
0 1 .. 7 8
active idle !online
1) CPU 8 is booting and creates a new node and a new top. For now it's
only connected to GRP0:1, not yet to GRP0:0. Also CPU 8 hasn't called
__tmigr_cpu_activate() on itself yet.
[GRP1:0]
migrator = TMIGR_NONE
active = NONE
nextevt = KTIME_MAX
/ \
[GRP0:0] [GRP0:1]
migrator = 0 migrator = TMIGR_NONE
active = 0 active = NONE
nextevt = KTIME_MAX nextevt = KTIME_MAX
/ \ |
0 1 .. 7 8
active idle active
2) CPU 8 connects GRP0:0 to GRP1:0 and observes while in
tmigr_connect_child_parent() that GRP0:0 is not TMIGR_NONE. So it
prepares to call tmigr_active_up() on it. It hasn't done it yet.
[GRP1:0]
migrator = TMIGR_NONE
active = NONE
nextevt = KTIME_MAX
/ \
[GRP0:0] [GRP0:1]
migrator = TMIGR_NONE migrator = TMIGR_NONE
active = NONE active = NONE
nextevt = KTIME_MAX nextevt = KTIME_MAX
/ \ |
0 1 .. 7 8
idle idle active
3) CPU 0 goes idle. Since GRP0:0->parent has been updated by CPU 8 with
GRP0:0->lock held, CPU 0 observes GRP1:0 after calling tmigr_update_events()
and it propagates the change to the top (no change there and no wakeup
programmed since there is no timer).
[GRP1:0]
migrator = GRP0:0
active = GRP0:0
nextevt = KTIME_MAX
/ \
[GRP0:0] [GRP0:1]
migrator = TMIGR_NONE migrator = TMIGR_NONE
active = NONE active = NONE
nextevt = KTIME_MAX nextevt = KTIME_MAX
/ \ |
0 1 .. 7 8
idle idle active
4) Now CPU 8 finally calls tmigr_active_up() to GRP0:0
[GRP1:0]
migrator = GRP0:0
active = GRP0:0, GRP0:1
nextevt = KTIME_MAX
/ \
[GRP0:0] [GRP0:1]
migrator = TMIGR_NONE migrator = 8
active = NONE active = 8
nextevt = KTIME_MAX nextevt = KTIME_MAX
/ \ |
0 1 .. 7 8
idle idle active
5) And out of tmigr_cpu_online() CPU 8 calls tmigr_active_up() on itself
[GRP1:0]
migrator = GRP0:0
active = GRP0:0
nextevt = T8
/ \
[GRP0:0] [GRP0:1]
migrator = TMIGR_NONE migrator = TMIGR_NONE
active = NONE active = NONE
nextevt = KTIME_MAX nextevt = T8
/ \ |
0 1 .. 7 8
idle idle idle
5) CPU 8 goes idle with a timer T8 and relies on GRP0:0 as the migrator.
But it's not really active, so T8 gets ignored.
And if that race looks plausible, does the following fix look good?
diff --git a/kernel/time/timer_migration.c b/kernel/time/timer_migration.c
index 84413114db5c..0609cb8c770e 100644
--- a/kernel/time/timer_migration.c
+++ b/kernel/time/timer_migration.c
@@ -1525,7 +1525,6 @@ static void tmigr_connect_child_parent(struct tmigr_group *child,
child->childmask = BIT(parent->num_children++);
raw_spin_unlock(&parent->lock);
- raw_spin_unlock_irq(&child->lock);
trace_tmigr_connect_child_parent(child);
@@ -1559,6 +1558,14 @@ static void tmigr_connect_child_parent(struct tmigr_group *child,
*/
WARN_ON(!tmigr_active_up(parent, child, &data) && parent->parent);
}
+ /*
+ * Keep the lock up to that point so that if the child goes idle
+ * concurrently, either it sees the new parent with its active state
+ * after locking on tmigr_update_events() and propagates afterwards
+ * its idle state up, or the current booting CPU will observe TMIGR_NONE
+ * on the remote child and it won't propagate a spurious active state.
+ */
+ raw_spin_unlock_irq(&child->lock);
}
static int tmigr_setup_groups(unsigned int cpu, unsigned int node)
next prev parent reply other threads:[~2024-06-21 14:31 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-21 9:37 [PATCH 0/3] timer_migration: Fix a possible race and improvements Anna-Maria Behnsen
2024-06-21 9:37 ` [PATCH 1/3] timer_migration: Do not rely always on group->parent Anna-Maria Behnsen
2024-06-21 9:37 ` [PATCH 2/3] timer_migration: Spare write when nothing changed Anna-Maria Behnsen
2024-06-21 9:37 ` [PATCH 3/3] timer_migration: Improve tracing Anna-Maria Behnsen
2024-06-21 14:31 ` Frederic Weisbecker [this message]
2024-06-24 8:58 ` [PATCH 0/3] timer_migration: Fix a possible race and improvements Anna-Maria Behnsen
2024-06-24 11:04 ` Frederic Weisbecker
2024-06-24 14:48 ` Anna-Maria Behnsen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZnWOswTMML6ShzYO@localhost.localdomain \
--to=frederic@kernel.org \
--cc=Narasimhan.V@amd.com \
--cc=anna-maria@linutronix.de \
--cc=bp@alien8.de \
--cc=linux-kernel@vger.kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.