From: Anna-Maria Behnsen <anna-maria@linutronix.de>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>,
Florian Fainelli <f.fainelli@gmail.com>,
Thomas Gleixner <tglx@linutronix.de>,
"Russell King (Oracle)" <linux@armlinux.org.uk>,
Joel Fernandes <joel@joelfernandes.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
linux-kernel@vger.kernel.org, kernel-team@meta.com,
paulmck@kernel.org, mingo@kernel.org, rcu@vger.kernel.org,
neeraj.upadhyay@amd.com, urezki@gmail.com,
qiang.zhang1211@gmail.com, bigeasy@linutronix.de,
chenzhongjin@huawei.com, yangjihong1@huawei.com,
rostedt@goodmis.org, Justin Chen <justin.chen@broadcom.com>
Subject: [PATCH v2] timers/migration: Return early on deactivation
Date: Fri, 05 Apr 2024 10:53:21 +0200 [thread overview]
Message-ID: <87cyr49on2.fsf@somnus> (raw)
In-Reply-To: <Zg8nVbKHPhUJfmgc@pavilion.home>
Commit 4b6f4c5a67c0 ("timer/migration: Remove buggy early return on
deactivation") removed the logic to return early in tmigr_update_events()
on deactivation. With this the problem with a not properly updated first
global event in a hierarchy containing only a single group was fixed.
But when having a look at this code path with a hierarchy with more than a
single level, now unnecessary work is done (example is partially copied
from the message of the commit mentioned above):
[GRP1:0]
migrator = GRP0:0
active = GRP0:0
nextevt = T0:0i, T0:1
/ \
[GRP0:0] [GRP0:1]
migrator = 0 migrator = NONE
active = 0 active = NONE
nextevt = T0i, T1 nextevt = T2
/ \ / \
0 (T0i) 1 (T1) 2 (T2) 3
active idle idle idle
0) CPU 0 is active thus its event is ignored (the letter 'i') and so are
upper levels' events. CPU 1 is idle and has the timer T1 enqueued.
CPU 2 also has a timer. The expiry order is T0 (ignored) < T1 < T2
[GRP1:0]
migrator = GRP0:0
active = GRP0:0
nextevt = T0:0i, T0:1
/ \
[GRP0:0] [GRP0:1]
migrator = NONE migrator = NONE
active = NONE active = NONE
nextevt = T1 nextevt = T2
/ \ / \
0 (T0i) 1 (T1) 2 (T2) 3
idle idle idle idle
1) CPU 0 goes idle without global event queued. Therefore KTIME_MAX is
pushed as its next expiry and its own event kept as "ignore". Without this
early return the following steps happen in tmigr_update_events() when
child = null and group = GRP0:0 :
lock(GRP0:0->lock);
timerqueue_del(GRP0:0, T0i);
unlock(GRP0:0->lock);
[GRP1:0]
migrator = NONE
active = NONE
nextevt = T0:0, T0:1
/ \
[GRP0:0] [GRP0:1]
migrator = NONE migrator = NONE
active = NONE active = NONE
nextevt = T1 nextevt = T2
/ \ / \
0 (T0i) 1 (T1) 2 (T2) 3
idle idle idle idle
2) The change now propagates up to the top. Then tmigr_update_events()
updates the group event of GRP0:0 and executes the following steps
(child = GRP0:0 and group = GRP0:0):
lock(GRP0:0->lock);
lock(GRP1:0->lock);
evt = tmigr_next_groupevt(GRP0:0); -> this removes the ignored events
in GRP0:0
... update GRP1:0 group event and timerqueue ...
unlock(GRP1:0->lock);
unlock(GRP0:0->lock);
So the dance in 1) with locking the GRP0:0->lock and removing the T0i from
the timerqueue is redundand as this is done nevertheless in 2) when
tmigr_next_groupevt(GRP0:0) is executed.
Revert commit 4b6f4c5a67c0 ("timer/migration: Remove buggy early return on
deactivation") and add a condition into return path to skip the return
only, when hierarchy contains a single group. Adapt comments accordingly.
Fixes: 4b6f4c5a67c0 ("timer/migration: Remove buggy early return on deactivation")
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
---
kernel/time/timer_migration.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
--- a/kernel/time/timer_migration.c
+++ b/kernel/time/timer_migration.c
@@ -751,6 +751,33 @@ bool tmigr_update_events(struct tmigr_gr
first_childevt = evt = data->evt;
+ /*
+ * Walking the hierarchy is required in any case when a
+ * remote expiry was done before. This ensures to not lose
+ * already queued events in non active groups (see section
+ * "Required event and timerqueue update after a remote
+ * expiry" in the documentation at the top).
+ *
+ * The two call sites which are executed without a remote expiry
+ * before, are not prevented from propagating changes through
+ * the hierarchy by the return:
+ * - When entering this path by tmigr_new_timer(), @evt->ignore
+ * is never set.
+ * - tmigr_inactive_up() takes care of the propagation by
+ * itself and ignores the return value. But an immediate
+ * return is possible if there is a parent, sparing group
+ * locking at this level, because the upper walking call to
+ * the parent will take care about removing this event from
+ * within the group and update next_expiry accordingly.
+ *
+ * However if there is no parent, ie: the hierarchy has only a
+ * single level so @group is the top level group, make sure the
+ * first event information of the group is updated properly and
+ * also handled properly, so skip this fast return path.
+ */
+ if (evt->ignore && !remote && group->parent)
+ return true;
+
raw_spin_lock(&group->lock);
childstate.state = 0;
next prev parent reply other threads:[~2024-04-05 8:53 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-08 17:15 [GIT PULL] RCU changes for v6.9 Boqun Feng
2024-03-11 19:43 ` pr-tracker-bot
2024-03-12 20:32 ` Unexplained long boot delays [Was Re: [GIT PULL] RCU changes for v6.9] Florian Fainelli
2024-03-12 21:01 ` Frederic Weisbecker
2024-03-12 21:15 ` Paul E. McKenney
2024-03-12 21:35 ` Florian Fainelli
2024-03-12 22:05 ` Florian Fainelli
2024-03-12 21:07 ` Boqun Feng
2024-03-12 21:34 ` Florian Fainelli
2024-03-12 21:44 ` Linus Torvalds
2024-03-12 23:48 ` Boqun Feng
2024-03-13 16:01 ` Joel Fernandes
2024-03-13 21:30 ` Florian Fainelli
2024-03-13 21:59 ` Russell King (Oracle)
2024-03-13 22:04 ` Florian Fainelli
2024-03-13 22:49 ` Russell King (Oracle)
2024-03-13 23:29 ` Florian Fainelli
2024-03-14 1:15 ` Linus Torvalds
2024-03-14 1:22 ` Florian Fainelli
2024-03-13 22:52 ` Frederic Weisbecker
2024-03-14 3:44 ` Florian Fainelli
2024-03-14 5:12 ` Boqun Feng
2024-03-14 6:33 ` Boqun Feng
2024-03-14 9:32 ` Thomas Gleixner
2024-03-14 9:11 ` Thomas Gleixner
2024-03-14 10:41 ` Frederic Weisbecker
2024-03-14 18:35 ` Florian Fainelli
2024-03-14 18:51 ` Boqun Feng
2024-03-14 19:09 ` Florian Fainelli
2024-03-14 20:45 ` Thomas Gleixner
2024-03-14 21:21 ` Thomas Gleixner
2024-03-14 21:53 ` Florian Fainelli
2024-03-14 22:51 ` Thomas Gleixner
2024-03-14 21:58 ` Thomas Gleixner
2024-03-14 22:05 ` Boqun Feng
2024-03-14 22:10 ` Boqun Feng
2024-03-15 1:14 ` [PATCH] timer/migration: Remove buggy early return on deactivation [was Re: Unexplained long boot delays [Was Re: [GIT PULL] RCU changes for v6.9]] Frederic Weisbecker
2024-03-15 1:20 ` Frederic Weisbecker
2024-03-15 13:44 ` Florian Fainelli
2024-03-16 19:06 ` [tip: timers/urgent] timer/migration: Remove buggy early return on deactivation tip-bot2 for Frederic Weisbecker
2024-03-26 16:41 ` [PATCH] timer/migration: Remove buggy early return on deactivation [was Re: Unexplained long boot delays [Was Re: [GIT PULL] RCU changes for v6.9]] Anna-Maria Behnsen
2024-03-26 17:18 ` Frederic Weisbecker
2024-04-04 16:50 ` [PATCH] timers/migration: Return early on deactivation Anna-Maria Behnsen
2024-04-04 22:19 ` Frederic Weisbecker
2024-04-05 8:53 ` Anna-Maria Behnsen [this message]
2024-04-05 9:11 ` [tip: timers/urgent] " tip-bot2 for Anna-Maria Behnsen
2024-03-14 9:03 ` Unexplained long boot delays [Was Re: [GIT PULL] RCU changes for v6.9] Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87cyr49on2.fsf@somnus \
--to=anna-maria@linutronix.de \
--cc=bigeasy@linutronix.de \
--cc=boqun.feng@gmail.com \
--cc=chenzhongjin@huawei.com \
--cc=f.fainelli@gmail.com \
--cc=frederic@kernel.org \
--cc=joel@joelfernandes.org \
--cc=justin.chen@broadcom.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=mingo@kernel.org \
--cc=neeraj.upadhyay@amd.com \
--cc=paulmck@kernel.org \
--cc=qiang.zhang1211@gmail.com \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=urezki@gmail.com \
--cc=yangjihong1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.