From: Frederic Weisbecker <frederic@kernel.org>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>,
Boqun Feng <boqun.feng@gmail.com>,
Joel Fernandes <joel@joelfernandes.org>,
Neeraj Upadhyay <neeraj.upadhyay@amd.com>,
"Paul E . McKenney" <paulmck@kernel.org>,
Uladzislau Rezki <urezki@gmail.com>,
Zqiang <qiang.zhang1211@gmail.com>, rcu <rcu@vger.kernel.org>,
kernel test robot <oliver.sang@intel.com>
Subject: [PATCH 2/3] rcu/nocb: Fix rcuog wake-up from offline softirq
Date: Wed, 2 Oct 2024 16:57:37 +0200 [thread overview]
Message-ID: <20241002145738.38226-3-frederic@kernel.org> (raw)
In-Reply-To: <20241002145738.38226-1-frederic@kernel.org>
After a CPU has set itself offline and before it eventually calls
rcutree_report_cpu_dead(), there are still opportunities for callbacks
to be enqueued, for example from an IRQ. When that happens on NOCB, the
rcuog wake-up is deferred through an IPI to an online CPU in order not
to call into the scheduler and risk arming the RT-bandwidth after
hrtimers have been migrated out and disabled.
But performing a synchronized IPI from an IRQ is buggy as reported in
the following scenario:
WARNING: CPU: 1 PID: 26 at kernel/smp.c:633 smp_call_function_single
Modules linked in: rcutorture torture
CPU: 1 UID: 0 PID: 26 Comm: migration/1 Not tainted 6.11.0-rc1-00012-g9139f93209d1 #1
Stopper: multi_cpu_stop+0x0/0x320 <- __stop_cpus+0xd0/0x120
RIP: 0010:smp_call_function_single
<IRQ>
swake_up_one_online
__call_rcu_nocb_wake
__call_rcu_common
? rcu_torture_one_read
call_timer_fn
__run_timers
run_timer_softirq
handle_softirqs
irq_exit_rcu
? tick_handle_periodic
sysvec_apic_timer_interrupt
</IRQ>
The periodic tick must be shutdown when the CPU is offline, just like is
done for oneshot tick. This must be fixed but this is not enough:
softirqs can happen on any hardirq tail and reproduce the above scenario.
Fix this with introducing a special deferred rcuog wake up mode when the
CPU is offline. This deferred wake up doesn't arm any timer and simply
wait for rcu_report_cpu_dead() to be called in order to flush any
pending rcuog wake up.
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202409231644.4c55582d-lkp@intel.com
Fixes: 9139f93209d1 ("rcu/nocb: Fix RT throttling hrtimer armed from offline CPU")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
kernel/rcu/tree.h | 1 +
kernel/rcu/tree_nocb.h | 14 ++++++++++++--
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index a9a811d9d7a3..7ed060edd12b 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -290,6 +290,7 @@ struct rcu_data {
#define RCU_NOCB_WAKE_LAZY 2
#define RCU_NOCB_WAKE 3
#define RCU_NOCB_WAKE_FORCE 4
+#define RCU_NOCB_WAKE_OFFLINE 5
#define RCU_JIFFIES_TILL_FORCE_QS (1 + (HZ > 250) + (HZ > 500))
/* For jiffies_till_first_fqs and */
diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index 2fb803f863da..8648233e1717 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -295,6 +295,8 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
case RCU_NOCB_WAKE_FORCE:
if (rdp_gp->nocb_defer_wakeup < RCU_NOCB_WAKE)
mod_timer(&rdp_gp->nocb_timer, jiffies + 1);
+ fallthrough;
+ case RCU_NOCB_WAKE_OFFLINE:
if (rdp_gp->nocb_defer_wakeup < waketype)
WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype);
break;
@@ -562,8 +564,16 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
lazy_len = READ_ONCE(rdp->lazy_len);
if (was_alldone) {
rdp->qlen_last_fqs_check = len;
- // Only lazy CBs in bypass list
- if (lazy_len && bypass_len == lazy_len) {
+ if (cpu_is_offline(rdp->cpu)) {
+ /*
+ * Offline CPUs can't call swake_up_one_online() from IRQs. Rely
+ * on the final deferred wake-up rcutree_report_cpu_dead()
+ */
+ rcu_nocb_unlock(rdp);
+ wake_nocb_gp_defer(rdp, RCU_NOCB_WAKE_OFFLINE,
+ TPS("WakeEmptyIsDeferredOffline"));
+ } else if (lazy_len && bypass_len == lazy_len) {
+ // Only lazy CBs in bypass list
rcu_nocb_unlock(rdp);
wake_nocb_gp_defer(rdp, RCU_NOCB_WAKE_LAZY,
TPS("WakeLazy"));
--
2.46.0
next prev parent reply other threads:[~2024-10-02 14:57 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-02 14:57 [PATCH 0/3] rcu: Fix yet another wake up from offline related issue Frederic Weisbecker
2024-10-02 14:57 ` [PATCH 1/3] rcu/nocb: Use switch/case on NOCB timer state machine Frederic Weisbecker
2024-10-10 8:16 ` Boqun Feng
2024-10-10 12:55 ` Frederic Weisbecker
2024-10-02 14:57 ` Frederic Weisbecker [this message]
2024-10-09 18:23 ` [PATCH 2/3] rcu/nocb: Fix rcuog wake-up from offline softirq Joel Fernandes
2024-10-09 20:56 ` Frederic Weisbecker
2024-10-10 0:27 ` Joel Fernandes
2024-10-02 14:57 ` [PATCH 3/3] rcu: Report callbacks enqueued on offline CPU blind spot Frederic Weisbecker
2024-10-02 15:00 ` Frederic Weisbecker
2024-10-09 2:03 ` Paul E. McKenney
2024-10-09 15:13 ` Paul E. McKenney
2024-10-10 14:30 ` Joel Fernandes
2024-10-09 2:24 ` Neeraj Upadhyay
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241002145738.38226-3-frederic@kernel.org \
--to=frederic@kernel.org \
--cc=boqun.feng@gmail.com \
--cc=joel@joelfernandes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=neeraj.upadhyay@amd.com \
--cc=oliver.sang@intel.com \
--cc=paulmck@kernel.org \
--cc=qiang.zhang1211@gmail.com \
--cc=rcu@vger.kernel.org \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.