From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Frederic Weisbecker <frederic@kernel.org>,
"Paul E . McKenney" <paulmck@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Sasha Levin <sashal@kernel.org>,
fweisbec@gmail.com, mingo@kernel.org, peterz@infradead.org,
dave@stgolabs.net, will@kernel.org, dirk.behme@de.bosch.com,
tannerlove@google.com
Subject: [PATCH MANUALSEL 5.15 2/9] timers/nohz: Last resort update jiffies on nohz_full IRQ entry
Date: Mon, 13 Dec 2021 09:19:35 -0500 [thread overview]
Message-ID: <20211213141944.352249-2-sashal@kernel.org> (raw)
In-Reply-To: <20211213141944.352249-1-sashal@kernel.org>
From: Frederic Weisbecker <frederic@kernel.org>
[ Upstream commit 53e87e3cdc155f20c3417b689df8d2ac88d79576 ]
When at least one CPU runs in nohz_full mode, a dedicated timekeeper CPU
is guaranteed to stay online and to never stop its tick.
Meanwhile on some rare case, the dedicated timekeeper may be running
with interrupts disabled for a while, such as in stop_machine.
If jiffies stop being updated, a nohz_full CPU may end up endlessly
programming the next tick in the past, taking the last jiffies update
monotonic timestamp as a stale base, resulting in an tick storm.
Here is a scenario where it matters:
0) CPU 0 is the timekeeper and CPU 1 a nohz_full CPU.
1) A stop machine callback is queued to execute somewhere.
2) CPU 0 reaches MULTI_STOP_DISABLE_IRQ while CPU 1 is still in
MULTI_STOP_PREPARE. Hence CPU 0 can't do its timekeeping duty. CPU 1
can still take IRQs.
3) CPU 1 receives an IRQ which queues a timer callback one jiffy forward.
4) On IRQ exit, CPU 1 schedules the tick one jiffy forward, taking
last_jiffies_update as a base. But last_jiffies_update hasn't been
updated for 2 jiffies since the timekeeper has interrupts disabled.
5) clockevents_program_event(), which relies on ktime_get(), observes
that the expiration is in the past and therefore programs the min
delta event on the clock.
6) The tick fires immediately, goto 3)
7) Tick storm, the nohz_full CPU is drown and takes ages to reach
MULTI_STOP_DISABLE_IRQ, which is the only way out of this situation.
Solve this with unconditionally updating jiffies if the value is stale
on nohz_full IRQ entry. IRQs and other disturbances are expected to be
rare enough on nohz_full for the unconditional call to ktime_get() to
actually matter.
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20211026141055.57358-2-frederic@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
kernel/softirq.c | 3 ++-
kernel/time/tick-sched.c | 7 +++++++
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 322b65d456767..41f470929e991 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -595,7 +595,8 @@ void irq_enter_rcu(void)
{
__irq_enter_raw();
- if (is_idle_task(current) && (irq_count() == HARDIRQ_OFFSET))
+ if (tick_nohz_full_cpu(smp_processor_id()) ||
+ (is_idle_task(current) && (irq_count() == HARDIRQ_OFFSET)))
tick_irq_enter();
account_hardirq_enter(current);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 6bffe5af8cb11..17a283ce2b20f 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1375,6 +1375,13 @@ static inline void tick_nohz_irq_enter(void)
now = ktime_get();
if (ts->idle_active)
tick_nohz_stop_idle(ts, now);
+ /*
+ * If all CPUs are idle. We may need to update a stale jiffies value.
+ * Note nohz_full is a special case: a timekeeper is guaranteed to stay
+ * alive but it might be busy looping with interrupts disabled in some
+ * rare case (typically stop machine). So we must make sure we have a
+ * last resort.
+ */
if (ts->tick_stopped)
tick_nohz_update_jiffies(now);
}
--
2.33.0
next prev parent reply other threads:[~2021-12-13 14:20 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-13 14:19 [PATCH MANUALSEL 5.15 1/9] sched/cputime: Fix getrusage(RUSAGE_THREAD) with nohz_full Sasha Levin
2021-12-13 14:19 ` Sasha Levin [this message]
2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 3/9] KVM: VMX: clear vmx_x86_ops.sync_pir_to_irr if APICv is disabled Sasha Levin
2021-12-13 14:27 ` Paolo Bonzini
2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 4/9] KVM: SEV: do not take kvm->lock when destroying Sasha Levin
2021-12-13 14:27 ` Paolo Bonzini
2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 5/9] KVM: selftests: Make sure kvm_create_max_vcpus test won't hit RLIMIT_NOFILE Sasha Levin
2021-12-13 14:27 ` Paolo Bonzini
2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 6/9] KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN Sasha Levin
2021-12-13 14:28 ` Paolo Bonzini
2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 7/9] KVM: selftests: Avoid KVM_SET_CPUID2 after KVM_RUN in hyperv_features test Sasha Levin
2021-12-13 14:28 ` Paolo Bonzini
2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 8/9] KVM: downgrade two BUG_ONs to WARN_ON_ONCE Sasha Levin
2021-12-13 14:28 ` Paolo Bonzini
2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 9/9] x86/kvm: remove unused ack_notifier callbacks Sasha Levin
2021-12-13 14:28 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211213141944.352249-2-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=dave@stgolabs.net \
--cc=dirk.behme@de.bosch.com \
--cc=frederic@kernel.org \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=stable@vger.kernel.org \
--cc=tannerlove@google.com \
--cc=tglx@linutronix.de \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.