From: Paul Gortmaker <paul.gortmaker@windriver.com>
To: <linux-kernel@vger.kernel.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Peter Zijlstra <peterz@infradead.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>,
Thomas Gleixner <tglx@linutronix.de>,
"Paul E . McKenney" <paulmck@kernel.org>,
Ingo Molnar <mingo@kernel.org>
Subject: [PATCH 1/2] sched/isolation: really align nohz_full with rcu_nocbs
Date: Mon, 21 Feb 2022 13:20:08 -0500 [thread overview]
Message-ID: <20220221182009.1283-2-paul.gortmaker@windriver.com> (raw)
In-Reply-To: <20220221182009.1283-1-paul.gortmaker@windriver.com>
At the moment it is currently possible to sneak a core into nohz_full
that lies between nr_possible and NR_CPUS - but you won't "see" it
because cpumask_pr_args() implicitly hides anything above nr_cpu_ids.
This becomes a problem when the nohz_full CPU set doesn't contain at
least one other valid nohz CPU - in which case we end up with the
tick_nohz_full_running set and no tick core specified, which trips an
endless sequence of WARN() and renders the machine unusable.
I inadvertently opened the door for this when fixing an overly
restrictive nohz_full conditional in the below Fixes: commit - and then
courtesy of my optimistic ACPI reporting nr_possible of 64 (the default
Kconfig for NR_CPUS) and the not-so helpful implict filtering done by
cpumask_pr_args, I unfortunately did not spot it during my testing.
So here, I don't rely on what was printed anymore, but code exactly what
our restrictions should be in order to be aligned with rcu_nocbs - which
was the original goal. Since the checks lie in "__init" code it is largely
free for us to do this anyway.
Building with NOHZ_FULL and NR_CPUS=128 on an otherwise defconfig, and
booting with "rcu_nocbs=8-127 nohz_full=96-127" on the same 16 core T5500
Dell machine now results in the following (only relevant lines shown):
smpboot: Allowing 64 CPUs, 48 hotplug CPUs
setup_percpu: NR_CPUS:128 nr_cpumask_bits:128 nr_cpu_ids:64 nr_node_ids:2
housekeeping: kernel parameter 'nohz_full=' or 'isolcpus=' contains nonexistent CPUs.
housekeeping: kernel parameter 'nohz_full=' or 'isolcpus=' has no valid CPUs.
rcu: RCU restricting CPUs from NR_CPUS=128 to nr_cpu_ids=64.
rcu: Note: kernel parameter 'rcu_nocbs=', 'nohz_full', or 'isolcpus=' contains nonexistent CPUs.
rcu: Offload RCU callbacks from CPUs: 8-63.
One can see both new housekeeping checks are triggered in the above.
The same invalid boot arg combination would have previously resulted in
an infinitely scrolling mix of WARN from all cores per tick on this box.
We may need to revisit these sanity checks when nohz_full is accepted as
a stand alone keyword "enable" w/o a cpuset (see rcu/nohz d2cf0854d728).
Fixes: 915a2bc3c6b7 ("sched/isolation: Reconcile rcu_nocbs= and nohz_full=")
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
kernel/sched/isolation.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index b4d10815c45a..f7d2406c1f1d 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -126,6 +126,17 @@ static int __init housekeeping_setup(char *str, unsigned long flags)
goto free_non_housekeeping_mask;
}
+ if (!cpumask_subset(non_housekeeping_mask, cpu_possible_mask)) {
+ pr_info("housekeeping: kernel parameter 'nohz_full=' or 'isolcpus=' contains nonexistent CPUs.\n");
+ cpumask_and(non_housekeeping_mask, cpu_possible_mask,
+ non_housekeeping_mask);
+ }
+
+ if (cpumask_empty(non_housekeeping_mask)) {
+ pr_info("housekeeping: kernel parameter 'nohz_full=' or 'isolcpus=' has no valid CPUs.\n");
+ goto free_non_housekeeping_mask;
+ }
+
alloc_bootmem_cpumask_var(&housekeeping_staging);
cpumask_andnot(housekeeping_staging,
cpu_possible_mask, non_housekeeping_mask);
--
2.17.1
next prev parent reply other threads:[~2022-02-21 18:30 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-21 18:20 [PATCH v2 0/2] sched/nohz: disallow non-existent cores from nohz-full Paul Gortmaker
2022-02-21 18:20 ` Paul Gortmaker [this message]
2022-02-21 18:20 ` [PATCH 2/2] tick/nohz: WARN_ON --> WARN_ON_ONCE to prevent console saturation Paul Gortmaker
-- strict thread matches above, loose matches on Subject: below --
2021-12-06 14:59 [PATCH 0/2] sched/nohz: disallow non-existent cores from nohz-full Paul Gortmaker
2021-12-06 14:59 ` [PATCH 1/2] sched/isolation: really align nohz_full with rcu_nocbs Paul Gortmaker
2021-12-06 21:33 ` Paul E. McKenney
2021-12-08 5:32 ` Paul Gortmaker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220221182009.1283-2-paul.gortmaker@windriver.com \
--to=paul.gortmaker@windriver.com \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox