Linux Documentation

Linux Documentation
 help / color / mirror / Atom feed

* Re: [PATCH] docs: kbuild: remove ISDN references in Makefile examples
From: Nathan Chancellor @ 2026-06-18  3:18 UTC (permalink / raw)
  To: Ethan Nelson-Moore
  Cc: Shuah Khan, Chen Pei, Randy Dunlap, Jonathan Corbet, linux-kbuild,
	linux-doc, Nathan Chancellor, Nicolas Schier
In-Reply-To: <20260613232830.147116-1-enelsonmoore@gmail.com>

On Sat, 13 Jun 2026 16:28:27 -0700, Ethan Nelson-Moore <enelsonmoore@gmail.com> wrote:
> Documentation/kbuild/makefiles.rst uses some extracts from now-removed
> ISDN code as examples. While they are harmless, they appeared in my
> checks for CONFIG_* symbols referenced but not defined in the kernel.
> Replace them with generic examples.

While I am fine with adjusting these examples to make it easier on tools
such as yours, how does this solve your problem? CONFIG_FOO and
CONFIG_BAR are still not defined anywhere. Are you adding exceptions for
these symbols? I ask because I would like these to be a little more
"kernel specific" if that makes sense.

Maybe it is not worth even checking Documentation/ for dead
configurations at all since that is probably not going to be a bug very
often but I guess it helps with cleaning up dead documentation?

>
>
> diff --git a/Documentation/kbuild/makefiles.rst b/Documentation/kbuild/makefiles.rst
> index 7521cae7d56f..ec8de1c20834 100644
> --- a/Documentation/kbuild/makefiles.rst
> +++ b/Documentation/kbuild/makefiles.rst
> @@ -127,11 +127,8 @@ controllers are detected, and thus your disks are renumbered.
>  
>  Example::
>  
> -  #drivers/isdn/i4l/Makefile
> -  # Makefile for the kernel ISDN subsystem and device drivers.
> -  # Each configuration option enables a list of files.

I think I would keep these comment, it is still relevant (at least to
me).

> -  obj-$(CONFIG_ISDN_I4L)         += isdn.o
> -  obj-$(CONFIG_ISDN_PPP_BSDCOMP) += isdn_bsdcomp.o
> +  obj-$(CONFIG_FOO) += foo.o
> +  obj-$(CONFIG_BAR) += bar.o

For instance, I think using a more descriptive symbol illustrates the
example a little better.

  obj-$(CONFIG_DRIVER_ONE) += driver_one.o
  obj-$(CONFIG_DRIVER_TWO) += driver_two.o

Same thing for the other examples. I just don't find these variable
names to be particularly good when illustrating actual real world
examples as opposed to conceptual ones. Not sure if others feel the same
way.

-- 
Cheers,
Nathan

^ permalink raw reply

* Re: [PATCH v2 06/11] hugetlb: make hugetlb_fault_mutex_hash() to take PAGE_SIZE index
From: Matthew Wilcox @ 2026-06-18  3:17 UTC (permalink / raw)
  To: Jane Chu
  Cc: akpm, jack, viro, brauner, muchun.song, osalvador, david, hughd,
	baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
	corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
In-Reply-To: <20260617172534.1740152-7-jane.chu@oracle.com>

On Wed, Jun 17, 2026 at 11:25:27AM -0600, Jane Chu wrote:
> Make hugetlb_fault_mutex_hash() to take a PAGE_SIZE-based index.
> This makes the helper interface consistent with filemap_get_folio(),
> and linear_page_index(), while preserving the same lock selection for
> a given hugetlb file offset.

Oh, hah.

I don't know that there's a better way to do this than the way you've
done it.  Unless we can just remove the fault mutex hash and use the
invalidate_lock instead.

^ permalink raw reply

* [PATCH v3 13/13] selftests/cgroup: Add kernel-noise isolation test to cpuset selftest
From: Jing Wu @ 2026-06-18  3:11 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Paul E. McKenney, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
	Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Anna-Maria Behnsen, Tejun Heo, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Thomas Gleixner
  Cc: linux-kernel, rcu, cgroups, linux-doc, linux-kselftest, Jing Wu,
	Qiliang Yuan
In-Reply-To: <20260618-wujing-dhm-v3-0-28f1a4d83b68@gmail.com>

Add test_hk_noise_isolated() to test_cpuset_prs.sh to verify that
creating and destroying an isolated cpuset partition updates both the
domain isolation state and the kernel-noise (nohz_full) state.

For domain isolation, the test checks cpuset.cpus.isolated before and
after the partition create/destroy cycle.

For kernel-noise isolation, the test reads
/sys/devices/system/cpu/nohz_full to confirm that the CPUs placed in
an isolated partition appear in the nohz_full mask while the partition
is active, and are removed from it once the partition is destroyed.
This sysfs attribute only exists when CONFIG_NO_HZ_FULL is enabled;
the nohz_full checks are skipped when it is absent so the test remains
usable on kernels without NO_HZ_FULL.

Add cpu_in_cpulist() to correctly determine whether a CPU number falls
within a kernel cpulist string (e.g. "4-7").  A plain grep cannot
detect membership in the interior of a range; cpu_in_cpulist() walks
each comma-separated element and handles both single values and
lo-hi ranges explicitly.

The test also covers: rejection of all-CPU isolation, the SMT sibling
constraint, nested partition inheritance, and a 100-cycle pressure test.
nohz_full is verified to be restored to its pre-test value after each
create/destroy cycle and after the pressure test.

Fix awk invocation to drop the spurious -e flag.

Signed-off-by: Jing Wu <realwujing@gmail.com>
Signed-off-by: Qiliang Yuan <yuanql9@chinatelecom.cn>
---
 tools/testing/selftests/cgroup/test_cpuset_prs.sh | 204 +++++++++++++++++++++-
 1 file changed, 203 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
index a56f4153c64df..047db14953fac 100755
--- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh
+++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
@@ -20,7 +20,7 @@ skip_test() {
 WAIT_INOTIFY=$(cd $(dirname $0); pwd)/wait_inotify
 
 # Find cgroup v2 mount point
-CGROUP2=$(mount -t cgroup2 | head -1 | awk -e '{print $3}')
+CGROUP2=$(mount -t cgroup2 | head -1 | awk '{print $3}')
 [[ -n "$CGROUP2" ]] || skip_test "Cgroup v2 mount point not found!"
 SUBPARTS_CPUS=$CGROUP2/.__DEBUG__.cpuset.cpus.subpartitions
 CPULIST=$(cat $CGROUP2/cpuset.cpus.effective)
@@ -1204,9 +1204,211 @@ test_inotify()
 	echo "" > cpuset.cpus
 }
 
+#
+# cpu_in_cpulist <cpu> <cpulist>
+#
+# Return 0 if <cpu> appears in <cpulist> (a kernel cpumask list such as
+# "0-3,8-31"), non-zero otherwise.  The kernel cpulist format uses ranges
+# ("lo-hi") and comma-separated items; a simple grep cannot detect that a
+# number falls in the middle of a range, so walk each element explicitly.
+#
+cpu_in_cpulist()
+{
+	local cpu=$1 list=$2 range lo hi
+	for range in $(echo "$list" | tr ',' ' '); do
+		if [[ "$range" == *-* ]]; then
+			lo=${range%-*}
+			hi=${range#*-}
+			[[ $cpu -ge $lo && $cpu -le $hi ]] && return 0
+		else
+			[[ $cpu -eq $range ]] && return 0
+		fi
+	done
+	return 1
+}
+
+#
+# Test that isolated partition creation/destruction drives kernel-noise
+# housekeeping mask updates and remains correct under pressure.
+#
+# Requires: >=8 CPUs, no isolcpus= boot conflict, root
+#
+test_hk_noise_isolated()
+{
+	local ISOL_BEFORE TEST_CPUS i PART ISOL_AFTER ISOL_RESTORE
+	local NOHZ_FILE NOHZ_BEFORE NOHZ_AFTER NOHZ_RESTORE
+	local HK_NOHZ_CHECK=0
+	local LOOPS=100
+
+	[[ $NR_CPUS -ge 8 ]] || {
+		echo "HK-noise test skipped: need >=8 CPUs, have $NR_CPUS"
+		return 0
+	}
+
+	# Detect whether CONFIG_NO_HZ_FULL is active: the sysfs attribute
+	# /sys/devices/system/cpu/nohz_full exposes the current nohz_full
+	# cpumask and is only present when NO_HZ_FULL is enabled.
+	NOHZ_FILE=/sys/devices/system/cpu/nohz_full
+	[[ -r "$NOHZ_FILE" ]] && HK_NOHZ_CHECK=1
+
+	cd $CGROUP2/test
+	echo member > cpuset.cpus.partition 2>/dev/null
+	echo "" > cpuset.cpus 2>/dev/null
+
+	ISOL_BEFORE=$(cat $CGROUP2/cpuset.cpus.isolated)
+	[[ $HK_NOHZ_CHECK -eq 1 ]] && NOHZ_BEFORE=$(cat $NOHZ_FILE)
+	TEST_CPUS="4-7"
+	echo $TEST_CPUS > cpuset.cpus
+
+	#
+	# Basic create/destroy cycle — verify domain isolation and
+	# kernel-noise (nohz_full) changes together.
+	#
+	console_msg "HK-noise: basic create/destroy cycle"
+	echo isolated > cpuset.cpus.partition
+
+	ISOL_AFTER=$(cat $CGROUP2/cpuset.cpus.isolated)
+	[[ $ISOL_AFTER != "$ISOL_BEFORE" ]] || {
+		echo "FAIL: isolated set unchanged after partition create"
+		exit 1
+	}
+
+	if [[ $HK_NOHZ_CHECK -eq 1 ]]; then
+		NOHZ_AFTER=$(cat $NOHZ_FILE)
+		# Verify that the newly isolated CPUs (4-7) appear in nohz_full.
+		# nohz_full = inverse of housekeeping, so isolating 4-7 should
+		# add them to nohz_full.
+		for cpu in 4 5 6 7; do
+			if ! cpu_in_cpulist $cpu "$NOHZ_AFTER"; then
+				echo "FAIL: cpu${cpu} not in nohz_full after isolation" \
+				     "(got: '$NOHZ_AFTER')"
+				exit 1
+			fi
+		done
+		console_msg "HK-noise: nohz_full after isolation: $NOHZ_AFTER"
+	fi
+
+	echo member > cpuset.cpus.partition
+
+	ISOL_RESTORE=$(cat $CGROUP2/cpuset.cpus.isolated)
+	[[ $ISOL_RESTORE = "$ISOL_BEFORE" ]] || {
+		echo "FAIL: expected '$ISOL_BEFORE' after destroy, got '$ISOL_RESTORE'"
+		exit 1
+	}
+
+	if [[ $HK_NOHZ_CHECK -eq 1 ]]; then
+		NOHZ_RESTORE=$(cat $NOHZ_FILE)
+		[[ "$NOHZ_RESTORE" = "$NOHZ_BEFORE" ]] || {
+			echo "FAIL: nohz_full not restored: expected '$NOHZ_BEFORE'," \
+			     "got '$NOHZ_RESTORE'"
+			exit 1
+		}
+	fi
+
+	#
+	# Reject all-CPU isolation (must leave at least one housekeeping CPU)
+	#
+	console_msg "HK-noise: reject all-CPU isolation"
+	echo 0-$((NR_CPUS - 1)) > cpuset.cpus
+	echo isolated > cpuset.cpus.partition
+	PART=$(cat cpuset.cpus.partition)
+	[[ $PART = *invalid* || $PART = member ]] || {
+		echo "FAIL: all-CPU isolation was not rejected, got '$PART'"
+		exit 1
+	}
+
+	#
+	# SMT safety: partial sibling isolation
+	#
+	console_msg "HK-noise: SMT sibling constraint"
+	echo $TEST_CPUS > cpuset.cpus
+	echo isolated > cpuset.cpus.partition
+	PART=$(cat cpuset.cpus.partition)
+	[[ $PART = isolated ]] || {
+		echo "FAIL: could not create isolated partition, got '$PART'"
+		exit 1
+	}
+	echo member > cpuset.cpus.partition
+
+	#
+	# Nested partition: parent root → child isolated
+	#
+	console_msg "HK-noise: nested partition inheritance"
+	echo $TEST_CPUS > cpuset.cpus
+	test_partition root
+	mkdir -p HK_SUB
+	cd HK_SUB
+	echo 4-5 > cpuset.cpus
+	echo isolated > cpuset.cpus.partition
+	ISOL_AFTER=$(cat $CGROUP2/cpuset.cpus.isolated)
+	[[ -n $ISOL_AFTER ]] || {
+		echo "FAIL: nested isolated partition not reflected in cpuset.cpus.isolated"
+		exit 1
+	}
+	echo member > cpuset.cpus.partition
+	cd $CGROUP2/test
+	echo member > cpuset.cpus.partition
+	rmdir HK_SUB 2>/dev/null
+
+	#
+	# Pressure test: 100 create/destroy cycles
+	#
+	console_msg "HK-noise: pressure test ($LOOPS cycles)"
+	echo $TEST_CPUS > cpuset.cpus
+	for i in $(seq 1 $LOOPS); do
+		echo isolated > cpuset.cpus.partition
+		PART=$(cat cpuset.cpus.partition)
+		[[ $PART = isolated ]] || {
+			echo "FAIL: cycle $i create failed, got '$PART'"
+			exit 1
+		}
+		echo member > cpuset.cpus.partition
+		PART=$(cat cpuset.cpus.partition)
+		[[ $PART = member ]] || {
+			echo "FAIL: cycle $i destroy failed, got '$PART'"
+			exit 1
+		}
+	done
+
+	#
+	# Stability: after pressure test, verify final state
+	#
+	console_msg "HK-noise: post-pressure cleanup"
+	echo isolated > cpuset.cpus.partition
+	ISOL_AFTER=$(cat $CGROUP2/cpuset.cpus.isolated)
+	[[ -n $ISOL_AFTER ]] || {
+		echo "FAIL: isolated set empty after pressure test"
+		exit 1
+	}
+	echo member > cpuset.cpus.partition
+	echo "" > cpuset.cpus
+	ISOL_RESTORE=$(cat $CGROUP2/cpuset.cpus.isolated)
+	[[ $ISOL_RESTORE = "$ISOL_BEFORE" ]] || {
+		echo "FAIL: final isolated '$ISOL_RESTORE' != '$ISOL_BEFORE'"
+		exit 1
+	}
+
+	if [[ $HK_NOHZ_CHECK -eq 1 ]]; then
+		NOHZ_RESTORE=$(cat $NOHZ_FILE)
+		[[ "$NOHZ_RESTORE" = "$NOHZ_BEFORE" ]] || {
+			echo "FAIL: nohz_full not restored after pressure test:" \
+			     "expected '$NOHZ_BEFORE', got '$NOHZ_RESTORE'"
+			exit 1
+		}
+	fi
+
+	cd $CGROUP2
+	if [[ $HK_NOHZ_CHECK -eq 1 ]]; then
+		console_msg "HK-noise: PASSED (with nohz_full verification)"
+	else
+		console_msg "HK-noise: PASSED (nohz_full skipped: CONFIG_NO_HZ_FULL not active)"
+	fi
+}
+
 trap cleanup 0 2 3 6
 run_state_test TEST_MATRIX
 run_remote_state_test REMOTE_TEST_MATRIX
 test_isolated
 test_inotify
+test_hk_noise_isolated
 echo "All tests PASSED."

-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 12/13] docs: cgroup-v2: Document kernel-noise isolation via isolated partitions
From: Jing Wu @ 2026-06-18  3:11 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Paul E. McKenney, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
	Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Anna-Maria Behnsen, Tejun Heo, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Thomas Gleixner
  Cc: linux-kernel, rcu, cgroups, linux-doc, linux-kselftest, Jing Wu,
	Qiliang Yuan
In-Reply-To: <20260618-wujing-dhm-v3-0-28f1a4d83b68@gmail.com>

Document that cpuset.cpus.partition=isolated now drives runtime updates
of the housekeeping masks for kernel-noise types: nohz_full (tick
suppression), RCU NOCB offloading, and managed IRQ migration.  No
additional cgroupfs files are required; the partition update path
automatically triggers explicit housekeeping callbacks for all affected
subsystems.

Signed-off-by: Jing Wu <realwujing@gmail.com>
Signed-off-by: Qiliang Yuan <yuanql9@chinatelecom.cn>
---
 Documentation/admin-guide/cgroup-v2.rst | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 6efd0095ed995..7c3b048e75cb5 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2721,6 +2721,14 @@ Cpuset Interface Files
 	kernel boot command line option.  If those CPUs are to be put
 	into a partition, they have to be used in an isolated partition.
 
+	When an isolated partition is created or destroyed, the kernel
+	automatically drives runtime updates of the housekeeping masks
+	for kernel-noise types (nohz_full, RCU NOCB, managed IRQ
+	interrupts).  This extends isolation beyond scheduler domains:
+	the tick is stopped on isolated CPUs, RCU callbacks are
+	offloaded to housekeeping cores, and managed interrupts are
+	migrated away.  No additional cgroupfs files are required.
+
 
 Device controller
 -----------------

-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 11/13] cgroup/cpuset: Extend isolated partition to trigger kernel-noise isolation
From: Jing Wu @ 2026-06-18  3:11 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Paul E. McKenney, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
	Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Anna-Maria Behnsen, Tejun Heo, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Thomas Gleixner
  Cc: linux-kernel, rcu, cgroups, linux-doc, linux-kselftest, Jing Wu,
	Qiliang Yuan
In-Reply-To: <20260618-wujing-dhm-v3-0-28f1a4d83b68@gmail.com>

When a cpuset isolated partition is created or destroyed, also drive
kernel-noise housekeeping types (HK_TYPE_KERNEL_NOISE and
HK_TYPE_MANAGED_IRQ) through housekeeping_update_types().  The sched
domain mask (HK_TYPE_DOMAIN) is updated first via the existing
housekeeping_update() call, then the explicit callback chain in
housekeeping_update_types() invokes subsystem apply() handlers to
toggle nohz_full, managed IRQ migration, and RCU NOCB offloading.

The update runs outside cpuset_mutex and cpus_read_lock, protected
only by cpuset_top_mutex.

Signed-off-by: Jing Wu <realwujing@gmail.com>
Signed-off-by: Qiliang Yuan <yuanql9@chinatelecom.cn>
---
 kernel/cgroup/cpuset.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 5c33ab20cc208..67b93bd4d58f2 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -1347,17 +1347,30 @@ static void cpuset_update_sd_hk_unlock(void)
 		rebuild_sched_domains_locked();
 
 	if (update_housekeeping) {
+		static const unsigned long noise_types =
+			BIT(HK_TYPE_KERNEL_NOISE) | BIT(HK_TYPE_MANAGED_IRQ);
+
 		update_housekeeping = false;
 		cpumask_copy(isolated_hk_cpus, isolated_cpus);
 
-		/*
-		 * housekeeping_update() is now called without holding
-		 * cpus_read_lock and cpuset_mutex. Only cpuset_top_mutex
-		 * is still being held for mutual exclusion.
-		 */
 		mutex_unlock(&cpuset_mutex);
 		cpus_read_unlock();
+
+		/*
+		 * Update the sched domain mask first; it must succeed
+		 * before the kernel-noise types because workqueue flush
+		 * and timer migration depend on the sched domain mask.
+		 */
 		WARN_ON_ONCE(housekeeping_update(isolated_hk_cpus));
+
+		/*
+		 * Drive kernel-noise types through the new explicit
+		 * callback chain.  Tik/rcu/genirq subtypes react
+		 * through their registered housekeeping_cbs apply()
+		 * handlers.
+		 */
+		WARN_ON_ONCE(housekeeping_update_types(noise_types,
+						       isolated_hk_cpus));
 		mutex_unlock(&cpuset_top_mutex);
 	} else {
 		cpuset_full_unlock();

-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 10/13] sched: Guard sched_tick_start/stop against uninitialized tick_work_cpu
From: Jing Wu @ 2026-06-18  3:11 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Paul E. McKenney, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
	Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Anna-Maria Behnsen, Tejun Heo, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Thomas Gleixner
  Cc: linux-kernel, rcu, cgroups, linux-doc, linux-kselftest, Jing Wu,
	Qiliang Yuan
In-Reply-To: <20260618-wujing-dhm-v3-0-28f1a4d83b68@gmail.com>

sched_tick_start() and sched_tick_stop() are called during CPU hotplug
for CPUs not in the HK_TYPE_KERNEL_NOISE set.  They dereference
tick_work_cpu, which is allocated by sched_tick_offload_init() and only
called from housekeeping_init() when nohz_full= is present at boot.

When the DHM subsystem first-enables HK_TYPE_KERNEL_NOISE at runtime via
housekeeping_update_types(), tick_work_cpu remains NULL because
sched_tick_offload_init() is __init-only and cannot be re-invoked.  A
subsequent CPU offline/online cycle for an isolated CPU triggers
WARN_ON_ONCE(!tick_work_cpu) followed by a NULL-pointer dereference in
per_cpu_ptr(tick_work_cpu, cpu), crashing the kernel.

Since nohz_full= was not active at boot, tick_nohz_full_running remains
false and the tick-offload infrastructure is never activated; isolated
CPUs continue to receive their own ticks.  Guard both helpers with an
additional !tick_work_cpu check so they become no-ops in this case.

Signed-off-by: Jing Wu <realwujing@gmail.com>
Signed-off-by: Qiliang Yuan <yuanql9@chinatelecom.cn>
---
 kernel/sched/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 371b509d92164..df004e3efca70 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5778,7 +5778,7 @@ static void sched_tick_start(int cpu)
 	int os;
 	struct tick_work *twork;

-	if (housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE))
+	if (housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE) || !tick_work_cpu)
 		return;

 	WARN_ON_ONCE(!tick_work_cpu);
@@ -5799,7 +5799,7 @@ static void sched_tick_stop(int cpu)
 	struct tick_work *twork;
 	int os;

-	if (housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE))
+	if (housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE) || !tick_work_cpu)
 		return;

 	WARN_ON_ONCE(!tick_work_cpu);

-- 
2.43.0

^ permalink raw reply related

* [PATCH v3 09/13] watchdog/lockup_detector: Register housekeeping callback for kernel-noise
From: Jing Wu @ 2026-06-18  3:11 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Paul E. McKenney, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
	Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Anna-Maria Behnsen, Tejun Heo, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Thomas Gleixner
  Cc: linux-kernel, rcu, cgroups, linux-doc, linux-kselftest, Jing Wu,
	Qiliang Yuan
In-Reply-To: <20260618-wujing-dhm-v3-0-28f1a4d83b68@gmail.com>

Initialize watchdog_cpumask from HK_TYPE_KERNEL_NOISE rather than
HK_TYPE_TIMER at boot, so the initial mask already reflects any CPUs
excluded by nohz_full= on the kernel command line.

Register a housekeeping_cbs so watchdog_cpumask stays in sync with
HK_TYPE_KERNEL_NOISE when isolation boundaries change at runtime via
cpuset isolated partitions.  The apply() callback copies the new
housekeeping mask into watchdog_cpumask and triggers
__lockup_detector_reconfigure() to restart watchdog threads on the
updated CPU set.

When nohz_full= is absent at boot, tick_nohz_full_running remains
false and DHM isolated partitions do not activate tick suppression.
In that case watchdog_hk_apply() is a no-op: there is no need to
reconfigure the watchdog CPU set because the full nohz_full
infrastructure was never initialized.

Signed-off-by: Jing Wu <realwujing@gmail.com>
Signed-off-by: Qiliang Yuan <yuanql9@chinatelecom.cn>
---
 kernel/watchdog.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 55 insertions(+), 1 deletion(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 87dd5e0f6968d..998ad94da4cb9 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -1389,7 +1389,7 @@ void __init lockup_detector_init(void)
 		pr_info("Disabling watchdog on nohz_full cores by default\n");
 
 	cpumask_copy(&watchdog_cpumask,
-		     housekeeping_cpumask(HK_TYPE_TIMER));
+		     housekeeping_cpumask(HK_TYPE_KERNEL_NOISE));
 
 	if (!watchdog_hardlockup_probe())
 		watchdog_hardlockup_available = true;
@@ -1398,3 +1398,57 @@ void __init lockup_detector_init(void)
 
 	lockup_detector_setup();
 }
+
+/*
+ * Watchdog housekeeping callback: resync watchdog_cpumask with
+ * HK_TYPE_KERNEL_NOISE when isolation boundaries change at runtime.
+ */
+#ifdef CONFIG_CPU_ISOLATION
+static void watchdog_hk_apply(enum hk_type type)
+{
+	const struct cpumask *hk;
+
+	/*
+	 * When nohz_full= was not given at boot, tick_nohz_full_running
+	 * remains false and the full nohz_full infrastructure was never
+	 * initialised.  DHM isolated partitions do not activate tick
+	 * suppression in that case, so there is no need to reconfigure the
+	 * watchdog CPU set.
+	 */
+#ifdef CONFIG_NO_HZ_FULL
+	if (!READ_ONCE(tick_nohz_full_running))
+		return;
+#endif
+
+	hk = housekeeping_cpumask(HK_TYPE_KERNEL_NOISE);
+	if (mutex_trylock(&watchdog_mutex)) {
+		cpumask_copy(&watchdog_cpumask, hk);
+		__lockup_detector_reconfigure(false);
+		mutex_unlock(&watchdog_mutex);
+	}
+}
+
+static int watchdog_hk_validate(enum hk_type type,
+				const struct cpumask *cur_mask,
+				const struct cpumask *new_mask)
+{
+	return 0;
+}
+
+static struct housekeeping_cbs watchdog_hk_cbs = {
+	.name		= "watchdog",
+	.pre_validate	= watchdog_hk_validate,
+	.apply		= watchdog_hk_apply,
+};
+
+static int __init watchdog_hk_init(void)
+{
+	int ret;
+
+	ret = housekeeping_register_cbs(HK_TYPE_KERNEL_NOISE, &watchdog_hk_cbs);
+	if (ret)
+		pr_debug("watchdog: hk callback registration skipped (%d)\n", ret);
+	return 0;
+}
+late_initcall(watchdog_hk_init);
+#endif

-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 08/13] genirq: Add explicit housekeeping callback for managed IRQ migration
From: Jing Wu @ 2026-06-18  3:11 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Paul E. McKenney, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
	Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Anna-Maria Behnsen, Tejun Heo, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Thomas Gleixner
  Cc: linux-kernel, rcu, cgroups, linux-doc, linux-kselftest, Jing Wu,
	Qiliang Yuan
In-Reply-To: <20260618-wujing-dhm-v3-0-28f1a4d83b68@gmail.com>

Register a housekeeping callback for HK_TYPE_MANAGED_IRQ.  When the
mask changes, iterate all active managed interrupts, intersect their
current affinity mask with the new housekeeping mask, and re-apply
with irq_do_set_affinity().  Managed interrupts on CPUs removed from
the housekeeping set are migrated to remaining housekeeping CPUs.

Only managed interrupts (IRQF_AFFINITY_MANAGED) are selected because
the kernel owns their affinity; user-controlled IRQ affinities must
not be overridden by the housekeeping layer.

The new HK_TYPE_MANAGED_IRQ cpumask is snapshotted once under an RCU
read lock before the IRQ loop, satisfying the lockdep annotation in
housekeeping_cpumask() for runtime-mutable types.

When the intersection of the IRQ's current affinity and the new
housekeeping mask is non-empty, irq_do_set_affinity() moves the IRQ
to the restricted set.  If the intersection is empty (all CPUs that
were serving this IRQ are now isolated), the affinity update is skipped
and the IRQ continues to run on the isolated CPU temporarily.  Full
support for the IRQ shutdown / re-startup path (when all serving CPUs
become isolated) is left for follow-up work.

Guarded by irq_lock_sparse() and per-descriptor raw_spin_lock to
prevent races with concurrent affinity changes.

Signed-off-by: Jing Wu <realwujing@gmail.com>
Signed-off-by: Qiliang Yuan <yuanql9@chinatelecom.cn>
---
 kernel/irq/manage.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 86 insertions(+)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 2e80724378267..ea97f455eab2a 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -2801,3 +2801,89 @@ bool irq_check_status_bit(unsigned int irq, unsigned int bitmask)
 	return res;
 }
 EXPORT_SYMBOL_GPL(irq_check_status_bit);
+
+/*
+ * Managed IRQ housekeeping callback: iterate all managed IRQs and ask
+ * the chip to move them off CPUs newly removed from HK_TYPE_MANAGED_IRQ.
+ */
+static void irq_hk_apply(enum hk_type type)
+{
+	cpumask_var_t hk_mask;
+	struct irq_desc *desc;
+	unsigned int irq;
+
+	if (!alloc_cpumask_var(&hk_mask, GFP_KERNEL))
+		return;
+
+	/*
+	 * Snapshot the new HK_TYPE_MANAGED_IRQ mask under an RCU read lock
+	 * before iterating IRQ descriptors.  The lockdep annotation in
+	 * housekeeping_cpumask() requires an RCU read-side critical section
+	 * for runtime-mutable types.
+	 */
+	rcu_read_lock();
+	cpumask_copy(hk_mask, housekeeping_cpumask_rcu(HK_TYPE_MANAGED_IRQ));
+	rcu_read_unlock();
+
+	irq_lock_sparse();
+
+	for_each_active_irq(irq) {
+		desc = irq_to_desc(irq);
+		if (!desc || !desc->action)
+			continue;
+
+		/*
+		 * Only managed interrupts are selected: they have
+		 * IRQF_AFFINITY_MANAGED set, meaning the kernel owns their
+		 * affinity.  User-controlled IRQs are intentionally skipped.
+		 *
+		 * When the intersection of the current affinity mask and the
+		 * new housekeeping mask is non-empty, re-apply the restricted
+		 * affinity to migrate the IRQ away from newly isolated CPUs.
+		 * If the intersection is empty (all serving CPUs are now
+		 * isolated), the IRQ is left on its current CPU temporarily;
+		 * handling that case (IRQ shutdown / re-startup) is left for
+		 * a follow-up.
+		 */
+		if (irqd_affinity_is_managed(&desc->irq_data)) {
+			const struct cpumask *mask;
+			struct cpumask *tmp = this_cpu_ptr(&__tmp_mask);
+
+			raw_spin_lock_irq(&desc->lock);
+			mask = irq_data_get_affinity_mask(&desc->irq_data);
+			cpumask_and(tmp, mask, hk_mask);
+			if (cpumask_intersects(tmp, cpu_online_mask))
+				irq_do_set_affinity(&desc->irq_data, tmp, false);
+			raw_spin_unlock_irq(&desc->lock);
+		}
+	}
+
+	irq_unlock_sparse();
+	free_cpumask_var(hk_mask);
+}
+
+static int irq_hk_validate(enum hk_type type,
+			   const struct cpumask *cur_mask,
+			   const struct cpumask *new_mask)
+{
+	if (!IS_ENABLED(CONFIG_SMP))
+		return -EOPNOTSUPP;
+	return 0;
+}
+
+static struct housekeeping_cbs irq_hk_cbs = {
+	.name		= "genirq/managed",
+	.pre_validate	= irq_hk_validate,
+	.apply		= irq_hk_apply,
+};
+
+static int __init irq_hk_init(void)
+{
+	int ret;
+
+	ret = housekeeping_register_cbs(HK_TYPE_MANAGED_IRQ, &irq_hk_cbs);
+	if (ret)
+		pr_info("genirq: managed IRQ runtime migration disabled (%d)\n", ret);
+	return 0;
+}
+late_initcall(irq_hk_init);

-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 07/13] rcu/nocb: Add explicit housekeeping callback for runtime NOCB toggling
From: Jing Wu @ 2026-06-18  3:11 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Paul E. McKenney, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
	Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Anna-Maria Behnsen, Tejun Heo, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Thomas Gleixner
  Cc: linux-kernel, rcu, cgroups, linux-doc, linux-kselftest, Jing Wu,
	Qiliang Yuan
In-Reply-To: <20260618-wujing-dhm-v3-0-28f1a4d83b68@gmail.com>

Register a housekeeping callback for HK_TYPE_KERNEL_NOISE.  When the
mask changes, schedule asynchronous work to iterate all possible CPUs
and toggle NOCB mode for CPUs whose state disagrees with the new mask.
CPUs in the housekeeping set are de-offloaded; isolated CPUs are
offloaded.

Use CPU hotplug (remove_cpu() / add_cpu()) because
rcu_nocb_cpu_offload() and rcu_nocb_cpu_deoffload() require the target
CPU to be offline.  The hotplug cycle takes the CPU fully offline to
quiesce its RCU state before toggling the NOCB flag, then brings it
back.  Skip CPUs whose state already matches to avoid unnecessary
hotplug churn.  Only bring a CPU back online if it was online before
the state change (was_online guard avoids add_cpu() on a CPU that was
already offline).

This differs from Frederic Weisbecker's suggestion to "assume the CPU
is offline" within the RCU subsystem and toggle NOCB without a full
hotplug cycle.  The full hotplug approach was chosen for v3 because
rcu_nocb_cpu_offload() and rcu_nocb_cpu_deoffload() are the existing
stable interfaces and the "assume offline" path would require adding
new internal RCU APIs.  This is a known limitation that may be
addressed by RCU maintainers in follow-up work.

Snapshot the current HK_TYPE_KERNEL_NOISE cpumask inside the work
function under an RCU read lock rather than caching the pointer at
apply() time.  Caching at apply() time would create a use-after-free
hazard: a subsequent housekeeping_update_types() call frees the old
cpumask after synchronize_rcu() but before the work function runs.

Remove the cpus_read_lock() / cpus_read_unlock() pair that wrapped the
hotplug loop.  remove_cpu() and add_cpu() acquire the cpu_hotplug_lock
write side; holding the read side via cpus_read_lock() before calling
them causes a deadlock.

Signed-off-by: Jing Wu <realwujing@gmail.com>
Signed-off-by: Qiliang Yuan <yuanql9@chinatelecom.cn>
---
 kernel/rcu/tree.c | 104 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 104 insertions(+)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 55df6d37145e8..214ce940f501b 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4929,3 +4929,107 @@ void __init rcu_init(void)
 #include "tree_exp.h"
 #include "tree_nocb.h"
 #include "tree_plugin.h"
+
+#ifdef CONFIG_RCU_NOCB_CPU
+/*
+ * RCU NOCB runtime toggle via housekeeping callback.
+ * Schedule the CPU-hotplug work asynchronously because
+ * remove_cpu() and add_cpu() must not be called while holding
+ * cpuset_top_mutex (the hk callback context).
+ *
+ * Snapshot the current HK_TYPE_KERNEL_NOISE cpumask inside the work
+ * function under an RCU read lock to avoid caching a pointer at
+ * apply() time that could be freed before the work runs.
+ */
+struct rcu_hk_work {
+	struct work_struct work;
+};
+
+static void rcu_hk_workfn(struct work_struct *w)
+{
+	struct rcu_hk_work *hw = container_of(w, struct rcu_hk_work, work);
+	cpumask_var_t hk_mask;
+	int cpu, ret;
+
+	if (!alloc_cpumask_var(&hk_mask, GFP_KERNEL)) {
+		kfree(hw);
+		return;
+	}
+
+	rcu_read_lock();
+	cpumask_copy(hk_mask, housekeeping_cpumask_rcu(HK_TYPE_KERNEL_NOISE));
+	rcu_read_unlock();
+
+	for_each_possible_cpu(cpu) {
+		bool should_offload = !cpumask_test_cpu(cpu, hk_mask);
+		bool is_offloaded;
+		bool was_online;
+
+		if (!cpumask_available(rcu_nocb_mask)) {
+			is_offloaded = false;
+		} else {
+			is_offloaded = cpumask_test_cpu(cpu, rcu_nocb_mask);
+		}
+
+		if (should_offload == is_offloaded)
+			continue;
+
+		was_online = cpu_online(cpu);
+		if (was_online) {
+			ret = remove_cpu(cpu);
+			if (ret)
+				continue;
+		}
+		if (should_offload)
+			rcu_nocb_cpu_offload(cpu);
+		else
+			rcu_nocb_cpu_deoffload(cpu);
+		if (was_online)
+			add_cpu(cpu);
+	}
+
+	free_cpumask_var(hk_mask);
+	kfree(hw);
+}
+
+static void rcu_hk_apply(enum hk_type type)
+{
+	struct rcu_hk_work *hw;
+
+	if (!cpumask_available(rcu_nocb_mask))
+		return;
+
+	hw = kmalloc(sizeof(*hw), GFP_KERNEL);
+	if (!hw)
+		return;
+
+	INIT_WORK(&hw->work, rcu_hk_workfn);
+	schedule_work(&hw->work);
+}
+
+static int rcu_hk_validate(enum hk_type type,
+			   const struct cpumask *cur_mask,
+			   const struct cpumask *new_mask)
+{
+	if (!IS_ENABLED(CONFIG_RCU_NOCB_CPU))
+		return -EOPNOTSUPP;
+	return 0;
+}
+
+static struct housekeeping_cbs rcu_hk_cbs = {
+	.name		= "rcu/nocb",
+	.pre_validate	= rcu_hk_validate,
+	.apply		= rcu_hk_apply,
+};
+
+static int __init rcu_hk_init(void)
+{
+	int ret;
+
+	ret = housekeeping_register_cbs(HK_TYPE_KERNEL_NOISE, &rcu_hk_cbs);
+	if (ret)
+		pr_info("rcu/nocb: runtime NOCB toggle disabled (%d)\n", ret);
+	return 0;
+}
+late_initcall(rcu_hk_init);
+#endif /* CONFIG_RCU_NOCB_CPU */

-- 
2.43.0

^ permalink raw reply related

* [PATCH v3 06/13] tick/nohz, context_tracking: Prepare for runtime nohz_full updates
From: Jing Wu @ 2026-06-18  3:11 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Paul E. McKenney, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
	Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Anna-Maria Behnsen, Tejun Heo, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Thomas Gleixner
  Cc: linux-kernel, rcu, cgroups, linux-doc, linux-kselftest, Jing Wu,
	Qiliang Yuan
In-Reply-To: <20260618-wujing-dhm-v3-0-28f1a4d83b68@gmail.com>

Remove __init from ct_cpu_track_user() and __initdata from the
initialized flag so context tracking can be activated on CPUs that
join nohz_full at runtime.  Drop the __ro_after_init attribute from
the context_tracking_key static key, allowing static_branch_dec()
when a CPU leaves nohz_full.

Add ct_cpu_untrack_user() to reverse ct_cpu_track_user(), decrementing
the static key and clearing the per-CPU tracking state.

Register a housekeeping_cbs for HK_TYPE_KERNEL_NOISE that:
- pre_validate: checks CONFIG_NO_HZ_FULL is available.
- apply: snapshots the new HK_TYPE_KERNEL_NOISE mask under an RCU
  read lock (the lockdep annotation in housekeeping_cpumask() requires
  this even after synchronize_rcu() completes), computes nohz_full as
  the complement of the housekeeping mask, then under tick_nohz_lock:
  - Activates context tracking (ct_cpu_track_user()) on CPUs newly
    added to nohz_full, and deactivates it (ct_cpu_untrack_user()) on
    CPUs returning to the housekeeping set.  This activates the
    context_tracking_key static key dynamically, eliminating the
    need for CONFIG_CONTEXT_TRACKING_USER_FORCE.
  - Updates tick_nohz_full_mask in-place (legacy EXPORT_SYMBOL_GPL
    snapshot, eventually consistent).
  - Migrates tick_do_timer_cpu if it moved into the isolated set.
  - Kicks all CPUs to re-evaluate tick behaviour.

When CONFIG_CONTEXT_TRACKING_USER_FORCE is enabled and nohz_full= is
given at boot, tick_nohz_init() now calls context_tracking_init()
before iterating over tick_nohz_full_mask to call ct_cpu_track_user().
This ensures the per-CPU tracking state is set up before any CPU is
tracked, which is also required for CPUs later added to nohz_full at
runtime via DHM isolated partitions.

Signed-off-by: Jing Wu <realwujing@gmail.com>
Signed-off-by: Qiliang Yuan <yuanql9@chinatelecom.cn>
---
 include/linux/context_tracking.h |   1 +
 kernel/context_tracking.c        |  23 ++----
 kernel/time/tick-sched.c         | 157 +++++++++++++++++++++++++++++++++++++--
 3 files changed, 161 insertions(+), 20 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index af9fe87a09225..632cfc97b5b22 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -12,6 +12,7 @@
 
 #ifdef CONFIG_CONTEXT_TRACKING_USER
 extern void ct_cpu_track_user(int cpu);
+extern void ct_cpu_untrack_user(int cpu);
 
 /* Called with interrupts disabled.  */
 extern void __ct_user_enter(enum ctx_state state);
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index a743e7ffa6c00..e68fb02b25ad4 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -411,7 +411,7 @@ static __always_inline void ct_kernel_enter(bool user, int offset) { }
 #define CREATE_TRACE_POINTS
 #include <trace/events/context_tracking.h>
 
-DEFINE_STATIC_KEY_FALSE_RO(context_tracking_key);
+DEFINE_STATIC_KEY_FALSE(context_tracking_key);
 EXPORT_SYMBOL_GPL(context_tracking_key);
 
 static noinstr bool context_tracking_recursion_enter(void)
@@ -674,28 +674,21 @@ void user_exit_callable(void)
 }
 NOKPROBE_SYMBOL(user_exit_callable);
 
-void __init ct_cpu_track_user(int cpu)
+void ct_cpu_track_user(int cpu)
 {
-	static __initdata bool initialized = false;
-
 	if (!per_cpu(context_tracking.active, cpu)) {
 		per_cpu(context_tracking.active, cpu) = true;
 		static_branch_inc(&context_tracking_key);
 	}
+}
 
-	if (initialized)
+void ct_cpu_untrack_user(int cpu)
+{
+	if (!per_cpu(context_tracking.active, cpu))
 		return;
 
-#ifdef CONFIG_HAVE_TIF_NOHZ
-	/*
-	 * Set TIF_NOHZ to init/0 and let it propagate to all tasks through fork
-	 * This assumes that init is the only task at this early boot stage.
-	 */
-	set_tsk_thread_flag(&init_task, TIF_NOHZ);
-#endif
-	WARN_ON_ONCE(!tasklist_empty());
-
-	initialized = true;
+	per_cpu(context_tracking.active, cpu) = false;
+	static_branch_dec(&context_tracking_key);
 }
 
 #ifdef CONFIG_CONTEXT_TRACKING_USER_FORCE
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index cbbb87a0c6e7c..a7fe097042f7d 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -26,6 +26,7 @@
 #include <linux/irq_work.h>
 #include <linux/posix-timers.h>
 #include <linux/context_tracking.h>
+#include <linux/sched/isolation.h>
 #include <linux/mm.h>
 
 #include <asm/irq_regs.h>
@@ -653,11 +654,6 @@ void __init tick_nohz_init(void)
 	if (!tick_nohz_full_running)
 		return;
 
-	/*
-	 * Full dynticks uses IRQ work to drive the tick rescheduling on safe
-	 * locking contexts. But then we need IRQ work to raise its own
-	 * interrupts to avoid circular dependency on the tick.
-	 */
 	if (!arch_irq_work_has_interrupt()) {
 		pr_warn("NO_HZ: Can't run full dynticks because arch doesn't support IRQ work self-IPIs\n");
 		cpumask_clear(tick_nohz_full_mask);
@@ -676,6 +672,16 @@ void __init tick_nohz_init(void)
 		}
 	}
 
+	/*
+	 * Pre-initialize context tracking for all possible CPUs so
+	 * ctx tracking is already active when a CPU is later added to
+	 * nohz_full at runtime.  The tracking overhead is negligible
+	 * because the static key is not incremented yet — only per-CPU
+	 * tracking state is set up.
+	 */
+	if (IS_ENABLED(CONFIG_CONTEXT_TRACKING_USER_FORCE))
+		context_tracking_init();
+
 	for_each_cpu(cpu, tick_nohz_full_mask)
 		ct_cpu_track_user(cpu);
 
@@ -686,6 +692,147 @@ void __init tick_nohz_init(void)
 	pr_info("NO_HZ: Full dynticks CPUs: %*pbl.\n",
 		cpumask_pr_args(tick_nohz_full_mask));
 }
+
+static int tick_nohz_hk_validate(enum hk_type type,
+				 const struct cpumask *cur_mask,
+				 const struct cpumask *new_mask)
+{
+	if (!IS_ENABLED(CONFIG_NO_HZ_FULL))
+		return -EOPNOTSUPP;
+	return 0;
+}
+
+static void tick_nohz_hk_apply(enum hk_type type)
+{
+	static DEFINE_SPINLOCK(tick_nohz_lock);
+	cpumask_var_t nohz_full, added, removed;
+	bool was_running;
+	int cpu;
+
+	if (!alloc_cpumask_var(&nohz_full, GFP_KERNEL))
+		return;
+	if (!alloc_cpumask_var(&added, GFP_KERNEL)) {
+		free_cpumask_var(nohz_full);
+		return;
+	}
+	if (!alloc_cpumask_var(&removed, GFP_KERNEL)) {
+		free_cpumask_var(added);
+		free_cpumask_var(nohz_full);
+		return;
+	}
+
+	/*
+	 * Snapshot the new HK_TYPE_KERNEL_NOISE mask under an RCU read lock.
+	 * housekeeping_update_types() completes synchronize_rcu() before
+	 * invoking apply(), so the new pointer is stable; however the lockdep
+	 * annotation in housekeeping_cpumask() still requires an RCU read-side
+	 * critical section for runtime-mutable types.
+	 */
+	rcu_read_lock();
+	cpumask_andnot(nohz_full, cpu_possible_mask,
+		       housekeeping_cpumask_rcu(HK_TYPE_KERNEL_NOISE));
+	rcu_read_unlock();
+
+	/*
+	 * When "nohz_full=" was not passed at boot, tick_nohz_full_running is
+	 * false and the full dynticks infrastructure (sched_tick_offload_init,
+	 * RCU nohz quiescent-state reporting, context-tracking bootstrap) was
+	 * never initialised.  In that case restrict the update to
+	 * tick_nohz_full_mask so the /sys/devices/system/cpu/nohz_full sysfs
+	 * attribute reflects DHM-isolated CPUs without enabling tick
+	 * suppression, context tracking, or timer migration – all of which
+	 * require boot-time setup and would deadlock on the first
+	 * synchronize_rcu() call after CPUs are offlined.
+	 */
+	was_running = READ_ONCE(tick_nohz_full_running);
+
+	spin_lock(&tick_nohz_lock);
+
+	/*
+	 * When nohz_full= was active at boot, compute the delta and update
+	 * context tracking for CPUs joining or leaving the nohz_full set.
+	 * Skip when !was_running: ct_cpu_track_user() calls
+	 * static_branch_inc() which may sleep (jump_label_update on the
+	 * 0→1 transition) – illegal inside a spinlock.
+	 */
+	if (IS_ENABLED(CONFIG_CONTEXT_TRACKING_USER) &&
+	    was_running &&
+	    cpumask_available(tick_nohz_full_mask)) {
+		cpumask_andnot(added, nohz_full, tick_nohz_full_mask);
+		cpumask_andnot(removed, tick_nohz_full_mask, nohz_full);
+		for_each_cpu(cpu, added)
+			ct_cpu_track_user(cpu);
+		for_each_cpu(cpu, removed)
+			ct_cpu_untrack_user(cpu);
+	}
+
+	/*
+	 * Update tick_nohz_full_mask unconditionally: this is the snapshot
+	 * read by the /sys/devices/system/cpu/nohz_full sysfs attribute and
+	 * must reflect the current isolation set even in the DHM runtime case.
+	 */
+	if (cpumask_available(tick_nohz_full_mask))
+		cpumask_copy(tick_nohz_full_mask, nohz_full);
+
+	/*
+	 * Only modify tick_nohz_full_running and migrate the global tick when
+	 * nohz_full= was set at boot; without boot-time setup, setting
+	 * tick_nohz_full_running would suppress ticks on isolated CPUs and
+	 * prevent RCU quiescent-state reporting, causing synchronize_rcu()
+	 * to stall permanently when a CPU is subsequently offlined.
+	 */
+	if (was_running) {
+		tick_nohz_full_running = !cpumask_empty(nohz_full);
+
+		if (tick_nohz_full_running) {
+			cpu = READ_ONCE(tick_do_timer_cpu);
+			if (cpu < nr_cpu_ids &&
+			    !housekeeping_test_cpu(cpu, HK_TYPE_KERNEL_NOISE)) {
+				int new_cpu;
+
+				new_cpu = housekeeping_any_cpu(HK_TYPE_KERNEL_NOISE);
+				if (new_cpu < nr_cpu_ids)
+					WRITE_ONCE(tick_do_timer_cpu, new_cpu);
+			}
+		}
+	}
+
+	spin_unlock(&tick_nohz_lock);
+
+	if (was_running)
+		tick_nohz_full_kick_all();
+	free_cpumask_var(removed);
+	free_cpumask_var(added);
+	free_cpumask_var(nohz_full);
+}
+
+static struct housekeeping_cbs tick_nohz_hk_cbs = {
+	.name		= "tick/nohz",
+	.pre_validate	= tick_nohz_hk_validate,
+	.apply		= tick_nohz_hk_apply,
+};
+
+static int __init tick_nohz_hk_init_late(void)
+{
+	int ret;
+
+	/*
+	 * Ensure tick_nohz_full_mask is allocated so that tick_nohz_hk_apply()
+	 * can update it (and the /sys/devices/system/cpu/nohz_full sysfs
+	 * attribute) when CPUs are isolated at runtime via DHM.  If "nohz_full="
+	 * was passed at boot the mask is already allocated; allocate an empty
+	 * one here for the runtime-only case.
+	 */
+	if (!cpumask_available(tick_nohz_full_mask) &&
+	    !zalloc_cpumask_var(&tick_nohz_full_mask, GFP_KERNEL))
+		pr_warn("tick/nohz: failed to allocate nohz_full_mask for DHM\n");
+
+	ret = housekeeping_register_cbs(HK_TYPE_KERNEL_NOISE, &tick_nohz_hk_cbs);
+	if (ret)
+		pr_warn("tick/nohz: Failed to register hk callback: %d\n", ret);
+	return 0;
+}
+late_initcall(tick_nohz_hk_init_late);
 #endif /* #ifdef CONFIG_NO_HZ_FULL */
 
 /*

-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 05/13] cpu/hotplug: Reserve CPUHP states for nohz_full and managed IRQ down-paths
From: Jing Wu @ 2026-06-18  3:11 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Paul E. McKenney, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
	Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Anna-Maria Behnsen, Tejun Heo, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Thomas Gleixner
  Cc: linux-kernel, rcu, cgroups, linux-doc, linux-kselftest, Jing Wu,
	Qiliang Yuan
In-Reply-To: <20260618-wujing-dhm-v3-0-28f1a4d83b68@gmail.com>

Add CPUHP_AP_NO_HZ_FULL_DYING and CPUHP_AP_IRQ_AFFINITY_DYING to the
cpuhp_state enum.  These dying callbacks are invoked during CPU offline
before the tick is stopped, enabling clean tick handover and managed
IRQ migration when a CPU transitions between isolated and housekeeping
states.

The existing CPUHP_AP_IRQ_AFFINITY_ONLINE already handles managed IRQ
restoration on CPU online.  The new dying callback completes the pair,
migrating managed interrupts away from the CPU before it goes down.

Subsequent patches register handlers for these states.

Signed-off-by: Jing Wu <realwujing@gmail.com>
Signed-off-by: Qiliang Yuan <yuanql9@chinatelecom.cn>
---
 include/linux/cpuhotplug.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index 22ba327ec2278..075cfa8161334 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -186,6 +186,8 @@ enum cpuhp_state {
 	CPUHP_AP_SMPCFD_DYING,
 	CPUHP_AP_HRTIMERS_DYING,
 	CPUHP_AP_TICK_DYING,
+	CPUHP_AP_IRQ_AFFINITY_DYING,
+	CPUHP_AP_NO_HZ_FULL_DYING,
 	CPUHP_AP_X86_TBOOT_DYING,
 	CPUHP_AP_ARM_CACHE_B15_RAC_DYING,
 	CPUHP_AP_ONLINE,

-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 04/13] sched/isolation: Fix RCU protection for runtime-mutable cpumask callers
From: Jing Wu @ 2026-06-18  3:11 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Paul E. McKenney, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
	Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Anna-Maria Behnsen, Tejun Heo, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Thomas Gleixner
  Cc: linux-kernel, rcu, cgroups, linux-doc, linux-kselftest, Jing Wu,
	Qiliang Yuan
In-Reply-To: <20260618-wujing-dhm-v3-0-28f1a4d83b68@gmail.com>

housekeeping_update_types() installs new cpumasks via rcu_assign_pointer()
and frees the old ones after synchronize_rcu(); callers that dereference
the old pointer without holding an RCU read lock can access freed memory.

Fix the four call sites:

kernel/sched/core.c (get_nohz_timer_target, HK_TYPE_KERNEL_NOISE):
  The guard(rcu)() was acquired after housekeeping_cpumask().  Move it
  before the call and switch to housekeeping_cpumask_rcu() so hk_mask
  is read inside the RCU read-side critical section.  HK_TYPE_KERNEL_NOISE
  is updated at runtime by housekeeping_update_types(); this fix is
  required for correctness.

drivers/hv/channel_mgmt.c (init_vp_index, HK_TYPE_MANAGED_IRQ):
  The function stored the raw pointer in a local variable and used it
  across GFP_KERNEL allocations (which can sleep, so an RCU read lock
  cannot span them).  Allocate both cpumask_var_t buffers first, then
  snapshot the housekeeping mask under a brief rcu_read_lock() and use
  the snapshot throughout.  HK_TYPE_MANAGED_IRQ is updated at runtime;
  this fix is required for correctness.

kernel/time/hrtimer.c (get_target_base, HK_TYPE_TIMER):
  cpumask_any_and() against housekeeping_cpumask(HK_TYPE_TIMER) was
  called without any lock.  Wrap with rcu_read_lock()/rcu_read_unlock()
  and use housekeeping_cpumask_rcu().  HK_TYPE_TIMER is not changed at
  runtime in this series; this is a defensive fix to satisfy the
  housekeeping_dereference_check() lockdep annotation for future-proofing.
  hrtimers_cpu_dying() is already safe: it runs under the cpu_hotplug_lock
  write side, which housekeeping_dereference_check() already permits.

arch/arm64/kernel/topology.c (arch_freq_get_on_cpu, HK_TYPE_TICK):
  cpumask_intersects() against housekeeping_cpumask(HK_TYPE_TICK) was
  called without any lock.  Evaluate under rcu_read_lock() and store
  the boolean result before releasing the lock.  HK_TYPE_TICK is not
  changed at runtime in this series; this is a defensive fix.

Signed-off-by: Jing Wu <realwujing@gmail.com>
Signed-off-by: Qiliang Yuan <yuanql9@chinatelecom.cn>
---
 arch/arm64/kernel/topology.c |  9 ++++++--
 drivers/hv/channel_mgmt.c    | 50 ++++++++++++++++++++++++++++++--------------
 kernel/sched/core.c          |  3 +--
 kernel/time/hrtimer.c        |  5 ++++-
 4 files changed, 46 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index b32f13358fbb1..8f4329b57cea7 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -212,8 +212,13 @@ int arch_freq_get_on_cpu(int cpu)
 			if (!policy)
 				return -EINVAL;
 
-			if (!cpumask_intersects(policy->related_cpus,
-						housekeeping_cpumask(HK_TYPE_TICK))) {
+			bool no_hk_in_policy;
+
+			rcu_read_lock();
+			no_hk_in_policy = !cpumask_intersects(policy->related_cpus,
+							      housekeeping_cpumask_rcu(HK_TYPE_TICK));
+			rcu_read_unlock();
+			if (no_hk_in_policy) {
 				cpufreq_cpu_put(policy);
 				return -EOPNOTSUPP;
 			}
diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 84eb0a6a0b546..fc5247e92e1b3 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -750,26 +750,43 @@ static void init_vp_index(struct vmbus_channel *channel)
 {
 	bool perf_chn = hv_is_perf_channel(channel);
 	u32 i, ncpu = num_online_cpus();
-	cpumask_var_t available_mask;
+	cpumask_var_t available_mask, hk_snap;
 	struct cpumask *allocated_mask;
-	const struct cpumask *hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ);
 	u32 target_cpu;
 	int numa_node;
 
-	if (!perf_chn ||
-	    !alloc_cpumask_var(&available_mask, GFP_KERNEL) ||
-	    cpumask_empty(hk_mask)) {
-		/*
-		 * If the channel is not a performance critical
-		 * channel, bind it to VMBUS_CONNECT_CPU.
-		 * In case alloc_cpumask_var() fails, bind it to
-		 * VMBUS_CONNECT_CPU.
-		 * If all the cpus are isolated, bind it to
-		 * VMBUS_CONNECT_CPU.
-		 */
+	if (!perf_chn) {
+		channel->target_cpu = VMBUS_CONNECT_CPU;
+		return;
+	}
+
+	if (!alloc_cpumask_var(&available_mask, GFP_KERNEL)) {
+		channel->target_cpu = VMBUS_CONNECT_CPU;
+		hv_set_allocated_cpu(VMBUS_CONNECT_CPU);
+		return;
+	}
+
+	/*
+	 * Snapshot HK_TYPE_MANAGED_IRQ cpumask under RCU read lock.
+	 * housekeeping_update_types() frees the old cpumask after
+	 * synchronize_rcu(), so we must not hold the pointer beyond an
+	 * RCU read-side critical section.
+	 */
+	if (!alloc_cpumask_var(&hk_snap, GFP_KERNEL)) {
+		free_cpumask_var(available_mask);
+		channel->target_cpu = VMBUS_CONNECT_CPU;
+		hv_set_allocated_cpu(VMBUS_CONNECT_CPU);
+		return;
+	}
+	rcu_read_lock();
+	cpumask_copy(hk_snap, housekeeping_cpumask_rcu(HK_TYPE_MANAGED_IRQ));
+	rcu_read_unlock();
+
+	if (cpumask_empty(hk_snap)) {
+		free_cpumask_var(hk_snap);
+		free_cpumask_var(available_mask);
 		channel->target_cpu = VMBUS_CONNECT_CPU;
-		if (perf_chn)
-			hv_set_allocated_cpu(VMBUS_CONNECT_CPU);
+		hv_set_allocated_cpu(VMBUS_CONNECT_CPU);
 		return;
 	}
 
@@ -788,7 +805,7 @@ static void init_vp_index(struct vmbus_channel *channel)
 
 retry:
 		cpumask_xor(available_mask, allocated_mask, cpumask_of_node(numa_node));
-		cpumask_and(available_mask, available_mask, hk_mask);
+		cpumask_and(available_mask, available_mask, hk_snap);
 
 		if (cpumask_empty(available_mask)) {
 			/*
@@ -809,6 +826,7 @@ static void init_vp_index(struct vmbus_channel *channel)
 
 	channel->target_cpu = target_cpu;
 
+	free_cpumask_var(hk_snap);
 	free_cpumask_var(available_mask);
 }
 
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b8871449d3c69..371b509d92164 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1272,9 +1272,8 @@ int get_nohz_timer_target(void)
 		default_cpu = cpu;
 	}
 
-	hk_mask = housekeeping_cpumask(HK_TYPE_KERNEL_NOISE);
-
 	guard(rcu)();
+	hk_mask = housekeeping_cpumask_rcu(HK_TYPE_KERNEL_NOISE);
 
 	for_each_domain(cpu, sd) {
 		for_each_cpu_and(i, sched_domain_span(sd), hk_mask) {
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 5bd6efe598f0f..18e17a9dad67b 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -242,8 +242,11 @@ static bool hrtimer_suitable_target(struct hrtimer *timer, struct hrtimer_clock_
 static inline struct hrtimer_cpu_base *get_target_base(struct hrtimer_cpu_base *base, bool pinned)
 {
 	if (!hrtimer_base_is_online(base)) {
-		int cpu = cpumask_any_and(cpu_online_mask, housekeeping_cpumask(HK_TYPE_TIMER));
+		int cpu;
 
+		rcu_read_lock();
+		cpu = cpumask_any_and(cpu_online_mask, housekeeping_cpumask_rcu(HK_TYPE_TIMER));
+		rcu_read_unlock();
 		return &per_cpu(hrtimer_bases, cpu);
 	}
 

-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 03/13] sched/isolation: RCU-protect all housekeeping cpumask readers
From: Jing Wu @ 2026-06-18  3:11 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Paul E. McKenney, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
	Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Anna-Maria Behnsen, Tejun Heo, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Thomas Gleixner
  Cc: linux-kernel, rcu, cgroups, linux-doc, linux-kselftest, Jing Wu,
	Qiliang Yuan
In-Reply-To: <20260618-wujing-dhm-v3-0-28f1a4d83b68@gmail.com>

Extend housekeeping_dereference_check() to validate all runtime-mutable
types (HK_TYPE_DOMAIN, HK_TYPE_KERNEL_NOISE, HK_TYPE_MANAGED_IRQ), not
only HK_TYPE_DOMAIN.  Boot-only types (HK_TYPE_DOMAIN_BOOT) remain
unchecked.

Add housekeeping_cpumask_rcu() for callers that already hold an RCU
read lock.  This variant uses rcu_dereference() without the lockdep
annotation, avoiding false-positive lockdep warnings in RCU read-side
critical sections.

Use READ_ONCE() consistently when testing housekeeping.flags in paths
that may race with housekeeping_update_types().

Signed-off-by: Jing Wu <realwujing@gmail.com>
Signed-off-by: Qiliang Yuan <yuanql9@chinatelecom.cn>
---
 include/linux/sched/isolation.h |  6 +++++
 kernel/sched/isolation.c        | 57 +++++++++++++++++++++++++++++++----------
 2 files changed, 49 insertions(+), 14 deletions(-)

diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h
index eecbcbe802bd0..ed6e1c6980131 100644
--- a/include/linux/sched/isolation.h
+++ b/include/linux/sched/isolation.h
@@ -40,6 +40,7 @@ enum hk_type {
 DECLARE_STATIC_KEY_FALSE(housekeeping_overridden);
 extern int housekeeping_any_cpu(enum hk_type type);
 extern const struct cpumask *housekeeping_cpumask(enum hk_type type);
+extern const struct cpumask *housekeeping_cpumask_rcu(enum hk_type type);
 extern bool housekeeping_enabled(enum hk_type type);
 extern void housekeeping_affine(struct task_struct *t, enum hk_type type);
 extern bool housekeeping_test_cpu(int cpu, enum hk_type type);
@@ -87,6 +88,11 @@ static inline const struct cpumask *housekeeping_cpumask(enum hk_type type)
 	return cpu_possible_mask;
 }
 
+static inline const struct cpumask *housekeeping_cpumask_rcu(enum hk_type type)
+{
+	return cpu_possible_mask;
+}
+
 static inline bool housekeeping_enabled(enum hk_type type)
 {
 	return false;
diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index 4eca18cc5e8ce..3d5d3f12853c7 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -121,25 +121,40 @@ bool housekeeping_enabled(enum hk_type type)
 }
 EXPORT_SYMBOL_GPL(housekeeping_enabled);
 
+/*
+ * Types that can change at runtime via cpuset isolated partitions.
+ * Boot-only types (DOMAIN_BOOT) are always safe to read without lockdep.
+ */
+static bool housekeeping_type_can_change(enum hk_type type)
+{
+	switch (type) {
+	case HK_TYPE_DOMAIN:
+	case HK_TYPE_KERNEL_NOISE:
+	case HK_TYPE_MANAGED_IRQ:
+		return true;
+	default:
+		return false;
+	}
+}
+
 static bool housekeeping_dereference_check(enum hk_type type)
 {
-	if (IS_ENABLED(CONFIG_LOCKDEP) && type == HK_TYPE_DOMAIN) {
-		/* Cpuset isn't even writable yet? */
-		if (system_state <= SYSTEM_SCHEDULING)
-			return true;
+	if (!IS_ENABLED(CONFIG_LOCKDEP) || !housekeeping_type_can_change(type))
+		return true;
 
-		/* CPU hotplug write locked, so cpuset partition can't be overwritten */
-		if (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_write_held())
-			return true;
+	/* Cpuset isn't even writable yet? */
+	if (system_state <= SYSTEM_SCHEDULING)
+		return true;
 
-		/* Cpuset lock held, partitions not writable */
-		if (IS_ENABLED(CONFIG_CPUSETS) && lockdep_is_cpuset_held())
-			return true;
+	/* CPU hotplug write locked, so cpuset partition can't be overwritten */
+	if (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_write_held())
+		return true;
 
-		return false;
-	}
+	/* Cpuset lock held, partitions not writable */
+	if (IS_ENABLED(CONFIG_CPUSETS) && lockdep_is_cpuset_held())
+		return true;
 
-	return true;
+	return false;
 }
 
 static inline struct cpumask *housekeeping_cpumask_dereference(enum hk_type type)
@@ -162,12 +177,26 @@ const struct cpumask *housekeeping_cpumask(enum hk_type type)
 }
 EXPORT_SYMBOL_GPL(housekeeping_cpumask);
 
+const struct cpumask *housekeeping_cpumask_rcu(enum hk_type type)
+{
+	const struct cpumask *mask = NULL;
+
+	if (static_branch_unlikely(&housekeeping_overridden)) {
+		if (READ_ONCE(housekeeping.flags) & BIT(type))
+			mask = rcu_dereference(housekeeping.cpumasks[type]);
+	}
+	if (!mask)
+		mask = cpu_possible_mask;
+	return mask;
+}
+EXPORT_SYMBOL_GPL(housekeeping_cpumask_rcu);
+
 int housekeeping_any_cpu(enum hk_type type)
 {
 	int cpu;
 
 	if (static_branch_unlikely(&housekeeping_overridden)) {
-		if (housekeeping.flags & BIT(type)) {
+		if (READ_ONCE(housekeeping.flags) & BIT(type)) {
 			cpu = sched_numa_find_closest(housekeeping_cpumask(type), smp_processor_id());
 			if (cpu < nr_cpu_ids)
 				return cpu;

-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 02/13] sched/isolation: Add housekeeping_update_types() for kernel-noise masks
From: Jing Wu @ 2026-06-18  3:11 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Paul E. McKenney, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
	Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Anna-Maria Behnsen, Tejun Heo, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Thomas Gleixner
  Cc: linux-kernel, rcu, cgroups, linux-doc, linux-kselftest, Jing Wu,
	Qiliang Yuan
In-Reply-To: <20260618-wujing-dhm-v3-0-28f1a4d83b68@gmail.com>

Introduce housekeeping_update_types(), which updates the cpumask for
each specified housekeeping type atomically using an RCU pointer swap.

For each type in @type_mask the trial mask is computed as
(base & ~isol_mask), where the base depends on the type:

  - Most types use the current housekeeping cpumask as base.  For
    types that are only set at boot this is equivalent to the boot
    mask, so trial = (boot_mask & ~isol_mask).

  - HK_TYPE_KERNEL_NOISE always uses cpu_possible_mask as base.  Its
    semantics are "all possible CPUs minus the currently-isolated set";
    using the current HK mask instead would leave it stuck at its last
    non-trivial value after de-isolation, breaking subsequent isolation
    cycles.

HK_TYPE_KERNEL_NOISE also supports runtime first-enable: if it was not
registered at boot (no nohz_full= on the kernel command line),
housekeeping_update_types() registers it in housekeeping.flags on the
first call.  All other types must already be boot-enabled.

For each type the function validates the trial mask against
cpu_online_mask, runs registered pre_validate() callbacks (which may
reject the update), swaps all RCU cpumask pointers in a single pass,
calls synchronize_rcu(), frees the old masks, and then runs apply()
callbacks.

The existing housekeeping_update() continues to update only
HK_TYPE_DOMAIN and remains the entry point for the cpuset partition
path.  housekeeping_update_types() enables the partition path to also
drive the kernel-noise types (HK_TYPE_KERNEL_NOISE,
HK_TYPE_MANAGED_IRQ) through the explicit callback interface added in
the previous patch.

Signed-off-by: Jing Wu <realwujing@gmail.com>
Signed-off-by: Qiliang Yuan <yuanql9@chinatelecom.cn>
---
 include/linux/sched/isolation.h |   4 ++
 kernel/sched/isolation.c        | 112 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 116 insertions(+)

diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h
index f362876b3ebdf..eecbcbe802bd0 100644
--- a/include/linux/sched/isolation.h
+++ b/include/linux/sched/isolation.h
@@ -44,6 +44,8 @@ extern bool housekeeping_enabled(enum hk_type type);
 extern void housekeeping_affine(struct task_struct *t, enum hk_type type);
 extern bool housekeeping_test_cpu(int cpu, enum hk_type type);
 extern int housekeeping_update(struct cpumask *isol_mask);
+extern int housekeeping_update_types(unsigned long type_mask,
+				     struct cpumask *isol_mask);
 extern void __init housekeeping_init(void);
 
 /**
@@ -99,6 +101,8 @@ static inline bool housekeeping_test_cpu(int cpu, enum hk_type type)
 }
 
 static inline int housekeeping_update(struct cpumask *isol_mask) { return 0; }
+static inline int housekeeping_update_types(unsigned long type_mask,
+					    struct cpumask *isol_mask) { return 0; }
 static inline void housekeeping_init(void) { }
 static inline int housekeeping_register_cbs(enum hk_type type,
 					    struct housekeeping_cbs *cbs) { return 0; }
diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index aae4dff7fbfc8..4eca18cc5e8ce 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -249,6 +249,118 @@ int housekeeping_update(struct cpumask *isol_mask)
 	return 0;
 }
 
+/**
+ * housekeeping_update_types - Update housekeeping masks for specified types
+ * @type_mask: Bitmask of housekeeping types to update
+ * @isol_mask: CPUs being added to the isolation set
+ *
+ * For each type in @type_mask that was enabled at boot, compute the
+ * trial mask as (boot mask & ~@isol_mask), validate it against
+ * @cpu_online_mask, invoke pre_validate() callbacks, swap the RCU
+ * mask pointer, and run apply() callbacks after synchronize_rcu().
+ *
+ * HK_TYPE_KERNEL_NOISE also supports runtime first-enable: when an
+ * isolated cpuset partition is created without nohz_full= at boot,
+ * cpu_possible_mask is used as the initial base and the type flag is
+ * set in housekeeping.flags on the first call.
+ *
+ * Return: 0 on success, -ENOMEM on allocation failure, -EINVAL if
+ * a trial mask has no online CPUs.
+ */
+int housekeeping_update_types(unsigned long type_mask,
+			      struct cpumask *isol_mask)
+{
+	struct cpumask *trials[HK_TYPE_MAX] = {};
+	struct cpumask *old_masks[HK_TYPE_MAX] = {};
+	enum hk_type type;
+	int ret = 0;
+
+	for_each_set_bit(type, &type_mask, HK_TYPE_MAX) {
+		const struct cpumask *base;
+
+		if (type == HK_TYPE_DOMAIN_BOOT)
+			continue;
+		if (!housekeeping_enabled(type)) {
+			/*
+			 * HK_TYPE_KERNEL_NOISE supports runtime first-enable
+			 * for DHM isolated partitions created without nohz_full=
+			 * at boot.  All other types must be boot-enabled.
+			 */
+			if (type != HK_TYPE_KERNEL_NOISE)
+				continue;
+		}
+
+		/*
+		 * HK_TYPE_KERNEL_NOISE always uses cpu_possible_mask as its
+		 * base.  Its semantics are exactly "cpu_possible minus the
+		 * currently-isolated set", so the base never shrinks across
+		 * successive isolation/de-isolation cycles.  If we used the
+		 * current HK mask instead, de-isolating all partitions would
+		 * leave the mask at its last non-trivial value rather than
+		 * reverting to cpu_possible, breaking subsequent isolations.
+		 */
+		if (type == HK_TYPE_KERNEL_NOISE)
+			base = cpu_possible_mask;
+		else
+			base = housekeeping_cpumask(type);
+		trials[type] = kmalloc(cpumask_size(), GFP_KERNEL);
+		if (!trials[type]) {
+			ret = -ENOMEM;
+			goto err_free;
+		}
+		cpumask_andnot(trials[type], base, isol_mask);
+		if (!cpumask_intersects(trials[type], cpu_online_mask)) {
+			ret = -EINVAL;
+			goto err_free;
+		}
+	}
+
+	if (!housekeeping.flags) {
+		ret = -EINVAL;
+		goto err_free;
+	}
+
+	for_each_set_bit(type, &type_mask, HK_TYPE_MAX) {
+		if (!trials[type])
+			continue;
+		ret = housekeeping_pre_validate_cbs(type,
+						    housekeeping_cpumask(type),
+						    trials[type]);
+		if (ret < 0)
+			goto err_free;
+	}
+
+	for_each_set_bit(type, &type_mask, HK_TYPE_MAX) {
+		if (!trials[type])
+			continue;
+		old_masks[type] = housekeeping_cpumask_dereference(type);
+		/* First-time runtime enable: register the type now. */
+		if (!housekeeping_enabled(type))
+			WRITE_ONCE(housekeeping.flags,
+				   housekeeping.flags | BIT(type));
+		rcu_assign_pointer(housekeeping.cpumasks[type], trials[type]);
+		trials[type] = NULL;
+	}
+
+	synchronize_rcu();
+
+	for_each_set_bit(type, &type_mask, HK_TYPE_MAX) {
+		if (housekeeping_cbs_table[type].nr == 0)
+			continue;
+		housekeeping_apply_cbs(type);
+	}
+
+	for_each_set_bit(type, &type_mask, HK_TYPE_MAX)
+		kfree(old_masks[type]);
+
+	return 0;
+
+err_free:
+	for_each_set_bit(type, &type_mask, HK_TYPE_MAX)
+		kfree(trials[type]);
+	return ret;
+}
+
 void __init housekeeping_init(void)
 {
 	enum hk_type type;

-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 01/13] sched/isolation: Replace notifier chain with explicit callback interface
From: Jing Wu @ 2026-06-18  3:11 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Paul E. McKenney, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
	Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Anna-Maria Behnsen, Tejun Heo, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Thomas Gleixner
  Cc: linux-kernel, rcu, cgroups, linux-doc, linux-kselftest, Jing Wu,
	Qiliang Yuan
In-Reply-To: <20260618-wujing-dhm-v3-0-28f1a4d83b68@gmail.com>

Replace the blocking notifier chain with an explicit per-type callback
table (struct housekeeping_cbs).  Each subsystem registers callbacks
at initcall time; pre_validate() runs before the RCU pointer swap to
allow rejecting the update, and apply() runs after synchronize_rcu()
when the new mask is visible to readers.

The table is limited to HK_MAX_CBS (4) slots per type, sufficient for
the kernel-noise subsystems and avoiding unbounded dynamic allocation
in the update path.  The interface provides deterministic callback
order and explicit registration, giving each subsystem maintainer clear
visibility into when and why its callback is invoked — unlike the
opaque priority-based dispatch of notifier chains.

Signed-off-by: Jing Wu <realwujing@gmail.com>
Signed-off-by: Qiliang Yuan <yuanql9@chinatelecom.cn>
---
 include/linux/sched/isolation.h | 31 +++++++++++++++
 kernel/sched/isolation.c        | 87 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 118 insertions(+)

diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h
index cf0fd03dd7a24..f362876b3ebdf 100644
--- a/include/linux/sched/isolation.h
+++ b/include/linux/sched/isolation.h
@@ -46,6 +46,33 @@ extern bool housekeeping_test_cpu(int cpu, enum hk_type type);
 extern int housekeeping_update(struct cpumask *isol_mask);
 extern void __init housekeeping_init(void);
 
+/**
+ * struct housekeeping_cbs - Per-subsystem callbacks for housekeeping mask changes
+ * @name:		Subsystem name for diagnostic messages
+ * @pre_validate:	Run before RCU pointer swap.  Return -EINVAL
+ *			to reject the update.
+ * @apply:		Run after synchronize_rcu().  Reconfigure subsystem
+ *			state.  The new mask is visible to readers.
+ *
+ * Register subsystem callbacks at initcall time.
+ * Invoke callbacks in registration order when the corresponding
+ * housekeeping mask changes.  Skip types not present in the update
+ * mask.
+ *
+ * Replace the notifier-chain pattern with deterministic callback
+ * ordering.
+ */
+struct housekeeping_cbs {
+	const char			*name;
+	int	(*pre_validate)(enum hk_type type,
+				const struct cpumask *cur_mask,
+				const struct cpumask *new_mask);
+	void	(*apply)(enum hk_type type);
+};
+
+int housekeeping_register_cbs(enum hk_type type, struct housekeeping_cbs *cbs);
+int housekeeping_unregister_cbs(enum hk_type type, struct housekeeping_cbs *cbs);
+
 #else
 
 static inline int housekeeping_any_cpu(enum hk_type type)
@@ -73,6 +100,10 @@ static inline bool housekeeping_test_cpu(int cpu, enum hk_type type)
 
 static inline int housekeeping_update(struct cpumask *isol_mask) { return 0; }
 static inline void housekeeping_init(void) { }
+static inline int housekeeping_register_cbs(enum hk_type type,
+					    struct housekeeping_cbs *cbs) { return 0; }
+static inline int housekeeping_unregister_cbs(enum hk_type type,
+					      struct housekeeping_cbs *cbs) { return 0; }
 #endif /* CONFIG_CPU_ISOLATION */
 
 static inline bool housekeeping_cpu(int cpu, enum hk_type type)
diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index ef152d401fe20..aae4dff7fbfc8 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -28,6 +28,93 @@ struct housekeeping {
 
 static struct housekeeping housekeeping;
 
+/*
+ * Maintain an explicit callback table indexed by housekeeping type.
+ * Invoke callbacks for affected types in deterministic order:
+ * pre_validate() before the RCU pointer swap, apply() after
+ * synchronize_rcu().
+ */
+#define HK_MAX_CBS 4
+
+static struct {
+	struct housekeeping_cbs *cbs[HK_MAX_CBS];
+	int nr;
+} housekeeping_cbs_table[HK_TYPE_MAX];
+
+/**
+ * housekeeping_register_cbs - Register explicit callbacks for a housekeeping type
+ * @type:	Housekeeping type to register for
+ * @cbs:	Callback structure containing pre_validate() and apply()
+ *
+ * Callbacks run in registration order when the mask for @type changes:
+ * pre_validate() before the RCU swap may reject the update; apply()
+ * after synchronize_rcu() reconfigures subsystem state.
+ *
+ * Return: 0 on success, -EINVAL if @type or @cbs is invalid,
+ * -ENOSPC if the per-type table is full.
+ */
+int housekeeping_register_cbs(enum hk_type type, struct housekeeping_cbs *cbs)
+{
+	if (type >= HK_TYPE_MAX || !cbs)
+		return -EINVAL;
+	if (housekeeping_cbs_table[type].nr >= HK_MAX_CBS)
+		return -ENOSPC;
+	housekeeping_cbs_table[type].cbs[housekeeping_cbs_table[type].nr++] = cbs;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(housekeeping_register_cbs);
+
+/**
+ * housekeeping_unregister_cbs - Remove previously registered callbacks
+ * @type:	Housekeeping type
+ * @cbs:	Callback structure to remove
+ *
+ * Return: 0 on success, -EINVAL if arguments are invalid,
+ * -ENOENT if @cbs was not registered.
+ */
+int housekeeping_unregister_cbs(enum hk_type type, struct housekeeping_cbs *cbs)
+{
+	int i;
+
+	if (type >= HK_TYPE_MAX || !cbs)
+		return -EINVAL;
+	for (i = 0; i < housekeeping_cbs_table[type].nr; i++) {
+		if (housekeeping_cbs_table[type].cbs[i] == cbs) {
+			housekeeping_cbs_table[type].cbs[i] =
+				housekeeping_cbs_table[type].cbs[--housekeeping_cbs_table[type].nr];
+			return 0;
+		}
+	}
+	return -ENOENT;
+}
+EXPORT_SYMBOL_GPL(housekeeping_unregister_cbs);
+
+static int housekeeping_pre_validate_cbs(enum hk_type type,
+					 const struct cpumask *cur,
+					 const struct cpumask *new)
+{
+	int i, ret;
+
+	for (i = 0; i < housekeeping_cbs_table[type].nr; i++) {
+		if (!housekeeping_cbs_table[type].cbs[i]->pre_validate)
+			continue;
+		ret = housekeeping_cbs_table[type].cbs[i]->pre_validate(type, cur, new);
+		if (ret < 0)
+			return ret;
+	}
+	return 0;
+}
+
+static void housekeeping_apply_cbs(enum hk_type type)
+{
+	int i;
+
+	for (i = 0; i < housekeeping_cbs_table[type].nr; i++) {
+		if (housekeeping_cbs_table[type].cbs[i]->apply)
+			housekeeping_cbs_table[type].cbs[i]->apply(type);
+	}
+}
+
 bool housekeeping_enabled(enum hk_type type)
 {
 	return !!(READ_ONCE(housekeeping.flags) & BIT(type));

-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 00/13] Dynamic Housekeeping Management (DHM) via CPUSets
From: Jing Wu @ 2026-06-18  3:11 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Paul E. McKenney, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
	Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Anna-Maria Behnsen, Tejun Heo, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Thomas Gleixner
  Cc: linux-kernel, rcu, cgroups, linux-doc, linux-kselftest, Jing Wu,
	Qiliang Yuan

This series introduces Dynamic Housekeeping Management (DHM) to the Linux
kernel, enabling runtime reconfiguration of kernel-noise housekeeping
(nohz_full tick suppression, RCU NOCB offloading, and managed IRQ
migration) through the existing cgroup v2 cpuset isolated partition
mechanism — no new kernel ABI required.

When a cpuset partition is set to isolated mode, the CPUs in that
partition are removed from the kernel's global housekeeping masks.  The
housekeeping subsystems (tick/nohz, RCU NOCB, genirq) react via explicit
registered callbacks, applying the new masks at runtime.  Destroying the
partition restores the CPUs to all housekeeping masks.

The architecture uses a per-type callback table (struct housekeeping_cbs)
with pre_validate/apply hooks, replacing the previous notifier chain.
Housekeeping cpumask pointers are RCU-protected to allow lock-free readers
during updates.

Signed-off-by: Jing Wu <realwujing@gmail.com>
Signed-off-by: Qiliang Yuan <yuanql9@chinatelecom.cn>
---
V2 -> V3:
- Replace notifier chain with explicit per-type callback interface
  (struct housekeeping_cbs with .name, .pre_validate, .apply fields).
- RCU-protect all housekeeping cpumask pointers; callers must hold
  rcu_read_lock() or use housekeeping_cpumask_rcu() in apply() callbacks.
- Drop 5 patches from v2: HK_TYPE enum separation (upstream aliases are
  already correct), no-op timer/hrtimer patches, kthread dead code, and
  workqueue double-update.
- Fix deadlock in rcu_hk_workfn(): remove cpus_read_lock() wrapper around
  remove_cpu()/add_cpu() which take cpu_hotplug_lock write side.
- Fix UAF in rcu_hk_apply(): snapshot the housekeeping cpumask inside the
  work function under rcu_read_lock(), not at apply() time where the old
  pointer may be freed by synchronize_rcu() before the work runs.
- Fix tick apply(): snapshot housekeeping_cpumask_rcu() under
  rcu_read_lock() as required by lockdep for runtime-mutable types.
- Activate context_tracking dynamically via ct_cpu_track_user() /
  ct_cpu_untrack_user() in tick apply(), eliminating the dependency on
  CONFIG_CONTEXT_TRACKING_USER_FORCE flagged by tglx.
- Fix genirq apply(): snapshot HK_TYPE_MANAGED_IRQ mask under
  rcu_read_lock() before the IRQ iteration loop.
- Simplify cpuset noise_types to BIT(HK_TYPE_KERNEL_NOISE) |
  BIT(HK_TYPE_MANAGED_IRQ), replacing the redundant per-alias bitmask.
- housekeeping_update_types(): always use cpu_possible_mask as base
  for HK_TYPE_KERNEL_NOISE, so de-isolation restores the mask to all
  possible CPUs rather than leaving it at its last non-trivial value.
- Initialize watchdog_cpumask from HK_TYPE_KERNEL_NOISE (not
  HK_TYPE_TIMER) at boot; keep it in sync at runtime via a new
  housekeeping_cbs callback.
- Add kernel-noise selftest to test_cpuset_prs.sh, including
  cpu_in_cpulist() for correct cpulist range membership detection and
  nohz_full sysfs verification when CONFIG_NO_HZ_FULL is active.
- Add RCU caller fixes: sched/core (HK_TYPE_KERNEL_NOISE) and
  drivers/hv (HK_TYPE_MANAGED_IRQ) are required because those types
  are updated at runtime; hrtimer (HK_TYPE_TIMER) and arm64/topology
  (HK_TYPE_TICK) are defensive fixes.
- Reorder patches so all subsystem callbacks are registered before the
  cpuset patch that triggers housekeeping_update_types().

V1 -> V2:
- Rebrand series from DHEI to DHM (Dynamic Housekeeping Management).
- Drop custom sysfs interface entirely.
- Integrate housekeeping control into cgroup v2 cpuset isolated partition
  mechanism.
- Add SMT-aware isolation constraints to prevent splitting SMT siblings.
- Add comprehensive documentation and cgroup functional selftests.
- Refactor mask transition logic to use RCU-safe handover.

v2: https://lore.kernel.org/r/20260413-wujing-dhm-v2-0-06df21caba5d@gmail.com
v1: https://lore.kernel.org/all/20260325-dhei-v12-final-v1-0-919cca23cadf@gmail.com

---
Jing Wu (13):
      sched/isolation: Replace notifier chain with explicit callback interface
      sched/isolation: Add housekeeping_update_types() for kernel-noise masks
      sched/isolation: RCU-protect all housekeeping cpumask readers
      sched/isolation: Fix RCU protection for runtime-mutable cpumask callers
      cpu/hotplug: Reserve CPUHP states for nohz_full and managed IRQ down-paths
      tick/nohz, context_tracking: Prepare for runtime nohz_full updates
      rcu/nocb: Add explicit housekeeping callback for runtime NOCB toggling
      genirq: Add explicit housekeeping callback for managed IRQ migration
      watchdog/lockup_detector: Register housekeeping callback for kernel-noise
      sched: Guard sched_tick_start/stop against uninitialized tick_work_cpu
      cgroup/cpuset: Extend isolated partition to trigger kernel-noise isolation
      docs: cgroup-v2: Document kernel-noise isolation via isolated partitions
      selftests/cgroup: Add kernel-noise isolation test to cpuset selftest

 Documentation/admin-guide/cgroup-v2.rst           |   8 +
 arch/arm64/kernel/topology.c                      |   9 +-
 drivers/hv/channel_mgmt.c                         |  50 +++--
 include/linux/context_tracking.h                  |   1 +
 include/linux/cpuhotplug.h                        |   2 +
 include/linux/sched/isolation.h                   |  41 ++++
 kernel/cgroup/cpuset.c                            |  23 +-
 kernel/context_tracking.c                         |  23 +-
 kernel/irq/manage.c                               |  86 ++++++++
 kernel/rcu/tree.c                                 | 104 +++++++++
 kernel/sched/core.c                               |   7 +-
 kernel/sched/isolation.c                          | 256 ++++++++++++++++++++--
 kernel/time/hrtimer.c                             |   5 +-
 kernel/time/tick-sched.c                          | 157 ++++++++++++-
 kernel/watchdog.c                                 |  56 ++++-
 tools/testing/selftests/cgroup/test_cpuset_prs.sh | 204 ++++++++++++++++-
 16 files changed, 968 insertions(+), 64 deletions(-)
---
base-commit: eb3f4b7426cfd2b79d65b7d37155480b32259a11
change-id: 20260408-wujing-dhm-8f43e2d49cd8

Best regards,
-- 
Jing Wu <realwujing@gmail.com>


^ permalink raw reply

* Re: [PATCH 2/3] irqchip/gic-v3: Add Renesas R-Car Gen4 erratum workaround
From: Marek Vasut @ 2026-06-18  2:50 UTC (permalink / raw)
  To: Marc Zyngier, Marek Vasut
  Cc: linux-pci, Yoshihiro Shimoda, Krzysztof Wilczyński,
	Bjorn Helgaas, Catalin Marinas, Conor Dooley, Geert Uytterhoeven,
	Krzysztof Kozlowski, Lorenzo Pieralisi, Manivannan Sadhasivam,
	Rob Herring, devicetree, linux-arm-kernel, linux-doc,
	linux-kernel, linux-renesas-soc
In-Reply-To: <864ij1tyrj.wl-maz@kernel.org>

On 6/17/26 9:24 AM, Marc Zyngier wrote:

Hello Marc,

>> Renesas R-Car S4/V4H/V4M GIC600 integration has address width for AXI
>> or APB interface configured to 32 bit, it can therefore access only
>> the first 4 GiB of physical address space. This information comes from
>> R-Car V4H Interface Specification sheet, there is currently no technical
>> update number assigned to this limitation. Further input from hardware
>> engineer indicates that this limitation also applies to R-Car S4 and V4M.
>> Name the limitation GEN4GICITS1, and add a driver quirk to mitigate this
>> limitation.

My concern is this ^ , I do not have an erratum number, because there 
isn't one. I am in touch with the hardware engineer and I did get a 
glimpse at internal details of the three SoC, which confirm the 
limitations. Is this sufficient ?

>> Note that the 0x0201743b GIC600 ID is not Renesas-specific, it is
>> common for many ARM GICv3 implementations. Therefore, add an extra
> 
> Not quite. It designates GIC600 unambiguously.

What I am trying to communicate is, that the 0x0201743b ID is not ID of 
the Renesas GIC implementation, but it is a generic ARM GIC600 ID. That 
is why we cannot match the quirk on the ID (it is generic ARM GIC600 
ID), and instead we have to match the quirk on the [ ID combined with 
of_machine_is_compatible("renesas,...") ].

> It is just that GIC600
> is integrated in zillions of SoCs, most of which don't have this
> problem (the machine I'm typing this from has a GIC600 *and* 96GB of
> RAM).

Right.

Shall I reword this paragraph somehow to make it clearer ?

>> of_machine_is_compatible() check.
>>
>> The GIC600 implementation in R-Car S4/V4H/V4M is r1p6.
> 
> Is this relevant?

I included it for the sake of completeness and to provide all relevant 
information, based on previous discussions about similar limitations 
that I could find on lore.k.o

[...]

>> +#ifdef CONFIG_RENESAS_ERRATUM_GEN4GICITS1
>> +	{
>> +		.desc   = "ITS: Renesas R-Car Gen4 GIC600 32-bit limit",
>> +		.iidr   = 0x0201743b,
>> +		.mask   = 0xffffffff,
>> +		.init   = its_enable_renesas_gen4,
>> +	},
>>   #endif
>>   	{
>>   	}
> 
> 
> Honestly, that's a bit too much copy-paste for my taste. Just refactor
> the erratum handling to be more generic, something like this:
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 291d7668cc8da..380c4758647d2 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -4894,10 +4894,17 @@ static bool __maybe_unused its_enable_quirk_hip09_162100801(void *data)
>   	return true;
>   }
>   
> -static bool __maybe_unused its_enable_rk3568002(void *data)
> +static const char * const dma_impaired_platforms[] = {
> +#ifdef CONFIG_ROCKCHIP_ERRATUM_3568002
> +	"rockchip,rk3566",
> +	"rockchip,rk3568",
> +#endif
> +	NULL,
> +};
> +
> +static bool __maybe_unused its_enable_dma32(void *data)
>   {
> -	if (!of_machine_is_compatible("rockchip,rk3566") &&
> -	    !of_machine_is_compatible("rockchip,rk3568"))
> +	if (!of_machine_compatible_match(dma_impaired_platforms))
>   		return false;
>   
>   	gfp_flags_quirk |= GFP_DMA32;
> @@ -4972,14 +4979,12 @@ static const struct gic_quirk its_quirks[] = {
>   		.property = "dma-noncoherent",
>   		.init   = its_set_non_coherent,
>   	},
> -#ifdef CONFIG_ROCKCHIP_ERRATUM_3568002
>   	{
> -		.desc   = "ITS: Rockchip erratum RK3568002",
> +		.desc   = "ITS: Broken GIC600 integration limited to 32bit PA",
>   		.iidr   = 0x0201743b,
>   		.mask   = 0xffffffff,
> -		.init   = its_enable_rk3568002,
> +		.init   = its_enable_dma32,
>   	},
> -#endif
>   	{
>   	}
>   };
> 
> Then add the two lines you need in a separate patch.

Will do in V2.

> In the future, please provide a cover letter when you have more than a
> single patch (git will happily generate one for you).
OK

^ permalink raw reply

* Re: [PATCH 2/3] irqchip/gic-v3: Add Renesas R-Car Gen4 erratum workaround
From: Marek Vasut @ 2026-06-18  2:38 UTC (permalink / raw)
  To: Geert Uytterhoeven, Marek Vasut
  Cc: linux-pci, Yoshihiro Shimoda, Krzysztof Wilczyński,
	Bjorn Helgaas, Catalin Marinas, Conor Dooley, Geert Uytterhoeven,
	Krzysztof Kozlowski, Lorenzo Pieralisi, Manivannan Sadhasivam,
	Marc Zyngier, Rob Herring, devicetree, linux-arm-kernel,
	linux-doc, linux-kernel, linux-renesas-soc
In-Reply-To: <CAMuHMdX7XuHQDSsX4P7NZ46_OnCX2o25szuALwSs2z+PHq+JNg@mail.gmail.com>

On 6/17/26 9:09 AM, Geert Uytterhoeven wrote:

Hello Geert,

>> --- a/drivers/irqchip/irq-gic-v3-its.c
>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>> @@ -4901,6 +4901,18 @@ static bool __maybe_unused its_enable_rk3568002(void *data)
>>          return true;
>>   }
>>
>> +static bool __maybe_unused its_enable_renesas_gen4(void *data)
>> +{
>> +       if (!of_machine_is_compatible("renesas,r8a779f0") &&
>> +           !of_machine_is_compatible("renesas,r8a779g0") &&
>> +           !of_machine_is_compatible("renesas,r8a779h0"))
> 
> of_machine_compatible_match() with an array of strings might generate
> smaller code (I didn't check if 3 entries is enough to trip the balance).

Let me handle that as part of suggestion from Marc.

^ permalink raw reply

* Re: [PATCH 1/3] PCI: rcar-gen4: Configure AXIINTC if iMSI-RX not used
From: Marek Vasut @ 2026-06-18  2:21 UTC (permalink / raw)
  To: Geert Uytterhoeven, Marek Vasut
  Cc: linux-pci, Yoshihiro Shimoda, Krzysztof Wilczyński,
	Bjorn Helgaas, Catalin Marinas, Conor Dooley, Geert Uytterhoeven,
	Krzysztof Kozlowski, Lorenzo Pieralisi, Manivannan Sadhasivam,
	Marc Zyngier, Rob Herring, devicetree, linux-arm-kernel,
	linux-doc, linux-kernel, linux-renesas-soc
In-Reply-To: <CAMuHMdU0SJ0q2hcpu+qZCH3eZ5eFDyo8Z964h9DhuSaQ7QdHSg@mail.gmail.com>

On 6/17/26 10:26 AM, Geert Uytterhoeven wrote:

Hello Geert,

>> +static void rcar_gen4_pcie_host_msi_init(struct dw_pcie_rp *pp)
>> +{
>> +       struct dw_pcie *dw = to_dw_pcie_from_pp(pp);
>> +       struct rcar_gen4_pcie *rcar = to_rcar_gen4_pcie(dw);
>> +       u32 val;
>> +
>> +       /* Make sure MSICAP0 MSIE is configured. */
>> +       val = dw_pcie_readl_dbi(dw, MSICAP0);
>> +       if (pci_msi_enabled())
>> +               val |= MSICAP0_MSIE;
>> +       else
>> +               val &= ~MSICAP0_MSIE;
>> +       dw_pcie_writel_dbi(dw, MSICAP0, val);
>> +
>> +       if (!pci_msi_enabled() || pp->use_imsi_rx) {
>> +               /* Clear AXIINTC mapping. */
>> +               writel(0, rcar->base + AXIINTCADDR);
>> +               writel(0, rcar->base + AXIINTCCONT);
>> +       } else {
>> +               /* Point AXIINTC to GIC ITS and enable. */
>> +               writel(AXIINTCADDR_VAL, rcar->base + AXIINTCADDR);
>> +               writel(INTC_EN | INTC_MASK, rcar->base + AXIINTCCONT);
>> +       }
>> +
>> +       /* Configure MSI interrupt signal */
>> +       val = readl(rcar->base + PCIEINTSTS0EN);
>> +       if (pci_msi_enabled())
>> +               val |= MSI_CTRL_INT;
>> +       else
>> +               val &= ~MSI_CTRL_INT;
>> +       writel(val, rcar->base + PCIEINTSTS0EN);
>> +}
>> +
>>   static int rcar_gen4_pcie_enable_device(struct pci_host_bridge *bridge,
> 
> FTR, this has a contextual dependency on "[PATCH v2] PCI: rcar-gen4:
> Limit Max_Read_Request_Size and Max_Payload_Size to 256 Bytes"
> (https://lore.kernel.org/all/20260519195219.189323-1-marek.vasut+renesas@mailbox.org).
It is not an explicit dependency, I only had these patches in my tree 
and clearly that was an interaction. I'll rebase this dependency out for V2.

Thanks!

-- 
Best regards,
Marek Vasut

^ permalink raw reply

* Re: [PATCH v16 06/10] riscv: kexec_file: Use crash_prepare_headers() helper to simplify code
From: Jinjie Ruan @ 2026-06-18  1:54 UTC (permalink / raw)
  To: corbet, skhan, catalin.marinas, will, chenhuacai, kernel, maddy,
	mpe, npiggin, chleroy, pjw, palmer, aou, alex, tglx, mingo, bp,
	dave.hansen, hpa, robh, saravanak, akpm, bhe, rppt,
	pasha.tatashin, pratyush, ruirui.yang, rdunlap, peterz, feng.tang,
	dapeng1.mi, kees, elver, kuba, lirongqing, ebiggers, paulmck,
	leitao, coxu, Liam.Howlett, ryan.roberts, osandov, jbohac,
	cfsworks, tangyouling, sourabhjain, ritesh.list, adityag,
	liaoyuanhong, seanjc, fuqiang.wang, ardb, chenjiahao16, guoren,
	x86, linux-doc, linux-kernel, linux-arm-kernel, loongarch,
	linuxppc-dev, linux-riscv, devicetree, kexec
In-Reply-To: <20260608073459.3119290-7-ruanjinjie@huawei.com>



On 6/8/2026 3:34 PM, Jinjie Ruan wrote:
> Use the newly introduced crash_prepare_headers() function to replace
> the existing prepare_elf_headers(), allocate cmem and exclude crash kernel
> memory in the crash core, which reduce code duplication.
> 
> Only the following two architecture functions need to be implemented:
> - arch_get_system_nr_ranges(). Call get_nr_ram_ranges_callback()
>   to pre-counts the max number of memory ranges.
> 
> - arch_crash_populate_cmem(). Use prepare_elf64_ram_headers_callback()
>   to collects the memory ranges and fills them into cmem.

Hi Paul, Palmer, Albert, Alexandre and RISC-V maintainers,

Sorry for the interruption.

This patch set aims to clean up and refactor the crash memory allocation
and the exclusion logic of crashk_res, crashk_low_res, and crashk_cma.
Currently, these  routines are almost identical across different
architectures, leading to a lot of duplicated code.

This series consolidates the logic into the generic crash core, removing
redundant implementations from architecture-specific directories,
including arch/riscv.

There are no functional changes intended for RISC-V.

The patches will be queued in the liveupdate tree for wider testing.
Could you please take a look at the RISC-V side and consider providing
an Acked-by?

The patch series can be reviewed here:

https://lore.kernel.org/all/20260608073459.3119290-1-ruanjinjie@huawei.com/

Thank you very much for your time and review!

Best regards,
Jinjie

> 
> Cc: Paul Walmsley <pjw@kernel.org>
> Cc: Palmer Dabbelt <palmer@dabbelt.com>
> Cc: Albert Ou <aou@eecs.berkeley.edu>
> Cc: Alexandre Ghiti <alex@ghiti.fr>
> Cc: Guo Ren <guoren@kernel.org>
> Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> Acked-by: Baoquan He <bhe@redhat.com>
> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> ---
>  arch/riscv/kernel/machine_kexec_file.c | 47 +++++++-------------------
>  1 file changed, 12 insertions(+), 35 deletions(-)
> 
> diff --git a/arch/riscv/kernel/machine_kexec_file.c b/arch/riscv/kernel/machine_kexec_file.c
> index 3f7766057cac..439cbc50dfa6 100644
> --- a/arch/riscv/kernel/machine_kexec_file.c
> +++ b/arch/riscv/kernel/machine_kexec_file.c
> @@ -44,6 +44,15 @@ static int get_nr_ram_ranges_callback(struct resource *res, void *arg)
>  	return 0;
>  }
>  
> +unsigned int arch_get_system_nr_ranges(void)
> +{
> +	unsigned int nr_ranges = 2; /* For exclusion of crashkernel region */
> +
> +	walk_system_ram_res(0, -1, &nr_ranges, get_nr_ram_ranges_callback);
> +
> +	return nr_ranges;
> +}
> +
>  static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
>  {
>  	struct crash_mem *cmem = arg;
> @@ -55,41 +64,9 @@ static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
>  	return 0;
>  }
>  
> -static int prepare_elf_headers(void **addr, unsigned long *sz)
> +int arch_crash_populate_cmem(struct crash_mem *cmem)
>  {
> -	struct crash_mem *cmem;
> -	unsigned int nr_ranges;
> -	int ret;
> -
> -	nr_ranges = 2; /* For exclusion of crashkernel region */
> -	walk_system_ram_res(0, -1, &nr_ranges, get_nr_ram_ranges_callback);
> -
> -	cmem = kmalloc_flex(*cmem, ranges, nr_ranges);
> -	if (!cmem)
> -		return -ENOMEM;
> -
> -	cmem->max_nr_ranges = nr_ranges;
> -	cmem->nr_ranges = 0;
> -	ret = walk_system_ram_res(0, -1, cmem, prepare_elf64_ram_headers_callback);
> -	if (ret)
> -		goto out;
> -
> -	/* Exclude crashkernel region */
> -	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
> -	if (ret)
> -		goto out;
> -
> -	if (crashk_low_res.end) {
> -		ret = crash_exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
> -		if (ret)
> -			goto out;
> -	}
> -
> -	ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
> -
> -out:
> -	kfree(cmem);
> -	return ret;
> +	return walk_system_ram_res(0, -1, cmem, prepare_elf64_ram_headers_callback);
>  }
>  
>  static char *setup_kdump_cmdline(struct kimage *image, char *cmdline,
> @@ -281,7 +258,7 @@ int load_extra_segments(struct kimage *image, unsigned long kernel_start,
>  	if (image->type == KEXEC_TYPE_CRASH) {
>  		void *headers;
>  		unsigned long headers_sz;
> -		ret = prepare_elf_headers(&headers, &headers_sz);
> +		ret = crash_prepare_headers(true, &headers, &headers_sz, NULL);
>  		if (ret) {
>  			pr_err("Preparing elf core header failed\n");
>  			goto out;


^ permalink raw reply

* Re: [PATCH v7 15/20] perf: arm_pmuv3: Handle IRQs for Partitioned PMU guest counters
From: wuyifan @ 2026-06-18  1:53 UTC (permalink / raw)
  To: Colton Lewis, kvm
  Cc: Alexandru Elisei, Paolo Bonzini, Jonathan Corbet, Russell King,
	Catalin Marinas, Will Deacon, Marc Zyngier, Oliver Upton,
	Mingwei Zhang, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Mark Rutland, Shuah Khan, Ganapatrao Kulkarni, James Clark,
	linux-doc, linux-kernel, linux-arm-kernel, kvmarm,
	linux-perf-users, linux-kselftest, wangyushan, Zhou Wang, xuwei5,
	prime.zeng, fanghao11
In-Reply-To: <20260504211813.1804997-16-coltonlewis@google.com>

Hi Colton,

On 5/5/2026 5:18 AM, Colton Lewis wrote:
>   static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
>   {
> -	u64 pmovsr;
>   	struct perf_sample_data data;
>   	struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events);
>   	struct pt_regs *regs;
> +	u64 host_set = kvm_pmu_host_counter_mask(cpu_pmu);
> +	u64 pmovsr;
kvm_pmu_host_counter_mask() is called from armv8pmu_handle_irq(). This
interrupt fires in both host and guest contexts.

However, kvm_pmu_host_counter_mask() dereferences
host_data_ptr(nr_event_counters). This indirection requires
kvm_arm_hyp_percpu_base[cpu] to be initialized, which only happens during
KVM hypervisor setup. When the interrupt fires in a guest kernel where 
KVM is
compiled but not active, the per-CPU base is NULL and the dereference 
faults.

Thanks,
Yifan


^ permalink raw reply

* htmldocs: Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.c:74: WARNING: Inline emphasis start-string without end-string. [docutils]
From: kernel test robot @ 2026-06-18  1:48 UTC (permalink / raw)
  To: Honglei Huang; +Cc: oe-kbuild-all, 0day robot, linux-doc

tree:   https://github.com/intel-lab-lkp/linux/commits/Honglei-Huang/drm-gpusvm-split-MM-state-flags-out-of-drm_gpusvm_pages_flags/20260617-202753
head:   19bcdccae716ca08c529566e2093edc5c2a81ce2
commit: c2a70a0070054be7a4f5097e2d7c835d765eae35 drm/gpusvm: move struct drm_gpusvm_pages out of struct drm_gpusvm_range
date:   13 hours ago
compiler: clang version 22.1.8 (https://github.com/llvm/llvm-project ca7933e47d3a3451d81e72ac174dcb5aa28b59d1)
docutils: docutils (Docutils 0.21.2, Python 3.13.5, on linux)
reproduce: (https://download.01.org/0day-ci/archive/20260618/202606180351.xwWq86H2-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202606180351.xwWq86H2-lkp@intel.com/

All warnings (new ones prefixed by >>):

   Examples
   ~~~~~~~~ [docutils]
   Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.c:74: ERROR: Unexpected indentation. [docutils]
>> Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.c:74: WARNING: Inline emphasis start-string without end-string. [docutils]
>> Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.c:76: WARNING: Block quote ends without a blank line; unexpected unindent. [docutils]
>> Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.c:77: WARNING: Definition list ends without a blank line; unexpected unindent. [docutils]
   WARNING: ./include/linux/host1x.h:159 struct member 'get' not described in 'host1x_bo_ops'
   WARNING: ./include/linux/host1x.h:159 struct member 'put' not described in 'host1x_bo_ops'
   WARNING: ./include/linux/host1x.h:159 struct member 'mmap' not described in 'host1x_bo_ops'
   WARNING: ./include/linux/host1x.h:159 struct member 'munmap' not described in 'host1x_bo_ops'
   WARNING: ./include/linux/host1x.h:159 struct member 'get' not described in 'host1x_bo_ops'

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply

* Re: [PATCH v16 00/10] arm64/riscv: Add support for crashkernel CMA reservation
From: Jinjie Ruan @ 2026-06-18  1:45 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: corbet, skhan, catalin.marinas, will, chenhuacai, kernel, maddy,
	mpe, npiggin, chleroy, pjw, palmer, aou, alex, tglx, mingo, bp,
	dave.hansen, hpa, robh, saravanak, akpm, bhe, pasha.tatashin,
	pratyush, ruirui.yang, rdunlap, peterz, feng.tang, dapeng1.mi,
	kees, elver, kuba, lirongqing, ebiggers, paulmck, leitao, coxu,
	Liam.Howlett, ryan.roberts, osandov, jbohac, cfsworks,
	tangyouling, sourabhjain, ritesh.list, adityag, liaoyuanhong,
	seanjc, fuqiang.wang, ardb, chenjiahao16, guoren, x86, linux-doc,
	linux-kernel, linux-arm-kernel, loongarch, linuxppc-dev,
	linux-riscv, devicetree, kexec
In-Reply-To: <ajLr53EK6mJbng-7@kernel.org>



On 6/18/2026 2:48 AM, Mike Rapoport wrote:
> Hi Jinjie,
> 
> On Mon, Jun 08, 2026 at 03:34:49PM +0800, Jinjie Ruan wrote:
>> The crash memory allocation, and the exclude of crashk_res, crashk_low_res
>> and crashk_cma memory are almost identical across different architectures,
>> This patch set handle them in crash core in a general way, which eliminate
>> a lot of duplication code.
>>
>> And add support for crashkernel CMA reservation for arm64 and riscv.
>>
>> This patch set is rebased on v7.1-rc1.
> 
> Please rebase this set on v7.2-rc1 once that's out.
> 
> I'm going to queue it in the liveupdate tree then to expose to the wider
> testing.
> 
> Meanwhile it would be great to chase riscv and x86 maintainers for acks :)

Thanks! That sounds great.

I will rebase this patch set on v7.2-rc1 as soon as it is out and send v17.

In the meantime, I will CC and reach out to the RISC-V and x86
maintainers to request their reviews and Acks.

Best regards,
Jinjie

> 


^ permalink raw reply

* Re: [PATCH] kselftest docs: remove reference to obsolete/archived wiki
From: Shuah Khan @ 2026-06-18  1:05 UTC (permalink / raw)
  To: Rafael Passos, shuah, corbet; +Cc: linux-kselftest, linux-doc, Shuah Khan
In-Reply-To: <865def83-a07e-4eba-b795-7da66e0e2d69@linuxfoundation.org>

On 6/17/26 19:03, Shuah Khan wrote:
> On 6/17/26 17:57, Rafael Passos wrote:
>> This link in the docs point to a wiki that is no longer active.
>>
>> The wiki was moved to archive.kernel.org, and there is a warning:
>> "OBSOLETE CONTENT This wiki has been archived and the content is
>> no longer updated."
>>
>> Signed-off-by: Rafael Passos <rafael@rcpassos.me>
>> ---
>>
>>   Documentation/dev-tools/kselftest.rst | 5 -----
>>   1 file changed, 5 deletions(-)
>>
>> diff --git a/Documentation/dev-tools/kselftest.rst b/Documentation/dev-tools/kselftest.rst
>> index d7bfe320338c..64c0ec7428a2 100644
>> --- a/Documentation/dev-tools/kselftest.rst
>> +++ b/Documentation/dev-tools/kselftest.rst
>> @@ -15,11 +15,6 @@ able to run that test on an older kernel. Hence, it is important to keep
>>   code that can still test an older kernel and make sure it skips the test
>>   gracefully on newer releases.
>> -You can find additional information on Kselftest framework, how to
>> -write new tests using the framework on Kselftest wiki:
>> -
>> -https://kselftest.wiki.kernel.org/
>> -
>>   On some systems, hot-plug tests could hang forever waiting for cpu and
>>   memory to be ready to be offlined. A special hot-plug target is created
>>   to run the full range of hot-plug tests. In default mode, hot-plug tests run
> 
> 
> Looks good to me.
> 
> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>

Jon,

I can take this through kselftest tree as I usually do.

thanks,
-- Shuah

^ permalink raw reply

* Re: [PATCH] kselftest docs: remove reference to obsolete/archived wiki
From: Shuah Khan @ 2026-06-18  1:03 UTC (permalink / raw)
  To: Rafael Passos, shuah, corbet; +Cc: linux-kselftest, linux-doc, Shuah Khan
In-Reply-To: <20260617235740.74029-1-rafael@rcpassos.me>

On 6/17/26 17:57, Rafael Passos wrote:
> This link in the docs point to a wiki that is no longer active.
> 
> The wiki was moved to archive.kernel.org, and there is a warning:
> "OBSOLETE CONTENT This wiki has been archived and the content is
> no longer updated."
> 
> Signed-off-by: Rafael Passos <rafael@rcpassos.me>
> ---
> 
>   Documentation/dev-tools/kselftest.rst | 5 -----
>   1 file changed, 5 deletions(-)
> 
> diff --git a/Documentation/dev-tools/kselftest.rst b/Documentation/dev-tools/kselftest.rst
> index d7bfe320338c..64c0ec7428a2 100644
> --- a/Documentation/dev-tools/kselftest.rst
> +++ b/Documentation/dev-tools/kselftest.rst
> @@ -15,11 +15,6 @@ able to run that test on an older kernel. Hence, it is important to keep
>   code that can still test an older kernel and make sure it skips the test
>   gracefully on newer releases.
>   
> -You can find additional information on Kselftest framework, how to
> -write new tests using the framework on Kselftest wiki:
> -
> -https://kselftest.wiki.kernel.org/
> -
>   On some systems, hot-plug tests could hang forever waiting for cpu and
>   memory to be ready to be offlined. A special hot-plug target is created
>   to run the full range of hot-plug tests. In default mode, hot-plug tests run


Looks good to me.

Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>

thanks,
-- Shuah

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox