public inbox for linux-hwmon@vger.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Waiman Long" <longman@redhat.com>
Cc: tj@kernel.org, hannes@cmpxchg.org, linux-hwmon@vger.kernel.org,
	mkoutny@suse.com
Subject: Re: [PATCH 23/23] cgroup/cpuset: Documentation and kselftest updates
Date: Wed, 22 Apr 2026 03:09:08 +0000	[thread overview]
Message-ID: <20260422030908.CA3F4C2BCB0@smtp.kernel.org> (raw)
In-Reply-To: <20260421030351.281436-24-longman@redhat.com>

Sashiko AI review found 5 potential issue(s):
- [Medium] The newly added `nohz_full` check in `check_isolcpus()` is placed before the state stabilization retry loop, leading to false test failures and bypassed validation.
- [Medium] The selftest script enables kernel dynamic debugging globally but fails to restore its state when the test finishes or aborts.
- [Medium] The new `CPU0_ISOLCPUS_MATRIX` test cases omit the 12th field (`ISOLCPUS`), silently bypassing validation of the `cpuset.cpus.isolated` state.
- [Low] Dead code: the variable `CHK_NOHZ_FULL` is defined and assigned but never used.
- [Low] Grammatical errors and typos in documentation and commit message.
--

commit 84d90aacf6dde2c021ed021a8f60057d923655be
Author: Waiman Long <longman@redhat.com>

cgroup/cpuset: Documentation and kselftest updates

This commit updates cpuset tests and documentation to reflect new CPU
hotplug capabilities for nohz_full. It adds tests to prevent using CPU 0
in isolated partitions when it cannot be brought offline, and updates
documentation in cgroup-v2.rst.

> A new test will also be run in offline isn't allowed in CPU 0 to verify
> that using CPU 0 as part of an isolated partition will fail.

This isn't a bug, but should "run in offline" be "run if offline"?

> Since there is a slight performance overhead to enable runtime changes
> to nohz_full CPU list, users have to explicitly opt in by adding a
> "nohz_ful" kernel command line parameter with or without a CPU list.

This isn't a bug, but there's a typo ("nohz_ful") in the commit message.

> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index 91beaa6798ce0..81181dc100ead 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
[ ... ]
> @@ -2647,11 +2648,29 @@ Cpuset Interface Files
>  	partition or scheduling domain.  The set of exclusive CPUs is
>  	determined by the value of its "cpuset.cpus.exclusive.effective".
>  
> -	When set to "isolated", the CPUs in that partition will be in
> -	an isolated state without any load balancing from the scheduler
> -	and excluded from the unbound workqueues.  Tasks placed in such
> -	a partition with multiple CPUs should be carefully distributed
> -	and bound to each of the individual CPUs for optimal performance.
> +	When set to "isolated", the CPUs in that partition will be in an
> +	isolated state without any load balancing from the scheduler and
> +	excluded from the unbound workqueues as well as other OS noises.
> +	Tasks placed in such a partition with multiple CPUs should be
> +	carefully distributed and bound to each of the individual CPUs
> +	for optimal performance.
> +
> +	As CPU hotplug, if supported, is used to improve the degree of
> +	CPU isolation close to the "nohz_full" kernel boot parameter.

This isn't a bug, but this sentence seems to be a subordinate clause lacking
a main clause.

> diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
> index a56f4153c64df..eebb4122b581c 100755
> --- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh
> +++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
> @@ -67,6 +67,12 @@ then
>  	echo Y > /sys/kernel/debug/sched/verbose
>  fi
>  
> +# Enable dynamic debug message if available
> +DYN_DEBUG=/proc/dynamic_debug/control
> +[[ -f $DYN_DEBUG ]] && {
> +	echo "file kernel/cpu.c +p" > $DYN_DEBUG
> +}

Does this leak the debug state after the test finishes?

The script enables dynamic debug messages for the CPU hotplug subsystem here.

Looking at the cleanup() function triggered via trap, it does not restore or
disable this configuration, which permanently alters the system's global debug
state.

Should the cleanup() function revert this change?

[ ... ]
> @@ -84,6 +90,15 @@ echo member > test/cpuset.cpus.partition
>  echo "" > test/cpuset.cpus
>  [[ $RESULT -eq 0 ]] && skip_test "Child cgroups are using cpuset!"
>  
> +#
> +# If nohz_full parameter is specified and nohz_full file exists, CPU hotplug
> +# will be used to modify nohz_full cpumask to include all the isolated CPUs
> +# in cpuset isolated partitions.
> +#
> +NOHZ_FULL=/sys/devices/system/cpu/nohz_full
> +BOOT_NOHZ_FULL=$(fmt -1 /proc/cmdline | grep "^nohz_full")
> +[[ "$BOOT_NOHZ_FULL" = nohz_full ]] && CHK_NOHZ_FULL=1

Is CHK_NOHZ_FULL used anywhere else in the script?

It looks like it is defined and assigned here, but the same string
comparison is redundantly evaluated later inside check_isolcpus() without
using this variable.

[ ... ]
> +#
> +# Test matrix to verify that using CPU 0 in isolated (local or remote) partition
> +# will fail when offline isn't allowed for CPU 0.
> +#
> +CPU0_ISOLCPUS_MATRIX=(
> +	#  old-A1 old-A2 old-A3 old-B1 new-A1 new-A2 new-A3 new-B1 fail ECPUs Pstate ISOLCPUS
> +	#  ------ ------ ------ ------ ------ ------ ------ ------ ---- ----- ------ --------
> +	"   C0-3     .      .    C4-5     P2     .      .      .     0 A1:0-3|B1:4-5 A1:P-2"
> +	"   C1-3     .      .      .      P2     .      .      .     0 A1:1-3 A1:P2"
> +	"   C1-3     .      .      .    P2:C0-3  .      .      .     0 A1:0-3 A1:P-2"
> +	"  CX0-3   C0-3     .      .       .     P2     .      .     0 A1:0-3|A2:0-3 A2:P-2"
> +	"  CX0-3 C0-3:X1-3  .      .       .     P2     .      .     0 A1:0|A2:1-3 A2:P2"
> +	"  CX0-3 C0-3:X1-3  .      .       .   P2:X0-3  .      .     0 A1:0-3|A2:0-3 A2:P-2"
> +)

Are these test cases bypassing the validation of the cpuset.cpus.isolated
state?

The test matrix headers indicate 12 fields, but the entries only provide
11 fields, omitting the final ISOLCPUS field.

Because the 12th field is missing, check_test_results() will evaluate if
ISOLCPUS is non-empty as false and skip calling check_isolcpus(), leaving
the isolated state unchecked.

[ ... ]
> @@ -789,6 +821,18 @@ check_isolcpus()
>  		EXPECTED_SDOMAIN=$EXPECTED_ISOLCPUS
>  	fi
>  
> +	#
> +	# Check if nohz_full match cpuset.cpus.isolated if nohz_boot parameter
> +	# specified with no parameter.
> +	#

This isn't a bug, but there's a typo ("nohz_boot" instead of "nohz_full")
in this comment.

> +	[[ -f $NOHZ_FULL && "$BOOT_NOHZ_FULL" = nohz_full ]] && {
> +		NOHZ_FULL_CPUS=$(cat $NOHZ_FULL)
> +		[[ "$ISOLCPUS" != "$NOHZ_FULL_CPUS" ]] && {
> +			echo "nohz_full ($NOHZ_FULL_CPUS) does not match cpuset.cpus.isolated ($ISOLCPUS)"
> +			return 1
> +		}
> +	}
> +

Will this check cause false test failures or bypass validation due to
asynchronous state updates?

The script reads the state and immediately returns an error if they mismatch.

However, later in the function there is a 50ms retry loop to account for
transient states.

If the files update at slightly different times, the check instantly fails.
If both files are in their old state, they match, passing the check, but
then the script pauses and reads the new state without re-evaluating
nohz_full against the final state.

Could this check be moved after the state stabilization retry loop?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260421030351.281436-1-longman@redhat.com?part=23

      reply	other threads:[~2026-04-22  3:09 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-21  3:03 [PATCH-next 00/23] cgroup/cpuset: Enable runtime update of nohz_full and managed_irq CPUs Waiman Long
2026-04-21  3:03 ` [PATCH 01/23] sched/isolation: Add HK_TYPE_KERNEL_NOISE_BOOT & HK_TYPE_MANAGED_IRQ_BOOT Waiman Long
2026-04-21  3:03 ` [PATCH 02/23] sched/isolation: Enhance housekeeping_update() to support updating more than one HK cpumask Waiman Long
2026-04-22  3:08   ` sashiko-bot
2026-04-22  6:39   ` Chen Ridong
2026-04-21  3:03 ` [PATCH 03/23] tick/nohz: Make nohz_full parameter optional Waiman Long
2026-04-21  8:32   ` Thomas Gleixner
2026-04-21 14:14     ` Waiman Long
2026-04-22  3:08   ` sashiko-bot
2026-04-21  3:03 ` [PATCH 04/23] tick/nohz: Allow runtime changes in full dynticks CPUs Waiman Long
2026-04-21  8:50   ` Thomas Gleixner
2026-04-21 14:24     ` Waiman Long
2026-04-22  3:08   ` sashiko-bot
2026-04-21  3:03 ` [PATCH 05/23] tick: Pass timer tick job to an online HK CPU in tick_cpu_dying() Waiman Long
2026-04-21  8:55   ` Thomas Gleixner
2026-04-21 14:22     ` Waiman Long
2026-04-21  3:03 ` [PATCH 06/23] rcu/nocbs: Allow runtime changes in RCU NOCBS cpumask Waiman Long
2026-04-22  3:08   ` sashiko-bot
2026-04-21  3:03 ` [PATCH 07/23] watchdog: Sync up with runtime change of isolated CPUs Waiman Long
2026-04-22  3:08   ` sashiko-bot
2026-04-21  3:03 ` [PATCH 08/23] arm64: topology: Use RCU to protect access to HK_TYPE_TICK cpumask Waiman Long
2026-04-22  3:08   ` sashiko-bot
2026-04-21  3:03 ` [PATCH 09/23] workqueue: Use RCU to protect access of HK_TYPE_TIMER cpumask Waiman Long
2026-04-21  3:03 ` [PATCH 10/23] cpu: " Waiman Long
2026-04-21  8:57   ` Thomas Gleixner
2026-04-21 14:25     ` Waiman Long
2026-04-21  3:03 ` [PATCH 11/23] hrtimer: " Waiman Long
2026-04-21  8:59   ` Thomas Gleixner
2026-04-22  3:09   ` sashiko-bot
2026-04-21  3:03 ` [PATCH 12/23] net: Use boot time housekeeping cpumask settings for now Waiman Long
2026-04-21  3:03 ` [PATCH 13/23] sched/core: Use RCU to protect access of HK_TYPE_KERNEL_NOISE cpumask Waiman Long
2026-04-22  3:09   ` sashiko-bot
2026-04-21  3:03 ` [PATCH 14/23] hwmon/coretemp: Use RCU to protect access of HK_TYPE_MISC cpumask Waiman Long
2026-04-22  3:09   ` sashiko-bot
2026-04-21  3:03 ` [PATCH 15/23] Drivers: hv: Use RCU to protect access of HK_TYPE_MANAGED_IRQ cpumask Waiman Long
2026-04-22  3:09   ` sashiko-bot
2026-04-21  3:03 ` [PATCH 16/23] genirq/cpuhotplug: " Waiman Long
2026-04-21  9:02   ` Thomas Gleixner
2026-04-21 14:29     ` Waiman Long
2026-04-21  3:03 ` [PATCH 17/23] sched/isolation: Extend housekeeping_dereference_check() to cover changes in nohz_full or manged_irqs cpumasks Waiman Long
2026-04-22  3:09   ` sashiko-bot
2026-04-21  3:03 ` [PATCH 18/23] cpu/hotplug: Add a new cpuhp_offline_cb() API Waiman Long
2026-04-21 16:17   ` Thomas Gleixner
2026-04-21 17:29     ` Waiman Long
2026-04-21 18:43       ` Thomas Gleixner
2026-04-22  3:09   ` sashiko-bot
2026-04-21  3:03 ` [PATCH 19/23] cgroup/cpuset: Improve check for calling housekeeping_update() Waiman Long
2026-04-21  3:03 ` [PATCH 20/23] cgroup/cpuset: Enable runtime update of HK_TYPE_{KERNEL_NOISE,MANAGED_IRQ} cpumasks Waiman Long
2026-04-22  3:09   ` sashiko-bot
2026-04-21  3:03 ` [PATCH 21/23] cgroup/cpuset: Limit the side effect of using CPU hotplug on isolated partition Waiman Long
2026-04-22  3:09   ` sashiko-bot
2026-04-21  3:03 ` [PATCH 22/23] cgroup/cpuset: Prevent offline_disabled CPUs from being used in " Waiman Long
2026-04-22  3:09   ` sashiko-bot
2026-04-21  3:03 ` [PATCH 23/23] cgroup/cpuset: Documentation and kselftest updates Waiman Long
2026-04-22  3:09   ` sashiko-bot [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260422030908.CA3F4C2BCB0@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-hwmon@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=sashiko@lists.linux.dev \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox