From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Chen Ridong <chenridong@huawei.com>,
Waiman Long <longman@redhat.com>, Tejun Heo <tj@kernel.org>,
Sasha Levin <sashal@kernel.org>,
hannes@cmpxchg.org, mkoutny@suse.com, cgroups@vger.kernel.org
Subject: [PATCH AUTOSEL 6.17-6.12] cpuset: Use new excpus for nocpu error check when enabling root partition
Date: Mon, 6 Oct 2025 14:17:52 -0400 [thread overview]
Message-ID: <20251006181835.1919496-20-sashal@kernel.org> (raw)
In-Reply-To: <20251006181835.1919496-1-sashal@kernel.org>
From: Chen Ridong <chenridong@huawei.com>
[ Upstream commit 59d5de3655698679ad8fd2cc82228de4679c4263 ]
A previous patch fixed a bug where new_prs should be assigned before
checking housekeeping conflicts. This patch addresses another potential
issue: the nocpu error check currently uses the xcpus which is not updated.
Although no issue has been observed so far, the check should be performed
using the new effective exclusive cpus.
The comment has been removed because the function returns an error if
nocpu checking fails, which is unrelated to the parent.
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Now I'll provide my comprehensive analysis and recommendation.
## BACKPORT RECOMMENDATION
**YES**
## EXTENSIVE ANALYSIS
### 1. Bug Origin and Timeline
The bug was introduced in commit **86888c7bd117c** ("cgroup/cpuset: Add
warnings to catch inconsistency in exclusive CPUs") which first appeared
in **v6.16-rc1** (April 2025).
**Critical code change that introduced the bug:**
```c
- xcpus = tmp->new_cpus;
+ xcpus = tmp->delmask;
if (compute_effective_exclusive_cpumask(cs, xcpus,
NULL))
```
This commit changed how `xcpus` is calculated during partition root
enabling. The variable `xcpus` was reassigned to `tmp->delmask` and then
recomputed by `compute_excpus()`, but the `nocpu` error checking was
still using the stale `nocpu` boolean calculated earlier with the old
xcpus value.
### 2. Detailed Code Flow Analysis
**Before the fix (bugged code):**
At line ~1742 (before partcmd_enable block):
```c
xcpus = user_xcpus(cs); // Initial xcpus
nocpu = tasks_nocpu_error(parent, cs, xcpus); // Calculate nocpu with
OLD xcpus
```
Inside partcmd_enable block (lines ~1747-1826):
```c
xcpus = tmp->delmask; // **REASSIGN xcpus**
if (compute_excpus(cs, xcpus)) // **RECOMPUTE into NEW xcpus**
WARN_ON_ONCE(!cpumask_empty(cs->exclusive_cpus));
new_prs = (cmd == partcmd_enable) ? PRS_ROOT : PRS_ISOLATED;
if (cpumask_empty(xcpus))
return PERR_INVCPUS;
if (prstate_housekeeping_conflict(new_prs, xcpus))
return PERR_HKEEPING;
if (nocpu) // **BUG: Using OLD nocpu calculated with OLD xcpus**
return PERR_NOCPUS;
```
**After the fix:**
```c
if (tasks_nocpu_error(parent, cs, xcpus)) // Recalculate with NEW xcpus
return PERR_NOCPUS;
```
### 3. Bug Impact Assessment
**Severity:** Medium
**Potential manifestations:**
1. **False negatives:** A partition change could be allowed when it
should be rejected if:
- Old xcpus had no nocpu error
- New xcpus (after compute_excpus) would have a nocpu error
- Result: Parent or child tasks left without CPUs → system
instability
2. **False positives:** A valid partition change could be rejected if:
- Old xcpus had a nocpu error
- New xcpus (after compute_excpus) would have no nocpu error
- Result: Legitimate partition changes fail
**Observed impact:** According to the commit message, "Although no issue
has been observed so far," suggesting this is a latent bug that hasn't
manifested in testing or production yet.
### 4. Why This Should Be Backported
**Strong YES indicators:**
1. **Logical correctness bug**: The code is provably incorrect - using
stale data for validation after that data has been updated
2. **Part of a bug fix series**: This commit is the second in a series
fixing related issues:
- Commit 216217ebee16: Fixed new_prs not being assigned before
housekeeping check
- Commit 59d5de36556986: Fixed nocpu not using updated xcpus
3. **Low risk change**:
- Only 1 line of actual code changed (plus comment removal)
- Changes `if (nocpu)` → `if (tasks_nocpu_error(parent, cs, xcpus))`
- Function `tasks_nocpu_error()` already exists and is well-tested
4. **Confined scope**: Only affects cpuset partition root enabling path
5. **Prevents potential system instability**: If manifested, could leave
tasks without CPUs
6. **Stable tree criteria compliance:**
- Important bugfix: ✓ (correctness issue in resource management)
- Minimal risk: ✓ (one line change)
- No new features: ✓
- Confined to subsystem: ✓ (cpuset only)
### 5. Related Context
This fix is directly related to and depends on the understanding of:
- Commit 216217ebee16 which fixed a similar issue with new_prs
- The refactoring series by Chen Ridong that reorganized cpuset
exclusive CPU handling
- The tasks_nocpu_error() function introduced in v6.7 (commit
e2ffe502ba450)
### 6. Versions Affected
The bug exists in:
- v6.16-rc1 through v6.16-rc4
- v6.16 (final)
- Any stable trees that have backported commit 86888c7bd117c
### 7. Technical Rationale
The `tasks_nocpu_error()` function checks if enabling a partition would
leave the parent or child without CPUs when they have tasks. The check
must use the **final computed** exclusive CPUs (`xcpus` after
`compute_excpus()`), not the initial user-specified exclusive CPUs. The
bug causes the check to use the wrong cpumask, potentially making
incorrect decisions about partition validity.
### 8. Recommendation Summary
**This commit should be backported because:**
- Fixes a logical correctness bug in critical resource management code
- Minimal risk (one line change, well-tested function)
- Prevents potential system instability
- Part of a coherent bug fix series
- Affects a relatively new feature (v6.16-rc1)
- Reviewed by cpuset maintainer (Waiman Long)
- Signed-off-by cgroup maintainer (Tejun Heo)
kernel/cgroup/cpuset.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 27adb04df675d..50f1d33d31bc9 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -1727,11 +1727,7 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
if (prstate_housekeeping_conflict(new_prs, xcpus))
return PERR_HKEEPING;
- /*
- * A parent can be left with no CPU as long as there is no
- * task directly associated with the parent partition.
- */
- if (nocpu)
+ if (tasks_nocpu_error(parent, cs, xcpus))
return PERR_NOCPUS;
/*
--
2.51.0
next prev parent reply other threads:[~2025-10-06 18:19 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-06 18:17 [PATCH AUTOSEL 6.17-5.4] x86/build: Remove cc-option from stack alignment flags Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.1] btrfs: zoned: refine extent allocator hint selection Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.1] arch: Add the macro COMPILE_OFFSETS to all the asm-offsets.c Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] btrfs: abort transaction on specific error places when walking log tree Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-5.4] btrfs: use smp_mb__after_atomic() when forcing COW in create_pending_snapshot() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17] x86/bugs: Add attack vector controls for VMSCAPE Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.16] sched_ext: Keep bypass on between enable failure and scx_disable_workfn() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] btrfs: abort transaction if we fail to update inode in log replay dir fixup Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.1] EDAC/mc_sysfs: Increase legacy channel support to 16 Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] btrfs: abort transaction in the process_one_buffer() log tree walk callback Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.6] btrfs: zoned: return error from btrfs_zone_finish_endio() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.6] perf: Skip user unwind if the task is a kernel thread Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.1] perf: Have get_perf_callchain() return NULL if crosstask and user are set Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.16] EDAC: Fix wrong executable file modes for C source files Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.6] perf: Use current->flags & PF_KTHREAD|PF_USER_WORKER instead of current->mm == NULL Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.6] btrfs: use level argument in log tree walk callback replay_one_buffer() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] sched_ext: Make qmap dump operation non-destructive Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] seccomp: passthrough uprobe systemcall without filtering Sasha Levin
2025-10-06 18:17 ` Sasha Levin [this message]
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] btrfs: tree-checker: add inode extref checks Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17] sched/fair: update_cfs_group() for throttled cfs_rqs Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-5.15] btrfs: scrub: replace max_t()/min_t() with clamp() in scrub_throttle_dev_io() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-5.4] x86/bugs: Fix reporting of LFENCE retpoline Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-5.10] btrfs: always drop log root tree reference in btrfs_replay_log() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.6] audit: record fanotify event regardless of presence of rules Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17] EDAC/ie31200: Add two more Intel Alder Lake-S SoCs for EDAC support Sasha Levin
2025-10-06 18:18 ` [PATCH AUTOSEL 6.17-6.6] x86/bugs: Report correct retbleed mitigation status Sasha Levin
2025-10-06 21:55 ` [PATCH AUTOSEL 6.17-5.4] x86/build: Remove cc-option from stack alignment flags Nathan Chancellor
2025-10-06 23:13 ` Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251006181835.1919496-20-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=cgroups@vger.kernel.org \
--cc=chenridong@huawei.com \
--cc=hannes@cmpxchg.org \
--cc=longman@redhat.com \
--cc=mkoutny@suse.com \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).