* [PATCH 0/3] security, sched: Expand task_setscheduler LSM hook and related fixes
@ 2026-05-09 16:48 Aaron Tomlin
2026-05-09 16:48 ` [PATCH 1/3] cgroup/cpuset: Fix deadline bandwidth leak in cpuset_can_attach() Aaron Tomlin
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Aaron Tomlin @ 2026-05-09 16:48 UTC (permalink / raw)
To: tsbogend, paul, jmorris, serge, mingo, peterz, juri.lelli,
vincent.guittot, stephen.smalley.work, casey, longman, tj, hannes,
mkoutny
Cc: chenridong, dietmar.eggemann, rostedt, bsegall, mgorman, vschneid,
kprateek.nayak, omosnace, kees, atomlin, neelx, sean, chjohnst,
steve, mproche, nick.lange, cgroups, linux-mips, linux-fsdevel,
linux-security-module, selinux, linux-kernel
Hi,
This series expands the task_setscheduler LSM hook to include the requested
CPU affinity mask, enabling BPF-based security modules to enforce strict
spatial isolation boundaries. During the development of this expansion, two
pre-existing subsystem bugs were identified and fixed.
In modern multi-tenant and real-time environments, CPU isolation is a
critical boundary. Currently, the task_setscheduler hook lacks visibility
into the actual CPU affinity mask being requested via sched_setaffinity()
or cgroup migrations. This limits the effectiveness of eBPF-driven security
policies when attempting to monitor and shield specific cores.
By expanding the LSM hook signature, BPF LSMs are provided with the
necessary context to audit and even restrict specific CPU pinning requests.
Patch 1 (cgroup/cpuset): Fixes a pre-existing deadline (DL) bandwidth
metric leak in cpuset_can_attach(). It was discovered that if a task
fails its security checks mid-batch during a thread group migration,
the loop aborts without unwinding previously accumulated DL metrics
(nr_migrate_dl_tasks and sum_migrate_dl_bw). This patch introduces an
out_unlock_reset path to guarantee clean unwinding.
Patch 2 (security): Implements the core LSM hook expansion. It safely
propagates either the requested cpumask (via sched_setaffinity and
cpuset_can_attach) or passes NULL for unchanged affinities. It also
adds proper __nullable annotations to ensure the BPF verifier mandates
explicit NULL checks for attached eBPF programs, and mechanically
updates SELinux, Smack, and Commoncap.
Patch 3 (mips): Resolves a critical memory corruption vulnerability in
the MIPS MT architecture's sched_setaffinity implementation. When
CONFIG_CPUMASK_OFFSTACK=y is enabled, copy_from_user() was clobbering
the stack pointer due to an invalid sizeof() evaluation, followed by an
uninitialised heap allocation. This patch safely reorders the
allocations and properly utilises cpumask_size().
These patches have been logically separated to assist subsystem maintainers
with review and backporting.
Comments and feedback are welcome.
Kind regards,
Aaron Tomlin (3):
cgroup/cpuset: Fix deadline bandwidth leak in cpuset_can_attach()
security: Expand task_setscheduler LSM hook to include CPU affinity
mask
mips: sched: Fix CPUMASK_OFFSTACK memory corruption
arch/mips/kernel/mips-mt-fpaff.c | 41 ++++++++++++++++----------------
fs/proc/base.c | 2 +-
include/linux/lsm_hook_defs.h | 3 ++-
include/linux/security.h | 11 +++++----
kernel/cgroup/cpuset.c | 13 ++++++----
kernel/sched/syscalls.c | 4 ++--
security/commoncap.c | 7 ++++--
security/security.c | 11 +++++----
security/selinux/hooks.c | 3 ++-
security/smack/smack_lsm.c | 11 +++++++--
10 files changed, 64 insertions(+), 42 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/3] cgroup/cpuset: Fix deadline bandwidth leak in cpuset_can_attach()
2026-05-09 16:48 [PATCH 0/3] security, sched: Expand task_setscheduler LSM hook and related fixes Aaron Tomlin
@ 2026-05-09 16:48 ` Aaron Tomlin
2026-05-11 5:10 ` Waiman Long
2026-05-09 16:48 ` [PATCH 2/3] security: Expand task_setscheduler LSM hook to include CPU affinity mask Aaron Tomlin
2026-05-09 16:48 ` [PATCH 3/3] mips: sched: Fix CPUMASK_OFFSTACK memory corruption Aaron Tomlin
2 siblings, 1 reply; 10+ messages in thread
From: Aaron Tomlin @ 2026-05-09 16:48 UTC (permalink / raw)
To: tsbogend, paul, jmorris, serge, mingo, peterz, juri.lelli,
vincent.guittot, stephen.smalley.work, casey, longman, tj, hannes,
mkoutny
Cc: chenridong, dietmar.eggemann, rostedt, bsegall, mgorman, vschneid,
kprateek.nayak, omosnace, kees, atomlin, neelx, sean, chjohnst,
steve, mproche, nick.lange, cgroups, linux-mips, linux-fsdevel,
linux-security-module, selinux, linux-kernel
During a cgroup migration, cpuset_can_attach() iterates over the
provided taskset. If a task within the batch is a deadline (DL) task,
the destination cpuset's DL metrics (i.e., nr_migrate_dl_tasks and
sum_migrate_dl_bw) are appropriately incremented.
However, if a subsequent task in the same migration batch fails the
task_can_attach() check, the loop aborts and jumps directly to
out_unlock. Consequently, any DL metrics accumulated from previously
processed tasks in the batch remain permanently inflated in the
destination cpuset. Because the migration is subsequently aborted by the
cgroup core, cpuset_cancel_attach() is never invoked to unwind these
specific increments.
This behaviour results in a permanent leak of deadline bandwidth, which
incorrectly restricts the admission control capacity of the destination
cpuset.
To resolve this, introduce an out_unlock_reset failure path that
conditionally invokes reset_migrate_dl_data(). This guarantees that if a
batch migration is aborted for any reason, the pending DL metrics are
safely reset before returning the error.
Fixes: 0a67b847e1f06 ("cpuset: Allow setscheduler regardless of manipulated task")
Cc: stable@vger.kernel.org
Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
---
kernel/cgroup/cpuset.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index e3a081a07c6d..b8022f6e2a35 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -3029,12 +3029,12 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
cgroup_taskset_for_each(task, css, tset) {
ret = task_can_attach(task);
if (ret)
- goto out_unlock;
+ goto out_unlock_reset;
if (setsched_check) {
ret = security_task_setscheduler(task);
if (ret)
- goto out_unlock;
+ goto out_unlock_reset;
}
if (dl_task(task)) {
@@ -3070,6 +3070,11 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
* changes which zero cpus/mems_allowed.
*/
cs->attach_in_progress++;
+ goto out_unlock;
+
+out_unlock_reset:
+ if (cs->nr_migrate_dl_tasks)
+ reset_migrate_dl_data(cs);
out_unlock:
mutex_unlock(&cpuset_mutex);
return ret;
--
2.51.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/3] security: Expand task_setscheduler LSM hook to include CPU affinity mask
2026-05-09 16:48 [PATCH 0/3] security, sched: Expand task_setscheduler LSM hook and related fixes Aaron Tomlin
2026-05-09 16:48 ` [PATCH 1/3] cgroup/cpuset: Fix deadline bandwidth leak in cpuset_can_attach() Aaron Tomlin
@ 2026-05-09 16:48 ` Aaron Tomlin
2026-05-09 18:29 ` Aaron Tomlin
2026-05-09 16:48 ` [PATCH 3/3] mips: sched: Fix CPUMASK_OFFSTACK memory corruption Aaron Tomlin
2 siblings, 1 reply; 10+ messages in thread
From: Aaron Tomlin @ 2026-05-09 16:48 UTC (permalink / raw)
To: tsbogend, paul, jmorris, serge, mingo, peterz, juri.lelli,
vincent.guittot, stephen.smalley.work, casey, longman, tj, hannes,
mkoutny
Cc: chenridong, dietmar.eggemann, rostedt, bsegall, mgorman, vschneid,
kprateek.nayak, omosnace, kees, atomlin, neelx, sean, chjohnst,
steve, mproche, nick.lange, cgroups, linux-mips, linux-fsdevel,
linux-security-module, selinux, linux-kernel
At present, the task_setscheduler LSM hook provides security modules
with the opportunity to mediate changes to a task's scheduling policy.
However, when invoked via sched_setaffinity(), the hook lacks
visibility into the actual CPU affinity mask being requested.
Consequently, BPF-based security modules are entirely blind to the
target CPUs and cannot make granular access control decisions based on
spatial isolation.
In modern multi-tenant and real-time environments, CPU isolation is a
critical boundary. The inability to audit or restrict specific CPU
pinning requests limits the effectiveness of eBPF-driven security
policies, particularly when attempting to shield isolated or
cryptographic cores from unprivileged or compromised tasks.
This patch expands the security_task_setscheduler() hook signature to
include a pointer to the requested cpumask. Because this is a shared
hook used for multiple scheduling attribute changes, call sites that do
not modify CPU affinity are updated to safely pass NULL.
To protect against unverified dereferences, the parameter is annotated
with __nullable in the LSM hook definition, ensuring the BPF verifier
mandates explicit NULL checks for attached eBPF programs.
This change updates all in-tree security modules (SELinux and Smack) to
accommodate the new parameter mechanically, whilst providing BPF LSMs
with the necessary context to enforce strict affinity policies.
Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
---
arch/mips/kernel/mips-mt-fpaff.c | 2 +-
fs/proc/base.c | 2 +-
include/linux/lsm_hook_defs.h | 3 ++-
include/linux/security.h | 11 +++++++----
kernel/cgroup/cpuset.c | 4 ++--
kernel/sched/syscalls.c | 4 ++--
security/commoncap.c | 7 +++++--
security/security.c | 11 ++++++-----
security/selinux/hooks.c | 3 ++-
security/smack/smack_lsm.c | 11 +++++++++--
10 files changed, 37 insertions(+), 21 deletions(-)
diff --git a/arch/mips/kernel/mips-mt-fpaff.c b/arch/mips/kernel/mips-mt-fpaff.c
index 10172fc4f627..a6a61393fc1a 100644
--- a/arch/mips/kernel/mips-mt-fpaff.c
+++ b/arch/mips/kernel/mips-mt-fpaff.c
@@ -108,7 +108,7 @@ asmlinkage long mipsmt_sys_sched_setaffinity(pid_t pid, unsigned int len,
goto out_unlock;
}
- retval = security_task_setscheduler(p);
+ retval = security_task_setscheduler(p, new_mask);
if (retval)
goto out_unlock;
diff --git a/fs/proc/base.c b/fs/proc/base.c
index d9acfa89c894..ac4096958a00 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2619,7 +2619,7 @@ static ssize_t timerslack_ns_write(struct file *file, const char __user *buf,
}
rcu_read_unlock();
- err = security_task_setscheduler(p);
+ err = security_task_setscheduler(p, NULL);
if (err) {
count = err;
goto out;
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index 2b8dfb35caed..6ec7bc04a1b7 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -255,7 +255,8 @@ LSM_HOOK(int, 0, task_prlimit, const struct cred *cred,
const struct cred *tcred, unsigned int flags)
LSM_HOOK(int, 0, task_setrlimit, struct task_struct *p, unsigned int resource,
struct rlimit *new_rlim)
-LSM_HOOK(int, 0, task_setscheduler, struct task_struct *p)
+LSM_HOOK(int, 0, task_setscheduler, struct task_struct *p,
+ const struct cpumask *in_mask__nullable)
LSM_HOOK(int, 0, task_getscheduler, struct task_struct *p)
LSM_HOOK(int, 0, task_movememory, struct task_struct *p)
LSM_HOOK(int, 0, task_kill, struct task_struct *p, struct kernel_siginfo *info,
diff --git a/include/linux/security.h b/include/linux/security.h
index 41d7367cf403..8b74153daa43 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -196,7 +196,8 @@ extern int cap_mmap_addr(unsigned long addr);
extern int cap_task_fix_setuid(struct cred *new, const struct cred *old, int flags);
extern int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
unsigned long arg4, unsigned long arg5);
-extern int cap_task_setscheduler(struct task_struct *p);
+extern int cap_task_setscheduler(struct task_struct *p,
+ const struct cpumask *in_mask);
extern int cap_task_setioprio(struct task_struct *p, int ioprio);
extern int cap_task_setnice(struct task_struct *p, int nice);
extern int cap_vm_enough_memory(struct mm_struct *mm, long pages);
@@ -531,7 +532,8 @@ int security_task_prlimit(const struct cred *cred, const struct cred *tcred,
unsigned int flags);
int security_task_setrlimit(struct task_struct *p, unsigned int resource,
struct rlimit *new_rlim);
-int security_task_setscheduler(struct task_struct *p);
+int security_task_setscheduler(struct task_struct *p,
+ const struct cpumask *in_mask);
int security_task_getscheduler(struct task_struct *p);
int security_task_movememory(struct task_struct *p);
int security_task_kill(struct task_struct *p, struct kernel_siginfo *info,
@@ -1392,9 +1394,10 @@ static inline int security_task_setrlimit(struct task_struct *p,
return 0;
}
-static inline int security_task_setscheduler(struct task_struct *p)
+static inline int security_task_setscheduler(struct task_struct *p,
+ const struct cpumask *in_mask)
{
- return cap_task_setscheduler(p);
+ return cap_task_setscheduler(p, in_mask);
}
static inline int security_task_getscheduler(struct task_struct *p)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index b8022f6e2a35..68cf89b17af2 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -3032,7 +3032,7 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
goto out_unlock_reset;
if (setsched_check) {
- ret = security_task_setscheduler(task);
+ ret = security_task_setscheduler(task, cs->effective_cpus);
if (ret)
goto out_unlock_reset;
}
@@ -3592,7 +3592,7 @@ static int cpuset_can_fork(struct task_struct *task, struct css_set *cset)
if (ret)
goto out_unlock;
- ret = security_task_setscheduler(task);
+ ret = security_task_setscheduler(task, NULL);
if (ret)
goto out_unlock;
diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c
index b215b0ead9a6..68bc7e466fb1 100644
--- a/kernel/sched/syscalls.c
+++ b/kernel/sched/syscalls.c
@@ -540,7 +540,7 @@ int __sched_setscheduler(struct task_struct *p,
if (attr->sched_flags & SCHED_FLAG_SUGOV)
return -EINVAL;
- retval = security_task_setscheduler(p);
+ retval = security_task_setscheduler(p, NULL);
if (retval)
return retval;
}
@@ -1213,7 +1213,7 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
return -EPERM;
}
- retval = security_task_setscheduler(p);
+ retval = security_task_setscheduler(p, in_mask);
if (retval)
return retval;
diff --git a/security/commoncap.c b/security/commoncap.c
index 3399535808fe..d86f1c2b9210 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -1222,13 +1222,16 @@ static int cap_safe_nice(struct task_struct *p)
/**
* cap_task_setscheduler - Determine if scheduler policy change is permitted
* @p: The task to affect
+ * @in_mask: Requested CPU affinity mask (ignored)
*
* Determine if the requested scheduler policy change is permitted for the
- * specified task.
+ * specified task. The capabilities security module does not evaluate the
+ * @in_mask parameter, relying solely on cap_safe_nice().
*
* Return: 0 if permission is granted, -ve if denied.
*/
-int cap_task_setscheduler(struct task_struct *p)
+int cap_task_setscheduler(struct task_struct *p,
+ const struct cpumask *in_mask __always_unused)
{
return cap_safe_nice(p);
}
diff --git a/security/security.c b/security/security.c
index 4e999f023651..53804ee40df5 100644
--- a/security/security.c
+++ b/security/security.c
@@ -3240,17 +3240,18 @@ int security_task_setrlimit(struct task_struct *p, unsigned int resource,
}
/**
- * security_task_setscheduler() - Check if setting sched policy/param is allowed
+ * security_task_setscheduler() - Check if setting sched policy/param/affinity is allowed
* @p: target task
+ * @in_mask: requested CPU affinity mask, or NULL if not changing affinity
*
- * Check permission before setting scheduling policy and/or parameters of
- * process @p.
+ * Check permission before setting the scheduling policy, parameters, and/or
+ * CPU affinity of process @p.
*
* Return: Returns 0 if permission is granted.
*/
-int security_task_setscheduler(struct task_struct *p)
+int security_task_setscheduler(struct task_struct *p, const struct cpumask *in_mask)
{
- return call_int_hook(task_setscheduler, p);
+ return call_int_hook(task_setscheduler, p, in_mask);
}
/**
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 0f704380a8c8..5f0914db23f6 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -4557,7 +4557,8 @@ static int selinux_task_setrlimit(struct task_struct *p, unsigned int resource,
return 0;
}
-static int selinux_task_setscheduler(struct task_struct *p)
+static int selinux_task_setscheduler(struct task_struct *p,
+ const struct cpumask *in_mask __always_unused)
{
return avc_has_perm(current_sid(), task_sid_obj(p), SECCLASS_PROCESS,
PROCESS__SETSCHED, NULL);
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 3f9ae05039a2..a77143beff44 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -2343,10 +2343,17 @@ static int smack_task_getioprio(struct task_struct *p)
/**
* smack_task_setscheduler - Smack check on setting scheduler
* @p: the task object
+ * @in_mask: Requested CPU affinity mask (ignored)
*
- * Return 0 if read access is permitted
+ * Evaluate whether the current task has write access to the target task @p
+ * to change its scheduling policy. The Smack security module relies
+ * strictly on label-based access control and does not evaluate CPU
+ * affinity masks.
+ *
+ * Return: 0 if write access is permitted
*/
-static int smack_task_setscheduler(struct task_struct *p)
+static int smack_task_setscheduler(struct task_struct *p,
+ const struct cpumask *in_mask __always_unused)
{
return smk_curacc_on_task(p, MAY_WRITE, __func__);
}
--
2.51.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/3] mips: sched: Fix CPUMASK_OFFSTACK memory corruption
2026-05-09 16:48 [PATCH 0/3] security, sched: Expand task_setscheduler LSM hook and related fixes Aaron Tomlin
2026-05-09 16:48 ` [PATCH 1/3] cgroup/cpuset: Fix deadline bandwidth leak in cpuset_can_attach() Aaron Tomlin
2026-05-09 16:48 ` [PATCH 2/3] security: Expand task_setscheduler LSM hook to include CPU affinity mask Aaron Tomlin
@ 2026-05-09 16:48 ` Aaron Tomlin
2 siblings, 0 replies; 10+ messages in thread
From: Aaron Tomlin @ 2026-05-09 16:48 UTC (permalink / raw)
To: tsbogend, paul, jmorris, serge, mingo, peterz, juri.lelli,
vincent.guittot, stephen.smalley.work, casey, longman, tj, hannes,
mkoutny
Cc: chenridong, dietmar.eggemann, rostedt, bsegall, mgorman, vschneid,
kprateek.nayak, omosnace, kees, atomlin, neelx, sean, chjohnst,
steve, mproche, nick.lange, cgroups, linux-mips, linux-fsdevel,
linux-security-module, selinux, linux-kernel
This patch addresses a critical memory management flaw.
When CONFIG_CPUMASK_OFFSTACK is enabled, cpumask_var_t is a pointer.
Consequently, sizeof(new_mask) evaluates to the pointer size, causing
copy_from_user() to clobber the stack pointer. The subsequent
alloc_cpumask_var() overwrites this with an uninitialised heap address,
discarding the user's mask and risking data leaks. Fix this by
allocating masks first, and using cpumask_size() to copy data directly
into the allocated buffer.
Additionally, reorder the failure goto labels to ensure task locks and
memory allocations are cleanly unwound.
Fixes: 295cbf6d63165 ("[MIPS] Move FPU affinity code into separate file.")
Cc: stable@vger.kernel.org
Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
---
arch/mips/kernel/mips-mt-fpaff.c | 39 ++++++++++++++++----------------
1 file changed, 20 insertions(+), 19 deletions(-)
diff --git a/arch/mips/kernel/mips-mt-fpaff.c b/arch/mips/kernel/mips-mt-fpaff.c
index a6a61393fc1a..4b3424088302 100644
--- a/arch/mips/kernel/mips-mt-fpaff.c
+++ b/arch/mips/kernel/mips-mt-fpaff.c
@@ -71,11 +71,23 @@ asmlinkage long mipsmt_sys_sched_setaffinity(pid_t pid, unsigned int len,
struct task_struct *p;
int retval;
- if (len < sizeof(new_mask))
+ if (len < cpumask_size())
return -EINVAL;
- if (copy_from_user(&new_mask, user_mask_ptr, sizeof(new_mask)))
- return -EFAULT;
+ if (!alloc_cpumask_var(&cpus_allowed, GFP_KERNEL))
+ return -ENOMEM;
+ if (!alloc_cpumask_var(&new_mask, GFP_KERNEL)) {
+ retval = -ENOMEM;
+ goto out_free_cpus_allowed;
+ }
+ if (!alloc_cpumask_var(&effective_mask, GFP_KERNEL)) {
+ retval = -ENOMEM;
+ goto out_free_new_mask;
+ }
+
+ retval = -EFAULT;
+ if (copy_from_user(new_mask, user_mask_ptr, cpumask_size()))
+ goto out_free_effective_mask;
cpus_read_lock();
rcu_read_lock();
@@ -84,25 +96,14 @@ asmlinkage long mipsmt_sys_sched_setaffinity(pid_t pid, unsigned int len,
if (!p) {
rcu_read_unlock();
cpus_read_unlock();
- return -ESRCH;
+ retval = -ESRCH;
+ goto out_free_effective_mask;
}
/* Prevent p going away */
get_task_struct(p);
rcu_read_unlock();
- if (!alloc_cpumask_var(&cpus_allowed, GFP_KERNEL)) {
- retval = -ENOMEM;
- goto out_put_task;
- }
- if (!alloc_cpumask_var(&new_mask, GFP_KERNEL)) {
- retval = -ENOMEM;
- goto out_free_cpus_allowed;
- }
- if (!alloc_cpumask_var(&effective_mask, GFP_KERNEL)) {
- retval = -ENOMEM;
- goto out_free_new_mask;
- }
if (!check_same_owner(p) && !capable(CAP_SYS_NICE)) {
retval = -EPERM;
goto out_unlock;
@@ -141,14 +142,14 @@ asmlinkage long mipsmt_sys_sched_setaffinity(pid_t pid, unsigned int len,
}
}
out_unlock:
+ put_task_struct(p);
+ cpus_read_unlock();
+out_free_effective_mask:
free_cpumask_var(effective_mask);
out_free_new_mask:
free_cpumask_var(new_mask);
out_free_cpus_allowed:
free_cpumask_var(cpus_allowed);
-out_put_task:
- put_task_struct(p);
- cpus_read_unlock();
return retval;
}
--
2.51.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] security: Expand task_setscheduler LSM hook to include CPU affinity mask
2026-05-09 16:48 ` [PATCH 2/3] security: Expand task_setscheduler LSM hook to include CPU affinity mask Aaron Tomlin
@ 2026-05-09 18:29 ` Aaron Tomlin
0 siblings, 0 replies; 10+ messages in thread
From: Aaron Tomlin @ 2026-05-09 18:29 UTC (permalink / raw)
To: tsbogend, paul, jmorris, serge, mingo, peterz, juri.lelli,
vincent.guittot, stephen.smalley.work, casey, longman, tj, hannes,
mkoutny
Cc: chenridong, dietmar.eggemann, rostedt, bsegall, mgorman, vschneid,
kprateek.nayak, omosnace, kees, neelx, sean, chjohnst, steve,
mproche, nick.lange, cgroups, linux-mips, linux-fsdevel,
linux-security-module, selinux, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 519 bytes --]
On Sat, May 09, 2026 at 12:48:46PM -0400, Aaron Tomlin wrote:
> @@ -3592,7 +3592,7 @@ static int cpuset_can_fork(struct task_struct *task, struct css_set *cset)
> if (ret)
> goto out_unlock;
>
> - ret = security_task_setscheduler(task);
> + ret = security_task_setscheduler(task, NULL);
> if (ret)
> goto out_unlock;
>
Apologies, we want the CPU affinity mask here too. The NULL should be
replaced with cs->effective_cpus. This will be addressed in the next
iteration.
--
Aaron Tomlin
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] cgroup/cpuset: Fix deadline bandwidth leak in cpuset_can_attach()
2026-05-09 16:48 ` [PATCH 1/3] cgroup/cpuset: Fix deadline bandwidth leak in cpuset_can_attach() Aaron Tomlin
@ 2026-05-11 5:10 ` Waiman Long
2026-05-11 11:08 ` Aaron Tomlin
0 siblings, 1 reply; 10+ messages in thread
From: Waiman Long @ 2026-05-11 5:10 UTC (permalink / raw)
To: Aaron Tomlin, tsbogend, paul, jmorris, serge, mingo, peterz,
juri.lelli, vincent.guittot, stephen.smalley.work, casey, tj,
hannes, mkoutny
Cc: chenridong, dietmar.eggemann, rostedt, bsegall, mgorman, vschneid,
kprateek.nayak, omosnace, kees, neelx, sean, chjohnst, steve,
mproche, nick.lange, cgroups, linux-mips, linux-fsdevel,
linux-security-module, selinux, linux-kernel
On 5/9/26 12:48 PM, Aaron Tomlin wrote:
> During a cgroup migration, cpuset_can_attach() iterates over the
> provided taskset. If a task within the batch is a deadline (DL) task,
> the destination cpuset's DL metrics (i.e., nr_migrate_dl_tasks and
> sum_migrate_dl_bw) are appropriately incremented.
>
> However, if a subsequent task in the same migration batch fails the
> task_can_attach() check, the loop aborts and jumps directly to
> out_unlock. Consequently, any DL metrics accumulated from previously
> processed tasks in the batch remain permanently inflated in the
> destination cpuset. Because the migration is subsequently aborted by the
> cgroup core, cpuset_cancel_attach() is never invoked to unwind these
> specific increments.
>
> This behaviour results in a permanent leak of deadline bandwidth, which
> incorrectly restricts the admission control capacity of the destination
> cpuset.
>
> To resolve this, introduce an out_unlock_reset failure path that
> conditionally invokes reset_migrate_dl_data(). This guarantees that if a
> batch migration is aborted for any reason, the pending DL metrics are
> safely reset before returning the error.
>
> Fixes: 0a67b847e1f06 ("cpuset: Allow setscheduler regardless of manipulated task")
That is not the commit that introduced the bug. Anyway, there is already
another patch sent recently to fix this bug. See
https://lore.kernel.org/lkml/20260509102031.97608-2-zhangguopeng@kylinos.cn/
Cheers,
Longman
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] cgroup/cpuset: Fix deadline bandwidth leak in cpuset_can_attach()
2026-05-11 5:10 ` Waiman Long
@ 2026-05-11 11:08 ` Aaron Tomlin
2026-05-11 17:54 ` Waiman Long
0 siblings, 1 reply; 10+ messages in thread
From: Aaron Tomlin @ 2026-05-11 11:08 UTC (permalink / raw)
To: Waiman Long
Cc: tsbogend, paul, jmorris, serge, mingo, peterz, juri.lelli,
vincent.guittot, stephen.smalley.work, casey, tj, hannes, mkoutny,
chenridong, dietmar.eggemann, rostedt, bsegall, mgorman, vschneid,
kprateek.nayak, omosnace, kees, neelx, sean, chjohnst, steve,
mproche, nick.lange, cgroups, linux-mips, linux-fsdevel,
linux-security-module, selinux, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2108 bytes --]
On Mon, May 11, 2026 at 01:10:02AM -0400, Waiman Long wrote:
>
> On 5/9/26 12:48 PM, Aaron Tomlin wrote:
> > During a cgroup migration, cpuset_can_attach() iterates over the
> > provided taskset. If a task within the batch is a deadline (DL) task,
> > the destination cpuset's DL metrics (i.e., nr_migrate_dl_tasks and
> > sum_migrate_dl_bw) are appropriately incremented.
> >
> > However, if a subsequent task in the same migration batch fails the
> > task_can_attach() check, the loop aborts and jumps directly to
> > out_unlock. Consequently, any DL metrics accumulated from previously
> > processed tasks in the batch remain permanently inflated in the
> > destination cpuset. Because the migration is subsequently aborted by the
> > cgroup core, cpuset_cancel_attach() is never invoked to unwind these
> > specific increments.
> >
> > This behaviour results in a permanent leak of deadline bandwidth, which
> > incorrectly restricts the admission control capacity of the destination
> > cpuset.
> >
> > To resolve this, introduce an out_unlock_reset failure path that
> > conditionally invokes reset_migrate_dl_data(). This guarantees that if a
> > batch migration is aborted for any reason, the pending DL metrics are
> > safely reset before returning the error.
> >
> > Fixes: 0a67b847e1f06 ("cpuset: Allow setscheduler regardless of manipulated task")
>
> That is not the commit that introduced the bug. Anyway, there is already
> another patch sent recently to fix this bug. See
>
> https://lore.kernel.org/lkml/20260509102031.97608-2-zhangguopeng@kylinos.cn/
>
Hi Waiman,
Thank you for the follow up.
Acknowledged. I will drop this patch in the next iteration due to [1].
Please note, the sashiko AI Review bot reported: cpuset_can_attach()
incorrectly assumes all migrating tasks originate from the same source
cpuset. At first glance, this feedback is valid. I plan to submit a patch,
if no solution was already proposed.
[1]: https://lore.kernel.org/lkml/20260509102031.97608-2-zhangguopeng@kylinos.cn/
Kind regards,
--
Aaron Tomlin
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] cgroup/cpuset: Fix deadline bandwidth leak in cpuset_can_attach()
2026-05-11 11:08 ` Aaron Tomlin
@ 2026-05-11 17:54 ` Waiman Long
2026-05-11 20:25 ` Aaron Tomlin
2026-05-12 19:37 ` Aaron Tomlin
0 siblings, 2 replies; 10+ messages in thread
From: Waiman Long @ 2026-05-11 17:54 UTC (permalink / raw)
To: Aaron Tomlin
Cc: tsbogend, paul, jmorris, serge, mingo, peterz, juri.lelli,
vincent.guittot, stephen.smalley.work, casey, tj, hannes, mkoutny,
chenridong, dietmar.eggemann, rostedt, bsegall, mgorman, vschneid,
kprateek.nayak, omosnace, kees, neelx, sean, chjohnst, steve,
mproche, nick.lange, cgroups, linux-mips, linux-fsdevel,
linux-security-module, selinux, linux-kernel
On 5/11/26 7:08 AM, Aaron Tomlin wrote:
> On Mon, May 11, 2026 at 01:10:02AM -0400, Waiman Long wrote:
>> On 5/9/26 12:48 PM, Aaron Tomlin wrote:
>>> During a cgroup migration, cpuset_can_attach() iterates over the
>>> provided taskset. If a task within the batch is a deadline (DL) task,
>>> the destination cpuset's DL metrics (i.e., nr_migrate_dl_tasks and
>>> sum_migrate_dl_bw) are appropriately incremented.
>>>
>>> However, if a subsequent task in the same migration batch fails the
>>> task_can_attach() check, the loop aborts and jumps directly to
>>> out_unlock. Consequently, any DL metrics accumulated from previously
>>> processed tasks in the batch remain permanently inflated in the
>>> destination cpuset. Because the migration is subsequently aborted by the
>>> cgroup core, cpuset_cancel_attach() is never invoked to unwind these
>>> specific increments.
>>>
>>> This behaviour results in a permanent leak of deadline bandwidth, which
>>> incorrectly restricts the admission control capacity of the destination
>>> cpuset.
>>>
>>> To resolve this, introduce an out_unlock_reset failure path that
>>> conditionally invokes reset_migrate_dl_data(). This guarantees that if a
>>> batch migration is aborted for any reason, the pending DL metrics are
>>> safely reset before returning the error.
>>>
>>> Fixes: 0a67b847e1f06 ("cpuset: Allow setscheduler regardless of manipulated task")
>> That is not the commit that introduced the bug. Anyway, there is already
>> another patch sent recently to fix this bug. See
>>
>> https://lore.kernel.org/lkml/20260509102031.97608-2-zhangguopeng@kylinos.cn/
>>
> Hi Waiman,
>
> Thank you for the follow up.
>
> Acknowledged. I will drop this patch in the next iteration due to [1].
>
> Please note, the sashiko AI Review bot reported: cpuset_can_attach()
> incorrectly assumes all migrating tasks originate from the same source
> cpuset. At first glance, this feedback is valid. I plan to submit a patch,
> if no solution was already proposed.
>
> [1]: https://lore.kernel.org/lkml/20260509102031.97608-2-zhangguopeng@kylinos.cn/
Yes, it does look like the AI feedback is valid. I will take a further
look into this.
Thanks,
Longman
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] cgroup/cpuset: Fix deadline bandwidth leak in cpuset_can_attach()
2026-05-11 17:54 ` Waiman Long
@ 2026-05-11 20:25 ` Aaron Tomlin
2026-05-12 19:37 ` Aaron Tomlin
1 sibling, 0 replies; 10+ messages in thread
From: Aaron Tomlin @ 2026-05-11 20:25 UTC (permalink / raw)
To: Waiman Long
Cc: tsbogend, paul, jmorris, serge, mingo, peterz, juri.lelli,
vincent.guittot, stephen.smalley.work, casey, tj, hannes, mkoutny,
chenridong, dietmar.eggemann, rostedt, bsegall, mgorman, vschneid,
kprateek.nayak, omosnace, kees, neelx, sean, chjohnst, steve,
mproche, nick.lange, cgroups, linux-mips, linux-fsdevel,
linux-security-module, selinux, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 796 bytes --]
On Mon, May 11, 2026 at 01:54:37PM -0400, Waiman Long wrote:
> > Hi Waiman,
> >
> > Thank you for the follow up.
> >
> > Acknowledged. I will drop this patch in the next iteration due to [1].
> >
> > Please note, the sashiko AI Review bot reported: cpuset_can_attach()
> > incorrectly assumes all migrating tasks originate from the same source
> > cpuset. At first glance, this feedback is valid. I plan to submit a patch,
> > if no solution was already proposed.
> >
> > [1]: https://lore.kernel.org/lkml/20260509102031.97608-2-zhangguopeng@kylinos.cn/
>
> Yes, it does look like the AI feedback is valid. I will take a further look
> into this.
>
> Thanks,
> Longman
No worries, I have it. I'll submit a patch for review shortly.
Kind regards,
--
Aaron Tomlin
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] cgroup/cpuset: Fix deadline bandwidth leak in cpuset_can_attach()
2026-05-11 17:54 ` Waiman Long
2026-05-11 20:25 ` Aaron Tomlin
@ 2026-05-12 19:37 ` Aaron Tomlin
1 sibling, 0 replies; 10+ messages in thread
From: Aaron Tomlin @ 2026-05-12 19:37 UTC (permalink / raw)
To: Waiman Long
Cc: tsbogend, paul, jmorris, serge, mingo, peterz, juri.lelli,
vincent.guittot, stephen.smalley.work, casey, tj, hannes, mkoutny,
chenridong, dietmar.eggemann, rostedt, bsegall, mgorman, vschneid,
kprateek.nayak, omosnace, kees, neelx, sean, chjohnst, steve,
mproche, nick.lange, cgroups, linux-mips, linux-fsdevel,
linux-security-module, selinux, linux-kernel
On Mon, May 11, 2026 at 01:54:37PM -0400, Waiman Long wrote:
> Yes, it does look like the AI feedback is valid. I will take a further look
> into this.
Hi,
As promised [1], please find my suggested patch here [2].
[1]: https://lore.kernel.org/lkml/dif4xi73znyz3diguyxihzztgosvyj3bjeh3y3oidg4gnt2qpv@5nygeq3rk333/
[2]: https://lore.kernel.org/lkml/20260512010341.101419-1-atomlin@atomlin.com/
Kind regards,
--
Aaron Tomlin
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-05-12 19:37 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-09 16:48 [PATCH 0/3] security, sched: Expand task_setscheduler LSM hook and related fixes Aaron Tomlin
2026-05-09 16:48 ` [PATCH 1/3] cgroup/cpuset: Fix deadline bandwidth leak in cpuset_can_attach() Aaron Tomlin
2026-05-11 5:10 ` Waiman Long
2026-05-11 11:08 ` Aaron Tomlin
2026-05-11 17:54 ` Waiman Long
2026-05-11 20:25 ` Aaron Tomlin
2026-05-12 19:37 ` Aaron Tomlin
2026-05-09 16:48 ` [PATCH 2/3] security: Expand task_setscheduler LSM hook to include CPU affinity mask Aaron Tomlin
2026-05-09 18:29 ` Aaron Tomlin
2026-05-09 16:48 ` [PATCH 3/3] mips: sched: Fix CPUMASK_OFFSTACK memory corruption Aaron Tomlin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox