* [PATCH v2 0/2] kernfs: Fix UAF in PSI polling when open file is released
@ 2025-08-22 7:07 Chen Ridong
2025-08-22 7:07 ` [PATCH v2 1/2] kernfs: Fix UAF in " Chen Ridong
2025-08-22 7:07 ` [PATCH v2 2/2] cgroup/psi: Set of->priv to NULL upon file release Chen Ridong
0 siblings, 2 replies; 11+ messages in thread
From: Chen Ridong @ 2025-08-22 7:07 UTC (permalink / raw)
To: gregkh, tj, hannes, mkoutny, peterz, zhouchengming
Cc: linux-kernel, cgroups, lujialin4, chenridong, libaokun1
From: Chen Ridong <chenridong@huawei.com>
This series fixes a use-after-free issue triggered when disabling and
enabling cgroup PSI.
---
v2:
- Add kernfs_get_active_of() to safely acquire active references for
kernfs open files suggested by Tj.
- Split into two patches as suggested by Greg.
Chen Ridong (2):
kernfs: Fix UAF in polling when open file is released
cgroup/psi: Set of->priv to NULL upon file release
fs/kernfs/file.c | 58 +++++++++++++++++++++++++++---------------
kernel/cgroup/cgroup.c | 1 +
2 files changed, 39 insertions(+), 20 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 1/2] kernfs: Fix UAF in polling when open file is released
2025-08-22 7:07 [PATCH v2 0/2] kernfs: Fix UAF in PSI polling when open file is released Chen Ridong
@ 2025-08-22 7:07 ` Chen Ridong
2025-08-22 7:47 ` Greg KH
2025-08-22 17:46 ` Tejun Heo
2025-08-22 7:07 ` [PATCH v2 2/2] cgroup/psi: Set of->priv to NULL upon file release Chen Ridong
1 sibling, 2 replies; 11+ messages in thread
From: Chen Ridong @ 2025-08-22 7:07 UTC (permalink / raw)
To: gregkh, tj, hannes, mkoutny, peterz, zhouchengming
Cc: linux-kernel, cgroups, lujialin4, chenridong, libaokun1
From: Chen Ridong <chenridong@huawei.com>
A use-after-free (UAF) vulnerability was identified in the PSI (Pressure
Stall Information) monitoring mechanism:
BUG: KASAN: slab-use-after-free in psi_trigger_poll+0x3c/0x140
Read of size 8 at addr ffff3de3d50bd308 by task systemd/1
psi_trigger_poll+0x3c/0x140
cgroup_pressure_poll+0x70/0xa0
cgroup_file_poll+0x8c/0x100
kernfs_fop_poll+0x11c/0x1c0
ep_item_poll.isra.0+0x188/0x2c0
Allocated by task 1:
cgroup_file_open+0x88/0x388
kernfs_fop_open+0x73c/0xaf0
do_dentry_open+0x5fc/0x1200
vfs_open+0xa0/0x3f0
do_open+0x7e8/0xd08
path_openat+0x2fc/0x6b0
do_filp_open+0x174/0x368
Freed by task 8462:
cgroup_file_release+0x130/0x1f8
kernfs_drain_open_files+0x17c/0x440
kernfs_drain+0x2dc/0x360
kernfs_show+0x1b8/0x288
cgroup_file_show+0x150/0x268
cgroup_pressure_write+0x1dc/0x340
cgroup_file_write+0x274/0x548
Reproduction Steps:
1. Open test/cpu.pressure and establish epoll monitoring
2. Disable monitoring: echo 0 > test/cgroup.pressure
3. Re-enable monitoring: echo 1 > test/cgroup.pressure
The race condition occurs because:
1. When cgroup.pressure is disabled (echo 0 > cgroup.pressure), it:
- Releases PSI triggers via cgroup_file_release()
- Frees of->priv through kernfs_drain_open_files()
2. While epoll still holds reference to the file and continues polling
3. Re-enabling (echo 1 > cgroup.pressure) accesses freed of->priv
epolling disable/enable cgroup.pressure
fd=open(cpu.pressure)
while(1)
...
epoll_wait
kernfs_fop_poll
kernfs_get_active = true echo 0 > cgroup.pressure
... cgroup_file_show
kernfs_show
// inactive kn
kernfs_drain_open_files
cft->release(of);
kfree(ctx);
...
kernfs_get_active = false
echo 1 > cgroup.pressure
kernfs_show
kernfs_activate_one(kn);
kernfs_fop_poll
kernfs_get_active = true
cgroup_file_poll
psi_trigger_poll
// UAF
...
end: close(fd)
To address this issue, introduce kernfs_get_active_of() for kernfs open
files to obtain active references. This function will fail if the open file
has been released. Replace kernfs_get_active() with kernfs_get_active_of()
to prevent further operations on released file descriptors.
Fixes: 34f26a15611a ("sched/psi: Per-cgroup PSI accounting disable/re-enable interface")
Reported-by: Zhang Zhaotian <zhangzhaotian@huawei.com>
Signed-off-by: Chen Ridong <chenridong@huawei.com>
---
fs/kernfs/file.c | 58 +++++++++++++++++++++++++++++++-----------------
1 file changed, 38 insertions(+), 20 deletions(-)
diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
index a6c692cac616..9adf36e6364b 100644
--- a/fs/kernfs/file.c
+++ b/fs/kernfs/file.c
@@ -70,6 +70,24 @@ static struct kernfs_open_node *of_on(struct kernfs_open_file *of)
!list_empty(&of->list));
}
+/* Get active reference to kernfs node for an open file */
+static struct kernfs_open_file *kernfs_get_active_of(struct kernfs_open_file *of)
+{
+ /* Skip if file was already released */
+ if (unlikely(of->released))
+ return NULL;
+
+ if (!kernfs_get_active(of->kn))
+ return NULL;
+
+ return of;
+}
+
+static void kernfs_put_active_of(struct kernfs_open_file *of)
+{
+ return kernfs_put_active(of->kn);
+}
+
/**
* kernfs_deref_open_node_locked - Get kernfs_open_node corresponding to @kn
*
@@ -139,7 +157,7 @@ static void kernfs_seq_stop_active(struct seq_file *sf, void *v)
if (ops->seq_stop)
ops->seq_stop(sf, v);
- kernfs_put_active(of->kn);
+ kernfs_put_active_of(of);
}
static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos)
@@ -152,7 +170,7 @@ static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos)
* the ops aren't called concurrently for the same open file.
*/
mutex_lock(&of->mutex);
- if (!kernfs_get_active(of->kn))
+ if (!kernfs_get_active_of(of))
return ERR_PTR(-ENODEV);
ops = kernfs_ops(of->kn);
@@ -238,7 +256,7 @@ static ssize_t kernfs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
* the ops aren't called concurrently for the same open file.
*/
mutex_lock(&of->mutex);
- if (!kernfs_get_active(of->kn)) {
+ if (!kernfs_get_active_of(of)) {
len = -ENODEV;
mutex_unlock(&of->mutex);
goto out_free;
@@ -252,7 +270,7 @@ static ssize_t kernfs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
else
len = -EINVAL;
- kernfs_put_active(of->kn);
+ kernfs_put_active_of(of);
mutex_unlock(&of->mutex);
if (len < 0)
@@ -323,7 +341,7 @@ static ssize_t kernfs_fop_write_iter(struct kiocb *iocb, struct iov_iter *iter)
* the ops aren't called concurrently for the same open file.
*/
mutex_lock(&of->mutex);
- if (!kernfs_get_active(of->kn)) {
+ if (!kernfs_get_active_of(of)) {
mutex_unlock(&of->mutex);
len = -ENODEV;
goto out_free;
@@ -335,7 +353,7 @@ static ssize_t kernfs_fop_write_iter(struct kiocb *iocb, struct iov_iter *iter)
else
len = -EINVAL;
- kernfs_put_active(of->kn);
+ kernfs_put_active_of(of);
mutex_unlock(&of->mutex);
if (len > 0)
@@ -357,13 +375,13 @@ static void kernfs_vma_open(struct vm_area_struct *vma)
if (!of->vm_ops)
return;
- if (!kernfs_get_active(of->kn))
+ if (!kernfs_get_active_of(of))
return;
if (of->vm_ops->open)
of->vm_ops->open(vma);
- kernfs_put_active(of->kn);
+ kernfs_put_active_of(of);
}
static vm_fault_t kernfs_vma_fault(struct vm_fault *vmf)
@@ -375,14 +393,14 @@ static vm_fault_t kernfs_vma_fault(struct vm_fault *vmf)
if (!of->vm_ops)
return VM_FAULT_SIGBUS;
- if (!kernfs_get_active(of->kn))
+ if (!kernfs_get_active_of(of))
return VM_FAULT_SIGBUS;
ret = VM_FAULT_SIGBUS;
if (of->vm_ops->fault)
ret = of->vm_ops->fault(vmf);
- kernfs_put_active(of->kn);
+ kernfs_put_active_of(of);
return ret;
}
@@ -395,7 +413,7 @@ static vm_fault_t kernfs_vma_page_mkwrite(struct vm_fault *vmf)
if (!of->vm_ops)
return VM_FAULT_SIGBUS;
- if (!kernfs_get_active(of->kn))
+ if (!kernfs_get_active_of(of))
return VM_FAULT_SIGBUS;
ret = 0;
@@ -404,7 +422,7 @@ static vm_fault_t kernfs_vma_page_mkwrite(struct vm_fault *vmf)
else
file_update_time(file);
- kernfs_put_active(of->kn);
+ kernfs_put_active_of(of);
return ret;
}
@@ -418,14 +436,14 @@ static int kernfs_vma_access(struct vm_area_struct *vma, unsigned long addr,
if (!of->vm_ops)
return -EINVAL;
- if (!kernfs_get_active(of->kn))
+ if (!kernfs_get_active_of(of))
return -EINVAL;
ret = -EINVAL;
if (of->vm_ops->access)
ret = of->vm_ops->access(vma, addr, buf, len, write);
- kernfs_put_active(of->kn);
+ kernfs_put_active_of(of);
return ret;
}
@@ -455,7 +473,7 @@ static int kernfs_fop_mmap(struct file *file, struct vm_area_struct *vma)
mutex_lock(&of->mutex);
rc = -ENODEV;
- if (!kernfs_get_active(of->kn))
+ if (!kernfs_get_active_of(of))
goto out_unlock;
ops = kernfs_ops(of->kn);
@@ -490,7 +508,7 @@ static int kernfs_fop_mmap(struct file *file, struct vm_area_struct *vma)
}
vma->vm_ops = &kernfs_vm_ops;
out_put:
- kernfs_put_active(of->kn);
+ kernfs_put_active_of(of);
out_unlock:
mutex_unlock(&of->mutex);
@@ -852,7 +870,7 @@ static __poll_t kernfs_fop_poll(struct file *filp, poll_table *wait)
struct kernfs_node *kn = kernfs_dentry_node(filp->f_path.dentry);
__poll_t ret;
- if (!kernfs_get_active(kn))
+ if (!kernfs_get_active_of(of))
return DEFAULT_POLLMASK|EPOLLERR|EPOLLPRI;
if (kn->attr.ops->poll)
@@ -860,7 +878,7 @@ static __poll_t kernfs_fop_poll(struct file *filp, poll_table *wait)
else
ret = kernfs_generic_poll(of, wait);
- kernfs_put_active(kn);
+ kernfs_put_active_of(of);
return ret;
}
@@ -875,7 +893,7 @@ static loff_t kernfs_fop_llseek(struct file *file, loff_t offset, int whence)
* the ops aren't called concurrently for the same open file.
*/
mutex_lock(&of->mutex);
- if (!kernfs_get_active(of->kn)) {
+ if (!kernfs_get_active_of(of)) {
mutex_unlock(&of->mutex);
return -ENODEV;
}
@@ -886,7 +904,7 @@ static loff_t kernfs_fop_llseek(struct file *file, loff_t offset, int whence)
else
ret = generic_file_llseek(file, offset, whence);
- kernfs_put_active(of->kn);
+ kernfs_put_active_of(of);
mutex_unlock(&of->mutex);
return ret;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 2/2] cgroup/psi: Set of->priv to NULL upon file release
2025-08-22 7:07 [PATCH v2 0/2] kernfs: Fix UAF in PSI polling when open file is released Chen Ridong
2025-08-22 7:07 ` [PATCH v2 1/2] kernfs: Fix UAF in " Chen Ridong
@ 2025-08-22 7:07 ` Chen Ridong
2025-08-22 17:48 ` Tejun Heo
1 sibling, 1 reply; 11+ messages in thread
From: Chen Ridong @ 2025-08-22 7:07 UTC (permalink / raw)
To: gregkh, tj, hannes, mkoutny, peterz, zhouchengming
Cc: linux-kernel, cgroups, lujialin4, chenridong, libaokun1
From: Chen Ridong <chenridong@huawei.com>
Setting of->priv to NULL when the file is released enables earlier bug
detection. This allows potential bugs to manifest as NULL pointer
dereferences rather than use-after-free errors[1], which are generally more
difficult to diagnose.
[1] https://lore.kernel.org/cgroups/38ef3ff9-b380-44f0-9315-8b3714b0948d@huaweicloud.com/T/#m8a3b3f88f0ff3da5925d342e90043394f8b2091b
Signed-off-by: Chen Ridong <chenridong@huawei.com>
---
kernel/cgroup/cgroup.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 312c6a8b55bb..d8b82afed181 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -4159,6 +4159,7 @@ static void cgroup_file_release(struct kernfs_open_file *of)
cft->release(of);
put_cgroup_ns(ctx->ns);
kfree(ctx);
+ of->priv = NULL;
}
static ssize_t cgroup_file_write(struct kernfs_open_file *of, char *buf,
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/2] kernfs: Fix UAF in polling when open file is released
2025-08-22 7:07 ` [PATCH v2 1/2] kernfs: Fix UAF in " Chen Ridong
@ 2025-08-22 7:47 ` Greg KH
2025-08-22 17:46 ` Tejun Heo
1 sibling, 0 replies; 11+ messages in thread
From: Greg KH @ 2025-08-22 7:47 UTC (permalink / raw)
To: Chen Ridong
Cc: tj, hannes, mkoutny, peterz, zhouchengming, linux-kernel, cgroups,
lujialin4, chenridong, libaokun1
On Fri, Aug 22, 2025 at 07:07:14AM +0000, Chen Ridong wrote:
> From: Chen Ridong <chenridong@huawei.com>
>
> A use-after-free (UAF) vulnerability was identified in the PSI (Pressure
> Stall Information) monitoring mechanism:
>
> BUG: KASAN: slab-use-after-free in psi_trigger_poll+0x3c/0x140
> Read of size 8 at addr ffff3de3d50bd308 by task systemd/1
>
> psi_trigger_poll+0x3c/0x140
> cgroup_pressure_poll+0x70/0xa0
> cgroup_file_poll+0x8c/0x100
> kernfs_fop_poll+0x11c/0x1c0
> ep_item_poll.isra.0+0x188/0x2c0
>
> Allocated by task 1:
> cgroup_file_open+0x88/0x388
> kernfs_fop_open+0x73c/0xaf0
> do_dentry_open+0x5fc/0x1200
> vfs_open+0xa0/0x3f0
> do_open+0x7e8/0xd08
> path_openat+0x2fc/0x6b0
> do_filp_open+0x174/0x368
>
> Freed by task 8462:
> cgroup_file_release+0x130/0x1f8
> kernfs_drain_open_files+0x17c/0x440
> kernfs_drain+0x2dc/0x360
> kernfs_show+0x1b8/0x288
> cgroup_file_show+0x150/0x268
> cgroup_pressure_write+0x1dc/0x340
> cgroup_file_write+0x274/0x548
>
> Reproduction Steps:
> 1. Open test/cpu.pressure and establish epoll monitoring
> 2. Disable monitoring: echo 0 > test/cgroup.pressure
> 3. Re-enable monitoring: echo 1 > test/cgroup.pressure
>
> The race condition occurs because:
> 1. When cgroup.pressure is disabled (echo 0 > cgroup.pressure), it:
> - Releases PSI triggers via cgroup_file_release()
> - Frees of->priv through kernfs_drain_open_files()
> 2. While epoll still holds reference to the file and continues polling
> 3. Re-enabling (echo 1 > cgroup.pressure) accesses freed of->priv
>
> epolling disable/enable cgroup.pressure
> fd=open(cpu.pressure)
> while(1)
> ...
> epoll_wait
> kernfs_fop_poll
> kernfs_get_active = true echo 0 > cgroup.pressure
> ... cgroup_file_show
> kernfs_show
> // inactive kn
> kernfs_drain_open_files
> cft->release(of);
> kfree(ctx);
> ...
> kernfs_get_active = false
> echo 1 > cgroup.pressure
> kernfs_show
> kernfs_activate_one(kn);
> kernfs_fop_poll
> kernfs_get_active = true
> cgroup_file_poll
> psi_trigger_poll
> // UAF
> ...
> end: close(fd)
>
> To address this issue, introduce kernfs_get_active_of() for kernfs open
> files to obtain active references. This function will fail if the open file
> has been released. Replace kernfs_get_active() with kernfs_get_active_of()
> to prevent further operations on released file descriptors.
>
> Fixes: 34f26a15611a ("sched/psi: Per-cgroup PSI accounting disable/re-enable interface")
> Reported-by: Zhang Zhaotian <zhangzhaotian@huawei.com>
> Signed-off-by: Chen Ridong <chenridong@huawei.com>
> ---
> fs/kernfs/file.c | 58 +++++++++++++++++++++++++++++++-----------------
> 1 file changed, 38 insertions(+), 20 deletions(-)
>
> diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
> index a6c692cac616..9adf36e6364b 100644
> --- a/fs/kernfs/file.c
> +++ b/fs/kernfs/file.c
> @@ -70,6 +70,24 @@ static struct kernfs_open_node *of_on(struct kernfs_open_file *of)
> !list_empty(&of->list));
> }
>
> +/* Get active reference to kernfs node for an open file */
> +static struct kernfs_open_file *kernfs_get_active_of(struct kernfs_open_file *of)
> +{
> + /* Skip if file was already released */
> + if (unlikely(of->released))
> + return NULL;
> +
> + if (!kernfs_get_active(of->kn))
> + return NULL;
> +
> + return of;
> +}
> +
> +static void kernfs_put_active_of(struct kernfs_open_file *of)
> +{
> + return kernfs_put_active(of->kn);
> +}
> +
> /**
> * kernfs_deref_open_node_locked - Get kernfs_open_node corresponding to @kn
> *
> @@ -139,7 +157,7 @@ static void kernfs_seq_stop_active(struct seq_file *sf, void *v)
>
> if (ops->seq_stop)
> ops->seq_stop(sf, v);
> - kernfs_put_active(of->kn);
> + kernfs_put_active_of(of);
> }
>
> static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos)
> @@ -152,7 +170,7 @@ static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos)
> * the ops aren't called concurrently for the same open file.
> */
> mutex_lock(&of->mutex);
> - if (!kernfs_get_active(of->kn))
> + if (!kernfs_get_active_of(of))
> return ERR_PTR(-ENODEV);
>
> ops = kernfs_ops(of->kn);
> @@ -238,7 +256,7 @@ static ssize_t kernfs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
> * the ops aren't called concurrently for the same open file.
> */
> mutex_lock(&of->mutex);
> - if (!kernfs_get_active(of->kn)) {
> + if (!kernfs_get_active_of(of)) {
> len = -ENODEV;
> mutex_unlock(&of->mutex);
> goto out_free;
> @@ -252,7 +270,7 @@ static ssize_t kernfs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
> else
> len = -EINVAL;
>
> - kernfs_put_active(of->kn);
> + kernfs_put_active_of(of);
> mutex_unlock(&of->mutex);
>
> if (len < 0)
> @@ -323,7 +341,7 @@ static ssize_t kernfs_fop_write_iter(struct kiocb *iocb, struct iov_iter *iter)
> * the ops aren't called concurrently for the same open file.
> */
> mutex_lock(&of->mutex);
> - if (!kernfs_get_active(of->kn)) {
> + if (!kernfs_get_active_of(of)) {
> mutex_unlock(&of->mutex);
> len = -ENODEV;
> goto out_free;
> @@ -335,7 +353,7 @@ static ssize_t kernfs_fop_write_iter(struct kiocb *iocb, struct iov_iter *iter)
> else
> len = -EINVAL;
>
> - kernfs_put_active(of->kn);
> + kernfs_put_active_of(of);
> mutex_unlock(&of->mutex);
>
> if (len > 0)
> @@ -357,13 +375,13 @@ static void kernfs_vma_open(struct vm_area_struct *vma)
> if (!of->vm_ops)
> return;
>
> - if (!kernfs_get_active(of->kn))
> + if (!kernfs_get_active_of(of))
> return;
>
> if (of->vm_ops->open)
> of->vm_ops->open(vma);
>
> - kernfs_put_active(of->kn);
> + kernfs_put_active_of(of);
> }
>
> static vm_fault_t kernfs_vma_fault(struct vm_fault *vmf)
> @@ -375,14 +393,14 @@ static vm_fault_t kernfs_vma_fault(struct vm_fault *vmf)
> if (!of->vm_ops)
> return VM_FAULT_SIGBUS;
>
> - if (!kernfs_get_active(of->kn))
> + if (!kernfs_get_active_of(of))
> return VM_FAULT_SIGBUS;
>
> ret = VM_FAULT_SIGBUS;
> if (of->vm_ops->fault)
> ret = of->vm_ops->fault(vmf);
>
> - kernfs_put_active(of->kn);
> + kernfs_put_active_of(of);
> return ret;
> }
>
> @@ -395,7 +413,7 @@ static vm_fault_t kernfs_vma_page_mkwrite(struct vm_fault *vmf)
> if (!of->vm_ops)
> return VM_FAULT_SIGBUS;
>
> - if (!kernfs_get_active(of->kn))
> + if (!kernfs_get_active_of(of))
> return VM_FAULT_SIGBUS;
>
> ret = 0;
> @@ -404,7 +422,7 @@ static vm_fault_t kernfs_vma_page_mkwrite(struct vm_fault *vmf)
> else
> file_update_time(file);
>
> - kernfs_put_active(of->kn);
> + kernfs_put_active_of(of);
> return ret;
> }
>
> @@ -418,14 +436,14 @@ static int kernfs_vma_access(struct vm_area_struct *vma, unsigned long addr,
> if (!of->vm_ops)
> return -EINVAL;
>
> - if (!kernfs_get_active(of->kn))
> + if (!kernfs_get_active_of(of))
> return -EINVAL;
>
> ret = -EINVAL;
> if (of->vm_ops->access)
> ret = of->vm_ops->access(vma, addr, buf, len, write);
>
> - kernfs_put_active(of->kn);
> + kernfs_put_active_of(of);
> return ret;
> }
>
> @@ -455,7 +473,7 @@ static int kernfs_fop_mmap(struct file *file, struct vm_area_struct *vma)
> mutex_lock(&of->mutex);
>
> rc = -ENODEV;
> - if (!kernfs_get_active(of->kn))
> + if (!kernfs_get_active_of(of))
> goto out_unlock;
>
> ops = kernfs_ops(of->kn);
> @@ -490,7 +508,7 @@ static int kernfs_fop_mmap(struct file *file, struct vm_area_struct *vma)
> }
> vma->vm_ops = &kernfs_vm_ops;
> out_put:
> - kernfs_put_active(of->kn);
> + kernfs_put_active_of(of);
> out_unlock:
> mutex_unlock(&of->mutex);
>
> @@ -852,7 +870,7 @@ static __poll_t kernfs_fop_poll(struct file *filp, poll_table *wait)
> struct kernfs_node *kn = kernfs_dentry_node(filp->f_path.dentry);
> __poll_t ret;
>
> - if (!kernfs_get_active(kn))
> + if (!kernfs_get_active_of(of))
> return DEFAULT_POLLMASK|EPOLLERR|EPOLLPRI;
>
> if (kn->attr.ops->poll)
> @@ -860,7 +878,7 @@ static __poll_t kernfs_fop_poll(struct file *filp, poll_table *wait)
> else
> ret = kernfs_generic_poll(of, wait);
>
> - kernfs_put_active(kn);
> + kernfs_put_active_of(of);
> return ret;
> }
>
> @@ -875,7 +893,7 @@ static loff_t kernfs_fop_llseek(struct file *file, loff_t offset, int whence)
> * the ops aren't called concurrently for the same open file.
> */
> mutex_lock(&of->mutex);
> - if (!kernfs_get_active(of->kn)) {
> + if (!kernfs_get_active_of(of)) {
> mutex_unlock(&of->mutex);
> return -ENODEV;
> }
> @@ -886,7 +904,7 @@ static loff_t kernfs_fop_llseek(struct file *file, loff_t offset, int whence)
> else
> ret = generic_file_llseek(file, offset, whence);
>
> - kernfs_put_active(of->kn);
> + kernfs_put_active_of(of);
> mutex_unlock(&of->mutex);
> return ret;
> }
> --
> 2.34.1
>
Hi,
This is the friendly patch-bot of Greg Kroah-Hartman. You have sent him
a patch that has triggered this response. He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created. Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.
You are receiving this message because of the following common error(s)
as indicated below:
- You have marked a patch with a "Fixes:" tag for a commit that is in an
older released kernel, yet you do not have a cc: stable line in the
signed-off-by area at all, which means that the patch will not be
applied to any older kernel releases. To properly fix this, please
follow the documented rules in the
Documentation/process/stable-kernel-rules.rst file for how to resolve
this.
If you wish to discuss this problem further, or you have questions about
how to resolve this issue, please feel free to respond to this email and
Greg will reply once he has dug out from the pending patches received
from other developers.
thanks,
greg k-h's patch email bot
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/2] kernfs: Fix UAF in polling when open file is released
2025-08-22 7:07 ` [PATCH v2 1/2] kernfs: Fix UAF in " Chen Ridong
2025-08-22 7:47 ` Greg KH
@ 2025-08-22 17:46 ` Tejun Heo
2025-08-23 0:23 ` Chen Ridong
1 sibling, 1 reply; 11+ messages in thread
From: Tejun Heo @ 2025-08-22 17:46 UTC (permalink / raw)
To: Chen Ridong
Cc: gregkh, hannes, mkoutny, peterz, zhouchengming, linux-kernel,
cgroups, lujialin4, chenridong, libaokun1
On Fri, Aug 22, 2025 at 07:07:14AM +0000, Chen Ridong wrote:
> From: Chen Ridong <chenridong@huawei.com>
>
> A use-after-free (UAF) vulnerability was identified in the PSI (Pressure
> Stall Information) monitoring mechanism:
>
> BUG: KASAN: slab-use-after-free in psi_trigger_poll+0x3c/0x140
> Read of size 8 at addr ffff3de3d50bd308 by task systemd/1
>
> psi_trigger_poll+0x3c/0x140
> cgroup_pressure_poll+0x70/0xa0
> cgroup_file_poll+0x8c/0x100
> kernfs_fop_poll+0x11c/0x1c0
> ep_item_poll.isra.0+0x188/0x2c0
>
> Allocated by task 1:
> cgroup_file_open+0x88/0x388
> kernfs_fop_open+0x73c/0xaf0
> do_dentry_open+0x5fc/0x1200
> vfs_open+0xa0/0x3f0
> do_open+0x7e8/0xd08
> path_openat+0x2fc/0x6b0
> do_filp_open+0x174/0x368
>
> Freed by task 8462:
> cgroup_file_release+0x130/0x1f8
> kernfs_drain_open_files+0x17c/0x440
> kernfs_drain+0x2dc/0x360
> kernfs_show+0x1b8/0x288
> cgroup_file_show+0x150/0x268
> cgroup_pressure_write+0x1dc/0x340
> cgroup_file_write+0x274/0x548
>
> Reproduction Steps:
> 1. Open test/cpu.pressure and establish epoll monitoring
> 2. Disable monitoring: echo 0 > test/cgroup.pressure
> 3. Re-enable monitoring: echo 1 > test/cgroup.pressure
>
> The race condition occurs because:
> 1. When cgroup.pressure is disabled (echo 0 > cgroup.pressure), it:
> - Releases PSI triggers via cgroup_file_release()
> - Frees of->priv through kernfs_drain_open_files()
> 2. While epoll still holds reference to the file and continues polling
> 3. Re-enabling (echo 1 > cgroup.pressure) accesses freed of->priv
>
> epolling disable/enable cgroup.pressure
> fd=open(cpu.pressure)
> while(1)
> ...
> epoll_wait
> kernfs_fop_poll
> kernfs_get_active = true echo 0 > cgroup.pressure
> ... cgroup_file_show
> kernfs_show
> // inactive kn
> kernfs_drain_open_files
> cft->release(of);
> kfree(ctx);
> ...
> kernfs_get_active = false
> echo 1 > cgroup.pressure
> kernfs_show
> kernfs_activate_one(kn);
> kernfs_fop_poll
> kernfs_get_active = true
> cgroup_file_poll
> psi_trigger_poll
> // UAF
> ...
> end: close(fd)
>
> To address this issue, introduce kernfs_get_active_of() for kernfs open
> files to obtain active references. This function will fail if the open file
> has been released. Replace kernfs_get_active() with kernfs_get_active_of()
> to prevent further operations on released file descriptors.
>
> Fixes: 34f26a15611a ("sched/psi: Per-cgroup PSI accounting disable/re-enable interface")
> Reported-by: Zhang Zhaotian <zhangzhaotian@huawei.com>
> Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Tejun Heo <tj@kernel.org>
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/2] cgroup/psi: Set of->priv to NULL upon file release
2025-08-22 7:07 ` [PATCH v2 2/2] cgroup/psi: Set of->priv to NULL upon file release Chen Ridong
@ 2025-08-22 17:48 ` Tejun Heo
2025-08-23 6:43 ` Greg KH
0 siblings, 1 reply; 11+ messages in thread
From: Tejun Heo @ 2025-08-22 17:48 UTC (permalink / raw)
To: Chen Ridong
Cc: gregkh, hannes, mkoutny, peterz, zhouchengming, linux-kernel,
cgroups, lujialin4, chenridong, libaokun1
On Fri, Aug 22, 2025 at 07:07:15AM +0000, Chen Ridong wrote:
> From: Chen Ridong <chenridong@huawei.com>
>
> Setting of->priv to NULL when the file is released enables earlier bug
> detection. This allows potential bugs to manifest as NULL pointer
> dereferences rather than use-after-free errors[1], which are generally more
> difficult to diagnose.
>
> [1] https://lore.kernel.org/cgroups/38ef3ff9-b380-44f0-9315-8b3714b0948d@huaweicloud.com/T/#m8a3b3f88f0ff3da5925d342e90043394f8b2091b
> Signed-off-by: Chen Ridong <chenridong@huawei.com>
Applied to cgroup/for-6.17-fixes.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/2] kernfs: Fix UAF in polling when open file is released
2025-08-22 17:46 ` Tejun Heo
@ 2025-08-23 0:23 ` Chen Ridong
0 siblings, 0 replies; 11+ messages in thread
From: Chen Ridong @ 2025-08-23 0:23 UTC (permalink / raw)
To: Tejun Heo
Cc: gregkh, hannes, mkoutny, peterz, zhouchengming, linux-kernel,
cgroups, lujialin4, chenridong, libaokun1
On 2025/8/23 1:46, Tejun Heo wrote:
> On Fri, Aug 22, 2025 at 07:07:14AM +0000, Chen Ridong wrote:
>> From: Chen Ridong <chenridong@huawei.com>
>>
>> A use-after-free (UAF) vulnerability was identified in the PSI (Pressure
>> Stall Information) monitoring mechanism:
>>
>> BUG: KASAN: slab-use-after-free in psi_trigger_poll+0x3c/0x140
>> Read of size 8 at addr ffff3de3d50bd308 by task systemd/1
>>
>> psi_trigger_poll+0x3c/0x140
>> cgroup_pressure_poll+0x70/0xa0
>> cgroup_file_poll+0x8c/0x100
>> kernfs_fop_poll+0x11c/0x1c0
>> ep_item_poll.isra.0+0x188/0x2c0
>>
>> Allocated by task 1:
>> cgroup_file_open+0x88/0x388
>> kernfs_fop_open+0x73c/0xaf0
>> do_dentry_open+0x5fc/0x1200
>> vfs_open+0xa0/0x3f0
>> do_open+0x7e8/0xd08
>> path_openat+0x2fc/0x6b0
>> do_filp_open+0x174/0x368
>>
>> Freed by task 8462:
>> cgroup_file_release+0x130/0x1f8
>> kernfs_drain_open_files+0x17c/0x440
>> kernfs_drain+0x2dc/0x360
>> kernfs_show+0x1b8/0x288
>> cgroup_file_show+0x150/0x268
>> cgroup_pressure_write+0x1dc/0x340
>> cgroup_file_write+0x274/0x548
>>
>> Reproduction Steps:
>> 1. Open test/cpu.pressure and establish epoll monitoring
>> 2. Disable monitoring: echo 0 > test/cgroup.pressure
>> 3. Re-enable monitoring: echo 1 > test/cgroup.pressure
>>
>> The race condition occurs because:
>> 1. When cgroup.pressure is disabled (echo 0 > cgroup.pressure), it:
>> - Releases PSI triggers via cgroup_file_release()
>> - Frees of->priv through kernfs_drain_open_files()
>> 2. While epoll still holds reference to the file and continues polling
>> 3. Re-enabling (echo 1 > cgroup.pressure) accesses freed of->priv
>>
>> epolling disable/enable cgroup.pressure
>> fd=open(cpu.pressure)
>> while(1)
>> ...
>> epoll_wait
>> kernfs_fop_poll
>> kernfs_get_active = true echo 0 > cgroup.pressure
>> ... cgroup_file_show
>> kernfs_show
>> // inactive kn
>> kernfs_drain_open_files
>> cft->release(of);
>> kfree(ctx);
>> ...
>> kernfs_get_active = false
>> echo 1 > cgroup.pressure
>> kernfs_show
>> kernfs_activate_one(kn);
>> kernfs_fop_poll
>> kernfs_get_active = true
>> cgroup_file_poll
>> psi_trigger_poll
>> // UAF
>> ...
>> end: close(fd)
>>
>> To address this issue, introduce kernfs_get_active_of() for kernfs open
>> files to obtain active references. This function will fail if the open file
>> has been released. Replace kernfs_get_active() with kernfs_get_active_of()
>> to prevent further operations on released file descriptors.
>>
>> Fixes: 34f26a15611a ("sched/psi: Per-cgroup PSI accounting disable/re-enable interface")
>> Reported-by: Zhang Zhaotian <zhangzhaotian@huawei.com>
>> Signed-off-by: Chen Ridong <chenridong@huawei.com>
>
> Acked-by: Tejun Heo <tj@kernel.org>
>
> Thanks.
>
Thanks
--
Best regards,
Ridong
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/2] cgroup/psi: Set of->priv to NULL upon file release
2025-08-22 17:48 ` Tejun Heo
@ 2025-08-23 6:43 ` Greg KH
2025-08-25 17:32 ` Tejun Heo
0 siblings, 1 reply; 11+ messages in thread
From: Greg KH @ 2025-08-23 6:43 UTC (permalink / raw)
To: Tejun Heo
Cc: Chen Ridong, hannes, mkoutny, peterz, zhouchengming, linux-kernel,
cgroups, lujialin4, chenridong, libaokun1
On Fri, Aug 22, 2025 at 07:48:08AM -1000, Tejun Heo wrote:
> On Fri, Aug 22, 2025 at 07:07:15AM +0000, Chen Ridong wrote:
> > From: Chen Ridong <chenridong@huawei.com>
> >
> > Setting of->priv to NULL when the file is released enables earlier bug
> > detection. This allows potential bugs to manifest as NULL pointer
> > dereferences rather than use-after-free errors[1], which are generally more
> > difficult to diagnose.
> >
> > [1] https://lore.kernel.org/cgroups/38ef3ff9-b380-44f0-9315-8b3714b0948d@huaweicloud.com/T/#m8a3b3f88f0ff3da5925d342e90043394f8b2091b
> > Signed-off-by: Chen Ridong <chenridong@huawei.com>
>
> Applied to cgroup/for-6.17-fixes.
Both or just this second patch? Should I take the first through the
driver-core tree, or do you want to take it through the cgroup tree? No
objection from me for you to take both :)
thanks,
greg k-h
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/2] cgroup/psi: Set of->priv to NULL upon file release
2025-08-23 6:43 ` Greg KH
@ 2025-08-25 17:32 ` Tejun Heo
2025-09-01 1:38 ` Chen Ridong
0 siblings, 1 reply; 11+ messages in thread
From: Tejun Heo @ 2025-08-25 17:32 UTC (permalink / raw)
To: Greg KH
Cc: Chen Ridong, hannes, mkoutny, peterz, zhouchengming, linux-kernel,
cgroups, lujialin4, chenridong, libaokun1
Hello, Greg.
On Sat, Aug 23, 2025 at 08:43:48AM +0200, Greg KH wrote:
> > Applied to cgroup/for-6.17-fixes.
>
> Both or just this second patch? Should I take the first through the
> driver-core tree, or do you want to take it through the cgroup tree? No
> objection from me for you to take both :)
Sorry about the lack of clarity. Just the second one. The first one looks
fine to me but it would probably be more appropriate if you take it.
Thanks!
--
tejun
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/2] cgroup/psi: Set of->priv to NULL upon file release
2025-08-25 17:32 ` Tejun Heo
@ 2025-09-01 1:38 ` Chen Ridong
2025-09-01 6:06 ` Greg KH
0 siblings, 1 reply; 11+ messages in thread
From: Chen Ridong @ 2025-09-01 1:38 UTC (permalink / raw)
To: Tejun Heo, Greg KH
Cc: hannes, mkoutny, peterz, zhouchengming, linux-kernel, cgroups,
lujialin4, chenridong, libaokun1
On 2025/8/26 1:32, Tejun Heo wrote:
> Hello, Greg.
>
> On Sat, Aug 23, 2025 at 08:43:48AM +0200, Greg KH wrote:
>>> Applied to cgroup/for-6.17-fixes.
>>
>> Both or just this second patch? Should I take the first through the
>> driver-core tree, or do you want to take it through the cgroup tree? No
>> objection from me for you to take both :)
>
> Sorry about the lack of clarity. Just the second one. The first one looks
> fine to me but it would probably be more appropriate if you take it.
>
> Thanks!
>
Hello all,
Any other opinions? Can this patch be applied?
--
Best regards,
Ridong
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/2] cgroup/psi: Set of->priv to NULL upon file release
2025-09-01 1:38 ` Chen Ridong
@ 2025-09-01 6:06 ` Greg KH
0 siblings, 0 replies; 11+ messages in thread
From: Greg KH @ 2025-09-01 6:06 UTC (permalink / raw)
To: Chen Ridong
Cc: Tejun Heo, hannes, mkoutny, peterz, zhouchengming, linux-kernel,
cgroups, lujialin4, chenridong, libaokun1
On Mon, Sep 01, 2025 at 09:38:49AM +0800, Chen Ridong wrote:
>
>
> On 2025/8/26 1:32, Tejun Heo wrote:
> > Hello, Greg.
> >
> > On Sat, Aug 23, 2025 at 08:43:48AM +0200, Greg KH wrote:
> >>> Applied to cgroup/for-6.17-fixes.
> >>
> >> Both or just this second patch? Should I take the first through the
> >> driver-core tree, or do you want to take it through the cgroup tree? No
> >> objection from me for you to take both :)
> >
> > Sorry about the lack of clarity. Just the second one. The first one looks
> > fine to me but it would probably be more appropriate if you take it.
> >
> > Thanks!
> >
>
> Hello all,
>
> Any other opinions? Can this patch be applied?
Please give us a chance to catch up with patch reviews :)
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-09-01 6:06 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-22 7:07 [PATCH v2 0/2] kernfs: Fix UAF in PSI polling when open file is released Chen Ridong
2025-08-22 7:07 ` [PATCH v2 1/2] kernfs: Fix UAF in " Chen Ridong
2025-08-22 7:47 ` Greg KH
2025-08-22 17:46 ` Tejun Heo
2025-08-23 0:23 ` Chen Ridong
2025-08-22 7:07 ` [PATCH v2 2/2] cgroup/psi: Set of->priv to NULL upon file release Chen Ridong
2025-08-22 17:48 ` Tejun Heo
2025-08-23 6:43 ` Greg KH
2025-08-25 17:32 ` Tejun Heo
2025-09-01 1:38 ` Chen Ridong
2025-09-01 6:06 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).