cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chen Ridong <chenridong@huaweicloud.com>
To: Greg KH <gregkh@linuxfoundation.org>
Cc: tj@kernel.org, hannes@cmpxchg.org, mkoutny@suse.com,
	peterz@infradead.org, zhouchengming@bytedance.com,
	linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
	lujialin4@huawei.com, chenridong@huawei.com
Subject: Re: [PATCH] kernfs: Fix UAF in PSI polling when open file is released
Date: Fri, 15 Aug 2025 15:16:15 +0800	[thread overview]
Message-ID: <269e3b81-3ad6-48b8-9756-4bc272f52fc4@huaweicloud.com> (raw)
In-Reply-To: <2025081558-patriot-goes-4559@gregkh>



On 2025/8/15 14:43, Greg KH wrote:
> On Fri, Aug 15, 2025 at 02:22:54PM +0800, Chen Ridong wrote:
>>
>>
>> On 2025/8/15 14:11, Greg KH wrote:
>>> On Fri, Aug 15, 2025 at 01:34:29AM +0000, Chen Ridong wrote:
>>>> From: Chen Ridong <chenridong@huawei.com>
>>>>
>>>> A use-after-free (UAF) vulnerability was identified in the PSI (Pressure
>>>> Stall Information) monitoring mechanism:
>>>>
>>>> BUG: KASAN: slab-use-after-free in psi_trigger_poll+0x3c/0x140
>>>> Read of size 8 at addr ffff3de3d50bd308 by task systemd/1
>>>>
>>>> psi_trigger_poll+0x3c/0x140
>>>> cgroup_pressure_poll+0x70/0xa0
>>>> cgroup_file_poll+0x8c/0x100
>>>> kernfs_fop_poll+0x11c/0x1c0
>>>> ep_item_poll.isra.0+0x188/0x2c0
>>>>
>>>> Allocated by task 1:
>>>> cgroup_file_open+0x88/0x388
>>>> kernfs_fop_open+0x73c/0xaf0
>>>> do_dentry_open+0x5fc/0x1200
>>>> vfs_open+0xa0/0x3f0
>>>> do_open+0x7e8/0xd08
>>>> path_openat+0x2fc/0x6b0
>>>> do_filp_open+0x174/0x368
>>>>
>>>> Freed by task 8462:
>>>> cgroup_file_release+0x130/0x1f8
>>>> kernfs_drain_open_files+0x17c/0x440
>>>> kernfs_drain+0x2dc/0x360
>>>> kernfs_show+0x1b8/0x288
>>>> cgroup_file_show+0x150/0x268
>>>> cgroup_pressure_write+0x1dc/0x340
>>>> cgroup_file_write+0x274/0x548
>>>>
>>>> Reproduction Steps:
>>>> 1. Open test/cpu.pressure and establish epoll monitoring
>>>> 2. Disable monitoring: echo 0 > test/cgroup.pressure
>>>> 3. Re-enable monitoring: echo 1 > test/cgroup.pressure
>>>>
>>>> The race condition occurs because:
>>>> 1. When cgroup.pressure is disabled (echo 0 > cgroup.pressure), it:
>>>>    - Releases PSI triggers via cgroup_file_release()
>>>>    - Frees of->priv through kernfs_drain_open_files()
>>>> 2. While epoll still holds reference to the file and continues polling
>>>> 3. Re-enabling (echo 1 > cgroup.pressure) accesses freed of->priv
>>>>
>>>> epolling			disable/enable cgroup.pressure
>>>> fd=open(cpu.pressure)
>>>> while(1)
>>>> ...
>>>> epoll_wait
>>>> kernfs_fop_poll
>>>> kernfs_get_active = true	echo 0 > cgroup.pressure
>>>> ...				cgroup_file_show
>>>> 				kernfs_show
>>>> 				// inactive kn
>>>> 				kernfs_drain_open_files
>>>> 				cft->release(of);
>>>> 				kfree(ctx);
>>>> 				...
>>>> kernfs_get_active = false
>>>> 				echo 1 > cgroup.pressure
>>>> 				kernfs_show
>>>> 				kernfs_activate_one(kn);
>>>> kernfs_fop_poll
>>>> kernfs_get_active = true
>>>> cgroup_file_poll
>>>> psi_trigger_poll
>>>> // UAF
>>>> ...
>>>> end: close(fd)
>>>>
>>>> Fix this by adding released flag check in kernfs_fop_poll(), which is
>>>> treated as kn is inactive.
>>>>
>>>> Fixes: 34f26a15611a ("sched/psi: Per-cgroup PSI accounting disable/re-enable interface")
>>>> Reported-by: Zhang Zhantian <zhangzhaotian@huawei.com>
>>>> Signed-off-by: Chen Ridong <chenridong@huawei.com>
>>>> ---
>>>>  fs/kernfs/file.c       | 2 +-
>>>>  kernel/cgroup/cgroup.c | 1 +
>>>>  2 files changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
>>>> index a6c692cac616..d5d01f0b9392 100644
>>>> --- a/fs/kernfs/file.c
>>>> +++ b/fs/kernfs/file.c
>>>> @@ -852,7 +852,7 @@ static __poll_t kernfs_fop_poll(struct file *filp, poll_table *wait)
>>>>  	struct kernfs_node *kn = kernfs_dentry_node(filp->f_path.dentry);
>>>>  	__poll_t ret;
>>>>  
>>>> -	if (!kernfs_get_active(kn))
>>>> +	if (of->released || !kernfs_get_active(kn))
>>>
>>> I can see why the cgroup change is needed, but why is this test for
>>> released() an issue here?  This feels like two different changes/fixes
>>> for different problems?  Why does testing for released matter at this
>>> point in time?
>>>
>>> thanks,
>>>
>>> greg k-h
>>
>> Thank you for your feedback.
>>
>> The cgroup changes can prevent the UAF (Use-After-Free) issue, but they will introduce a NULL
>> pointer access problem.
> 
> Where will the null pointer access happen?  kernfs_fop_poll() isn't
> caring about the released file, AND you need to properly lock things
> when testing that value (hint, what if it changes right after you tested
> it?)
> 


Issue occurs in psi_trigger_poll() during this sequence:

fd = open("cpu.pressure")
<...> // cgroup.pressure disabled then re-enabled
of->released flag gets set to true
kernfs_fop_poll()
└─ kernfs_get_active() returns true (due to re-enable) <<<--  kernfs changes, check of->released
    └─ kn->attr.ops->poll()
        └─ cgroup_file_poll()
	    └─ *ctx = of->priv; // Already released by .release()(cgroup_file_release)
                 └─ psi_trigger_poll()
                     └─ smp_load_acquire(trigger_ptr);
// UAF: trigger_ptr == of->priv
// NULL dereference after of->priv = NULL patch


>> If the open file is properly drained(released), it can not safely invoke the poll callback again.
>> Otherwise, it may still lead to either UAF or NULL pointer issues
>>
>> Are you suggesting I should split this into two separate patches?
> 
> This feels like two separate issues for two different things.  The PSI
> change didn't cause the kernfs change to be required right?
> 
> thanks,
> 
> greg k-h

Correct. The cgroup modifications make the issue more easily observable, while the kernfs changes
provide the actual fix.

-- 
Best regards,
Ridong


  reply	other threads:[~2025-08-15  7:16 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-15  1:34 [PATCH] kernfs: Fix UAF in PSI polling when open file is released Chen Ridong
2025-08-15  6:11 ` Greg KH
2025-08-15  6:22   ` Chen Ridong
2025-08-15  6:43     ` Greg KH
2025-08-15  7:16       ` Chen Ridong [this message]
2025-08-15 14:42   ` Michal Koutný
2025-08-18  7:41     ` Chen Ridong
     [not found] ` <0319ee9b-ce2c-4c02-a731-c538afcf008f@huawei.com>
2025-08-18  8:00   ` Chen Ridong
2025-08-18  8:22     ` Chen Ridong
2025-08-20  1:46     ` Tejun Heo
2025-08-22  1:57       ` Chen Ridong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=269e3b81-3ad6-48b8-9756-4bc272f52fc4@huaweicloud.com \
    --to=chenridong@huaweicloud.com \
    --cc=cgroups@vger.kernel.org \
    --cc=chenridong@huawei.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lujialin4@huawei.com \
    --cc=mkoutny@suse.com \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    --cc=zhouchengming@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).