From: Waiman Long <longman@redhat.com>
To: "Tejun Heo" <tj@kernel.org>, "Zefan Li" <lizefan.x@bytedance.com>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Michal Koutný" <mkoutny@suse.com>,
"Shuah Khan" <shuah@kernel.org>
Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
linux-kselftest@vger.kernel.org,
Chen Ridong <chenridong@huawei.com>,
Waiman Long <longman@redhat.com>
Subject: [PATCH-cgroup 1/5] cgroup/cpuset: fix panic caused by partcmd_update
Date: Sun, 4 Aug 2024 21:30:15 -0400 [thread overview]
Message-ID: <20240805013019.724300-2-longman@redhat.com> (raw)
In-Reply-To: <20240805013019.724300-1-longman@redhat.com>
From: Chen Ridong <chenridong@huawei.com>
We find a bug as below:
BUG: unable to handle page fault for address: 00000003
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 358 Comm: bash Tainted: G W I 6.6.0-10893-g60d6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/4
RIP: 0010:partition_sched_domains_locked+0x483/0x600
Code: 01 48 85 d2 74 0d 48 83 05 29 3f f8 03 01 f3 48 0f bc c2 89 c0 48 9
RSP: 0018:ffffc90000fdbc58 EFLAGS: 00000202
RAX: 0000000100000003 RBX: ffff888100b3dfa0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000002fe80
RBP: ffff888100b3dfb0 R08: 0000000000000001 R09: 0000000000000000
R10: ffffc90000fdbcb0 R11: 0000000000000004 R12: 0000000000000002
R13: ffff888100a92b48 R14: 0000000000000000 R15: 0000000000000000
FS: 00007f44a5425740(0000) GS:ffff888237d80000(0000) knlGS:0000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000100030973 CR3: 000000010722c000 CR4: 00000000000006e0
Call Trace:
<TASK>
? show_regs+0x8c/0xa0
? __die_body+0x23/0xa0
? __die+0x3a/0x50
? page_fault_oops+0x1d2/0x5c0
? partition_sched_domains_locked+0x483/0x600
? search_module_extables+0x2a/0xb0
? search_exception_tables+0x67/0x90
? kernelmode_fixup_or_oops+0x144/0x1b0
? __bad_area_nosemaphore+0x211/0x360
? up_read+0x3b/0x50
? bad_area_nosemaphore+0x1a/0x30
? exc_page_fault+0x890/0xd90
? __lock_acquire.constprop.0+0x24f/0x8d0
? __lock_acquire.constprop.0+0x24f/0x8d0
? asm_exc_page_fault+0x26/0x30
? partition_sched_domains_locked+0x483/0x600
? partition_sched_domains_locked+0xf0/0x600
rebuild_sched_domains_locked+0x806/0xdc0
update_partition_sd_lb+0x118/0x130
cpuset_write_resmask+0xffc/0x1420
cgroup_file_write+0xb2/0x290
kernfs_fop_write_iter+0x194/0x290
new_sync_write+0xeb/0x160
vfs_write+0x16f/0x1d0
ksys_write+0x81/0x180
__x64_sys_write+0x21/0x30
x64_sys_call+0x2f25/0x4630
do_syscall_64+0x44/0xb0
entry_SYSCALL_64_after_hwframe+0x78/0xe2
RIP: 0033:0x7f44a553c887
It can be reproduced with cammands:
cd /sys/fs/cgroup/
mkdir test
cd test/
echo +cpuset > ../cgroup.subtree_control
echo root > cpuset.cpus.partition
cat /sys/fs/cgroup/cpuset.cpus.effective
0-3
echo 0-3 > cpuset.cpus // taking away all cpus from root
This issue is caused by the incorrect rebuilding of scheduling domains.
In this scenario, test/cpuset.cpus.partition should be an invalid root
and should not trigger the rebuilding of scheduling domains. When calling
update_parent_effective_cpumask with partcmd_update, if newmask is not
null, it should recheck newmask whether there are cpus is available
for parect/cs that has tasks.
Fixes: 0c7f293efc87 ("cgroup/cpuset: Add cpuset.cpus.exclusive.effective for v2")
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Signed-off-by: Waiman Long <longman@redhat.com>
---
kernel/cgroup/cpuset.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 9066f9b4af24..f1846a08e245 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -1978,6 +1978,8 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
part_error = PERR_CPUSEMPTY;
goto write_error;
}
+ /* Check newmask again, whether cpus are available for parent/cs */
+ nocpu |= tasks_nocpu_error(parent, cs, newmask);
/*
* partcmd_update with newmask:
--
2.43.5
next prev parent reply other threads:[~2024-08-05 1:30 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-05 1:30 [PATCH-cgroup 0/5] cgroup/cpuset: Miscellaneous cpuset updates for 6.12 Waiman Long
2024-08-05 1:30 ` Waiman Long [this message]
2024-08-05 1:30 ` [PATCH-cgroup 2/5] cgroup/cpuset: Clear effective_xcpus on cpus_allowed clearing only if cpus.exclusive not set Waiman Long
2024-08-05 20:53 ` Tejun Heo
2024-08-05 1:30 ` [PATCH-cgroup 3/5] cgroup/cpuset: Eliminate unncessary sched domains rebuilds in hotplug Waiman Long
2024-08-05 20:55 ` Tejun Heo
2024-08-05 1:30 ` [PATCH-cgroup 4/5] cgroup/cpuset: Check for partition roots with overlapping CPUs Waiman Long
2024-08-05 1:30 ` [PATCH-cgroup 5/5] selftest/cgroup: Add new test cases to test_cpuset_prs.sh Waiman Long
2024-08-05 20:58 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240805013019.724300-2-longman@redhat.com \
--to=longman@redhat.com \
--cc=cgroups@vger.kernel.org \
--cc=chenridong@huawei.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=lizefan.x@bytedance.com \
--cc=mkoutny@suse.com \
--cc=shuah@kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox