* Re: data-race in cgroup_get_tree / proc_cgroup_show [not found] <CAEHB249jcoG=sMGLUgqw3Yf+SjZ7ZkUfF_M+WcyQGCAe77o2kA@mail.gmail.com> @ 2022-08-19 7:22 ` Christian Brauner 2022-08-22 17:04 ` Gabriel Ryan 0 siblings, 1 reply; 5+ messages in thread From: Christian Brauner @ 2022-08-19 7:22 UTC (permalink / raw) To: Abhishek Shah Cc: linux-kernel, andrii, ast, bpf, cgroups, daniel, hannes, john.fastabend, kafai, kpsingh, lizefan.x, netdev, songliubraving, tj, yhs, Gabriel Ryan On Thu, Aug 18, 2022 at 07:24:00PM -0400, Abhishek Shah wrote: > Hi all, > > We found the following data race involving the *cgrp_dfl_visible *variable. > We think it has security implications as the racing variable controls the > contents used in /proc/<pid>/cgroup which has been used in prior work > <https://www.cyberark.com/resources/threat-research-blog/the-strange-case-of-how-we-escaped-the-docker-default-container> > in container escapes. Please let us know what you think. Thanks! One straightforward fix might be to use cmpxchg(&cgrp_dfl_visible, false, true) in cgroup_get_tree() and READ_ONCE(cgrp_dfl_visible) in proc_cgroup_show() or sm like that. I'm not sure this is an issue though but might still be nice to fix it. > > *-----------------------------Report--------------------------------------* > *write* to 0xffffffff881d0344 of 1 bytes by task 6542 on cpu 0: > cgroup_get_tree+0x30/0x1c0 kernel/cgroup/cgroup.c:2153 > vfs_get_tree+0x53/0x1b0 fs/super.c:1497 > do_new_mount+0x208/0x6a0 fs/namespace.c:3040 > path_mount+0x4a0/0xbd0 fs/namespace.c:3370 > do_mount fs/namespace.c:3383 [inline] > __do_sys_mount fs/namespace.c:3591 [inline] > __se_sys_mount+0x215/0x2d0 fs/namespace.c:3568 > __x64_sys_mount+0x67/0x80 fs/namespace.c:3568 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x44/0xae > > *read* to 0xffffffff881d0344 of 1 bytes by task 6541 on cpu 1: > proc_cgroup_show+0x1ec/0x4e0 kernel/cgroup/cgroup.c:6017 > proc_single_show+0x96/0x120 fs/proc/base.c:777 > seq_read_iter+0x2d2/0x8e0 fs/seq_file.c:230 > seq_read+0x1c9/0x210 fs/seq_file.c:162 > vfs_read+0x1b5/0x6e0 fs/read_write.c:480 > ksys_read+0xde/0x190 fs/read_write.c:620 > __do_sys_read fs/read_write.c:630 [inline] > __se_sys_read fs/read_write.c:628 [inline] > __x64_sys_read+0x43/0x50 fs/read_write.c:628 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x44/0xae > > Reported by Kernel Concurrency Sanitizer on: > CPU: 1 PID: 6541 Comm: syz-executor2-n Not tainted 5.18.0-rc5+ #107 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 > 04/01/2014 > > > *Reproducing Inputs* > Input CPU 0: > r0 = fsopen(&(0x7f0000000000)='cgroup2\x00', 0x0) > fsconfig$FSCONFIG_CMD_CREATE(r0, 0x6, 0x0, 0x0, 0x0) > fsmount(r0, 0x0, 0x83) > > Input CPU 1: > r0 = syz_open_procfs(0x0, &(0x7f0000000040)='cgroup\x00') > read$eventfd(r0, &(0x7f0000000080), 0x8) ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: data-race in cgroup_get_tree / proc_cgroup_show 2022-08-19 7:22 ` data-race in cgroup_get_tree / proc_cgroup_show Christian Brauner @ 2022-08-22 17:04 ` Gabriel Ryan 2022-08-28 18:22 ` Tejun Heo 0 siblings, 1 reply; 5+ messages in thread From: Gabriel Ryan @ 2022-08-22 17:04 UTC (permalink / raw) To: Christian Brauner Cc: Abhishek Shah, linux-kernel, andrii, ast, bpf, cgroups, daniel, hannes, john.fastabend, kafai, kpsingh, lizefan.x, netdev, songliubraving, tj, yhs Hi Christian, We ran a quick test and confirm your suggestion would eliminate the data race alert we observed. If the data race is benign (and it appears to be), using WRITE_ONCE(cgrp_dfl_visible, true) instead of cmpxchg in cgroup_get_tree() would probably also be ok. Best, Gabe On Fri, Aug 19, 2022 at 3:23 AM Christian Brauner <brauner@kernel.org> wrote: > > On Thu, Aug 18, 2022 at 07:24:00PM -0400, Abhishek Shah wrote: > > Hi all, > > > > We found the following data race involving the *cgrp_dfl_visible *variable. > > We think it has security implications as the racing variable controls the > > contents used in /proc/<pid>/cgroup which has been used in prior work > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cyberark.com_resources_threat-2Dresearch-2Dblog_the-2Dstrange-2Dcase-2Dof-2Dhow-2Dwe-2Descaped-2Dthe-2Ddocker-2Ddefault-2Dcontainer&d=DwIBaQ&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=EyAJYRJu01oaAhhVVY3o8zKgZvacDAXd_PNRtaqACCo&m=oB43wXi5itVN6tAAOVg5q3rzeXp6QVvxICYqYL6p0wnMMhRB_HrHCwwt0dYa5x44&s=78sLv2vexAVEQwQPx_CuCJ90is9f3iixNbmbCp0Agpo&e= > > > in container escapes. Please let us know what you think. Thanks! > > One straightforward fix might be to use > cmpxchg(&cgrp_dfl_visible, false, true) in cgroup_get_tree() > and READ_ONCE(cgrp_dfl_visible) in proc_cgroup_show() or sm like that. > I'm not sure this is an issue though but might still be nice to fix it. > > > > > *-----------------------------Report--------------------------------------* > > *write* to 0xffffffff881d0344 of 1 bytes by task 6542 on cpu 0: > > cgroup_get_tree+0x30/0x1c0 kernel/cgroup/cgroup.c:2153 > > vfs_get_tree+0x53/0x1b0 fs/super.c:1497 > > do_new_mount+0x208/0x6a0 fs/namespace.c:3040 > > path_mount+0x4a0/0xbd0 fs/namespace.c:3370 > > do_mount fs/namespace.c:3383 [inline] > > __do_sys_mount fs/namespace.c:3591 [inline] > > __se_sys_mount+0x215/0x2d0 fs/namespace.c:3568 > > __x64_sys_mount+0x67/0x80 fs/namespace.c:3568 > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > *read* to 0xffffffff881d0344 of 1 bytes by task 6541 on cpu 1: > > proc_cgroup_show+0x1ec/0x4e0 kernel/cgroup/cgroup.c:6017 > > proc_single_show+0x96/0x120 fs/proc/base.c:777 > > seq_read_iter+0x2d2/0x8e0 fs/seq_file.c:230 > > seq_read+0x1c9/0x210 fs/seq_file.c:162 > > vfs_read+0x1b5/0x6e0 fs/read_write.c:480 > > ksys_read+0xde/0x190 fs/read_write.c:620 > > __do_sys_read fs/read_write.c:630 [inline] > > __se_sys_read fs/read_write.c:628 [inline] > > __x64_sys_read+0x43/0x50 fs/read_write.c:628 > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > Reported by Kernel Concurrency Sanitizer on: > > CPU: 1 PID: 6541 Comm: syz-executor2-n Not tainted 5.18.0-rc5+ #107 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 > > 04/01/2014 > > > > > > *Reproducing Inputs* > > Input CPU 0: > > r0 = fsopen(&(0x7f0000000000)='cgroup2\x00', 0x0) > > fsconfig$FSCONFIG_CMD_CREATE(r0, 0x6, 0x0, 0x0, 0x0) > > fsmount(r0, 0x0, 0x83) > > > > Input CPU 1: > > r0 = syz_open_procfs(0x0, &(0x7f0000000040)='cgroup\x00') > > read$eventfd(r0, &(0x7f0000000080), 0x8) -- Gabriel Ryan PhD Candidate at Columbia University ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: data-race in cgroup_get_tree / proc_cgroup_show 2022-08-22 17:04 ` Gabriel Ryan @ 2022-08-28 18:22 ` Tejun Heo 2022-08-29 7:27 ` Christian Brauner 0 siblings, 1 reply; 5+ messages in thread From: Tejun Heo @ 2022-08-28 18:22 UTC (permalink / raw) To: Gabriel Ryan Cc: Christian Brauner, Abhishek Shah, linux-kernel, andrii, ast, bpf, cgroups, daniel, hannes, john.fastabend, kafai, kpsingh, lizefan.x, netdev, songliubraving, yhs On Mon, Aug 22, 2022 at 01:04:58PM -0400, Gabriel Ryan wrote: > Hi Christian, > > We ran a quick test and confirm your suggestion would eliminate the > data race alert we observed. If the data race is benign (and it > appears to be), using WRITE_ONCE(cgrp_dfl_visible, true) instead of > cmpxchg in cgroup_get_tree() would probably also be ok. I don't see how the data race can lead to anything but would the following work? Thanks. diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index ffaccd6373f1e..a90fdba881bdb 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -2172,7 +2172,7 @@ static int cgroup_get_tree(struct fs_context *fc) struct cgroup_fs_context *ctx = cgroup_fc2context(fc); int ret; - cgrp_dfl_visible = true; + WRITE_ONCE(cgrp_dfl_visible, true); cgroup_get_live(&cgrp_dfl_root.cgrp); ctx->root = &cgrp_dfl_root; @@ -6056,7 +6056,7 @@ int proc_cgroup_show(struct seq_file *m, struct pid_namespace *ns, struct cgroup *cgrp; int ssid, count = 0; - if (root == &cgrp_dfl_root && !cgrp_dfl_visible) + if (root == &cgrp_dfl_root && !READ_ONCE(cgrp_dfl_visible)) continue; seq_printf(m, "%d:", root->hierarchy_id); -- tejun ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: data-race in cgroup_get_tree / proc_cgroup_show 2022-08-28 18:22 ` Tejun Heo @ 2022-08-29 7:27 ` Christian Brauner 2022-09-04 19:23 ` [PATCH cgroup/for-6.1] cgroup: Remove data-race around cgrp_dfl_visible Tejun Heo 0 siblings, 1 reply; 5+ messages in thread From: Christian Brauner @ 2022-08-29 7:27 UTC (permalink / raw) To: Tejun Heo Cc: Gabriel Ryan, Abhishek Shah, linux-kernel, andrii, ast, bpf, cgroups, daniel, hannes, john.fastabend, kafai, kpsingh, lizefan.x, netdev, songliubraving, yhs On Sun, Aug 28, 2022 at 08:22:02AM -1000, Tejun Heo wrote: > On Mon, Aug 22, 2022 at 01:04:58PM -0400, Gabriel Ryan wrote: > > Hi Christian, > > > > We ran a quick test and confirm your suggestion would eliminate the > > data race alert we observed. If the data race is benign (and it > > appears to be), using WRITE_ONCE(cgrp_dfl_visible, true) instead of > > cmpxchg in cgroup_get_tree() would probably also be ok. > > I don't see how the data race can lead to anything but would the following > work? Yep. You can take my, Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> when you turn it into a patch. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH cgroup/for-6.1] cgroup: Remove data-race around cgrp_dfl_visible 2022-08-29 7:27 ` Christian Brauner @ 2022-09-04 19:23 ` Tejun Heo 0 siblings, 0 replies; 5+ messages in thread From: Tejun Heo @ 2022-09-04 19:23 UTC (permalink / raw) To: Christian Brauner Cc: Gabriel Ryan, Abhishek Shah, linux-kernel, andrii, ast, bpf, cgroups, daniel, hannes, john.fastabend, kafai, kpsingh, lizefan.x, netdev, songliubraving, yhs From dc79ec1b232ad2c165d381d3dd2626df4ef9b5a4 Mon Sep 17 00:00:00 2001 From: Tejun Heo <tj@kernel.org> Date: Sun, 4 Sep 2022 09:16:19 -1000 There's a seemingly harmless data-race around cgrp_dfl_visible detected by kernel concurrency sanitizer. Let's remove it by throwing WRITE/READ_ONCE at it. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Abhishek Shah <abhishek.shah@columbia.edu> Cc: Gabriel Ryan <gabe@cs.columbia.edu> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Link: https://lore.kernel.org/netdev/20220819072256.fn7ctciefy4fc4cu@wittgenstein/ --- Applied to cgroup/for-6.1. Thanks. kernel/cgroup/cgroup.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 0005de2e2ed9..e0b72eb5d283 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -2173,7 +2173,7 @@ static int cgroup_get_tree(struct fs_context *fc) struct cgroup_fs_context *ctx = cgroup_fc2context(fc); int ret; - cgrp_dfl_visible = true; + WRITE_ONCE(cgrp_dfl_visible, true); cgroup_get_live(&cgrp_dfl_root.cgrp); ctx->root = &cgrp_dfl_root; @@ -6098,7 +6098,7 @@ int proc_cgroup_show(struct seq_file *m, struct pid_namespace *ns, struct cgroup *cgrp; int ssid, count = 0; - if (root == &cgrp_dfl_root && !cgrp_dfl_visible) + if (root == &cgrp_dfl_root && !READ_ONCE(cgrp_dfl_visible)) continue; seq_printf(m, "%d:", root->hierarchy_id); -- 2.37.3 ^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-09-04 19:23 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CAEHB249jcoG=sMGLUgqw3Yf+SjZ7ZkUfF_M+WcyQGCAe77o2kA@mail.gmail.com>
2022-08-19 7:22 ` data-race in cgroup_get_tree / proc_cgroup_show Christian Brauner
2022-08-22 17:04 ` Gabriel Ryan
2022-08-28 18:22 ` Tejun Heo
2022-08-29 7:27 ` Christian Brauner
2022-09-04 19:23 ` [PATCH cgroup/for-6.1] cgroup: Remove data-race around cgrp_dfl_visible Tejun Heo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox