* [syzbot] [fs?] BUG: unable to handle kernel NULL pointer dereference in path_from_stashed
@ 2024-07-20 16:36 syzbot
2024-07-21 6:23 ` [PATCH] fs/pidfs: when time ns disabled add check for ioctl Edward Adam Davis
0 siblings, 1 reply; 6+ messages in thread
From: syzbot @ 2024-07-20 16:36 UTC (permalink / raw)
To: brauner, jack, linux-fsdevel, linux-kernel, syzkaller-bugs, viro
Hello,
syzbot found the following issue on:
HEAD commit: 51835949dda3 Merge tag 'net-next-6.11' of git://git.kernel..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1325e60d980000
kernel config: https://syzkaller.appspot.com/x/.config?x=c1c1b0a8065e216
dashboard link: https://syzkaller.appspot.com/bug?extid=34a0ee986f61f15da35d
compiler: arm-linux-gnueabi-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11bd9a5e980000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15726e95980000
Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/8ead8862021c/non_bootable_disk-51835949.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/a129ae4ab997/vmlinux-51835949.xz
kernel image: https://storage.googleapis.com/syzbot-assets/9339fe082652/zImage-51835949.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+34a0ee986f61f15da35d@syzkaller.appspotmail.com
8<--- cut here ---
Unable to handle kernel NULL pointer dereference at virtual address 00000008 when read
[00000008] *pgd=844c0003, *pmd=fe12e003
Internal error: Oops: 205 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 3011 Comm: syz-executor103 Not tainted 6.10.0-syzkaller #0
Hardware name: ARM-Versatile Express
PC is at path_from_stashed+0x1c/0x308 fs/libfs.c:2204
LR is at open_namespace+0x44/0xbc fs/nsfs.c:102
pc : [<8053af54>] lr : [<8054d6f8>] psr: 80000013
sp : df959e80 ip : 84183000 fp : df959ec4
r10: 84183000 r9 : 00000003 r8 : 843f3300
r7 : 82caa250 r6 : 84183000 r5 : 00000000 r4 : 82625878
r3 : df959ecc r2 : 00000008 r1 : 82c95800 r0 : 00000008
Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
Control: 30c5387d Table: 843f1a00 DAC: fffffffd
Register r0 information: non-paged memory
Register r1 information: slab kmalloc-1k start 82c95800 pointer offset 0 size 1024
Register r2 information: non-paged memory
Register r3 information: 2-page vmalloc region starting at 0xdf958000 allocated at kernel_clone+0xac/0x3e4 kernel/fork.c:2780
Register r4 information: non-slab/vmalloc memory
Register r5 information: NULL pointer
Register r6 information: slab task_struct start 84183000 pointer offset 0 size 3072
Register r7 information: slab mnt_cache start 82caa240 pointer offset 16 size 184
Register r8 information: slab filp start 843f3300 pointer offset 0 size 160
Register r9 information: non-paged memory
Register r10 information: slab task_struct start 84183000 pointer offset 0 size 3072
Register r11 information: 2-page vmalloc region starting at 0xdf958000 allocated at kernel_clone+0xac/0x3e4 kernel/fork.c:2780
Register r12 information: slab task_struct start 84183000 pointer offset 0 size 3072
Process syz-executor103 (pid: 3011, stack limit = 0xdf958000)
Stack: (0xdf959e80 to 0xdf95a000)
9e80: 841b2400 00000008 df959eb4 df959e98 806e6544 804065ac 00000009 82625878
9ea0: 00000000 84183000 841b2c80 843f3300 00000003 84183000 df959ef4 df959ec8
9ec0: 8054d6f8 8053af44 df959ef4 00000000 00000000 f9244696 82625878 82625878
9ee0: 841b2400 00000008 df959f14 df959ef8 8055271c 8054d6c0 0000ff07 00000000
9f00: 843f3300 00000000 df959fa4 df959f18 8051a7f0 805525a4 000001b2 8020029c
9f20: 84183000 000001b2 df959fac df959f38 8020ba70 8042c724 83f01500 df959f80
9f40: 00000000 843f3300 00000003 82cb0800 df959f7c df959f60 805283b8 8027aebc
9f60: 83f01500 00000003 83f01500 00000003 df959fa4 f9244696 8026b8b0 00000000
9f80: 00000000 0008e058 00000036 8020029c 84183000 00000036 00000000 df959fa8
9fa0: 80200060 8051a6c8 00000000 00000000 00000003 0000ff07 00000000 00000000
9fc0: 00000000 00000000 0008e058 00000036 7ee54e0c 00000000 00000001 00000000
9fe0: 7ee54c70 7ee54c60 0001064c 0002e7a0 00000010 00000003 00000000 00000000
Call trace:
[<8053af38>] (path_from_stashed) from [<8054d6f8>] (open_namespace+0x44/0xbc fs/nsfs.c:102)
r10:84183000 r9:00000003 r8:843f3300 r7:841b2c80 r6:84183000 r5:00000000
r4:82625878
[<8054d6b4>] (open_namespace) from [<8055271c>] (pidfd_ioctl+0x184/0x4c4 fs/pidfs.c:196)
r6:00000008 r5:841b2400 r4:82625878
[<80552598>] (pidfd_ioctl) from [<8051a7f0>] (vfs_ioctl fs/ioctl.c:51 [inline])
[<80552598>] (pidfd_ioctl) from [<8051a7f0>] (do_vfs_ioctl fs/ioctl.c:861 [inline])
[<80552598>] (pidfd_ioctl) from [<8051a7f0>] (__do_sys_ioctl fs/ioctl.c:905 [inline])
[<80552598>] (pidfd_ioctl) from [<8051a7f0>] (sys_ioctl+0x134/0xda4 fs/ioctl.c:893)
r7:00000000 r6:843f3300 r5:00000000 r4:0000ff07
[<8051a6bc>] (sys_ioctl) from [<80200060>] (ret_fast_syscall+0x0/0x1c arch/arm/mm/proc-v7.S:67)
Exception stack(0xdf959fa8 to 0xdf959ff0)
9fa0: 00000000 00000000 00000003 0000ff07 00000000 00000000
9fc0: 00000000 00000000 0008e058 00000036 7ee54e0c 00000000 00000001 00000000
9fe0: 7ee54c70 7ee54c60 0001064c 0002e7a0
r10:00000036 r9:84183000 r8:8020029c r7:00000036 r6:0008e058 r5:00000000
r4:00000000
Code: e24dd01c e1a07001 e5911004 ee1dcf70 (e5905000)
---[ end trace 0000000000000000 ]---
----------------
Code disassembly (best guess):
0: e24dd01c sub sp, sp, #28
4: e1a07001 mov r7, r1
8: e5911004 ldr r1, [r1, #4]
c: ee1dcf70 mrc 15, 0, ip, cr13, cr0, {3}
* 10: e5905000 ldr r5, [r0] <-- trapping instruction
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 6+ messages in thread* [PATCH] fs/pidfs: when time ns disabled add check for ioctl 2024-07-20 16:36 [syzbot] [fs?] BUG: unable to handle kernel NULL pointer dereference in path_from_stashed syzbot @ 2024-07-21 6:23 ` Edward Adam Davis 2024-07-21 10:59 ` Matthew Wilcox ` (2 more replies) 0 siblings, 3 replies; 6+ messages in thread From: Edward Adam Davis @ 2024-07-21 6:23 UTC (permalink / raw) To: syzbot+34a0ee986f61f15da35d Cc: brauner, jack, linux-fsdevel, linux-kernel, syzkaller-bugs, viro syzbot call pidfd_ioctl() with cmd "PIDFD_GET_TIME_NAMESPACE" and disabled CONFIG_TIME_NS, since time_ns is NULL, it will make NULL ponter deref in open_namespace. Reported-and-tested-by: syzbot+34a0ee986f61f15da35d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=34a0ee986f61f15da35d Signed-off-by: Edward Adam Davis <eadavis@qq.com> --- fs/pidfs.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/pidfs.c b/fs/pidfs.c index c9cb14181def..fe0ddab48f57 100644 --- a/fs/pidfs.c +++ b/fs/pidfs.c @@ -168,6 +168,8 @@ static long pidfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg) case PIDFD_GET_TIME_NAMESPACE: get_time_ns(nsp->time_ns); ns_common = to_ns_common(nsp->time_ns); + if (!nsp->time_ns) + return -EINVAL; break; case PIDFD_GET_TIME_FOR_CHILDREN_NAMESPACE: get_time_ns(nsp->time_ns_for_children); -- 2.43.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] fs/pidfs: when time ns disabled add check for ioctl 2024-07-21 6:23 ` [PATCH] fs/pidfs: when time ns disabled add check for ioctl Edward Adam Davis @ 2024-07-21 10:59 ` Matthew Wilcox 2024-07-22 8:00 ` Christian Brauner 2024-07-22 13:13 ` [PATCH 1/2] pidfs: handle kernels without namespaces cleanly Christian Brauner 2 siblings, 0 replies; 6+ messages in thread From: Matthew Wilcox @ 2024-07-21 10:59 UTC (permalink / raw) To: Edward Adam Davis Cc: syzbot+34a0ee986f61f15da35d, brauner, jack, linux-fsdevel, linux-kernel, syzkaller-bugs, viro On Sun, Jul 21, 2024 at 02:23:12PM +0800, Edward Adam Davis wrote: > syzbot call pidfd_ioctl() with cmd "PIDFD_GET_TIME_NAMESPACE" and disabled > CONFIG_TIME_NS, since time_ns is NULL, it will make NULL ponter deref in > open_namespace. what about PIDFD_GET_TIME_FOR_CHILDREN_NAMESPACE? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] fs/pidfs: when time ns disabled add check for ioctl 2024-07-21 6:23 ` [PATCH] fs/pidfs: when time ns disabled add check for ioctl Edward Adam Davis 2024-07-21 10:59 ` Matthew Wilcox @ 2024-07-22 8:00 ` Christian Brauner 2024-07-22 13:13 ` [PATCH 1/2] pidfs: handle kernels without namespaces cleanly Christian Brauner 2 siblings, 0 replies; 6+ messages in thread From: Christian Brauner @ 2024-07-22 8:00 UTC (permalink / raw) To: Edward Adam Davis Cc: syzbot+34a0ee986f61f15da35d, jack, linux-fsdevel, linux-kernel, syzkaller-bugs, viro On Sun, Jul 21, 2024 at 02:23:12PM GMT, Edward Adam Davis wrote: > syzbot call pidfd_ioctl() with cmd "PIDFD_GET_TIME_NAMESPACE" and disabled > CONFIG_TIME_NS, since time_ns is NULL, it will make NULL ponter deref in > open_namespace. > > Reported-and-tested-by: syzbot+34a0ee986f61f15da35d@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=34a0ee986f61f15da35d > Signed-off-by: Edward Adam Davis <eadavis@qq.com> > --- Hm, nsproxy is really messy in that regard. Some namespaces will always be set to init_<type>_ns and others will be set to NULL. That's an invitation for bugs such as this. Imho the correct fix is to change nsproxy to always set nsp-><type>_ns to init_<type>_ns and no code ever needs to worry about dereferencing NULL. But that'll require more changes so this seems an appropriate fix for now. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] pidfs: handle kernels without namespaces cleanly 2024-07-21 6:23 ` [PATCH] fs/pidfs: when time ns disabled add check for ioctl Edward Adam Davis 2024-07-21 10:59 ` Matthew Wilcox 2024-07-22 8:00 ` Christian Brauner @ 2024-07-22 13:13 ` Christian Brauner 2024-07-22 13:13 ` [PATCH 2/2] pidfs: add selftests for new namespace ioctls Christian Brauner 2 siblings, 1 reply; 6+ messages in thread From: Christian Brauner @ 2024-07-22 13:13 UTC (permalink / raw) To: linux-fsdevel Cc: Christian Brauner, Edward Adam Davis, syzbot+34a0ee986f61f15da35d, jack, linux-kernel, syzkaller-bugs, viro The nsproxy structure contains nearly all of the namespaces associated with a task. When a given namespace type is not supported by this kernel the rules whether the corresponding pointer in struct nsproxy is NULL or always init_<ns_type>_ns differ per namespace. Ideally, that wouldn't be the case and for all namespace types we'd always set it to init_<ns_type>_ns when the corresponding namespace type isn't supported. Make sure we handle all namespaces where the pointer in struct nsproxy can be NULL when the namespace type isn't supported. Fixes: 5b08bd408534 ("pidfs: allow retrieval of namespace file descriptors") # mainline only Signed-off-by: Christian Brauner <brauner@kernel.org> --- fs/pidfs.c | 65 +++++++++++++++++++++++++++++++++++------------------- 1 file changed, 42 insertions(+), 23 deletions(-) diff --git a/fs/pidfs.c b/fs/pidfs.c index fe0ddab48f57..7ffdc88dfb52 100644 --- a/fs/pidfs.c +++ b/fs/pidfs.c @@ -119,7 +119,7 @@ static long pidfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg) struct task_struct *task __free(put_task) = NULL; struct nsproxy *nsp __free(put_nsproxy) = NULL; struct pid *pid = pidfd_pid(file); - struct ns_common *ns_common; + struct ns_common *ns_common = NULL; if (arg) return -EINVAL; @@ -146,54 +146,73 @@ static long pidfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg) switch (cmd) { /* Namespaces that hang of nsproxy. */ case PIDFD_GET_CGROUP_NAMESPACE: - get_cgroup_ns(nsp->cgroup_ns); - ns_common = to_ns_common(nsp->cgroup_ns); + if (IS_ENABLED(CONFIG_CGROUPS)) { + get_cgroup_ns(nsp->cgroup_ns); + ns_common = to_ns_common(nsp->cgroup_ns); + } break; case PIDFD_GET_IPC_NAMESPACE: - get_ipc_ns(nsp->ipc_ns); - ns_common = to_ns_common(nsp->ipc_ns); + if (IS_ENABLED(CONFIG_IPC_NS)) { + get_ipc_ns(nsp->ipc_ns); + ns_common = to_ns_common(nsp->ipc_ns); + } break; case PIDFD_GET_MNT_NAMESPACE: get_mnt_ns(nsp->mnt_ns); ns_common = to_ns_common(nsp->mnt_ns); break; case PIDFD_GET_NET_NAMESPACE: - ns_common = to_ns_common(nsp->net_ns); - get_net_ns(ns_common); + if (IS_ENABLED(CONFIG_NET_NS)) { + ns_common = to_ns_common(nsp->net_ns); + get_net_ns(ns_common); + } break; case PIDFD_GET_PID_FOR_CHILDREN_NAMESPACE: - get_pid_ns(nsp->pid_ns_for_children); - ns_common = to_ns_common(nsp->pid_ns_for_children); + if (IS_ENABLED(CONFIG_PID_NS)) { + get_pid_ns(nsp->pid_ns_for_children); + ns_common = to_ns_common(nsp->pid_ns_for_children); + } break; case PIDFD_GET_TIME_NAMESPACE: - get_time_ns(nsp->time_ns); - ns_common = to_ns_common(nsp->time_ns); - if (!nsp->time_ns) - return -EINVAL; + if (IS_ENABLED(CONFIG_TIME_NS)) { + get_time_ns(nsp->time_ns); + ns_common = to_ns_common(nsp->time_ns); + } break; case PIDFD_GET_TIME_FOR_CHILDREN_NAMESPACE: - get_time_ns(nsp->time_ns_for_children); - ns_common = to_ns_common(nsp->time_ns_for_children); + if (IS_ENABLED(CONFIG_TIME_NS)) { + get_time_ns(nsp->time_ns_for_children); + ns_common = to_ns_common(nsp->time_ns_for_children); + } break; case PIDFD_GET_UTS_NAMESPACE: - get_uts_ns(nsp->uts_ns); - ns_common = to_ns_common(nsp->uts_ns); + if (IS_ENABLED(CONFIG_UTS_NS)) { + get_uts_ns(nsp->uts_ns); + ns_common = to_ns_common(nsp->uts_ns); + } break; /* Namespaces that don't hang of nsproxy. */ case PIDFD_GET_USER_NAMESPACE: - rcu_read_lock(); - ns_common = to_ns_common(get_user_ns(task_cred_xxx(task, user_ns))); - rcu_read_unlock(); + if (IS_ENABLED(CONFIG_USER_NS)) { + rcu_read_lock(); + ns_common = to_ns_common(get_user_ns(task_cred_xxx(task, user_ns))); + rcu_read_unlock(); + } break; case PIDFD_GET_PID_NAMESPACE: - rcu_read_lock(); - ns_common = to_ns_common(get_pid_ns(task_active_pid_ns(task))); - rcu_read_unlock(); + if (IS_ENABLED(CONFIG_PID_NS)) { + rcu_read_lock(); + ns_common = to_ns_common( get_pid_ns(task_active_pid_ns(task))); + rcu_read_unlock(); + } break; default: return -ENOIOCTLCMD; } + if (!ns_common) + return -EOPNOTSUPP; + /* open_namespace() unconditionally consumes the reference */ return open_namespace(ns_common); } -- 2.43.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] pidfs: add selftests for new namespace ioctls 2024-07-22 13:13 ` [PATCH 1/2] pidfs: handle kernels without namespaces cleanly Christian Brauner @ 2024-07-22 13:13 ` Christian Brauner 0 siblings, 0 replies; 6+ messages in thread From: Christian Brauner @ 2024-07-22 13:13 UTC (permalink / raw) To: linux-fsdevel Cc: Christian Brauner, Edward Adam Davis, syzbot+34a0ee986f61f15da35d, jack, linux-kernel, syzkaller-bugs, viro Add selftests to verify that deriving namespace file descriptors from pidfd file descriptors works correctly. Signed-off-by: Christian Brauner <brauner@kernel.org> --- .../selftests/pidfd/pidfd_setns_test.c | 258 +++++++++++++++--- 1 file changed, 227 insertions(+), 31 deletions(-) diff --git a/tools/testing/selftests/pidfd/pidfd_setns_test.c b/tools/testing/selftests/pidfd/pidfd_setns_test.c index 47746b0c6acd..7c2a4349170a 100644 --- a/tools/testing/selftests/pidfd/pidfd_setns_test.c +++ b/tools/testing/selftests/pidfd/pidfd_setns_test.c @@ -16,11 +16,56 @@ #include <unistd.h> #include <sys/socket.h> #include <sys/stat.h> +#include <linux/ioctl.h> #include "pidfd.h" #include "../clone3/clone3_selftests.h" #include "../kselftest_harness.h" +#ifndef PIDFS_IOCTL_MAGIC +#define PIDFS_IOCTL_MAGIC 0xFF +#endif + +#ifndef PIDFD_GET_CGROUP_NAMESPACE +#define PIDFD_GET_CGROUP_NAMESPACE _IO(PIDFS_IOCTL_MAGIC, 1) +#endif + +#ifndef PIDFD_GET_IPC_NAMESPACE +#define PIDFD_GET_IPC_NAMESPACE _IO(PIDFS_IOCTL_MAGIC, 2) +#endif + +#ifndef PIDFD_GET_MNT_NAMESPACE +#define PIDFD_GET_MNT_NAMESPACE _IO(PIDFS_IOCTL_MAGIC, 3) +#endif + +#ifndef PIDFD_GET_NET_NAMESPACE +#define PIDFD_GET_NET_NAMESPACE _IO(PIDFS_IOCTL_MAGIC, 4) +#endif + +#ifndef PIDFD_GET_PID_NAMESPACE +#define PIDFD_GET_PID_NAMESPACE _IO(PIDFS_IOCTL_MAGIC, 5) +#endif + +#ifndef PIDFD_GET_PID_FOR_CHILDREN_NAMESPACE +#define PIDFD_GET_PID_FOR_CHILDREN_NAMESPACE _IO(PIDFS_IOCTL_MAGIC, 6) +#endif + +#ifndef PIDFD_GET_TIME_NAMESPACE +#define PIDFD_GET_TIME_NAMESPACE _IO(PIDFS_IOCTL_MAGIC, 7) +#endif + +#ifndef PIDFD_GET_TIME_FOR_CHILDREN_NAMESPACE +#define PIDFD_GET_TIME_FOR_CHILDREN_NAMESPACE _IO(PIDFS_IOCTL_MAGIC, 8) +#endif + +#ifndef PIDFD_GET_USER_NAMESPACE +#define PIDFD_GET_USER_NAMESPACE _IO(PIDFS_IOCTL_MAGIC, 9) +#endif + +#ifndef PIDFD_GET_UTS_NAMESPACE +#define PIDFD_GET_UTS_NAMESPACE _IO(PIDFS_IOCTL_MAGIC, 10) +#endif + enum { PIDFD_NS_USER, PIDFD_NS_MNT, @@ -31,22 +76,25 @@ enum { PIDFD_NS_CGROUP, PIDFD_NS_PIDCLD, PIDFD_NS_TIME, + PIDFD_NS_TIMECLD, PIDFD_NS_MAX }; const struct ns_info { const char *name; int flag; + unsigned int pidfd_ioctl; } ns_info[] = { - [PIDFD_NS_USER] = { "user", CLONE_NEWUSER, }, - [PIDFD_NS_MNT] = { "mnt", CLONE_NEWNS, }, - [PIDFD_NS_PID] = { "pid", CLONE_NEWPID, }, - [PIDFD_NS_UTS] = { "uts", CLONE_NEWUTS, }, - [PIDFD_NS_IPC] = { "ipc", CLONE_NEWIPC, }, - [PIDFD_NS_NET] = { "net", CLONE_NEWNET, }, - [PIDFD_NS_CGROUP] = { "cgroup", CLONE_NEWCGROUP, }, - [PIDFD_NS_PIDCLD] = { "pid_for_children", 0, }, - [PIDFD_NS_TIME] = { "time", CLONE_NEWTIME, }, + [PIDFD_NS_USER] = { "user", CLONE_NEWUSER, PIDFD_GET_USER_NAMESPACE, }, + [PIDFD_NS_MNT] = { "mnt", CLONE_NEWNS, PIDFD_GET_MNT_NAMESPACE, }, + [PIDFD_NS_PID] = { "pid", CLONE_NEWPID, PIDFD_GET_PID_NAMESPACE, }, + [PIDFD_NS_UTS] = { "uts", CLONE_NEWUTS, PIDFD_GET_UTS_NAMESPACE, }, + [PIDFD_NS_IPC] = { "ipc", CLONE_NEWIPC, PIDFD_GET_IPC_NAMESPACE, }, + [PIDFD_NS_NET] = { "net", CLONE_NEWNET, PIDFD_GET_NET_NAMESPACE, }, + [PIDFD_NS_CGROUP] = { "cgroup", CLONE_NEWCGROUP, PIDFD_GET_CGROUP_NAMESPACE, }, + [PIDFD_NS_TIME] = { "time", CLONE_NEWTIME, PIDFD_GET_TIME_NAMESPACE, }, + [PIDFD_NS_PIDCLD] = { "pid_for_children", 0, PIDFD_GET_PID_FOR_CHILDREN_NAMESPACE, }, + [PIDFD_NS_TIMECLD] = { "time_for_children", 0, PIDFD_GET_TIME_FOR_CHILDREN_NAMESPACE, }, }; FIXTURE(current_nsset) @@ -54,6 +102,7 @@ FIXTURE(current_nsset) pid_t pid; int pidfd; int nsfds[PIDFD_NS_MAX]; + int child_pidfd_derived_nsfds[PIDFD_NS_MAX]; pid_t child_pid_exited; int child_pidfd_exited; @@ -61,10 +110,12 @@ FIXTURE(current_nsset) pid_t child_pid1; int child_pidfd1; int child_nsfds1[PIDFD_NS_MAX]; + int child_pidfd_derived_nsfds1[PIDFD_NS_MAX]; pid_t child_pid2; int child_pidfd2; int child_nsfds2[PIDFD_NS_MAX]; + int child_pidfd_derived_nsfds2[PIDFD_NS_MAX]; }; static int sys_waitid(int which, pid_t pid, int options) @@ -128,9 +179,12 @@ FIXTURE_SETUP(current_nsset) char c; for (i = 0; i < PIDFD_NS_MAX; i++) { - self->nsfds[i] = -EBADF; - self->child_nsfds1[i] = -EBADF; - self->child_nsfds2[i] = -EBADF; + self->nsfds[i] = -EBADF; + self->child_nsfds1[i] = -EBADF; + self->child_nsfds2[i] = -EBADF; + self->child_pidfd_derived_nsfds[i] = -EBADF; + self->child_pidfd_derived_nsfds1[i] = -EBADF; + self->child_pidfd_derived_nsfds2[i] = -EBADF; } proc_fd = open("/proc/self/ns", O_DIRECTORY | O_CLOEXEC); @@ -139,6 +193,11 @@ FIXTURE_SETUP(current_nsset) } self->pid = getpid(); + self->pidfd = sys_pidfd_open(self->pid, 0); + EXPECT_GT(self->pidfd, 0) { + TH_LOG("%m - Failed to open pidfd for process %d", self->pid); + } + for (i = 0; i < PIDFD_NS_MAX; i++) { const struct ns_info *info = &ns_info[i]; self->nsfds[i] = openat(proc_fd, info->name, O_RDONLY | O_CLOEXEC); @@ -148,20 +207,27 @@ FIXTURE_SETUP(current_nsset) info->name, self->pid); } } - } - self->pidfd = sys_pidfd_open(self->pid, 0); - EXPECT_GT(self->pidfd, 0) { - TH_LOG("%m - Failed to open pidfd for process %d", self->pid); + self->child_pidfd_derived_nsfds[i] = ioctl(self->pidfd, info->pidfd_ioctl, 0); + if (self->child_pidfd_derived_nsfds[i] < 0) { + EXPECT_EQ(errno, EOPNOTSUPP) { + TH_LOG("%m - Failed to derive %s namespace from pidfd of process %d", + info->name, self->pid); + } + } } /* Create task that exits right away. */ - self->child_pid_exited = create_child(&self->child_pidfd_exited, - CLONE_NEWUSER | CLONE_NEWNET); + self->child_pid_exited = create_child(&self->child_pidfd_exited, 0); EXPECT_GE(self->child_pid_exited, 0); - if (self->child_pid_exited == 0) + if (self->child_pid_exited == 0) { + if (self->nsfds[PIDFD_NS_USER] >= 0 && unshare(CLONE_NEWUSER) < 0) + _exit(EXIT_FAILURE); + if (self->nsfds[PIDFD_NS_NET] >= 0 && unshare(CLONE_NEWNET) < 0) + _exit(EXIT_FAILURE); _exit(EXIT_SUCCESS); + } ASSERT_EQ(sys_waitid(P_PID, self->child_pid_exited, WEXITED | WNOWAIT), 0); @@ -174,18 +240,43 @@ FIXTURE_SETUP(current_nsset) EXPECT_EQ(ret, 0); /* Create tasks that will be stopped. */ - self->child_pid1 = create_child(&self->child_pidfd1, - CLONE_NEWUSER | CLONE_NEWNS | - CLONE_NEWCGROUP | CLONE_NEWIPC | - CLONE_NEWUTS | CLONE_NEWPID | - CLONE_NEWNET); + if (self->nsfds[PIDFD_NS_USER] >= 0 && self->nsfds[PIDFD_NS_PID] >= 0) + self->child_pid1 = create_child(&self->child_pidfd1, CLONE_NEWUSER | CLONE_NEWPID); + else if (self->nsfds[PIDFD_NS_PID] >= 0) + self->child_pid1 = create_child(&self->child_pidfd1, CLONE_NEWPID); + else if (self->nsfds[PIDFD_NS_USER] >= 0) + self->child_pid1 = create_child(&self->child_pidfd1, CLONE_NEWUSER); + else + self->child_pid1 = create_child(&self->child_pidfd1, 0); EXPECT_GE(self->child_pid1, 0); if (self->child_pid1 == 0) { close(ipc_sockets[0]); - if (!switch_timens()) + if (self->nsfds[PIDFD_NS_MNT] >= 0 && unshare(CLONE_NEWNS) < 0) { + TH_LOG("%m - Failed to unshare mount namespace for process %d", self->pid); _exit(EXIT_FAILURE); + } + if (self->nsfds[PIDFD_NS_CGROUP] >= 0 && unshare(CLONE_NEWCGROUP) < 0) { + TH_LOG("%m - Failed to unshare cgroup namespace for process %d", self->pid); + _exit(EXIT_FAILURE); + } + if (self->nsfds[PIDFD_NS_IPC] >= 0 && unshare(CLONE_NEWIPC) < 0) { + TH_LOG("%m - Failed to unshare ipc namespace for process %d", self->pid); + _exit(EXIT_FAILURE); + } + if (self->nsfds[PIDFD_NS_UTS] >= 0 && unshare(CLONE_NEWUTS) < 0) { + TH_LOG("%m - Failed to unshare uts namespace for process %d", self->pid); + _exit(EXIT_FAILURE); + } + if (self->nsfds[PIDFD_NS_NET] >= 0 && unshare(CLONE_NEWNET) < 0) { + TH_LOG("%m - Failed to unshare net namespace for process %d", self->pid); + _exit(EXIT_FAILURE); + } + if (self->nsfds[PIDFD_NS_TIME] >= 0 && !switch_timens()) { + TH_LOG("%m - Failed to unshare time namespace for process %d", self->pid); + _exit(EXIT_FAILURE); + } if (write_nointr(ipc_sockets[1], "1", 1) < 0) _exit(EXIT_FAILURE); @@ -203,18 +294,43 @@ FIXTURE_SETUP(current_nsset) ret = socketpair(AF_LOCAL, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets); EXPECT_EQ(ret, 0); - self->child_pid2 = create_child(&self->child_pidfd2, - CLONE_NEWUSER | CLONE_NEWNS | - CLONE_NEWCGROUP | CLONE_NEWIPC | - CLONE_NEWUTS | CLONE_NEWPID | - CLONE_NEWNET); + if (self->nsfds[PIDFD_NS_USER] >= 0 && self->nsfds[PIDFD_NS_PID] >= 0) + self->child_pid2 = create_child(&self->child_pidfd2, CLONE_NEWUSER | CLONE_NEWPID); + else if (self->nsfds[PIDFD_NS_PID] >= 0) + self->child_pid2 = create_child(&self->child_pidfd2, CLONE_NEWPID); + else if (self->nsfds[PIDFD_NS_USER] >= 0) + self->child_pid2 = create_child(&self->child_pidfd2, CLONE_NEWUSER); + else + self->child_pid2 = create_child(&self->child_pidfd2, 0); EXPECT_GE(self->child_pid2, 0); if (self->child_pid2 == 0) { close(ipc_sockets[0]); - if (!switch_timens()) + if (self->nsfds[PIDFD_NS_MNT] >= 0 && unshare(CLONE_NEWNS) < 0) { + TH_LOG("%m - Failed to unshare mount namespace for process %d", self->pid); + _exit(EXIT_FAILURE); + } + if (self->nsfds[PIDFD_NS_CGROUP] >= 0 && unshare(CLONE_NEWCGROUP) < 0) { + TH_LOG("%m - Failed to unshare cgroup namespace for process %d", self->pid); _exit(EXIT_FAILURE); + } + if (self->nsfds[PIDFD_NS_IPC] >= 0 && unshare(CLONE_NEWIPC) < 0) { + TH_LOG("%m - Failed to unshare ipc namespace for process %d", self->pid); + _exit(EXIT_FAILURE); + } + if (self->nsfds[PIDFD_NS_UTS] >= 0 && unshare(CLONE_NEWUTS) < 0) { + TH_LOG("%m - Failed to unshare uts namespace for process %d", self->pid); + _exit(EXIT_FAILURE); + } + if (self->nsfds[PIDFD_NS_NET] >= 0 && unshare(CLONE_NEWNET) < 0) { + TH_LOG("%m - Failed to unshare net namespace for process %d", self->pid); + _exit(EXIT_FAILURE); + } + if (self->nsfds[PIDFD_NS_TIME] >= 0 && !switch_timens()) { + TH_LOG("%m - Failed to unshare time namespace for process %d", self->pid); + _exit(EXIT_FAILURE); + } if (write_nointr(ipc_sockets[1], "1", 1) < 0) _exit(EXIT_FAILURE); @@ -267,6 +383,22 @@ FIXTURE_SETUP(current_nsset) info->name, self->child_pid1); } } + + self->child_pidfd_derived_nsfds1[i] = ioctl(self->child_pidfd1, info->pidfd_ioctl, 0); + if (self->child_pidfd_derived_nsfds1[i] < 0) { + EXPECT_EQ(errno, EOPNOTSUPP) { + TH_LOG("%m - Failed to derive %s namespace from pidfd of process %d", + info->name, self->child_pid1); + } + } + + self->child_pidfd_derived_nsfds2[i] = ioctl(self->child_pidfd2, info->pidfd_ioctl, 0); + if (self->child_pidfd_derived_nsfds2[i] < 0) { + EXPECT_EQ(errno, EOPNOTSUPP) { + TH_LOG("%m - Failed to derive %s namespace from pidfd of process %d", + info->name, self->child_pid2); + } + } } close(proc_fd); @@ -288,6 +420,12 @@ FIXTURE_TEARDOWN(current_nsset) close(self->child_nsfds1[i]); if (self->child_nsfds2[i] >= 0) close(self->child_nsfds2[i]); + if (self->child_pidfd_derived_nsfds[i] >= 0) + close(self->child_pidfd_derived_nsfds[i]); + if (self->child_pidfd_derived_nsfds1[i] >= 0) + close(self->child_pidfd_derived_nsfds1[i]); + if (self->child_pidfd_derived_nsfds2[i] >= 0) + close(self->child_pidfd_derived_nsfds2[i]); } if (self->child_pidfd1 >= 0) @@ -446,6 +584,42 @@ TEST_F(current_nsset, nsfd_incremental_setns) } } +TEST_F(current_nsset, pidfd_derived_nsfd_incremental_setns) +{ + int i; + pid_t pid; + + pid = getpid(); + for (i = 0; i < PIDFD_NS_MAX; i++) { + const struct ns_info *info = &ns_info[i]; + int nsfd; + + if (self->child_pidfd_derived_nsfds1[i] < 0) + continue; + + if (info->flag) { + ASSERT_EQ(setns(self->child_pidfd_derived_nsfds1[i], info->flag), 0) { + TH_LOG("%m - Failed to setns to %s namespace of %d via nsfd %d", + info->name, self->child_pid1, + self->child_pidfd_derived_nsfds1[i]); + } + } + + /* Verify that we have changed to the correct namespaces. */ + if (info->flag == CLONE_NEWPID) + nsfd = self->child_pidfd_derived_nsfds[i]; + else + nsfd = self->child_pidfd_derived_nsfds1[i]; + ASSERT_EQ(in_same_namespace(nsfd, pid, info->name), 1) { + TH_LOG("setns failed to place us correctly into %s namespace of %d via nsfd %d", + info->name, self->child_pid1, + self->child_pidfd_derived_nsfds1[i]); + } + TH_LOG("Managed to correctly setns to %s namespace of %d via nsfd %d", + info->name, self->child_pid1, self->child_pidfd_derived_nsfds1[i]); + } +} + TEST_F(current_nsset, pidfd_one_shot_setns) { unsigned flags = 0; @@ -542,6 +716,28 @@ TEST_F(current_nsset, no_foul_play) info->name, self->child_pid2, self->child_nsfds2[i]); } + + /* + * Can't setns to a user namespace outside of our hierarchy since we + * don't have caps in there and didn't create it. That means that under + * no circumstances should we be able to setns to any of the other + * ones since they aren't owned by our user namespace. + */ + for (i = 0; i < PIDFD_NS_MAX; i++) { + const struct ns_info *info = &ns_info[i]; + + if (self->child_pidfd_derived_nsfds2[i] < 0 || !info->flag) + continue; + + ASSERT_NE(setns(self->child_pidfd_derived_nsfds2[i], info->flag), 0) { + TH_LOG("Managed to setns to %s namespace of %d via nsfd %d", + info->name, self->child_pid2, + self->child_pidfd_derived_nsfds2[i]); + } + TH_LOG("%m - Correctly failed to setns to %s namespace of %d via nsfd %d", + info->name, self->child_pid2, + self->child_pidfd_derived_nsfds2[i]); + } } TEST(setns_einval) -- 2.43.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-07-22 13:14 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-07-20 16:36 [syzbot] [fs?] BUG: unable to handle kernel NULL pointer dereference in path_from_stashed syzbot 2024-07-21 6:23 ` [PATCH] fs/pidfs: when time ns disabled add check for ioctl Edward Adam Davis 2024-07-21 10:59 ` Matthew Wilcox 2024-07-22 8:00 ` Christian Brauner 2024-07-22 13:13 ` [PATCH 1/2] pidfs: handle kernels without namespaces cleanly Christian Brauner 2024-07-22 13:13 ` [PATCH 2/2] pidfs: add selftests for new namespace ioctls Christian Brauner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).