linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/4] bpf: cgroup device guard for non-initial user namespace
@ 2023-08-14 14:26 Michael Weiß
  2023-08-14 14:26 ` [PATCH RFC 1/4] bpf: add cgroup device guard to flag a cgroup device prog Michael Weiß
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Michael Weiß @ 2023-08-14 14:26 UTC (permalink / raw)
  To: Alexander Mikhalitsyn, Christian Brauner, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, Quentin Monnet, Alexander Viro
  Cc: bpf, linux-kernel, linux-fsdevel, gyroidos, Michael Weiß

Introduce the BPF_F_CGROUP_DEVICE_GUARD flag for BPF_PROG_LOAD
which allows to set a cgroup device program to be a device guard.
This may be used to guard actions on device nodes in non-initial
userns, e.g., mknod.

If a container manager restricts its unprivileged (user namespaced)
children by a device cgroup, it is not necessary to deny mknod
anymore. Thus, user space applications may map devices on different
locations in the file system by using mknod() inside the container.

A use case for this, we also use in GyroidOS, is to run virsh for
VMs inside an unprivileged container. virsh creates device nodes,
e.g., "/var/run/libvirt/qemu/11-fgfg.dev/null" which currently fails
in a non-initial userns, even if a cgroup device white list with the
corresponding major, minor of /dev/null exists. Thus, in this case
the usual bind mounts or pre populated device nodes under /dev are
not sufficient.

To circumvent this limitation, we allow mknod() in the VFS if a
bpf cgroup device guard is enabled for the current task and check
CAP_MKNOD for the current user namespace instead of the init userns.

To avoid unusable device nodes on file systems mounted in
non-initial user namespace, may_open_dev() ignores the SB_I_NODEV
for cgroup device guarded tasks.

Tested for a GyroidOS container generated by the cmld using the
following user space patch: https://github.com/gyroidos/cml/pull/394

I discussed this internally with Christian in the UAPI group, earlier.
I put this to the public list now, since also LXC/LXD Folks have
announced interest on this.

This series applies to the latest mainline v6.5-rc6 tag.

Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
---
Michael Weiß (4):
      bpf: add cgroup device guard to flag a cgroup device prog
      bpf: provide cgroup_device_guard in bpf_prog_info to user space
      device_cgroup: wrapper for bpf cgroup device guard
      fs: allow mknod in non-initial userns using cgroup device guard

 fs/namei.c                     | 19 ++++++++++++++++---
 include/linux/bpf-cgroup.h     |  7 +++++++
 include/linux/bpf.h            |  1 +
 include/linux/device_cgroup.h  |  7 +++++++
 include/uapi/linux/bpf.h       |  8 +++++++-
 kernel/bpf/cgroup.c            | 30 ++++++++++++++++++++++++++++++
 kernel/bpf/syscall.c           |  6 +++++-
 security/device_cgroup.c       | 10 ++++++++++
 tools/bpf/bpftool/prog.c       |  2 ++
 tools/include/uapi/linux/bpf.h |  8 +++++++-
 10 files changed, 92 insertions(+), 6 deletions(-)
---
base-commit: 2ccdd1b13c591d306f0401d98dedc4bdcd02b421
change-id: 20230814-devcg_guard-5398ef84bf7b

Best regards,
-- 
Michael Weiß <michael.weiss@aisec.fraunhofer.de>


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-09-11 21:04 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-14 14:26 [PATCH RFC 0/4] bpf: cgroup device guard for non-initial user namespace Michael Weiß
2023-08-14 14:26 ` [PATCH RFC 1/4] bpf: add cgroup device guard to flag a cgroup device prog Michael Weiß
2023-08-14 15:54   ` Alexander Mikhalitsyn
2023-08-17 15:50     ` Michael Weiß
2023-08-15  8:59   ` Christian Brauner
2023-08-17 15:47     ` Michael Weiß
2023-08-17 22:11     ` Alexei Starovoitov
2023-08-29 13:35       ` Alexander Mikhalitsyn
2023-09-04 11:44         ` Christian Brauner
2023-09-11 10:38           ` Michael Weiß
2023-09-11 12:35             ` Christian Brauner
2023-09-11 19:20           ` Paul Moore
2023-08-14 14:26 ` [PATCH RFC 2/4] bpf: provide cgroup_device_guard in bpf_prog_info to user space Michael Weiß
2023-08-14 14:26 ` [PATCH RFC 3/4] device_cgroup: wrapper for bpf cgroup device guard Michael Weiß
2023-08-14 14:26 ` [PATCH RFC 4/4] fs: allow mknod in non-initial userns using " Michael Weiß
2023-08-14 15:24   ` Alexander Mikhalitsyn
2023-08-15  7:18   ` kernel test robot
2023-08-15  7:49     ` Alexander Mikhalitsyn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).