From: Paul Moore <paul@paul-moore.com>
To: Andrii Nakryiko <andrii@kernel.org>, <bpf@vger.kernel.org>,
<netdev@vger.kernel.org>
Cc: <linux-fsdevel@vger.kernel.org>,
<linux-security-module@vger.kernel.org>, <keescook@chromium.org>,
<brauner@kernel.org>, <lennart@poettering.net>,
<kernel-team@meta.com>, <sargun@sargun.me>,
selinux@vger.kernel.org
Subject: Re: [PATCH v6 3/13] bpf: introduce BPF token object
Date: Tue, 10 Oct 2023 21:17:38 -0400 [thread overview]
Message-ID: <53183ab045f8154ef94070039d53bbab.paul@paul-moore.com> (raw)
In-Reply-To: <20230927225809.2049655-4-andrii@kernel.org>
On Sep 27, 2023 Andrii Nakryiko <andrii@kernel.org> wrote:
>
> Add new kind of BPF kernel object, BPF token. BPF token is meant to
> allow delegating privileged BPF functionality, like loading a BPF
> program or creating a BPF map, from privileged process to a *trusted*
> unprivileged process, all while have a good amount of control over which
> privileged operations could be performed using provided BPF token.
>
> This is achieved through mounting BPF FS instance with extra delegation
> mount options, which determine what operations are delegatable, and also
> constraining it to the owning user namespace (as mentioned in the
> previous patch).
>
> BPF token itself is just a derivative from BPF FS and can be created
> through a new bpf() syscall command, BPF_TOKEN_CREAT, which accepts
> a path specification (using the usual fd + string path combo) to a BPF
> FS mount. Currently, BPF token "inherits" delegated command, map types,
> prog type, and attach type bit sets from BPF FS as is. In the future,
> having an BPF token as a separate object with its own FD, we can allow
> to further restrict BPF token's allowable set of things either at the creation
> time or after the fact, allowing the process to guard itself further
> from, e.g., unintentionally trying to load undesired kind of BPF
> programs. But for now we keep things simple and just copy bit sets as is.
>
> When BPF token is created from BPF FS mount, we take reference to the
> BPF super block's owning user namespace, and then use that namespace for
> checking all the {CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, CAP_SYS_ADMIN}
> capabilities that are normally only checked against init userns (using
> capable()), but now we check them using ns_capable() instead (if BPF
> token is provided). See bpf_token_capable() for details.
>
> Such setup means that BPF token in itself is not sufficient to grant BPF
> functionality. User namespaced process has to *also* have necessary
> combination of capabilities inside that user namespace. So while
> previously CAP_BPF was useless when granted within user namespace, now
> it gains a meaning and allows container managers and sys admins to have
> a flexible control over which processes can and need to use BPF
> functionality within the user namespace (i.e., container in practice).
> And BPF FS delegation mount options and derived BPF tokens serve as
> a per-container "flag" to grant overall ability to use bpf() (plus further
> restrict on which parts of bpf() syscalls are treated as namespaced).
>
> The alternative to creating BPF token object was:
> a) not having any extra object and just pasing BPF FS path to each
> relevant bpf() command. This seems suboptimal as it's racy (mount
> under the same path might change in between checking it and using it
> for bpf() command). And also less flexible if we'd like to further
> restrict ourselves compared to all the delegated functionality
> allowed on BPF FS.
> b) use non-bpf() interface, e.g., ioctl(), but otherwise also create
> a dedicated FD that would represent a token-like functionality. This
> doesn't seem superior to having a proper bpf() command, so
> BPF_TOKEN_CREATE was chosen.
>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
> include/linux/bpf.h | 40 +++++++
> include/uapi/linux/bpf.h | 39 +++++++
> kernel/bpf/Makefile | 2 +-
> kernel/bpf/inode.c | 10 +-
> kernel/bpf/syscall.c | 17 +++
> kernel/bpf/token.c | 197 +++++++++++++++++++++++++++++++++
> tools/include/uapi/linux/bpf.h | 39 +++++++
> 7 files changed, 339 insertions(+), 5 deletions(-)
> create mode 100644 kernel/bpf/token.c
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index a5bd40f71fd0..c43131a24579 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1572,6 +1576,13 @@ struct bpf_mount_opts {
> u64 delegate_attachs;
> };
>
> +struct bpf_token {
> + struct work_struct work;
> + atomic64_t refcnt;
> + struct user_namespace *userns;
> + u64 allowed_cmds;
We'll also need a 'void *security' field to go along with the BPF token
allocation/creation/free hooks, see my comments below. This is similar
to what we do for other kernel objects.
> +};
> +
...
> diff --git a/kernel/bpf/token.c b/kernel/bpf/token.c
> new file mode 100644
> index 000000000000..779aad5007a3
> --- /dev/null
> +++ b/kernel/bpf/token.c
> @@ -0,0 +1,197 @@
> +#include <linux/bpf.h>
> +#include <linux/vmalloc.h>
> +#include <linux/anon_inodes.h>
Probably don't need the anon_inode.h include anymore.
> +#include <linux/fdtable.h>
> +#include <linux/file.h>
> +#include <linux/fs.h>
> +#include <linux/kernel.h>
> +#include <linux/idr.h>
> +#include <linux/namei.h>
> +#include <linux/user_namespace.h>
> +
> +bool bpf_token_capable(const struct bpf_token *token, int cap)
> +{
> + /* BPF token allows ns_capable() level of capabilities */
> + if (token) {
I think we want a LSM hook here before the token is used in the
capability check. The LSM will see the capability check, but it will
not be able to distinguish it from the process which created the
delegation token. This is arguably the purpose of the delegation, but
with the LSM we want to be able to control who can use the delegated
privilege. How about something like this:
if (security_bpf_token_capable(token, cap))
return false;
> + if (ns_capable(token->userns, cap))
> + return true;
> + if (cap != CAP_SYS_ADMIN && ns_capable(token->userns, CAP_SYS_ADMIN))
> + return true;
> + }
> + /* otherwise fallback to capable() checks */
> + return capable(cap) || (cap != CAP_SYS_ADMIN && capable(CAP_SYS_ADMIN));
> +}
> +
> +void bpf_token_inc(struct bpf_token *token)
> +{
> + atomic64_inc(&token->refcnt);
> +}
> +
> +static void bpf_token_free(struct bpf_token *token)
> +{
We should have a LSM hook here to handle freeing the LSM state
associated with the token.
security_bpf_token_free(token);
> + put_user_ns(token->userns);
> + kvfree(token);
> +}
...
> +static struct bpf_token *bpf_token_alloc(void)
> +{
> + struct bpf_token *token;
> +
> + token = kvzalloc(sizeof(*token), GFP_USER);
> + if (!token)
> + return NULL;
> +
> + atomic64_set(&token->refcnt, 1);
We should have a LSM hook here to allocate the LSM state associated
with the token.
if (security_bpf_token_alloc(token)) {
kvfree(token);
return NULL;
}
> + return token;
> +}
...
> +int bpf_token_create(union bpf_attr *attr)
> +{
> + struct bpf_mount_opts *mnt_opts;
> + struct bpf_token *token = NULL;
> + struct inode *inode;
> + struct file *file;
> + struct path path;
> + umode_t mode;
> + int err, fd;
> +
> + err = user_path_at(attr->token_create.bpffs_path_fd,
> + u64_to_user_ptr(attr->token_create.bpffs_pathname),
> + LOOKUP_FOLLOW | LOOKUP_EMPTY, &path);
> + if (err)
> + return err;
> +
> + if (path.mnt->mnt_root != path.dentry) {
> + err = -EINVAL;
> + goto out_path;
> + }
> + err = path_permission(&path, MAY_ACCESS);
> + if (err)
> + goto out_path;
> +
> + mode = S_IFREG | ((S_IRUSR | S_IWUSR) & ~current_umask());
> + inode = bpf_get_inode(path.mnt->mnt_sb, NULL, mode);
> + if (IS_ERR(inode)) {
> + err = PTR_ERR(inode);
> + goto out_path;
> + }
> +
> + inode->i_op = &bpf_token_iops;
> + inode->i_fop = &bpf_token_fops;
> + clear_nlink(inode); /* make sure it is unlinked */
> +
> + file = alloc_file_pseudo(inode, path.mnt, BPF_TOKEN_INODE_NAME, O_RDWR, &bpf_token_fops);
> + if (IS_ERR(file)) {
> + iput(inode);
> + err = PTR_ERR(file);
> + goto out_file;
> + }
> +
> + token = bpf_token_alloc();
> + if (!token) {
> + err = -ENOMEM;
> + goto out_file;
> + }
> +
> + /* remember bpffs owning userns for future ns_capable() checks */
> + token->userns = get_user_ns(path.dentry->d_sb->s_user_ns);
> +
> + mnt_opts = path.dentry->d_sb->s_fs_info;
> + token->allowed_cmds = mnt_opts->delegate_cmds;
I think we would want a LSM hook here, both to control the creation
of the token and mark it with the security attributes of the creating
process. How about something like this:
err = security_bpf_token_create(token);
if (err)
goto out_token;
> + fd = get_unused_fd_flags(O_CLOEXEC);
> + if (fd < 0) {
> + err = fd;
> + goto out_token;
> + }
> +
> + file->private_data = token;
> + fd_install(fd, file);
> +
> + path_put(&path);
> + return fd;
> +
> +out_token:
> + bpf_token_free(token);
> +out_file:
> + fput(file);
> +out_path:
> + path_put(&path);
> + return err;
> +}
...
> +bool bpf_token_allow_cmd(const struct bpf_token *token, enum bpf_cmd cmd)
> +{
> + if (!token)
> + return false;
> +
> + return token->allowed_cmds & (1ULL << cmd);
Similar to bpf_token_capable(), I believe we want a LSM hook here to
control who is allowed to use the delegated privilege.
bool bpf_token_allow_cmd(...)
{
if (token && (token->allowed_cmds & (1ULL << cmd))
return security_bpf_token_cmd(token, cmd);
return false;
}
> +}
--
paul-moore.com
next prev parent reply other threads:[~2023-10-11 1:17 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-27 22:57 [PATCH v6 bpf-next 00/13] BPF token and BPF FS-based delegation Andrii Nakryiko
2023-09-27 22:57 ` [PATCH v6 bpf-next 01/13] bpf: align CAP_NET_ADMIN checks with bpf_capable() approach Andrii Nakryiko
2023-09-27 22:57 ` [PATCH v6 bpf-next 02/13] bpf: add BPF token delegation mount options to BPF FS Andrii Nakryiko
2023-10-10 7:08 ` Hou Tao
2023-10-12 0:30 ` Andrii Nakryiko
2023-09-27 22:57 ` [PATCH v6 bpf-next 03/13] bpf: introduce BPF token object Andrii Nakryiko
2023-10-11 1:17 ` Paul Moore [this message]
2023-10-12 0:31 ` [PATCH v6 3/13] " Andrii Nakryiko
2023-10-12 21:48 ` Andrii Nakryiko
2023-10-12 23:43 ` Paul Moore
2023-10-12 23:51 ` Andrii Nakryiko
2023-10-12 23:18 ` Paul Moore
2023-10-12 23:45 ` Andrii Nakryiko
2023-10-11 2:35 ` [PATCH v6 bpf-next 03/13] " Hou Tao
2023-10-12 0:31 ` Andrii Nakryiko
2023-09-27 22:58 ` [PATCH v6 bpf-next 04/13] bpf: add BPF token support to BPF_MAP_CREATE command Andrii Nakryiko
2023-10-10 8:35 ` Jiri Olsa
2023-10-12 0:30 ` Andrii Nakryiko
2023-09-27 22:58 ` [PATCH v6 bpf-next 05/13] bpf: add BPF token support to BPF_BTF_LOAD command Andrii Nakryiko
2023-09-27 22:58 ` [PATCH v6 bpf-next 06/13] bpf: add BPF token support to BPF_PROG_LOAD command Andrii Nakryiko
2023-10-11 1:17 ` [PATCH v6 6/13] " Paul Moore
2023-10-12 0:31 ` Andrii Nakryiko
2023-09-27 22:58 ` [PATCH v6 bpf-next 07/13] bpf: take into account BPF token when fetching helper protos Andrii Nakryiko
2023-09-27 22:58 ` [PATCH v6 bpf-next 08/13] bpf: consistenly use BPF token throughout BPF verifier logic Andrii Nakryiko
2023-09-27 22:58 ` [PATCH v6 bpf-next 09/13] libbpf: add bpf_token_create() API Andrii Nakryiko
2023-09-27 22:58 ` [PATCH v6 bpf-next 10/13] libbpf: add BPF token support to bpf_map_create() API Andrii Nakryiko
2023-09-27 22:58 ` [PATCH v6 bpf-next 11/13] libbpf: add BPF token support to bpf_btf_load() API Andrii Nakryiko
2023-09-27 22:58 ` [PATCH v6 bpf-next 12/13] libbpf: add BPF token support to bpf_prog_load() API Andrii Nakryiko
2023-09-27 22:58 ` [PATCH v6 bpf-next 13/13] selftests/bpf: add BPF token-enabled tests Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53183ab045f8154ef94070039d53bbab.paul@paul-moore.com \
--to=paul@paul-moore.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brauner@kernel.org \
--cc=keescook@chromium.org \
--cc=kernel-team@meta.com \
--cc=lennart@poettering.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=sargun@sargun.me \
--cc=selinux@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).