From: Kumar Kartikeya Dwivedi <memxor@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: "Joanne Koong" <joannelkoong@gmail.com>,
bpf <bpf@vger.kernel.org>, "Alexei Starovoitov" <ast@kernel.org>,
"Andrii Nakryiko" <andrii@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Toke Høiland-Jørgensen" <toke@redhat.com>,
"Jesper Dangaard Brouer" <brouer@redhat.com>
Subject: Re: [PATCH bpf-next v3 03/13] bpf: Allow storing unreferenced kptr in map
Date: Fri, 25 Mar 2022 20:21:23 +0530 [thread overview]
Message-ID: <20220325145123.yrzpwlahkmo66s2o@apollo> (raw)
In-Reply-To: <CAEf4BzamUKigWrC+qwAKhRO9KqWfe=aYfkg8ZQo9ZSAW7n7S_g@mail.gmail.com>
On Wed, Mar 23, 2022 at 01:52:52AM IST, Andrii Nakryiko wrote:
> On Tue, Mar 22, 2022 at 12:05 AM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > On Tue, Mar 22, 2022 at 05:09:30AM IST, Joanne Koong wrote:
> > > On Sun, Mar 20, 2022 at 5:27 PM Kumar Kartikeya Dwivedi
> > > <memxor@gmail.com> wrote:
> > > >
> > > > This commit introduces a new pointer type 'kptr' which can be embedded
> > > > in a map value as holds a PTR_TO_BTF_ID stored by a BPF program during
> > > > its invocation. Storing to such a kptr, BPF program's PTR_TO_BTF_ID
> > > > register must have the same type as in the map value's BTF, and loading
> > > > a kptr marks the destination register as PTR_TO_BTF_ID with the correct
> > > > kernel BTF and BTF ID.
> > > >
> > > > Such kptr are unreferenced, i.e. by the time another invocation of the
> > > > BPF program loads this pointer, the object which the pointer points to
> > > > may not longer exist. Since PTR_TO_BTF_ID loads (using BPF_LDX) are
> > > > patched to PROBE_MEM loads by the verifier, it would safe to allow user
> > > > to still access such invalid pointer, but passing such pointers into
> > > > BPF helpers and kfuncs should not be permitted. A future patch in this
> > > > series will close this gap.
> > > >
> > > > The flexibility offered by allowing programs to dereference such invalid
> > > > pointers while being safe at runtime frees the verifier from doing
> > > > complex lifetime tracking. As long as the user may ensure that the
> > > > object remains valid, it can ensure data read by it from the kernel
> > > > object is valid.
> > > >
> > > > The user indicates that a certain pointer must be treated as kptr
> > > > capable of accepting stores of PTR_TO_BTF_ID of a certain type, by using
> > > > a BTF type tag 'kptr' on the pointed to type of the pointer. Then, this
> > > > information is recorded in the object BTF which will be passed into the
> > > > kernel by way of map's BTF information. The name and kind from the map
> > > > value BTF is used to look up the in-kernel type, and the actual BTF and
> > > > BTF ID is recorded in the map struct in a new kptr_off_tab member. For
> > > > now, only storing pointers to structs is permitted.
> > > >
> > > > An example of this specification is shown below:
> > > >
> > > > #define __kptr __attribute__((btf_type_tag("kptr")))
> > > >
> > > > struct map_value {
> > > > ...
> > > > struct task_struct __kptr *task;
> > > > ...
> > > > };
> > > >
> > > > Then, in a BPF program, user may store PTR_TO_BTF_ID with the type
> > > > task_struct into the map, and then load it later.
> > > >
> > > > Note that the destination register is marked PTR_TO_BTF_ID_OR_NULL, as
> > > > the verifier cannot know whether the value is NULL or not statically, it
> > > > must treat all potential loads at that map value offset as loading a
> > > > possibly NULL pointer.
> > > >
> > > > Only BPF_LDX, BPF_STX, and BPF_ST with insn->imm = 0 (to denote NULL)
> > > > are allowed instructions that can access such a pointer. On BPF_LDX, the
> > > > destination register is updated to be a PTR_TO_BTF_ID, and on BPF_STX,
> > > > it is checked whether the source register type is a PTR_TO_BTF_ID with
> > > > same BTF type as specified in the map BTF. The access size must always
> > > > be BPF_DW.
> > > >
> > > > For the map in map support, the kptr_off_tab for outer map is copied
> > > > from the inner map's kptr_off_tab. It was chosen to do a deep copy
> > > > instead of introducing a refcount to kptr_off_tab, because the copy only
> > > > needs to be done when paramterizing using inner_map_fd in the map in map
> > > > case, hence would be unnecessary for all other users.
> > > >
> > > > It is not permitted to use MAP_FREEZE command and mmap for BPF map
> > > > having kptr, similar to the bpf_timer case.
> > > >
> > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > > ---
> > > > include/linux/bpf.h | 29 +++++++-
> > > > include/linux/btf.h | 2 +
> > > > kernel/bpf/btf.c | 161 ++++++++++++++++++++++++++++++++++------
> > > > kernel/bpf/map_in_map.c | 5 +-
> > > > kernel/bpf/syscall.c | 112 +++++++++++++++++++++++++++-
> > > > kernel/bpf/verifier.c | 120 ++++++++++++++++++++++++++++++
> > > > 6 files changed, 401 insertions(+), 28 deletions(-)
> > > >
> > > [...]
> > > > +
> > > > struct bpf_map *bpf_map_get(u32 ufd);
> > > > struct bpf_map *bpf_map_get_with_uref(u32 ufd);
> > > > struct bpf_map *__bpf_map_get(struct fd f);
> > > > diff --git a/include/linux/btf.h b/include/linux/btf.h
> > > > index 36bc09b8e890..5b578dc81c04 100644
> > > > --- a/include/linux/btf.h
> > > > +++ b/include/linux/btf.h
> > > > @@ -123,6 +123,8 @@ bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s,
> > > > u32 expected_offset, u32 expected_size);
> > > > int btf_find_spin_lock(const struct btf *btf, const struct btf_type *t);
> > > > int btf_find_timer(const struct btf *btf, const struct btf_type *t);
> > > > +struct bpf_map_value_off *btf_find_kptr(const struct btf *btf,
> > > > + const struct btf_type *t);
> > >
> > > nit: given that "btf_find_kptr" allocates memory as well, maybe the
> > > name "btf_parse_kptr" would be more reflective?
> > >
> >
> > Good point, will change.
> >
> > > > bool btf_type_is_void(const struct btf_type *t);
> > > > s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind);
> > > > const struct btf_type *btf_type_skip_modifiers(const struct btf *btf,
> > > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > > > index 9e17af936a7a..92afbec0a887 100644
> > > > --- a/kernel/bpf/btf.c
> > > > +++ b/kernel/bpf/btf.c
> > > > @@ -3164,9 +3164,16 @@ static void btf_struct_log(struct btf_verifier_env *env,
> > > > enum {
> > > > BTF_FIELD_SPIN_LOCK,
> > > > BTF_FIELD_TIMER,
> > > > + BTF_FIELD_KPTR,
> > > > +};
> > > > +
> > > > +enum {
> > > > + BTF_FIELD_IGNORE = 0,
> > > > + BTF_FIELD_FOUND = 1,
> > > > };
> > > >
> > > > struct btf_field_info {
> > > > + const struct btf_type *type;
> > > > u32 off;
> > > > };
> > > >
> > > > @@ -3174,23 +3181,48 @@ static int btf_find_field_struct(const struct btf *btf, const struct btf_type *t
> > > > u32 off, int sz, struct btf_field_info *info)
> > > > {
> > > > if (!__btf_type_is_struct(t))
> > > > - return 0;
> > > > + return BTF_FIELD_IGNORE;
> > > > if (t->size != sz)
> > > > - return 0;
> > > > - if (info->off != -ENOENT)
> > > > - /* only one such field is allowed */
> > > > - return -E2BIG;
> > > > + return BTF_FIELD_IGNORE;
> > > > info->off = off;
> > > > - return 0;
> > > > + return BTF_FIELD_FOUND;
> > > > +}
> > > > +
> > > > +static int btf_find_field_kptr(const struct btf *btf, const struct btf_type *t,
> > > > + u32 off, int sz, struct btf_field_info *info)
> > > > +{
> > > > + /* For PTR, sz is always == 8 */
> > > > + if (!btf_type_is_ptr(t))
> > > > + return BTF_FIELD_IGNORE;
> > > > + t = btf_type_by_id(btf, t->type);
> > > > +
> > > > + if (!btf_type_is_type_tag(t))
> > > > + return BTF_FIELD_IGNORE;
> > > > + /* Reject extra tags */
> > > > + if (btf_type_is_type_tag(btf_type_by_id(btf, t->type)))
> > > > + return -EINVAL;
> > > > + if (strcmp("kptr", __btf_name_by_offset(btf, t->name_off)))
> > > > + return -EINVAL;
> > > > +
> > > > + /* Get the base type */
> > > > + if (btf_type_is_modifier(t))
> > > > + t = btf_type_skip_modifiers(btf, t->type, NULL);
> > > > + /* Only pointer to struct is allowed */
> > > > + if (!__btf_type_is_struct(t))
> > > > + return -EINVAL;
> > > > +
> > > > + info->type = t;
> > > > + info->off = off;
> > > > + return BTF_FIELD_FOUND;
> > > > }
> > > >
> > > > static int btf_find_struct_field(const struct btf *btf, const struct btf_type *t,
> > > > const char *name, int sz, int align, int field_type,
> > > > - struct btf_field_info *info)
> > > > + struct btf_field_info *info, int info_cnt)
> > >
> > > From my understanding, this patch now modifies btf_find_struct_field
> > > and btf_find_datasec_var such that the "info" that is passed in has to
> > > be an array of size max possible + 1 while "info_cnt" is the max
> > > possible count, or we risk writing beyond the "info" array passed in.
> > > It seems like we could just modify the
> > > btf_find_struct_field/btf_find_datasec_var logic so that the user can
> > > just pass in info array of max possible size instead of max possible
> > > size + 1 - or is your concern that this would require more idx >=
> > > info_cnt checks inside the functions? Maybe we should include a
> > > comment here and in btf_find_datasec_var to document that "info"
> > > should always be max possible size + 1?
> > >
> >
> > So for some context on why this was changed, follow [0].
> >
> > I agree it's pretty ugly. My first thought was to check it inside the functions,
> > but that is also not very great. So I went with this, one more suggestion from
> > Alexei was to split it into a find and then fill info, because the error on
> > idx >= info_cnt should only happen after we find. Right now the find and fill
> > happens together, so to error out, you need an extra element it can fill before
> > you bail for ARRAY_SIZE - 1 (which is the actual max).
> >
> > TBH the find + fill split looks best to me, but open to more suggestions.
>
> I think there is much simpler way that doesn't require unnecessary
> copying or splitting anything:
>
> struct btf_field_info tmp;
>
> ...
>
> ret = btf_find_field_struct(btf, member_type, off, sz,
> idx < info_cnt ? &info[idx] : &tmp);
>
> ...
>
> That's it.
>
Indeed, not sure why I was overthinking this, it should work :).
> >
> > [0]: https://lore.kernel.org/bpf/20220319181538.nbqdkprjrzkxk7v4@ast-mbp.dhcp.thefacebook.com
> >
>
> [...]
--
Kartikeya
next prev parent reply other threads:[~2022-03-25 14:51 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-20 15:54 [PATCH bpf-next v3 00/13] Introduce typed pointer support in BPF maps Kumar Kartikeya Dwivedi
2022-03-20 15:54 ` [PATCH bpf-next v3 01/13] bpf: Make btf_find_field more generic Kumar Kartikeya Dwivedi
2022-03-20 15:54 ` [PATCH bpf-next v3 02/13] bpf: Move check_ptr_off_reg before check_map_access Kumar Kartikeya Dwivedi
2022-03-20 15:55 ` [PATCH bpf-next v3 03/13] bpf: Allow storing unreferenced kptr in map Kumar Kartikeya Dwivedi
2022-03-21 23:39 ` Joanne Koong
2022-03-22 7:04 ` Kumar Kartikeya Dwivedi
2022-03-22 20:22 ` Andrii Nakryiko
2022-03-25 14:51 ` Kumar Kartikeya Dwivedi [this message]
2022-03-22 5:45 ` Andrii Nakryiko
2022-03-22 7:16 ` Kumar Kartikeya Dwivedi
2022-03-22 7:43 ` Kumar Kartikeya Dwivedi
2022-03-22 18:52 ` Andrii Nakryiko
2022-03-25 14:42 ` Kumar Kartikeya Dwivedi
2022-03-25 22:59 ` Andrii Nakryiko
2022-03-22 18:06 ` Martin KaFai Lau
2022-03-25 14:45 ` Kumar Kartikeya Dwivedi
2022-03-20 15:55 ` [PATCH bpf-next v3 04/13] bpf: Indicate argument that will be released in bpf_func_proto Kumar Kartikeya Dwivedi
2022-03-22 1:47 ` Joanne Koong
2022-03-22 7:34 ` Kumar Kartikeya Dwivedi
2022-03-20 15:55 ` [PATCH bpf-next v3 05/13] bpf: Allow storing referenced kptr in map Kumar Kartikeya Dwivedi
2022-03-22 20:59 ` Martin KaFai Lau
2022-03-25 14:57 ` Kumar Kartikeya Dwivedi
2022-03-25 23:39 ` Martin KaFai Lau
2022-03-26 1:01 ` Kumar Kartikeya Dwivedi
2022-03-20 15:55 ` [PATCH bpf-next v3 06/13] bpf: Prevent escaping of kptr loaded from maps Kumar Kartikeya Dwivedi
2022-03-22 5:58 ` Andrii Nakryiko
2022-03-22 7:18 ` Kumar Kartikeya Dwivedi
2022-03-20 15:55 ` [PATCH bpf-next v3 07/13] bpf: Adapt copy_map_value for multiple offset case Kumar Kartikeya Dwivedi
2022-03-22 20:38 ` Andrii Nakryiko
2022-03-25 15:06 ` Kumar Kartikeya Dwivedi
2022-03-20 15:55 ` [PATCH bpf-next v3 08/13] bpf: Populate pairs of btf_id and destructor kfunc in btf Kumar Kartikeya Dwivedi
2022-03-20 15:55 ` [PATCH bpf-next v3 09/13] bpf: Wire up freeing of referenced kptr Kumar Kartikeya Dwivedi
2022-03-22 20:51 ` Andrii Nakryiko
2022-03-25 14:50 ` Kumar Kartikeya Dwivedi
2022-03-22 21:10 ` Alexei Starovoitov
2022-03-25 15:07 ` Kumar Kartikeya Dwivedi
2022-03-20 15:55 ` [PATCH bpf-next v3 10/13] bpf: Teach verifier about kptr_get kfunc helpers Kumar Kartikeya Dwivedi
2022-03-20 15:55 ` [PATCH bpf-next v3 11/13] libbpf: Add kptr type tag macros to bpf_helpers.h Kumar Kartikeya Dwivedi
2022-03-20 15:55 ` [PATCH bpf-next v3 12/13] selftests/bpf: Add C tests for kptr Kumar Kartikeya Dwivedi
2022-03-22 21:00 ` Andrii Nakryiko
2022-03-25 14:52 ` Kumar Kartikeya Dwivedi
2022-03-24 9:10 ` Jiri Olsa
2022-03-25 14:52 ` Kumar Kartikeya Dwivedi
2022-03-20 15:55 ` [PATCH bpf-next v3 13/13] selftests/bpf: Add verifier " Kumar Kartikeya Dwivedi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220325145123.yrzpwlahkmo66s2o@apollo \
--to=memxor@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=joannelkoong@gmail.com \
--cc=toke@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox