From: Ilya Leoshkevich <iii@linux.ibm.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
bpf@vger.kernel.org, Heiko Carstens <hca@linux.ibm.com>,
Vasily Gorbik <gor@linux.ibm.com>,
Alexander Gordeev <agordeev@linux.ibm.com>
Subject: Re: [PATCH bpf-next v3 7/8] libbpf: Add MSan annotations
Date: Tue, 21 Feb 2023 01:46:18 +0100 [thread overview]
Message-ID: <b0ae5117d46948ca4d160157bc02e94c3b00fb19.camel@linux.ibm.com> (raw)
In-Reply-To: <CAEf4BzZcvuCZpjKwgT_-3WaKuM82CA1Uxg3X-4E63r2o6he+sA@mail.gmail.com>
On Thu, 2023-02-16 at 15:28 -0800, Andrii Nakryiko wrote:
> On Tue, Feb 14, 2023 at 3:12 PM Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> >
> > MSan runs into a few false positives in libbpf. They all come from
> > the
> > fact that MSan does not know anything about the bpf syscall,
> > particularly, what it writes to.
> >
> > Add __libbpf_mark_mem_written() function to mark memory modified by
> > the
> > bpf syscall, and a few convenience wrappers. Use the abstract name
> > (it
> > could be e.g. libbpf_msan_unpoison()), because it can be used for
> > Valgrind in the future as well.
> >
> > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > ---
> > tools/lib/bpf/bpf.c | 161
> > ++++++++++++++++++++++++++++++--
> > tools/lib/bpf/btf.c | 1 +
> > tools/lib/bpf/libbpf.c | 1 +
> > tools/lib/bpf/libbpf_internal.h | 38 ++++++++
> > 4 files changed, 194 insertions(+), 7 deletions(-)
>
[...]
> > +/* Helper macros for telling memory checkers that an array pointed
> > to by
> > + * a struct bpf_{btf,link,map,prog}_info member is initialized.
> > Before doing
> > + * that, they make sure that kernel has provided the respective
> > member.
> > + */
> > +
> > +/* Handle arrays with a certain element size. */
> > +#define __MARK_INFO_ARRAY_WRITTEN(ptr, nr, elem_size) do
> > { \
> > + if (info_len >= offsetofend(typeof(*info), ptr)
> > && \
> > + info_len >= offsetofend(typeof(*info), nr)
> > && \
> > + info-
> > >ptr) \
> > + libbpf_mark_mem_written(u64_to_ptr(info-
> > >ptr), \
> > + info->nr *
> > elem_size); \
> > +} while (0)
> > +
> > +/* Handle arrays with a certain element type. */
> > +#define MARK_INFO_ARRAY_WRITTEN(ptr, nr,
> > type) \
> > + __MARK_INFO_ARRAY_WRITTEN(ptr, nr, sizeof(type))
> > +
> > +/* Handle arrays with element size defined by a struct member. */
> > +#define MARK_INFO_REC_ARRAY_WRITTEN(ptr, nr, rec_size) do
> > { \
> > + if (info_len >= offsetofend(typeof(*info),
> > rec_size)) \
> > + __MARK_INFO_ARRAY_WRITTEN(ptr, nr, info-
> > >rec_size); \
> > +} while (0)
> > +
> > +/* Handle null-terminated strings. */
> > +#define MARK_INFO_STR_WRITTEN(ptr, nr) do
> > { \
> > + if (info_len >= offsetofend(typeof(*info), ptr)
> > && \
> > + info_len >= offsetofend(typeof(*info), nr)
> > && \
> > + info-
> > >ptr) \
> > + libbpf_mark_mem_written(u64_to_ptr(info-
> > >ptr), \
> > + info->nr +
> > 1); \
> > +} while (0)
> > +
> > +/* Helper functions for telling memory checkers that arrays
> > pointed to by
> > + * bpf_{btf,link,map,prog}_info members are initialized.
> > + */
> > +
> > +static void mark_prog_info_written(struct bpf_prog_info *info,
> > __u32 info_len)
> > +{
> > + MARK_INFO_ARRAY_WRITTEN(map_ids, nr_map_ids, __u32);
> > + MARK_INFO_ARRAY_WRITTEN(jited_ksyms, nr_jited_ksyms,
> > __u64);
> > + MARK_INFO_ARRAY_WRITTEN(jited_func_lens,
> > nr_jited_func_lens, __u32);
> > + MARK_INFO_REC_ARRAY_WRITTEN(func_info, nr_func_info,
> > + func_info_rec_size);
> > + MARK_INFO_REC_ARRAY_WRITTEN(line_info, nr_line_info,
> > + line_info_rec_size);
> > + MARK_INFO_REC_ARRAY_WRITTEN(jited_line_info,
> > nr_jited_line_info,
> > + jited_line_info_rec_size);
> > + MARK_INFO_ARRAY_WRITTEN(prog_tags, nr_prog_tags,
> > __u8[BPF_TAG_SIZE]);
> > +}
> > +
> > +static void mark_btf_info_written(struct bpf_btf_info *info, __u32
> > info_len)
> > +{
> > + MARK_INFO_ARRAY_WRITTEN(btf, btf_size, __u8);
> > + MARK_INFO_STR_WRITTEN(name, name_len);
> > +}
> > +
> > +static void mark_link_info_written(struct bpf_link_info *info,
> > __u32 info_len)
> > +{
> > + switch (info->type) {
> > + case BPF_LINK_TYPE_RAW_TRACEPOINT:
> > + MARK_INFO_STR_WRITTEN(raw_tracepoint.tp_name,
> > + raw_tracepoint.tp_name_len);
> > + break;
> > + case BPF_LINK_TYPE_ITER:
> > + MARK_INFO_STR_WRITTEN(iter.target_name,
> > iter.target_name_len);
> > + break;
> > + default:
> > + break;
> > + }
> > +}
> > +
> > +#undef MARK_INFO_STR_WRITTEN
> > +#undef MARK_INFO_REC_ARRAY_WRITTEN
> > +#undef MARK_INFO_ARRAY_WRITTEN
> > +#undef __MARK_INFO_ARRAY_WRITTEN
>
> Ugh... I wasn't a big fan of adding all these "mark_mem_written"
> across a bunch of APIs to begin with, but this part is really putting
> me off.
>
> I like the bpf_{map,btf,prog,btf}_info_by_fd() improvements you did,
> but maybe adding these MSan annotations is a bit too much?
> Applications that really care about this whole "do I read
> uninitialized memory" business could do their own simpler wrappers on
> top of libbpf APIs, right?
>
> Maybe we should start there first and see if there is more demand to
> have built-in libbpf support?
I can try moving all this to selftests.
Alternatively this could be made a part of LLVM sanitizers, but then
we come back to the question of resolving fd types.
> BTW, is this all needed for ASan as well?
Not strictly needed, but this would help detecting bad writes.
> One more worry I have is that given we don't exercise all these
> sanitizers regularly in BPF CI, we'll keep forgetting adding new
> annotations and all this machinery will start bit rotting.
>
> So I'd say that we should first make sure that we have sanitizer
> builds/runs in BPF CI, before signing up for maintaining these
> "annotations".
I'll wait until LLVM folks review my patches, and then see if I can
add MSan to the CI. Configuring it locally wasn't too complicated,
the main difficulty is that one needs instrumented zlib and elfutils.
For the CI, they can be prebuilt and uploaded to S3, and then added
to the build environment and the image.
[...]
next prev parent reply other threads:[~2023-02-21 0:46 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-14 23:12 [PATCH bpf-next v3 0/8] Add Memory Sanitizer support Ilya Leoshkevich
2023-02-14 23:12 ` [PATCH bpf-next v3 1/8] libbpf: Introduce bpf_{btf,link,map,prog}_get_info_by_fd() Ilya Leoshkevich
2023-02-16 23:08 ` Andrii Nakryiko
2023-02-16 23:37 ` Andrii Nakryiko
2023-02-14 23:12 ` [PATCH bpf-next v3 2/8] libbpf: Use bpf_{btf,link,map,prog}_get_info_by_fd() Ilya Leoshkevich
2023-02-14 23:12 ` [PATCH bpf-next v3 3/8] bpftool: " Ilya Leoshkevich
2023-02-15 12:53 ` Quentin Monnet
2023-02-14 23:12 ` [PATCH bpf-next v3 4/8] samples/bpf: " Ilya Leoshkevich
2023-02-14 23:12 ` [PATCH bpf-next v3 5/8] selftests/bpf: " Ilya Leoshkevich
2023-02-14 23:12 ` [PATCH bpf-next v3 6/8] libbpf: Factor out is_percpu_bpf_map_type() Ilya Leoshkevich
2023-02-14 23:12 ` [PATCH bpf-next v3 7/8] libbpf: Add MSan annotations Ilya Leoshkevich
2023-02-16 23:28 ` Andrii Nakryiko
2023-02-21 0:46 ` Ilya Leoshkevich [this message]
2023-02-27 21:23 ` Andrii Nakryiko
2023-02-14 23:12 ` [PATCH bpf-next v3 8/8] selftests/bpf: " Ilya Leoshkevich
2023-02-16 23:29 ` Andrii Nakryiko
2023-02-17 0:10 ` [PATCH bpf-next v3 0/8] Add Memory Sanitizer support patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b0ae5117d46948ca4d160157bc02e94c3b00fb19.camel@linux.ibm.com \
--to=iii@linux.ibm.com \
--cc=agordeev@linux.ibm.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox