All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiri Olsa <olsajiri@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Jiri Olsa <olsajiri@gmail.com>,
	Juri Lelli <juri.lelli@redhat.com>, bpf <bpf@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Artem Savkov <asavkov@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	"Jose E. Marchesi" <jose.marchesi@oracle.com>
Subject: Re: NULL pointer deref when running BPF monitor program (6.11.0-rc1)
Date: Thu, 15 Aug 2024 13:48:02 +0200	[thread overview]
Message-ID: <Zr3q8ihbe8cUdpfp@krava> (raw)
In-Reply-To: <CAADnVQLYxdKn-J2-2iXKKKTg=o6xkKWzV2WyYrnmQ-j62b9STA@mail.gmail.com>

On Thu, Aug 08, 2024 at 08:43:05AM -0700, Alexei Starovoitov wrote:
> On Thu, Aug 8, 2024 at 3:46 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Tue, Aug 06, 2024 at 11:44:52AM -0700, Alexei Starovoitov wrote:
> > > On Tue, Aug 6, 2024 at 6:24 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > >
> > > > > Jiri,
> > > > >
> > > > > the verifier removes the check because it assumes that pointers
> > > > > passed by the kernel into tracepoint are valid and trusted.
> > > > > In this case:
> > > > >         trace_sched_pi_setprio(p, pi_task);
> > > > >
> > > > > pi_task can be NULL.
> > > > >
> > > > > We cannot make all tracepoint pointers to be PTR_TRUSTED | PTR_MAYBE_NULL
> > > > > by default, since it will break a bunch of progs.
> > > > > Instead we can annotate this tracepoint arg as __nullable and
> > > > > teach the verifier to recognize such special arguments of tracepoints.
> > > >
> > > > ok, so you mean to be able to mark it in event header like:
> > > >
> > > >   TRACE_EVENT(sched_pi_setprio,
> > > >         TP_PROTO(struct task_struct *tsk, struct task_struct *pi_task __nullable),
> > > >
> > > > I guess we could make pahole to emit DECL_TAG for that argument,
> > > > but I'm not sure how to propagate that __nullable info to pahole
> > > >
> > > > while wondering about that, I tried the direct fix below ;-)
> > >
> > > We don't need to rush such a hack below.
> > > No need to add decl_tag and change pahole either.
> > > The arg name is already vmlinux BTF:
> > > [51371] FUNC_PROTO '(anon)' ret_type_id=0 vlen=3
> > >         '__data' type_id=61
> > >         'tsk' type_id=77
> > >         'pi_task' type_id=77
> > > [51372] FUNC '__bpf_trace_sched_pi_setprio' type_id=51371 linkage=static
> > >
> > > just need to rename "pi_task" to "pi_task__nullable"
> > > and teach the verifier.
> >
> > the problem is that btf_trace_<xxx> is typedef
> >
> >   typedef void (*btf_trace_##call)(void *__data, proto);
> >
> > and dwarf does not store argument names for subroutine type entry,
> > so it's not in BTF's TYPEDEF either
> >
> > it's the btf_trace_##call typedef ID that verifier has to work with,
> > I wonder we could somehow associate that ID with __bpf_trace_##call
> > subroutine entry which has the argument names
> >
> > we could store __bpf_trace_##call's BTF_ID in __bpf_raw_tp_map record,
> > but we'd need to do the lookup based on the tracepoint name when loading
> > the program .. ATM we do the lookup __bpf_raw_tp_map record only when
> > doing attach, so we would need to move it to program load time
> >
> > or we could 'fix' the argument names in pahole, but that'd probably
> > mean extra setup and hash lookup, so also not great
> 
> I would do a simple string search in vmlinux BTF for "__bpf_trace" + tp name.
> No need to add btf_id-s and waste memory to speed up the slow path.

I checked bit more and there are more tracepoints with the same issue,
the first diff stat looks like:

	 include/trace/events/afs.h                            | 44 ++++++++++++++++++++++----------------------
	 include/trace/events/cachefiles.h                     | 96 ++++++++++++++++++++++++++++++++++++++++++++++++------------------------------------------------
	 include/trace/events/ext4.h                           |  6 +++---
	 include/trace/events/fib.h                            | 16 ++++++++--------
	 include/trace/events/filelock.h                       | 38 +++++++++++++++++++-------------------
	 include/trace/events/host1x.h                         | 10 +++++-----
	 include/trace/events/huge_memory.h                    | 24 ++++++++++++------------
	 include/trace/events/kmem.h                           | 18 +++++++++---------
	 include/trace/events/netfs.h                          | 16 ++++++++--------
	 include/trace/events/power.h                          |  6 +++---
	 include/trace/events/qdisc.h                          |  8 ++++----
	 include/trace/events/rxrpc.h                          | 12 ++++++------
	 include/trace/events/sched.h                          | 12 ++++++------
	 include/trace/events/sunrpc.h                         |  8 ++++----
	 include/trace/events/tcp.h                            | 14 +++++++-------
	 include/trace/events/tegra_apb_dma.h                  |  6 +++---
	 include/trace/events/timer_migration.h                | 10 +++++-----
	 include/trace/events/writeback.h                      | 16 ++++++++--------

plus there's one case where pointer needs to be checked with IS_ERR in
include/trace/events/rdma_core.h trace_mr_alloc/mr_integ_alloc

I'm not excited about the '_nullable' argument suffix, because it's lot
of extra changes/renames in TP_fast_assign and it does not solve the
IS_ERR case above

I checked on the type tag and with llvm build we get the TYPE_TAG info
nicely in BTF:

	[119148] TYPEDEF 'btf_trace_sched_pi_setprio' type_id=119149
	[119149] PTR '(anon)' type_id=119150
	[119150] FUNC_PROTO '(anon)' ret_type_id=0 vlen=3
		'(anon)' type_id=27
		'(anon)' type_id=678
		'(anon)' type_id=119152
	[119151] TYPE_TAG 'nullable' type_id=679
	[119152] PTR '(anon)' type_id=119151

	[679] STRUCT 'task_struct' size=15424 vlen=277

which we can easily check in verifier.. the tracepoint definition would look like:

	-       TP_PROTO(struct task_struct *tsk, struct task_struct *pi_task),
	+       TP_PROTO(struct task_struct *tsk, struct task_struct __nullable *pi_task),

and no other change in TP_fast_assign is needed

I think using the type tag for this is nicer, but I'm not sure where's
gcc at with btf_type_tag implementation, need to check on that

jirka

  reply	other threads:[~2024-08-15 11:48 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-05  9:20 NULL pointer deref when running BPF monitor program (6.11.0-rc1) Juri Lelli
2024-08-05 16:49 ` Jiri Olsa
2024-08-05 17:00   ` Alexei Starovoitov
2024-08-06  7:08     ` Juri Lelli
2024-08-06 13:17     ` Jiri Olsa
2024-08-06 13:24     ` Jiri Olsa
2024-08-06 18:44       ` Alexei Starovoitov
2024-08-08 10:46         ` Jiri Olsa
2024-08-08 15:43           ` Alexei Starovoitov
2024-08-15 11:48             ` Jiri Olsa [this message]
2024-08-15 12:37               ` Alexei Starovoitov
2024-08-16 14:10                 ` Steven Rostedt
2024-08-16 18:59                   ` Jiri Olsa
2024-08-16 19:30                     ` Steven Rostedt
2024-08-19 11:47                       ` Jiri Olsa
2024-08-19 14:05                         ` Jiri Olsa
2024-08-19 15:37                         ` Steven Rostedt
2024-08-20 10:17                           ` Jiri Olsa
2024-08-20 15:05                             ` Steven Rostedt
2024-10-02 16:30                               ` Jiri Olsa
2024-10-09 20:41                                 ` Jiri Olsa
2024-10-10  0:33                                   ` Josh Poimboeuf
2024-10-10  0:56                                     ` Steven Rostedt
2024-10-10  0:57                                       ` Steven Rostedt
2024-10-10  3:17                                         ` Josh Poimboeuf
2024-10-10  9:00                                           ` Jiri Olsa
2024-10-10 13:49                                             ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zr3q8ihbe8cUdpfp@krava \
    --to=olsajiri@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=asavkov@redhat.com \
    --cc=bpf@vger.kernel.org \
    --cc=jose.marchesi@oracle.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.