From: Steven Rostedt <rostedt@goodmis.org>
To: Alexei Starovoitov <ast@fb.com>
Cc: davem@davemloft.net, daniel@iogearbox.net,
torvalds@linux-foundation.org, peterz@infradead.org,
mathieu.desnoyers@efficios.com, netdev@vger.kernel.org,
kernel-team@fb.com, linux-api@vger.kernel.org
Subject: Re: [PATCH v7 bpf-next 07/10] bpf: introduce BPF_RAW_TRACEPOINT
Date: Wed, 28 Mar 2018 13:41:39 -0400 [thread overview]
Message-ID: <20180328134139.0db1b5b5@gandalf.local.home> (raw)
In-Reply-To: <20180328021105.4061744-8-ast@fb.com>
On Tue, 27 Mar 2018 19:11:02 -0700
Alexei Starovoitov <ast@fb.com> wrote:
> From: Alexei Starovoitov <ast@kernel.org>
>
> Introduce BPF_PROG_TYPE_RAW_TRACEPOINT bpf program type to access
> kernel internal arguments of the tracepoints in their raw form.
>
> >From bpf program point of view the access to the arguments look like:
> struct bpf_raw_tracepoint_args {
> __u64 args[0];
> };
>
> int bpf_prog(struct bpf_raw_tracepoint_args *ctx)
> {
> // program can read args[N] where N depends on tracepoint
> // and statically verified at program load+attach time
> }
>
> kprobe+bpf infrastructure allows programs access function arguments.
> This feature allows programs access raw tracepoint arguments.
>
> Similar to proposed 'dynamic ftrace events' there are no abi guarantees
> to what the tracepoints arguments are and what their meaning is.
> The program needs to type cast args properly and use bpf_probe_read()
> helper to access struct fields when argument is a pointer.
>
> For every tracepoint __bpf_trace_##call function is prepared.
> In assembler it looks like:
> (gdb) disassemble __bpf_trace_xdp_exception
> Dump of assembler code for function __bpf_trace_xdp_exception:
> 0xffffffff81132080 <+0>: mov %ecx,%ecx
> 0xffffffff81132082 <+2>: jmpq 0xffffffff811231f0 <bpf_trace_run3>
>
> where
>
> TRACE_EVENT(xdp_exception,
> TP_PROTO(const struct net_device *dev,
> const struct bpf_prog *xdp, u32 act),
>
> The above assembler snippet is casting 32-bit 'act' field into 'u64'
> to pass into bpf_trace_run3(), while 'dev' and 'xdp' args are passed as-is.
> All of ~500 of __bpf_trace_*() functions are only 5-10 byte long
> and in total this approach adds 7k bytes to .text.
>
> This approach gives the lowest possible overhead
> while calling trace_xdp_exception() from kernel C code and
> transitioning into bpf land.
> Since tracepoint+bpf are used at speeds of 1M+ events per second
> this is valuable optimization.
>
> The new BPF_RAW_TRACEPOINT_OPEN sys_bpf command is introduced
> that returns anon_inode FD of 'bpf-raw-tracepoint' object.
>
> The user space looks like:
> // load bpf prog with BPF_PROG_TYPE_RAW_TRACEPOINT type
> prog_fd = bpf_prog_load(...);
> // receive anon_inode fd for given bpf_raw_tracepoint with prog attached
> raw_tp_fd = bpf_raw_tracepoint_open("xdp_exception", prog_fd);
>
> Ctrl-C of tracing daemon or cmdline tool that uses this feature
> will automatically detach bpf program, unload it and
> unregister tracepoint probe.
>
> On the kernel side the __bpf_raw_tp_map section of pointers to
> tracepoint definition and to __bpf_trace_*() probe function is used
> to find a tracepoint with "xdp_exception" name and
> corresponding __bpf_trace_xdp_exception() probe function
> which are passed to tracepoint_probe_register() to connect probe
> with tracepoint.
>
> Addition of bpf_raw_tracepoint doesn't interfere with ftrace and perf
> tracepoint mechanisms. perf_event_open() can be used in parallel
> on the same tracepoint.
> Multiple bpf_raw_tracepoint_open("xdp_exception", prog_fd) are permitted.
> Each with its own bpf program. The kernel will execute
> all tracepoint probes and all attached bpf programs.
>
> In the future bpf_raw_tracepoints can be extended with
> query/introspection logic.
>
> __bpf_raw_tp_map section logic was contributed by Steven Rostedt
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
Just an FYI, I applied all the patches up to and including this one
(made sure BPF_EVENTS was enabled in my config this time), built and
booted the kernel and ran a bunch of tests (not my full suite, but
enough).
It didn't affect any other tracing features that I can see.
-- Steve
next prev parent reply other threads:[~2018-03-28 17:41 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-28 2:10 [PATCH v7 bpf-next 00/10] bpf, tracing: introduce bpf raw tracepoints Alexei Starovoitov
2018-03-28 2:10 ` [PATCH v7 bpf-next 01/10] treewide: remove large struct-pass-by-value from tracepoint arguments Alexei Starovoitov
2018-03-28 2:10 ` [PATCH v7 bpf-next 02/10] net/mediatek: disambiguate mt76 vs mt7601u trace events Alexei Starovoitov
2018-03-28 2:10 ` [PATCH v7 bpf-next 03/10] net/mac802154: disambiguate mac80215 vs mac802154 " Alexei Starovoitov
2018-03-28 2:10 ` [PATCH v7 bpf-next 04/10] net/wireless/iwlwifi: fix iwlwifi_dev_ucode_error tracepoint Alexei Starovoitov
2018-03-28 2:11 ` [PATCH v7 bpf-next 05/10] macro: introduce COUNT_ARGS() macro Alexei Starovoitov
2018-03-28 2:11 ` [PATCH v7 bpf-next 06/10] tracepoint: compute num_args at build time Alexei Starovoitov
2018-03-28 13:49 ` Mathieu Desnoyers
2018-03-28 16:43 ` Alexei Starovoitov
2018-03-28 17:04 ` Steven Rostedt
2018-03-28 17:10 ` Alexei Starovoitov
2018-03-28 17:38 ` Steven Rostedt
2018-03-28 18:03 ` Alexei Starovoitov
2018-03-28 18:10 ` Steven Rostedt
2018-03-28 18:19 ` Alexei Starovoitov
2018-03-28 18:54 ` Steven Rostedt
2018-03-28 19:22 ` Mathieu Desnoyers
2018-03-28 19:25 ` Alexei Starovoitov
2018-03-28 19:32 ` Steven Rostedt
2018-03-28 19:38 ` Steven Rostedt
2018-03-28 19:47 ` Mathieu Desnoyers
2018-03-28 17:14 ` Mathieu Desnoyers
2018-03-28 2:11 ` [PATCH v7 bpf-next 07/10] bpf: introduce BPF_RAW_TRACEPOINT Alexei Starovoitov
2018-03-28 17:41 ` Steven Rostedt [this message]
2018-03-28 2:11 ` [PATCH v7 bpf-next 08/10] libbpf: add bpf_raw_tracepoint_open helper Alexei Starovoitov
2018-03-28 2:11 ` [PATCH v7 bpf-next 09/10] samples/bpf: raw tracepoint test Alexei Starovoitov
2018-03-28 2:11 ` [PATCH v7 bpf-next 10/10] selftests/bpf: test for bpf_get_stackid() from raw tracepoints Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180328134139.0db1b5b5@gandalf.local.home \
--to=rostedt@goodmis.org \
--cc=ast@fb.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=kernel-team@fb.com \
--cc=linux-api@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).