From: Namhyung Kim <namhyung@kernel.org>
To: Alexei Starovoitov <ast@plumgrid.com>
Cc: "David S. Miller" <davem@davemloft.net>,
Ingo Molnar <mingo@kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Steven Rostedt <rostedt@goodmis.org>,
Daniel Borkmann <dborkman@redhat.com>,
Chema Gonzalez <chema@google.com>,
Eric Dumazet <edumazet@google.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Arnaldo Carvalho de Melo <acme@infradead.org>,
Jiri Olsa <jolsa@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Andrew Morton <akpm@linux-foundation.org>,
Kees Cook <keescook@chromium.org>,
Linux API <linux-api@vger.kernel.org>,
Network Development <netdev@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH RFC net-next 11/14] tracing: allow eBPF programs to be attached to events
Date: Wed, 2 Jul 2014 15:39:18 +0900 [thread overview]
Message-ID: <CAM9d7cisUq9nJbSXZMtD+nQ_g+ZCKSsu_ghRDZY9vvkq51oiKQ@mail.gmail.com> (raw)
In-Reply-To: <CAMEtUuzHakFzJYRR0WgwRBnsQL24KGsNax5DTfRNMJrGhvfQMA@mail.gmail.com>
On Wed, Jul 2, 2014 at 3:14 PM, Alexei Starovoitov <ast@plumgrid.com> wrote:
> On Tue, Jul 1, 2014 at 10:32 PM, Namhyung Kim <namhyung@gmail.com> wrote:
>> On Fri, 27 Jun 2014 17:06:03 -0700, Alexei Starovoitov wrote:
>>> User interface:
>>> cat bpf_123 > /sys/kernel/debug/tracing/__event__/filter
>>>
>>> where 123 is an id of the eBPF program priorly loaded.
>>> __event__ is static tracepoint event.
>>> (kprobe events will be supported in the future patches)
>>>
>>> eBPF programs can call in-kernel helper functions to:
>>> - lookup/update/delete elements in maps
>>> - memcmp
>>> - trace_printk
>>
>> ISTR Steve doesn't like to use trace_printk() (at least for production
>> kernels) anymore. And I'm not sure it'd work if there's no existing
>> trace_printk() on a system.
>
> yes. I saw big warning that trace_printk_init_buffers() emits.
> The idea here is to use eBPF programs for live kernel debugging.
> Instead of adding printk() and recompiling, just write a program,
> attach it to some event, and printk whatever is interesting.
> My only concern about printk() was that it dumps things into trace
> buffers (which is still better than dumping stuff to syslog), but now
> (since Andy almost convinced me to switch to 'fd' based interface)
> we can have seq_printk-like that prints into special buffer. So that
> user space does 'read(ufd)' and receives whatever program has
> printed. I think that would be much cleaner.
>
>>> + if (unlikely(ftrace_file->flags & FTRACE_EVENT_FL_FILTERED) && \
>>> + unlikely(ftrace_file->event_call->flags & TRACE_EVENT_FL_BPF)) { \
>>> + struct bpf_context __ctx; \
>>> + \
>>> + populate_bpf_context(&__ctx, args, 0, 0, 0, 0, 0); \
>>> + trace_filter_call_bpf(ftrace_file->filter, &__ctx); \
>>> + return; \
>>> + } \
>>> + \
>>
>> Hmm.. But it seems the eBPF prog is not a filter - it'd always drop the
>> event. And I think it's better to use a recorded entry rather then args
>> as a bpf_context so that tools like perf can manipulate it at compile
>> time based on the event format.
>
> Can manipulate what at compile time? Entry records of tracepoints are
> hard coded based on the event. For verifier it's easier to treat all
> tracepoint events as they received the same 'struct bpf_context'
> of N arguments then the same program can be attached to multiple
> tracepoint events at the same time.
I was thinking about perf creates a bpf program for filtering some
events like recording kfree_skb if protocol == xx. So perf can
calculate the offset and size of the protocol field and make
appropriate insns for the filter.
Maybe it needs to pass the event format to the verifier somehow then.
> I thought about making verifier specific for _every_ tracepoint event,
> but it complicates the user interface, since 'bpf_context' is now different
> for every program. I think args are much easier to deal with from C
> programming point of view, since program can go a fetch the same
> fields that tracepoint 'fast_assign' macro does.
> Also skipping buffer allocation and fast_assign gives very sizable
> performance boost, since the program will access only what it needs to.
>
> The return value of eBPF program is ignored, since I couldn't think
> of use case for it. We can change it to be more 'filter' like and interpret
> return value as true/false, whether to record this event or not. Thoughts?
Your scenario looks like just calling a bpf program when it hits a
event. It could use event triggering for that purpose IMHO.
But for filtering, it needs to add checking of the return value.
Thanks,
Namhyung
next prev parent reply other threads:[~2014-07-02 6:39 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-28 0:05 [PATCH RFC net-next 00/14] BPF syscall, maps, verifier, samples Alexei Starovoitov
2014-06-28 0:05 ` [PATCH RFC net-next 01/14] net: filter: split filter.c into two files Alexei Starovoitov
2014-07-02 4:23 ` Namhyung Kim
2014-07-02 5:35 ` Alexei Starovoitov
2014-06-28 0:05 ` [PATCH RFC net-next 02/14] net: filter: split filter.h and expose eBPF to user space Alexei Starovoitov
2014-06-28 0:05 ` [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps Alexei Starovoitov
2014-06-28 0:16 ` Andy Lutomirski
2014-06-28 5:55 ` Alexei Starovoitov
2014-06-28 6:25 ` Andy Lutomirski
2014-06-28 6:43 ` Alexei Starovoitov
2014-06-28 15:34 ` Andy Lutomirski
2014-06-28 20:49 ` Alexei Starovoitov
2014-06-29 1:52 ` Andy Lutomirski
2014-06-29 6:36 ` Alexei Starovoitov
2014-06-30 22:09 ` Andy Lutomirski
2014-07-01 5:47 ` Alexei Starovoitov
2014-07-01 15:11 ` Andy Lutomirski
2014-07-02 5:33 ` Alexei Starovoitov
2014-07-03 1:43 ` Andy Lutomirski
2014-07-03 2:29 ` Alexei Starovoitov
2014-07-04 15:17 ` Andy Lutomirski
2014-07-05 21:59 ` Alexei Starovoitov
2014-06-28 0:05 ` [PATCH RFC net-next 04/14] bpf: update MAINTAINERS entry Alexei Starovoitov
2014-06-28 0:18 ` Joe Perches
2014-06-28 5:59 ` Alexei Starovoitov
2014-06-28 0:05 ` [PATCH RFC net-next 05/14] bpf: add lookup/update/delete/iterate methods to BPF maps Alexei Starovoitov
2014-06-28 0:05 ` [PATCH RFC net-next 06/14] bpf: add hashtable type of " Alexei Starovoitov
2014-06-28 0:05 ` [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload Alexei Starovoitov
2014-06-28 0:19 ` Andy Lutomirski
2014-06-28 6:12 ` Alexei Starovoitov
2014-06-28 6:28 ` Andy Lutomirski
2014-06-28 7:26 ` Alexei Starovoitov
2014-06-28 15:21 ` Greg KH
2014-06-28 15:35 ` Andy Lutomirski
2014-06-30 20:39 ` Alexei Starovoitov
2014-06-30 10:06 ` David Laight
2014-06-28 0:06 ` [PATCH RFC net-next 08/14] bpf: add eBPF verifier Alexei Starovoitov
2014-06-28 16:01 ` Andy Lutomirski
2014-06-28 20:25 ` Alexei Starovoitov
2014-06-29 1:58 ` Andy Lutomirski
2014-06-29 6:20 ` Alexei Starovoitov
2014-07-01 8:05 ` Daniel Borkmann
2014-07-01 20:04 ` Alexei Starovoitov
2014-07-02 8:11 ` David Laight
2014-07-02 22:43 ` Alexei Starovoitov
2014-07-02 5:05 ` Namhyung Kim
2014-07-02 5:57 ` Alexei Starovoitov
2014-07-02 22:22 ` Chema Gonzalez
2014-07-02 23:04 ` Alexei Starovoitov
2014-07-02 23:35 ` Chema Gonzalez
2014-07-03 0:01 ` Alexei Starovoitov
2014-07-03 9:13 ` David Laight
2014-07-03 17:41 ` Alexei Starovoitov
2014-06-28 0:06 ` [PATCH RFC net-next 09/14] bpf: allow eBPF programs to use maps Alexei Starovoitov
2014-06-28 0:06 ` [PATCH RFC net-next 10/14] net: sock: allow eBPF programs to be attached to sockets Alexei Starovoitov
2014-06-28 0:06 ` [PATCH RFC net-next 11/14] tracing: allow eBPF programs to be attached to events Alexei Starovoitov
2014-07-01 8:30 ` Daniel Borkmann
2014-07-01 20:06 ` Alexei Starovoitov
2014-07-02 5:32 ` Namhyung Kim
2014-07-02 6:14 ` Alexei Starovoitov
2014-07-02 6:39 ` Namhyung Kim [this message]
2014-07-02 7:29 ` Alexei Starovoitov
2014-06-28 0:06 ` [PATCH RFC net-next 12/14] samples: bpf: add mini eBPF library to manipulate maps and programs Alexei Starovoitov
2014-06-28 0:06 ` [PATCH RFC net-next 13/14] samples: bpf: example of stateful socket filtering Alexei Starovoitov
2014-06-28 0:21 ` Andy Lutomirski
2014-06-28 6:21 ` Alexei Starovoitov
2014-06-28 0:06 ` [PATCH RFC net-next 14/14] samples: bpf: example of tracing filters with eBPF Alexei Starovoitov
2014-06-30 23:09 ` [PATCH RFC net-next 00/14] BPF syscall, maps, verifier, samples Kees Cook
2014-07-01 7:18 ` Daniel Borkmann
2014-07-02 16:39 ` Kees Cook
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAM9d7cisUq9nJbSXZMtD+nQ_g+ZCKSsu_ghRDZY9vvkq51oiKQ@mail.gmail.com \
--to=namhyung@kernel.org \
--cc=a.p.zijlstra@chello.nl \
--cc=acme@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=ast@plumgrid.com \
--cc=chema@google.com \
--cc=davem@davemloft.net \
--cc=dborkman@redhat.com \
--cc=edumazet@google.com \
--cc=hpa@zytor.com \
--cc=jolsa@redhat.com \
--cc=keescook@chromium.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).