All of lore.kernel.org
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Alexei Starovoitov <ast@plumgrid.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@kernel.org>,
	Arnaldo Carvalho de Melo <acme@infradead.org>,
	Jiri Olsa <jolsa@redhat.com>,
	Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
	linux-api@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 linux-trace 1/8] tracing: attach eBPF programs to tracepoints and syscalls
Date: Thu, 29 Jan 2015 09:46:31 +0900	[thread overview]
Message-ID: <20150129004631.GA24182@sejong> (raw)
In-Reply-To: <1422417973-10195-2-git-send-email-ast@plumgrid.com>

Hi Alexei,

On Tue, Jan 27, 2015 at 08:06:06PM -0800, Alexei Starovoitov wrote:
> User interface:
> fd = open("/sys/kernel/debug/tracing/__event__/filter")
> 
> write(fd, "bpf_123")
> 
> where 123 is process local FD associated with eBPF program previously loaded.
> __event__ is static tracepoint event or syscall.
> (kprobe support is in next patch)
> Once program is successfully attached to tracepoint event, the tracepoint
> will be auto-enabled
> 
> close(fd)
> auto-disables tracepoint event and detaches eBPF program from it
> 
> eBPF programs can call in-kernel helper functions to:
> - lookup/update/delete elements in maps
> - memcmp
> - fetch_ptr/u64/u32/u16/u8 values from unsafe address via probe_kernel_read(),
>   so that eBPF program can walk any kernel data structures
> 
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
> ---

[SNIP]
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index b03a0ea77b99..70482817231a 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -1084,6 +1084,26 @@ event_filter_read(struct file *filp, char __user *ubuf, size_t cnt,
>  	return r;
>  }
>  
> +static int event_filter_release(struct inode *inode, struct file *filp)
> +{
> +	struct ftrace_event_file *file;
> +	char buf[2] = "0";
> +
> +	mutex_lock(&event_mutex);
> +	file = event_file_data(filp);
> +	if (file) {
> +		if (file->flags & TRACE_EVENT_FL_BPF) {
> +			/* auto-disable the filter */
> +			ftrace_event_enable_disable(file, 0);

Hmm.. what if user already enabled an event, attached a bpf filter and
then detached the filter - I'm not sure we can always auto-disable
it..


> +
> +			/* if BPF filter was used, clear it on fd close */
> +			apply_event_filter(file, buf);
> +		}
> +	}
> +	mutex_unlock(&event_mutex);
> +	return 0;
> +}
> +
>  static ssize_t
>  event_filter_write(struct file *filp, const char __user *ubuf, size_t cnt,
>  		   loff_t *ppos)
> @@ -1107,8 +1127,18 @@ event_filter_write(struct file *filp, const char __user *ubuf, size_t cnt,
>  
>  	mutex_lock(&event_mutex);
>  	file = event_file_data(filp);
> -	if (file)
> +	if (file) {
> +		/*
> +		 * note to user space tools:
> +		 * write() into debugfs/tracing/events/xxx/filter file
> +		 * must be done with the same privilege level as open()
> +		 */
>  		err = apply_event_filter(file, buf);
> +		if (!err && file->flags & TRACE_EVENT_FL_BPF)
> +			/* once filter is applied, auto-enable it */
> +			ftrace_event_enable_disable(file, 1);
> +	}
> +
>  	mutex_unlock(&event_mutex);
>  
>  	free_page((unsigned long) buf);
> @@ -1363,6 +1393,7 @@ static const struct file_operations ftrace_event_filter_fops = {
>  	.open = tracing_open_generic,
>  	.read = event_filter_read,
>  	.write = event_filter_write,
> +	.release = event_filter_release,
>  	.llseek = default_llseek,
>  };
>  
> diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
> index ced69da0ff55..e0303b3cc9fb 100644
> --- a/kernel/trace/trace_events_filter.c
> +++ b/kernel/trace/trace_events_filter.c
> @@ -23,6 +23,9 @@
>  #include <linux/mutex.h>
>  #include <linux/perf_event.h>
>  #include <linux/slab.h>
> +#include <linux/bpf.h>
> +#include <trace/bpf_trace.h>
> +#include <linux/filter.h>
>  
>  #include "trace.h"
>  #include "trace_output.h"
> @@ -541,6 +544,21 @@ static int filter_match_preds_cb(enum move_type move, struct filter_pred *pred,
>  	return WALK_PRED_DEFAULT;
>  }
>  
> +unsigned int trace_filter_call_bpf(struct event_filter *filter, void *ctx)
> +{
> +	unsigned int ret;
> +
> +	if (in_nmi()) /* not supported yet */
> +		return 0;

But doesn't this mean to auto-disable all attached events during NMI
as returning 0 will prevent the event going to ring buffer?

I think it'd be better to keep an attached event in a soft-disabled
state like event trigger and give control of enabling to users..

Thanks,
Namhyung


> +
> +	rcu_read_lock();
> +	ret = BPF_PROG_RUN(filter->prog, ctx);
> +	rcu_read_unlock();
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(trace_filter_call_bpf);

  reply	other threads:[~2015-01-29  0:46 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-28  4:06 [PATCH v2 linux-trace 0/8] tracing: attach eBPF programs to tracepoints/syscalls/kprobe Alexei Starovoitov
2015-01-28  4:06 ` [PATCH v2 linux-trace 1/8] tracing: attach eBPF programs to tracepoints and syscalls Alexei Starovoitov
2015-01-29  0:46   ` Namhyung Kim [this message]
2015-01-28  4:06 ` [PATCH v2 linux-trace 2/8] tracing: allow eBPF programs to call ktime_get_ns() Alexei Starovoitov
2015-01-28  4:06 ` [PATCH v2 linux-trace 3/8] samples: bpf: simple tracing example in eBPF assembler Alexei Starovoitov
2015-01-28  4:06 ` [PATCH v2 linux-trace 4/8] samples: bpf: simple tracing example in C Alexei Starovoitov
2015-01-28 16:24   ` Arnaldo Carvalho de Melo
2015-01-28 16:25     ` Arnaldo Carvalho de Melo
     [not found]       ` <20150128162557.GP7220-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-01-28 16:42         ` Alexei Starovoitov
2015-01-28 16:42           ` Alexei Starovoitov
2015-01-28 20:44           ` Arnaldo Carvalho de Melo
2015-01-28  4:06 ` [PATCH v2 linux-trace 5/8] samples: bpf: counting example for kfree_skb tracepoint and write syscall Alexei Starovoitov
2015-01-28  4:06 ` [PATCH v2 linux-trace 6/8] samples: bpf: IO latency analysis (iosnoop/heatmap) Alexei Starovoitov
2015-01-28  4:06 ` [PATCH v2 linux-trace 7/8] tracing: attach eBPF programs to kprobe/kretprobe Alexei Starovoitov
     [not found] ` <1422417973-10195-1-git-send-email-ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org>
2015-01-28  4:06   ` [PATCH v2 linux-trace 8/8] samples: bpf: simple kprobe example Alexei Starovoitov
2015-01-28  4:06     ` Alexei Starovoitov
  -- strict thread matches above, loose matches on Subject: below --
2015-01-29  4:40 [PATCH v2 linux-trace 1/8] tracing: attach eBPF programs to tracepoints and syscalls Alexei Starovoitov
     [not found] ` <CAMEtUuz+N52ms4ZFh8+fSGhg1UaXpBdk9RVPU_zBDCjaZ4bj4A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-29  5:39   ` Masami Hiramatsu
2015-01-29  5:39     ` Masami Hiramatsu
2015-01-29  6:41 ` Namhyung Kim
2015-01-29  6:39 Alexei Starovoitov
2015-01-29  7:04 Alexei Starovoitov
2015-01-29 12:35 ` Namhyung Kim
2015-01-29 12:35   ` Namhyung Kim
2015-01-29 18:55 Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150129004631.GA24182@sejong \
    --to=namhyung@kernel.org \
    --cc=acme@infradead.org \
    --cc=ast@plumgrid.com \
    --cc=jolsa@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.