Re: [RFC PATCH 1/5] bpf: Put perf_events check ahead of bpf prog

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: "Wangnan (F)" <wangnan0@huawei.com>
To: Alexei Starovoitov <ast@plumgrid.com>,
	He Kuang <hekuang@huawei.com>, <rostedt@goodmis.org>,
	<masami.hiramatsu.pt@hitachi.com>, <mingo@redhat.com>,
	<acme@redhat.com>, <a.p.zijlstra@chello.nl>, <jolsa@kernel.org>,
	<namhyung@kernel.org>
Cc: <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 1/5] bpf: Put perf_events check ahead of bpf prog
Date: Thu, 2 Jul 2015 13:52:42 +0800	[thread overview]
Message-ID: <5594D1AA.9040803@huawei.com> (raw)
In-Reply-To: <5594B50A.3010705@plumgrid.com>

On 2015/7/2 11:50, Alexei Starovoitov wrote:
> On 6/30/15 7:57 PM, He Kuang wrote:
>> When we add a kprobe point and record events by perf, the execution path
>> of all threads on each cpu will enter this point, but perf may only
>> record events on a particular thread or cpu at this kprobe point, a
>> check on call->perf_events list filters out the threads which perf is
>> not recording.
>
> I think there is a better way to do that. You're adding artificial
> per_cpu filtering whereas you really need per_pid filtering.

I think the differences between you and He Kuang is the order of
filtering. In He Kuang's view, perf's original filtering mechanism
(implicit or explicit) should takes precedence over BPF filter, because
what the user want is to filter events with *an additional* BPF filter.
So filters should be run by following order:

  event -> X -> Y -> Z  -> BPF filter +-> perf.data
                                      |
                                      `-> dropped

(In the above diagram, X represents limitations which prevent an event
to be triggered. For example, kprobe reentering. Y represents implicit
filters, like checking of call->perf_events, which is used to filter
events from other CPU out (per-pid perf event is also done by it).
Z represents explicit filter which is set using
PERF_EVENT_IOC_SET_FILTER by user.)

So only those events which should be collected by perf without BPF
filter should be passed to BPF program.

The above is our understanding of ideal BPF filters.

Therefore, to create a ideal BPF filter, it should be better to put BPF
filters into perf_tp_filter_match().

In current implementation, BPF filters take effects in the middle
of kprobe event processing:

  event -> X -> BPF filter -> Y -> Z +-> perf.data
                                     |
                                     `-> dropped

And this patch changes the ordering to:

  event -> X -> Y -> BPF filter -> Z +-> perf.data
                                     |
                                     `-> dropped

Both are not ideal, but He Kuang's patch moves BPF filter to correct
direction. It uses a relativly lower-cost operation (checking of
call->perf_events) to reduce the need of calling BPF filters.

I'd like to discuss with you about the correctness of our
understanding. Do you have any strong reason to put BPF filters at such
an early stage?

Thank you.

next prev parent reply	other threads:[~2015-07-02  5:54 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-01  2:57 [RFC PATCH 0/5] Make eBPF programs output data to perf event He Kuang
2015-07-01  2:57 ` [RFC PATCH 1/5] bpf: Put perf_events check ahead of bpf prog He Kuang
2015-07-02  3:50   ` Alexei Starovoitov
2015-07-02  5:52     ` Wangnan (F) [this message]
2015-07-02 18:02       ` Alexei Starovoitov
2015-07-01  2:57 ` [RFC PATCH 2/5] perf/trace: Add perf extra percpu trace buffer He Kuang
2015-07-01  2:57 ` [RFC PATCH 3/5] tracing/kprobe: Separate inc recursion count out of perf_trace_buf_prepare He Kuang
2015-07-01  2:57 ` [RFC PATCH 4/5] bpf: Introduce function for outputing sample data to perf event He Kuang
2015-07-01  2:57 ` [RFC PATCH 5/5] tracing/kprobe: Combine extra trace buf into perf trace buf He Kuang
2015-07-01  5:44 ` [RFC PATCH 0/5] Make eBPF programs output data to perf event Peter Zijlstra
2015-07-01  6:21   ` Wangnan (F)
2015-07-01 11:58     ` Peter Zijlstra
2015-07-02  2:48       ` Alexei Starovoitov
2015-07-02  3:38         ` He Kuang
2015-07-02  3:52           ` Alexei Starovoitov
2015-07-02  9:24             ` Wangnan (F)
2015-07-02 18:37               ` Alexei Starovoitov
2015-07-02  9:31         ` Peter Zijlstra
2015-07-02 13:50     ` [RFC PATCH v2 0/4] " He Kuang
2015-07-02 13:50       ` [RFC PATCH v2 1/4] bpf: Put perf_events check ahead of bpf prog He Kuang
2015-07-02 18:41         ` Alexei Starovoitov
2015-07-02 13:50       ` [RFC PATCH v2 2/4] tracing/kprobe: Separate inc recursion count out of perf_trace_buf_prepare He Kuang
2015-07-02 13:50       ` [RFC PATCH v2 3/4] bpf: Introduce function for outputing data to perf event He Kuang
2015-07-02 13:50       ` [RFC PATCH v2 4/4] tracing/kprobe: Combine bpf output and perf event output He Kuang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5594D1AA.9040803@huawei.com \
    --to=wangnan0@huawei.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@redhat.com \
    --cc=ast@plumgrid.com \
    --cc=hekuang@huawei.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox