All of lore.kernel.org
 help / color / mirror / Atom feed
From: He Kuang <hekuang@huawei.com>
To: Alexei Starovoitov <ast@plumgrid.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Wangnan (F)" <wangnan0@huawei.com>
Cc: <rostedt@goodmis.org>, <masami.hiramatsu.pt@hitachi.com>,
	<mingo@redhat.com>, <acme@redhat.com>, <jolsa@kernel.org>,
	<namhyung@kernel.org>, <linux-kernel@vger.kernel.org>,
	pi3orama <pi3orama@163.com>
Subject: Re: [RFC PATCH 0/5] Make eBPF programs output data to perf event
Date: Thu, 2 Jul 2015 11:38:23 +0800	[thread overview]
Message-ID: <5594B22F.8090500@huawei.com> (raw)
In-Reply-To: <5594A679.7010108@plumgrid.com>



On 2015/7/2 10:48, Alexei Starovoitov wrote:
> On 7/1/15 4:58 AM, Peter Zijlstra wrote:
>>
>> But why create a separate trace buffer, it should go into the regular
>> perf buffer.
>
> +1
>
> I think
> +static char __percpu *perf_extra_trace_buf[PERF_NR_CONTEXTS];
> is redundant.
> It adds quite a bit of unnecessary complexity to the whole patch set.
>
> Also the call to bpf_output_sample() is not effective unless program
> returns 1. It's a confusing user interface.
>
> Also you cannot ever do:
>       BPF_FUNC_probe_read,
> +    BPF_FUNC_output_sample,
>       BPF_FUNC_ktime_get_ns,
> new functions must be added to the end.
>
> Why not just do:
> perf_trace_buf_prepare() + perf_trace_buf_submit() from the helper?
> No changes to current code.
> No need to call __get_data_size() and other overhead.
> The helper can be called multiple times from the same program.
> imo much cleaner.
>

Invoke perf_trace_buf_submit() will generate a second perf
event (header->type = PERF_RECORD_SAMPLE) entry which is
different from the event entry outputed by the orignial
kprobe. So the final result of the example in 00/00 patch may
like this:

   sample entry 1(from bpf_prog):
     comm timestamp1 generic_perform_write pmu_value=0x1234
                                                                                             
   sample entry 2(from original kprobe):
     comm timestamp2 generic_perform_write: (ffffffff81140b60)
                                                                                             
Compared with current implementation:
                                                                                             
   combined sample entry:
     comm timestamp generic_perform_write: (ffffffff81140b60) pmu_value=0x1234

The former two entries may be discontinuous as there are multiple
threads and kprobes to be recorded, and there's a chance that one
entry is missed but the other is recorded. What we need is the
pmu_value read when 'generic_perform_write' enters, the two
entries result is not intuitive enough and userspace tools have
to do the work to find and combine those two sample entries to
get the result.

Thank you.

> Also how about calling this helper:
> bpf_trace_buf_submit(void *stack_ptr, int size) ?
> bpf_output_sample, I think, is odd name. It's not a sample.
> May be other name?
>
>


  reply	other threads:[~2015-07-02  3:42 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-01  2:57 [RFC PATCH 0/5] Make eBPF programs output data to perf event He Kuang
2015-07-01  2:57 ` [RFC PATCH 1/5] bpf: Put perf_events check ahead of bpf prog He Kuang
2015-07-02  3:50   ` Alexei Starovoitov
2015-07-02  5:52     ` Wangnan (F)
2015-07-02 18:02       ` Alexei Starovoitov
2015-07-01  2:57 ` [RFC PATCH 2/5] perf/trace: Add perf extra percpu trace buffer He Kuang
2015-07-01  2:57 ` [RFC PATCH 3/5] tracing/kprobe: Separate inc recursion count out of perf_trace_buf_prepare He Kuang
2015-07-01  2:57 ` [RFC PATCH 4/5] bpf: Introduce function for outputing sample data to perf event He Kuang
2015-07-01  2:57 ` [RFC PATCH 5/5] tracing/kprobe: Combine extra trace buf into perf trace buf He Kuang
2015-07-01  5:44 ` [RFC PATCH 0/5] Make eBPF programs output data to perf event Peter Zijlstra
2015-07-01  6:21   ` Wangnan (F)
2015-07-01 11:58     ` Peter Zijlstra
2015-07-02  2:48       ` Alexei Starovoitov
2015-07-02  3:38         ` He Kuang [this message]
2015-07-02  3:52           ` Alexei Starovoitov
2015-07-02  9:24             ` Wangnan (F)
2015-07-02 18:37               ` Alexei Starovoitov
2015-07-02  9:31         ` Peter Zijlstra
2015-07-02 13:50     ` [RFC PATCH v2 0/4] " He Kuang
2015-07-02 13:50       ` [RFC PATCH v2 1/4] bpf: Put perf_events check ahead of bpf prog He Kuang
2015-07-02 18:41         ` Alexei Starovoitov
2015-07-02 13:50       ` [RFC PATCH v2 2/4] tracing/kprobe: Separate inc recursion count out of perf_trace_buf_prepare He Kuang
2015-07-02 13:50       ` [RFC PATCH v2 3/4] bpf: Introduce function for outputing data to perf event He Kuang
2015-07-02 13:50       ` [RFC PATCH v2 4/4] tracing/kprobe: Combine bpf output and perf event output He Kuang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5594B22F.8090500@huawei.com \
    --to=hekuang@huawei.com \
    --cc=acme@redhat.com \
    --cc=ast@plumgrid.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pi3orama@163.com \
    --cc=rostedt@goodmis.org \
    --cc=wangnan0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.