All of lore.kernel.org
 help / color / mirror / Atom feed
From: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
To: Alexei Starovoitov <ast@plumgrid.com>
Cc: Ingo Molnar <mingo@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Namhyung Kim <namhyung@kernel.org>,
	Arnaldo Carvalho de Melo <acme@infradead.org>,
	Jiri Olsa <jolsa@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	Daniel Borkmann <dborkman@redhat.com>,
	Hannes Frederic Sowa <hannes@stressinduktion.org>,
	Brendan Gregg <brendan.d.gregg@gmail.com>,
	Linux API <linux-api@vger.kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"zhangwei(Jovi)" <jovi.zhangwei@huawei.com>,
	"yrl.pp-manager.tt@hitachi.com" <yrl.pp-manager.tt@hitachi.com>
Subject: Re: Re: Re: [PATCH tip 0/9] tracing: attach eBPF programs to tracepoints/syscalls/kprobe
Date: Tue, 20 Jan 2015 20:57:18 +0900	[thread overview]
Message-ID: <54BE429E.9020108@hitachi.com> (raw)
In-Reply-To: <CAMEtUuwrpqRG4a=Hqnj3JBKuLbC4yV+trVAZhevKLbCsm_6U4Q@mail.gmail.com>

(2015/01/20 12:55), Alexei Starovoitov wrote:
> On Mon, Jan 19, 2015 at 6:58 PM, Masami Hiramatsu
> <masami.hiramatsu.pt@hitachi.com> wrote:
>>>
>>> it's done already... one can do the same skb->dev->name logic
>>> in kprobe attached program... so from bpf program point of view,
>>> tracepoints and kprobes feature-wise are exactly the same.
>>> Only input is different.
>>
>> No, I meant that the input should also be same, at least for the first step.
>> I guess it is easy to hook the ring buffer committing and fetch arguments
>> from the event entry.
> 
> No. That would be very slow. See my comment to Steven
> and more detailed numbers below.

Thank you for measuring the performance differences.
Indeed, the ring buffer looks slow.

> Allocating ring buffer takes too much time.
> 
>> And what I expected scenario was
>>
>> 1. setup kprobe traceevent with fd, buf, count by using perf-probe.
>> 2. load bpf module
>> 3. the module processes given event arguments.
> 
> from ring buffer? that's too slow.

Ok, BTW, would you think is it possible to use a reusable small scratchpad
memory for passing arguments? (just a thought)

> It's not usable for high frequency events which
> need this in-kernel aggregation.
> If events are rare, then just dumping everything
> into trace buffer is just fine. No in-kernel program is needed.

Hmm, let me ensure your point, the performance number is the reason why
we need to do it in the kernel, right? Not mainly for the flexibility but speed.

>> Hmm, it sounds making another systemtap on top of tracepoint and kprobes.
>> Why don't you just reuse the existing facilities (perftools and ftrace)
>> instead of co-exist?
> 
> hmm. I don't think we're on the same page yet...
> ring buffer and tracing interface is fully reused.
> programs are run as soon as event triggers.
> They can return non-zero and kernel will allocate ring
> buffer which user space will consume.
> Please take a look at tracex1

I see, this code itself is not a destructive change.

>>> Just look how ktap scripts look alike for kprobes and tracepoints.
>>
>> Ktap is a good example, it provides only a language parser and a runtime engine.
>> Actually, currently it lacks a feature to execute "perf-probe" helper from
>> script, but it is easy to add such feature.
> ...
>> For this usecase, I've made --output option for perf probe
>> https://lkml.org/lkml/2014/10/31/210
> 
> you're proposing to call perf binary from ktap binary?

Yes, that's right :)

> I think packaging headaches and error conditions
> will make such approach very hard to use.

No, I don't think so. perf can be a "buffer" from the kernel API
and command-line API. If you need to get clearer error, you also
can join the upstream development.

> it would be much cleaner to have ktap as part of perf
> generating bpf on the fly and feeding into kernel.
> 'perf probe' parsing and functions don't belong in kernel
> when userspace can generate them in more efficient way.

No, perf probe still be needed to users who don't choose "injecting
binary blob" tracing. Efficiency is NOT only one index.

- perf probe and kprobe-event gives us a complete understandable
 interface for what will be recorded at where.
 (we can see the event definitions via kprobe_events interface,
  without any tools)
- kprobe-event gives a completely same interface as other tracepoint
  events.
- it also doesn't require any build-binary parts :) nor special tools.
  We can play with ftrace on just a small busybox.

However, this does NOT interfere your patch upstreaming. I just said current
ftrace method is also meaningful for some reasons :)


By the way, I concern about that bpf compiler can become another systemtap,
especially if you build it on llvm. Would you plan to develop it on kernel
tree? or apart from the kernel-side development?
I think it is hard to sync the development if you do it out-of-tree.


Thank you,



-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com

  reply	other threads:[~2015-01-20 11:57 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-20  3:55 Re: [PATCH tip 0/9] tracing: attach eBPF programs to tracepoints/syscalls/kprobe Alexei Starovoitov
2015-01-20  3:55 ` Alexei Starovoitov
2015-01-20 11:57 ` Masami Hiramatsu [this message]
  -- strict thread matches above, loose matches on Subject: below --
2015-01-20 20:33 Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54BE429E.9020108@hitachi.com \
    --to=masami.hiramatsu.pt@hitachi.com \
    --cc=acme@infradead.org \
    --cc=ast@plumgrid.com \
    --cc=brendan.d.gregg@gmail.com \
    --cc=davem@davemloft.net \
    --cc=dborkman@redhat.com \
    --cc=hannes@stressinduktion.org \
    --cc=jolsa@redhat.com \
    --cc=jovi.zhangwei@huawei.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=yrl.pp-manager.tt@hitachi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.