Re: [RFC PATCH 2/2] bpf: Implement bpf_perf_event_sample_enable/disable() helpers

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Wangnan (F)" <wangnan0@huawei.com>
To: Alexei Starovoitov <ast@plumgrid.com>,
	Kaixu Xia <xiakaixu@huawei.com>, <davem@davemloft.net>,
	<acme@kernel.org>, <mingo@redhat.com>, <a.p.zijlstra@chello.nl>,
	<masami.hiramatsu.pt@hitachi.com>, <jolsa@kernel.org>,
	<daniel@iogearbox.net>
Cc: <linux-kernel@vger.kernel.org>, <pi3orama@163.com>,
	<hekuang@huawei.com>, <netdev@vger.kernel.org>
Subject: Re: [RFC PATCH 2/2] bpf: Implement bpf_perf_event_sample_enable/disable() helpers
Date: Tue, 13 Oct 2015 12:34:52 +0800	[thread overview]
Message-ID: <561C89EC.8030303@huawei.com> (raw)
In-Reply-To: <561C85BB.3000505@plumgrid.com>

On 2015/10/13 12:16, Alexei Starovoitov wrote:
> On 10/12/15 8:51 PM, Wangnan (F) wrote:
>>> why 'set disable' is needed ?
>>> the example given in cover letter shows the use case where you want
>>> to receive samples only within sys_write() syscall.
>>> The example makes sense, but sys_write() is running on this cpu, so 
>>> just
>>> disabling it on the current one is enough.
>>>
>>
>> Our real use case is control of the system-wide sampling. For example,
>> we need sampling all CPUs when smartphone start refershing its display.
>> We need all CPUs because in Android system there are plenty of threads
>> get involed into this behavior. We can't achieve this by controling
>> sampling on only one CPU. This is the reason we need 'set enable'
>> and 'set disable'.
>
> ok, but that use case may have different enable/disable pattern.
> In sys_write example ultra-fast enable/disable is must have, since
> the whole syscall is fast and overhead should be minimal.
> but for display refresh? we're talking milliseconds, no?
> Can you just ioctl() it from user space?
> If cost of enable/disable is high or the time range between toggling is
> long, then doing it from the bpf program doesn't make sense. Instead
> the program can do bpf_perf_event_output() to send a notification to
> user space that condition is met and the user space can ioctl() events.
>

OK. I think I understand your design principle that, everything inside BPF
should be as fast as possible.

Make userspace control events using ioctl make things harder. You know that
'perf record' itself doesn't care too much about events it reveived. It only
copies data to perf.data, but what we want is to use perf record simply like
this:

  # perf record -e evt=cycles -e control.o/pmu=evt/ -a sleep 100

And in control.o we create uprobe point to mark the start and finish of 
a frame:

  SEC("target=/a/b/c.o\nstartFrame=0x123456")
  int startFrame(void *) {
    bpf_pmu_enable(pmu);
    return 1;
  }

  SEC("target=/a/b/c.o\nfinishFrame=0x234568")
  int finishFrame(void *) {
    bpf_pmu_disable(pmu);
    return 1;
  }

I think it is make sence also.

I still think perf is not necessary be independent each other. You know 
we have
PERF_EVENT_IOC_SET_OUTPUT which can set multiple events output through one
ringbuffer. This way perf events are connected.

I think the 'set disable/enable' design in this patchset satisify the 
design goal
that in BPF program we only do simple and fast things. The only 
inconvience is
we add something into map, which is ugly. What about using similar 
implementation
like PERF_EVENT_IOC_SET_OUTPUT, creating a new ioctl like 
PERF_EVENT_IOC_SET_ENABLER,
then let perf to select an event as 'enabler', then BPF can still 
control one atomic
variable to enable/disable a set of events.

Thank you.

next prev parent reply	other threads:[~2015-10-13  4:34 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-12  9:02 [RFC PATCH 0/2] bpf: enable/disable events stored in PERF_EVENT_ARRAY maps trace data output when perf sampling Kaixu Xia
2015-10-12  9:02 ` [RFC PATCH 1/2] perf: Add the flag sample_disable not to output data on samples Kaixu Xia
2015-10-12 12:02   ` Peter Zijlstra
2015-10-12 12:05     ` Wangnan (F)
2015-10-12 12:12       ` Peter Zijlstra
2015-10-12 14:14   ` kbuild test robot
2015-10-12 19:20   ` Alexei Starovoitov
2015-10-13  2:30     ` xiakaixu
2015-10-13  3:10       ` Alexei Starovoitov
2015-10-13 12:00   ` Sergei Shtylyov
2015-10-12  9:02 ` [RFC PATCH 2/2] bpf: Implement bpf_perf_event_sample_enable/disable() helpers Kaixu Xia
2015-10-12 19:29   ` Alexei Starovoitov
2015-10-13  3:27     ` Wangnan (F)
2015-10-13  3:39       ` Alexei Starovoitov
2015-10-13  3:51         ` Wangnan (F)
2015-10-13  4:16           ` Alexei Starovoitov
2015-10-13  4:34             ` Wangnan (F) [this message]
2015-10-13  5:15               ` Alexei Starovoitov
2015-10-13  6:57                 ` Wangnan (F)
2015-10-13 10:54                 ` He Kuang
2015-10-13 11:07                   ` Wangnan (F)
2015-10-14  5:14                   ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=561C89EC.8030303@huawei.com \
    --to=wangnan0@huawei.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@kernel.org \
    --cc=ast@plumgrid.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=hekuang@huawei.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pi3orama@163.com \
    --cc=xiakaixu@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).