linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Brendan Gregg <bgregg@netflix.com>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	linux-kernel@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
	Wang Nan <wangnan0@huawei.com>
Subject: Re: [PATCH v2 1/3] perf/core: Add a tracepoint for perf sampling
Date: Fri, 5 Aug 2016 12:52:09 +0200	[thread overview]
Message-ID: <20160805105209.GR6879@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <20160805052404.GA57782@ast-mbp.thefacebook.com>

On Thu, Aug 04, 2016 at 10:24:06PM -0700, Alexei Starovoitov wrote:
> tracepoints are actually zero overhead already via static-key mechanism.
> I don't think Peter's objection for the tracepoint was due to overhead.

Almost 0, they still have some I$ footprint, but yes. My main worry is
that we can feed tracepoints into perf, so having tracepoints in perf is
tricky.

I also don't much like this tracepoint being specific to the hrtimer
bits, I can well imagine people wanting to do the same thing for
hardware based samples or whatnot.

> > The perf:perf_hrtimer probe point is also reading state mid-way
> > through a function, so it's not quite as simple as wrapping the
> > function pointer. I do like that idea, though, but for things like
> > struct file_operations.

So what additional state to you need?

> > > Currently overflow_handler is set at event alloc time. If we start
> > > changing it on the fly with atomic xchg(), afaik things shouldn't
> > > break, since each overflow_handler is run to completion and doesn't
> > > change global state, right?

Yes, or even a simple WRITE_ONCE() to replace it, as long as we make
sure to use a READ_ONCE() to load the pointer.

As long as we're sure to limit this poking to a single user its fairly
simple to get right. The moment there can be concurrency a lot of fail
can happen.

> instead of adding a tracepoint to perf_swevent_hrtimer we can replace
> overflow_handler for that particular event with some form of bpf wrapper.
> (probably new bpf program type). Then not only periodic events
> will be triggering bpf prog, but pmu events as well.

Exactly.

> So instead of normal __perf_event_output() writing into ringbuffer,
> a bpf prog will be called that can optionally write into different
> rb via bpf_perf_event_output. 

It could even chain and call into the original function once its done
and have both outputs.

> The question is what to pass into the
> program to make the most use out of it. 'struct pt_regs' is done deal.
> but perf_sample_data we cannot pass as-is, since it's kernel internal.

Urgh, does it have to be stable API? Can't we simply rely on the kernel
headers to provide the right structure definition?

  reply	other threads:[~2016-08-05 10:52 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-03  2:47 [PATCH v2 0/3] Add a tracepoint for BPF perf sampling Brendan Gregg
2016-08-03  2:47 ` [PATCH v2 1/3] perf/core: Add a tracepoint for " Brendan Gregg
2016-08-03  9:48   ` Peter Zijlstra
2016-08-03 18:57     ` Brendan Gregg
2016-08-04 14:28       ` Peter Zijlstra
2016-08-05  1:43         ` Alexei Starovoitov
2016-08-05  4:13           ` Brendan Gregg
2016-08-05  5:24             ` Alexei Starovoitov
2016-08-05 10:52               ` Peter Zijlstra [this message]
2016-08-05 17:22                 ` Brendan Gregg
2016-08-08  9:57                   ` Peter Zijlstra
2016-08-05 19:45                 ` Alexei Starovoitov
2016-08-08 10:03                   ` Peter Zijlstra
2016-08-03  2:47 ` [PATCH v2 2/3] samples/bpf: Add a sampling BPF example Brendan Gregg
2016-08-03  2:47 ` [PATCH v2 3/3] perf script: add bpf-output field to usage message Brendan Gregg
2016-08-09 19:16   ` [tip:perf/urgent] perf script: Add 'bpf-output' " tip-bot for Brendan Gregg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160805105209.GR6879@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=bgregg@netflix.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=wangnan0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).