BPF List
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Jiri Olsa <jolsa@kernel.org>,
	Kan Liang <kan.liang@linux.intel.com>,
	Ravi Bangoria <ravi.bangoria@amd.com>,
	bpf@vger.kernel.org
Subject: Re: [PATCH 2/3] perf/core: Set data->sample_flags in perf_prepare_sample()
Date: Mon, 9 Jan 2023 13:14:31 +0100	[thread overview]
Message-ID: <Y7wFJ+NF0NwnmzLa@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20221229204101.1099430-2-namhyung@kernel.org>

On Thu, Dec 29, 2022 at 12:41:00PM -0800, Namhyung Kim wrote:

So I like the general idea; I just think it's turned into a bit of a
mess. That is code is already overly branchy which is known to hurt
performance, we should really try and not make it worse than absolutely
needed.

>  kernel/events/core.c | 86 ++++++++++++++++++++++++++++++++------------
>  1 file changed, 63 insertions(+), 23 deletions(-)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index eacc3702654d..70bff8a04583 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -7582,14 +7582,21 @@ void perf_prepare_sample(struct perf_event_header *header,
>  	filtered_sample_type = sample_type & ~data->sample_flags;
>  	__perf_event_header__init_id(header, data, event, filtered_sample_type);
>  
> -	if (sample_type & (PERF_SAMPLE_IP | PERF_SAMPLE_CODE_PAGE_SIZE))
> -		data->ip = perf_instruction_pointer(regs);
> +	if (sample_type & (PERF_SAMPLE_IP | PERF_SAMPLE_CODE_PAGE_SIZE)) {
> +		/* attr.sample_type may not have PERF_SAMPLE_IP */

Right, but that shouldn't matter, IIRC its OK to have more bits set in
data->sample_flags than we have set in attr.sample_type. It just means
we have data available for sample types we're (possibly) not using.

That is, I think you can simply write this like:

> +		if (!(data->sample_flags & PERF_SAMPLE_IP)) {
> +			data->ip = perf_instruction_pointer(regs);
> +			data->sample_flags |= PERF_SAMPLE_IP;
> +		}
> +	}

	if (filtered_sample_type & (PERF_SAMPLE_IP | PERF_SAMPLE_CODE_PAGE_SIZE)) {
		data->ip = perf_instruction_pointer(regs);
		data->sample_flags |= PERF_SAMPLE_IP);
	}

	...

	if (filtered_sample_type & PERF_SAMPLE_CODE_PAGE_SIZE) {
		data->code_page_size = perf_get_page_size(data->ip);
		data->sample_flags |= PERF_SAMPLE_CODE_PAGE_SIZE;
	}

Then after a single perf_prepare_sample() run we have:

  pre			|	post
  ----------------------------------------
  0			|	0
  IP			|	IP
  CODE_PAGE_SIZE	|	IP|CODE_PAGE_SIZE
  IP|CODE_PAGE_SIZE	|	IP|CODE_PAGE_SIZE

So while data->sample_flags will have an extra bit set in the 3rd case,
that will not affect perf_sample_outout() which only looks at data->type
(== attr.sample_type).

And since data->sample_flags will have both bits set, a second run will
filter out both and avoid the extra work (except doing that will mess up
the branch predictors).


>  	if (sample_type & PERF_SAMPLE_CALLCHAIN) {
>  		int size = 1;
>  
> -		if (filtered_sample_type & PERF_SAMPLE_CALLCHAIN)
> +		if (filtered_sample_type & PERF_SAMPLE_CALLCHAIN) {
>  			data->callchain = perf_callchain(event, regs);
> +			data->sample_flags |= PERF_SAMPLE_CALLCHAIN;
> +		}
>  
>  		size += data->callchain->nr;
>  

This, why can't this be:

	if (filtered_sample_type & PERF_SAMPLE_CALLCHAIN) {
		data->callchain = perf_callchain(event, regs);
		data->sample_flags |= PERF_SAMPLE_CALLCHAIN;

		header->size += (1 + data->callchain->nr) * sizeof(u64);
	}

I suppose this is because perf_event_header lives on the stack of the
overflow handler and all that isn't available / relevant for the BPF
thing.

And we can't pull that out into anther function without adding yet
another branch fest.

However; inspired by your next patch; we can do something like so:

	if (filtered_sample_type & PERF_SAMPLE_CALLCHAIN) {
		data->callchain = perf_callchain(event, regs);
		data->sample_flags |= PERF_SAMPLE_CALLCHAIN;

		data->size += (1 + data->callchain->nr) * sizeof(u64);
	}

And then have __perf_event_output() (or something thereabout) do:

	perf_prepare_sample(data, event, regs);
	perf_prepare_header(&header, data, event);
	err = output_begin(&handle, data, event, header.size);
	if (err)
		goto exit;
	perf_output_sample(&handle, &header, data, event);
	perf_output_end(&handle);

With perf_prepare_header() being something like:

	header->type = PERF_RECORD_SAMPLE;
	header->size = sizeof(*header) + event->header_size + data->size;
	header->misc = perf_misc_flags(regs);
	...

Hmm ?

(same for all the other sites)

  reply	other threads:[~2023-01-09 12:15 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-29 20:40 [PATCH 1/3] perf/core: Change the layout of perf_sample_data Namhyung Kim
2022-12-29 20:41 ` [PATCH 2/3] perf/core: Set data->sample_flags in perf_prepare_sample() Namhyung Kim
2023-01-09 12:14   ` Peter Zijlstra [this message]
2023-01-09 20:21     ` Namhyung Kim
2023-01-10 10:54       ` Peter Zijlstra
2023-01-10 11:10         ` Ingo Molnar
2023-01-10 19:00           ` Namhyung Kim
2023-01-10 10:55       ` Peter Zijlstra
2023-01-10 19:01         ` Namhyung Kim
2023-01-10 20:06       ` Namhyung Kim
2023-01-11 12:54         ` Peter Zijlstra
2023-01-11 16:45           ` Peter Zijlstra
2023-01-11 17:59             ` Namhyung Kim
2022-12-29 20:41 ` [PATCH 3/3] perf/core: Save calculated sample data size Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y7wFJ+NF0NwnmzLa@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=acme@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=ravi.bangoria@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox