BPF List
 help / color / mirror / Atom feed
* [PATCHSET 0/8] perf/core: Prepare sample data for BPF
@ 2023-01-12 21:40 Namhyung Kim
  2023-01-12 21:40 ` [PATCH 3/8] perf/core: Add perf_sample_save_raw_data() helper Namhyung Kim
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Namhyung Kim @ 2023-01-12 21:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Arnaldo Carvalho de Melo, Jiri Olsa, LKML, Song Liu, bpf

Hello,

The perf_prepare_sample() is to fill the perf sample data and update the
header info before sending it to the ring buffer.  But we want to use it
for BPF overflow handler so that it can access the sample data to filter
relevant ones.

Changes in v2)
 * the layout change is merged
 * reduce branches using __cond_set  (PeterZ)
 * add helpers to set dynamic sample data  (PeterZ)
 * introduce perf_prepare_header()  (PeterZ)
 * call perf_prepare_sample() before bpf_overflow_handler unconditionally

This means the perf_prepare_handler() can be called more than once.  To
avoid duplicate work, use the data->sample_flags and save the data size.

I also added a few of helpers to set those information accordingly.
But it looks some fields like REGS_USER, STACK_USER and AUX are saved in
the perf_prepare_sample() so I didn't add the helpers for them.

After than we can just check the filtered_sample_type flags begin zero
to determine if it has more work.  In that case, it needs to update the
data->type since it's possible to miss when PMU driver sets all required
sample flags before calling perf_prepare_sample().

The code is also available at 'perf/prepare-sample-v2' branch in

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung


Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org


Namhyung Kim (8):
  perf/core: Save the dynamic parts of sample data size
  perf/core: Add perf_sample_save_callchain() helper
  perf/core: Add perf_sample_save_raw_data() helper
  perf/core: Add perf_sample_save_brstack() helper
  perf/core: Set data->sample_flags in perf_prepare_sample()
  perf/core: Do not pass header for sample id init
  perf/core: Introduce perf_prepare_header()
  perf/core: Call perf_prepare_sample() before running BPF

 arch/powerpc/perf/core-book3s.c    |   3 +-
 arch/s390/kernel/perf_cpum_cf.c    |   4 +-
 arch/s390/kernel/perf_cpum_sf.c    |   3 +-
 arch/s390/kernel/perf_pai_crypto.c |   4 +-
 arch/s390/kernel/perf_pai_ext.c    |   4 +-
 arch/x86/events/amd/core.c         |   6 +-
 arch/x86/events/amd/ibs.c          |   9 +-
 arch/x86/events/intel/core.c       |   6 +-
 arch/x86/events/intel/ds.c         |  24 ++--
 include/linux/perf_event.h         | 133 +++++++++++++-----
 kernel/events/core.c               | 207 ++++++++++++++++-------------
 kernel/trace/bpf_trace.c           |   6 +-
 12 files changed, 236 insertions(+), 173 deletions(-)


base-commit: 9fcad995c6c52cc9791f7ee9f1386a5684055f9c
-- 
2.39.0.314.g84b9a713c41-goog


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 3/8] perf/core: Add perf_sample_save_raw_data() helper
  2023-01-12 21:40 [PATCHSET 0/8] perf/core: Prepare sample data for BPF Namhyung Kim
@ 2023-01-12 21:40 ` Namhyung Kim
  2023-01-13 11:19   ` Peter Zijlstra
  2023-01-13 21:01   ` Song Liu
  2023-01-12 21:40 ` [PATCH 8/8] perf/core: Call perf_prepare_sample() before running BPF Namhyung Kim
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 11+ messages in thread
From: Namhyung Kim @ 2023-01-12 21:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Arnaldo Carvalho de Melo, Jiri Olsa, LKML, Song Liu, linux-s390,
	x86, bpf

When it saves the raw_data to the perf sample data, it needs to update
the sample flags and the dynamic size.  To make sure this, add the
perf_sample_save_raw_data() helper and convert all call sites.

Cc: linux-s390@vger.kernel.org
Cc: x86@kernel.org
Cc: bpf@vger.kernel.org
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 arch/s390/kernel/perf_cpum_cf.c    |  4 +---
 arch/s390/kernel/perf_pai_crypto.c |  4 +---
 arch/s390/kernel/perf_pai_ext.c    |  4 +---
 arch/x86/events/amd/ibs.c          |  3 +--
 include/linux/perf_event.h         | 33 +++++++++++++++++++++++++-----
 kernel/events/core.c               | 31 +++++-----------------------
 kernel/trace/bpf_trace.c           |  6 ++----
 7 files changed, 39 insertions(+), 46 deletions(-)

diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c
index f043a7ff220b..aa38649c7c27 100644
--- a/arch/s390/kernel/perf_cpum_cf.c
+++ b/arch/s390/kernel/perf_cpum_cf.c
@@ -662,9 +662,7 @@ static int cfdiag_push_sample(struct perf_event *event,
 	if (event->attr.sample_type & PERF_SAMPLE_RAW) {
 		raw.frag.size = cpuhw->usedss;
 		raw.frag.data = cpuhw->stop;
-		raw.size = raw.frag.size;
-		data.raw = &raw;
-		data.sample_flags |= PERF_SAMPLE_RAW;
+		perf_sample_save_raw_data(&data, &raw);
 	}
 
 	overflow = perf_event_overflow(event, &data, &regs);
diff --git a/arch/s390/kernel/perf_pai_crypto.c b/arch/s390/kernel/perf_pai_crypto.c
index 985e243a2ed8..a7b339c4fd7c 100644
--- a/arch/s390/kernel/perf_pai_crypto.c
+++ b/arch/s390/kernel/perf_pai_crypto.c
@@ -362,9 +362,7 @@ static int paicrypt_push_sample(void)
 	if (event->attr.sample_type & PERF_SAMPLE_RAW) {
 		raw.frag.size = rawsize;
 		raw.frag.data = cpump->save;
-		raw.size = raw.frag.size;
-		data.raw = &raw;
-		data.sample_flags |= PERF_SAMPLE_RAW;
+		perf_sample_save_raw_data(&data, &raw);
 	}
 
 	overflow = perf_event_overflow(event, &data, &regs);
diff --git a/arch/s390/kernel/perf_pai_ext.c b/arch/s390/kernel/perf_pai_ext.c
index 1138f57baae3..555597222bad 100644
--- a/arch/s390/kernel/perf_pai_ext.c
+++ b/arch/s390/kernel/perf_pai_ext.c
@@ -451,9 +451,7 @@ static int paiext_push_sample(void)
 	if (event->attr.sample_type & PERF_SAMPLE_RAW) {
 		raw.frag.size = rawsize;
 		raw.frag.data = cpump->save;
-		raw.size = raw.frag.size;
-		data.raw = &raw;
-		data.sample_flags |= PERF_SAMPLE_RAW;
+		perf_sample_save_raw_data(&data, &raw);
 	}
 
 	overflow = perf_event_overflow(event, &data, &regs);
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index 417c80bd3274..64582954b5f6 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -1110,8 +1110,7 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
 				.data = ibs_data.data,
 			},
 		};
-		data.raw = &raw;
-		data.sample_flags |= PERF_SAMPLE_RAW;
+		perf_sample_save_raw_data(&data, &raw);
 	}
 
 	if (perf_ibs == &perf_ibs_op)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index a9419608402b..569dfac5887f 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -95,6 +95,11 @@ struct perf_raw_record {
 	u32				size;
 };
 
+static __always_inline bool perf_raw_frag_last(const struct perf_raw_frag *frag)
+{
+	return frag->pad < sizeof(u64);
+}
+
 /*
  * branch stack layout:
  *  nr: number of taken branches stored in entries[]
@@ -1182,6 +1187,29 @@ static inline void perf_sample_save_callchain(struct perf_sample_data *data,
 	data->sample_flags |= PERF_SAMPLE_CALLCHAIN;
 }
 
+static inline void perf_sample_save_raw_data(struct perf_sample_data *data,
+					     struct perf_raw_record *raw)
+{
+	struct perf_raw_frag *frag = &raw->frag;
+	u32 sum = 0;
+	int size;
+
+	do {
+		sum += frag->size;
+		if (perf_raw_frag_last(frag))
+			break;
+		frag = frag->next;
+	} while (1);
+
+	size = round_up(sum + sizeof(u32), sizeof(u64));
+	raw->size = size - sizeof(u32);
+	frag->pad = raw->size - sum;
+
+	data->raw = raw;
+	data->dyn_size += size;
+	data->sample_flags |= PERF_SAMPLE_RAW;
+}
+
 /*
  * Clear all bitfields in the perf_branch_entry.
  * The to and from fields are not cleared because they are
@@ -1690,11 +1718,6 @@ extern void perf_restore_debug_store(void);
 static inline void perf_restore_debug_store(void)			{ }
 #endif
 
-static __always_inline bool perf_raw_frag_last(const struct perf_raw_frag *frag)
-{
-	return frag->pad < sizeof(u64);
-}
-
 #define perf_output_put(handle, x) perf_output_copy((handle), &(x), sizeof(x))
 
 struct perf_pmu_events_attr {
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 0fba98b9cd65..133894ae5e30 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7588,30 +7588,10 @@ void perf_prepare_sample(struct perf_event_header *header,
 	if (filtered_sample_type & PERF_SAMPLE_CALLCHAIN)
 		perf_sample_save_callchain(data, event, regs);
 
-	if (sample_type & PERF_SAMPLE_RAW) {
-		struct perf_raw_record *raw = data->raw;
-		int size;
-
-		if (raw && (data->sample_flags & PERF_SAMPLE_RAW)) {
-			struct perf_raw_frag *frag = &raw->frag;
-			u32 sum = 0;
-
-			do {
-				sum += frag->size;
-				if (perf_raw_frag_last(frag))
-					break;
-				frag = frag->next;
-			} while (1);
-
-			size = round_up(sum + sizeof(u32), sizeof(u64));
-			raw->size = size - sizeof(u32);
-			frag->pad = raw->size - sum;
-		} else {
-			size = sizeof(u64);
-			data->raw = NULL;
-		}
-
-		data->dyn_size += size;
+	if (filtered_sample_type & PERF_SAMPLE_RAW) {
+		data->raw = NULL;
+		data->dyn_size += sizeof(u64);
+		data->sample_flags |= PERF_SAMPLE_RAW;
 	}
 
 	if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
@@ -10127,8 +10107,7 @@ void perf_tp_event(u16 event_type, u64 count, void *record, int entry_size,
 	};
 
 	perf_sample_data_init(&data, 0, 0);
-	data.raw = &raw;
-	data.sample_flags |= PERF_SAMPLE_RAW;
+	perf_sample_save_raw_data(&data, &raw);
 
 	perf_trace_buf_update(record, event_type);
 
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 3bbd3f0c810c..ad37608afc35 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -687,8 +687,7 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
 	}
 
 	perf_sample_data_init(sd, 0, 0);
-	sd->raw = &raw;
-	sd->sample_flags |= PERF_SAMPLE_RAW;
+	perf_sample_save_raw_data(sd, &raw);
 
 	err = __bpf_perf_event_output(regs, map, flags, sd);
 
@@ -746,8 +745,7 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
 
 	perf_fetch_caller_regs(regs);
 	perf_sample_data_init(sd, 0, 0);
-	sd->raw = &raw;
-	sd->sample_flags |= PERF_SAMPLE_RAW;
+	perf_sample_save_raw_data(sd, &raw);
 
 	ret = __bpf_perf_event_output(regs, map, flags, sd);
 out:
-- 
2.39.0.314.g84b9a713c41-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 8/8] perf/core: Call perf_prepare_sample() before running BPF
  2023-01-12 21:40 [PATCHSET 0/8] perf/core: Prepare sample data for BPF Namhyung Kim
  2023-01-12 21:40 ` [PATCH 3/8] perf/core: Add perf_sample_save_raw_data() helper Namhyung Kim
@ 2023-01-12 21:40 ` Namhyung Kim
  2023-01-13 21:06   ` Song Liu
  2023-01-13 11:21 ` [PATCHSET 0/8] perf/core: Prepare sample data for BPF Peter Zijlstra
  2023-01-13 13:25 ` Jiri Olsa
  3 siblings, 1 reply; 11+ messages in thread
From: Namhyung Kim @ 2023-01-12 21:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Arnaldo Carvalho de Melo, Jiri Olsa, LKML, Song Liu, bpf

As BPF can access sample data, it needs to populate the data.  Also
remove the logic to get the callchain specifically as it's covered by
the perf_prepare_sample() now.

Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 kernel/events/core.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5c4f3fa3d2b7..af8365fb639a 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10363,13 +10363,7 @@ static void bpf_overflow_handler(struct perf_event *event,
 	rcu_read_lock();
 	prog = READ_ONCE(event->prog);
 	if (prog) {
-		if (prog->call_get_stack &&
-		    (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN) &&
-		    !(data->sample_flags & PERF_SAMPLE_CALLCHAIN)) {
-			data->callchain = perf_callchain(event, regs);
-			data->sample_flags |= PERF_SAMPLE_CALLCHAIN;
-		}
-
+		perf_prepare_sample(data, event, regs);
 		ret = bpf_prog_run(prog, &ctx);
 	}
 	rcu_read_unlock();
-- 
2.39.0.314.g84b9a713c41-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/8] perf/core: Add perf_sample_save_raw_data() helper
  2023-01-12 21:40 ` [PATCH 3/8] perf/core: Add perf_sample_save_raw_data() helper Namhyung Kim
@ 2023-01-13 11:19   ` Peter Zijlstra
  2023-01-13 21:01   ` Song Liu
  1 sibling, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2023-01-13 11:19 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Arnaldo Carvalho de Melo, Jiri Olsa, LKML, Song Liu, linux-s390,
	x86, bpf

On Thu, Jan 12, 2023 at 01:40:10PM -0800, Namhyung Kim wrote:
> @@ -1182,6 +1187,29 @@ static inline void perf_sample_save_callchain(struct perf_sample_data *data,
>  	data->sample_flags |= PERF_SAMPLE_CALLCHAIN;
>  }
>  
> +static inline void perf_sample_save_raw_data(struct perf_sample_data *data,
> +					     struct perf_raw_record *raw)
> +{
> +	struct perf_raw_frag *frag = &raw->frag;
> +	u32 sum = 0;
> +	int size;
> +
> +	do {
> +		sum += frag->size;
> +		if (perf_raw_frag_last(frag))
> +			break;
> +		frag = frag->next;
> +	} while (1);
> +
> +	size = round_up(sum + sizeof(u32), sizeof(u64));
> +	raw->size = size - sizeof(u32);
> +	frag->pad = raw->size - sum;
> +
> +	data->raw = raw;
> +	data->dyn_size += size;
> +	data->sample_flags |= PERF_SAMPLE_RAW;
> +}

This might be a wee big for inline, but I suppose it doesn't matter too
much.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHSET 0/8] perf/core: Prepare sample data for BPF
  2023-01-12 21:40 [PATCHSET 0/8] perf/core: Prepare sample data for BPF Namhyung Kim
  2023-01-12 21:40 ` [PATCH 3/8] perf/core: Add perf_sample_save_raw_data() helper Namhyung Kim
  2023-01-12 21:40 ` [PATCH 8/8] perf/core: Call perf_prepare_sample() before running BPF Namhyung Kim
@ 2023-01-13 11:21 ` Peter Zijlstra
  2023-01-13 13:25 ` Jiri Olsa
  3 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2023-01-13 11:21 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Arnaldo Carvalho de Melo, Jiri Olsa, LKML, Song Liu, bpf

On Thu, Jan 12, 2023 at 01:40:07PM -0800, Namhyung Kim wrote:

> Namhyung Kim (8):
>   perf/core: Save the dynamic parts of sample data size
>   perf/core: Add perf_sample_save_callchain() helper
>   perf/core: Add perf_sample_save_raw_data() helper
>   perf/core: Add perf_sample_save_brstack() helper
>   perf/core: Set data->sample_flags in perf_prepare_sample()
>   perf/core: Do not pass header for sample id init
>   perf/core: Introduce perf_prepare_header()
>   perf/core: Call perf_prepare_sample() before running BPF

Aside from a few small niggles, this looks really good, Thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHSET 0/8] perf/core: Prepare sample data for BPF
  2023-01-12 21:40 [PATCHSET 0/8] perf/core: Prepare sample data for BPF Namhyung Kim
                   ` (2 preceding siblings ...)
  2023-01-13 11:21 ` [PATCHSET 0/8] perf/core: Prepare sample data for BPF Peter Zijlstra
@ 2023-01-13 13:25 ` Jiri Olsa
  2023-01-17  8:12   ` Namhyung Kim
  3 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2023-01-13 13:25 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Arnaldo Carvalho de Melo, LKML, Song Liu, bpf

On Thu, Jan 12, 2023 at 01:40:07PM -0800, Namhyung Kim wrote:
> Hello,
> 
> The perf_prepare_sample() is to fill the perf sample data and update the
> header info before sending it to the ring buffer.  But we want to use it
> for BPF overflow handler so that it can access the sample data to filter
> relevant ones.
> 
> Changes in v2)
>  * the layout change is merged
>  * reduce branches using __cond_set  (PeterZ)
>  * add helpers to set dynamic sample data  (PeterZ)
>  * introduce perf_prepare_header()  (PeterZ)
>  * call perf_prepare_sample() before bpf_overflow_handler unconditionally
> 
> This means the perf_prepare_handler() can be called more than once.  To
> avoid duplicate work, use the data->sample_flags and save the data size.
> 
> I also added a few of helpers to set those information accordingly.
> But it looks some fields like REGS_USER, STACK_USER and AUX are saved in
> the perf_prepare_sample() so I didn't add the helpers for them.
> 
> After than we can just check the filtered_sample_type flags begin zero
> to determine if it has more work.  In that case, it needs to update the
> data->type since it's possible to miss when PMU driver sets all required
> sample flags before calling perf_prepare_sample().
> 
> The code is also available at 'perf/prepare-sample-v2' branch in
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> 
> Thanks,
> Namhyung
> 
> 
> Cc: Song Liu <song@kernel.org>
> Cc: bpf@vger.kernel.org
> 
> 
> Namhyung Kim (8):
>   perf/core: Save the dynamic parts of sample data size
>   perf/core: Add perf_sample_save_callchain() helper
>   perf/core: Add perf_sample_save_raw_data() helper
>   perf/core: Add perf_sample_save_brstack() helper
>   perf/core: Set data->sample_flags in perf_prepare_sample()
>   perf/core: Do not pass header for sample id init
>   perf/core: Introduce perf_prepare_header()
>   perf/core: Call perf_prepare_sample() before running BPF

lgtm, I ran the bpf selftests on top of that and it's ok

jirka

> 
>  arch/powerpc/perf/core-book3s.c    |   3 +-
>  arch/s390/kernel/perf_cpum_cf.c    |   4 +-
>  arch/s390/kernel/perf_cpum_sf.c    |   3 +-
>  arch/s390/kernel/perf_pai_crypto.c |   4 +-
>  arch/s390/kernel/perf_pai_ext.c    |   4 +-
>  arch/x86/events/amd/core.c         |   6 +-
>  arch/x86/events/amd/ibs.c          |   9 +-
>  arch/x86/events/intel/core.c       |   6 +-
>  arch/x86/events/intel/ds.c         |  24 ++--
>  include/linux/perf_event.h         | 133 +++++++++++++-----
>  kernel/events/core.c               | 207 ++++++++++++++++-------------
>  kernel/trace/bpf_trace.c           |   6 +-
>  12 files changed, 236 insertions(+), 173 deletions(-)
> 
> 
> base-commit: 9fcad995c6c52cc9791f7ee9f1386a5684055f9c
> -- 
> 2.39.0.314.g84b9a713c41-goog
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/8] perf/core: Add perf_sample_save_raw_data() helper
  2023-01-12 21:40 ` [PATCH 3/8] perf/core: Add perf_sample_save_raw_data() helper Namhyung Kim
  2023-01-13 11:19   ` Peter Zijlstra
@ 2023-01-13 21:01   ` Song Liu
  2023-01-13 21:56     ` Namhyung Kim
  1 sibling, 1 reply; 11+ messages in thread
From: Song Liu @ 2023-01-13 21:01 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Arnaldo Carvalho de Melo, Jiri Olsa, LKML, linux-s390, x86, bpf

On Thu, Jan 12, 2023 at 1:40 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> When it saves the raw_data to the perf sample data, it needs to update
> the sample flags and the dynamic size.  To make sure this, add the
> perf_sample_save_raw_data() helper and convert all call sites.
>
> Cc: linux-s390@vger.kernel.org
> Cc: x86@kernel.org
> Cc: bpf@vger.kernel.org
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  arch/s390/kernel/perf_cpum_cf.c    |  4 +---
>  arch/s390/kernel/perf_pai_crypto.c |  4 +---
>  arch/s390/kernel/perf_pai_ext.c    |  4 +---
>  arch/x86/events/amd/ibs.c          |  3 +--
>  include/linux/perf_event.h         | 33 +++++++++++++++++++++++++-----
>  kernel/events/core.c               | 31 +++++-----------------------
>  kernel/trace/bpf_trace.c           |  6 ++----
>  7 files changed, 39 insertions(+), 46 deletions(-)
>
> diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c
> index f043a7ff220b..aa38649c7c27 100644
> --- a/arch/s390/kernel/perf_cpum_cf.c
> +++ b/arch/s390/kernel/perf_cpum_cf.c
> @@ -662,9 +662,7 @@ static int cfdiag_push_sample(struct perf_event *event,
>         if (event->attr.sample_type & PERF_SAMPLE_RAW) {
>                 raw.frag.size = cpuhw->usedss;
>                 raw.frag.data = cpuhw->stop;
> -               raw.size = raw.frag.size;
> -               data.raw = &raw;
> -               data.sample_flags |= PERF_SAMPLE_RAW;
> +               perf_sample_save_raw_data(&data, &raw);
>         }
>
>         overflow = perf_event_overflow(event, &data, &regs);
> diff --git a/arch/s390/kernel/perf_pai_crypto.c b/arch/s390/kernel/perf_pai_crypto.c
> index 985e243a2ed8..a7b339c4fd7c 100644
> --- a/arch/s390/kernel/perf_pai_crypto.c
> +++ b/arch/s390/kernel/perf_pai_crypto.c
> @@ -362,9 +362,7 @@ static int paicrypt_push_sample(void)
>         if (event->attr.sample_type & PERF_SAMPLE_RAW) {
>                 raw.frag.size = rawsize;
>                 raw.frag.data = cpump->save;
> -               raw.size = raw.frag.size;
> -               data.raw = &raw;
> -               data.sample_flags |= PERF_SAMPLE_RAW;
> +               perf_sample_save_raw_data(&data, &raw);
>         }
>
>         overflow = perf_event_overflow(event, &data, &regs);
> diff --git a/arch/s390/kernel/perf_pai_ext.c b/arch/s390/kernel/perf_pai_ext.c
> index 1138f57baae3..555597222bad 100644
> --- a/arch/s390/kernel/perf_pai_ext.c
> +++ b/arch/s390/kernel/perf_pai_ext.c
> @@ -451,9 +451,7 @@ static int paiext_push_sample(void)
>         if (event->attr.sample_type & PERF_SAMPLE_RAW) {
>                 raw.frag.size = rawsize;
>                 raw.frag.data = cpump->save;
> -               raw.size = raw.frag.size;
> -               data.raw = &raw;
> -               data.sample_flags |= PERF_SAMPLE_RAW;
> +               perf_sample_save_raw_data(&data, &raw);
>         }
>
>         overflow = perf_event_overflow(event, &data, &regs);
> diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
> index 417c80bd3274..64582954b5f6 100644
> --- a/arch/x86/events/amd/ibs.c
> +++ b/arch/x86/events/amd/ibs.c
> @@ -1110,8 +1110,7 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
>                                 .data = ibs_data.data,
>                         },
>                 };
> -               data.raw = &raw;
> -               data.sample_flags |= PERF_SAMPLE_RAW;
> +               perf_sample_save_raw_data(&data, &raw);
>         }
>
>         if (perf_ibs == &perf_ibs_op)
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index a9419608402b..569dfac5887f 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -95,6 +95,11 @@ struct perf_raw_record {
>         u32                             size;
>  };
>
> +static __always_inline bool perf_raw_frag_last(const struct perf_raw_frag *frag)
> +{
> +       return frag->pad < sizeof(u64);
> +}
> +
>  /*
>   * branch stack layout:
>   *  nr: number of taken branches stored in entries[]
> @@ -1182,6 +1187,29 @@ static inline void perf_sample_save_callchain(struct perf_sample_data *data,
>         data->sample_flags |= PERF_SAMPLE_CALLCHAIN;
>  }
>
> +static inline void perf_sample_save_raw_data(struct perf_sample_data *data,
> +                                            struct perf_raw_record *raw)
> +{
> +       struct perf_raw_frag *frag = &raw->frag;
> +       u32 sum = 0;
> +       int size;
> +
> +       do {
> +               sum += frag->size;
> +               if (perf_raw_frag_last(frag))
> +                       break;
> +               frag = frag->next;
> +       } while (1);
> +
> +       size = round_up(sum + sizeof(u32), sizeof(u64));
> +       raw->size = size - sizeof(u32);
> +       frag->pad = raw->size - sum;
> +
> +       data->raw = raw;
> +       data->dyn_size += size;
> +       data->sample_flags |= PERF_SAMPLE_RAW;
> +}
> +
>  /*
>   * Clear all bitfields in the perf_branch_entry.
>   * The to and from fields are not cleared because they are
> @@ -1690,11 +1718,6 @@ extern void perf_restore_debug_store(void);
>  static inline void perf_restore_debug_store(void)                      { }
>  #endif
>
> -static __always_inline bool perf_raw_frag_last(const struct perf_raw_frag *frag)
> -{
> -       return frag->pad < sizeof(u64);
> -}
> -
>  #define perf_output_put(handle, x) perf_output_copy((handle), &(x), sizeof(x))
>
>  struct perf_pmu_events_attr {
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 0fba98b9cd65..133894ae5e30 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -7588,30 +7588,10 @@ void perf_prepare_sample(struct perf_event_header *header,
>         if (filtered_sample_type & PERF_SAMPLE_CALLCHAIN)
>                 perf_sample_save_callchain(data, event, regs);
>
> -       if (sample_type & PERF_SAMPLE_RAW) {
> -               struct perf_raw_record *raw = data->raw;
> -               int size;
> -
> -               if (raw && (data->sample_flags & PERF_SAMPLE_RAW)) {
> -                       struct perf_raw_frag *frag = &raw->frag;
> -                       u32 sum = 0;
> -
> -                       do {
> -                               sum += frag->size;
> -                               if (perf_raw_frag_last(frag))
> -                                       break;
> -                               frag = frag->next;
> -                       } while (1);
> -
> -                       size = round_up(sum + sizeof(u32), sizeof(u64));
> -                       raw->size = size - sizeof(u32);
> -                       frag->pad = raw->size - sum;
> -               } else {
> -                       size = sizeof(u64);
> -                       data->raw = NULL;
> -               }
> -
> -               data->dyn_size += size;
> +       if (filtered_sample_type & PERF_SAMPLE_RAW) {
> +               data->raw = NULL;
> +               data->dyn_size += sizeof(u64);
> +               data->sample_flags |= PERF_SAMPLE_RAW;
>         }

I don't quite follow this change, and the commit log doesn't seem
to cover this part.

>
>         if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
> @@ -10127,8 +10107,7 @@ void perf_tp_event(u16 event_type, u64 count, void *record, int entry_size,
>         };
>
>         perf_sample_data_init(&data, 0, 0);
> -       data.raw = &raw;
> -       data.sample_flags |= PERF_SAMPLE_RAW;
> +       perf_sample_save_raw_data(&data, &raw);
>
>         perf_trace_buf_update(record, event_type);
>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 3bbd3f0c810c..ad37608afc35 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -687,8 +687,7 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
>         }
>
>         perf_sample_data_init(sd, 0, 0);
> -       sd->raw = &raw;
> -       sd->sample_flags |= PERF_SAMPLE_RAW;
> +       perf_sample_save_raw_data(sd, &raw);
>
>         err = __bpf_perf_event_output(regs, map, flags, sd);
>
> @@ -746,8 +745,7 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
>
>         perf_fetch_caller_regs(regs);
>         perf_sample_data_init(sd, 0, 0);
> -       sd->raw = &raw;
> -       sd->sample_flags |= PERF_SAMPLE_RAW;
> +       perf_sample_save_raw_data(sd, &raw);
>
>         ret = __bpf_perf_event_output(regs, map, flags, sd);
>  out:
> --
> 2.39.0.314.g84b9a713c41-goog
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 8/8] perf/core: Call perf_prepare_sample() before running BPF
  2023-01-12 21:40 ` [PATCH 8/8] perf/core: Call perf_prepare_sample() before running BPF Namhyung Kim
@ 2023-01-13 21:06   ` Song Liu
  0 siblings, 0 replies; 11+ messages in thread
From: Song Liu @ 2023-01-13 21:06 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Arnaldo Carvalho de Melo, Jiri Olsa, LKML, bpf

On Thu, Jan 12, 2023 at 1:40 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> As BPF can access sample data, it needs to populate the data.  Also
> remove the logic to get the callchain specifically as it's covered by
> the perf_prepare_sample() now.
>
> Cc: Song Liu <song@kernel.org>
> Cc: bpf@vger.kernel.org
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>

Acked-by: Song Liu <song@kernel.org>

> ---
>  kernel/events/core.c | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 5c4f3fa3d2b7..af8365fb639a 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -10363,13 +10363,7 @@ static void bpf_overflow_handler(struct perf_event *event,
>         rcu_read_lock();
>         prog = READ_ONCE(event->prog);
>         if (prog) {
> -               if (prog->call_get_stack &&
> -                   (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN) &&
> -                   !(data->sample_flags & PERF_SAMPLE_CALLCHAIN)) {
> -                       data->callchain = perf_callchain(event, regs);
> -                       data->sample_flags |= PERF_SAMPLE_CALLCHAIN;
> -               }
> -
> +               perf_prepare_sample(data, event, regs);
>                 ret = bpf_prog_run(prog, &ctx);
>         }
>         rcu_read_unlock();
> --
> 2.39.0.314.g84b9a713c41-goog
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/8] perf/core: Add perf_sample_save_raw_data() helper
  2023-01-13 21:01   ` Song Liu
@ 2023-01-13 21:56     ` Namhyung Kim
  2023-01-13 22:57       ` Song Liu
  0 siblings, 1 reply; 11+ messages in thread
From: Namhyung Kim @ 2023-01-13 21:56 UTC (permalink / raw)
  To: Song Liu
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Arnaldo Carvalho de Melo, Jiri Olsa, LKML, linux-s390, x86, bpf

Hi Song,

On Fri, Jan 13, 2023 at 1:01 PM Song Liu <song@kernel.org> wrote:
>
> On Thu, Jan 12, 2023 at 1:40 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > When it saves the raw_data to the perf sample data, it needs to update
> > the sample flags and the dynamic size.  To make sure this, add the
> > perf_sample_save_raw_data() helper and convert all call sites.
> >
> > Cc: linux-s390@vger.kernel.org
> > Cc: x86@kernel.org
> > Cc: bpf@vger.kernel.org
> > Suggested-by: Peter Zijlstra <peterz@infradead.org>
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---

[SNIP]
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 0fba98b9cd65..133894ae5e30 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -7588,30 +7588,10 @@ void perf_prepare_sample(struct perf_event_header *header,
> >         if (filtered_sample_type & PERF_SAMPLE_CALLCHAIN)
> >                 perf_sample_save_callchain(data, event, regs);
> >
> > -       if (sample_type & PERF_SAMPLE_RAW) {
> > -               struct perf_raw_record *raw = data->raw;
> > -               int size;
> > -
> > -               if (raw && (data->sample_flags & PERF_SAMPLE_RAW)) {
> > -                       struct perf_raw_frag *frag = &raw->frag;
> > -                       u32 sum = 0;
> > -
> > -                       do {
> > -                               sum += frag->size;
> > -                               if (perf_raw_frag_last(frag))
> > -                                       break;
> > -                               frag = frag->next;
> > -                       } while (1);
> > -
> > -                       size = round_up(sum + sizeof(u32), sizeof(u64));
> > -                       raw->size = size - sizeof(u32);
> > -                       frag->pad = raw->size - sum;
> > -               } else {
> > -                       size = sizeof(u64);
> > -                       data->raw = NULL;
> > -               }
> > -
> > -               data->dyn_size += size;
> > +       if (filtered_sample_type & PERF_SAMPLE_RAW) {
> > +               data->raw = NULL;
> > +               data->dyn_size += sizeof(u64);
> > +               data->sample_flags |= PERF_SAMPLE_RAW;
> >         }
>
> I don't quite follow this change, and the commit log doesn't seem
> to cover this part.

It's for when the user requested RAW but no actual data.
It assumes PMU drivers call perf_sample_save_raw_data()
before perf_prepare_sample() if there's RAW data.
So we can only handle the 'else' part in the original code.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/8] perf/core: Add perf_sample_save_raw_data() helper
  2023-01-13 21:56     ` Namhyung Kim
@ 2023-01-13 22:57       ` Song Liu
  0 siblings, 0 replies; 11+ messages in thread
From: Song Liu @ 2023-01-13 22:57 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Arnaldo Carvalho de Melo, Jiri Olsa, LKML, linux-s390, x86, bpf

On Fri, Jan 13, 2023 at 1:56 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hi Song,
>
> On Fri, Jan 13, 2023 at 1:01 PM Song Liu <song@kernel.org> wrote:
> >
> > On Thu, Jan 12, 2023 at 1:40 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > When it saves the raw_data to the perf sample data, it needs to update
> > > the sample flags and the dynamic size.  To make sure this, add the
> > > perf_sample_save_raw_data() helper and convert all call sites.
> > >
> > > Cc: linux-s390@vger.kernel.org
> > > Cc: x86@kernel.org
> > > Cc: bpf@vger.kernel.org
> > > Suggested-by: Peter Zijlstra <peterz@infradead.org>
> > > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > > ---
>
> [SNIP]
> > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > index 0fba98b9cd65..133894ae5e30 100644
> > > --- a/kernel/events/core.c
> > > +++ b/kernel/events/core.c
> > > @@ -7588,30 +7588,10 @@ void perf_prepare_sample(struct perf_event_header *header,
> > >         if (filtered_sample_type & PERF_SAMPLE_CALLCHAIN)
> > >                 perf_sample_save_callchain(data, event, regs);
> > >
> > > -       if (sample_type & PERF_SAMPLE_RAW) {
> > > -               struct perf_raw_record *raw = data->raw;
> > > -               int size;
> > > -
> > > -               if (raw && (data->sample_flags & PERF_SAMPLE_RAW)) {
> > > -                       struct perf_raw_frag *frag = &raw->frag;
> > > -                       u32 sum = 0;
> > > -
> > > -                       do {
> > > -                               sum += frag->size;
> > > -                               if (perf_raw_frag_last(frag))
> > > -                                       break;
> > > -                               frag = frag->next;
> > > -                       } while (1);
> > > -
> > > -                       size = round_up(sum + sizeof(u32), sizeof(u64));
> > > -                       raw->size = size - sizeof(u32);
> > > -                       frag->pad = raw->size - sum;
> > > -               } else {
> > > -                       size = sizeof(u64);
> > > -                       data->raw = NULL;
> > > -               }
> > > -
> > > -               data->dyn_size += size;
> > > +       if (filtered_sample_type & PERF_SAMPLE_RAW) {
> > > +               data->raw = NULL;
> > > +               data->dyn_size += sizeof(u64);
> > > +               data->sample_flags |= PERF_SAMPLE_RAW;
> > >         }
> >
> > I don't quite follow this change, and the commit log doesn't seem
> > to cover this part.
>
> It's for when the user requested RAW but no actual data.
> It assumes PMU drivers call perf_sample_save_raw_data()
> before perf_prepare_sample() if there's RAW data.
> So we can only handle the 'else' part in the original code.

Got it. Thanks for the explanation.

Song

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHSET 0/8] perf/core: Prepare sample data for BPF
  2023-01-13 13:25 ` Jiri Olsa
@ 2023-01-17  8:12   ` Namhyung Kim
  0 siblings, 0 replies; 11+ messages in thread
From: Namhyung Kim @ 2023-01-17  8:12 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Arnaldo Carvalho de Melo, LKML, Song Liu, bpf

Hi Jiri,

On Fri, Jan 13, 2023 at 5:26 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Thu, Jan 12, 2023 at 01:40:07PM -0800, Namhyung Kim wrote:
> > Hello,
> >
> > The perf_prepare_sample() is to fill the perf sample data and update the
> > header info before sending it to the ring buffer.  But we want to use it
> > for BPF overflow handler so that it can access the sample data to filter
> > relevant ones.
> >
> > Changes in v2)
> >  * the layout change is merged
> >  * reduce branches using __cond_set  (PeterZ)
> >  * add helpers to set dynamic sample data  (PeterZ)
> >  * introduce perf_prepare_header()  (PeterZ)
> >  * call perf_prepare_sample() before bpf_overflow_handler unconditionally
> >
> > This means the perf_prepare_handler() can be called more than once.  To
> > avoid duplicate work, use the data->sample_flags and save the data size.
> >
> > I also added a few of helpers to set those information accordingly.
> > But it looks some fields like REGS_USER, STACK_USER and AUX are saved in
> > the perf_prepare_sample() so I didn't add the helpers for them.
> >
> > After than we can just check the filtered_sample_type flags begin zero
> > to determine if it has more work.  In that case, it needs to update the
> > data->type since it's possible to miss when PMU driver sets all required
> > sample flags before calling perf_prepare_sample().
> >
> > The code is also available at 'perf/prepare-sample-v2' branch in
> >
> >   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> >
> > Thanks,
> > Namhyung
> >
> >
> > Cc: Song Liu <song@kernel.org>
> > Cc: bpf@vger.kernel.org
> >
> >
> > Namhyung Kim (8):
> >   perf/core: Save the dynamic parts of sample data size
> >   perf/core: Add perf_sample_save_callchain() helper
> >   perf/core: Add perf_sample_save_raw_data() helper
> >   perf/core: Add perf_sample_save_brstack() helper
> >   perf/core: Set data->sample_flags in perf_prepare_sample()
> >   perf/core: Do not pass header for sample id init
> >   perf/core: Introduce perf_prepare_header()
> >   perf/core: Call perf_prepare_sample() before running BPF
>
> lgtm, I ran the bpf selftests on top of that and it's ok

Thanks for your review.  I'll add your Acked-by and Tested-by
in the v3. :)

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-01-17  8:13 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-12 21:40 [PATCHSET 0/8] perf/core: Prepare sample data for BPF Namhyung Kim
2023-01-12 21:40 ` [PATCH 3/8] perf/core: Add perf_sample_save_raw_data() helper Namhyung Kim
2023-01-13 11:19   ` Peter Zijlstra
2023-01-13 21:01   ` Song Liu
2023-01-13 21:56     ` Namhyung Kim
2023-01-13 22:57       ` Song Liu
2023-01-12 21:40 ` [PATCH 8/8] perf/core: Call perf_prepare_sample() before running BPF Namhyung Kim
2023-01-13 21:06   ` Song Liu
2023-01-13 11:21 ` [PATCHSET 0/8] perf/core: Prepare sample data for BPF Peter Zijlstra
2023-01-13 13:25 ` Jiri Olsa
2023-01-17  8:12   ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox