linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2)
@ 2024-09-17 22:28 Namhyung Kim
  2024-09-17 22:28 ` [PATCH 1/5] perf tools: Sync UAPI perf_event.h header Namhyung Kim
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-09-17 22:28 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Josh Poimboeuf, Steven Rostedt,
	Mathieu Desnoyers, Indu Bhagat, linux-toolchains

Hello,

This is a counterpart for Josh's kernel change v2 [1] to support deferred
user callchains.  The change is transparent and users should not notice
anything with the deferred callchains.

  $ perf record -g sleep 1

I added --[no-]merge-callchains option to control output of perf script.
You can verify it has the deferred callchains like this:

  $ perf script --no-merge-callchains
  perf     801 [000]    18.031793:          1 cycles:P:
          ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
          ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
          ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
          ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
          ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
          ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
          ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
          ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
          ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
          ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
          ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
          ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
          ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])

  perf     801 [000]    18.031814: DEFERRED CALLCHAIN
                  7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)

  ...

When the callchain is merged (it's the default) it'd look like below:

  $ perf script
  perf     801 [000]    18.031793:          1 cycles:P:
          ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
          ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
          ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
          ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
          ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
          ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
          ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
          ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
          ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
          ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
          ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
          ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
          ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
                  7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)

  ...

Notice that the last line and it has the __GI___ioctl in the same
callchain.  It should work with other tools like perf report.

The code is available at 'perf/defer-callchain-v2' branch in
https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung

[1] https://lore.kernel.org/lkml/cover.1726268190.git.jpoimboe@kernel.org


Namhyung Kim (5):
  perf tools: Sync UAPI perf_event.h header
  perf tools: Minimal DEFERRED_CALLCHAIN support
  perf record: Enable defer_callchain for user callchains
  perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED
  perf tools: Merge deferred user callchains

 tools/include/uapi/linux/perf_event.h     | 21 +++++-
 tools/lib/perf/include/perf/event.h       |  7 ++
 tools/perf/Documentation/perf-script.txt  |  5 ++
 tools/perf/builtin-script.c               | 92 +++++++++++++++++++++++
 tools/perf/util/callchain.c               | 24 ++++++
 tools/perf/util/callchain.h               |  3 +
 tools/perf/util/event.c                   |  1 +
 tools/perf/util/evlist.c                  |  1 +
 tools/perf/util/evlist.h                  |  1 +
 tools/perf/util/evsel.c                   | 32 +++++++-
 tools/perf/util/evsel.h                   |  1 +
 tools/perf/util/machine.c                 |  1 +
 tools/perf/util/perf_event_attr_fprintf.c |  1 +
 tools/perf/util/sample.h                  |  3 +-
 tools/perf/util/session.c                 | 78 +++++++++++++++++++
 tools/perf/util/tool.c                    |  2 +
 tools/perf/util/tool.h                    |  4 +-
 17 files changed, 273 insertions(+), 4 deletions(-)

-- 
2.46.0.792.g87dc391469-goog


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/5] perf tools: Sync UAPI perf_event.h header
  2024-09-17 22:28 [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2) Namhyung Kim
@ 2024-09-17 22:28 ` Namhyung Kim
  2024-09-17 22:28 ` [PATCH 2/5] perf tools: Minimal DEFERRED_CALLCHAIN support Namhyung Kim
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-09-17 22:28 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Josh Poimboeuf, Steven Rostedt,
	Mathieu Desnoyers, Indu Bhagat, linux-toolchains

To import defer_callchain changes.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/include/uapi/linux/perf_event.h | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 4842c36fdf801996..a7f875eb29dd049a 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -460,7 +460,8 @@ struct perf_event_attr {
 				inherit_thread :  1, /* children only inherit if cloned with CLONE_THREAD */
 				remove_on_exec :  1, /* event is removed from task on exec */
 				sigtrap        :  1, /* send synchronous SIGTRAP on event */
-				__reserved_1   : 26;
+				defer_callchain:  1, /* generate PERF_RECORD_CALLCHAIN_DEFERRED records */
+				__reserved_1   : 25;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
@@ -1217,6 +1218,23 @@ enum perf_event_type {
 	 */
 	PERF_RECORD_AUX_OUTPUT_HW_ID		= 21,
 
+	/*
+	 * This user callchain capture was deferred until shortly before
+	 * returning to user space.  Previous samples would have kernel
+	 * callchains only and they need to be stitched with this to make full
+	 * callchains.
+	 *
+	 * TODO: do PERF_SAMPLE_{REGS,STACK}_USER also need deferral?
+	 *
+	 * struct {
+	 *	struct perf_event_header	header;
+	 *	u64				nr;
+	 *	u64				ips[nr];
+	 *	struct sample_id		sample_id;
+	 * };
+	 */
+	PERF_RECORD_CALLCHAIN_DEFERRED		= 22,
+
 	PERF_RECORD_MAX,			/* non-ABI */
 };
 
@@ -1247,6 +1265,7 @@ enum perf_callchain_context {
 	PERF_CONTEXT_HV			= (__u64)-32,
 	PERF_CONTEXT_KERNEL		= (__u64)-128,
 	PERF_CONTEXT_USER		= (__u64)-512,
+	PERF_CONTEXT_USER_DEFERRED	= (__u64)-640,
 
 	PERF_CONTEXT_GUEST		= (__u64)-2048,
 	PERF_CONTEXT_GUEST_KERNEL	= (__u64)-2176,
-- 
2.46.0.792.g87dc391469-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/5] perf tools: Minimal DEFERRED_CALLCHAIN support
  2024-09-17 22:28 [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2) Namhyung Kim
  2024-09-17 22:28 ` [PATCH 1/5] perf tools: Sync UAPI perf_event.h header Namhyung Kim
@ 2024-09-17 22:28 ` Namhyung Kim
  2024-09-17 22:28 ` [PATCH 3/5] perf record: Enable defer_callchain for user callchains Namhyung Kim
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-09-17 22:28 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Josh Poimboeuf, Steven Rostedt,
	Mathieu Desnoyers, Indu Bhagat, linux-toolchains

Add a new event type for deferred callchains and a new callback for the
struct perf_tool.  For now it doesn't actually handle the deferred
callchains but it just marks the sample if it has the PERF_CONTEXT_
USER_DEFFERED in the callchain array.

At least, perf report can dump the raw data with this change.  Actually
this requires the next commit to enable attr.defer_callchain, but if you
already have a data file, it'll show the following result.

  $ perf report -D
  ...
  0x5fe0@perf.data [0x40]: event: 22
  .
  . ... raw event: size 64 bytes
  .  0000:  16 00 00 00 02 00 40 00 02 00 00 00 00 00 00 00  ......@.........
  .  0010:  00 fe ff ff ff ff ff ff 4b d3 3f 25 45 7f 00 00  ........K.?%E...
  .  0020:  21 03 00 00 21 03 00 00 43 02 12 ab 05 00 00 00  !...!...C.......
  .  0030:  00 00 00 00 00 00 00 00 09 00 00 00 00 00 00 00  ................

  0 24344920643 0x5fe0 [0x40]: PERF_RECORD_CALLCHAIN_DEFERRED(IP, 0x2): 801/801: 0
  ... FP chain: nr:2
  .....  0: fffffffffffffe00
  .....  1: 00007f45253fd34b
  : unhandled!

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/lib/perf/include/perf/event.h       |  7 +++++++
 tools/perf/util/event.c                   |  1 +
 tools/perf/util/evsel.c                   | 15 +++++++++++++++
 tools/perf/util/machine.c                 |  1 +
 tools/perf/util/perf_event_attr_fprintf.c |  1 +
 tools/perf/util/sample.h                  |  3 ++-
 tools/perf/util/session.c                 | 17 +++++++++++++++++
 tools/perf/util/tool.c                    |  1 +
 tools/perf/util/tool.h                    |  3 ++-
 9 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/tools/lib/perf/include/perf/event.h b/tools/lib/perf/include/perf/event.h
index 37bb7771d9143466..f643a6a2b9fc2279 100644
--- a/tools/lib/perf/include/perf/event.h
+++ b/tools/lib/perf/include/perf/event.h
@@ -151,6 +151,12 @@ struct perf_record_switch {
 	__u32			 next_prev_tid;
 };
 
+struct perf_record_callchain_deferred {
+	struct perf_event_header header;
+	__u64			 nr;
+	__u64			 ips[];
+};
+
 struct perf_record_header_attr {
 	struct perf_event_header header;
 	struct perf_event_attr	 attr;
@@ -494,6 +500,7 @@ union perf_event {
 	struct perf_record_read			read;
 	struct perf_record_throttle		throttle;
 	struct perf_record_sample		sample;
+	struct perf_record_callchain_deferred	callchain_deferred;
 	struct perf_record_bpf_event		bpf;
 	struct perf_record_ksymbol		ksymbol;
 	struct perf_record_text_poke_event	text_poke;
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index aac96d5d19170091..8cdec373db44deac 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -58,6 +58,7 @@ static const char *perf_event__names[] = {
 	[PERF_RECORD_CGROUP]			= "CGROUP",
 	[PERF_RECORD_TEXT_POKE]			= "TEXT_POKE",
 	[PERF_RECORD_AUX_OUTPUT_HW_ID]		= "AUX_OUTPUT_HW_ID",
+	[PERF_RECORD_CALLCHAIN_DEFERRED]	= "CALLCHAIN_DEFERRED",
 	[PERF_RECORD_HEADER_ATTR]		= "ATTR",
 	[PERF_RECORD_HEADER_EVENT_TYPE]		= "EVENT_TYPE",
 	[PERF_RECORD_HEADER_TRACING_DATA]	= "TRACING_DATA",
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index dbf9c8cee3c5658f..701092d6b1b64124 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2676,6 +2676,18 @@ int evsel__parse_sample(struct evsel *evsel, union perf_event *event,
 	data->data_src = PERF_MEM_DATA_SRC_NONE;
 	data->vcpu = -1;
 
+	if (event->header.type == PERF_RECORD_CALLCHAIN_DEFERRED) {
+		const u64 max_callchain_nr = UINT64_MAX / sizeof(u64);
+
+		data->callchain = (struct ip_callchain *)&event->callchain_deferred.nr;
+		if (data->callchain->nr > max_callchain_nr)
+			return -EFAULT;
+
+		if (evsel->core.attr.sample_id_all)
+			perf_evsel__parse_id_sample(evsel, event, data);
+		return 0;
+	}
+
 	if (event->header.type != PERF_RECORD_SAMPLE) {
 		if (!evsel->core.attr.sample_id_all)
 			return 0;
@@ -2806,6 +2818,9 @@ int evsel__parse_sample(struct evsel *evsel, union perf_event *event,
 		if (data->callchain->nr > max_callchain_nr)
 			return -EFAULT;
 		sz = data->callchain->nr * sizeof(u64);
+		if (evsel->core.attr.defer_callchain && data->callchain->nr >= 1 &&
+		    data->callchain->ips[data->callchain->nr - 1] == PERF_CONTEXT_USER_DEFERRED)
+			data->deferred_callchain = true;
 		OVERFLOW_CHECK(array, sz, max_size);
 		array = (void *)array + sz;
 	}
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index fad227b625d155c5..f367577c91ffa016 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2085,6 +2085,7 @@ static int add_callchain_ip(struct thread *thread,
 				*cpumode = PERF_RECORD_MISC_KERNEL;
 				break;
 			case PERF_CONTEXT_USER:
+			case PERF_CONTEXT_USER_DEFERRED:
 				*cpumode = PERF_RECORD_MISC_USER;
 				break;
 			default:
diff --git a/tools/perf/util/perf_event_attr_fprintf.c b/tools/perf/util/perf_event_attr_fprintf.c
index 59fbbba796974058..113845b35110262a 100644
--- a/tools/perf/util/perf_event_attr_fprintf.c
+++ b/tools/perf/util/perf_event_attr_fprintf.c
@@ -321,6 +321,7 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 	PRINT_ATTRf(inherit_thread, p_unsigned);
 	PRINT_ATTRf(remove_on_exec, p_unsigned);
 	PRINT_ATTRf(sigtrap, p_unsigned);
+	PRINT_ATTRf(defer_callchain, p_unsigned);
 
 	PRINT_ATTRn("{ wakeup_events, wakeup_watermark }", wakeup_events, p_unsigned, false);
 	PRINT_ATTRf(bp_type, p_unsigned);
diff --git a/tools/perf/util/sample.h b/tools/perf/util/sample.h
index 70b2c3135555ec26..010659dc80f88652 100644
--- a/tools/perf/util/sample.h
+++ b/tools/perf/util/sample.h
@@ -108,7 +108,8 @@ struct perf_sample {
 		u16 p_stage_cyc;
 		u16 retire_lat;
 	};
-	bool no_hw_idx;		/* No hw_idx collected in branch_stack */
+	bool no_hw_idx;			/* No hw_idx collected in branch_stack */
+	bool deferred_callchain;	/* Has deferred user callchains */
 	char insn[MAX_INSN];
 	void *raw_data;
 	struct ip_callchain *callchain;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index dbaf07bf6c5fb88c..1248a0317a2f164a 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -714,6 +714,7 @@ static perf_event__swap_op perf_event__swap_ops[] = {
 	[PERF_RECORD_CGROUP]		  = perf_event__cgroup_swap,
 	[PERF_RECORD_TEXT_POKE]		  = perf_event__text_poke_swap,
 	[PERF_RECORD_AUX_OUTPUT_HW_ID]	  = perf_event__all64_swap,
+	[PERF_RECORD_CALLCHAIN_DEFERRED]  = perf_event__all64_swap,
 	[PERF_RECORD_HEADER_ATTR]	  = perf_event__hdr_attr_swap,
 	[PERF_RECORD_HEADER_EVENT_TYPE]	  = perf_event__event_type_swap,
 	[PERF_RECORD_HEADER_TRACING_DATA] = perf_event__tracing_data_swap,
@@ -1107,6 +1108,19 @@ static void dump_sample(struct evsel *evsel, union perf_event *event,
 		sample_read__printf(sample, evsel->core.attr.read_format);
 }
 
+static void dump_deferred_callchain(struct evsel *evsel, union perf_event *event,
+				    struct perf_sample *sample)
+{
+	if (!dump_trace)
+		return;
+
+	printf("(IP, 0x%x): %d/%d: %#" PRIx64 "\n",
+	       event->header.misc, sample->pid, sample->tid, sample->ip);
+
+	if (evsel__has_callchain(evsel))
+		callchain__printf(evsel, sample);
+}
+
 static void dump_read(struct evsel *evsel, union perf_event *event)
 {
 	struct perf_record_read *read_event = &event->read;
@@ -1327,6 +1341,9 @@ static int machines__deliver_event(struct machines *machines,
 		return tool->text_poke(tool, event, sample, machine);
 	case PERF_RECORD_AUX_OUTPUT_HW_ID:
 		return tool->aux_output_hw_id(tool, event, sample, machine);
+	case PERF_RECORD_CALLCHAIN_DEFERRED:
+		dump_deferred_callchain(evsel, event, sample);
+		return tool->callchain_deferred(tool, event, sample, evsel, machine);
 	default:
 		++evlist->stats.nr_unknown_events;
 		return -1;
diff --git a/tools/perf/util/tool.c b/tools/perf/util/tool.c
index 3b7f390f26eb427d..e78f16de912ed9e2 100644
--- a/tools/perf/util/tool.c
+++ b/tools/perf/util/tool.c
@@ -259,6 +259,7 @@ void perf_tool__init(struct perf_tool *tool, bool ordered_events)
 	tool->read = process_event_sample_stub;
 	tool->throttle = process_event_stub;
 	tool->unthrottle = process_event_stub;
+	tool->callchain_deferred = process_event_sample_stub;
 	tool->attr = process_event_synth_attr_stub;
 	tool->event_update = process_event_synth_event_update_stub;
 	tool->tracing_data = process_event_synth_tracing_data_stub;
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index db1c7642b0d1564d..9987bbde6d5e0565 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -42,7 +42,8 @@ enum show_feature_header {
 
 struct perf_tool {
 	event_sample	sample,
-			read;
+			read,
+			callchain_deferred;
 	event_op	mmap,
 			mmap2,
 			comm,
-- 
2.46.0.792.g87dc391469-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/5] perf record: Enable defer_callchain for user callchains
  2024-09-17 22:28 [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2) Namhyung Kim
  2024-09-17 22:28 ` [PATCH 1/5] perf tools: Sync UAPI perf_event.h header Namhyung Kim
  2024-09-17 22:28 ` [PATCH 2/5] perf tools: Minimal DEFERRED_CALLCHAIN support Namhyung Kim
@ 2024-09-17 22:28 ` Namhyung Kim
  2024-09-17 22:28 ` [PATCH 4/5] perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED Namhyung Kim
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-09-17 22:28 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Josh Poimboeuf, Steven Rostedt,
	Mathieu Desnoyers, Indu Bhagat, linux-toolchains

And add the missing feature detection logic to clear the flag on old
kernels.

  $ perf record -g -vv true
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             0 (PERF_TYPE_HARDWARE)
    size                             136
    config                           0 (PERF_COUNT_HW_CPU_CYCLES)
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|CALLCHAIN|PERIOD
    read_format                      ID|LOST
    disabled                         1
    inherit                          1
    mmap                             1
    comm                             1
    freq                             1
    enable_on_exec                   1
    task                             1
    sample_id_all                    1
    mmap2                            1
    comm_exec                        1
    ksymbol                          1
    bpf_event                        1
    defer_callchain                  1
  ------------------------------------------------------------
  sys_perf_event_open: pid 162755  cpu 0  group_fd -1  flags 0x8
  sys_perf_event_open failed, error -22
  switching off deferred callchain support

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/evsel.c | 17 ++++++++++++++++-
 tools/perf/util/evsel.h |  1 +
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 701092d6b1b64124..ad89644b32f23035 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -912,6 +912,14 @@ static void __evsel__config_callchain(struct evsel *evsel, struct record_opts *o
 		}
 	}
 
+	if (param->record_mode == CALLCHAIN_FP && !attr->exclude_callchain_user) {
+		/*
+		 * Enable deferred callchains optimistically.  It'll be switched
+		 * off later if the kernel doesn't support it.
+		 */
+		attr->defer_callchain = 1;
+	}
+
 	if (function) {
 		pr_info("Disabling user space callchains for function trace event.\n");
 		attr->exclude_callchain_user = 1;
@@ -2089,6 +2097,8 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
 
 static void evsel__disable_missing_features(struct evsel *evsel)
 {
+	if (perf_missing_features.defer_callchain)
+		evsel->core.attr.defer_callchain = 0;
 	if (perf_missing_features.branch_counters)
 		evsel->core.attr.branch_sample_type &= ~PERF_SAMPLE_BRANCH_COUNTERS;
 	if (perf_missing_features.read_lost)
@@ -2144,7 +2154,12 @@ bool evsel__detect_missing_features(struct evsel *evsel)
 	 * Must probe features in the order they were added to the
 	 * perf_event_attr interface.
 	 */
-	if (!perf_missing_features.branch_counters &&
+	if (!perf_missing_features.defer_callchain &&
+	    evsel->core.attr.defer_callchain) {
+		perf_missing_features.defer_callchain = true;
+		pr_debug2("switching off deferred callchain support\n");
+		return true;
+	} else if (!perf_missing_features.branch_counters &&
 	    (evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_COUNTERS)) {
 		perf_missing_features.branch_counters = true;
 		pr_debug2("switching off branch counters support\n");
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 15e745a9a798fa29..f0a1e1d789420a94 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -221,6 +221,7 @@ struct perf_missing_features {
 	bool weight_struct;
 	bool read_lost;
 	bool branch_counters;
+	bool defer_callchain;
 };
 
 extern struct perf_missing_features perf_missing_features;
-- 
2.46.0.792.g87dc391469-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/5] perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED
  2024-09-17 22:28 [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2) Namhyung Kim
                   ` (2 preceding siblings ...)
  2024-09-17 22:28 ` [PATCH 3/5] perf record: Enable defer_callchain for user callchains Namhyung Kim
@ 2024-09-17 22:28 ` Namhyung Kim
  2024-09-17 22:28 ` [PATCH 5/5] perf tools: Merge deferred user callchains Namhyung Kim
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-09-17 22:28 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Josh Poimboeuf, Steven Rostedt,
	Mathieu Desnoyers, Indu Bhagat, linux-toolchains

Handle the deferred callchains in the script output.

  $ perf script
  perf     801 [000]    18.031793:          1 cycles:P:
          ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
          ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
          ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
          ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
          ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
          ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
          ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
          ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
          ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
          ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
          ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
          ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
          ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])

  perf     801 [000]    18.031814: DEFERRED CALLCHAIN
              7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-script.c | 89 +++++++++++++++++++++++++++++++++++++
 1 file changed, 89 insertions(+)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index a644787fa9e1dc25..311580e25f5b2008 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2540,6 +2540,93 @@ static int process_sample_event(const struct perf_tool *tool,
 	return ret;
 }
 
+static int process_deferred_sample_event(const struct perf_tool *tool,
+					 union perf_event *event,
+					 struct perf_sample *sample,
+					 struct evsel *evsel,
+					 struct machine *machine)
+{
+	struct perf_script *scr = container_of(tool, struct perf_script, tool);
+	struct perf_event_attr *attr = &evsel->core.attr;
+	struct evsel_script *es = evsel->priv;
+	unsigned int type = output_type(attr->type);
+	struct addr_location al;
+	FILE *fp = es->fp;
+	int ret = 0;
+
+	if (output[type].fields == 0)
+		return 0;
+
+	/* Set thread to NULL to indicate addr_al and al are not initialized */
+	addr_location__init(&al);
+
+	if (perf_time__ranges_skip_sample(scr->ptime_range, scr->range_num,
+					  sample->time)) {
+		goto out_put;
+	}
+
+	if (debug_mode) {
+		if (sample->time < last_timestamp) {
+			pr_err("Samples misordered, previous: %" PRIu64
+				" this: %" PRIu64 "\n", last_timestamp,
+				sample->time);
+			nr_unordered++;
+		}
+		last_timestamp = sample->time;
+		goto out_put;
+	}
+
+	if (filter_cpu(sample))
+		goto out_put;
+
+	if (machine__resolve(machine, &al, sample) < 0) {
+		pr_err("problem processing %d event, skipping it.\n",
+		       event->header.type);
+		ret = -1;
+		goto out_put;
+	}
+
+	if (al.filtered)
+		goto out_put;
+
+	if (!show_event(sample, evsel, al.thread, &al, NULL))
+		goto out_put;
+
+	if (evswitch__discard(&scr->evswitch, evsel))
+		goto out_put;
+
+	perf_sample__fprintf_start(scr, sample, al.thread, evsel,
+				   PERF_RECORD_CALLCHAIN_DEFERRED, fp);
+	fprintf(fp, "DEFERRED CALLCHAIN");
+
+	if (PRINT_FIELD(IP)) {
+		struct callchain_cursor *cursor = NULL;
+
+		if (symbol_conf.use_callchain && sample->callchain) {
+			cursor = get_tls_callchain_cursor();
+			if (thread__resolve_callchain(al.thread, cursor, evsel,
+						      sample, NULL, NULL,
+						      scripting_max_stack)) {
+				pr_info("cannot resolve deferred callchains\n");
+				cursor = NULL;
+			}
+		}
+
+		fputc(cursor ? '\n' : ' ', fp);
+		sample__fprintf_sym(sample, &al, 0, output[type].print_ip_opts,
+				    cursor, symbol_conf.bt_stop_list, fp);
+	}
+
+	fprintf(fp, "\n");
+
+	if (verbose > 0)
+		fflush(fp);
+
+out_put:
+	addr_location__exit(&al);
+	return ret;
+}
+
 // Used when scr->per_event_dump is not set
 static struct evsel_script es_stdout;
 
@@ -4325,6 +4412,7 @@ int cmd_script(int argc, const char **argv)
 
 	perf_tool__init(&script.tool, !unsorted_dump);
 	script.tool.sample		 = process_sample_event;
+	script.tool.callchain_deferred	 = process_deferred_sample_event;
 	script.tool.mmap		 = perf_event__process_mmap;
 	script.tool.mmap2		 = perf_event__process_mmap2;
 	script.tool.comm		 = perf_event__process_comm;
@@ -4351,6 +4439,7 @@ int cmd_script(int argc, const char **argv)
 	script.tool.throttle		 = process_throttle_event;
 	script.tool.unthrottle		 = process_throttle_event;
 	script.tool.ordering_requires_timestamps = true;
+	script.tool.merge_deferred_callchains = false;
 	session = perf_session__new(&data, &script.tool);
 	if (IS_ERR(session))
 		return PTR_ERR(session);
-- 
2.46.0.792.g87dc391469-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/5] perf tools: Merge deferred user callchains
  2024-09-17 22:28 [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2) Namhyung Kim
                   ` (3 preceding siblings ...)
  2024-09-17 22:28 ` [PATCH 4/5] perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED Namhyung Kim
@ 2024-09-17 22:28 ` Namhyung Kim
  2024-09-18  6:38 ` [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2) Ian Rogers
  2024-09-18 20:26 ` Liang, Kan
  6 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-09-17 22:28 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Josh Poimboeuf, Steven Rostedt,
	Mathieu Desnoyers, Indu Bhagat, linux-toolchains

Save samples with deferred callchains in a separate list and deliver
them after merging the user callchains.  If users don't want to merge
they can set tool->merge_deferred_callchains to false to prevent the
behavior.

With previous result, now perf script will show the merged callchains.

  $ perf script
  perf     801 [000]    18.031793:          1 cycles:P:
          ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
          ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
          ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
          ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
          ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
          ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
          ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
          ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
          ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
          ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
          ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
          ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
          ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
              7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
  ...

The old output can be get using --no-merge-callchain option.
Also perf report can get the user callchain entry at the end.

  $ perf report --no-children --percent-limit=0 --stdio -q -S __intel_pmu_enable_all.isra.0
  # symbol: __intel_pmu_enable_all.isra.0
       0.00%  perf     [kernel.kallsyms]
              |
              ---__intel_pmu_enable_all.isra.0
                 perf_ctx_enable
                 event_function
                 remote_function
                 generic_exec_single
                 smp_call_function_single
                 event_function_call
                 perf_event_for_each_child
                 _perf_ioctl
                 perf_ioctl
                 __x64_sys_ioctl
                 do_syscall_64
                 entry_SYSCALL_64
                 __GI___ioctl

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-script.txt |  5 ++
 tools/perf/builtin-script.c              |  5 +-
 tools/perf/util/callchain.c              | 24 +++++++++
 tools/perf/util/callchain.h              |  3 ++
 tools/perf/util/evlist.c                 |  1 +
 tools/perf/util/evlist.h                 |  1 +
 tools/perf/util/session.c                | 63 +++++++++++++++++++++++-
 tools/perf/util/tool.c                   |  1 +
 tools/perf/util/tool.h                   |  1 +
 9 files changed, 102 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index b72866ef270b9068..69f018b3d1993716 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -518,6 +518,11 @@ include::itrace.txt[]
 	The known limitations include exception handing such as
 	setjmp/longjmp will have calls/returns not match.
 
+--merge-callchains::
+	Enable merging deferred user callchains if available.  This is the
+	default behavior.  If you want to see separate CALLCHAIN_DEFERRED
+	records for some reason, use --no-merge-callchains explicitly.
+
 :GMEXAMPLECMD: script
 :GMEXAMPLESUBCMD:
 include::guest-files.txt[]
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 311580e25f5b2008..e3acf4979c36d902 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -4031,6 +4031,7 @@ int cmd_script(int argc, const char **argv)
 	bool header_only = false;
 	bool script_started = false;
 	bool unsorted_dump = false;
+	bool merge_deferred_callchains = true;
 	char *rec_script_path = NULL;
 	char *rep_script_path = NULL;
 	struct perf_session *session;
@@ -4184,6 +4185,8 @@ int cmd_script(int argc, const char **argv)
 		    "Guest code can be found in hypervisor process"),
 	OPT_BOOLEAN('\0', "stitch-lbr", &script.stitch_lbr,
 		    "Enable LBR callgraph stitching approach"),
+	OPT_BOOLEAN('\0', "merge-callchains", &merge_deferred_callchains,
+		    "Enable merge deferred user callchains"),
 	OPTS_EVSWITCH(&script.evswitch),
 	OPT_END()
 	};
@@ -4439,7 +4442,7 @@ int cmd_script(int argc, const char **argv)
 	script.tool.throttle		 = process_throttle_event;
 	script.tool.unthrottle		 = process_throttle_event;
 	script.tool.ordering_requires_timestamps = true;
-	script.tool.merge_deferred_callchains = false;
+	script.tool.merge_deferred_callchains = merge_deferred_callchains;
 	session = perf_session__new(&data, &script.tool);
 	if (IS_ERR(session))
 		return PTR_ERR(session);
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 0c7564747a14e539..d1114491c3da5d0a 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1832,3 +1832,27 @@ int sample__for_each_callchain_node(struct thread *thread, struct evsel *evsel,
 	}
 	return 0;
 }
+
+int sample__merge_deferred_callchain(struct perf_sample *sample_orig,
+				     struct perf_sample *sample_callchain)
+{
+	u64 nr_orig = sample_orig->callchain->nr - 1;
+	u64 nr_deferred = sample_callchain->callchain->nr;
+	struct ip_callchain *callchain;
+
+	callchain = calloc(1 + nr_orig + nr_deferred, sizeof(u64));
+	if (callchain == NULL) {
+		sample_orig->deferred_callchain = false;
+		return -ENOMEM;
+	}
+
+	callchain->nr = nr_orig + nr_deferred;
+	/* copy except for the last PERF_CONTEXT_USER_DEFERRED */
+	memcpy(callchain->ips, sample_orig->callchain->ips, nr_orig * sizeof(u64));
+	/* copy deferred use callchains */
+	memcpy(&callchain->ips[nr_orig], sample_callchain->callchain->ips,
+	       nr_deferred * sizeof(u64));
+
+	sample_orig->callchain = callchain;
+	return 0;
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 86ed9e4d04f9ee7b..89785125ed25783d 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -317,4 +317,7 @@ int sample__for_each_callchain_node(struct thread *thread, struct evsel *evsel,
 				    struct perf_sample *sample, int max_stack,
 				    bool symbols, callchain_iter_fn cb, void *data);
 
+int sample__merge_deferred_callchain(struct perf_sample *sample_orig,
+				     struct perf_sample *sample_callchain);
+
 #endif	/* __PERF_CALLCHAIN_H */
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index f14b7e6ff1dcc2cd..f27d8c4a22aadde9 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -81,6 +81,7 @@ void evlist__init(struct evlist *evlist, struct perf_cpu_map *cpus,
 	evlist->ctl_fd.ack = -1;
 	evlist->ctl_fd.pos = -1;
 	evlist->nr_br_cntr = -1;
+	INIT_LIST_HEAD(&evlist->deferred_samples);
 }
 
 struct evlist *evlist__new(void)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index bcc1c6984bb58a9d..c26379366554cf09 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -84,6 +84,7 @@ struct evlist {
 		int	pos;	/* index at evlist core object to check signals */
 	} ctl_fd;
 	struct event_enable_timer *eet;
+	struct list_head deferred_samples;
 };
 
 struct evsel_str_handler {
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 1248a0317a2f164a..e0a21b896b5784f3 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1256,6 +1256,56 @@ static int evlist__deliver_sample(struct evlist *evlist, const struct perf_tool
 					    &sample->read.one, machine);
 }
 
+struct deferred_event {
+	struct list_head list;
+	union perf_event *event;
+};
+
+static int evlist__deliver_deferred_samples(struct evlist *evlist,
+					    const struct perf_tool *tool,
+					    union  perf_event *event,
+					    struct perf_sample *sample,
+					    struct machine *machine)
+{
+	struct deferred_event *de, *tmp;
+	struct evsel *evsel;
+	int ret = 0;
+
+	if (!tool->merge_deferred_callchains) {
+		evsel = evlist__id2evsel(evlist, sample->id);
+		return tool->callchain_deferred(tool, event, sample,
+						evsel, machine);
+	}
+
+	list_for_each_entry_safe(de, tmp, &evlist->deferred_samples, list) {
+		struct perf_sample orig_sample;
+
+		ret = evlist__parse_sample(evlist, de->event, &orig_sample);
+		if (ret < 0) {
+			pr_err("failed to parse original sample\n");
+			break;
+		}
+
+		if (sample->tid != orig_sample.tid)
+			continue;
+
+		evsel = evlist__id2evsel(evlist, orig_sample.id);
+		sample__merge_deferred_callchain(&orig_sample, sample);
+		ret = evlist__deliver_sample(evlist, tool, de->event,
+					     &orig_sample, evsel, machine);
+
+		if (orig_sample.deferred_callchain)
+			free(orig_sample.callchain);
+
+		list_del(&de->list);
+		free(de);
+
+		if (ret)
+			break;
+	}
+	return ret;
+}
+
 static int machines__deliver_event(struct machines *machines,
 				   struct evlist *evlist,
 				   union perf_event *event,
@@ -1284,6 +1334,16 @@ static int machines__deliver_event(struct machines *machines,
 			return 0;
 		}
 		dump_sample(evsel, event, sample, perf_env__arch(machine->env));
+		if (sample->deferred_callchain && tool->merge_deferred_callchains) {
+			struct deferred_event *de = malloc(sizeof(*de));
+
+			if (de == NULL)
+				return -ENOMEM;
+
+			de->event = event;
+			list_add_tail(&de->list, &evlist->deferred_samples);
+			return 0;
+		}
 		return evlist__deliver_sample(evlist, tool, event, sample, evsel, machine);
 	case PERF_RECORD_MMAP:
 		return tool->mmap(tool, event, sample, machine);
@@ -1343,7 +1403,8 @@ static int machines__deliver_event(struct machines *machines,
 		return tool->aux_output_hw_id(tool, event, sample, machine);
 	case PERF_RECORD_CALLCHAIN_DEFERRED:
 		dump_deferred_callchain(evsel, event, sample);
-		return tool->callchain_deferred(tool, event, sample, evsel, machine);
+		return evlist__deliver_deferred_samples(evlist, tool, event,
+							sample, machine);
 	default:
 		++evlist->stats.nr_unknown_events;
 		return -1;
diff --git a/tools/perf/util/tool.c b/tools/perf/util/tool.c
index e78f16de912ed9e2..385043e06627d269 100644
--- a/tools/perf/util/tool.c
+++ b/tools/perf/util/tool.c
@@ -238,6 +238,7 @@ void perf_tool__init(struct perf_tool *tool, bool ordered_events)
 	tool->cgroup_events = false;
 	tool->no_warn = false;
 	tool->show_feat_hdr = SHOW_FEAT_NO_HEADER;
+	tool->merge_deferred_callchains = true;
 
 	tool->sample = process_event_sample_stub;
 	tool->mmap = process_event_stub;
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index 9987bbde6d5e0565..d06580478ab17a88 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -87,6 +87,7 @@ struct perf_tool {
 	bool		cgroup_events;
 	bool		no_warn;
 	bool		dont_split_sample_group;
+	bool		merge_deferred_callchains;
 	enum show_feature_header show_feat_hdr;
 };
 
-- 
2.46.0.792.g87dc391469-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2)
  2024-09-17 22:28 [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2) Namhyung Kim
                   ` (4 preceding siblings ...)
  2024-09-17 22:28 ` [PATCH 5/5] perf tools: Merge deferred user callchains Namhyung Kim
@ 2024-09-18  6:38 ` Ian Rogers
  2024-09-18  9:38   ` Namhyung Kim
  2024-09-18 20:26 ` Liang, Kan
  6 siblings, 1 reply; 12+ messages in thread
From: Ian Rogers @ 2024-09-18  6:38 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Kan Liang, Jiri Olsa, Adrian Hunter,
	Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users,
	Josh Poimboeuf, Steven Rostedt, Mathieu Desnoyers, Indu Bhagat,
	linux-toolchains

On Wed, Sep 18, 2024 at 12:28 AM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hello,
>
> This is a counterpart for Josh's kernel change v2 [1] to support deferred
> user callchains.  The change is transparent and users should not notice
> anything with the deferred callchains.
>
>   $ perf record -g sleep 1
>
> I added --[no-]merge-callchains option to control output of perf script.
> You can verify it has the deferred callchains like this:
>
>   $ perf script --no-merge-callchains
>   perf     801 [000]    18.031793:          1 cycles:P:
>           ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
>           ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
>           ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
>           ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
>           ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
>           ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
>           ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
>           ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
>           ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
>           ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
>           ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
>           ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
>           ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
>
>   perf     801 [000]    18.031814: DEFERRED CALLCHAIN
>                   7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
>
>   ...
>
> When the callchain is merged (it's the default) it'd look like below:
>
>   $ perf script
>   perf     801 [000]    18.031793:          1 cycles:P:
>           ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
>           ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
>           ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
>           ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
>           ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
>           ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
>           ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
>           ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
>           ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
>           ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
>           ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
>           ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
>           ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
>                   7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
>
>   ...
>
> Notice that the last line and it has the __GI___ioctl in the same
> callchain.  It should work with other tools like perf report.

Hi Namhyung, I think this is interesting work!

The issue feels similar to leader sampling and some of the unpicking
of that we've been dealing with. With leader sampling it was added and
then the dispatch of events modified so that tools wouldn't see leader
samples, instead new events would be synthesized based on the leader
sample data. However, the leader sample event wasn't changed and so
now we have multiple repeated events and perf inject wouldn't just
pass through a perf data file.

What I'm expecting based on this description is that a deferred call
chain will be merged with a regular one, however, perf inject isn't
updated to drop the deferred callchain so now we have the deferred
callchain event twice.

My feeling is that making the dispatch of events to tools "smart" is a
false economy. Tools can add handlers for these events easily enough.
What's harder is undoing the smartness when it does things that lead
to duplicated events and the like. I'm not a fan of how leader
sampling was implemented and I still think it odd that with perf
script we see invented events when trying to just dump the contents of
a perf.data file.

Thanks,
Ian

> The code is available at 'perf/defer-callchain-v2' branch in
> https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
>
> Thanks,
> Namhyung
>
> [1] https://lore.kernel.org/lkml/cover.1726268190.git.jpoimboe@kernel.org
>
>
> Namhyung Kim (5):
>   perf tools: Sync UAPI perf_event.h header
>   perf tools: Minimal DEFERRED_CALLCHAIN support
>   perf record: Enable defer_callchain for user callchains
>   perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED
>   perf tools: Merge deferred user callchains
>
>  tools/include/uapi/linux/perf_event.h     | 21 +++++-
>  tools/lib/perf/include/perf/event.h       |  7 ++
>  tools/perf/Documentation/perf-script.txt  |  5 ++
>  tools/perf/builtin-script.c               | 92 +++++++++++++++++++++++
>  tools/perf/util/callchain.c               | 24 ++++++
>  tools/perf/util/callchain.h               |  3 +
>  tools/perf/util/event.c                   |  1 +
>  tools/perf/util/evlist.c                  |  1 +
>  tools/perf/util/evlist.h                  |  1 +
>  tools/perf/util/evsel.c                   | 32 +++++++-
>  tools/perf/util/evsel.h                   |  1 +
>  tools/perf/util/machine.c                 |  1 +
>  tools/perf/util/perf_event_attr_fprintf.c |  1 +
>  tools/perf/util/sample.h                  |  3 +-
>  tools/perf/util/session.c                 | 78 +++++++++++++++++++
>  tools/perf/util/tool.c                    |  2 +
>  tools/perf/util/tool.h                    |  4 +-
>  17 files changed, 273 insertions(+), 4 deletions(-)
>
> --
> 2.46.0.792.g87dc391469-goog
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2)
  2024-09-18  6:38 ` [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2) Ian Rogers
@ 2024-09-18  9:38   ` Namhyung Kim
  2024-09-18 13:39     ` Ian Rogers
  0 siblings, 1 reply; 12+ messages in thread
From: Namhyung Kim @ 2024-09-18  9:38 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Kan Liang, Jiri Olsa, Adrian Hunter,
	Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users,
	Josh Poimboeuf, Steven Rostedt, Mathieu Desnoyers, Indu Bhagat,
	linux-toolchains

Hi Ian,

On Wed, Sep 18, 2024 at 08:38:22AM +0200, Ian Rogers wrote:
> On Wed, Sep 18, 2024 at 12:28 AM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hello,
> >
> > This is a counterpart for Josh's kernel change v2 [1] to support deferred
> > user callchains.  The change is transparent and users should not notice
> > anything with the deferred callchains.
> >
> >   $ perf record -g sleep 1
> >
> > I added --[no-]merge-callchains option to control output of perf script.
> > You can verify it has the deferred callchains like this:
> >
> >   $ perf script --no-merge-callchains
> >   perf     801 [000]    18.031793:          1 cycles:P:
> >           ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
> >           ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
> >           ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
> >           ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
> >           ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
> >           ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
> >           ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
> >           ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
> >           ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
> >           ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
> >           ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
> >           ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
> >           ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
> >
> >   perf     801 [000]    18.031814: DEFERRED CALLCHAIN
> >                   7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
> >
> >   ...
> >
> > When the callchain is merged (it's the default) it'd look like below:
> >
> >   $ perf script
> >   perf     801 [000]    18.031793:          1 cycles:P:
> >           ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
> >           ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
> >           ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
> >           ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
> >           ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
> >           ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
> >           ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
> >           ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
> >           ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
> >           ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
> >           ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
> >           ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
> >           ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
> >                   7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
> >
> >   ...
> >
> > Notice that the last line and it has the __GI___ioctl in the same
> > callchain.  It should work with other tools like perf report.
> 
> Hi Namhyung, I think this is interesting work!
> 
> The issue feels similar to leader sampling and some of the unpicking
> of that we've been dealing with. With leader sampling it was added and
> then the dispatch of events modified so that tools wouldn't see leader
> samples, instead new events would be synthesized based on the leader
> sample data. However, the leader sample event wasn't changed and so
> now we have multiple repeated events and perf inject wouldn't just
> pass through a perf data file.
> 
> What I'm expecting based on this description is that a deferred call
> chain will be merged with a regular one, however, perf inject isn't
> updated to drop the deferred callchain so now we have the deferred
> callchain event twice.
> 
> My feeling is that making the dispatch of events to tools "smart" is a
> false economy. Tools can add handlers for these events easily enough.
> What's harder is undoing the smartness when it does things that lead
> to duplicated events and the like. I'm not a fan of how leader
> sampling was implemented and I still think it odd that with perf
> script we see invented events when trying to just dump the contents of
> a perf.data file.

That's why I added perf_tool.merge_deferred_callchains flag to control
the behavior.  I haven't implemented it to perf inject because it covers
a couple of different use cases.  I believe the default behavior is to
not invoke the callback for deferred callchains during perf inject and
each sample will get the full callchains.  But you can add a new
callback and set perf_tool.merge_deferred_callchains to false.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2)
  2024-09-18  9:38   ` Namhyung Kim
@ 2024-09-18 13:39     ` Ian Rogers
  2024-09-23 23:07       ` Namhyung Kim
  0 siblings, 1 reply; 12+ messages in thread
From: Ian Rogers @ 2024-09-18 13:39 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Kan Liang, Jiri Olsa, Adrian Hunter,
	Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users,
	Josh Poimboeuf, Steven Rostedt, Mathieu Desnoyers, Indu Bhagat,
	linux-toolchains

On Wed, Sep 18, 2024 at 11:38 AM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hi Ian,
>
> On Wed, Sep 18, 2024 at 08:38:22AM +0200, Ian Rogers wrote:
> > On Wed, Sep 18, 2024 at 12:28 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > Hello,
> > >
> > > This is a counterpart for Josh's kernel change v2 [1] to support deferred
> > > user callchains.  The change is transparent and users should not notice
> > > anything with the deferred callchains.
> > >
> > >   $ perf record -g sleep 1
> > >
> > > I added --[no-]merge-callchains option to control output of perf script.
> > > You can verify it has the deferred callchains like this:
> > >
> > >   $ perf script --no-merge-callchains
> > >   perf     801 [000]    18.031793:          1 cycles:P:
> > >           ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
> > >           ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
> > >           ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
> > >           ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
> > >           ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
> > >           ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
> > >           ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
> > >           ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
> > >           ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
> > >           ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
> > >           ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
> > >           ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
> > >           ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
> > >
> > >   perf     801 [000]    18.031814: DEFERRED CALLCHAIN
> > >                   7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
> > >
> > >   ...
> > >
> > > When the callchain is merged (it's the default) it'd look like below:
> > >
> > >   $ perf script
> > >   perf     801 [000]    18.031793:          1 cycles:P:
> > >           ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
> > >           ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
> > >           ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
> > >           ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
> > >           ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
> > >           ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
> > >           ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
> > >           ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
> > >           ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
> > >           ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
> > >           ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
> > >           ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
> > >           ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
> > >                   7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
> > >
> > >   ...
> > >
> > > Notice that the last line and it has the __GI___ioctl in the same
> > > callchain.  It should work with other tools like perf report.
> >
> > Hi Namhyung, I think this is interesting work!
> >
> > The issue feels similar to leader sampling and some of the unpicking
> > of that we've been dealing with. With leader sampling it was added and
> > then the dispatch of events modified so that tools wouldn't see leader
> > samples, instead new events would be synthesized based on the leader
> > sample data. However, the leader sample event wasn't changed and so
> > now we have multiple repeated events and perf inject wouldn't just
> > pass through a perf data file.
> >
> > What I'm expecting based on this description is that a deferred call
> > chain will be merged with a regular one, however, perf inject isn't
> > updated to drop the deferred callchain so now we have the deferred
> > callchain event twice.
> >
> > My feeling is that making the dispatch of events to tools "smart" is a
> > false economy. Tools can add handlers for these events easily enough.
> > What's harder is undoing the smartness when it does things that lead
> > to duplicated events and the like. I'm not a fan of how leader
> > sampling was implemented and I still think it odd that with perf
> > script we see invented events when trying to just dump the contents of
> > a perf.data file.
>
> That's why I added perf_tool.merge_deferred_callchains flag to control
> the behavior.  I haven't implemented it to perf inject because it covers
> a couple of different use cases.  I believe the default behavior is to
> not invoke the callback for deferred callchains during perf inject and
> each sample will get the full callchains.  But you can add a new
> callback and set perf_tool.merge_deferred_callchains to false.

I wonder if there is a different strategy for handling this. Normally
with a visitor pattern you fail when you call an unimplemented
visitor, this is then a signal the (in our case) tool needs to handle
the new case. This avoids naively doing things like making perf inject
duplicate events. The equivalent in the perf code would be to
initialize the callbacks in the tool constructor to be to stubs that
abort, then explicitly initialize and use things like callchain
merging as appropriate. The whole booleans next to the callbacks feels
like a kludge and likely to hide bugs. It is also marginally less
efficient.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2)
  2024-09-17 22:28 [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2) Namhyung Kim
                   ` (5 preceding siblings ...)
  2024-09-18  6:38 ` [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2) Ian Rogers
@ 2024-09-18 20:26 ` Liang, Kan
  2024-09-23 23:08   ` Namhyung Kim
  6 siblings, 1 reply; 12+ messages in thread
From: Liang, Kan @ 2024-09-18 20:26 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Ian Rogers
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Josh Poimboeuf, Steven Rostedt,
	Mathieu Desnoyers, Indu Bhagat, linux-toolchains



On 2024-09-17 6:28 p.m., Namhyung Kim wrote:
> Hello,
> 
> This is a counterpart for Josh's kernel change v2 [1] to support deferred
> user callchains.  The change is transparent and users should not notice
> anything with the deferred callchains.
> 
>   $ perf record -g sleep 1
> 
> I added --[no-]merge-callchains option to control output of perf script.
> You can verify it has the deferred callchains like this:
> 
>   $ perf script --no-merge-callchains
>   perf     801 [000]    18.031793:          1 cycles:P:
>           ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
>           ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
>           ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
>           ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
>           ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
>           ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
>           ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
>           ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
>           ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
>           ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
>           ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
>           ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
>           ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
> 
>   perf     801 [000]    18.031814: DEFERRED CALLCHAIN
>                   7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
> 
>   ...
> 
> When the callchain is merged (it's the default) it'd look like below:
> 
>   $ perf script
>   perf     801 [000]    18.031793:          1 cycles:P:
>           ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
>           ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
>           ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
>           ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
>           ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
>           ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
>           ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
>           ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
>           ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
>           ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
>           ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
>           ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
>           ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
>                   7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
> 
>   ...
> 
> Notice that the last line and it has the __GI___ioctl in the same
> callchain.  It should work with other tools like perf report.


It seems it only works with perf report -D, when I test it on a
non-hybrid machine.
$perf record -e branches -g -c 3000000 ~/tchain_edit
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.397 MB perf.data ]
$ perf report -D | tail -n 17

Aggregated stats:
               TOTAL events:       8235
                MMAP events:         78  ( 0.9%)
                COMM events:          2  ( 0.0%)
                EXIT events:          1  ( 0.0%)
              SAMPLE events:       4060  (49.3%)
               MMAP2 events:          2  ( 0.0%)
             KSYMBOL events:         12  ( 0.1%)
           BPF_EVENT events:         12  ( 0.1%)
  CALLCHAIN_DEFERRED events:       4060  (49.3%)
      FINISHED_ROUND events:          3  ( 0.0%)
            ID_INDEX events:          1  ( 0.0%)
          THREAD_MAP events:          1  ( 0.0%)
             CPU_MAP events:          1  ( 0.0%)
           TIME_CONV events:          1  ( 0.0%)
       FINISHED_INIT events:          1  ( 0.0%)
$ perf report
Error:
The perf.data data has no samples!
# To display the perf.data header info, please use
--header/--header-only options.
#


On a hybrid machine, perf record errors out.

$perf record -g true
[ perf record: Woken up 1 times to write data ]
0x58a8 [0x38]: failed to process type: 22 [Bad address]
[ perf record: Captured and wrote 0.022 MB perf.data ]

Thanks,
Kan
> 
> The code is available at 'perf/defer-callchain-v2' branch in
> https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> 
> Thanks,
> Namhyung
> 
> [1] https://lore.kernel.org/lkml/cover.1726268190.git.jpoimboe@kernel.org
> 
> 
> Namhyung Kim (5):
>   perf tools: Sync UAPI perf_event.h header
>   perf tools: Minimal DEFERRED_CALLCHAIN support
>   perf record: Enable defer_callchain for user callchains
>   perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED
>   perf tools: Merge deferred user callchains
> 
>  tools/include/uapi/linux/perf_event.h     | 21 +++++-
>  tools/lib/perf/include/perf/event.h       |  7 ++
>  tools/perf/Documentation/perf-script.txt  |  5 ++
>  tools/perf/builtin-script.c               | 92 +++++++++++++++++++++++
>  tools/perf/util/callchain.c               | 24 ++++++
>  tools/perf/util/callchain.h               |  3 +
>  tools/perf/util/event.c                   |  1 +
>  tools/perf/util/evlist.c                  |  1 +
>  tools/perf/util/evlist.h                  |  1 +
>  tools/perf/util/evsel.c                   | 32 +++++++-
>  tools/perf/util/evsel.h                   |  1 +
>  tools/perf/util/machine.c                 |  1 +
>  tools/perf/util/perf_event_attr_fprintf.c |  1 +
>  tools/perf/util/sample.h                  |  3 +-
>  tools/perf/util/session.c                 | 78 +++++++++++++++++++
>  tools/perf/util/tool.c                    |  2 +
>  tools/perf/util/tool.h                    |  4 +-
>  17 files changed, 273 insertions(+), 4 deletions(-)
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2)
  2024-09-18 13:39     ` Ian Rogers
@ 2024-09-23 23:07       ` Namhyung Kim
  0 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-09-23 23:07 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Kan Liang, Jiri Olsa, Adrian Hunter,
	Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users,
	Josh Poimboeuf, Steven Rostedt, Mathieu Desnoyers, Indu Bhagat,
	linux-toolchains

Hi Ian,

On Wed, Sep 18, 2024 at 03:39:31PM +0200, Ian Rogers wrote:
> On Wed, Sep 18, 2024 at 11:38 AM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hi Ian,
> >
> > On Wed, Sep 18, 2024 at 08:38:22AM +0200, Ian Rogers wrote:
> > > On Wed, Sep 18, 2024 at 12:28 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > > >
> > > > Hello,
> > > >
> > > > This is a counterpart for Josh's kernel change v2 [1] to support deferred
> > > > user callchains.  The change is transparent and users should not notice
> > > > anything with the deferred callchains.
> > > >
> > > >   $ perf record -g sleep 1
> > > >
> > > > I added --[no-]merge-callchains option to control output of perf script.
> > > > You can verify it has the deferred callchains like this:
> > > >
> > > >   $ perf script --no-merge-callchains
> > > >   perf     801 [000]    18.031793:          1 cycles:P:
> > > >           ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
> > > >           ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
> > > >           ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
> > > >           ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
> > > >           ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
> > > >           ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
> > > >           ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
> > > >           ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
> > > >           ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
> > > >           ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
> > > >           ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
> > > >           ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
> > > >           ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
> > > >
> > > >   perf     801 [000]    18.031814: DEFERRED CALLCHAIN
> > > >                   7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
> > > >
> > > >   ...
> > > >
> > > > When the callchain is merged (it's the default) it'd look like below:
> > > >
> > > >   $ perf script
> > > >   perf     801 [000]    18.031793:          1 cycles:P:
> > > >           ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
> > > >           ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
> > > >           ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
> > > >           ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
> > > >           ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
> > > >           ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
> > > >           ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
> > > >           ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
> > > >           ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
> > > >           ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
> > > >           ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
> > > >           ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
> > > >           ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
> > > >                   7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
> > > >
> > > >   ...
> > > >
> > > > Notice that the last line and it has the __GI___ioctl in the same
> > > > callchain.  It should work with other tools like perf report.
> > >
> > > Hi Namhyung, I think this is interesting work!
> > >
> > > The issue feels similar to leader sampling and some of the unpicking
> > > of that we've been dealing with. With leader sampling it was added and
> > > then the dispatch of events modified so that tools wouldn't see leader
> > > samples, instead new events would be synthesized based on the leader
> > > sample data. However, the leader sample event wasn't changed and so
> > > now we have multiple repeated events and perf inject wouldn't just
> > > pass through a perf data file.
> > >
> > > What I'm expecting based on this description is that a deferred call
> > > chain will be merged with a regular one, however, perf inject isn't
> > > updated to drop the deferred callchain so now we have the deferred
> > > callchain event twice.
> > >
> > > My feeling is that making the dispatch of events to tools "smart" is a
> > > false economy. Tools can add handlers for these events easily enough.
> > > What's harder is undoing the smartness when it does things that lead
> > > to duplicated events and the like. I'm not a fan of how leader
> > > sampling was implemented and I still think it odd that with perf
> > > script we see invented events when trying to just dump the contents of
> > > a perf.data file.
> >
> > That's why I added perf_tool.merge_deferred_callchains flag to control
> > the behavior.  I haven't implemented it to perf inject because it covers
> > a couple of different use cases.  I believe the default behavior is to
> > not invoke the callback for deferred callchains during perf inject and
> > each sample will get the full callchains.  But you can add a new
> > callback and set perf_tool.merge_deferred_callchains to false.
> 
> I wonder if there is a different strategy for handling this. Normally
> with a visitor pattern you fail when you call an unimplemented
> visitor, this is then a signal the (in our case) tool needs to handle
> the new case. This avoids naively doing things like making perf inject
> duplicate events. The equivalent in the perf code would be to
> initialize the callbacks in the tool constructor to be to stubs that
> abort, then explicitly initialize and use things like callchain
> merging as appropriate. The whole booleans next to the callbacks feels
> like a kludge and likely to hide bugs. It is also marginally less
> efficient.

Well.. we might change it that way later, but I just wanted to test the
deferred callchains quickly in this series.

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2)
  2024-09-18 20:26 ` Liang, Kan
@ 2024-09-23 23:08   ` Namhyung Kim
  0 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-09-23 23:08 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Arnaldo Carvalho de Melo, Ian Rogers, Jiri Olsa, Adrian Hunter,
	Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users,
	Josh Poimboeuf, Steven Rostedt, Mathieu Desnoyers, Indu Bhagat,
	linux-toolchains

Hi Kan,

On Wed, Sep 18, 2024 at 04:26:56PM -0400, Liang, Kan wrote:
> 
> 
> On 2024-09-17 6:28 p.m., Namhyung Kim wrote:
> > Hello,
> > 
> > This is a counterpart for Josh's kernel change v2 [1] to support deferred
> > user callchains.  The change is transparent and users should not notice
> > anything with the deferred callchains.
> > 
> >   $ perf record -g sleep 1
> > 
> > I added --[no-]merge-callchains option to control output of perf script.
> > You can verify it has the deferred callchains like this:
> > 
> >   $ perf script --no-merge-callchains
> >   perf     801 [000]    18.031793:          1 cycles:P:
> >           ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
> >           ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
> >           ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
> >           ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
> >           ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
> >           ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
> >           ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
> >           ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
> >           ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
> >           ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
> >           ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
> >           ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
> >           ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
> > 
> >   perf     801 [000]    18.031814: DEFERRED CALLCHAIN
> >                   7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
> > 
> >   ...
> > 
> > When the callchain is merged (it's the default) it'd look like below:
> > 
> >   $ perf script
> >   perf     801 [000]    18.031793:          1 cycles:P:
> >           ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
> >           ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
> >           ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
> >           ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
> >           ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
> >           ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
> >           ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
> >           ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
> >           ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
> >           ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
> >           ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
> >           ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
> >           ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
> >                   7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
> > 
> >   ...
> > 
> > Notice that the last line and it has the __GI___ioctl in the same
> > callchain.  It should work with other tools like perf report.
> 
> 
> It seems it only works with perf report -D, when I test it on a
> non-hybrid machine.
> $perf record -e branches -g -c 3000000 ~/tchain_edit
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.397 MB perf.data ]
> $ perf report -D | tail -n 17
> 
> Aggregated stats:
>                TOTAL events:       8235
>                 MMAP events:         78  ( 0.9%)
>                 COMM events:          2  ( 0.0%)
>                 EXIT events:          1  ( 0.0%)
>               SAMPLE events:       4060  (49.3%)
>                MMAP2 events:          2  ( 0.0%)
>              KSYMBOL events:         12  ( 0.1%)
>            BPF_EVENT events:         12  ( 0.1%)
>   CALLCHAIN_DEFERRED events:       4060  (49.3%)
>       FINISHED_ROUND events:          3  ( 0.0%)
>             ID_INDEX events:          1  ( 0.0%)
>           THREAD_MAP events:          1  ( 0.0%)
>              CPU_MAP events:          1  ( 0.0%)
>            TIME_CONV events:          1  ( 0.0%)
>        FINISHED_INIT events:          1  ( 0.0%)
> $ perf report
> Error:
> The perf.data data has no samples!
> # To display the perf.data header info, please use
> --header/--header-only options.
> #
> 
> 
> On a hybrid machine, perf record errors out.
> 
> $perf record -g true
> [ perf record: Woken up 1 times to write data ]
> 0x58a8 [0x38]: failed to process type: 22 [Bad address]
> [ perf record: Captured and wrote 0.022 MB perf.data ]

Thanks for the test, I'll take a look what I missed.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-09-23 23:08 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-17 22:28 [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2) Namhyung Kim
2024-09-17 22:28 ` [PATCH 1/5] perf tools: Sync UAPI perf_event.h header Namhyung Kim
2024-09-17 22:28 ` [PATCH 2/5] perf tools: Minimal DEFERRED_CALLCHAIN support Namhyung Kim
2024-09-17 22:28 ` [PATCH 3/5] perf record: Enable defer_callchain for user callchains Namhyung Kim
2024-09-17 22:28 ` [PATCH 4/5] perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED Namhyung Kim
2024-09-17 22:28 ` [PATCH 5/5] perf tools: Merge deferred user callchains Namhyung Kim
2024-09-18  6:38 ` [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2) Ian Rogers
2024-09-18  9:38   ` Namhyung Kim
2024-09-18 13:39     ` Ian Rogers
2024-09-23 23:07       ` Namhyung Kim
2024-09-18 20:26 ` Liang, Kan
2024-09-23 23:08   ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).