public inbox for linux-s390@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] perf symbol: Remove psw_idle() from list of idle symbols
@ 2026-02-19 11:38 Thomas Richter
  2026-02-19 11:55 ` Jan Polensky
                   ` (2 more replies)
  0 siblings, 3 replies; 80+ messages in thread
From: Thomas Richter @ 2026-02-19 11:38 UTC (permalink / raw)
  To: linux-kernel, linux-s390, linux-perf-users, acme, namhyung
  Cc: agordeev, gor, sumanthk, hca, japo, Thomas Richter

Commit fa2ae4a377c0 ("s390/idle: Rewrite psw_idle() in C")

removes symbols psw_idle() and psw_idle_exit() from the linux
kernel for s390. Remove them in perf tool's list of idle
functions. They can not be detected anymore.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Suggested-by: Heiko Carstens <hca@linux.ibm.com>
---
 tools/perf/util/symbol.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 814f960fa8f8..575951d98b1b 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -752,8 +752,6 @@ static bool symbol__is_idle(const char *name)
 		"poll_idle",
 		"ppc64_runlatch_off",
 		"pseries_dedicated_idle_sleep",
-		"psw_idle",
-		"psw_idle_exit",
 		NULL
 	};
 	int i;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* Re: [PATCH v2] perf symbol: Remove psw_idle() from list of idle symbols
  2026-02-19 11:38 [PATCH v2] perf symbol: Remove psw_idle() from list of idle symbols Thomas Richter
@ 2026-02-19 11:55 ` Jan Polensky
  2026-02-23 21:46 ` Namhyung Kim
  2026-03-02 23:43 ` [PATCH v1] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
  2 siblings, 0 replies; 80+ messages in thread
From: Jan Polensky @ 2026-02-19 11:55 UTC (permalink / raw)
  To: Thomas Richter, linux-kernel, linux-s390, linux-perf-users, acme,
	namhyung
  Cc: agordeev, gor, sumanthk, hca

On Thu, Feb 19, 2026 at 12:38:50PM +0100, Thomas Richter wrote:
> Commit fa2ae4a377c0 ("s390/idle: Rewrite psw_idle() in C")
>
> removes symbols psw_idle() and psw_idle_exit() from the linux
> kernel for s390. Remove them in perf tool's list of idle
> functions. They can not be detected anymore.
>
> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
> Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2] perf symbol: Remove psw_idle() from list of idle symbols
  2026-02-19 11:38 [PATCH v2] perf symbol: Remove psw_idle() from list of idle symbols Thomas Richter
  2026-02-19 11:55 ` Jan Polensky
@ 2026-02-23 21:46 ` Namhyung Kim
  2026-02-23 23:14   ` Arnaldo Melo
  2026-03-02 18:43   ` Arnaldo Carvalho de Melo
  2026-03-02 23:43 ` [PATCH v1] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
  2 siblings, 2 replies; 80+ messages in thread
From: Namhyung Kim @ 2026-02-23 21:46 UTC (permalink / raw)
  To: Thomas Richter
  Cc: linux-kernel, linux-s390, linux-perf-users, acme, agordeev, gor,
	sumanthk, hca, japo

On Thu, Feb 19, 2026 at 12:38:50PM +0100, Thomas Richter wrote:
> Commit fa2ae4a377c0 ("s390/idle: Rewrite psw_idle() in C")
> 
> removes symbols psw_idle() and psw_idle_exit() from the linux
> kernel for s390. Remove them in perf tool's list of idle
> functions. They can not be detected anymore.

But I think old kernels may still run somewhere.  It seems the above
commit was merged to v6.10.  Maybe we should wait some more time before
removing it in the tool.

Thanks,
Namhyung

> 
> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
> Suggested-by: Heiko Carstens <hca@linux.ibm.com>
> ---
>  tools/perf/util/symbol.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index 814f960fa8f8..575951d98b1b 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -752,8 +752,6 @@ static bool symbol__is_idle(const char *name)
>  		"poll_idle",
>  		"ppc64_runlatch_off",
>  		"pseries_dedicated_idle_sleep",
> -		"psw_idle",
> -		"psw_idle_exit",
>  		NULL
>  	};
>  	int i;
> -- 
> 2.53.0
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2] perf symbol: Remove psw_idle() from list of idle symbols
  2026-02-23 21:46 ` Namhyung Kim
@ 2026-02-23 23:14   ` Arnaldo Melo
  2026-03-02 18:43   ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 80+ messages in thread
From: Arnaldo Melo @ 2026-02-23 23:14 UTC (permalink / raw)
  To: Namhyung Kim, Thomas Richter
  Cc: linux-kernel, linux-s390, linux-perf-users, acme, agordeev, gor,
	sumanthk, hca, japo



On February 23, 2026 6:46:21 PM GMT-03:00, Namhyung Kim <namhyung@kernel.org> wrote:
>On Thu, Feb 19, 2026 at 12:38:50PM +0100, Thomas Richter wrote:
>> Commit fa2ae4a377c0 ("s390/idle: Rewrite psw_idle() in C")
>> 
>> removes symbols psw_idle() and psw_idle_exit() from the linux
>> kernel for s390. Remove them in perf tool's list of idle
>> functions. They can not be detected anymore.
>
>But I think old kernels may still run somewhere.  It seems the above
>commit was merged to v6.10.  Maybe we should wait some more time before
>removing it in the tool.

Right, people keep asking if one can use a new version of perf on an old kernel and vice versa. 

So I think we should not apply this patch. 

There has been efforts in the past to try to have have some info per sample indicating the "context" for a sample, if it was in idle processing, hard/soft irq processing, etc, but that didn't come to fruition so far. 

With that we could get rid of this flaky heuristic of looking at a symbol name.

- Arnaldo


>
>Thanks,
>Namhyung
>
>> 
>> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
>> Suggested-by: Heiko Carstens <hca@linux.ibm.com>
>> ---
>>  tools/perf/util/symbol.c | 2 --
>>  1 file changed, 2 deletions(-)
>> 
>> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
>> index 814f960fa8f8..575951d98b1b 100644
>> --- a/tools/perf/util/symbol.c
>> +++ b/tools/perf/util/symbol.c
>> @@ -752,8 +752,6 @@ static bool symbol__is_idle(const char *name)
>>  		"poll_idle",
>>  		"ppc64_runlatch_off",
>>  		"pseries_dedicated_idle_sleep",
>> -		"psw_idle",
>> -		"psw_idle_exit",
>>  		NULL
>>  	};
>>  	int i;
>> -- 
>> 2.53.0
>> 

- Arnaldo

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2] perf symbol: Remove psw_idle() from list of idle symbols
  2026-02-23 21:46 ` Namhyung Kim
  2026-02-23 23:14   ` Arnaldo Melo
@ 2026-03-02 18:43   ` Arnaldo Carvalho de Melo
  2026-03-02 19:44     ` Ian Rogers
  1 sibling, 1 reply; 80+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-03-02 18:43 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Thomas Richter, linux-kernel, linux-s390, linux-perf-users,
	agordeev, gor, sumanthk, hca, japo

On Mon, Feb 23, 2026 at 01:46:21PM -0800, Namhyung Kim wrote:
> On Thu, Feb 19, 2026 at 12:38:50PM +0100, Thomas Richter wrote:
> > Commit fa2ae4a377c0 ("s390/idle: Rewrite psw_idle() in C")
> > 
> > removes symbols psw_idle() and psw_idle_exit() from the linux
> > kernel for s390. Remove them in perf tool's list of idle
> > functions. They can not be detected anymore.
> 
> But I think old kernels may still run somewhere.  It seems the above
> commit was merged to v6.10.  Maybe we should wait some more time before
> removing it in the tool.

Agreed, using a new perf tool, say built from the tarballs made
available at:

https://www.kernel.org/pub/linux/kernel/tools/perf/v7.0.0/perf-7.0.0-rc1.tar.xz

(I will not make a rc2 available since there are no changes to the
tools/perf codebase in this rc).

On older kernels should still ignore those functions.

A suggestion for work in this area instead is to get those samples into
a special bucket, the "idle" one, and show it at some place in the
screen.

Thanks,

- Arnaldo

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2] perf symbol: Remove psw_idle() from list of idle symbols
  2026-03-02 18:43   ` Arnaldo Carvalho de Melo
@ 2026-03-02 19:44     ` Ian Rogers
  2026-03-04 14:34       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 80+ messages in thread
From: Ian Rogers @ 2026-03-02 19:44 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Namhyung Kim, Thomas Richter, linux-kernel, linux-s390,
	linux-perf-users, agordeev, gor, sumanthk, hca, japo

On Mon, Mar 2, 2026 at 10:43 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> On Mon, Feb 23, 2026 at 01:46:21PM -0800, Namhyung Kim wrote:
> > On Thu, Feb 19, 2026 at 12:38:50PM +0100, Thomas Richter wrote:
> > > Commit fa2ae4a377c0 ("s390/idle: Rewrite psw_idle() in C")
> > >
> > > removes symbols psw_idle() and psw_idle_exit() from the linux
> > > kernel for s390. Remove them in perf tool's list of idle
> > > functions. They can not be detected anymore.
> >
> > But I think old kernels may still run somewhere.  It seems the above
> > commit was merged to v6.10.  Maybe we should wait some more time before
> > removing it in the tool.
>
> Agreed, using a new perf tool, say built from the tarballs made
> available at:
>
> https://www.kernel.org/pub/linux/kernel/tools/perf/v7.0.0/perf-7.0.0-rc1.tar.xz
>
> (I will not make a rc2 available since there are no changes to the
> tools/perf codebase in this rc).
>
> On older kernels should still ignore those functions.
>
> A suggestion for work in this area instead is to get those samples into
> a special bucket, the "idle" one, and show it at some place in the
> screen.

Would it also be sensible to pass the perf_env:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/env.h?h=perf-tools-next#n74
into symbol__is_idle? The contents of the perf_env are shown by `perf
report --header`:
```
# ========
# captured on    : Mon Mar  2 11:34:47 2026
# header version : 1
# data offset    : 904
# data size      : 4268216
# feat offset    : 4269120
# hostname : google.com
# os release : 6.17.13-1rodete1-amd64
# perf version : 7.0.rc1.g982b63f6380b
# arch : x86_64
# nrcpus online : 28
# nrcpus avail : 28
# cpudesc : Intel(R) Core(TM) i7-14700
# cpuid : GenuineIntel,6,183,1
...
# e_machine : 62
#   e_flags : 0
...
```
The kernel version is in the release and the e_machine/arch captures
the CPU type.

Thanks,
Ian

> Thanks,
>
> - Arnaldo
>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH v1] perf symbol: Lazily compute idle and use the perf_env
  2026-02-19 11:38 [PATCH v2] perf symbol: Remove psw_idle() from list of idle symbols Thomas Richter
  2026-02-19 11:55 ` Jan Polensky
  2026-02-23 21:46 ` Namhyung Kim
@ 2026-03-02 23:43 ` Ian Rogers
  2026-03-24 17:14   ` Ian Rogers
  2026-03-25 16:18   ` [PATCH v2] " Ian Rogers
  2 siblings, 2 replies; 80+ messages in thread
From: Ian Rogers @ 2026-03-02 23:43 UTC (permalink / raw)
  To: tmricht
  Cc: acme, agordeev, gor, hca, japo, linux-kernel, linux-perf-users,
	linux-s390, namhyung, sumanthk, Ian Rogers

Move the idle boolean to a helper symbol__is_idle function. In the
function lazily compute whether a symbol is an idle function taking
into consideration the kernel version and architecture of the
machine. As symbols__insert no longer needs to know if a symbol is for
the kernel, remove the argument.

This change is inspired by mailing list discussion, particularly from
Thomas Richter <tmricht@linux.ibm.com> and Heiko Carstens
<hca@linux.ibm.com>:
https://lore.kernel.org/lkml/20260219113850.354271-1-tmricht@linux.ibm.com/

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-top.c     |   6 +-
 tools/perf/util/symbol-elf.c |   2 +-
 tools/perf/util/symbol.c     | 106 ++++++++++++++++++++++-------------
 tools/perf/util/symbol.h     |  15 +++--
 4 files changed, 85 insertions(+), 44 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 710604c4f6f6..bc3c8e3b6ec0 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -750,6 +750,7 @@ static void perf_event__process_sample(const struct perf_tool *tool,
 {
 	struct perf_top *top = container_of(tool, struct perf_top, tool);
 	struct addr_location al;
+	struct dso *dso = NULL;
 
 	if (!machine && perf_guest) {
 		static struct intlist *seen;
@@ -829,7 +830,10 @@ static void perf_event__process_sample(const struct perf_tool *tool,
 		}
 	}
 
-	if (al.sym == NULL || !al.sym->idle) {
+	if (al.map)
+		dso = map__dso(al.map);
+
+	if (al.sym == NULL || !symbol__is_idle(al.sym, dso, machine->env)) {
 		struct hists *hists = evsel__hists(evsel);
 		struct hist_entry_iter iter = {
 			.evsel		= evsel,
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 76912c62b6a0..6bb46384aa0c 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1725,7 +1725,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 
 		arch__sym_update(f, &sym);
 
-		__symbols__insert(dso__symbols(curr_dso), f, dso__kernel(dso));
+		__symbols__insert(dso__symbols(curr_dso), f);
 		nr++;
 	}
 	dso__put(curr_dso);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 8662001e1e25..6155f509ca70 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -25,6 +25,8 @@
 #include "demangle-ocaml.h"
 #include "demangle-rust-v0.h"
 #include "dso.h"
+#include "dwarf-regs.h"
+#include "env.h"
 #include "util.h" // lsdir()
 #include "debug.h"
 #include "event.h"
@@ -51,7 +53,6 @@
 
 static int dso__load_kernel_sym(struct dso *dso, struct map *map);
 static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map);
-static bool symbol__is_idle(const char *name);
 
 int vmlinux_path__nr_entries;
 char **vmlinux_path;
@@ -357,8 +358,7 @@ void symbols__delete(struct rb_root_cached *symbols)
 	}
 }
 
-void __symbols__insert(struct rb_root_cached *symbols,
-		       struct symbol *sym, bool kernel)
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
 	struct rb_node **p = &symbols->rb_root.rb_node;
 	struct rb_node *parent = NULL;
@@ -366,17 +366,6 @@ void __symbols__insert(struct rb_root_cached *symbols,
 	struct symbol *s;
 	bool leftmost = true;
 
-	if (kernel) {
-		const char *name = sym->name;
-		/*
-		 * ppc64 uses function descriptors and appends a '.' to the
-		 * start of every instruction address. Remove it.
-		 */
-		if (name[0] == '.')
-			name++;
-		sym->idle = symbol__is_idle(name);
-	}
-
 	while (*p != NULL) {
 		parent = *p;
 		s = rb_entry(parent, struct symbol, rb_node);
@@ -393,7 +382,7 @@ void __symbols__insert(struct rb_root_cached *symbols,
 
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
-	__symbols__insert(symbols, sym, false);
+	__symbols__insert(symbols, sym);
 }
 
 static struct symbol *symbols__find(struct rb_root_cached *symbols, u64 ip)
@@ -554,7 +543,7 @@ void dso__reset_find_symbol_cache(struct dso *dso)
 
 void dso__insert_symbol(struct dso *dso, struct symbol *sym)
 {
-	__symbols__insert(dso__symbols(dso), sym, dso__kernel(dso));
+	__symbols__insert(dso__symbols(dso), sym);
 
 	/* update the symbol cache if necessary */
 	if (dso__last_find_result_addr(dso) >= sym->start &&
@@ -716,47 +705,88 @@ int modules__parse(const char *filename, void *arg,
 	return err;
 }
 
+static int sym_name_cmp(const void *a, const void *b)
+{
+	const char *name = a;
+	const char *const *sym = b;
+
+	return strcmp(name, *sym);
+}
+
 /*
  * These are symbols in the kernel image, so make sure that
  * sym is from a kernel DSO.
  */
-static bool symbol__is_idle(const char *name)
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, const struct perf_env *env)
 {
-	const char * const idle_symbols[] = {
+	static const char * const idle_symbols[] = {
 		"acpi_idle_do_entry",
 		"acpi_processor_ffh_cstate_enter",
 		"arch_cpu_idle",
 		"cpu_idle",
 		"cpu_startup_entry",
-		"idle_cpu",
-		"intel_idle",
-		"intel_idle_ibrs",
 		"default_idle",
-		"native_safe_halt",
 		"enter_idle",
 		"exit_idle",
-		"mwait_idle",
-		"mwait_idle_with_hints",
-		"mwait_idle_with_hints.constprop.0",
+		"idle_cpu",
+		"native_safe_halt",
 		"poll_idle",
-		"ppc64_runlatch_off",
 		"pseries_dedicated_idle_sleep",
-		"psw_idle",
-		"psw_idle_exit",
-		NULL
 	};
-	int i;
-	static struct strlist *idle_symbols_list;
+	const char *name = sym->name;
+	uint16_t e_machine = env ? env->e_machine : EM_HOST;
+
+	if (sym->idle)
+		return sym->idle == SYMBOL_IDLE__IDLE;
+
+	if (!dso || dso__kernel(dso) == DSO_SPACE__USER) {
+		sym->idle = SYMBOL_IDLE__NOT_IDLE;
+		return false;
+	}
 
-	if (idle_symbols_list)
-		return strlist__has_entry(idle_symbols_list, name);
+	/*
+	 * ppc64 uses function descriptors and appends a '.' to the
+	 * start of every instruction address. Remove it.
+	 */
+	if (name[0] == '.')
+		name++;
+
+
+	if (bsearch(name, idle_symbols, ARRAY_SIZE(idle_symbols),
+		    sizeof(idle_symbols[0]), sym_name_cmp)) {
+		sym->idle = SYMBOL_IDLE__IDLE;
+		return true;
+	}
+
+	if (e_machine == EM_386 || e_machine == EM_X86_64) {
+		if (strstarts(name, "mwait_idle") ||
+		    strstarts(name, "intel_idle")) {
+			sym->idle = SYMBOL_IDLE__IDLE;
+			return true;
+		}
+	}
+
+	if (e_machine == EM_PPC64 &&!strcmp(name, "ppc64_runlatch_off")) {
+		sym->idle = SYMBOL_IDLE__IDLE;
+		return true;
+	}
 
-	idle_symbols_list = strlist__new(NULL, NULL);
+	if (e_machine == EM_S390) {
+		int major = 0, minor = 0;
+		const char *release = env && env->os_release
+			? env->os_release : perf_version_string;
 
-	for (i = 0; idle_symbols[i]; i++)
-		strlist__add(idle_symbols_list, idle_symbols[i]);
+		sscanf(release, "%d.%d", &major, &minor);
 
-	return strlist__has_entry(idle_symbols_list, name);
+		/* Before v6.10, s390 used psw_idle. */
+		if ((major < 6 || (major == 6 && minor < 10)) && strstarts(name, "psw_idle")) {
+			sym->idle = SYMBOL_IDLE__IDLE;
+			return true;
+		}
+	}
+
+	sym->idle = SYMBOL_IDLE__NOT_IDLE;
+	return false;
 }
 
 static int map__process_kallsym_symbol(void *arg, const char *name,
@@ -785,7 +815,7 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
 	 * We will pass the symbols to the filter later, in
 	 * map__split_kallsyms, when we have split the maps per module
 	 */
-	__symbols__insert(root, sym, !strchr(name, '['));
+	__symbols__insert(root, sym);
 
 	return 0;
 }
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 3fb5d146d9b1..508dd9f336e9 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -24,6 +24,7 @@ struct dso;
 struct map;
 struct maps;
 struct option;
+struct perf_env;
 struct build_id;
 
 /*
@@ -41,6 +42,12 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
 			     GElf_Shdr *shp, const char *name, size_t *idx);
 #endif
 
+enum symbol_idle_kind {
+	SYMBOL_IDLE__UNKNOWN = 0,
+	SYMBOL_IDLE__NOT_IDLE = 1,
+	SYMBOL_IDLE__IDLE = 2,
+};
+
 /**
  * A symtab entry. When allocated this may be preceded by an annotation (see
  * symbol__annotation) and/or a browser_index (see symbol__browser_index).
@@ -56,8 +63,8 @@ struct symbol {
 	u8		type:4;
 	/** ELF binding type as defined for st_info. E.g. STB_WEAK or STB_GLOBAL. */
 	u8		binding:4;
-	/** Set true for kernel symbols of idle routines. */
-	u8		idle:1;
+	/** Cache for symbol__is_idle. */
+	enum symbol_idle_kind idle:2;
 	/** Resolvable but tools ignore it (e.g. idle routines). */
 	u8		ignore:1;
 	/** Symbol for an inlined function. */
@@ -184,8 +191,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss);
 
 char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
 
-void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym,
-		       bool kernel);
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__fixup_duplicate(struct rb_root_cached *symbols);
 void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms);
@@ -269,5 +275,6 @@ enum {
 };
 
 int symbol__validate_sym_arguments(void);
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, const struct perf_env *env);
 
 #endif /* __PERF_SYMBOL */
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* Re: [PATCH v2] perf symbol: Remove psw_idle() from list of idle symbols
  2026-03-02 19:44     ` Ian Rogers
@ 2026-03-04 14:34       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 80+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-03-04 14:34 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Namhyung Kim, Thomas Richter, linux-kernel, linux-s390,
	linux-perf-users, agordeev, gor, sumanthk, hca, japo

On Mon, Mar 02, 2026 at 11:44:19AM -0800, Ian Rogers wrote:
> On Mon, Mar 2, 2026 at 10:43 AM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > On Mon, Feb 23, 2026 at 01:46:21PM -0800, Namhyung Kim wrote:
> > > On Thu, Feb 19, 2026 at 12:38:50PM +0100, Thomas Richter wrote:
> > > > Commit fa2ae4a377c0 ("s390/idle: Rewrite psw_idle() in C")
> > > >
> > > > removes symbols psw_idle() and psw_idle_exit() from the linux
> > > > kernel for s390. Remove them in perf tool's list of idle
> > > > functions. They can not be detected anymore.
> > >
> > > But I think old kernels may still run somewhere.  It seems the above
> > > commit was merged to v6.10.  Maybe we should wait some more time before
> > > removing it in the tool.
> >
> > Agreed, using a new perf tool, say built from the tarballs made
> > available at:
> >
> > https://www.kernel.org/pub/linux/kernel/tools/perf/v7.0.0/perf-7.0.0-rc1.tar.xz
> >
> > (I will not make a rc2 available since there are no changes to the
> > tools/perf codebase in this rc).
> >
> > On older kernels should still ignore those functions.
> >
> > A suggestion for work in this area instead is to get those samples into
> > a special bucket, the "idle" one, and show it at some place in the
> > screen.
> 
> Would it also be sensible to pass the perf_env:
> https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/env.h?h=perf-tools-next#n74
> into symbol__is_idle? The contents of the perf_env are shown by `perf
> report --header`:
> ```
> # ========
> # captured on    : Mon Mar  2 11:34:47 2026
> # header version : 1
> # data offset    : 904
> # data size      : 4268216
> # feat offset    : 4269120
> # hostname : google.com
> # os release : 6.17.13-1rodete1-amd64
> # perf version : 7.0.rc1.g982b63f6380b
> # arch : x86_64
> # nrcpus online : 28
> # nrcpus avail : 28
> # cpudesc : Intel(R) Core(TM) i7-14700
> # cpuid : GenuineIntel,6,183,1
> ...
> # e_machine : 62
> #   e_flags : 0
> ...
> ```
> The kernel version is in the release and the e_machine/arch captures
> the CPU type.

Yeah, I think it is a good improvement, I think you mean that we should
have per-arch idle symbol lists? 

- Arnaldo

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v1] perf symbol: Lazily compute idle and use the perf_env
  2026-03-02 23:43 ` [PATCH v1] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
@ 2026-03-24 17:14   ` Ian Rogers
  2026-03-25  6:58     ` Namhyung Kim
  2026-03-25 16:18   ` [PATCH v2] " Ian Rogers
  1 sibling, 1 reply; 80+ messages in thread
From: Ian Rogers @ 2026-03-24 17:14 UTC (permalink / raw)
  To: tmricht, namhyung, acme
  Cc: agordeev, gor, hca, japo, linux-kernel, linux-perf-users,
	linux-s390, sumanthk

On Mon, Mar 2, 2026 at 3:43 PM Ian Rogers <irogers@google.com> wrote:
>
> Move the idle boolean to a helper symbol__is_idle function. In the
> function lazily compute whether a symbol is an idle function taking
> into consideration the kernel version and architecture of the
> machine. As symbols__insert no longer needs to know if a symbol is for
> the kernel, remove the argument.
>
> This change is inspired by mailing list discussion, particularly from
> Thomas Richter <tmricht@linux.ibm.com> and Heiko Carstens
> <hca@linux.ibm.com>:
> https://lore.kernel.org/lkml/20260219113850.354271-1-tmricht@linux.ibm.com/
>
> Signed-off-by: Ian Rogers <irogers@google.com>

Ping.

Thanks,
Ian

> ---
>  tools/perf/builtin-top.c     |   6 +-
>  tools/perf/util/symbol-elf.c |   2 +-
>  tools/perf/util/symbol.c     | 106 ++++++++++++++++++++++-------------
>  tools/perf/util/symbol.h     |  15 +++--
>  4 files changed, 85 insertions(+), 44 deletions(-)
>
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index 710604c4f6f6..bc3c8e3b6ec0 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -750,6 +750,7 @@ static void perf_event__process_sample(const struct perf_tool *tool,
>  {
>         struct perf_top *top = container_of(tool, struct perf_top, tool);
>         struct addr_location al;
> +       struct dso *dso = NULL;
>
>         if (!machine && perf_guest) {
>                 static struct intlist *seen;
> @@ -829,7 +830,10 @@ static void perf_event__process_sample(const struct perf_tool *tool,
>                 }
>         }
>
> -       if (al.sym == NULL || !al.sym->idle) {
> +       if (al.map)
> +               dso = map__dso(al.map);
> +
> +       if (al.sym == NULL || !symbol__is_idle(al.sym, dso, machine->env)) {
>                 struct hists *hists = evsel__hists(evsel);
>                 struct hist_entry_iter iter = {
>                         .evsel          = evsel,
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index 76912c62b6a0..6bb46384aa0c 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -1725,7 +1725,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
>
>                 arch__sym_update(f, &sym);
>
> -               __symbols__insert(dso__symbols(curr_dso), f, dso__kernel(dso));
> +               __symbols__insert(dso__symbols(curr_dso), f);
>                 nr++;
>         }
>         dso__put(curr_dso);
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index 8662001e1e25..6155f509ca70 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -25,6 +25,8 @@
>  #include "demangle-ocaml.h"
>  #include "demangle-rust-v0.h"
>  #include "dso.h"
> +#include "dwarf-regs.h"
> +#include "env.h"
>  #include "util.h" // lsdir()
>  #include "debug.h"
>  #include "event.h"
> @@ -51,7 +53,6 @@
>
>  static int dso__load_kernel_sym(struct dso *dso, struct map *map);
>  static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map);
> -static bool symbol__is_idle(const char *name);
>
>  int vmlinux_path__nr_entries;
>  char **vmlinux_path;
> @@ -357,8 +358,7 @@ void symbols__delete(struct rb_root_cached *symbols)
>         }
>  }
>
> -void __symbols__insert(struct rb_root_cached *symbols,
> -                      struct symbol *sym, bool kernel)
> +void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
>  {
>         struct rb_node **p = &symbols->rb_root.rb_node;
>         struct rb_node *parent = NULL;
> @@ -366,17 +366,6 @@ void __symbols__insert(struct rb_root_cached *symbols,
>         struct symbol *s;
>         bool leftmost = true;
>
> -       if (kernel) {
> -               const char *name = sym->name;
> -               /*
> -                * ppc64 uses function descriptors and appends a '.' to the
> -                * start of every instruction address. Remove it.
> -                */
> -               if (name[0] == '.')
> -                       name++;
> -               sym->idle = symbol__is_idle(name);
> -       }
> -
>         while (*p != NULL) {
>                 parent = *p;
>                 s = rb_entry(parent, struct symbol, rb_node);
> @@ -393,7 +382,7 @@ void __symbols__insert(struct rb_root_cached *symbols,
>
>  void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
>  {
> -       __symbols__insert(symbols, sym, false);
> +       __symbols__insert(symbols, sym);
>  }
>
>  static struct symbol *symbols__find(struct rb_root_cached *symbols, u64 ip)
> @@ -554,7 +543,7 @@ void dso__reset_find_symbol_cache(struct dso *dso)
>
>  void dso__insert_symbol(struct dso *dso, struct symbol *sym)
>  {
> -       __symbols__insert(dso__symbols(dso), sym, dso__kernel(dso));
> +       __symbols__insert(dso__symbols(dso), sym);
>
>         /* update the symbol cache if necessary */
>         if (dso__last_find_result_addr(dso) >= sym->start &&
> @@ -716,47 +705,88 @@ int modules__parse(const char *filename, void *arg,
>         return err;
>  }
>
> +static int sym_name_cmp(const void *a, const void *b)
> +{
> +       const char *name = a;
> +       const char *const *sym = b;
> +
> +       return strcmp(name, *sym);
> +}
> +
>  /*
>   * These are symbols in the kernel image, so make sure that
>   * sym is from a kernel DSO.
>   */
> -static bool symbol__is_idle(const char *name)
> +bool symbol__is_idle(struct symbol *sym, const struct dso *dso, const struct perf_env *env)
>  {
> -       const char * const idle_symbols[] = {
> +       static const char * const idle_symbols[] = {
>                 "acpi_idle_do_entry",
>                 "acpi_processor_ffh_cstate_enter",
>                 "arch_cpu_idle",
>                 "cpu_idle",
>                 "cpu_startup_entry",
> -               "idle_cpu",
> -               "intel_idle",
> -               "intel_idle_ibrs",
>                 "default_idle",
> -               "native_safe_halt",
>                 "enter_idle",
>                 "exit_idle",
> -               "mwait_idle",
> -               "mwait_idle_with_hints",
> -               "mwait_idle_with_hints.constprop.0",
> +               "idle_cpu",
> +               "native_safe_halt",
>                 "poll_idle",
> -               "ppc64_runlatch_off",
>                 "pseries_dedicated_idle_sleep",
> -               "psw_idle",
> -               "psw_idle_exit",
> -               NULL
>         };
> -       int i;
> -       static struct strlist *idle_symbols_list;
> +       const char *name = sym->name;
> +       uint16_t e_machine = env ? env->e_machine : EM_HOST;
> +
> +       if (sym->idle)
> +               return sym->idle == SYMBOL_IDLE__IDLE;
> +
> +       if (!dso || dso__kernel(dso) == DSO_SPACE__USER) {
> +               sym->idle = SYMBOL_IDLE__NOT_IDLE;
> +               return false;
> +       }
>
> -       if (idle_symbols_list)
> -               return strlist__has_entry(idle_symbols_list, name);
> +       /*
> +        * ppc64 uses function descriptors and appends a '.' to the
> +        * start of every instruction address. Remove it.
> +        */
> +       if (name[0] == '.')
> +               name++;
> +
> +
> +       if (bsearch(name, idle_symbols, ARRAY_SIZE(idle_symbols),
> +                   sizeof(idle_symbols[0]), sym_name_cmp)) {
> +               sym->idle = SYMBOL_IDLE__IDLE;
> +               return true;
> +       }
> +
> +       if (e_machine == EM_386 || e_machine == EM_X86_64) {
> +               if (strstarts(name, "mwait_idle") ||
> +                   strstarts(name, "intel_idle")) {
> +                       sym->idle = SYMBOL_IDLE__IDLE;
> +                       return true;
> +               }
> +       }
> +
> +       if (e_machine == EM_PPC64 &&!strcmp(name, "ppc64_runlatch_off")) {
> +               sym->idle = SYMBOL_IDLE__IDLE;
> +               return true;
> +       }
>
> -       idle_symbols_list = strlist__new(NULL, NULL);
> +       if (e_machine == EM_S390) {
> +               int major = 0, minor = 0;
> +               const char *release = env && env->os_release
> +                       ? env->os_release : perf_version_string;
>
> -       for (i = 0; idle_symbols[i]; i++)
> -               strlist__add(idle_symbols_list, idle_symbols[i]);
> +               sscanf(release, "%d.%d", &major, &minor);
>
> -       return strlist__has_entry(idle_symbols_list, name);
> +               /* Before v6.10, s390 used psw_idle. */
> +               if ((major < 6 || (major == 6 && minor < 10)) && strstarts(name, "psw_idle")) {
> +                       sym->idle = SYMBOL_IDLE__IDLE;
> +                       return true;
> +               }
> +       }
> +
> +       sym->idle = SYMBOL_IDLE__NOT_IDLE;
> +       return false;
>  }
>
>  static int map__process_kallsym_symbol(void *arg, const char *name,
> @@ -785,7 +815,7 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
>          * We will pass the symbols to the filter later, in
>          * map__split_kallsyms, when we have split the maps per module
>          */
> -       __symbols__insert(root, sym, !strchr(name, '['));
> +       __symbols__insert(root, sym);
>
>         return 0;
>  }
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index 3fb5d146d9b1..508dd9f336e9 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -24,6 +24,7 @@ struct dso;
>  struct map;
>  struct maps;
>  struct option;
> +struct perf_env;
>  struct build_id;
>
>  /*
> @@ -41,6 +42,12 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
>                              GElf_Shdr *shp, const char *name, size_t *idx);
>  #endif
>
> +enum symbol_idle_kind {
> +       SYMBOL_IDLE__UNKNOWN = 0,
> +       SYMBOL_IDLE__NOT_IDLE = 1,
> +       SYMBOL_IDLE__IDLE = 2,
> +};
> +
>  /**
>   * A symtab entry. When allocated this may be preceded by an annotation (see
>   * symbol__annotation) and/or a browser_index (see symbol__browser_index).
> @@ -56,8 +63,8 @@ struct symbol {
>         u8              type:4;
>         /** ELF binding type as defined for st_info. E.g. STB_WEAK or STB_GLOBAL. */
>         u8              binding:4;
> -       /** Set true for kernel symbols of idle routines. */
> -       u8              idle:1;
> +       /** Cache for symbol__is_idle. */
> +       enum symbol_idle_kind idle:2;
>         /** Resolvable but tools ignore it (e.g. idle routines). */
>         u8              ignore:1;
>         /** Symbol for an inlined function. */
> @@ -184,8 +191,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss);
>
>  char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
>
> -void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym,
> -                      bool kernel);
> +void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
>  void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
>  void symbols__fixup_duplicate(struct rb_root_cached *symbols);
>  void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms);
> @@ -269,5 +275,6 @@ enum {
>  };
>
>  int symbol__validate_sym_arguments(void);
> +bool symbol__is_idle(struct symbol *sym, const struct dso *dso, const struct perf_env *env);
>
>  #endif /* __PERF_SYMBOL */
> --
> 2.53.0.473.g4a7958ca14-goog
>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v1] perf symbol: Lazily compute idle and use the perf_env
  2026-03-24 17:14   ` Ian Rogers
@ 2026-03-25  6:58     ` Namhyung Kim
  2026-03-25 15:58       ` Ian Rogers
  0 siblings, 1 reply; 80+ messages in thread
From: Namhyung Kim @ 2026-03-25  6:58 UTC (permalink / raw)
  To: Ian Rogers
  Cc: tmricht, acme, agordeev, gor, hca, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Hi Ian,

Sorry for the delay.

On Tue, Mar 24, 2026 at 10:14:01AM -0700, Ian Rogers wrote:
> On Mon, Mar 2, 2026 at 3:43 PM Ian Rogers <irogers@google.com> wrote:
[SNIP]
> > -       if (idle_symbols_list)
> > -               return strlist__has_entry(idle_symbols_list, name);
> > +       /*
> > +        * ppc64 uses function descriptors and appends a '.' to the
> > +        * start of every instruction address. Remove it.
> > +        */
> > +       if (name[0] == '.')

Then e_machine == EM_PPC64 can be checked here.

> > +               name++;
> > +
> > +

Two blank lines.

> > +       if (bsearch(name, idle_symbols, ARRAY_SIZE(idle_symbols),
> > +                   sizeof(idle_symbols[0]), sym_name_cmp)) {
> > +               sym->idle = SYMBOL_IDLE__IDLE;
> > +               return true;
> > +       }
> > +
> > +       if (e_machine == EM_386 || e_machine == EM_X86_64) {
> > +               if (strstarts(name, "mwait_idle") ||
> > +                   strstarts(name, "intel_idle")) {
> > +                       sym->idle = SYMBOL_IDLE__IDLE;
> > +                       return true;
> > +               }
> > +       }
> > +
> > +       if (e_machine == EM_PPC64 &&!strcmp(name, "ppc64_runlatch_off")) {
> > +               sym->idle = SYMBOL_IDLE__IDLE;
> > +               return true;
> > +       }
> >
> > -       idle_symbols_list = strlist__new(NULL, NULL);
> > +       if (e_machine == EM_S390) {
> > +               int major = 0, minor = 0;
> > +               const char *release = env && env->os_release
> > +                       ? env->os_release : perf_version_string;
> >
> > -       for (i = 0; idle_symbols[i]; i++)
> > -               strlist__add(idle_symbols_list, idle_symbols[i]);
> > +               sscanf(release, "%d.%d", &major, &minor);
> >
> > -       return strlist__has_entry(idle_symbols_list, name);
> > +               /* Before v6.10, s390 used psw_idle. */
> > +               if ((major < 6 || (major == 6 && minor < 10)) && strstarts(name, "psw_idle")) {
> > +                       sym->idle = SYMBOL_IDLE__IDLE;
> > +                       return true;
> > +               }
> > +       }
> > +
> > +       sym->idle = SYMBOL_IDLE__NOT_IDLE;
> > +       return false;
> >  }
> >
> >  static int map__process_kallsym_symbol(void *arg, const char *name,
> > @@ -785,7 +815,7 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
> >          * We will pass the symbols to the filter later, in
> >          * map__split_kallsyms, when we have split the maps per module
> >          */
> > -       __symbols__insert(root, sym, !strchr(name, '['));
> > +       __symbols__insert(root, sym);
> >
> >         return 0;
> >  }
> > diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> > index 3fb5d146d9b1..508dd9f336e9 100644
> > --- a/tools/perf/util/symbol.h
> > +++ b/tools/perf/util/symbol.h
> > @@ -24,6 +24,7 @@ struct dso;
> >  struct map;
> >  struct maps;
> >  struct option;
> > +struct perf_env;
> >  struct build_id;
> >
> >  /*
> > @@ -41,6 +42,12 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
> >                              GElf_Shdr *shp, const char *name, size_t *idx);
> >  #endif
> >
> > +enum symbol_idle_kind {
> > +       SYMBOL_IDLE__UNKNOWN = 0,
> > +       SYMBOL_IDLE__NOT_IDLE = 1,
> > +       SYMBOL_IDLE__IDLE = 2,
> > +};
> > +
> >  /**
> >   * A symtab entry. When allocated this may be preceded by an annotation (see
> >   * symbol__annotation) and/or a browser_index (see symbol__browser_index).
> > @@ -56,8 +63,8 @@ struct symbol {
> >         u8              type:4;
> >         /** ELF binding type as defined for st_info. E.g. STB_WEAK or STB_GLOBAL. */
> >         u8              binding:4;
> > -       /** Set true for kernel symbols of idle routines. */
> > -       u8              idle:1;
> > +       /** Cache for symbol__is_idle. */
> > +       enum symbol_idle_kind idle:2;

I'm curious if bitfields with different types (u8 and enum) can be
placed consecutively bitwise.  There can be a lot of symbols so it
could be a concern.

Thanks,
Namhyung


> >         /** Resolvable but tools ignore it (e.g. idle routines). */
> >         u8              ignore:1;
> >         /** Symbol for an inlined function. */
> > @@ -184,8 +191,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss);
> >
> >  char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
> >
> > -void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym,
> > -                      bool kernel);
> > +void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
> >  void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
> >  void symbols__fixup_duplicate(struct rb_root_cached *symbols);
> >  void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms);
> > @@ -269,5 +275,6 @@ enum {
> >  };
> >
> >  int symbol__validate_sym_arguments(void);
> > +bool symbol__is_idle(struct symbol *sym, const struct dso *dso, const struct perf_env *env);
> >
> >  #endif /* __PERF_SYMBOL */
> > --
> > 2.53.0.473.g4a7958ca14-goog
> >

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v1] perf symbol: Lazily compute idle and use the perf_env
  2026-03-25  6:58     ` Namhyung Kim
@ 2026-03-25 15:58       ` Ian Rogers
  0 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-03-25 15:58 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: tmricht, acme, agordeev, gor, hca, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

On Tue, Mar 24, 2026 at 11:58 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hi Ian,
>
> Sorry for the delay.
>
> On Tue, Mar 24, 2026 at 10:14:01AM -0700, Ian Rogers wrote:
> > On Mon, Mar 2, 2026 at 3:43 PM Ian Rogers <irogers@google.com> wrote:
> [SNIP]
> > > -       if (idle_symbols_list)
> > > -               return strlist__has_entry(idle_symbols_list, name);
> > > +       /*
> > > +        * ppc64 uses function descriptors and appends a '.' to the
> > > +        * start of every instruction address. Remove it.
> > > +        */
> > > +       if (name[0] == '.')
>
> Then e_machine == EM_PPC64 can be checked here.

Agreed, but potentially this is load bearing for more than just PPC so
I'd rather leave it as it is.

> > > +               name++;
> > > +
> > > +
>
> Two blank lines.

Will fix in v2.

> > > +       if (bsearch(name, idle_symbols, ARRAY_SIZE(idle_symbols),
> > > +                   sizeof(idle_symbols[0]), sym_name_cmp)) {
> > > +               sym->idle = SYMBOL_IDLE__IDLE;
> > > +               return true;
> > > +       }
> > > +
> > > +       if (e_machine == EM_386 || e_machine == EM_X86_64) {
> > > +               if (strstarts(name, "mwait_idle") ||
> > > +                   strstarts(name, "intel_idle")) {
> > > +                       sym->idle = SYMBOL_IDLE__IDLE;
> > > +                       return true;
> > > +               }
> > > +       }
> > > +
> > > +       if (e_machine == EM_PPC64 &&!strcmp(name, "ppc64_runlatch_off")) {
> > > +               sym->idle = SYMBOL_IDLE__IDLE;
> > > +               return true;
> > > +       }
> > >
> > > -       idle_symbols_list = strlist__new(NULL, NULL);
> > > +       if (e_machine == EM_S390) {
> > > +               int major = 0, minor = 0;
> > > +               const char *release = env && env->os_release
> > > +                       ? env->os_release : perf_version_string;
> > >
> > > -       for (i = 0; idle_symbols[i]; i++)
> > > -               strlist__add(idle_symbols_list, idle_symbols[i]);
> > > +               sscanf(release, "%d.%d", &major, &minor);
> > >
> > > -       return strlist__has_entry(idle_symbols_list, name);
> > > +               /* Before v6.10, s390 used psw_idle. */
> > > +               if ((major < 6 || (major == 6 && minor < 10)) && strstarts(name, "psw_idle")) {
> > > +                       sym->idle = SYMBOL_IDLE__IDLE;
> > > +                       return true;
> > > +               }
> > > +       }
> > > +
> > > +       sym->idle = SYMBOL_IDLE__NOT_IDLE;
> > > +       return false;
> > >  }
> > >
> > >  static int map__process_kallsym_symbol(void *arg, const char *name,
> > > @@ -785,7 +815,7 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
> > >          * We will pass the symbols to the filter later, in
> > >          * map__split_kallsyms, when we have split the maps per module
> > >          */
> > > -       __symbols__insert(root, sym, !strchr(name, '['));
> > > +       __symbols__insert(root, sym);
> > >
> > >         return 0;
> > >  }
> > > diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> > > index 3fb5d146d9b1..508dd9f336e9 100644
> > > --- a/tools/perf/util/symbol.h
> > > +++ b/tools/perf/util/symbol.h
> > > @@ -24,6 +24,7 @@ struct dso;
> > >  struct map;
> > >  struct maps;
> > >  struct option;
> > > +struct perf_env;
> > >  struct build_id;
> > >
> > >  /*
> > > @@ -41,6 +42,12 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
> > >                              GElf_Shdr *shp, const char *name, size_t *idx);
> > >  #endif
> > >
> > > +enum symbol_idle_kind {
> > > +       SYMBOL_IDLE__UNKNOWN = 0,
> > > +       SYMBOL_IDLE__NOT_IDLE = 1,
> > > +       SYMBOL_IDLE__IDLE = 2,
> > > +};
> > > +
> > >  /**
> > >   * A symtab entry. When allocated this may be preceded by an annotation (see
> > >   * symbol__annotation) and/or a browser_index (see symbol__browser_index).
> > > @@ -56,8 +63,8 @@ struct symbol {
> > >         u8              type:4;
> > >         /** ELF binding type as defined for st_info. E.g. STB_WEAK or STB_GLOBAL. */
> > >         u8              binding:4;
> > > -       /** Set true for kernel symbols of idle routines. */
> > > -       u8              idle:1;
> > > +       /** Cache for symbol__is_idle. */
> > > +       enum symbol_idle_kind idle:2;
>
> I'm curious if bitfields with different types (u8 and enum) can be
> placed consecutively bitwise.  There can be a lot of symbols so it
> could be a concern.

pahole says no size difference:

Before:
```
struct symbol {
       struct rb_node             rb_node
__attribute__((__aligned__(8))); /*     0    24 */
       u64                        start;                /*    24     8 */
       u64                        end;                  /*    32     8 */
       u16                        namelen;              /*    40     2 */
       u8                         type:4;               /*    42: 0  1 */
       u8                         binding:4;            /*    42: 4  1 */
       u8                         idle:1;               /*    43: 0  1 */
       u8                         ignore:1;             /*    43: 1  1 */
       u8                         inlined:1;            /*    43: 2  1 */
       u8                         annotate2:1;          /*    43: 3  1 */
       u8                         ifunc_alias:1;        /*    43: 4  1 */

       /* XXX 3 bits hole, try to pack */

       u8                         arch_sym;             /*    44     1 */
       char                       name[];               /*    45     0 */

       /* size: 48, cachelines: 1, members: 13 */
       /* sum members: 43 */
       /* sum bitfield members: 13 bits, bit holes: 1, sum bit holes: 3 bits */
       /* padding: 3 */
       /* forced alignments: 1 */
       /* last cacheline: 48 bytes */
} __attribute__((__aligned__(8)));
```

After:
```
struct symbol {
       struct rb_node             rb_node
__attribute__((__aligned__(8))); /*     0    24 */
       u64                        start;                /*    24     8 */
       u64                        end;                  /*    32     8 */
       u16                        namelen;              /*    40     2 */
       u8                         type:4;               /*    42: 0  1 */
       u8                         binding:4;            /*    42: 4  1 */

       /* Bitfield combined with previous fields */

       enum symbol_idle_kind      idle:2;               /*    40:24  4 */

       /* Bitfield combined with next fields */

       u8                         ignore:1;             /*    43: 2  1 */
       u8                         inlined:1;            /*    43: 3  1 */
       u8                         annotate2:1;          /*    43: 4  1 */
       u8                         ifunc_alias:1;        /*    43: 5  1 */

       /* XXX 2 bits hole, try to pack */

       u8                         arch_sym;             /*    44     1 */
       char                       name[];               /*    45     0 */

       /* size: 48, cachelines: 1, members: 13 */
       /* sum members: 43 */
       /* sum bitfield members: 14 bits, bit holes: 1, sum bit holes: 2 bits */
       /* padding: 3 */
       /* forced alignments: 1 */
       /* last cacheline: 48 bytes */
} __attribute__((__aligned__(8)));
```

Thanks,
Ian

> Thanks,
> Namhyung
>
>
> > >         /** Resolvable but tools ignore it (e.g. idle routines). */
> > >         u8              ignore:1;
> > >         /** Symbol for an inlined function. */
> > > @@ -184,8 +191,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss);
> > >
> > >  char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
> > >
> > > -void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym,
> > > -                      bool kernel);
> > > +void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
> > >  void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
> > >  void symbols__fixup_duplicate(struct rb_root_cached *symbols);
> > >  void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms);
> > > @@ -269,5 +275,6 @@ enum {
> > >  };
> > >
> > >  int symbol__validate_sym_arguments(void);
> > > +bool symbol__is_idle(struct symbol *sym, const struct dso *dso, const struct perf_env *env);
> > >
> > >  #endif /* __PERF_SYMBOL */
> > > --
> > > 2.53.0.473.g4a7958ca14-goog
> > >

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH v2] perf symbol: Lazily compute idle and use the perf_env
  2026-03-02 23:43 ` [PATCH v1] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
  2026-03-24 17:14   ` Ian Rogers
@ 2026-03-25 16:18   ` Ian Rogers
  2026-03-26  7:20     ` Honglei Wang
  1 sibling, 1 reply; 80+ messages in thread
From: Ian Rogers @ 2026-03-25 16:18 UTC (permalink / raw)
  To: acme, namhyung, tmricht
  Cc: irogers, agordeev, gor, hca, japo, linux-kernel, linux-perf-users,
	linux-s390, sumanthk, jameshongleiwang

Move the idle boolean to a helper symbol__is_idle function. In the
function lazily compute whether a symbol is an idle function taking
into consideration the kernel version and architecture of the
machine. As symbols__insert no longer needs to know if a symbol is for
the kernel, remove the argument.

This change is inspired by mailing list discussion, particularly from
Thomas Richter <tmricht@linux.ibm.com> and Heiko Carstens
<hca@linux.ibm.com>:
https://lore.kernel.org/lkml/20260219113850.354271-1-tmricht@linux.ibm.com/

The change switches x86 matches to use strstarts which means
intel_idle_irq is matched as part of strstarts(name, "intel_idle"), a
change suggested by Honglei Wang <jameshongleiwang@126.com> in:
https://lore.kernel.org/lkml/20260323085255.98173-1-jameshongleiwang@126.com/

Signed-off-by: Ian Rogers <irogers@google.com>
---
v1: https://lore.kernel.org/lkml/20260302234343.564937-1-irogers@google.com/
---
 tools/perf/builtin-top.c     |   6 +-
 tools/perf/util/symbol-elf.c |   2 +-
 tools/perf/util/symbol.c     | 105 ++++++++++++++++++++++-------------
 tools/perf/util/symbol.h     |  15 +++--
 4 files changed, 84 insertions(+), 44 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 37950efb28ac..bdc1c761cd61 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -751,6 +751,7 @@ static void perf_event__process_sample(const struct perf_tool *tool,
 {
 	struct perf_top *top = container_of(tool, struct perf_top, tool);
 	struct addr_location al;
+	struct dso *dso = NULL;
 
 	if (!machine && perf_guest) {
 		static struct intlist *seen;
@@ -830,7 +831,10 @@ static void perf_event__process_sample(const struct perf_tool *tool,
 		}
 	}
 
-	if (al.sym == NULL || !al.sym->idle) {
+	if (al.map)
+		dso = map__dso(al.map);
+
+	if (al.sym == NULL || !symbol__is_idle(al.sym, dso, machine->env)) {
 		struct hists *hists = evsel__hists(evsel);
 		struct hist_entry_iter iter = {
 			.evsel		= evsel,
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 3cd4e5a03cc5..9fabf5146d89 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1723,7 +1723,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 
 		arch__sym_update(f, &sym);
 
-		__symbols__insert(dso__symbols(curr_dso), f, dso__kernel(dso));
+		__symbols__insert(dso__symbols(curr_dso), f);
 		nr++;
 	}
 	dso__put(curr_dso);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index ce9195717f44..1a357af93a0a 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -25,6 +25,8 @@
 #include "demangle-ocaml.h"
 #include "demangle-rust-v0.h"
 #include "dso.h"
+#include "dwarf-regs.h"
+#include "env.h"
 #include "util.h" // lsdir()
 #include "event.h"
 #include "machine.h"
@@ -50,7 +52,6 @@
 
 static int dso__load_kernel_sym(struct dso *dso, struct map *map);
 static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map);
-static bool symbol__is_idle(const char *name);
 
 int vmlinux_path__nr_entries;
 char **vmlinux_path;
@@ -357,8 +358,7 @@ void symbols__delete(struct rb_root_cached *symbols)
 	}
 }
 
-void __symbols__insert(struct rb_root_cached *symbols,
-		       struct symbol *sym, bool kernel)
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
 	struct rb_node **p = &symbols->rb_root.rb_node;
 	struct rb_node *parent = NULL;
@@ -366,17 +366,6 @@ void __symbols__insert(struct rb_root_cached *symbols,
 	struct symbol *s;
 	bool leftmost = true;
 
-	if (kernel) {
-		const char *name = sym->name;
-		/*
-		 * ppc64 uses function descriptors and appends a '.' to the
-		 * start of every instruction address. Remove it.
-		 */
-		if (name[0] == '.')
-			name++;
-		sym->idle = symbol__is_idle(name);
-	}
-
 	while (*p != NULL) {
 		parent = *p;
 		s = rb_entry(parent, struct symbol, rb_node);
@@ -393,7 +382,7 @@ void __symbols__insert(struct rb_root_cached *symbols,
 
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
-	__symbols__insert(symbols, sym, false);
+	__symbols__insert(symbols, sym);
 }
 
 static struct symbol *symbols__find(struct rb_root_cached *symbols, u64 ip)
@@ -554,7 +543,7 @@ void dso__reset_find_symbol_cache(struct dso *dso)
 
 void dso__insert_symbol(struct dso *dso, struct symbol *sym)
 {
-	__symbols__insert(dso__symbols(dso), sym, dso__kernel(dso));
+	__symbols__insert(dso__symbols(dso), sym);
 
 	/* update the symbol cache if necessary */
 	if (dso__last_find_result_addr(dso) >= sym->start &&
@@ -716,47 +705,87 @@ int modules__parse(const char *filename, void *arg,
 	return err;
 }
 
+static int sym_name_cmp(const void *a, const void *b)
+{
+	const char *name = a;
+	const char *const *sym = b;
+
+	return strcmp(name, *sym);
+}
+
 /*
  * These are symbols in the kernel image, so make sure that
  * sym is from a kernel DSO.
  */
-static bool symbol__is_idle(const char *name)
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, const struct perf_env *env)
 {
-	const char * const idle_symbols[] = {
+	static const char * const idle_symbols[] = {
 		"acpi_idle_do_entry",
 		"acpi_processor_ffh_cstate_enter",
 		"arch_cpu_idle",
 		"cpu_idle",
 		"cpu_startup_entry",
-		"idle_cpu",
-		"intel_idle",
-		"intel_idle_ibrs",
 		"default_idle",
-		"native_safe_halt",
 		"enter_idle",
 		"exit_idle",
-		"mwait_idle",
-		"mwait_idle_with_hints",
-		"mwait_idle_with_hints.constprop.0",
+		"idle_cpu",
+		"native_safe_halt",
 		"poll_idle",
-		"ppc64_runlatch_off",
 		"pseries_dedicated_idle_sleep",
-		"psw_idle",
-		"psw_idle_exit",
-		NULL
 	};
-	int i;
-	static struct strlist *idle_symbols_list;
+	const char *name = sym->name;
+	uint16_t e_machine = env ? env->e_machine : EM_HOST;
 
-	if (idle_symbols_list)
-		return strlist__has_entry(idle_symbols_list, name);
+	if (sym->idle)
+		return sym->idle == SYMBOL_IDLE__IDLE;
 
-	idle_symbols_list = strlist__new(NULL, NULL);
+	if (!dso || dso__kernel(dso) == DSO_SPACE__USER) {
+		sym->idle = SYMBOL_IDLE__NOT_IDLE;
+		return false;
+	}
 
-	for (i = 0; idle_symbols[i]; i++)
-		strlist__add(idle_symbols_list, idle_symbols[i]);
+	/*
+	 * ppc64 uses function descriptors and appends a '.' to the
+	 * start of every instruction address. Remove it.
+	 */
+	if (name[0] == '.')
+		name++;
 
-	return strlist__has_entry(idle_symbols_list, name);
+	if (bsearch(name, idle_symbols, ARRAY_SIZE(idle_symbols),
+		    sizeof(idle_symbols[0]), sym_name_cmp)) {
+		sym->idle = SYMBOL_IDLE__IDLE;
+		return true;
+	}
+
+	if (e_machine == EM_386 || e_machine == EM_X86_64) {
+		if (strstarts(name, "mwait_idle") ||
+		    strstarts(name, "intel_idle")) {
+			sym->idle = SYMBOL_IDLE__IDLE;
+			return true;
+		}
+	}
+
+	if (e_machine == EM_PPC64 && !strcmp(name, "ppc64_runlatch_off")) {
+		sym->idle = SYMBOL_IDLE__IDLE;
+		return true;
+	}
+
+	if (e_machine == EM_S390) {
+		int major = 0, minor = 0;
+		const char *release = env && env->os_release
+			? env->os_release : perf_version_string;
+
+		sscanf(release, "%d.%d", &major, &minor);
+
+		/* Before v6.10, s390 used psw_idle. */
+		if ((major < 6 || (major == 6 && minor < 10)) && strstarts(name, "psw_idle")) {
+			sym->idle = SYMBOL_IDLE__IDLE;
+			return true;
+		}
+	}
+
+	sym->idle = SYMBOL_IDLE__NOT_IDLE;
+	return false;
 }
 
 static int map__process_kallsym_symbol(void *arg, const char *name,
@@ -785,7 +814,7 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
 	 * We will pass the symbols to the filter later, in
 	 * map__split_kallsyms, when we have split the maps per module
 	 */
-	__symbols__insert(root, sym, !strchr(name, '['));
+	__symbols__insert(root, sym);
 
 	return 0;
 }
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index c67814d6d6d6..f26f67bd7982 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -25,6 +25,7 @@ struct dso;
 struct map;
 struct maps;
 struct option;
+struct perf_env;
 struct build_id;
 
 /*
@@ -42,6 +43,12 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
 			     GElf_Shdr *shp, const char *name, size_t *idx);
 #endif
 
+enum symbol_idle_kind {
+	SYMBOL_IDLE__UNKNOWN = 0,
+	SYMBOL_IDLE__NOT_IDLE = 1,
+	SYMBOL_IDLE__IDLE = 2,
+};
+
 /**
  * A symtab entry. When allocated this may be preceded by an annotation (see
  * symbol__annotation) and/or a browser_index (see symbol__browser_index).
@@ -57,8 +64,8 @@ struct symbol {
 	u8		type:4;
 	/** ELF binding type as defined for st_info. E.g. STB_WEAK or STB_GLOBAL. */
 	u8		binding:4;
-	/** Set true for kernel symbols of idle routines. */
-	u8		idle:1;
+	/** Cache for symbol__is_idle. */
+	enum symbol_idle_kind idle:2;
 	/** Resolvable but tools ignore it (e.g. idle routines). */
 	u8		ignore:1;
 	/** Symbol for an inlined function. */
@@ -202,8 +209,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss);
 
 char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
 
-void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym,
-		       bool kernel);
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__fixup_duplicate(struct rb_root_cached *symbols);
 void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms);
@@ -286,5 +292,6 @@ enum {
 };
 
 int symbol__validate_sym_arguments(void);
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, const struct perf_env *env);
 
 #endif /* __PERF_SYMBOL */
-- 
2.53.0.1018.g2bb0e51243-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* Re: [PATCH v2] perf symbol: Lazily compute idle and use the perf_env
  2026-03-25 16:18   ` [PATCH v2] " Ian Rogers
@ 2026-03-26  7:20     ` Honglei Wang
  2026-03-26 15:11       ` Ian Rogers
  0 siblings, 1 reply; 80+ messages in thread
From: Honglei Wang @ 2026-03-26  7:20 UTC (permalink / raw)
  To: Ian Rogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, japo, linux-kernel, linux-perf-users,
	linux-s390, sumanthk

Hi Ian,

On 3/26/26 12:18 AM, Ian Rogers wrote:
> Move the idle boolean to a helper symbol__is_idle function. In the
> function lazily compute whether a symbol is an idle function taking
> into consideration the kernel version and architecture of the
> machine. As symbols__insert no longer needs to know if a symbol is for
> the kernel, remove the argument.
> 
> This change is inspired by mailing list discussion, particularly from
> Thomas Richter <tmricht@linux.ibm.com> and Heiko Carstens
> <hca@linux.ibm.com>:
> https://lore.kernel.org/lkml/20260219113850.354271-1-tmricht@linux.ibm.com/
> 
> The change switches x86 matches to use strstarts which means
> intel_idle_irq is matched as part of strstarts(name, "intel_idle"), a
> change suggested by Honglei Wang <jameshongleiwang@126.com> in:
> https://lore.kernel.org/lkml/20260323085255.98173-1-jameshongleiwang@126.com/
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
> v1: https://lore.kernel.org/lkml/20260302234343.564937-1-irogers@google.com/
> ---
>  tools/perf/builtin-top.c     |   6 +-
>  tools/perf/util/symbol-elf.c |   2 +-
>  tools/perf/util/symbol.c     | 105 ++++++++++++++++++++++-------------
>  tools/perf/util/symbol.h     |  15 +++--
>  4 files changed, 84 insertions(+), 44 deletions(-)
> 
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index 37950efb28ac..bdc1c761cd61 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -751,6 +751,7 @@ static void perf_event__process_sample(const struct perf_tool *tool,
>  {
>  	struct perf_top *top = container_of(tool, struct perf_top, tool);
>  	struct addr_location al;
> +	struct dso *dso = NULL;
>  
>  	if (!machine && perf_guest) {
>  		static struct intlist *seen;
> @@ -830,7 +831,10 @@ static void perf_event__process_sample(const struct perf_tool *tool,
>  		}
>  	}
>  
> -	if (al.sym == NULL || !al.sym->idle) {
> +	if (al.map)
> +		dso = map__dso(al.map);
> +
> +	if (al.sym == NULL || !symbol__is_idle(al.sym, dso, machine->env)) {
>  		struct hists *hists = evsel__hists(evsel);
>  		struct hist_entry_iter iter = {
>  			.evsel		= evsel,
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index 3cd4e5a03cc5..9fabf5146d89 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -1723,7 +1723,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
>  
>  		arch__sym_update(f, &sym);
>  
> -		__symbols__insert(dso__symbols(curr_dso), f, dso__kernel(dso));
> +		__symbols__insert(dso__symbols(curr_dso), f);
>  		nr++;
>  	}
>  	dso__put(curr_dso);
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index ce9195717f44..1a357af93a0a 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -25,6 +25,8 @@
>  #include "demangle-ocaml.h"
>  #include "demangle-rust-v0.h"
>  #include "dso.h"
> +#include "dwarf-regs.h"
> +#include "env.h"
>  #include "util.h" // lsdir()
>  #include "event.h"
>  #include "machine.h"
> @@ -50,7 +52,6 @@
>  
>  static int dso__load_kernel_sym(struct dso *dso, struct map *map);
>  static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map);
> -static bool symbol__is_idle(const char *name);
>  
>  int vmlinux_path__nr_entries;
>  char **vmlinux_path;
> @@ -357,8 +358,7 @@ void symbols__delete(struct rb_root_cached *symbols)
>  	}
>  }
>  
> -void __symbols__insert(struct rb_root_cached *symbols,
> -		       struct symbol *sym, bool kernel)
> +void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
>  {
>  	struct rb_node **p = &symbols->rb_root.rb_node;
>  	struct rb_node *parent = NULL;
> @@ -366,17 +366,6 @@ void __symbols__insert(struct rb_root_cached *symbols,
>  	struct symbol *s;
>  	bool leftmost = true;
>  
> -	if (kernel) {
> -		const char *name = sym->name;
> -		/*
> -		 * ppc64 uses function descriptors and appends a '.' to the
> -		 * start of every instruction address. Remove it.
> -		 */
> -		if (name[0] == '.')
> -			name++;
> -		sym->idle = symbol__is_idle(name);
> -	}
> -
>  	while (*p != NULL) {
>  		parent = *p;
>  		s = rb_entry(parent, struct symbol, rb_node);
> @@ -393,7 +382,7 @@ void __symbols__insert(struct rb_root_cached *symbols,
>  
>  void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
>  {
> -	__symbols__insert(symbols, sym, false);
> +	__symbols__insert(symbols, sym);
>  }
>  
>  static struct symbol *symbols__find(struct rb_root_cached *symbols, u64 ip)
> @@ -554,7 +543,7 @@ void dso__reset_find_symbol_cache(struct dso *dso)
>  
>  void dso__insert_symbol(struct dso *dso, struct symbol *sym)
>  {
> -	__symbols__insert(dso__symbols(dso), sym, dso__kernel(dso));
> +	__symbols__insert(dso__symbols(dso), sym);
>  
>  	/* update the symbol cache if necessary */
>  	if (dso__last_find_result_addr(dso) >= sym->start &&
> @@ -716,47 +705,87 @@ int modules__parse(const char *filename, void *arg,
>  	return err;
>  }
>  
> +static int sym_name_cmp(const void *a, const void *b)
> +{
> +	const char *name = a;
> +	const char *const *sym = b;
> +
> +	return strcmp(name, *sym);
> +}
> +
>  /*
>   * These are symbols in the kernel image, so make sure that
>   * sym is from a kernel DSO.
>   */
> -static bool symbol__is_idle(const char *name)
> +bool symbol__is_idle(struct symbol *sym, const struct dso *dso, const struct perf_env *env)
>  {
> -	const char * const idle_symbols[] = {
> +	static const char * const idle_symbols[] = {
>  		"acpi_idle_do_entry",
>  		"acpi_processor_ffh_cstate_enter",
>  		"arch_cpu_idle",
>  		"cpu_idle",
>  		"cpu_startup_entry",
> -		"idle_cpu",
> -		"intel_idle",
> -		"intel_idle_ibrs",
>  		"default_idle",
> -		"native_safe_halt",
>  		"enter_idle",
>  		"exit_idle",
> -		"mwait_idle",
> -		"mwait_idle_with_hints",
> -		"mwait_idle_with_hints.constprop.0",
> +		"idle_cpu",
> +		"native_safe_halt",
>  		"poll_idle",
> -		"ppc64_runlatch_off",
>  		"pseries_dedicated_idle_sleep",
> -		"psw_idle",
> -		"psw_idle_exit",
> -		NULL
>  	};
> -	int i;
> -	static struct strlist *idle_symbols_list;
> +	const char *name = sym->name;
> +	uint16_t e_machine = env ? env->e_machine : EM_HOST;
>  
> -	if (idle_symbols_list)
> -		return strlist__has_entry(idle_symbols_list, name);
> +	if (sym->idle)
> +		return sym->idle == SYMBOL_IDLE__IDLE;
>  
> -	idle_symbols_list = strlist__new(NULL, NULL);
> +	if (!dso || dso__kernel(dso) == DSO_SPACE__USER) {
> +		sym->idle = SYMBOL_IDLE__NOT_IDLE;
> +		return false;
> +	}
>  
> -	for (i = 0; idle_symbols[i]; i++)
> -		strlist__add(idle_symbols_list, idle_symbols[i]);
> +	/*
> +	 * ppc64 uses function descriptors and appends a '.' to the
> +	 * start of every instruction address. Remove it.
> +	 */
> +	if (name[0] == '.')
> +		name++;
>  
> -	return strlist__has_entry(idle_symbols_list, name);
> +	if (bsearch(name, idle_symbols, ARRAY_SIZE(idle_symbols),
> +		    sizeof(idle_symbols[0]), sym_name_cmp)) {
> +		sym->idle = SYMBOL_IDLE__IDLE;
> +		return true;
> +	}
> +
> +	if (e_machine == EM_386 || e_machine == EM_X86_64) {

As said in anther thread, intel_idle_irq was still there on my test
machine. I did a bit debug and found e_machine == 0 so it couldn't run
into this branch. After dig more, it should be
deliver_event()->perf_session__find_machine() return a struct machine
whose env->e_machine is 0. I'm still busy today to do more, wish this
clue can help.

Thanks,
Honglei

> +		if (strstarts(name, "mwait_idle") ||
> +		    strstarts(name, "intel_idle")) {
> +			sym->idle = SYMBOL_IDLE__IDLE;
> +			return true;
> +		}
> +	}
> +
> +	if (e_machine == EM_PPC64 && !strcmp(name, "ppc64_runlatch_off")) {
> +		sym->idle = SYMBOL_IDLE__IDLE;
> +		return true;
> +	}
> +
> +	if (e_machine == EM_S390) {
> +		int major = 0, minor = 0;
> +		const char *release = env && env->os_release
> +			? env->os_release : perf_version_string;
> +
> +		sscanf(release, "%d.%d", &major, &minor);
> +
> +		/* Before v6.10, s390 used psw_idle. */
> +		if ((major < 6 || (major == 6 && minor < 10)) && strstarts(name, "psw_idle")) {
> +			sym->idle = SYMBOL_IDLE__IDLE;
> +			return true;
> +		}
> +	}
> +
> +	sym->idle = SYMBOL_IDLE__NOT_IDLE;
> +	return false;
>  }
>  
>  static int map__process_kallsym_symbol(void *arg, const char *name,
> @@ -785,7 +814,7 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
>  	 * We will pass the symbols to the filter later, in
>  	 * map__split_kallsyms, when we have split the maps per module
>  	 */
> -	__symbols__insert(root, sym, !strchr(name, '['));
> +	__symbols__insert(root, sym);
>  
>  	return 0;
>  }
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index c67814d6d6d6..f26f67bd7982 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -25,6 +25,7 @@ struct dso;
>  struct map;
>  struct maps;
>  struct option;
> +struct perf_env;
>  struct build_id;
>  
>  /*
> @@ -42,6 +43,12 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
>  			     GElf_Shdr *shp, const char *name, size_t *idx);
>  #endif
>  
> +enum symbol_idle_kind {
> +	SYMBOL_IDLE__UNKNOWN = 0,
> +	SYMBOL_IDLE__NOT_IDLE = 1,
> +	SYMBOL_IDLE__IDLE = 2,
> +};
> +
>  /**
>   * A symtab entry. When allocated this may be preceded by an annotation (see
>   * symbol__annotation) and/or a browser_index (see symbol__browser_index).
> @@ -57,8 +64,8 @@ struct symbol {
>  	u8		type:4;
>  	/** ELF binding type as defined for st_info. E.g. STB_WEAK or STB_GLOBAL. */
>  	u8		binding:4;
> -	/** Set true for kernel symbols of idle routines. */
> -	u8		idle:1;
> +	/** Cache for symbol__is_idle. */
> +	enum symbol_idle_kind idle:2;
>  	/** Resolvable but tools ignore it (e.g. idle routines). */
>  	u8		ignore:1;
>  	/** Symbol for an inlined function. */
> @@ -202,8 +209,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss);
>  
>  char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
>  
> -void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym,
> -		       bool kernel);
> +void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
>  void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
>  void symbols__fixup_duplicate(struct rb_root_cached *symbols);
>  void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms);
> @@ -286,5 +292,6 @@ enum {
>  };
>  
>  int symbol__validate_sym_arguments(void);
> +bool symbol__is_idle(struct symbol *sym, const struct dso *dso, const struct perf_env *env);
>  
>  #endif /* __PERF_SYMBOL */


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2] perf symbol: Lazily compute idle and use the perf_env
  2026-03-26  7:20     ` Honglei Wang
@ 2026-03-26 15:11       ` Ian Rogers
  2026-03-26 17:45         ` [PATCH v3 0/2] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  0 siblings, 1 reply; 80+ messages in thread
From: Ian Rogers @ 2026-03-26 15:11 UTC (permalink / raw)
  To: Honglei Wang
  Cc: acme, namhyung, tmricht, agordeev, gor, hca, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

On Thu, Mar 26, 2026 at 12:20 AM Honglei Wang <jameshongleiwang@126.com> wrote:
>
> Hi Ian,
>
> On 3/26/26 12:18 AM, Ian Rogers wrote:
> > Move the idle boolean to a helper symbol__is_idle function. In the
> > function lazily compute whether a symbol is an idle function taking
> > into consideration the kernel version and architecture of the
> > machine. As symbols__insert no longer needs to know if a symbol is for
> > the kernel, remove the argument.
> >
> > This change is inspired by mailing list discussion, particularly from
> > Thomas Richter <tmricht@linux.ibm.com> and Heiko Carstens
> > <hca@linux.ibm.com>:
> > https://lore.kernel.org/lkml/20260219113850.354271-1-tmricht@linux.ibm.com/
> >
> > The change switches x86 matches to use strstarts which means
> > intel_idle_irq is matched as part of strstarts(name, "intel_idle"), a
> > change suggested by Honglei Wang <jameshongleiwang@126.com> in:
> > https://lore.kernel.org/lkml/20260323085255.98173-1-jameshongleiwang@126.com/
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> > v1: https://lore.kernel.org/lkml/20260302234343.564937-1-irogers@google.com/
> > ---
> >  tools/perf/builtin-top.c     |   6 +-
> >  tools/perf/util/symbol-elf.c |   2 +-
> >  tools/perf/util/symbol.c     | 105 ++++++++++++++++++++++-------------
> >  tools/perf/util/symbol.h     |  15 +++--
> >  4 files changed, 84 insertions(+), 44 deletions(-)
> >
> > diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> > index 37950efb28ac..bdc1c761cd61 100644
> > --- a/tools/perf/builtin-top.c
> > +++ b/tools/perf/builtin-top.c
> > @@ -751,6 +751,7 @@ static void perf_event__process_sample(const struct perf_tool *tool,
> >  {
> >       struct perf_top *top = container_of(tool, struct perf_top, tool);
> >       struct addr_location al;
> > +     struct dso *dso = NULL;
> >
> >       if (!machine && perf_guest) {
> >               static struct intlist *seen;
> > @@ -830,7 +831,10 @@ static void perf_event__process_sample(const struct perf_tool *tool,
> >               }
> >       }
> >
> > -     if (al.sym == NULL || !al.sym->idle) {
> > +     if (al.map)
> > +             dso = map__dso(al.map);
> > +
> > +     if (al.sym == NULL || !symbol__is_idle(al.sym, dso, machine->env)) {
> >               struct hists *hists = evsel__hists(evsel);
> >               struct hist_entry_iter iter = {
> >                       .evsel          = evsel,
> > diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> > index 3cd4e5a03cc5..9fabf5146d89 100644
> > --- a/tools/perf/util/symbol-elf.c
> > +++ b/tools/perf/util/symbol-elf.c
> > @@ -1723,7 +1723,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
> >
> >               arch__sym_update(f, &sym);
> >
> > -             __symbols__insert(dso__symbols(curr_dso), f, dso__kernel(dso));
> > +             __symbols__insert(dso__symbols(curr_dso), f);
> >               nr++;
> >       }
> >       dso__put(curr_dso);
> > diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> > index ce9195717f44..1a357af93a0a 100644
> > --- a/tools/perf/util/symbol.c
> > +++ b/tools/perf/util/symbol.c
> > @@ -25,6 +25,8 @@
> >  #include "demangle-ocaml.h"
> >  #include "demangle-rust-v0.h"
> >  #include "dso.h"
> > +#include "dwarf-regs.h"
> > +#include "env.h"
> >  #include "util.h" // lsdir()
> >  #include "event.h"
> >  #include "machine.h"
> > @@ -50,7 +52,6 @@
> >
> >  static int dso__load_kernel_sym(struct dso *dso, struct map *map);
> >  static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map);
> > -static bool symbol__is_idle(const char *name);
> >
> >  int vmlinux_path__nr_entries;
> >  char **vmlinux_path;
> > @@ -357,8 +358,7 @@ void symbols__delete(struct rb_root_cached *symbols)
> >       }
> >  }
> >
> > -void __symbols__insert(struct rb_root_cached *symbols,
> > -                    struct symbol *sym, bool kernel)
> > +void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
> >  {
> >       struct rb_node **p = &symbols->rb_root.rb_node;
> >       struct rb_node *parent = NULL;
> > @@ -366,17 +366,6 @@ void __symbols__insert(struct rb_root_cached *symbols,
> >       struct symbol *s;
> >       bool leftmost = true;
> >
> > -     if (kernel) {
> > -             const char *name = sym->name;
> > -             /*
> > -              * ppc64 uses function descriptors and appends a '.' to the
> > -              * start of every instruction address. Remove it.
> > -              */
> > -             if (name[0] == '.')
> > -                     name++;
> > -             sym->idle = symbol__is_idle(name);
> > -     }
> > -
> >       while (*p != NULL) {
> >               parent = *p;
> >               s = rb_entry(parent, struct symbol, rb_node);
> > @@ -393,7 +382,7 @@ void __symbols__insert(struct rb_root_cached *symbols,
> >
> >  void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
> >  {
> > -     __symbols__insert(symbols, sym, false);
> > +     __symbols__insert(symbols, sym);
> >  }
> >
> >  static struct symbol *symbols__find(struct rb_root_cached *symbols, u64 ip)
> > @@ -554,7 +543,7 @@ void dso__reset_find_symbol_cache(struct dso *dso)
> >
> >  void dso__insert_symbol(struct dso *dso, struct symbol *sym)
> >  {
> > -     __symbols__insert(dso__symbols(dso), sym, dso__kernel(dso));
> > +     __symbols__insert(dso__symbols(dso), sym);
> >
> >       /* update the symbol cache if necessary */
> >       if (dso__last_find_result_addr(dso) >= sym->start &&
> > @@ -716,47 +705,87 @@ int modules__parse(const char *filename, void *arg,
> >       return err;
> >  }
> >
> > +static int sym_name_cmp(const void *a, const void *b)
> > +{
> > +     const char *name = a;
> > +     const char *const *sym = b;
> > +
> > +     return strcmp(name, *sym);
> > +}
> > +
> >  /*
> >   * These are symbols in the kernel image, so make sure that
> >   * sym is from a kernel DSO.
> >   */
> > -static bool symbol__is_idle(const char *name)
> > +bool symbol__is_idle(struct symbol *sym, const struct dso *dso, const struct perf_env *env)
> >  {
> > -     const char * const idle_symbols[] = {
> > +     static const char * const idle_symbols[] = {
> >               "acpi_idle_do_entry",
> >               "acpi_processor_ffh_cstate_enter",
> >               "arch_cpu_idle",
> >               "cpu_idle",
> >               "cpu_startup_entry",
> > -             "idle_cpu",
> > -             "intel_idle",
> > -             "intel_idle_ibrs",
> >               "default_idle",
> > -             "native_safe_halt",
> >               "enter_idle",
> >               "exit_idle",
> > -             "mwait_idle",
> > -             "mwait_idle_with_hints",
> > -             "mwait_idle_with_hints.constprop.0",
> > +             "idle_cpu",
> > +             "native_safe_halt",
> >               "poll_idle",
> > -             "ppc64_runlatch_off",
> >               "pseries_dedicated_idle_sleep",
> > -             "psw_idle",
> > -             "psw_idle_exit",
> > -             NULL
> >       };
> > -     int i;
> > -     static struct strlist *idle_symbols_list;
> > +     const char *name = sym->name;
> > +     uint16_t e_machine = env ? env->e_machine : EM_HOST;
> >
> > -     if (idle_symbols_list)
> > -             return strlist__has_entry(idle_symbols_list, name);
> > +     if (sym->idle)
> > +             return sym->idle == SYMBOL_IDLE__IDLE;
> >
> > -     idle_symbols_list = strlist__new(NULL, NULL);
> > +     if (!dso || dso__kernel(dso) == DSO_SPACE__USER) {
> > +             sym->idle = SYMBOL_IDLE__NOT_IDLE;
> > +             return false;
> > +     }
> >
> > -     for (i = 0; idle_symbols[i]; i++)
> > -             strlist__add(idle_symbols_list, idle_symbols[i]);
> > +     /*
> > +      * ppc64 uses function descriptors and appends a '.' to the
> > +      * start of every instruction address. Remove it.
> > +      */
> > +     if (name[0] == '.')
> > +             name++;
> >
> > -     return strlist__has_entry(idle_symbols_list, name);
> > +     if (bsearch(name, idle_symbols, ARRAY_SIZE(idle_symbols),
> > +                 sizeof(idle_symbols[0]), sym_name_cmp)) {
> > +             sym->idle = SYMBOL_IDLE__IDLE;
> > +             return true;
> > +     }
> > +
> > +     if (e_machine == EM_386 || e_machine == EM_X86_64) {
>
> As said in anther thread, intel_idle_irq was still there on my test
> machine. I did a bit debug and found e_machine == 0 so it couldn't run
> into this branch. After dig more, it should be
> deliver_event()->perf_session__find_machine() return a struct machine
> whose env->e_machine is 0. I'm still busy today to do more, wish this
> clue can help.

I can see this, the env's e_machine isn't being lazily initialized for
the host like the arch is. I'll add a patch for this.

Thanks,
Ian

> Thanks,
> Honglei
>
> > +             if (strstarts(name, "mwait_idle") ||
> > +                 strstarts(name, "intel_idle")) {
> > +                     sym->idle = SYMBOL_IDLE__IDLE;
> > +                     return true;
> > +             }
> > +     }
> > +
> > +     if (e_machine == EM_PPC64 && !strcmp(name, "ppc64_runlatch_off")) {
> > +             sym->idle = SYMBOL_IDLE__IDLE;
> > +             return true;
> > +     }
> > +
> > +     if (e_machine == EM_S390) {
> > +             int major = 0, minor = 0;
> > +             const char *release = env && env->os_release
> > +                     ? env->os_release : perf_version_string;
> > +
> > +             sscanf(release, "%d.%d", &major, &minor);
> > +
> > +             /* Before v6.10, s390 used psw_idle. */
> > +             if ((major < 6 || (major == 6 && minor < 10)) && strstarts(name, "psw_idle")) {
> > +                     sym->idle = SYMBOL_IDLE__IDLE;
> > +                     return true;
> > +             }
> > +     }
> > +
> > +     sym->idle = SYMBOL_IDLE__NOT_IDLE;
> > +     return false;
> >  }
> >
> >  static int map__process_kallsym_symbol(void *arg, const char *name,
> > @@ -785,7 +814,7 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
> >        * We will pass the symbols to the filter later, in
> >        * map__split_kallsyms, when we have split the maps per module
> >        */
> > -     __symbols__insert(root, sym, !strchr(name, '['));
> > +     __symbols__insert(root, sym);
> >
> >       return 0;
> >  }
> > diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> > index c67814d6d6d6..f26f67bd7982 100644
> > --- a/tools/perf/util/symbol.h
> > +++ b/tools/perf/util/symbol.h
> > @@ -25,6 +25,7 @@ struct dso;
> >  struct map;
> >  struct maps;
> >  struct option;
> > +struct perf_env;
> >  struct build_id;
> >
> >  /*
> > @@ -42,6 +43,12 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
> >                            GElf_Shdr *shp, const char *name, size_t *idx);
> >  #endif
> >
> > +enum symbol_idle_kind {
> > +     SYMBOL_IDLE__UNKNOWN = 0,
> > +     SYMBOL_IDLE__NOT_IDLE = 1,
> > +     SYMBOL_IDLE__IDLE = 2,
> > +};
> > +
> >  /**
> >   * A symtab entry. When allocated this may be preceded by an annotation (see
> >   * symbol__annotation) and/or a browser_index (see symbol__browser_index).
> > @@ -57,8 +64,8 @@ struct symbol {
> >       u8              type:4;
> >       /** ELF binding type as defined for st_info. E.g. STB_WEAK or STB_GLOBAL. */
> >       u8              binding:4;
> > -     /** Set true for kernel symbols of idle routines. */
> > -     u8              idle:1;
> > +     /** Cache for symbol__is_idle. */
> > +     enum symbol_idle_kind idle:2;
> >       /** Resolvable but tools ignore it (e.g. idle routines). */
> >       u8              ignore:1;
> >       /** Symbol for an inlined function. */
> > @@ -202,8 +209,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss);
> >
> >  char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
> >
> > -void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym,
> > -                    bool kernel);
> > +void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
> >  void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
> >  void symbols__fixup_duplicate(struct rb_root_cached *symbols);
> >  void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms);
> > @@ -286,5 +292,6 @@ enum {
> >  };
> >
> >  int symbol__validate_sym_arguments(void);
> > +bool symbol__is_idle(struct symbol *sym, const struct dso *dso, const struct perf_env *env);
> >
> >  #endif /* __PERF_SYMBOL */
>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH v3 0/2] perf symbol/env: ELF machine clean up and lazy idle computation
  2026-03-26 15:11       ` Ian Rogers
@ 2026-03-26 17:45         ` Ian Rogers
  2026-03-26 17:45           ` [PATCH v3 1/2] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
                             ` (3 more replies)
  0 siblings, 4 replies; 80+ messages in thread
From: Ian Rogers @ 2026-03-26 17:45 UTC (permalink / raw)
  To: irogers
  Cc: acme, agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, namhyung, sumanthk, tmricht

Add a helper to perf_env to compute the e_machine if it is
EM_NONE. Derive the value from the arch string if available. Similarly
derive the arch string from the ELF machine if available, for
consistency. This means perf's arch (machine type) is no longer
determined by uname but set to match that of the perf ELF executable.

Switch the idle computation to the point of use and lazily compute it,
rather than computing it for every symbol. The current only user is
`perf top`. At the point of use the perf_env is available and this can
be used to make sure the idle function computation is machine and
kernel version dependent.

v3: Properly set up the e_machine coming from the perf_env as reported
    by Honglei Wang.

v2: Some minor white space clean up:
    https://lore.kernel.org/lkml/20260325161836.1029457-1-irogers@google.com/

v1: https://lore.kernel.org/lkml/20260302234343.564937-1-irogers@google.com/

Ian Rogers (2):
  perf env: Add perf_env__e_machine helper and use in perf_env__arch
  perf symbol: Lazily compute idle and use the perf_env

 tools/perf/builtin-top.c     |   6 +-
 tools/perf/util/env.c        | 179 +++++++++++++++++++++++++++--------
 tools/perf/util/env.h        |   1 +
 tools/perf/util/session.c    |  14 +--
 tools/perf/util/symbol-elf.c |   2 +-
 tools/perf/util/symbol.c     | 105 ++++++++++++--------
 tools/perf/util/symbol.h     |  15 ++-
 7 files changed, 235 insertions(+), 87 deletions(-)

-- 
2.53.0.1018.g2bb0e51243-goog


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH v3 1/2] perf env: Add perf_env__e_machine helper and use in perf_env__arch
  2026-03-26 17:45         ` [PATCH v3 0/2] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
@ 2026-03-26 17:45           ` Ian Rogers
  2026-03-26 17:45           ` [PATCH v3 2/2] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-03-26 17:45 UTC (permalink / raw)
  To: irogers
  Cc: acme, agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, namhyung, sumanthk, tmricht

Add a helper that lazily computes the e_machine and falls back of
EM_HOST. Use the perf_env's arch to compute the e_machine if
available. Use a binary search for some efficiency in this, but handle
somewhat complex duplicate rules. Switch perf_env__arch to be derived
the e_machine for consistency. This switches arch from being uname
derived to matching that of the perf binary (via EM_HOST). Update
session to use the helper, which may mean using EM_HOST when no
threads are available. This also updates the perf data file header
that gets the e_machine/e_flags from the session.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/env.c     | 179 ++++++++++++++++++++++++++++++--------
 tools/perf/util/env.h     |   1 +
 tools/perf/util/session.c |  14 +--
 3 files changed, 151 insertions(+), 43 deletions(-)

diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 93d475a80f14..304bd8245485 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -1,10 +1,12 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "cpumap.h"
+#include "dwarf-regs.h"
 #include "debug.h"
 #include "env.h"
 #include "util/header.h"
 #include "util/rwsem.h"
 #include <linux/compiler.h>
+#include <linux/kernel.h>
 #include <linux/ctype.h>
 #include <linux/rbtree.h>
 #include <linux/string.h>
@@ -588,51 +590,154 @@ void cpu_cache_level__free(struct cpu_cache_level *cache)
 	zfree(&cache->size);
 }
 
+struct arch_to_e_machine {
+	const char *prefix;
+	uint16_t e_machine;
+};
+
 /*
- * Return architecture name in a normalized form.
- * The conversion logic comes from the Makefile.
+ * A mapping from an arch prefix string to an ELF machine that can be used in a
+ * bsearch. Some arch prefixes are shared an need additional processing as
+ * marked next to the architecture. The prefixes handle both perf's architecture
+ * naming and those from uname.
  */
-static const char *normalize_arch(char *arch)
-{
-	if (!strcmp(arch, "x86_64"))
-		return "x86";
-	if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
-		return "x86";
-	if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
-		return "sparc";
-	if (!strncmp(arch, "aarch64", 7) || !strncmp(arch, "arm64", 5))
-		return "arm64";
-	if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
-		return "arm";
-	if (!strncmp(arch, "s390", 4))
-		return "s390";
-	if (!strncmp(arch, "parisc", 6))
-		return "parisc";
-	if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
-		return "powerpc";
-	if (!strncmp(arch, "mips", 4))
-		return "mips";
-	if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
-		return "sh";
-	if (!strncmp(arch, "loongarch", 9))
-		return "loongarch";
-
-	return arch;
+static const struct arch_to_e_machine prefix_to_e_machine[] = {
+	{"aarch64", EM_AARCH64},
+	{"alpha", EM_ALPHA},
+	{"arc", EM_ARC},
+	{"arm", EM_ARM}, /* Check also for EM_AARCH64. */
+	{"avr", EM_AVR},  /* Check also for EM_AVR32. */
+	{"bfin", EM_BLACKFIN},
+	{"blackfin", EM_BLACKFIN},
+	{"cris", EM_CRIS},
+	{"csky", EM_CSKY},
+	{"hppa", EM_PARISC},
+	{"i386", EM_386},
+	{"i486", EM_386},
+	{"i586", EM_386},
+	{"i686", EM_386},
+	{"loongarch", EM_LOONGARCH},
+	{"m32r", EM_M32R},
+	{"m68k", EM_68K},
+	{"microblaze", EM_MICROBLAZE},
+	{"mips", EM_MIPS},
+	{"msp430", EM_MSP430},
+	{"parisc", EM_PARISC},
+	{"powerpc", EM_PPC}, /* Check also for EM_PPC64. */
+	{"ppc", EM_PPC}, /* Check also for EM_PPC64. */
+	{"riscv", EM_RISCV},
+	{"sa110", EM_ARM},
+	{"s390", EM_S390},
+	{"sh", EM_SH},
+	{"sparc", EM_SPARC}, /* Check also for EM_SPARCV9. */
+	{"sun4u", EM_SPARC},
+	{"x86", EM_X86_64}, /* Check also for EM_386. */
+	{"xtensa", EM_XTENSA},
+};
+
+static int compare_prefix(const void *key, const void *element)
+{
+    const char *search_key = key;
+    const struct arch_to_e_machine *map_element = element;
+    size_t prefix_len = strlen(map_element->prefix);
+
+    return strncmp(search_key, map_element->prefix, prefix_len);
+}
+
+static uint16_t perf_arch_to_e_machine(const char *perf_arch, bool is_64_bit)
+{
+	/* Binary search for a matching prefix. */
+	const struct arch_to_e_machine *result;
+
+	if (!perf_arch)
+		return EM_HOST;
+
+	result = bsearch(perf_arch,
+			 prefix_to_e_machine, ARRAY_SIZE(prefix_to_e_machine),
+			 sizeof(prefix_to_e_machine[0]),
+			 compare_prefix);
+
+	if (!result) {
+		pr_debug("Unknown perf arch for ELF machine mapping: %s\n", perf_arch);
+		return EM_NONE;
+	}
+
+	/* Handle conflicting prefixes. */
+	switch (result->e_machine) {
+	case EM_ARM:
+		return !strcmp(perf_arch, "arm64") ? EM_AARCH64 : EM_ARM;
+	case EM_AVR:
+		return !strcmp(perf_arch, "avr32") ? EM_AVR32 : EM_AVR;
+	case EM_PPC:
+		return is_64_bit || strstarts(perf_arch, "ppc64") ? EM_PPC64 : EM_PPC;
+	case EM_SPARC:
+		return is_64_bit || !strcmp(perf_arch, "sparc64") ? EM_SPARCV9 : EM_SPARC;
+	case EM_X86_64:
+		return is_64_bit || !strcmp(perf_arch, "x86_64") ? EM_X86_64 : EM_386;
+	default:
+		return result->e_machine;
+	}
+}
+
+static const char *e_machine_to_perf_arch(uint16_t e_machine)
+{
+	/*
+	 * Table for if either the perf arch string differs from uname or there
+	 * are >1 ELF machine with the prefix.
+	 */
+	static const struct arch_to_e_machine extras[] = {
+		{"arm64", EM_AARCH64},
+		{"avr32", EM_AVR32},
+		{"powerpc", EM_PPC},
+		{"powerpc", EM_PPC64},
+		{"sparc", EM_SPARCV9},
+		{"x86", EM_386},
+		{"x86", EM_X86_64},
+		{"none", EM_NONE},
+	};
+
+	for (size_t i = 0; i < ARRAY_SIZE(extras); i++) {
+		if (extras[i].e_machine == e_machine)
+			return extras[i].prefix;
+	}
+
+	for (size_t i = 0; i < ARRAY_SIZE(prefix_to_e_machine); i++) {
+		if (prefix_to_e_machine[i].e_machine == e_machine)
+			return prefix_to_e_machine[i].prefix;
+
+	}
+	return "unknown";
+}
+
+uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags)
+{
+	if (!env) {
+		if (e_flags)
+			*e_flags = EF_HOST;
+
+		return EM_HOST;
+	}
+	if (env->e_machine == EM_NONE) {
+		env->e_machine = perf_arch_to_e_machine(env->arch, env->kernel_is_64_bit);
+
+		if (env->e_machine == EM_HOST)
+			env->e_flags = EF_HOST;
+	}
+	if (e_flags)
+		*e_flags = EF_HOST;
+
+	return env->e_machine;
 }
 
 const char *perf_env__arch(struct perf_env *env)
 {
-	char *arch_name;
+	if (!env)
+		return e_machine_to_perf_arch(EM_HOST);
 
-	if (!env || !env->arch) { /* Assume local operation */
-		static struct utsname uts = { .machine[0] = '\0', };
-		if (uts.machine[0] == '\0' && uname(&uts) < 0)
-			return NULL;
-		arch_name = uts.machine;
-	} else
-		arch_name = env->arch;
+	if (!env->arch)
+		env->arch = strdup(e_machine_to_perf_arch(perf_env__e_machine(env, /*e_flags=*/NULL)));
 
-	return normalize_arch(arch_name);
+	return env->arch;
 }
 
 #if defined(HAVE_LIBTRACEEVENT)
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index a4501cbca375..91ff252712f4 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -186,6 +186,7 @@ int perf_env__read_cpu_topology_map(struct perf_env *env);
 
 void cpu_cache_level__free(struct cpu_cache_level *cache);
 
+uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags);
 const char *perf_env__arch(struct perf_env *env);
 const char *perf_env__arch_strerrno(struct perf_env *env, int err);
 const char *perf_env__cpuid(struct perf_env *env);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 4b465abfa36c..dcc9bef303aa 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -2996,14 +2996,16 @@ uint16_t perf_session__e_machine(struct perf_session *session, uint32_t *e_flags
 		return EM_HOST;
 	}
 
+	/* Is the env caching an e_machine? */
 	env = perf_session__env(session);
-	if (env && env->e_machine != EM_NONE) {
-		if (e_flags)
-			*e_flags = env->e_flags;
-
-		return env->e_machine;
-	}
+	if (env && env->e_machine != EM_NONE)
+		return perf_env__e_machine(env, e_flags);
 
+	/*
+	 * Compute from threads, note this is more accurate than
+	 * perf_env__e_machine that falls back on EM_HOST and doesn't consider
+	 * mixed 32-bit and 64-bit threads.
+	 */
 	machines__for_each_thread(&session->machines,
 				  perf_session__e_machine_cb,
 				  &args);
-- 
2.53.0.1018.g2bb0e51243-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v3 2/2] perf symbol: Lazily compute idle and use the perf_env
  2026-03-26 17:45         ` [PATCH v3 0/2] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  2026-03-26 17:45           ` [PATCH v3 1/2] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
@ 2026-03-26 17:45           ` Ian Rogers
  2026-03-27  6:56             ` Honglei Wang
  2026-03-27  4:50           ` [PATCH v4 0/2] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  2026-03-27  6:00           ` [PATCH v2] perf tests task-analyzer: Write test files to tmpdir Ian Rogers
  3 siblings, 1 reply; 80+ messages in thread
From: Ian Rogers @ 2026-03-26 17:45 UTC (permalink / raw)
  To: irogers
  Cc: acme, agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, namhyung, sumanthk, tmricht

Move the idle boolean to a helper symbol__is_idle function. In the
function lazily compute whether a symbol is an idle function taking
into consideration the kernel version and architecture of the
machine. As symbols__insert no longer needs to know if a symbol is for
the kernel, remove the argument.

This change is inspired by mailing list discussion, particularly from
Thomas Richter <tmricht@linux.ibm.com> and Heiko Carstens
<hca@linux.ibm.com>:
https://lore.kernel.org/lkml/20260219113850.354271-1-tmricht@linux.ibm.com/

The change switches x86 matches to use strstarts which means
intel_idle_irq is matched as part of strstarts(name, "intel_idle"), a
change suggested by Honglei Wang <jameshongleiwang@126.com> in:
https://lore.kernel.org/lkml/20260323085255.98173-1-jameshongleiwang@126.com/

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-top.c     |   6 +-
 tools/perf/util/symbol-elf.c |   2 +-
 tools/perf/util/symbol.c     | 105 ++++++++++++++++++++++-------------
 tools/perf/util/symbol.h     |  15 +++--
 4 files changed, 84 insertions(+), 44 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 37950efb28ac..bdc1c761cd61 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -751,6 +751,7 @@ static void perf_event__process_sample(const struct perf_tool *tool,
 {
 	struct perf_top *top = container_of(tool, struct perf_top, tool);
 	struct addr_location al;
+	struct dso *dso = NULL;
 
 	if (!machine && perf_guest) {
 		static struct intlist *seen;
@@ -830,7 +831,10 @@ static void perf_event__process_sample(const struct perf_tool *tool,
 		}
 	}
 
-	if (al.sym == NULL || !al.sym->idle) {
+	if (al.map)
+		dso = map__dso(al.map);
+
+	if (al.sym == NULL || !symbol__is_idle(al.sym, dso, machine->env)) {
 		struct hists *hists = evsel__hists(evsel);
 		struct hist_entry_iter iter = {
 			.evsel		= evsel,
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 3cd4e5a03cc5..9fabf5146d89 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1723,7 +1723,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 
 		arch__sym_update(f, &sym);
 
-		__symbols__insert(dso__symbols(curr_dso), f, dso__kernel(dso));
+		__symbols__insert(dso__symbols(curr_dso), f);
 		nr++;
 	}
 	dso__put(curr_dso);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index ce9195717f44..92bc28934f36 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -25,6 +25,8 @@
 #include "demangle-ocaml.h"
 #include "demangle-rust-v0.h"
 #include "dso.h"
+#include "dwarf-regs.h"
+#include "env.h"
 #include "util.h" // lsdir()
 #include "event.h"
 #include "machine.h"
@@ -50,7 +52,6 @@
 
 static int dso__load_kernel_sym(struct dso *dso, struct map *map);
 static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map);
-static bool symbol__is_idle(const char *name);
 
 int vmlinux_path__nr_entries;
 char **vmlinux_path;
@@ -357,8 +358,7 @@ void symbols__delete(struct rb_root_cached *symbols)
 	}
 }
 
-void __symbols__insert(struct rb_root_cached *symbols,
-		       struct symbol *sym, bool kernel)
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
 	struct rb_node **p = &symbols->rb_root.rb_node;
 	struct rb_node *parent = NULL;
@@ -366,17 +366,6 @@ void __symbols__insert(struct rb_root_cached *symbols,
 	struct symbol *s;
 	bool leftmost = true;
 
-	if (kernel) {
-		const char *name = sym->name;
-		/*
-		 * ppc64 uses function descriptors and appends a '.' to the
-		 * start of every instruction address. Remove it.
-		 */
-		if (name[0] == '.')
-			name++;
-		sym->idle = symbol__is_idle(name);
-	}
-
 	while (*p != NULL) {
 		parent = *p;
 		s = rb_entry(parent, struct symbol, rb_node);
@@ -393,7 +382,7 @@ void __symbols__insert(struct rb_root_cached *symbols,
 
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
-	__symbols__insert(symbols, sym, false);
+	__symbols__insert(symbols, sym);
 }
 
 static struct symbol *symbols__find(struct rb_root_cached *symbols, u64 ip)
@@ -554,7 +543,7 @@ void dso__reset_find_symbol_cache(struct dso *dso)
 
 void dso__insert_symbol(struct dso *dso, struct symbol *sym)
 {
-	__symbols__insert(dso__symbols(dso), sym, dso__kernel(dso));
+	__symbols__insert(dso__symbols(dso), sym);
 
 	/* update the symbol cache if necessary */
 	if (dso__last_find_result_addr(dso) >= sym->start &&
@@ -716,47 +705,87 @@ int modules__parse(const char *filename, void *arg,
 	return err;
 }
 
+static int sym_name_cmp(const void *a, const void *b)
+{
+	const char *name = a;
+	const char *const *sym = b;
+
+	return strcmp(name, *sym);
+}
+
 /*
  * These are symbols in the kernel image, so make sure that
  * sym is from a kernel DSO.
  */
-static bool symbol__is_idle(const char *name)
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, struct perf_env *env)
 {
-	const char * const idle_symbols[] = {
+	static const char * const idle_symbols[] = {
 		"acpi_idle_do_entry",
 		"acpi_processor_ffh_cstate_enter",
 		"arch_cpu_idle",
 		"cpu_idle",
 		"cpu_startup_entry",
-		"idle_cpu",
-		"intel_idle",
-		"intel_idle_ibrs",
 		"default_idle",
-		"native_safe_halt",
 		"enter_idle",
 		"exit_idle",
-		"mwait_idle",
-		"mwait_idle_with_hints",
-		"mwait_idle_with_hints.constprop.0",
+		"idle_cpu",
+		"native_safe_halt",
 		"poll_idle",
-		"ppc64_runlatch_off",
 		"pseries_dedicated_idle_sleep",
-		"psw_idle",
-		"psw_idle_exit",
-		NULL
 	};
-	int i;
-	static struct strlist *idle_symbols_list;
+	const char *name = sym->name;
+	uint16_t e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
 
-	if (idle_symbols_list)
-		return strlist__has_entry(idle_symbols_list, name);
+	if (sym->idle)
+		return sym->idle == SYMBOL_IDLE__IDLE;
 
-	idle_symbols_list = strlist__new(NULL, NULL);
+	if (!dso || dso__kernel(dso) == DSO_SPACE__USER) {
+		sym->idle = SYMBOL_IDLE__NOT_IDLE;
+		return false;
+	}
 
-	for (i = 0; idle_symbols[i]; i++)
-		strlist__add(idle_symbols_list, idle_symbols[i]);
+	/*
+	 * ppc64 uses function descriptors and appends a '.' to the
+	 * start of every instruction address. Remove it.
+	 */
+	if (name[0] == '.')
+		name++;
 
-	return strlist__has_entry(idle_symbols_list, name);
+	if (bsearch(name, idle_symbols, ARRAY_SIZE(idle_symbols),
+		    sizeof(idle_symbols[0]), sym_name_cmp)) {
+		sym->idle = SYMBOL_IDLE__IDLE;
+		return true;
+	}
+
+	if (e_machine == EM_386 || e_machine == EM_X86_64) {
+		if (strstarts(name, "mwait_idle") ||
+		    strstarts(name, "intel_idle")) {
+			sym->idle = SYMBOL_IDLE__IDLE;
+			return true;
+		}
+	}
+
+	if (e_machine == EM_PPC64 && !strcmp(name, "ppc64_runlatch_off")) {
+		sym->idle = SYMBOL_IDLE__IDLE;
+		return true;
+	}
+
+	if (e_machine == EM_S390) {
+		int major = 0, minor = 0;
+		const char *release = env && env->os_release
+			? env->os_release : perf_version_string;
+
+		sscanf(release, "%d.%d", &major, &minor);
+
+		/* Before v6.10, s390 used psw_idle. */
+		if ((major < 6 || (major == 6 && minor < 10)) && strstarts(name, "psw_idle")) {
+			sym->idle = SYMBOL_IDLE__IDLE;
+			return true;
+		}
+	}
+
+	sym->idle = SYMBOL_IDLE__NOT_IDLE;
+	return false;
 }
 
 static int map__process_kallsym_symbol(void *arg, const char *name,
@@ -785,7 +814,7 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
 	 * We will pass the symbols to the filter later, in
 	 * map__split_kallsyms, when we have split the maps per module
 	 */
-	__symbols__insert(root, sym, !strchr(name, '['));
+	__symbols__insert(root, sym);
 
 	return 0;
 }
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index c67814d6d6d6..65422c1c8fdb 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -25,6 +25,7 @@ struct dso;
 struct map;
 struct maps;
 struct option;
+struct perf_env;
 struct build_id;
 
 /*
@@ -42,6 +43,12 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
 			     GElf_Shdr *shp, const char *name, size_t *idx);
 #endif
 
+enum symbol_idle_kind {
+	SYMBOL_IDLE__UNKNOWN = 0,
+	SYMBOL_IDLE__NOT_IDLE = 1,
+	SYMBOL_IDLE__IDLE = 2,
+};
+
 /**
  * A symtab entry. When allocated this may be preceded by an annotation (see
  * symbol__annotation) and/or a browser_index (see symbol__browser_index).
@@ -57,8 +64,8 @@ struct symbol {
 	u8		type:4;
 	/** ELF binding type as defined for st_info. E.g. STB_WEAK or STB_GLOBAL. */
 	u8		binding:4;
-	/** Set true for kernel symbols of idle routines. */
-	u8		idle:1;
+	/** Cache for symbol__is_idle. */
+	enum symbol_idle_kind idle:2;
 	/** Resolvable but tools ignore it (e.g. idle routines). */
 	u8		ignore:1;
 	/** Symbol for an inlined function. */
@@ -202,8 +209,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss);
 
 char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
 
-void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym,
-		       bool kernel);
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__fixup_duplicate(struct rb_root_cached *symbols);
 void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms);
@@ -286,5 +292,6 @@ enum {
 };
 
 int symbol__validate_sym_arguments(void);
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, struct perf_env *env);
 
 #endif /* __PERF_SYMBOL */
-- 
2.53.0.1018.g2bb0e51243-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v4 0/2] perf symbol/env: ELF machine clean up and lazy idle computation
  2026-03-26 17:45         ` [PATCH v3 0/2] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  2026-03-26 17:45           ` [PATCH v3 1/2] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
  2026-03-26 17:45           ` [PATCH v3 2/2] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
@ 2026-03-27  4:50           ` Ian Rogers
  2026-03-27  4:50             ` [PATCH v4 1/2] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
  2026-03-27  4:50             ` [PATCH v4 2/2] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
  2026-03-27  6:00           ` [PATCH v2] perf tests task-analyzer: Write test files to tmpdir Ian Rogers
  3 siblings, 2 replies; 80+ messages in thread
From: Ian Rogers @ 2026-03-27  4:50 UTC (permalink / raw)
  To: acme, namhyung, tmricht
  Cc: irogers, agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Add a helper to perf_env to compute the e_machine if it is
EM_NONE. Derive the value from the arch string if available. Similarly
derive the arch string from the ELF machine if available, for
consistency. This means perf's arch (machine type) is no longer
determined by uname but set to match that of the perf ELF executable.

Switch the idle computation to the point of use and lazily compute it,
rather than computing it for every symbol. The current only user is
`perf top`. At the point of use the perf_env is available and this can
be used to make sure the idle function computation is machine and
kernel version dependent.

v4: Fix Sashiko issues where an array element wasn't sorted properly,
    the e_flags weren't returned properly, the idle type is change to
    a u8 rather than an enum value and the s390 version check for
    psw_idle is slightly reordered and tweaked.

v3: Properly set up the e_machine coming from the perf_env as reported
    by Honglei Wang.
    https://lore.kernel.org/lkml/20260326174521.1829203-1-irogers@google.com/

v2: Some minor white space clean up:
    https://lore.kernel.org/lkml/20260325161836.1029457-1-irogers@google.com/

v1: https://lore.kernel.org/lkml/20260302234343.564937-1-irogers@google.com/

Ian Rogers (2):
  perf env: Add perf_env__e_machine helper and use in perf_env__arch
  perf symbol: Lazily compute idle and use the perf_env

 tools/perf/builtin-top.c     |   6 +-
 tools/perf/util/env.c        | 185 ++++++++++++++++++++++++++++-------
 tools/perf/util/env.h        |   1 +
 tools/perf/util/session.c    |  14 +--
 tools/perf/util/symbol-elf.c |   2 +-
 tools/perf/util/symbol.c     | 104 +++++++++++++-------
 tools/perf/util/symbol.h     |  15 ++-
 7 files changed, 240 insertions(+), 87 deletions(-)

-- 
2.53.0.1018.g2bb0e51243-goog


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH v4 1/2] perf env: Add perf_env__e_machine helper and use in perf_env__arch
  2026-03-27  4:50           ` [PATCH v4 0/2] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
@ 2026-03-27  4:50             ` Ian Rogers
  2026-04-06  5:05               ` Namhyung Kim
  2026-03-27  4:50             ` [PATCH v4 2/2] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
  1 sibling, 1 reply; 80+ messages in thread
From: Ian Rogers @ 2026-03-27  4:50 UTC (permalink / raw)
  To: acme, namhyung, tmricht
  Cc: irogers, agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Add a helper that lazily computes the e_machine and falls back of
EM_HOST. Use the perf_env's arch to compute the e_machine if
available. Use a binary search for some efficiency in this, but handle
somewhat complex duplicate rules. Switch perf_env__arch to be derived
the e_machine for consistency. This switches arch from being uname
derived to matching that of the perf binary (via EM_HOST). Update
session to use the helper, which may mean using EM_HOST when no
threads are available. This also updates the perf data file header
that gets the e_machine/e_flags from the session.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/env.c     | 185 ++++++++++++++++++++++++++++++--------
 tools/perf/util/env.h     |   1 +
 tools/perf/util/session.c |  14 +--
 3 files changed, 157 insertions(+), 43 deletions(-)

diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 93d475a80f14..ae08178870d7 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -1,10 +1,12 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "cpumap.h"
+#include "dwarf-regs.h"
 #include "debug.h"
 #include "env.h"
 #include "util/header.h"
 #include "util/rwsem.h"
 #include <linux/compiler.h>
+#include <linux/kernel.h>
 #include <linux/ctype.h>
 #include <linux/rbtree.h>
 #include <linux/string.h>
@@ -588,51 +590,160 @@ void cpu_cache_level__free(struct cpu_cache_level *cache)
 	zfree(&cache->size);
 }
 
+struct arch_to_e_machine {
+	const char *prefix;
+	uint16_t e_machine;
+};
+
 /*
- * Return architecture name in a normalized form.
- * The conversion logic comes from the Makefile.
+ * A mapping from an arch prefix string to an ELF machine that can be used in a
+ * bsearch. Some arch prefixes are shared an need additional processing as
+ * marked next to the architecture. The prefixes handle both perf's architecture
+ * naming and those from uname.
  */
-static const char *normalize_arch(char *arch)
-{
-	if (!strcmp(arch, "x86_64"))
-		return "x86";
-	if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
-		return "x86";
-	if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
-		return "sparc";
-	if (!strncmp(arch, "aarch64", 7) || !strncmp(arch, "arm64", 5))
-		return "arm64";
-	if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
-		return "arm";
-	if (!strncmp(arch, "s390", 4))
-		return "s390";
-	if (!strncmp(arch, "parisc", 6))
-		return "parisc";
-	if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
-		return "powerpc";
-	if (!strncmp(arch, "mips", 4))
-		return "mips";
-	if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
-		return "sh";
-	if (!strncmp(arch, "loongarch", 9))
-		return "loongarch";
-
-	return arch;
+static const struct arch_to_e_machine prefix_to_e_machine[] = {
+	{"aarch64", EM_AARCH64},
+	{"alpha", EM_ALPHA},
+	{"arc", EM_ARC},
+	{"arm", EM_ARM}, /* Check also for EM_AARCH64. */
+	{"avr", EM_AVR},  /* Check also for EM_AVR32. */
+	{"bfin", EM_BLACKFIN},
+	{"blackfin", EM_BLACKFIN},
+	{"cris", EM_CRIS},
+	{"csky", EM_CSKY},
+	{"hppa", EM_PARISC},
+	{"i386", EM_386},
+	{"i486", EM_386},
+	{"i586", EM_386},
+	{"i686", EM_386},
+	{"loongarch", EM_LOONGARCH},
+	{"m32r", EM_M32R},
+	{"m68k", EM_68K},
+	{"microblaze", EM_MICROBLAZE},
+	{"mips", EM_MIPS},
+	{"msp430", EM_MSP430},
+	{"parisc", EM_PARISC},
+	{"powerpc", EM_PPC}, /* Check also for EM_PPC64. */
+	{"ppc", EM_PPC}, /* Check also for EM_PPC64. */
+	{"riscv", EM_RISCV},
+	{"s390", EM_S390},
+	{"sa110", EM_ARM},
+	{"sh", EM_SH},
+	{"sparc", EM_SPARC}, /* Check also for EM_SPARCV9. */
+	{"sun4u", EM_SPARC},
+	{"x86", EM_X86_64}, /* Check also for EM_386. */
+	{"xtensa", EM_XTENSA},
+};
+
+static int compare_prefix(const void *key, const void *element)
+{
+	const char *search_key = key;
+	const struct arch_to_e_machine *map_element = element;
+	size_t prefix_len = strlen(map_element->prefix);
+
+	return strncmp(search_key, map_element->prefix, prefix_len);
+}
+
+static uint16_t perf_arch_to_e_machine(const char *perf_arch, bool is_64_bit)
+{
+	/* Binary search for a matching prefix. */
+	const struct arch_to_e_machine *result;
+
+	if (!perf_arch)
+		return EM_HOST;
+
+	result = bsearch(perf_arch,
+			 prefix_to_e_machine, ARRAY_SIZE(prefix_to_e_machine),
+			 sizeof(prefix_to_e_machine[0]),
+			 compare_prefix);
+
+	if (!result) {
+		pr_debug("Unknown perf arch for ELF machine mapping: %s\n", perf_arch);
+		return EM_NONE;
+	}
+
+	/* Handle conflicting prefixes. */
+	switch (result->e_machine) {
+	case EM_ARM:
+		return !strcmp(perf_arch, "arm64") ? EM_AARCH64 : EM_ARM;
+	case EM_AVR:
+		return !strcmp(perf_arch, "avr32") ? EM_AVR32 : EM_AVR;
+	case EM_PPC:
+		return is_64_bit || strstarts(perf_arch, "ppc64") ? EM_PPC64 : EM_PPC;
+	case EM_SPARC:
+		return is_64_bit || !strcmp(perf_arch, "sparc64") ? EM_SPARCV9 : EM_SPARC;
+	case EM_X86_64:
+		return is_64_bit || !strcmp(perf_arch, "x86_64") ? EM_X86_64 : EM_386;
+	default:
+		return result->e_machine;
+	}
+}
+
+static const char *e_machine_to_perf_arch(uint16_t e_machine)
+{
+	/*
+	 * Table for if either the perf arch string differs from uname or there
+	 * are >1 ELF machine with the prefix.
+	 */
+	static const struct arch_to_e_machine extras[] = {
+		{"arm64", EM_AARCH64},
+		{"avr32", EM_AVR32},
+		{"powerpc", EM_PPC},
+		{"powerpc", EM_PPC64},
+		{"sparc", EM_SPARCV9},
+		{"x86", EM_386},
+		{"x86", EM_X86_64},
+		{"none", EM_NONE},
+	};
+
+	for (size_t i = 0; i < ARRAY_SIZE(extras); i++) {
+		if (extras[i].e_machine == e_machine)
+			return extras[i].prefix;
+	}
+
+	for (size_t i = 0; i < ARRAY_SIZE(prefix_to_e_machine); i++) {
+		if (prefix_to_e_machine[i].e_machine == e_machine)
+			return prefix_to_e_machine[i].prefix;
+
+	}
+	return "unknown";
+}
+
+uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags)
+{
+	if (!env) {
+		if (e_flags)
+			*e_flags = EF_HOST;
+
+		return EM_HOST;
+	}
+	if (env->e_machine == EM_NONE) {
+		env->e_machine = perf_arch_to_e_machine(env->arch, env->kernel_is_64_bit);
+
+		if (env->e_machine == EM_HOST)
+			env->e_flags = EF_HOST;
+	}
+	if (e_flags)
+		*e_flags = env->e_flags;
+
+	return env->e_machine;
 }
 
 const char *perf_env__arch(struct perf_env *env)
 {
-	char *arch_name;
+	if (!env)
+		return e_machine_to_perf_arch(EM_HOST);
 
-	if (!env || !env->arch) { /* Assume local operation */
-		static struct utsname uts = { .machine[0] = '\0', };
-		if (uts.machine[0] == '\0' && uname(&uts) < 0)
-			return NULL;
-		arch_name = uts.machine;
-	} else
-		arch_name = env->arch;
+	if (!env->arch) {
+		/*
+		 * Lazily compute/allocate arch. The e_machine may have been
+		 * read from a data file and so may not be EM_HOST.
+		 */
+		uint16_t e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
 
-	return normalize_arch(arch_name);
+		env->arch = strdup(e_machine_to_perf_arch(e_machine));
+	}
+	return env->arch;
 }
 
 #if defined(HAVE_LIBTRACEEVENT)
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index a4501cbca375..91ff252712f4 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -186,6 +186,7 @@ int perf_env__read_cpu_topology_map(struct perf_env *env);
 
 void cpu_cache_level__free(struct cpu_cache_level *cache);
 
+uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags);
 const char *perf_env__arch(struct perf_env *env);
 const char *perf_env__arch_strerrno(struct perf_env *env, int err);
 const char *perf_env__cpuid(struct perf_env *env);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 4b465abfa36c..dcc9bef303aa 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -2996,14 +2996,16 @@ uint16_t perf_session__e_machine(struct perf_session *session, uint32_t *e_flags
 		return EM_HOST;
 	}
 
+	/* Is the env caching an e_machine? */
 	env = perf_session__env(session);
-	if (env && env->e_machine != EM_NONE) {
-		if (e_flags)
-			*e_flags = env->e_flags;
-
-		return env->e_machine;
-	}
+	if (env && env->e_machine != EM_NONE)
+		return perf_env__e_machine(env, e_flags);
 
+	/*
+	 * Compute from threads, note this is more accurate than
+	 * perf_env__e_machine that falls back on EM_HOST and doesn't consider
+	 * mixed 32-bit and 64-bit threads.
+	 */
 	machines__for_each_thread(&session->machines,
 				  perf_session__e_machine_cb,
 				  &args);
-- 
2.53.0.1018.g2bb0e51243-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v4 2/2] perf symbol: Lazily compute idle and use the perf_env
  2026-03-27  4:50           ` [PATCH v4 0/2] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  2026-03-27  4:50             ` [PATCH v4 1/2] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
@ 2026-03-27  4:50             ` Ian Rogers
  2026-04-06  5:10               ` Namhyung Kim
  1 sibling, 1 reply; 80+ messages in thread
From: Ian Rogers @ 2026-03-27  4:50 UTC (permalink / raw)
  To: acme, namhyung, tmricht
  Cc: irogers, agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Move the idle boolean to a helper symbol__is_idle function. In the
function lazily compute whether a symbol is an idle function taking
into consideration the kernel version and architecture of the
machine. As symbols__insert no longer needs to know if a symbol is for
the kernel, remove the argument.

This change is inspired by mailing list discussion, particularly from
Thomas Richter <tmricht@linux.ibm.com> and Heiko Carstens
<hca@linux.ibm.com>:
https://lore.kernel.org/lkml/20260219113850.354271-1-tmricht@linux.ibm.com/

The change switches x86 matches to use strstarts which means
intel_idle_irq is matched as part of strstarts(name, "intel_idle"), a
change suggested by Honglei Wang <jameshongleiwang@126.com> in:
https://lore.kernel.org/lkml/20260323085255.98173-1-jameshongleiwang@126.com/

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-top.c     |   6 +-
 tools/perf/util/symbol-elf.c |   2 +-
 tools/perf/util/symbol.c     | 104 ++++++++++++++++++++++-------------
 tools/perf/util/symbol.h     |  15 +++--
 4 files changed, 83 insertions(+), 44 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 37950efb28ac..bdc1c761cd61 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -751,6 +751,7 @@ static void perf_event__process_sample(const struct perf_tool *tool,
 {
 	struct perf_top *top = container_of(tool, struct perf_top, tool);
 	struct addr_location al;
+	struct dso *dso = NULL;
 
 	if (!machine && perf_guest) {
 		static struct intlist *seen;
@@ -830,7 +831,10 @@ static void perf_event__process_sample(const struct perf_tool *tool,
 		}
 	}
 
-	if (al.sym == NULL || !al.sym->idle) {
+	if (al.map)
+		dso = map__dso(al.map);
+
+	if (al.sym == NULL || !symbol__is_idle(al.sym, dso, machine->env)) {
 		struct hists *hists = evsel__hists(evsel);
 		struct hist_entry_iter iter = {
 			.evsel		= evsel,
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 3cd4e5a03cc5..9fabf5146d89 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1723,7 +1723,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 
 		arch__sym_update(f, &sym);
 
-		__symbols__insert(dso__symbols(curr_dso), f, dso__kernel(dso));
+		__symbols__insert(dso__symbols(curr_dso), f);
 		nr++;
 	}
 	dso__put(curr_dso);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index ce9195717f44..9ff709edeb88 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -25,6 +25,8 @@
 #include "demangle-ocaml.h"
 #include "demangle-rust-v0.h"
 #include "dso.h"
+#include "dwarf-regs.h"
+#include "env.h"
 #include "util.h" // lsdir()
 #include "event.h"
 #include "machine.h"
@@ -50,7 +52,6 @@
 
 static int dso__load_kernel_sym(struct dso *dso, struct map *map);
 static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map);
-static bool symbol__is_idle(const char *name);
 
 int vmlinux_path__nr_entries;
 char **vmlinux_path;
@@ -357,8 +358,7 @@ void symbols__delete(struct rb_root_cached *symbols)
 	}
 }
 
-void __symbols__insert(struct rb_root_cached *symbols,
-		       struct symbol *sym, bool kernel)
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
 	struct rb_node **p = &symbols->rb_root.rb_node;
 	struct rb_node *parent = NULL;
@@ -366,17 +366,6 @@ void __symbols__insert(struct rb_root_cached *symbols,
 	struct symbol *s;
 	bool leftmost = true;
 
-	if (kernel) {
-		const char *name = sym->name;
-		/*
-		 * ppc64 uses function descriptors and appends a '.' to the
-		 * start of every instruction address. Remove it.
-		 */
-		if (name[0] == '.')
-			name++;
-		sym->idle = symbol__is_idle(name);
-	}
-
 	while (*p != NULL) {
 		parent = *p;
 		s = rb_entry(parent, struct symbol, rb_node);
@@ -393,7 +382,7 @@ void __symbols__insert(struct rb_root_cached *symbols,
 
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
-	__symbols__insert(symbols, sym, false);
+	__symbols__insert(symbols, sym);
 }
 
 static struct symbol *symbols__find(struct rb_root_cached *symbols, u64 ip)
@@ -554,7 +543,7 @@ void dso__reset_find_symbol_cache(struct dso *dso)
 
 void dso__insert_symbol(struct dso *dso, struct symbol *sym)
 {
-	__symbols__insert(dso__symbols(dso), sym, dso__kernel(dso));
+	__symbols__insert(dso__symbols(dso), sym);
 
 	/* update the symbol cache if necessary */
 	if (dso__last_find_result_addr(dso) >= sym->start &&
@@ -716,47 +705,86 @@ int modules__parse(const char *filename, void *arg,
 	return err;
 }
 
+static int sym_name_cmp(const void *a, const void *b)
+{
+	const char *name = a;
+	const char *const *sym = b;
+
+	return strcmp(name, *sym);
+}
+
 /*
  * These are symbols in the kernel image, so make sure that
  * sym is from a kernel DSO.
  */
-static bool symbol__is_idle(const char *name)
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, struct perf_env *env)
 {
-	const char * const idle_symbols[] = {
+	static const char * const idle_symbols[] = {
 		"acpi_idle_do_entry",
 		"acpi_processor_ffh_cstate_enter",
 		"arch_cpu_idle",
 		"cpu_idle",
 		"cpu_startup_entry",
-		"idle_cpu",
-		"intel_idle",
-		"intel_idle_ibrs",
 		"default_idle",
-		"native_safe_halt",
 		"enter_idle",
 		"exit_idle",
-		"mwait_idle",
-		"mwait_idle_with_hints",
-		"mwait_idle_with_hints.constprop.0",
+		"idle_cpu",
+		"native_safe_halt",
 		"poll_idle",
-		"ppc64_runlatch_off",
 		"pseries_dedicated_idle_sleep",
-		"psw_idle",
-		"psw_idle_exit",
-		NULL
 	};
-	int i;
-	static struct strlist *idle_symbols_list;
+	const char *name = sym->name;
+	uint16_t e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
+
+	if (sym->idle)
+		return sym->idle == SYMBOL_IDLE__IDLE;
+
+	if (!dso || dso__kernel(dso) == DSO_SPACE__USER) {
+		sym->idle = SYMBOL_IDLE__NOT_IDLE;
+		return false;
+	}
 
-	if (idle_symbols_list)
-		return strlist__has_entry(idle_symbols_list, name);
+	/*
+	 * ppc64 uses function descriptors and appends a '.' to the
+	 * start of every instruction address. Remove it.
+	 */
+	if (name[0] == '.')
+		name++;
 
-	idle_symbols_list = strlist__new(NULL, NULL);
+	if (bsearch(name, idle_symbols, ARRAY_SIZE(idle_symbols),
+		    sizeof(idle_symbols[0]), sym_name_cmp)) {
+		sym->idle = SYMBOL_IDLE__IDLE;
+		return true;
+	}
 
-	for (i = 0; idle_symbols[i]; i++)
-		strlist__add(idle_symbols_list, idle_symbols[i]);
+	if (e_machine == EM_386 || e_machine == EM_X86_64) {
+		if (strstarts(name, "mwait_idle") ||
+		    strstarts(name, "intel_idle")) {
+			sym->idle = SYMBOL_IDLE__IDLE;
+			return true;
+		}
+	}
 
-	return strlist__has_entry(idle_symbols_list, name);
+	if (e_machine == EM_PPC64 && !strcmp(name, "ppc64_runlatch_off")) {
+		sym->idle = SYMBOL_IDLE__IDLE;
+		return true;
+	}
+
+	if (e_machine == EM_S390 && strstarts(name, "psw_idle")) {
+		int major = 0, minor = 0;
+		const char *release = env && env->os_release
+			? env->os_release : perf_version_string;
+
+		/* Before v6.10, s390 used psw_idle. */
+		if (sscanf(release, "%d.%d", &major, &minor) != 2 ||
+		    major < 6 || (major == 6 && minor < 10)) {
+			sym->idle = SYMBOL_IDLE__IDLE;
+			return true;
+		}
+	}
+
+	sym->idle = SYMBOL_IDLE__NOT_IDLE;
+	return false;
 }
 
 static int map__process_kallsym_symbol(void *arg, const char *name,
@@ -785,7 +813,7 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
 	 * We will pass the symbols to the filter later, in
 	 * map__split_kallsyms, when we have split the maps per module
 	 */
-	__symbols__insert(root, sym, !strchr(name, '['));
+	__symbols__insert(root, sym);
 
 	return 0;
 }
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index c67814d6d6d6..2f5f90f547aa 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -25,6 +25,7 @@ struct dso;
 struct map;
 struct maps;
 struct option;
+struct perf_env;
 struct build_id;
 
 /*
@@ -42,6 +43,12 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
 			     GElf_Shdr *shp, const char *name, size_t *idx);
 #endif
 
+enum symbol_idle_kind {
+	SYMBOL_IDLE__UNKNOWN = 0,
+	SYMBOL_IDLE__NOT_IDLE = 1,
+	SYMBOL_IDLE__IDLE = 2,
+};
+
 /**
  * A symtab entry. When allocated this may be preceded by an annotation (see
  * symbol__annotation) and/or a browser_index (see symbol__browser_index).
@@ -57,8 +64,8 @@ struct symbol {
 	u8		type:4;
 	/** ELF binding type as defined for st_info. E.g. STB_WEAK or STB_GLOBAL. */
 	u8		binding:4;
-	/** Set true for kernel symbols of idle routines. */
-	u8		idle:1;
+	/** Cache for symbol__is_idle holding enum symbol_idle_kind values. */
+	u8		idle:2;
 	/** Resolvable but tools ignore it (e.g. idle routines). */
 	u8		ignore:1;
 	/** Symbol for an inlined function. */
@@ -202,8 +209,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss);
 
 char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
 
-void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym,
-		       bool kernel);
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__fixup_duplicate(struct rb_root_cached *symbols);
 void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms);
@@ -286,5 +292,6 @@ enum {
 };
 
 int symbol__validate_sym_arguments(void);
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, struct perf_env *env);
 
 #endif /* __PERF_SYMBOL */
-- 
2.53.0.1018.g2bb0e51243-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v2] perf tests task-analyzer: Write test files to tmpdir
  2026-03-26 17:45         ` [PATCH v3 0/2] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                             ` (2 preceding siblings ...)
  2026-03-27  4:50           ` [PATCH v4 0/2] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
@ 2026-03-27  6:00           ` Ian Rogers
  2026-03-31  7:22             ` Namhyung Kim
  3 siblings, 1 reply; 80+ messages in thread
From: Ian Rogers @ 2026-03-27  6:00 UTC (permalink / raw)
  To: acme, namhyung
  Cc: irogers, agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk, tmricht

Writing to the test output files in the current working directory can
fail in various contexts such as continual test. Other tests write to
a mktemp-ed file, make the "perf script task-analyszer tests" follow
this convention too. Currently this isn't possible for the perf.data
file due to a lack of perf script support, add a variable for when
this support is available.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/shell/test_task_analyzer.sh | 38 +++++++++++---------
 1 file changed, 21 insertions(+), 17 deletions(-)

diff --git a/tools/perf/tests/shell/test_task_analyzer.sh b/tools/perf/tests/shell/test_task_analyzer.sh
index e194fcf61df3..b1a6a7e017e4 100755
--- a/tools/perf/tests/shell/test_task_analyzer.sh
+++ b/tools/perf/tests/shell/test_task_analyzer.sh
@@ -3,6 +3,11 @@
 # SPDX-License-Identifier: GPL-2.0
 
 tmpdir=$(mktemp -d /tmp/perf-script-task-analyzer-XXXXX)
+# TODO: perf script report only supports input from the CWD perf.data file, make
+# it support input from any file.
+perfdata="perf.data"
+csv="$tmpdir/csv"
+csvsummary="$tmpdir/csvsummary"
 err=0
 
 # set PERF_EXEC_PATH to find scripts in the source directory
@@ -15,11 +20,10 @@ fi
 export ASAN_OPTIONS=detect_leaks=0
 
 cleanup() {
-  rm -f perf.data
-  rm -f perf.data.old
-  rm -f csv
-  rm -f csvsummary
+  rm -f "${perfdata}"
+  rm -f "${perfdata}".old
   rm -rf "$tmpdir"
+
   trap - exit term int
 }
 
@@ -61,7 +65,7 @@ skip_no_probe_record_support() {
 
 prepare_perf_data() {
 	# 1s should be sufficient to catch at least some switches
-	perf record -e sched:sched_switch -a -- sleep 1 > /dev/null 2>&1
+	perf record -e sched:sched_switch -a -o "${perfdata}" -- sleep 1 > /dev/null 2>&1
 	# check if perf data file got created in above step.
 	if [ ! -e "perf.data" ]; then
 		printf "FAIL: perf record failed to create \"perf.data\" \n"
@@ -130,28 +134,28 @@ test_extended_times_summary_ns() {
 }
 
 test_csv() {
-	perf script report task-analyzer --csv csv > /dev/null
-	check_exec_0 "perf script report task-analyzer --csv csv"
-	find_str_or_fail "Comm;" csv "${FUNCNAME[0]}"
+	perf script report task-analyzer --csv "${csv}" > /dev/null
+	check_exec_0 "perf script report task-analyzer --csv ${csv}"
+	find_str_or_fail "Comm;" "${csv}" "${FUNCNAME[0]}"
 }
 
 test_csv_extended_times() {
-	perf script report task-analyzer --csv csv --extended-times > /dev/null
-	check_exec_0 "perf script report task-analyzer --csv csv --extended-times"
-	find_str_or_fail "Out-Out;" csv "${FUNCNAME[0]}"
+	perf script report task-analyzer --csv "${csv}" --extended-times > /dev/null
+	check_exec_0 "perf script report task-analyzer --csv ${csv} --extended-times"
+	find_str_or_fail "Out-Out;" "${csv}" "${FUNCNAME[0]}"
 }
 
 test_csvsummary() {
-	perf script report task-analyzer --csv-summary csvsummary > /dev/null
-	check_exec_0 "perf script report task-analyzer --csv-summary csvsummary"
-	find_str_or_fail "Comm;" csvsummary "${FUNCNAME[0]}"
+	perf script report task-analyzer --csv-summary "${csvsummary}" > /dev/null
+	check_exec_0 "perf script report task-analyzer --csv-summary ${csvsummary}"
+	find_str_or_fail "Comm;" "${csvsummary}" "${FUNCNAME[0]}"
 }
 
 test_csvsummary_extended() {
-	perf script report task-analyzer --csv-summary csvsummary --summary-extended \
+	perf script report task-analyzer --csv-summary "${csvsummary}" --summary-extended \
 	>/dev/null
-	check_exec_0 "perf script report task-analyzer --csv-summary csvsummary --summary-extended"
-	find_str_or_fail "Out-Out;" csvsummary "${FUNCNAME[0]}"
+	check_exec_0 "perf script report task-analyzer --csv-summary ${csvsummary} --summary-extended"
+	find_str_or_fail "Out-Out;" "${csvsummary}" "${FUNCNAME[0]}"
 }
 
 skip_no_probe_record_support
-- 
2.53.0.1018.g2bb0e51243-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 2/2] perf symbol: Lazily compute idle and use the perf_env
  2026-03-26 17:45           ` [PATCH v3 2/2] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
@ 2026-03-27  6:56             ` Honglei Wang
  0 siblings, 0 replies; 80+ messages in thread
From: Honglei Wang @ 2026-03-27  6:56 UTC (permalink / raw)
  To: Ian Rogers
  Cc: acme, agordeev, gor, hca, japo, linux-kernel, linux-perf-users,
	linux-s390, namhyung, sumanthk, tmricht

Hi Ian,

FYI. It works on my icx machine with 'perf top'.

Thanks,
Honglei

On 3/27/26 1:45 AM, Ian Rogers wrote:
> Move the idle boolean to a helper symbol__is_idle function. In the
> function lazily compute whether a symbol is an idle function taking
> into consideration the kernel version and architecture of the
> machine. As symbols__insert no longer needs to know if a symbol is for
> the kernel, remove the argument.
> 
> This change is inspired by mailing list discussion, particularly from
> Thomas Richter <tmricht@linux.ibm.com> and Heiko Carstens
> <hca@linux.ibm.com>:
> https://lore.kernel.org/lkml/20260219113850.354271-1-tmricht@linux.ibm.com/
> 
> The change switches x86 matches to use strstarts which means
> intel_idle_irq is matched as part of strstarts(name, "intel_idle"), a
> change suggested by Honglei Wang <jameshongleiwang@126.com> in:
> https://lore.kernel.org/lkml/20260323085255.98173-1-jameshongleiwang@126.com/
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/builtin-top.c     |   6 +-
>  tools/perf/util/symbol-elf.c |   2 +-
>  tools/perf/util/symbol.c     | 105 ++++++++++++++++++++++-------------
>  tools/perf/util/symbol.h     |  15 +++--
>  4 files changed, 84 insertions(+), 44 deletions(-)
> 
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index 37950efb28ac..bdc1c761cd61 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -751,6 +751,7 @@ static void perf_event__process_sample(const struct perf_tool *tool,
>  {
>  	struct perf_top *top = container_of(tool, struct perf_top, tool);
>  	struct addr_location al;
> +	struct dso *dso = NULL;
>  
>  	if (!machine && perf_guest) {
>  		static struct intlist *seen;
> @@ -830,7 +831,10 @@ static void perf_event__process_sample(const struct perf_tool *tool,
>  		}
>  	}
>  
> -	if (al.sym == NULL || !al.sym->idle) {
> +	if (al.map)
> +		dso = map__dso(al.map);
> +
> +	if (al.sym == NULL || !symbol__is_idle(al.sym, dso, machine->env)) {
>  		struct hists *hists = evsel__hists(evsel);
>  		struct hist_entry_iter iter = {
>  			.evsel		= evsel,
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index 3cd4e5a03cc5..9fabf5146d89 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -1723,7 +1723,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
>  
>  		arch__sym_update(f, &sym);
>  
> -		__symbols__insert(dso__symbols(curr_dso), f, dso__kernel(dso));
> +		__symbols__insert(dso__symbols(curr_dso), f);
>  		nr++;
>  	}
>  	dso__put(curr_dso);
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index ce9195717f44..92bc28934f36 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -25,6 +25,8 @@
>  #include "demangle-ocaml.h"
>  #include "demangle-rust-v0.h"
>  #include "dso.h"
> +#include "dwarf-regs.h"
> +#include "env.h"
>  #include "util.h" // lsdir()
>  #include "event.h"
>  #include "machine.h"
> @@ -50,7 +52,6 @@
>  
>  static int dso__load_kernel_sym(struct dso *dso, struct map *map);
>  static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map);
> -static bool symbol__is_idle(const char *name);
>  
>  int vmlinux_path__nr_entries;
>  char **vmlinux_path;
> @@ -357,8 +358,7 @@ void symbols__delete(struct rb_root_cached *symbols)
>  	}
>  }
>  
> -void __symbols__insert(struct rb_root_cached *symbols,
> -		       struct symbol *sym, bool kernel)
> +void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
>  {
>  	struct rb_node **p = &symbols->rb_root.rb_node;
>  	struct rb_node *parent = NULL;
> @@ -366,17 +366,6 @@ void __symbols__insert(struct rb_root_cached *symbols,
>  	struct symbol *s;
>  	bool leftmost = true;
>  
> -	if (kernel) {
> -		const char *name = sym->name;
> -		/*
> -		 * ppc64 uses function descriptors and appends a '.' to the
> -		 * start of every instruction address. Remove it.
> -		 */
> -		if (name[0] == '.')
> -			name++;
> -		sym->idle = symbol__is_idle(name);
> -	}
> -
>  	while (*p != NULL) {
>  		parent = *p;
>  		s = rb_entry(parent, struct symbol, rb_node);
> @@ -393,7 +382,7 @@ void __symbols__insert(struct rb_root_cached *symbols,
>  
>  void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
>  {
> -	__symbols__insert(symbols, sym, false);
> +	__symbols__insert(symbols, sym);
>  }
>  
>  static struct symbol *symbols__find(struct rb_root_cached *symbols, u64 ip)
> @@ -554,7 +543,7 @@ void dso__reset_find_symbol_cache(struct dso *dso)
>  
>  void dso__insert_symbol(struct dso *dso, struct symbol *sym)
>  {
> -	__symbols__insert(dso__symbols(dso), sym, dso__kernel(dso));
> +	__symbols__insert(dso__symbols(dso), sym);
>  
>  	/* update the symbol cache if necessary */
>  	if (dso__last_find_result_addr(dso) >= sym->start &&
> @@ -716,47 +705,87 @@ int modules__parse(const char *filename, void *arg,
>  	return err;
>  }
>  
> +static int sym_name_cmp(const void *a, const void *b)
> +{
> +	const char *name = a;
> +	const char *const *sym = b;
> +
> +	return strcmp(name, *sym);
> +}
> +
>  /*
>   * These are symbols in the kernel image, so make sure that
>   * sym is from a kernel DSO.
>   */
> -static bool symbol__is_idle(const char *name)
> +bool symbol__is_idle(struct symbol *sym, const struct dso *dso, struct perf_env *env)
>  {
> -	const char * const idle_symbols[] = {
> +	static const char * const idle_symbols[] = {
>  		"acpi_idle_do_entry",
>  		"acpi_processor_ffh_cstate_enter",
>  		"arch_cpu_idle",
>  		"cpu_idle",
>  		"cpu_startup_entry",
> -		"idle_cpu",
> -		"intel_idle",
> -		"intel_idle_ibrs",
>  		"default_idle",
> -		"native_safe_halt",
>  		"enter_idle",
>  		"exit_idle",
> -		"mwait_idle",
> -		"mwait_idle_with_hints",
> -		"mwait_idle_with_hints.constprop.0",
> +		"idle_cpu",
> +		"native_safe_halt",
>  		"poll_idle",
> -		"ppc64_runlatch_off",
>  		"pseries_dedicated_idle_sleep",
> -		"psw_idle",
> -		"psw_idle_exit",
> -		NULL
>  	};
> -	int i;
> -	static struct strlist *idle_symbols_list;
> +	const char *name = sym->name;
> +	uint16_t e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
>  
> -	if (idle_symbols_list)
> -		return strlist__has_entry(idle_symbols_list, name);
> +	if (sym->idle)
> +		return sym->idle == SYMBOL_IDLE__IDLE;
>  
> -	idle_symbols_list = strlist__new(NULL, NULL);
> +	if (!dso || dso__kernel(dso) == DSO_SPACE__USER) {
> +		sym->idle = SYMBOL_IDLE__NOT_IDLE;
> +		return false;
> +	}
>  
> -	for (i = 0; idle_symbols[i]; i++)
> -		strlist__add(idle_symbols_list, idle_symbols[i]);
> +	/*
> +	 * ppc64 uses function descriptors and appends a '.' to the
> +	 * start of every instruction address. Remove it.
> +	 */
> +	if (name[0] == '.')
> +		name++;
>  
> -	return strlist__has_entry(idle_symbols_list, name);
> +	if (bsearch(name, idle_symbols, ARRAY_SIZE(idle_symbols),
> +		    sizeof(idle_symbols[0]), sym_name_cmp)) {
> +		sym->idle = SYMBOL_IDLE__IDLE;
> +		return true;
> +	}
> +
> +	if (e_machine == EM_386 || e_machine == EM_X86_64) {
> +		if (strstarts(name, "mwait_idle") ||
> +		    strstarts(name, "intel_idle")) {
> +			sym->idle = SYMBOL_IDLE__IDLE;
> +			return true;
> +		}
> +	}
> +
> +	if (e_machine == EM_PPC64 && !strcmp(name, "ppc64_runlatch_off")) {
> +		sym->idle = SYMBOL_IDLE__IDLE;
> +		return true;
> +	}
> +
> +	if (e_machine == EM_S390) {
> +		int major = 0, minor = 0;
> +		const char *release = env && env->os_release
> +			? env->os_release : perf_version_string;
> +
> +		sscanf(release, "%d.%d", &major, &minor);
> +
> +		/* Before v6.10, s390 used psw_idle. */
> +		if ((major < 6 || (major == 6 && minor < 10)) && strstarts(name, "psw_idle")) {
> +			sym->idle = SYMBOL_IDLE__IDLE;
> +			return true;
> +		}
> +	}
> +
> +	sym->idle = SYMBOL_IDLE__NOT_IDLE;
> +	return false;
>  }
>  
>  static int map__process_kallsym_symbol(void *arg, const char *name,
> @@ -785,7 +814,7 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
>  	 * We will pass the symbols to the filter later, in
>  	 * map__split_kallsyms, when we have split the maps per module
>  	 */
> -	__symbols__insert(root, sym, !strchr(name, '['));
> +	__symbols__insert(root, sym);
>  
>  	return 0;
>  }
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index c67814d6d6d6..65422c1c8fdb 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -25,6 +25,7 @@ struct dso;
>  struct map;
>  struct maps;
>  struct option;
> +struct perf_env;
>  struct build_id;
>  
>  /*
> @@ -42,6 +43,12 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
>  			     GElf_Shdr *shp, const char *name, size_t *idx);
>  #endif
>  
> +enum symbol_idle_kind {
> +	SYMBOL_IDLE__UNKNOWN = 0,
> +	SYMBOL_IDLE__NOT_IDLE = 1,
> +	SYMBOL_IDLE__IDLE = 2,
> +};
> +
>  /**
>   * A symtab entry. When allocated this may be preceded by an annotation (see
>   * symbol__annotation) and/or a browser_index (see symbol__browser_index).
> @@ -57,8 +64,8 @@ struct symbol {
>  	u8		type:4;
>  	/** ELF binding type as defined for st_info. E.g. STB_WEAK or STB_GLOBAL. */
>  	u8		binding:4;
> -	/** Set true for kernel symbols of idle routines. */
> -	u8		idle:1;
> +	/** Cache for symbol__is_idle. */
> +	enum symbol_idle_kind idle:2;
>  	/** Resolvable but tools ignore it (e.g. idle routines). */
>  	u8		ignore:1;
>  	/** Symbol for an inlined function. */
> @@ -202,8 +209,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss);
>  
>  char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
>  
> -void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym,
> -		       bool kernel);
> +void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
>  void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
>  void symbols__fixup_duplicate(struct rb_root_cached *symbols);
>  void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms);
> @@ -286,5 +292,6 @@ enum {
>  };
>  
>  int symbol__validate_sym_arguments(void);
> +bool symbol__is_idle(struct symbol *sym, const struct dso *dso, struct perf_env *env);
>  
>  #endif /* __PERF_SYMBOL */


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2] perf tests task-analyzer: Write test files to tmpdir
  2026-03-27  6:00           ` [PATCH v2] perf tests task-analyzer: Write test files to tmpdir Ian Rogers
@ 2026-03-31  7:22             ` Namhyung Kim
  2026-03-31 17:58               ` Ian Rogers
  0 siblings, 1 reply; 80+ messages in thread
From: Namhyung Kim @ 2026-03-31  7:22 UTC (permalink / raw)
  To: Ian Rogers
  Cc: acme, agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk, tmricht

I'm curious why this patch is in the idle symbol thread.


On Thu, Mar 26, 2026 at 11:00:33PM -0700, Ian Rogers wrote:
> Writing to the test output files in the current working directory can
> fail in various contexts such as continual test. Other tests write to
> a mktemp-ed file, make the "perf script task-analyszer tests" follow
> this convention too. Currently this isn't possible for the perf.data
> file due to a lack of perf script support, add a variable for when
> this support is available.
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/tests/shell/test_task_analyzer.sh | 38 +++++++++++---------
>  1 file changed, 21 insertions(+), 17 deletions(-)
> 
> diff --git a/tools/perf/tests/shell/test_task_analyzer.sh b/tools/perf/tests/shell/test_task_analyzer.sh
> index e194fcf61df3..b1a6a7e017e4 100755
> --- a/tools/perf/tests/shell/test_task_analyzer.sh
> +++ b/tools/perf/tests/shell/test_task_analyzer.sh
> @@ -3,6 +3,11 @@
>  # SPDX-License-Identifier: GPL-2.0
>  
>  tmpdir=$(mktemp -d /tmp/perf-script-task-analyzer-XXXXX)
> +# TODO: perf script report only supports input from the CWD perf.data file, make
> +# it support input from any file.
> +perfdata="perf.data"
> +csv="$tmpdir/csv"
> +csvsummary="$tmpdir/csvsummary"
>  err=0
>  
>  # set PERF_EXEC_PATH to find scripts in the source directory
> @@ -15,11 +20,10 @@ fi
>  export ASAN_OPTIONS=detect_leaks=0
>  
>  cleanup() {
> -  rm -f perf.data
> -  rm -f perf.data.old
> -  rm -f csv
> -  rm -f csvsummary
> +  rm -f "${perfdata}"
> +  rm -f "${perfdata}".old
>    rm -rf "$tmpdir"
> +
>    trap - exit term int
>  }
>  
> @@ -61,7 +65,7 @@ skip_no_probe_record_support() {
>  
>  prepare_perf_data() {
>  	# 1s should be sufficient to catch at least some switches
> -	perf record -e sched:sched_switch -a -- sleep 1 > /dev/null 2>&1
> +	perf record -e sched:sched_switch -a -o "${perfdata}" -- sleep 1 > /dev/null 2>&1
>  	# check if perf data file got created in above step.
>  	if [ ! -e "perf.data" ]; then
>  		printf "FAIL: perf record failed to create \"perf.data\" \n"

Please update this part too.

Thanks,
Namhyung


> @@ -130,28 +134,28 @@ test_extended_times_summary_ns() {
>  }
>  
>  test_csv() {
> -	perf script report task-analyzer --csv csv > /dev/null
> -	check_exec_0 "perf script report task-analyzer --csv csv"
> -	find_str_or_fail "Comm;" csv "${FUNCNAME[0]}"
> +	perf script report task-analyzer --csv "${csv}" > /dev/null
> +	check_exec_0 "perf script report task-analyzer --csv ${csv}"
> +	find_str_or_fail "Comm;" "${csv}" "${FUNCNAME[0]}"
>  }
>  
>  test_csv_extended_times() {
> -	perf script report task-analyzer --csv csv --extended-times > /dev/null
> -	check_exec_0 "perf script report task-analyzer --csv csv --extended-times"
> -	find_str_or_fail "Out-Out;" csv "${FUNCNAME[0]}"
> +	perf script report task-analyzer --csv "${csv}" --extended-times > /dev/null
> +	check_exec_0 "perf script report task-analyzer --csv ${csv} --extended-times"
> +	find_str_or_fail "Out-Out;" "${csv}" "${FUNCNAME[0]}"
>  }
>  
>  test_csvsummary() {
> -	perf script report task-analyzer --csv-summary csvsummary > /dev/null
> -	check_exec_0 "perf script report task-analyzer --csv-summary csvsummary"
> -	find_str_or_fail "Comm;" csvsummary "${FUNCNAME[0]}"
> +	perf script report task-analyzer --csv-summary "${csvsummary}" > /dev/null
> +	check_exec_0 "perf script report task-analyzer --csv-summary ${csvsummary}"
> +	find_str_or_fail "Comm;" "${csvsummary}" "${FUNCNAME[0]}"
>  }
>  
>  test_csvsummary_extended() {
> -	perf script report task-analyzer --csv-summary csvsummary --summary-extended \
> +	perf script report task-analyzer --csv-summary "${csvsummary}" --summary-extended \
>  	>/dev/null
> -	check_exec_0 "perf script report task-analyzer --csv-summary csvsummary --summary-extended"
> -	find_str_or_fail "Out-Out;" csvsummary "${FUNCNAME[0]}"
> +	check_exec_0 "perf script report task-analyzer --csv-summary ${csvsummary} --summary-extended"
> +	find_str_or_fail "Out-Out;" "${csvsummary}" "${FUNCNAME[0]}"
>  }
>  
>  skip_no_probe_record_support
> -- 
> 2.53.0.1018.g2bb0e51243-goog
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2] perf tests task-analyzer: Write test files to tmpdir
  2026-03-31  7:22             ` Namhyung Kim
@ 2026-03-31 17:58               ` Ian Rogers
  2026-04-01  3:41                 ` Namhyung Kim
  0 siblings, 1 reply; 80+ messages in thread
From: Ian Rogers @ 2026-03-31 17:58 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: acme, agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk, tmricht

On Tue, Mar 31, 2026 at 12:22 AM Namhyung Kim <namhyung@kernel.org> wrote:
>
> I'm curious why this patch is in the idle symbol thread.

I'll separate it, I was gathering fixes. Same branch has the BPF
counters test fix in it:
https://lore.kernel.org/lkml/20260325171653.1091337-1-irogers@google.com/

> On Thu, Mar 26, 2026 at 11:00:33PM -0700, Ian Rogers wrote:
> > Writing to the test output files in the current working directory can
> > fail in various contexts such as continual test. Other tests write to
> > a mktemp-ed file, make the "perf script task-analyszer tests" follow
> > this convention too. Currently this isn't possible for the perf.data
> > file due to a lack of perf script support, add a variable for when
> > this support is available.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> >  tools/perf/tests/shell/test_task_analyzer.sh | 38 +++++++++++---------
> >  1 file changed, 21 insertions(+), 17 deletions(-)
> >
> > diff --git a/tools/perf/tests/shell/test_task_analyzer.sh b/tools/perf/tests/shell/test_task_analyzer.sh
> > index e194fcf61df3..b1a6a7e017e4 100755
> > --- a/tools/perf/tests/shell/test_task_analyzer.sh
> > +++ b/tools/perf/tests/shell/test_task_analyzer.sh
> > @@ -3,6 +3,11 @@
> >  # SPDX-License-Identifier: GPL-2.0
> >
> >  tmpdir=$(mktemp -d /tmp/perf-script-task-analyzer-XXXXX)
> > +# TODO: perf script report only supports input from the CWD perf.data file, make
> > +# it support input from any file.
> > +perfdata="perf.data"
> > +csv="$tmpdir/csv"
> > +csvsummary="$tmpdir/csvsummary"
> >  err=0
> >
> >  # set PERF_EXEC_PATH to find scripts in the source directory
> > @@ -15,11 +20,10 @@ fi
> >  export ASAN_OPTIONS=detect_leaks=0
> >
> >  cleanup() {
> > -  rm -f perf.data
> > -  rm -f perf.data.old
> > -  rm -f csv
> > -  rm -f csvsummary
> > +  rm -f "${perfdata}"
> > +  rm -f "${perfdata}".old
> >    rm -rf "$tmpdir"
> > +
> >    trap - exit term int
> >  }
> >
> > @@ -61,7 +65,7 @@ skip_no_probe_record_support() {
> >
> >  prepare_perf_data() {
> >       # 1s should be sufficient to catch at least some switches
> > -     perf record -e sched:sched_switch -a -- sleep 1 > /dev/null 2>&1
> > +     perf record -e sched:sched_switch -a -o "${perfdata}" -- sleep 1 > /dev/null 2>&1
> >       # check if perf data file got created in above step.
> >       if [ ! -e "perf.data" ]; then
> >               printf "FAIL: perf record failed to create \"perf.data\" \n"
>
> Please update this part too.

Done.

Thanks,
Ian

> Thanks,
> Namhyung
>
>
> > @@ -130,28 +134,28 @@ test_extended_times_summary_ns() {
> >  }
> >
> >  test_csv() {
> > -     perf script report task-analyzer --csv csv > /dev/null
> > -     check_exec_0 "perf script report task-analyzer --csv csv"
> > -     find_str_or_fail "Comm;" csv "${FUNCNAME[0]}"
> > +     perf script report task-analyzer --csv "${csv}" > /dev/null
> > +     check_exec_0 "perf script report task-analyzer --csv ${csv}"
> > +     find_str_or_fail "Comm;" "${csv}" "${FUNCNAME[0]}"
> >  }
> >
> >  test_csv_extended_times() {
> > -     perf script report task-analyzer --csv csv --extended-times > /dev/null
> > -     check_exec_0 "perf script report task-analyzer --csv csv --extended-times"
> > -     find_str_or_fail "Out-Out;" csv "${FUNCNAME[0]}"
> > +     perf script report task-analyzer --csv "${csv}" --extended-times > /dev/null
> > +     check_exec_0 "perf script report task-analyzer --csv ${csv} --extended-times"
> > +     find_str_or_fail "Out-Out;" "${csv}" "${FUNCNAME[0]}"
> >  }
> >
> >  test_csvsummary() {
> > -     perf script report task-analyzer --csv-summary csvsummary > /dev/null
> > -     check_exec_0 "perf script report task-analyzer --csv-summary csvsummary"
> > -     find_str_or_fail "Comm;" csvsummary "${FUNCNAME[0]}"
> > +     perf script report task-analyzer --csv-summary "${csvsummary}" > /dev/null
> > +     check_exec_0 "perf script report task-analyzer --csv-summary ${csvsummary}"
> > +     find_str_or_fail "Comm;" "${csvsummary}" "${FUNCNAME[0]}"
> >  }
> >
> >  test_csvsummary_extended() {
> > -     perf script report task-analyzer --csv-summary csvsummary --summary-extended \
> > +     perf script report task-analyzer --csv-summary "${csvsummary}" --summary-extended \
> >       >/dev/null
> > -     check_exec_0 "perf script report task-analyzer --csv-summary csvsummary --summary-extended"
> > -     find_str_or_fail "Out-Out;" csvsummary "${FUNCNAME[0]}"
> > +     check_exec_0 "perf script report task-analyzer --csv-summary ${csvsummary} --summary-extended"
> > +     find_str_or_fail "Out-Out;" "${csvsummary}" "${FUNCNAME[0]}"
> >  }
> >
> >  skip_no_probe_record_support
> > --
> > 2.53.0.1018.g2bb0e51243-goog
> >

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2] perf tests task-analyzer: Write test files to tmpdir
  2026-03-31 17:58               ` Ian Rogers
@ 2026-04-01  3:41                 ` Namhyung Kim
  0 siblings, 0 replies; 80+ messages in thread
From: Namhyung Kim @ 2026-04-01  3:41 UTC (permalink / raw)
  To: Ian Rogers
  Cc: acme, agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk, tmricht

On Tue, Mar 31, 2026 at 10:58:55AM -0700, Ian Rogers wrote:
> On Tue, Mar 31, 2026 at 12:22 AM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > I'm curious why this patch is in the idle symbol thread.
> 
> I'll separate it, I was gathering fixes. Same branch has the BPF
> counters test fix in it:
> https://lore.kernel.org/lkml/20260325171653.1091337-1-irogers@google.com/

Ok, I'll test and process it.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v4 1/2] perf env: Add perf_env__e_machine helper and use in perf_env__arch
  2026-03-27  4:50             ` [PATCH v4 1/2] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
@ 2026-04-06  5:05               ` Namhyung Kim
  2026-04-06 15:36                 ` Ian Rogers
  0 siblings, 1 reply; 80+ messages in thread
From: Namhyung Kim @ 2026-04-06  5:05 UTC (permalink / raw)
  To: Ian Rogers
  Cc: acme, tmricht, agordeev, gor, hca, jameshongleiwang, japo,
	linux-kernel, linux-perf-users, linux-s390, sumanthk

On Thu, Mar 26, 2026 at 09:50:24PM -0700, Ian Rogers wrote:
> Add a helper that lazily computes the e_machine and falls back of
> EM_HOST. Use the perf_env's arch to compute the e_machine if
> available. Use a binary search for some efficiency in this, but handle
> somewhat complex duplicate rules. Switch perf_env__arch to be derived
> the e_machine for consistency. This switches arch from being uname
> derived to matching that of the perf binary (via EM_HOST). Update
> session to use the helper, which may mean using EM_HOST when no
> threads are available. This also updates the perf data file header
> that gets the e_machine/e_flags from the session.
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/util/env.c     | 185 ++++++++++++++++++++++++++++++--------
>  tools/perf/util/env.h     |   1 +
>  tools/perf/util/session.c |  14 +--
>  3 files changed, 157 insertions(+), 43 deletions(-)
> 
> diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
> index 93d475a80f14..ae08178870d7 100644
> --- a/tools/perf/util/env.c
> +++ b/tools/perf/util/env.c
> @@ -1,10 +1,12 @@
>  // SPDX-License-Identifier: GPL-2.0
>  #include "cpumap.h"
> +#include "dwarf-regs.h"
>  #include "debug.h"
>  #include "env.h"
>  #include "util/header.h"
>  #include "util/rwsem.h"
>  #include <linux/compiler.h>
> +#include <linux/kernel.h>
>  #include <linux/ctype.h>
>  #include <linux/rbtree.h>
>  #include <linux/string.h>
> @@ -588,51 +590,160 @@ void cpu_cache_level__free(struct cpu_cache_level *cache)
>  	zfree(&cache->size);
>  }
>  
> +struct arch_to_e_machine {
> +	const char *prefix;
> +	uint16_t e_machine;
> +};
> +
>  /*
> - * Return architecture name in a normalized form.
> - * The conversion logic comes from the Makefile.
> + * A mapping from an arch prefix string to an ELF machine that can be used in a
> + * bsearch. Some arch prefixes are shared an need additional processing as
> + * marked next to the architecture. The prefixes handle both perf's architecture
> + * naming and those from uname.
>   */
> -static const char *normalize_arch(char *arch)
> -{
> -	if (!strcmp(arch, "x86_64"))
> -		return "x86";
> -	if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
> -		return "x86";
> -	if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
> -		return "sparc";
> -	if (!strncmp(arch, "aarch64", 7) || !strncmp(arch, "arm64", 5))
> -		return "arm64";
> -	if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
> -		return "arm";
> -	if (!strncmp(arch, "s390", 4))
> -		return "s390";
> -	if (!strncmp(arch, "parisc", 6))
> -		return "parisc";
> -	if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
> -		return "powerpc";
> -	if (!strncmp(arch, "mips", 4))
> -		return "mips";
> -	if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
> -		return "sh";
> -	if (!strncmp(arch, "loongarch", 9))
> -		return "loongarch";
> -
> -	return arch;
> +static const struct arch_to_e_machine prefix_to_e_machine[] = {
> +	{"aarch64", EM_AARCH64},
> +	{"alpha", EM_ALPHA},
> +	{"arc", EM_ARC},
> +	{"arm", EM_ARM}, /* Check also for EM_AARCH64. */
> +	{"avr", EM_AVR},  /* Check also for EM_AVR32. */
> +	{"bfin", EM_BLACKFIN},
> +	{"blackfin", EM_BLACKFIN},
> +	{"cris", EM_CRIS},
> +	{"csky", EM_CSKY},
> +	{"hppa", EM_PARISC},
> +	{"i386", EM_386},
> +	{"i486", EM_386},
> +	{"i586", EM_386},
> +	{"i686", EM_386},
> +	{"loongarch", EM_LOONGARCH},
> +	{"m32r", EM_M32R},
> +	{"m68k", EM_68K},
> +	{"microblaze", EM_MICROBLAZE},
> +	{"mips", EM_MIPS},
> +	{"msp430", EM_MSP430},
> +	{"parisc", EM_PARISC},
> +	{"powerpc", EM_PPC}, /* Check also for EM_PPC64. */
> +	{"ppc", EM_PPC}, /* Check also for EM_PPC64. */
> +	{"riscv", EM_RISCV},
> +	{"s390", EM_S390},
> +	{"sa110", EM_ARM},
> +	{"sh", EM_SH},
> +	{"sparc", EM_SPARC}, /* Check also for EM_SPARCV9. */
> +	{"sun4u", EM_SPARC},
> +	{"x86", EM_X86_64}, /* Check also for EM_386. */
> +	{"xtensa", EM_XTENSA},
> +};
> +
> +static int compare_prefix(const void *key, const void *element)
> +{
> +	const char *search_key = key;
> +	const struct arch_to_e_machine *map_element = element;
> +	size_t prefix_len = strlen(map_element->prefix);
> +
> +	return strncmp(search_key, map_element->prefix, prefix_len);
> +}
> +
> +static uint16_t perf_arch_to_e_machine(const char *perf_arch, bool is_64_bit)
> +{
> +	/* Binary search for a matching prefix. */
> +	const struct arch_to_e_machine *result;
> +
> +	if (!perf_arch)
> +		return EM_HOST;
> +
> +	result = bsearch(perf_arch,
> +			 prefix_to_e_machine, ARRAY_SIZE(prefix_to_e_machine),
> +			 sizeof(prefix_to_e_machine[0]),
> +			 compare_prefix);
> +
> +	if (!result) {
> +		pr_debug("Unknown perf arch for ELF machine mapping: %s\n", perf_arch);
> +		return EM_NONE;
> +	}
> +
> +	/* Handle conflicting prefixes. */
> +	switch (result->e_machine) {
> +	case EM_ARM:
> +		return !strcmp(perf_arch, "arm64") ? EM_AARCH64 : EM_ARM;
> +	case EM_AVR:
> +		return !strcmp(perf_arch, "avr32") ? EM_AVR32 : EM_AVR;
> +	case EM_PPC:
> +		return is_64_bit || strstarts(perf_arch, "ppc64") ? EM_PPC64 : EM_PPC;

I'm curious what's the name `uname -m` returns for PPC64.  Is
"powerpc64" possible?


> +	case EM_SPARC:
> +		return is_64_bit || !strcmp(perf_arch, "sparc64") ? EM_SPARCV9 : EM_SPARC;
> +	case EM_X86_64:
> +		return is_64_bit || !strcmp(perf_arch, "x86_64") ? EM_X86_64 : EM_386;
> +	default:
> +		return result->e_machine;
> +	}
> +}
> +
> +static const char *e_machine_to_perf_arch(uint16_t e_machine)
> +{
> +	/*
> +	 * Table for if either the perf arch string differs from uname or there
> +	 * are >1 ELF machine with the prefix.
> +	 */
> +	static const struct arch_to_e_machine extras[] = {
> +		{"arm64", EM_AARCH64},
> +		{"avr32", EM_AVR32},
> +		{"powerpc", EM_PPC},
> +		{"powerpc", EM_PPC64},

Here it returns powerpc for both.


> +		{"sparc", EM_SPARCV9},
> +		{"x86", EM_386},
> +		{"x86", EM_X86_64},
> +		{"none", EM_NONE},
> +	};
> +
> +	for (size_t i = 0; i < ARRAY_SIZE(extras); i++) {
> +		if (extras[i].e_machine == e_machine)
> +			return extras[i].prefix;
> +	}
> +
> +	for (size_t i = 0; i < ARRAY_SIZE(prefix_to_e_machine); i++) {
> +		if (prefix_to_e_machine[i].e_machine == e_machine)
> +			return prefix_to_e_machine[i].prefix;
> +
> +	}
> +	return "unknown";
> +}
> +
> +uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags)
> +{
> +	if (!env) {
> +		if (e_flags)
> +			*e_flags = EF_HOST;
> +
> +		return EM_HOST;
> +	}
> +	if (env->e_machine == EM_NONE) {
> +		env->e_machine = perf_arch_to_e_machine(env->arch, env->kernel_is_64_bit);
> +
> +		if (env->e_machine == EM_HOST)
> +			env->e_flags = EF_HOST;
> +	}
> +	if (e_flags)
> +		*e_flags = env->e_flags;
> +
> +	return env->e_machine;
>  }
>  
>  const char *perf_env__arch(struct perf_env *env)
>  {
> -	char *arch_name;
> +	if (!env)
> +		return e_machine_to_perf_arch(EM_HOST);
>  
> -	if (!env || !env->arch) { /* Assume local operation */
> -		static struct utsname uts = { .machine[0] = '\0', };
> -		if (uts.machine[0] == '\0' && uname(&uts) < 0)
> -			return NULL;
> -		arch_name = uts.machine;
> -	} else
> -		arch_name = env->arch;
> +	if (!env->arch) {
> +		/*
> +		 * Lazily compute/allocate arch. The e_machine may have been
> +		 * read from a data file and so may not be EM_HOST.
> +		 */
> +		uint16_t e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
>  
> -	return normalize_arch(arch_name);
> +		env->arch = strdup(e_machine_to_perf_arch(e_machine));
> +	}
> +	return env->arch;
>  }
>  
>  #if defined(HAVE_LIBTRACEEVENT)
> diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
> index a4501cbca375..91ff252712f4 100644
> --- a/tools/perf/util/env.h
> +++ b/tools/perf/util/env.h
> @@ -186,6 +186,7 @@ int perf_env__read_cpu_topology_map(struct perf_env *env);
>  
>  void cpu_cache_level__free(struct cpu_cache_level *cache);
>  
> +uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags);
>  const char *perf_env__arch(struct perf_env *env);
>  const char *perf_env__arch_strerrno(struct perf_env *env, int err);
>  const char *perf_env__cpuid(struct perf_env *env);
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index 4b465abfa36c..dcc9bef303aa 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -2996,14 +2996,16 @@ uint16_t perf_session__e_machine(struct perf_session *session, uint32_t *e_flags
>  		return EM_HOST;
>  	}
>  
> +	/* Is the env caching an e_machine? */
>  	env = perf_session__env(session);
> -	if (env && env->e_machine != EM_NONE) {
> -		if (e_flags)
> -			*e_flags = env->e_flags;
> -
> -		return env->e_machine;
> -	}
> +	if (env && env->e_machine != EM_NONE)
> +		return perf_env__e_machine(env, e_flags);
>  
> +	/*
> +	 * Compute from threads, note this is more accurate than
> +	 * perf_env__e_machine that falls back on EM_HOST and doesn't consider
> +	 * mixed 32-bit and 64-bit threads.
> +	 */
>  	machines__for_each_thread(&session->machines,
>  				  perf_session__e_machine_cb,
>  				  &args);
> -- 
> 2.53.0.1018.g2bb0e51243-goog
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v4 2/2] perf symbol: Lazily compute idle and use the perf_env
  2026-03-27  4:50             ` [PATCH v4 2/2] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
@ 2026-04-06  5:10               ` Namhyung Kim
  2026-04-06 16:11                 ` Ian Rogers
  0 siblings, 1 reply; 80+ messages in thread
From: Namhyung Kim @ 2026-04-06  5:10 UTC (permalink / raw)
  To: Ian Rogers
  Cc: acme, tmricht, agordeev, gor, hca, jameshongleiwang, japo,
	linux-kernel, linux-perf-users, linux-s390, sumanthk

On Thu, Mar 26, 2026 at 09:50:25PM -0700, Ian Rogers wrote:
> Move the idle boolean to a helper symbol__is_idle function. In the
> function lazily compute whether a symbol is an idle function taking
> into consideration the kernel version and architecture of the
> machine. As symbols__insert no longer needs to know if a symbol is for
> the kernel, remove the argument.
> 
> This change is inspired by mailing list discussion, particularly from
> Thomas Richter <tmricht@linux.ibm.com> and Heiko Carstens
> <hca@linux.ibm.com>:
> https://lore.kernel.org/lkml/20260219113850.354271-1-tmricht@linux.ibm.com/
> 
> The change switches x86 matches to use strstarts which means
> intel_idle_irq is matched as part of strstarts(name, "intel_idle"), a
> change suggested by Honglei Wang <jameshongleiwang@126.com> in:
> https://lore.kernel.org/lkml/20260323085255.98173-1-jameshongleiwang@126.com/
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
[SNIP]
> +	if (e_machine == EM_S390 && strstarts(name, "psw_idle")) {
> +		int major = 0, minor = 0;
> +		const char *release = env && env->os_release
> +			? env->os_release : perf_version_string;

I think Sashiko's review is right.  You need to check the kernel version
instead of perf.

Thanks,
Namhyung

> +
> +		/* Before v6.10, s390 used psw_idle. */
> +		if (sscanf(release, "%d.%d", &major, &minor) != 2 ||
> +		    major < 6 || (major == 6 && minor < 10)) {
> +			sym->idle = SYMBOL_IDLE__IDLE;
> +			return true;
> +		}
> +	}
> +
> +	sym->idle = SYMBOL_IDLE__NOT_IDLE;
> +	return false;
>  }

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v4 1/2] perf env: Add perf_env__e_machine helper and use in perf_env__arch
  2026-04-06  5:05               ` Namhyung Kim
@ 2026-04-06 15:36                 ` Ian Rogers
  0 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-04-06 15:36 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: acme, tmricht, agordeev, gor, hca, jameshongleiwang, japo,
	linux-kernel, linux-perf-users, linux-s390, sumanthk

On Sun, Apr 5, 2026 at 10:05 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Thu, Mar 26, 2026 at 09:50:24PM -0700, Ian Rogers wrote:
> > Add a helper that lazily computes the e_machine and falls back of
> > EM_HOST. Use the perf_env's arch to compute the e_machine if
> > available. Use a binary search for some efficiency in this, but handle
> > somewhat complex duplicate rules. Switch perf_env__arch to be derived
> > the e_machine for consistency. This switches arch from being uname
> > derived to matching that of the perf binary (via EM_HOST). Update
> > session to use the helper, which may mean using EM_HOST when no
> > threads are available. This also updates the perf data file header
> > that gets the e_machine/e_flags from the session.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> >  tools/perf/util/env.c     | 185 ++++++++++++++++++++++++++++++--------
> >  tools/perf/util/env.h     |   1 +
> >  tools/perf/util/session.c |  14 +--
> >  3 files changed, 157 insertions(+), 43 deletions(-)
> >
> > diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
> > index 93d475a80f14..ae08178870d7 100644
> > --- a/tools/perf/util/env.c
> > +++ b/tools/perf/util/env.c
> > @@ -1,10 +1,12 @@
> >  // SPDX-License-Identifier: GPL-2.0
> >  #include "cpumap.h"
> > +#include "dwarf-regs.h"
> >  #include "debug.h"
> >  #include "env.h"
> >  #include "util/header.h"
> >  #include "util/rwsem.h"
> >  #include <linux/compiler.h>
> > +#include <linux/kernel.h>
> >  #include <linux/ctype.h>
> >  #include <linux/rbtree.h>
> >  #include <linux/string.h>
> > @@ -588,51 +590,160 @@ void cpu_cache_level__free(struct cpu_cache_level *cache)
> >       zfree(&cache->size);
> >  }
> >
> > +struct arch_to_e_machine {
> > +     const char *prefix;
> > +     uint16_t e_machine;
> > +};
> > +
> >  /*
> > - * Return architecture name in a normalized form.
> > - * The conversion logic comes from the Makefile.
> > + * A mapping from an arch prefix string to an ELF machine that can be used in a
> > + * bsearch. Some arch prefixes are shared an need additional processing as
> > + * marked next to the architecture. The prefixes handle both perf's architecture
> > + * naming and those from uname.
> >   */
> > -static const char *normalize_arch(char *arch)
> > -{
> > -     if (!strcmp(arch, "x86_64"))
> > -             return "x86";
> > -     if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
> > -             return "x86";
> > -     if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
> > -             return "sparc";
> > -     if (!strncmp(arch, "aarch64", 7) || !strncmp(arch, "arm64", 5))
> > -             return "arm64";
> > -     if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
> > -             return "arm";
> > -     if (!strncmp(arch, "s390", 4))
> > -             return "s390";
> > -     if (!strncmp(arch, "parisc", 6))
> > -             return "parisc";
> > -     if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
> > -             return "powerpc";
> > -     if (!strncmp(arch, "mips", 4))
> > -             return "mips";
> > -     if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
> > -             return "sh";
> > -     if (!strncmp(arch, "loongarch", 9))
> > -             return "loongarch";
> > -
> > -     return arch;
> > +static const struct arch_to_e_machine prefix_to_e_machine[] = {
> > +     {"aarch64", EM_AARCH64},
> > +     {"alpha", EM_ALPHA},
> > +     {"arc", EM_ARC},
> > +     {"arm", EM_ARM}, /* Check also for EM_AARCH64. */
> > +     {"avr", EM_AVR},  /* Check also for EM_AVR32. */
> > +     {"bfin", EM_BLACKFIN},
> > +     {"blackfin", EM_BLACKFIN},
> > +     {"cris", EM_CRIS},
> > +     {"csky", EM_CSKY},
> > +     {"hppa", EM_PARISC},
> > +     {"i386", EM_386},
> > +     {"i486", EM_386},
> > +     {"i586", EM_386},
> > +     {"i686", EM_386},
> > +     {"loongarch", EM_LOONGARCH},
> > +     {"m32r", EM_M32R},
> > +     {"m68k", EM_68K},
> > +     {"microblaze", EM_MICROBLAZE},
> > +     {"mips", EM_MIPS},
> > +     {"msp430", EM_MSP430},
> > +     {"parisc", EM_PARISC},
> > +     {"powerpc", EM_PPC}, /* Check also for EM_PPC64. */
> > +     {"ppc", EM_PPC}, /* Check also for EM_PPC64. */
> > +     {"riscv", EM_RISCV},
> > +     {"s390", EM_S390},
> > +     {"sa110", EM_ARM},
> > +     {"sh", EM_SH},
> > +     {"sparc", EM_SPARC}, /* Check also for EM_SPARCV9. */
> > +     {"sun4u", EM_SPARC},
> > +     {"x86", EM_X86_64}, /* Check also for EM_386. */
> > +     {"xtensa", EM_XTENSA},
> > +};
> > +
> > +static int compare_prefix(const void *key, const void *element)
> > +{
> > +     const char *search_key = key;
> > +     const struct arch_to_e_machine *map_element = element;
> > +     size_t prefix_len = strlen(map_element->prefix);
> > +
> > +     return strncmp(search_key, map_element->prefix, prefix_len);
> > +}
> > +
> > +static uint16_t perf_arch_to_e_machine(const char *perf_arch, bool is_64_bit)
> > +{
> > +     /* Binary search for a matching prefix. */
> > +     const struct arch_to_e_machine *result;
> > +
> > +     if (!perf_arch)
> > +             return EM_HOST;
> > +
> > +     result = bsearch(perf_arch,
> > +                      prefix_to_e_machine, ARRAY_SIZE(prefix_to_e_machine),
> > +                      sizeof(prefix_to_e_machine[0]),
> > +                      compare_prefix);
> > +
> > +     if (!result) {
> > +             pr_debug("Unknown perf arch for ELF machine mapping: %s\n", perf_arch);
> > +             return EM_NONE;
> > +     }
> > +
> > +     /* Handle conflicting prefixes. */
> > +     switch (result->e_machine) {
> > +     case EM_ARM:
> > +             return !strcmp(perf_arch, "arm64") ? EM_AARCH64 : EM_ARM;
> > +     case EM_AVR:
> > +             return !strcmp(perf_arch, "avr32") ? EM_AVR32 : EM_AVR;
> > +     case EM_PPC:
> > +             return is_64_bit || strstarts(perf_arch, "ppc64") ? EM_PPC64 : EM_PPC;
>
> I'm curious what's the name `uname -m` returns for PPC64.  Is
> "powerpc64" possible?

It is.

> > +     case EM_SPARC:
> > +             return is_64_bit || !strcmp(perf_arch, "sparc64") ? EM_SPARCV9 : EM_SPARC;
> > +     case EM_X86_64:
> > +             return is_64_bit || !strcmp(perf_arch, "x86_64") ? EM_X86_64 : EM_386;
> > +     default:
> > +             return result->e_machine;
> > +     }
> > +}
> > +
> > +static const char *e_machine_to_perf_arch(uint16_t e_machine)
> > +{
> > +     /*
> > +      * Table for if either the perf arch string differs from uname or there
> > +      * are >1 ELF machine with the prefix.
> > +      */
> > +     static const struct arch_to_e_machine extras[] = {
> > +             {"arm64", EM_AARCH64},
> > +             {"avr32", EM_AVR32},
> > +             {"powerpc", EM_PPC},
> > +             {"powerpc", EM_PPC64},
>
> Here it returns powerpc for both.

Yep. This is 100% intentional as the existing code does the same:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/env.c?h=perf-tools-next#n611
```
static const char *normalize_arch(char *arch)
...
if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
return "powerpc";
```
The strncmp is limited to just the prefix of the uname string,
ignoring the 64. So the arch "powerpc" can be 32-bit or 64-bit, just
as "x86" can be 32-bit or 64-bit. To determine which case applies, the
code should really check `struct perf_env`'s `kernel_is_64_bit`. I
think this is generally much more painful than just using the
e_machine - especially since you need to strcmp the name. For the
e_machine, the problem is that on x86 we have 32-bit, x32 and x86_64.
There is then also an ABI question regarding the use of SIMD registers
and the newer APX registers. If there are no samples and no DSOs in
play, making a choice of e_machine to set up variables with is
somewhat arbitrary. I think EM_HOST, the e_machine of the current perf
binary, is a good choice.

Thanks,
Ian

> > +             {"sparc", EM_SPARCV9},
> > +             {"x86", EM_386},
> > +             {"x86", EM_X86_64},
> > +             {"none", EM_NONE},
> > +     };
> > +
> > +     for (size_t i = 0; i < ARRAY_SIZE(extras); i++) {
> > +             if (extras[i].e_machine == e_machine)
> > +                     return extras[i].prefix;
> > +     }
> > +
> > +     for (size_t i = 0; i < ARRAY_SIZE(prefix_to_e_machine); i++) {
> > +             if (prefix_to_e_machine[i].e_machine == e_machine)
> > +                     return prefix_to_e_machine[i].prefix;
> > +
> > +     }
> > +     return "unknown";
> > +}
> > +
> > +uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags)
> > +{
> > +     if (!env) {
> > +             if (e_flags)
> > +                     *e_flags = EF_HOST;
> > +
> > +             return EM_HOST;
> > +     }
> > +     if (env->e_machine == EM_NONE) {
> > +             env->e_machine = perf_arch_to_e_machine(env->arch, env->kernel_is_64_bit);
> > +
> > +             if (env->e_machine == EM_HOST)
> > +                     env->e_flags = EF_HOST;
> > +     }
> > +     if (e_flags)
> > +             *e_flags = env->e_flags;
> > +
> > +     return env->e_machine;
> >  }
> >
> >  const char *perf_env__arch(struct perf_env *env)
> >  {
> > -     char *arch_name;
> > +     if (!env)
> > +             return e_machine_to_perf_arch(EM_HOST);
> >
> > -     if (!env || !env->arch) { /* Assume local operation */
> > -             static struct utsname uts = { .machine[0] = '\0', };
> > -             if (uts.machine[0] == '\0' && uname(&uts) < 0)
> > -                     return NULL;
> > -             arch_name = uts.machine;
> > -     } else
> > -             arch_name = env->arch;
> > +     if (!env->arch) {
> > +             /*
> > +              * Lazily compute/allocate arch. The e_machine may have been
> > +              * read from a data file and so may not be EM_HOST.
> > +              */
> > +             uint16_t e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
> >
> > -     return normalize_arch(arch_name);
> > +             env->arch = strdup(e_machine_to_perf_arch(e_machine));
> > +     }
> > +     return env->arch;
> >  }
> >
> >  #if defined(HAVE_LIBTRACEEVENT)
> > diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
> > index a4501cbca375..91ff252712f4 100644
> > --- a/tools/perf/util/env.h
> > +++ b/tools/perf/util/env.h
> > @@ -186,6 +186,7 @@ int perf_env__read_cpu_topology_map(struct perf_env *env);
> >
> >  void cpu_cache_level__free(struct cpu_cache_level *cache);
> >
> > +uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags);
> >  const char *perf_env__arch(struct perf_env *env);
> >  const char *perf_env__arch_strerrno(struct perf_env *env, int err);
> >  const char *perf_env__cpuid(struct perf_env *env);
> > diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> > index 4b465abfa36c..dcc9bef303aa 100644
> > --- a/tools/perf/util/session.c
> > +++ b/tools/perf/util/session.c
> > @@ -2996,14 +2996,16 @@ uint16_t perf_session__e_machine(struct perf_session *session, uint32_t *e_flags
> >               return EM_HOST;
> >       }
> >
> > +     /* Is the env caching an e_machine? */
> >       env = perf_session__env(session);
> > -     if (env && env->e_machine != EM_NONE) {
> > -             if (e_flags)
> > -                     *e_flags = env->e_flags;
> > -
> > -             return env->e_machine;
> > -     }
> > +     if (env && env->e_machine != EM_NONE)
> > +             return perf_env__e_machine(env, e_flags);
> >
> > +     /*
> > +      * Compute from threads, note this is more accurate than
> > +      * perf_env__e_machine that falls back on EM_HOST and doesn't consider
> > +      * mixed 32-bit and 64-bit threads.
> > +      */
> >       machines__for_each_thread(&session->machines,
> >                                 perf_session__e_machine_cb,
> >                                 &args);
> > --
> > 2.53.0.1018.g2bb0e51243-goog
> >

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v4 2/2] perf symbol: Lazily compute idle and use the perf_env
  2026-04-06  5:10               ` Namhyung Kim
@ 2026-04-06 16:11                 ` Ian Rogers
  2026-04-06 17:09                   ` [PATCH v5 0/3] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  0 siblings, 1 reply; 80+ messages in thread
From: Ian Rogers @ 2026-04-06 16:11 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: acme, tmricht, agordeev, gor, hca, jameshongleiwang, japo,
	linux-kernel, linux-perf-users, linux-s390, sumanthk

On Sun, Apr 5, 2026 at 10:10 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Thu, Mar 26, 2026 at 09:50:25PM -0700, Ian Rogers wrote:
> > Move the idle boolean to a helper symbol__is_idle function. In the
> > function lazily compute whether a symbol is an idle function taking
> > into consideration the kernel version and architecture of the
> > machine. As symbols__insert no longer needs to know if a symbol is for
> > the kernel, remove the argument.
> >
> > This change is inspired by mailing list discussion, particularly from
> > Thomas Richter <tmricht@linux.ibm.com> and Heiko Carstens
> > <hca@linux.ibm.com>:
> > https://lore.kernel.org/lkml/20260219113850.354271-1-tmricht@linux.ibm.com/
> >
> > The change switches x86 matches to use strstarts which means
> > intel_idle_irq is matched as part of strstarts(name, "intel_idle"), a
> > change suggested by Honglei Wang <jameshongleiwang@126.com> in:
> > https://lore.kernel.org/lkml/20260323085255.98173-1-jameshongleiwang@126.com/
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> [SNIP]
> > +     if (e_machine == EM_S390 && strstarts(name, "psw_idle")) {
> > +             int major = 0, minor = 0;
> > +             const char *release = env && env->os_release
> > +                     ? env->os_release : perf_version_string;
>
> I think Sashiko's review is right.  You need to check the kernel version
> instead of perf.

Doing this can create more problems and complexity than it solves. If
we state that `os_release` can be NULL at this point, we recompute it
using `uname`:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/header.c?h=perf-tools-next#n378
then do we cache the value in env? What happens if a data/pipe file
that assigns to the env later? Ad-hoc users of env->os_release
recomputing it shouldn't happen; instead, in 'live' mode, we should
assign os_release using uname either when the perf_env is created or
lazily with a helper function. I dislike that with a helper we could
potentially have multiple notions of os_release.

I'll add a patch to refactor the use of os_release, but can we be
mindful that this is clear feature creep with little benefit? We will
still fall back on `perf_version_string` if uname fails and for all
practical purposes, `perf_version_string` will differ little from
uname in this case. I'm only going to add the patch because checking
other uses of os_release suggests the change is benign.

Thanks,
Ian

> Thanks,
> Namhyung
>
> > +
> > +             /* Before v6.10, s390 used psw_idle. */
> > +             if (sscanf(release, "%d.%d", &major, &minor) != 2 ||
> > +                 major < 6 || (major == 6 && minor < 10)) {
> > +                     sym->idle = SYMBOL_IDLE__IDLE;
> > +                     return true;
> > +             }
> > +     }
> > +
> > +     sym->idle = SYMBOL_IDLE__NOT_IDLE;
> > +     return false;
> >  }

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH v5 0/3] perf symbol/env: ELF machine clean up and lazy idle computation
  2026-04-06 16:11                 ` Ian Rogers
@ 2026-04-06 17:09                   ` Ian Rogers
  2026-04-06 17:09                     ` [PATCH v5 1/3] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
                                       ` (3 more replies)
  0 siblings, 4 replies; 80+ messages in thread
From: Ian Rogers @ 2026-04-06 17:09 UTC (permalink / raw)
  To: acme, namhyung
  Cc: irogers, agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk, tmricht

Add a helper to perf_env to compute the e_machine if it is
EM_NONE. Derive the value from the arch string if available. Similarly
derive the arch string from the ELF machine if available, for
consistency. This means perf's arch (machine type) is no longer
determined by uname but set to match that of the perf ELF executable.

Switch the idle computation to the point of use and lazily compute it,
rather than computing it for every symbol. The current only user is
`perf top`. At the point of use the perf_env is available and this can
be used to make sure the idle function computation is machine and
kernel version dependent.

v5: Add perf_env os_release helper (Namhyung/Sashiko)

v4: Fix Sashiko issues where an array element wasn't sorted properly,
    the e_flags weren't returned properly, the idle type is change to
    a u8 rather than an enum value and the s390 version check for
    psw_idle is slightly reordered and tweaked.
    https://lore.kernel.org/lkml/20260327045025.2276517-1-irogers@google.com/

v3: Properly set up the e_machine coming from the perf_env as reported
    by Honglei Wang.
    https://lore.kernel.org/lkml/20260326174521.1829203-1-irogers@google.com/

v2: Some minor white space clean up:
    https://lore.kernel.org/lkml/20260325161836.1029457-1-irogers@google.com/

v1: https://lore.kernel.org/lkml/20260302234343.564937-1-irogers@google.com/

Ian Rogers (3):
  perf env: Add perf_env__e_machine helper and use in perf_env__arch
  perf env: Add helper to lazily compute the os_release
  perf symbol: Lazily compute idle and use the perf_env

 tools/perf/builtin-top.c          |   6 +-
 tools/perf/util/data-convert-bt.c |   2 +-
 tools/perf/util/env.c             | 206 ++++++++++++++++++++++++------
 tools/perf/util/env.h             |   2 +
 tools/perf/util/session.c         |  14 +-
 tools/perf/util/symbol-elf.c      |   2 +-
 tools/perf/util/symbol.c          | 107 ++++++++++------
 tools/perf/util/symbol.h          |  15 ++-
 8 files changed, 264 insertions(+), 90 deletions(-)

-- 
2.53.0.1213.gd9a14994de-goog


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH v5 1/3] perf env: Add perf_env__e_machine helper and use in perf_env__arch
  2026-04-06 17:09                   ` [PATCH v5 0/3] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
@ 2026-04-06 17:09                     ` Ian Rogers
  2026-04-06 17:09                     ` [PATCH v5 2/3] perf env: Add helper to lazily compute the os_release Ian Rogers
                                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-04-06 17:09 UTC (permalink / raw)
  To: acme, namhyung
  Cc: irogers, agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk, tmricht

Add a helper that lazily computes the e_machine and falls back of
EM_HOST. Use the perf_env's arch to compute the e_machine if
available. Use a binary search for some efficiency in this, but handle
somewhat complex duplicate rules. Switch perf_env__arch to be derived
the e_machine for consistency. This switches arch from being uname
derived to matching that of the perf binary (via EM_HOST). Update
session to use the helper, which may mean using EM_HOST when no
threads are available. This also updates the perf data file header
that gets the e_machine/e_flags from the session.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/env.c     | 185 ++++++++++++++++++++++++++++++--------
 tools/perf/util/env.h     |   1 +
 tools/perf/util/session.c |  14 +--
 3 files changed, 157 insertions(+), 43 deletions(-)

diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 1e54e2c86360..339d62ca37bb 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -1,10 +1,12 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "cpumap.h"
+#include "dwarf-regs.h"
 #include "debug.h"
 #include "env.h"
 #include "util/header.h"
 #include "util/rwsem.h"
 #include <linux/compiler.h>
+#include <linux/kernel.h>
 #include <linux/ctype.h>
 #include <linux/rbtree.h>
 #include <linux/string.h>
@@ -588,51 +590,160 @@ void cpu_cache_level__free(struct cpu_cache_level *cache)
 	zfree(&cache->size);
 }
 
+struct arch_to_e_machine {
+	const char *prefix;
+	uint16_t e_machine;
+};
+
 /*
- * Return architecture name in a normalized form.
- * The conversion logic comes from the Makefile.
+ * A mapping from an arch prefix string to an ELF machine that can be used in a
+ * bsearch. Some arch prefixes are shared an need additional processing as
+ * marked next to the architecture. The prefixes handle both perf's architecture
+ * naming and those from uname.
  */
-static const char *normalize_arch(char *arch)
-{
-	if (!strcmp(arch, "x86_64"))
-		return "x86";
-	if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
-		return "x86";
-	if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
-		return "sparc";
-	if (!strncmp(arch, "aarch64", 7) || !strncmp(arch, "arm64", 5))
-		return "arm64";
-	if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
-		return "arm";
-	if (!strncmp(arch, "s390", 4))
-		return "s390";
-	if (!strncmp(arch, "parisc", 6))
-		return "parisc";
-	if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
-		return "powerpc";
-	if (!strncmp(arch, "mips", 4))
-		return "mips";
-	if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
-		return "sh";
-	if (!strncmp(arch, "loongarch", 9))
-		return "loongarch";
-
-	return arch;
+static const struct arch_to_e_machine prefix_to_e_machine[] = {
+	{"aarch64", EM_AARCH64},
+	{"alpha", EM_ALPHA},
+	{"arc", EM_ARC},
+	{"arm", EM_ARM}, /* Check also for EM_AARCH64. */
+	{"avr", EM_AVR},  /* Check also for EM_AVR32. */
+	{"bfin", EM_BLACKFIN},
+	{"blackfin", EM_BLACKFIN},
+	{"cris", EM_CRIS},
+	{"csky", EM_CSKY},
+	{"hppa", EM_PARISC},
+	{"i386", EM_386},
+	{"i486", EM_386},
+	{"i586", EM_386},
+	{"i686", EM_386},
+	{"loongarch", EM_LOONGARCH},
+	{"m32r", EM_M32R},
+	{"m68k", EM_68K},
+	{"microblaze", EM_MICROBLAZE},
+	{"mips", EM_MIPS},
+	{"msp430", EM_MSP430},
+	{"parisc", EM_PARISC},
+	{"powerpc", EM_PPC}, /* Check also for EM_PPC64. */
+	{"ppc", EM_PPC}, /* Check also for EM_PPC64. */
+	{"riscv", EM_RISCV},
+	{"s390", EM_S390},
+	{"sa110", EM_ARM},
+	{"sh", EM_SH},
+	{"sparc", EM_SPARC}, /* Check also for EM_SPARCV9. */
+	{"sun4u", EM_SPARC},
+	{"x86", EM_X86_64}, /* Check also for EM_386. */
+	{"xtensa", EM_XTENSA},
+};
+
+static int compare_prefix(const void *key, const void *element)
+{
+	const char *search_key = key;
+	const struct arch_to_e_machine *map_element = element;
+	size_t prefix_len = strlen(map_element->prefix);
+
+	return strncmp(search_key, map_element->prefix, prefix_len);
+}
+
+static uint16_t perf_arch_to_e_machine(const char *perf_arch, bool is_64_bit)
+{
+	/* Binary search for a matching prefix. */
+	const struct arch_to_e_machine *result;
+
+	if (!perf_arch)
+		return EM_HOST;
+
+	result = bsearch(perf_arch,
+			 prefix_to_e_machine, ARRAY_SIZE(prefix_to_e_machine),
+			 sizeof(prefix_to_e_machine[0]),
+			 compare_prefix);
+
+	if (!result) {
+		pr_debug("Unknown perf arch for ELF machine mapping: %s\n", perf_arch);
+		return EM_NONE;
+	}
+
+	/* Handle conflicting prefixes. */
+	switch (result->e_machine) {
+	case EM_ARM:
+		return !strcmp(perf_arch, "arm64") ? EM_AARCH64 : EM_ARM;
+	case EM_AVR:
+		return !strcmp(perf_arch, "avr32") ? EM_AVR32 : EM_AVR;
+	case EM_PPC:
+		return is_64_bit || strstarts(perf_arch, "ppc64") ? EM_PPC64 : EM_PPC;
+	case EM_SPARC:
+		return is_64_bit || !strcmp(perf_arch, "sparc64") ? EM_SPARCV9 : EM_SPARC;
+	case EM_X86_64:
+		return is_64_bit || !strcmp(perf_arch, "x86_64") ? EM_X86_64 : EM_386;
+	default:
+		return result->e_machine;
+	}
+}
+
+static const char *e_machine_to_perf_arch(uint16_t e_machine)
+{
+	/*
+	 * Table for if either the perf arch string differs from uname or there
+	 * are >1 ELF machine with the prefix.
+	 */
+	static const struct arch_to_e_machine extras[] = {
+		{"arm64", EM_AARCH64},
+		{"avr32", EM_AVR32},
+		{"powerpc", EM_PPC},
+		{"powerpc", EM_PPC64},
+		{"sparc", EM_SPARCV9},
+		{"x86", EM_386},
+		{"x86", EM_X86_64},
+		{"none", EM_NONE},
+	};
+
+	for (size_t i = 0; i < ARRAY_SIZE(extras); i++) {
+		if (extras[i].e_machine == e_machine)
+			return extras[i].prefix;
+	}
+
+	for (size_t i = 0; i < ARRAY_SIZE(prefix_to_e_machine); i++) {
+		if (prefix_to_e_machine[i].e_machine == e_machine)
+			return prefix_to_e_machine[i].prefix;
+
+	}
+	return "unknown";
+}
+
+uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags)
+{
+	if (!env) {
+		if (e_flags)
+			*e_flags = EF_HOST;
+
+		return EM_HOST;
+	}
+	if (env->e_machine == EM_NONE) {
+		env->e_machine = perf_arch_to_e_machine(env->arch, env->kernel_is_64_bit);
+
+		if (env->e_machine == EM_HOST)
+			env->e_flags = EF_HOST;
+	}
+	if (e_flags)
+		*e_flags = env->e_flags;
+
+	return env->e_machine;
 }
 
 const char *perf_env__arch(struct perf_env *env)
 {
-	char *arch_name;
+	if (!env)
+		return e_machine_to_perf_arch(EM_HOST);
 
-	if (!env || !env->arch) { /* Assume local operation */
-		static struct utsname uts = { .machine[0] = '\0', };
-		if (uts.machine[0] == '\0' && uname(&uts) < 0)
-			return NULL;
-		arch_name = uts.machine;
-	} else
-		arch_name = env->arch;
+	if (!env->arch) {
+		/*
+		 * Lazily compute/allocate arch. The e_machine may have been
+		 * read from a data file and so may not be EM_HOST.
+		 */
+		uint16_t e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
 
-	return normalize_arch(arch_name);
+		env->arch = strdup(e_machine_to_perf_arch(e_machine));
+	}
+	return env->arch;
 }
 
 #if defined(HAVE_LIBTRACEEVENT)
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index a4501cbca375..91ff252712f4 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -186,6 +186,7 @@ int perf_env__read_cpu_topology_map(struct perf_env *env);
 
 void cpu_cache_level__free(struct cpu_cache_level *cache);
 
+uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags);
 const char *perf_env__arch(struct perf_env *env);
 const char *perf_env__arch_strerrno(struct perf_env *env, int err);
 const char *perf_env__cpuid(struct perf_env *env);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 3a911c70cd0e..070dd78772f2 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -3009,14 +3009,16 @@ uint16_t perf_session__e_machine(struct perf_session *session, uint32_t *e_flags
 		return EM_HOST;
 	}
 
+	/* Is the env caching an e_machine? */
 	env = perf_session__env(session);
-	if (env && env->e_machine != EM_NONE) {
-		if (e_flags)
-			*e_flags = env->e_flags;
-
-		return env->e_machine;
-	}
+	if (env && env->e_machine != EM_NONE)
+		return perf_env__e_machine(env, e_flags);
 
+	/*
+	 * Compute from threads, note this is more accurate than
+	 * perf_env__e_machine that falls back on EM_HOST and doesn't consider
+	 * mixed 32-bit and 64-bit threads.
+	 */
 	machines__for_each_thread(&session->machines,
 				  perf_session__e_machine_cb,
 				  &args);
-- 
2.53.0.1213.gd9a14994de-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v5 2/3] perf env: Add helper to lazily compute the os_release
  2026-04-06 17:09                   ` [PATCH v5 0/3] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  2026-04-06 17:09                     ` [PATCH v5 1/3] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
@ 2026-04-06 17:09                     ` Ian Rogers
  2026-04-06 17:09                     ` [PATCH v5 3/3] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
  2026-04-09 23:06                     ` [PATCH v6 0/3] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  3 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-04-06 17:09 UTC (permalink / raw)
  To: acme, namhyung
  Cc: irogers, agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk, tmricht

In live mode the os_release isn't being initialized, make a lazy
initialization helper that assumes when the os_release isn't
initialized this is live mode.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/data-convert-bt.c |  2 +-
 tools/perf/util/env.c             | 21 +++++++++++++++++++++
 tools/perf/util/env.h             |  1 +
 tools/perf/util/symbol.c          |  4 ++--
 4 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c
index bece77cbc493..bc5805183100 100644
--- a/tools/perf/util/data-convert-bt.c
+++ b/tools/perf/util/data-convert-bt.c
@@ -1414,7 +1414,7 @@ do {									\
 
 	ADD("host",    env->hostname);
 	ADD("sysname", "Linux");
-	ADD("release", env->os_release);
+	ADD("release", perf_env__os_release(env));
 	ADD("version", env->version);
 	ADD("machine", env->arch);
 	ADD("domain", "kernel");
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 339d62ca37bb..34b737950f73 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -330,6 +330,27 @@ int perf_env__kernel_is_64_bit(struct perf_env *env)
 	return env->kernel_is_64_bit;
 }
 
+const char *perf_env__os_release(struct perf_env *env)
+{
+	struct utsname uts;
+	int ret;
+
+	if (!env)
+		return perf_version_string;
+
+	if (env->os_release)
+		return env->os_release;
+
+	/*
+	 * The os_release is being accessed but wasn't initialized from a data
+	 * file, assume this is 'live' mode and use the release from uname. If
+	 * uname fails then use the current perf tool version.
+	 */
+	ret = uname(&uts);
+	env->os_release = strdup(ret < 0 ? perf_version_string : uts.release);
+	return env->os_release;
+}
+
 int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[])
 {
 	int i;
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index 91ff252712f4..bf30a02dccf7 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -174,6 +174,7 @@ void free_cpu_domain_info(struct cpu_domain_map **cd_map, u32 schedstat_version,
 void perf_env__exit(struct perf_env *env);
 
 int perf_env__kernel_is_64_bit(struct perf_env *env);
+const char *perf_env__os_release(struct perf_env *env);
 
 int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[]);
 
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index b4b30675688d..ea7d2f2dbcb7 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -2208,7 +2208,7 @@ static int vmlinux_path__init(struct perf_env *env)
 {
 	struct utsname uts;
 	char bf[PATH_MAX];
-	char *kernel_version;
+	const char *kernel_version;
 	unsigned int i;
 
 	vmlinux_path = malloc(sizeof(char *) * (ARRAY_SIZE(vmlinux_paths) +
@@ -2225,7 +2225,7 @@ static int vmlinux_path__init(struct perf_env *env)
 		return 0;
 
 	if (env) {
-		kernel_version = env->os_release;
+		kernel_version = perf_env__os_release(env);
 	} else {
 		if (uname(&uts) < 0)
 			goto out_fail;
-- 
2.53.0.1213.gd9a14994de-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v5 3/3] perf symbol: Lazily compute idle and use the perf_env
  2026-04-06 17:09                   ` [PATCH v5 0/3] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  2026-04-06 17:09                     ` [PATCH v5 1/3] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
  2026-04-06 17:09                     ` [PATCH v5 2/3] perf env: Add helper to lazily compute the os_release Ian Rogers
@ 2026-04-06 17:09                     ` Ian Rogers
  2026-04-09 23:06                     ` [PATCH v6 0/3] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  3 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-04-06 17:09 UTC (permalink / raw)
  To: acme, namhyung
  Cc: irogers, agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk, tmricht

Move the idle boolean to a helper symbol__is_idle function. In the
function lazily compute whether a symbol is an idle function taking
into consideration the kernel version and architecture of the
machine. As symbols__insert no longer needs to know if a symbol is for
the kernel, remove the argument.

This change is inspired by mailing list discussion, particularly from
Thomas Richter <tmricht@linux.ibm.com> and Heiko Carstens
<hca@linux.ibm.com>:
https://lore.kernel.org/lkml/20260219113850.354271-1-tmricht@linux.ibm.com/

The change switches x86 matches to use strstarts which means
intel_idle_irq is matched as part of strstarts(name, "intel_idle"), a
change suggested by Honglei Wang <jameshongleiwang@126.com> in:
https://lore.kernel.org/lkml/20260323085255.98173-1-jameshongleiwang@126.com/

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-top.c     |   6 +-
 tools/perf/util/symbol-elf.c |   2 +-
 tools/perf/util/symbol.c     | 103 ++++++++++++++++++++++-------------
 tools/perf/util/symbol.h     |  15 +++--
 4 files changed, 82 insertions(+), 44 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 37950efb28ac..bdc1c761cd61 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -751,6 +751,7 @@ static void perf_event__process_sample(const struct perf_tool *tool,
 {
 	struct perf_top *top = container_of(tool, struct perf_top, tool);
 	struct addr_location al;
+	struct dso *dso = NULL;
 
 	if (!machine && perf_guest) {
 		static struct intlist *seen;
@@ -830,7 +831,10 @@ static void perf_event__process_sample(const struct perf_tool *tool,
 		}
 	}
 
-	if (al.sym == NULL || !al.sym->idle) {
+	if (al.map)
+		dso = map__dso(al.map);
+
+	if (al.sym == NULL || !symbol__is_idle(al.sym, dso, machine->env)) {
 		struct hists *hists = evsel__hists(evsel);
 		struct hist_entry_iter iter = {
 			.evsel		= evsel,
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 7afa8a117139..e8f7fe3f19fc 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1727,7 +1727,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 
 		arch__sym_update(f, &sym);
 
-		__symbols__insert(dso__symbols(curr_dso), f, dso__kernel(dso));
+		__symbols__insert(dso__symbols(curr_dso), f);
 		nr++;
 	}
 	dso__put(curr_dso);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index ea7d2f2dbcb7..8c23802b39ad 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -25,6 +25,8 @@
 #include "demangle-ocaml.h"
 #include "demangle-rust-v0.h"
 #include "dso.h"
+#include "dwarf-regs.h"
+#include "env.h"
 #include "util.h" // lsdir()
 #include "event.h"
 #include "machine.h"
@@ -50,7 +52,6 @@
 
 static int dso__load_kernel_sym(struct dso *dso, struct map *map);
 static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map);
-static bool symbol__is_idle(const char *name);
 
 int vmlinux_path__nr_entries;
 char **vmlinux_path;
@@ -357,8 +358,7 @@ void symbols__delete(struct rb_root_cached *symbols)
 	}
 }
 
-void __symbols__insert(struct rb_root_cached *symbols,
-		       struct symbol *sym, bool kernel)
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
 	struct rb_node **p = &symbols->rb_root.rb_node;
 	struct rb_node *parent = NULL;
@@ -366,17 +366,6 @@ void __symbols__insert(struct rb_root_cached *symbols,
 	struct symbol *s;
 	bool leftmost = true;
 
-	if (kernel) {
-		const char *name = sym->name;
-		/*
-		 * ppc64 uses function descriptors and appends a '.' to the
-		 * start of every instruction address. Remove it.
-		 */
-		if (name[0] == '.')
-			name++;
-		sym->idle = symbol__is_idle(name);
-	}
-
 	while (*p != NULL) {
 		parent = *p;
 		s = rb_entry(parent, struct symbol, rb_node);
@@ -393,7 +382,7 @@ void __symbols__insert(struct rb_root_cached *symbols,
 
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
-	__symbols__insert(symbols, sym, false);
+	__symbols__insert(symbols, sym);
 }
 
 static struct symbol *symbols__find(struct rb_root_cached *symbols, u64 ip)
@@ -554,7 +543,7 @@ void dso__reset_find_symbol_cache(struct dso *dso)
 
 void dso__insert_symbol(struct dso *dso, struct symbol *sym)
 {
-	__symbols__insert(dso__symbols(dso), sym, dso__kernel(dso));
+	__symbols__insert(dso__symbols(dso), sym);
 
 	/* update the symbol cache if necessary */
 	if (dso__last_find_result_addr(dso) >= sym->start &&
@@ -716,47 +705,85 @@ int modules__parse(const char *filename, void *arg,
 	return err;
 }
 
+static int sym_name_cmp(const void *a, const void *b)
+{
+	const char *name = a;
+	const char *const *sym = b;
+
+	return strcmp(name, *sym);
+}
+
 /*
  * These are symbols in the kernel image, so make sure that
  * sym is from a kernel DSO.
  */
-static bool symbol__is_idle(const char *name)
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, struct perf_env *env)
 {
-	const char * const idle_symbols[] = {
+	static const char * const idle_symbols[] = {
 		"acpi_idle_do_entry",
 		"acpi_processor_ffh_cstate_enter",
 		"arch_cpu_idle",
 		"cpu_idle",
 		"cpu_startup_entry",
-		"idle_cpu",
-		"intel_idle",
-		"intel_idle_ibrs",
 		"default_idle",
-		"native_safe_halt",
 		"enter_idle",
 		"exit_idle",
-		"mwait_idle",
-		"mwait_idle_with_hints",
-		"mwait_idle_with_hints.constprop.0",
+		"idle_cpu",
+		"native_safe_halt",
 		"poll_idle",
-		"ppc64_runlatch_off",
 		"pseries_dedicated_idle_sleep",
-		"psw_idle",
-		"psw_idle_exit",
-		NULL
 	};
-	int i;
-	static struct strlist *idle_symbols_list;
+	const char *name = sym->name;
+	uint16_t e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
+
+	if (sym->idle)
+		return sym->idle == SYMBOL_IDLE__IDLE;
+
+	if (!dso || dso__kernel(dso) == DSO_SPACE__USER) {
+		sym->idle = SYMBOL_IDLE__NOT_IDLE;
+		return false;
+	}
 
-	if (idle_symbols_list)
-		return strlist__has_entry(idle_symbols_list, name);
+	/*
+	 * ppc64 uses function descriptors and appends a '.' to the
+	 * start of every instruction address. Remove it.
+	 */
+	if (name[0] == '.')
+		name++;
 
-	idle_symbols_list = strlist__new(NULL, NULL);
+	if (bsearch(name, idle_symbols, ARRAY_SIZE(idle_symbols),
+		    sizeof(idle_symbols[0]), sym_name_cmp)) {
+		sym->idle = SYMBOL_IDLE__IDLE;
+		return true;
+	}
 
-	for (i = 0; idle_symbols[i]; i++)
-		strlist__add(idle_symbols_list, idle_symbols[i]);
+	if (e_machine == EM_386 || e_machine == EM_X86_64) {
+		if (strstarts(name, "mwait_idle") ||
+		    strstarts(name, "intel_idle")) {
+			sym->idle = SYMBOL_IDLE__IDLE;
+			return true;
+		}
+	}
 
-	return strlist__has_entry(idle_symbols_list, name);
+	if (e_machine == EM_PPC64 && !strcmp(name, "ppc64_runlatch_off")) {
+		sym->idle = SYMBOL_IDLE__IDLE;
+		return true;
+	}
+
+	if (e_machine == EM_S390 && strstarts(name, "psw_idle")) {
+		int major = 0, minor = 0;
+		const char *release = perf_env__os_release(env);
+
+		/* Before v6.10, s390 used psw_idle. */
+		if (sscanf(release, "%d.%d", &major, &minor) != 2 ||
+		    major < 6 || (major == 6 && minor < 10)) {
+			sym->idle = SYMBOL_IDLE__IDLE;
+			return true;
+		}
+	}
+
+	sym->idle = SYMBOL_IDLE__NOT_IDLE;
+	return false;
 }
 
 static int map__process_kallsym_symbol(void *arg, const char *name,
@@ -785,7 +812,7 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
 	 * We will pass the symbols to the filter later, in
 	 * map__split_kallsyms, when we have split the maps per module
 	 */
-	__symbols__insert(root, sym, !strchr(name, '['));
+	__symbols__insert(root, sym);
 
 	return 0;
 }
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index c67814d6d6d6..2f5f90f547aa 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -25,6 +25,7 @@ struct dso;
 struct map;
 struct maps;
 struct option;
+struct perf_env;
 struct build_id;
 
 /*
@@ -42,6 +43,12 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
 			     GElf_Shdr *shp, const char *name, size_t *idx);
 #endif
 
+enum symbol_idle_kind {
+	SYMBOL_IDLE__UNKNOWN = 0,
+	SYMBOL_IDLE__NOT_IDLE = 1,
+	SYMBOL_IDLE__IDLE = 2,
+};
+
 /**
  * A symtab entry. When allocated this may be preceded by an annotation (see
  * symbol__annotation) and/or a browser_index (see symbol__browser_index).
@@ -57,8 +64,8 @@ struct symbol {
 	u8		type:4;
 	/** ELF binding type as defined for st_info. E.g. STB_WEAK or STB_GLOBAL. */
 	u8		binding:4;
-	/** Set true for kernel symbols of idle routines. */
-	u8		idle:1;
+	/** Cache for symbol__is_idle holding enum symbol_idle_kind values. */
+	u8		idle:2;
 	/** Resolvable but tools ignore it (e.g. idle routines). */
 	u8		ignore:1;
 	/** Symbol for an inlined function. */
@@ -202,8 +209,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss);
 
 char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
 
-void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym,
-		       bool kernel);
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__fixup_duplicate(struct rb_root_cached *symbols);
 void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms);
@@ -286,5 +292,6 @@ enum {
 };
 
 int symbol__validate_sym_arguments(void);
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, struct perf_env *env);
 
 #endif /* __PERF_SYMBOL */
-- 
2.53.0.1213.gd9a14994de-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v6 0/3] perf symbol/env: ELF machine clean up and lazy idle computation
  2026-04-06 17:09                   ` [PATCH v5 0/3] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                       ` (2 preceding siblings ...)
  2026-04-06 17:09                     ` [PATCH v5 3/3] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
@ 2026-04-09 23:06                     ` Ian Rogers
  2026-04-09 23:06                       ` [PATCH v6 1/3] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
                                         ` (2 more replies)
  3 siblings, 3 replies; 80+ messages in thread
From: Ian Rogers @ 2026-04-09 23:06 UTC (permalink / raw)
  To: namhyung
  Cc: irogers, acme, agordeev, gor, hca, jameshongleiwang, japo,
	linux-kernel, linux-perf-users, linux-s390, sumanthk, tmricht

Add a helper to perf_env to compute the e_machine if it is
EM_NONE. Derive the value from the arch string if available. Similarly
derive the arch string from the ELF machine if available, for
consistency. This means perf's arch (machine type) is no longer
determined by uname but set to match that of the perf ELF executable.

Switch the idle computation to the point of use and lazily compute it,
rather than computing it for every symbol. The current only user is
`perf top`. At the point of use the perf_env is available and this can
be used to make sure the idle function computation is machine and
kernel version dependent.

v6: Ensure arch is canonical by going to e_machine and back (Sashiko)

v5: Add perf_env os_release helper (Namhyung/Sashiko)
    https://lore.kernel.org/lkml/20260406170905.2614260-1-irogers@google.com/

v4: Fix Sashiko issues where an array element wasn't sorted properly,
    the e_flags weren't returned properly, the idle type is change to
    a u8 rather than an enum value and the s390 version check for
    psw_idle is slightly reordered and tweaked.
    https://lore.kernel.org/lkml/20260327045025.2276517-1-irogers@google.com/

v3: Properly set up the e_machine coming from the perf_env as reported
    by Honglei Wang.
    https://lore.kernel.org/lkml/20260326174521.1829203-1-irogers@google.com/

v2: Some minor white space clean up:
    https://lore.kernel.org/lkml/20260325161836.1029457-1-irogers@google.com/

v1: https://lore.kernel.org/lkml/20260302234343.564937-1-irogers@google.com/

Ian Rogers (3):
  perf env: Add perf_env__e_machine helper and use in perf_env__arch
  perf env: Add helper to lazily compute the os_release
  perf symbol: Lazily compute idle and use the perf_env

 tools/perf/builtin-top.c          |   6 +-
 tools/perf/util/data-convert-bt.c |   2 +-
 tools/perf/util/env.c             | 206 ++++++++++++++++++++++++------
 tools/perf/util/env.h             |   2 +
 tools/perf/util/header.c          |  60 ++++++---
 tools/perf/util/session.c         |  14 +-
 tools/perf/util/symbol-elf.c      |   2 +-
 tools/perf/util/symbol.c          | 107 ++++++++++------
 tools/perf/util/symbol.h          |  15 ++-
 9 files changed, 309 insertions(+), 105 deletions(-)

-- 
2.53.0.1213.gd9a14994de-goog


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH v6 1/3] perf env: Add perf_env__e_machine helper and use in perf_env__arch
  2026-04-09 23:06                     ` [PATCH v6 0/3] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
@ 2026-04-09 23:06                       ` Ian Rogers
  2026-05-01 18:20                         ` [PATCH v7 0/4] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  2026-04-09 23:06                       ` [PATCH v6 2/3] perf env: Add helper to lazily compute the os_release Ian Rogers
  2026-04-09 23:06                       ` [PATCH v6 3/3] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
  2 siblings, 2 replies; 80+ messages in thread
From: Ian Rogers @ 2026-04-09 23:06 UTC (permalink / raw)
  To: namhyung
  Cc: irogers, acme, agordeev, gor, hca, jameshongleiwang, japo,
	linux-kernel, linux-perf-users, linux-s390, sumanthk, tmricht

Add a helper that lazily computes the e_machine and falls back of
EM_HOST. Use the perf_env's arch to compute the e_machine if
available. Use a binary search for some efficiency in this, but handle
somewhat complex duplicate rules. Switch perf_env__arch to be derived
the e_machine for consistency. This switches arch from being uname
derived to matching that of the perf binary (via EM_HOST). Update
session to use the helper, which may mean using EM_HOST when no
threads are available. This also updates the perf data file header
that gets the e_machine/e_flags from the session.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/env.c     | 185 ++++++++++++++++++++++++++++++--------
 tools/perf/util/env.h     |   1 +
 tools/perf/util/header.c  |  44 ++++++---
 tools/perf/util/session.c |  14 +--
 4 files changed, 191 insertions(+), 53 deletions(-)

diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 1e54e2c86360..339d62ca37bb 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -1,10 +1,12 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "cpumap.h"
+#include "dwarf-regs.h"
 #include "debug.h"
 #include "env.h"
 #include "util/header.h"
 #include "util/rwsem.h"
 #include <linux/compiler.h>
+#include <linux/kernel.h>
 #include <linux/ctype.h>
 #include <linux/rbtree.h>
 #include <linux/string.h>
@@ -588,51 +590,160 @@ void cpu_cache_level__free(struct cpu_cache_level *cache)
 	zfree(&cache->size);
 }
 
+struct arch_to_e_machine {
+	const char *prefix;
+	uint16_t e_machine;
+};
+
 /*
- * Return architecture name in a normalized form.
- * The conversion logic comes from the Makefile.
+ * A mapping from an arch prefix string to an ELF machine that can be used in a
+ * bsearch. Some arch prefixes are shared an need additional processing as
+ * marked next to the architecture. The prefixes handle both perf's architecture
+ * naming and those from uname.
  */
-static const char *normalize_arch(char *arch)
-{
-	if (!strcmp(arch, "x86_64"))
-		return "x86";
-	if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
-		return "x86";
-	if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
-		return "sparc";
-	if (!strncmp(arch, "aarch64", 7) || !strncmp(arch, "arm64", 5))
-		return "arm64";
-	if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
-		return "arm";
-	if (!strncmp(arch, "s390", 4))
-		return "s390";
-	if (!strncmp(arch, "parisc", 6))
-		return "parisc";
-	if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
-		return "powerpc";
-	if (!strncmp(arch, "mips", 4))
-		return "mips";
-	if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
-		return "sh";
-	if (!strncmp(arch, "loongarch", 9))
-		return "loongarch";
-
-	return arch;
+static const struct arch_to_e_machine prefix_to_e_machine[] = {
+	{"aarch64", EM_AARCH64},
+	{"alpha", EM_ALPHA},
+	{"arc", EM_ARC},
+	{"arm", EM_ARM}, /* Check also for EM_AARCH64. */
+	{"avr", EM_AVR},  /* Check also for EM_AVR32. */
+	{"bfin", EM_BLACKFIN},
+	{"blackfin", EM_BLACKFIN},
+	{"cris", EM_CRIS},
+	{"csky", EM_CSKY},
+	{"hppa", EM_PARISC},
+	{"i386", EM_386},
+	{"i486", EM_386},
+	{"i586", EM_386},
+	{"i686", EM_386},
+	{"loongarch", EM_LOONGARCH},
+	{"m32r", EM_M32R},
+	{"m68k", EM_68K},
+	{"microblaze", EM_MICROBLAZE},
+	{"mips", EM_MIPS},
+	{"msp430", EM_MSP430},
+	{"parisc", EM_PARISC},
+	{"powerpc", EM_PPC}, /* Check also for EM_PPC64. */
+	{"ppc", EM_PPC}, /* Check also for EM_PPC64. */
+	{"riscv", EM_RISCV},
+	{"s390", EM_S390},
+	{"sa110", EM_ARM},
+	{"sh", EM_SH},
+	{"sparc", EM_SPARC}, /* Check also for EM_SPARCV9. */
+	{"sun4u", EM_SPARC},
+	{"x86", EM_X86_64}, /* Check also for EM_386. */
+	{"xtensa", EM_XTENSA},
+};
+
+static int compare_prefix(const void *key, const void *element)
+{
+	const char *search_key = key;
+	const struct arch_to_e_machine *map_element = element;
+	size_t prefix_len = strlen(map_element->prefix);
+
+	return strncmp(search_key, map_element->prefix, prefix_len);
+}
+
+static uint16_t perf_arch_to_e_machine(const char *perf_arch, bool is_64_bit)
+{
+	/* Binary search for a matching prefix. */
+	const struct arch_to_e_machine *result;
+
+	if (!perf_arch)
+		return EM_HOST;
+
+	result = bsearch(perf_arch,
+			 prefix_to_e_machine, ARRAY_SIZE(prefix_to_e_machine),
+			 sizeof(prefix_to_e_machine[0]),
+			 compare_prefix);
+
+	if (!result) {
+		pr_debug("Unknown perf arch for ELF machine mapping: %s\n", perf_arch);
+		return EM_NONE;
+	}
+
+	/* Handle conflicting prefixes. */
+	switch (result->e_machine) {
+	case EM_ARM:
+		return !strcmp(perf_arch, "arm64") ? EM_AARCH64 : EM_ARM;
+	case EM_AVR:
+		return !strcmp(perf_arch, "avr32") ? EM_AVR32 : EM_AVR;
+	case EM_PPC:
+		return is_64_bit || strstarts(perf_arch, "ppc64") ? EM_PPC64 : EM_PPC;
+	case EM_SPARC:
+		return is_64_bit || !strcmp(perf_arch, "sparc64") ? EM_SPARCV9 : EM_SPARC;
+	case EM_X86_64:
+		return is_64_bit || !strcmp(perf_arch, "x86_64") ? EM_X86_64 : EM_386;
+	default:
+		return result->e_machine;
+	}
+}
+
+static const char *e_machine_to_perf_arch(uint16_t e_machine)
+{
+	/*
+	 * Table for if either the perf arch string differs from uname or there
+	 * are >1 ELF machine with the prefix.
+	 */
+	static const struct arch_to_e_machine extras[] = {
+		{"arm64", EM_AARCH64},
+		{"avr32", EM_AVR32},
+		{"powerpc", EM_PPC},
+		{"powerpc", EM_PPC64},
+		{"sparc", EM_SPARCV9},
+		{"x86", EM_386},
+		{"x86", EM_X86_64},
+		{"none", EM_NONE},
+	};
+
+	for (size_t i = 0; i < ARRAY_SIZE(extras); i++) {
+		if (extras[i].e_machine == e_machine)
+			return extras[i].prefix;
+	}
+
+	for (size_t i = 0; i < ARRAY_SIZE(prefix_to_e_machine); i++) {
+		if (prefix_to_e_machine[i].e_machine == e_machine)
+			return prefix_to_e_machine[i].prefix;
+
+	}
+	return "unknown";
+}
+
+uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags)
+{
+	if (!env) {
+		if (e_flags)
+			*e_flags = EF_HOST;
+
+		return EM_HOST;
+	}
+	if (env->e_machine == EM_NONE) {
+		env->e_machine = perf_arch_to_e_machine(env->arch, env->kernel_is_64_bit);
+
+		if (env->e_machine == EM_HOST)
+			env->e_flags = EF_HOST;
+	}
+	if (e_flags)
+		*e_flags = env->e_flags;
+
+	return env->e_machine;
 }
 
 const char *perf_env__arch(struct perf_env *env)
 {
-	char *arch_name;
+	if (!env)
+		return e_machine_to_perf_arch(EM_HOST);
 
-	if (!env || !env->arch) { /* Assume local operation */
-		static struct utsname uts = { .machine[0] = '\0', };
-		if (uts.machine[0] == '\0' && uname(&uts) < 0)
-			return NULL;
-		arch_name = uts.machine;
-	} else
-		arch_name = env->arch;
+	if (!env->arch) {
+		/*
+		 * Lazily compute/allocate arch. The e_machine may have been
+		 * read from a data file and so may not be EM_HOST.
+		 */
+		uint16_t e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
 
-	return normalize_arch(arch_name);
+		env->arch = strdup(e_machine_to_perf_arch(e_machine));
+	}
+	return env->arch;
 }
 
 #if defined(HAVE_LIBTRACEEVENT)
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index c7052ac1f856..d36a0fb2cd04 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -187,6 +187,7 @@ int perf_env__read_cpu_topology_map(struct perf_env *env);
 
 void cpu_cache_level__free(struct cpu_cache_level *cache);
 
+uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags);
 const char *perf_env__arch(struct perf_env *env);
 const char *perf_env__arch_strerrno(struct perf_env *env, int err);
 const char *perf_env__cpuid(struct perf_env *env);
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index c6efddb70aee..9bb4a271b4f8 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -370,21 +370,25 @@ static int write_osrelease(struct feat_fd *ff,
 	return do_write_string(ff, uts.release);
 }
 
-static int write_arch(struct feat_fd *ff,
-		      struct evlist *evlist __maybe_unused)
+static int write_arch(struct feat_fd *ff, struct evlist *evlist)
 {
 	struct utsname uts;
-	int ret;
+	const char *arch = NULL;
 
-	ret = uname(&uts);
-	if (ret < 0)
-		return -1;
+	if (evlist->session)
+		arch = perf_env__arch(perf_session__env(evlist->session));
 
-	return do_write_string(ff, uts.machine);
+	if (!arch) {
+		int ret = uname(&uts);
+
+		if (ret < 0)
+			return -1;
+		arch = uts.machine;
+	}
+	return do_write_string(ff, arch);
 }
 
-static int write_e_machine(struct feat_fd *ff,
-			   struct evlist *evlist __maybe_unused)
+static int write_e_machine(struct feat_fd *ff, struct evlist *evlist)
 {
 	/* e_machine expanded from 16 to 32-bits for alignment. */
 	uint32_t e_flags;
@@ -2675,10 +2679,30 @@ static int process_##__feat(struct feat_fd *ff, void *data __maybe_unused) \
 FEAT_PROCESS_STR_FUN(hostname, hostname);
 FEAT_PROCESS_STR_FUN(osrelease, os_release);
 FEAT_PROCESS_STR_FUN(version, version);
-FEAT_PROCESS_STR_FUN(arch, arch);
 FEAT_PROCESS_STR_FUN(cpudesc, cpu_desc);
 FEAT_PROCESS_STR_FUN(cpuid, cpuid);
 
+static int process_arch(struct feat_fd *ff, void *data __maybe_unused)
+{
+	uint16_t saved_e_machine = ff->ph->env.e_machine;
+
+	free(ff->ph->env.arch);
+	ff->ph->env.arch = do_read_string(ff);
+	if (!ff->ph->env.arch)
+		return -ENOMEM;
+	/*
+	 * Make the arch string canonical by computing the e_machine from it,
+	 * then turning the e_machine back into an arch string.
+	 */
+	ff->ph->env.e_machine = EM_NONE;
+	if (perf_env__e_machine(&ff->ph->env, /*e_flags=*/NULL) != EM_NONE) {
+		zfree(&ff->ph->env.arch);
+		perf_env__arch(&ff->ph->env);
+	}
+	ff->ph->env.e_machine = saved_e_machine;
+	return 0;
+}
+
 static int process_e_machine(struct feat_fd *ff, void *data __maybe_unused)
 {
 	int ret;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index fe0de2a0277f..726568b88803 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -3023,14 +3023,16 @@ uint16_t perf_session__e_machine(struct perf_session *session, uint32_t *e_flags
 		return EM_HOST;
 	}
 
+	/* Is the env caching an e_machine? */
 	env = perf_session__env(session);
-	if (env && env->e_machine != EM_NONE) {
-		if (e_flags)
-			*e_flags = env->e_flags;
-
-		return env->e_machine;
-	}
+	if (env && env->e_machine != EM_NONE)
+		return perf_env__e_machine(env, e_flags);
 
+	/*
+	 * Compute from threads, note this is more accurate than
+	 * perf_env__e_machine that falls back on EM_HOST and doesn't consider
+	 * mixed 32-bit and 64-bit threads.
+	 */
 	machines__for_each_thread(&session->machines,
 				  perf_session__e_machine_cb,
 				  &args);
-- 
2.53.0.1213.gd9a14994de-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v6 2/3] perf env: Add helper to lazily compute the os_release
  2026-04-09 23:06                     ` [PATCH v6 0/3] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  2026-04-09 23:06                       ` [PATCH v6 1/3] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
@ 2026-04-09 23:06                       ` Ian Rogers
  2026-04-09 23:06                       ` [PATCH v6 3/3] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
  2 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-04-09 23:06 UTC (permalink / raw)
  To: namhyung
  Cc: irogers, acme, agordeev, gor, hca, jameshongleiwang, japo,
	linux-kernel, linux-perf-users, linux-s390, sumanthk, tmricht

In live mode the os_release isn't being initialized, make a lazy
initialization helper that assumes when the os_release isn't
initialized this is live mode.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/data-convert-bt.c |  2 +-
 tools/perf/util/env.c             | 21 +++++++++++++++++++++
 tools/perf/util/env.h             |  1 +
 tools/perf/util/header.c          | 16 +++++++++++-----
 tools/perf/util/symbol.c          |  4 ++--
 5 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c
index 3b8f2df823a9..2c88420fe33e 100644
--- a/tools/perf/util/data-convert-bt.c
+++ b/tools/perf/util/data-convert-bt.c
@@ -1414,7 +1414,7 @@ do {									\
 
 	ADD("host",    env->hostname);
 	ADD("sysname", "Linux");
-	ADD("release", env->os_release);
+	ADD("release", perf_env__os_release(env));
 	ADD("version", env->version);
 	ADD("machine", env->arch);
 	ADD("domain", "kernel");
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 339d62ca37bb..34b737950f73 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -330,6 +330,27 @@ int perf_env__kernel_is_64_bit(struct perf_env *env)
 	return env->kernel_is_64_bit;
 }
 
+const char *perf_env__os_release(struct perf_env *env)
+{
+	struct utsname uts;
+	int ret;
+
+	if (!env)
+		return perf_version_string;
+
+	if (env->os_release)
+		return env->os_release;
+
+	/*
+	 * The os_release is being accessed but wasn't initialized from a data
+	 * file, assume this is 'live' mode and use the release from uname. If
+	 * uname fails then use the current perf tool version.
+	 */
+	ret = uname(&uts);
+	env->os_release = strdup(ret < 0 ? perf_version_string : uts.release);
+	return env->os_release;
+}
+
 int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[])
 {
 	int i;
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index d36a0fb2cd04..56020f4381cd 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -175,6 +175,7 @@ void free_cpu_domain_info(struct cpu_domain_map **cd_map, u32 schedstat_version,
 void perf_env__exit(struct perf_env *env);
 
 int perf_env__kernel_is_64_bit(struct perf_env *env);
+const char *perf_env__os_release(struct perf_env *env);
 
 int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[]);
 
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 9bb4a271b4f8..89115134f1d2 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -361,13 +361,19 @@ static int write_osrelease(struct feat_fd *ff,
 			   struct evlist *evlist __maybe_unused)
 {
 	struct utsname uts;
-	int ret;
+	const char *release = NULL;
 
-	ret = uname(&uts);
-	if (ret < 0)
-		return -1;
+	if (evlist->session)
+		release = perf_env__os_release(perf_session__env(evlist->session));
 
-	return do_write_string(ff, uts.release);
+	if (!release) {
+		int ret = uname(&uts);
+
+		if (ret < 0)
+			return -1;
+		release = uts.release;
+	}
+	return do_write_string(ff, release);
 }
 
 static int write_arch(struct feat_fd *ff, struct evlist *evlist)
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index fcaeeddbbb6b..fd332db56157 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -2209,7 +2209,7 @@ static int vmlinux_path__init(struct perf_env *env)
 {
 	struct utsname uts;
 	char bf[PATH_MAX];
-	char *kernel_version;
+	const char *kernel_version;
 	unsigned int i;
 
 	vmlinux_path = malloc(sizeof(char *) * (ARRAY_SIZE(vmlinux_paths) +
@@ -2226,7 +2226,7 @@ static int vmlinux_path__init(struct perf_env *env)
 		return 0;
 
 	if (env) {
-		kernel_version = env->os_release;
+		kernel_version = perf_env__os_release(env);
 	} else {
 		if (uname(&uts) < 0)
 			goto out_fail;
-- 
2.53.0.1213.gd9a14994de-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v6 3/3] perf symbol: Lazily compute idle and use the perf_env
  2026-04-09 23:06                     ` [PATCH v6 0/3] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  2026-04-09 23:06                       ` [PATCH v6 1/3] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
  2026-04-09 23:06                       ` [PATCH v6 2/3] perf env: Add helper to lazily compute the os_release Ian Rogers
@ 2026-04-09 23:06                       ` Ian Rogers
  2 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-04-09 23:06 UTC (permalink / raw)
  To: namhyung
  Cc: irogers, acme, agordeev, gor, hca, jameshongleiwang, japo,
	linux-kernel, linux-perf-users, linux-s390, sumanthk, tmricht

Move the idle boolean to a helper symbol__is_idle function. In the
function lazily compute whether a symbol is an idle function taking
into consideration the kernel version and architecture of the
machine. As symbols__insert no longer needs to know if a symbol is for
the kernel, remove the argument.

This change is inspired by mailing list discussion, particularly from
Thomas Richter <tmricht@linux.ibm.com> and Heiko Carstens
<hca@linux.ibm.com>:
https://lore.kernel.org/lkml/20260219113850.354271-1-tmricht@linux.ibm.com/

The change switches x86 matches to use strstarts which means
intel_idle_irq is matched as part of strstarts(name, "intel_idle"), a
change suggested by Honglei Wang <jameshongleiwang@126.com> in:
https://lore.kernel.org/lkml/20260323085255.98173-1-jameshongleiwang@126.com/

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-top.c     |   6 +-
 tools/perf/util/symbol-elf.c |   2 +-
 tools/perf/util/symbol.c     | 103 ++++++++++++++++++++++-------------
 tools/perf/util/symbol.h     |  15 +++--
 4 files changed, 82 insertions(+), 44 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index f6eb543de537..95fa3a03e62d 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -751,6 +751,7 @@ static void perf_event__process_sample(const struct perf_tool *tool,
 {
 	struct perf_top *top = container_of(tool, struct perf_top, tool);
 	struct addr_location al;
+	struct dso *dso = NULL;
 
 	if (!machine && perf_guest) {
 		static struct intlist *seen;
@@ -830,7 +831,10 @@ static void perf_event__process_sample(const struct perf_tool *tool,
 		}
 	}
 
-	if (al.sym == NULL || !al.sym->idle) {
+	if (al.map)
+		dso = map__dso(al.map);
+
+	if (al.sym == NULL || !symbol__is_idle(al.sym, dso, machine->env)) {
 		struct hists *hists = evsel__hists(evsel);
 		struct hist_entry_iter iter = {
 			.evsel		= evsel,
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 7afa8a117139..e8f7fe3f19fc 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1727,7 +1727,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 
 		arch__sym_update(f, &sym);
 
-		__symbols__insert(dso__symbols(curr_dso), f, dso__kernel(dso));
+		__symbols__insert(dso__symbols(curr_dso), f);
 		nr++;
 	}
 	dso__put(curr_dso);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index fd332db56157..482fd47bead2 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -25,6 +25,8 @@
 #include "demangle-ocaml.h"
 #include "demangle-rust-v0.h"
 #include "dso.h"
+#include "dwarf-regs.h"
+#include "env.h"
 #include "util.h" // lsdir()
 #include "event.h"
 #include "machine.h"
@@ -50,7 +52,6 @@
 
 static int dso__load_kernel_sym(struct dso *dso, struct map *map);
 static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map);
-static bool symbol__is_idle(const char *name);
 
 int vmlinux_path__nr_entries;
 char **vmlinux_path;
@@ -358,8 +359,7 @@ void symbols__delete(struct rb_root_cached *symbols)
 	}
 }
 
-void __symbols__insert(struct rb_root_cached *symbols,
-		       struct symbol *sym, bool kernel)
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
 	struct rb_node **p = &symbols->rb_root.rb_node;
 	struct rb_node *parent = NULL;
@@ -367,17 +367,6 @@ void __symbols__insert(struct rb_root_cached *symbols,
 	struct symbol *s;
 	bool leftmost = true;
 
-	if (kernel) {
-		const char *name = sym->name;
-		/*
-		 * ppc64 uses function descriptors and appends a '.' to the
-		 * start of every instruction address. Remove it.
-		 */
-		if (name[0] == '.')
-			name++;
-		sym->idle = symbol__is_idle(name);
-	}
-
 	while (*p != NULL) {
 		parent = *p;
 		s = rb_entry(parent, struct symbol, rb_node);
@@ -394,7 +383,7 @@ void __symbols__insert(struct rb_root_cached *symbols,
 
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
-	__symbols__insert(symbols, sym, false);
+	__symbols__insert(symbols, sym);
 }
 
 static struct symbol *symbols__find(struct rb_root_cached *symbols, u64 ip)
@@ -555,7 +544,7 @@ void dso__reset_find_symbol_cache(struct dso *dso)
 
 void dso__insert_symbol(struct dso *dso, struct symbol *sym)
 {
-	__symbols__insert(dso__symbols(dso), sym, dso__kernel(dso));
+	__symbols__insert(dso__symbols(dso), sym);
 
 	/* update the symbol cache if necessary */
 	if (dso__last_find_result_addr(dso) >= sym->start &&
@@ -717,47 +706,85 @@ int modules__parse(const char *filename, void *arg,
 	return err;
 }
 
+static int sym_name_cmp(const void *a, const void *b)
+{
+	const char *name = a;
+	const char *const *sym = b;
+
+	return strcmp(name, *sym);
+}
+
 /*
  * These are symbols in the kernel image, so make sure that
  * sym is from a kernel DSO.
  */
-static bool symbol__is_idle(const char *name)
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, struct perf_env *env)
 {
-	const char * const idle_symbols[] = {
+	static const char * const idle_symbols[] = {
 		"acpi_idle_do_entry",
 		"acpi_processor_ffh_cstate_enter",
 		"arch_cpu_idle",
 		"cpu_idle",
 		"cpu_startup_entry",
-		"idle_cpu",
-		"intel_idle",
-		"intel_idle_ibrs",
 		"default_idle",
-		"native_safe_halt",
 		"enter_idle",
 		"exit_idle",
-		"mwait_idle",
-		"mwait_idle_with_hints",
-		"mwait_idle_with_hints.constprop.0",
+		"idle_cpu",
+		"native_safe_halt",
 		"poll_idle",
-		"ppc64_runlatch_off",
 		"pseries_dedicated_idle_sleep",
-		"psw_idle",
-		"psw_idle_exit",
-		NULL
 	};
-	int i;
-	static struct strlist *idle_symbols_list;
+	const char *name = sym->name;
+	uint16_t e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
+
+	if (sym->idle)
+		return sym->idle == SYMBOL_IDLE__IDLE;
+
+	if (!dso || dso__kernel(dso) == DSO_SPACE__USER) {
+		sym->idle = SYMBOL_IDLE__NOT_IDLE;
+		return false;
+	}
 
-	if (idle_symbols_list)
-		return strlist__has_entry(idle_symbols_list, name);
+	/*
+	 * ppc64 uses function descriptors and appends a '.' to the
+	 * start of every instruction address. Remove it.
+	 */
+	if (name[0] == '.')
+		name++;
 
-	idle_symbols_list = strlist__new(NULL, NULL);
+	if (bsearch(name, idle_symbols, ARRAY_SIZE(idle_symbols),
+		    sizeof(idle_symbols[0]), sym_name_cmp)) {
+		sym->idle = SYMBOL_IDLE__IDLE;
+		return true;
+	}
 
-	for (i = 0; idle_symbols[i]; i++)
-		strlist__add(idle_symbols_list, idle_symbols[i]);
+	if (e_machine == EM_386 || e_machine == EM_X86_64) {
+		if (strstarts(name, "mwait_idle") ||
+		    strstarts(name, "intel_idle")) {
+			sym->idle = SYMBOL_IDLE__IDLE;
+			return true;
+		}
+	}
 
-	return strlist__has_entry(idle_symbols_list, name);
+	if (e_machine == EM_PPC64 && !strcmp(name, "ppc64_runlatch_off")) {
+		sym->idle = SYMBOL_IDLE__IDLE;
+		return true;
+	}
+
+	if (e_machine == EM_S390 && strstarts(name, "psw_idle")) {
+		int major = 0, minor = 0;
+		const char *release = perf_env__os_release(env);
+
+		/* Before v6.10, s390 used psw_idle. */
+		if (release && sscanf(release, "%d.%d", &major, &minor) == 2 &&
+		    (major < 6 || (major == 6 && minor < 10))) {
+			sym->idle = SYMBOL_IDLE__IDLE;
+			return true;
+		}
+	}
+
+	sym->idle = SYMBOL_IDLE__NOT_IDLE;
+	return false;
 }
 
 static int map__process_kallsym_symbol(void *arg, const char *name,
@@ -786,7 +813,7 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
 	 * We will pass the symbols to the filter later, in
 	 * map__split_kallsyms, when we have split the maps per module
 	 */
-	__symbols__insert(root, sym, !strchr(name, '['));
+	__symbols__insert(root, sym);
 
 	return 0;
 }
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index bd6eb90c8668..7e0036f80185 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -26,6 +26,7 @@ struct dso;
 struct map;
 struct maps;
 struct option;
+struct perf_env;
 struct build_id;
 
 /*
@@ -43,6 +44,12 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
 			     GElf_Shdr *shp, const char *name, size_t *idx);
 #endif
 
+enum symbol_idle_kind {
+	SYMBOL_IDLE__UNKNOWN = 0,
+	SYMBOL_IDLE__NOT_IDLE = 1,
+	SYMBOL_IDLE__IDLE = 2,
+};
+
 /**
  * A symtab entry. When allocated this may be preceded by an annotation (see
  * symbol__annotation) and/or a browser_index (see symbol__browser_index).
@@ -58,8 +65,8 @@ struct symbol {
 	u8		type:4;
 	/** ELF binding type as defined for st_info. E.g. STB_WEAK or STB_GLOBAL. */
 	u8		binding:4;
-	/** Set true for kernel symbols of idle routines. */
-	u8		idle:1;
+	/** Cache for symbol__is_idle holding enum symbol_idle_kind values. */
+	u8		idle:2;
 	/** Resolvable but tools ignore it (e.g. idle routines). */
 	u8		ignore:1;
 	/** Symbol for an inlined function. */
@@ -194,8 +201,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss);
 
 char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
 
-void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym,
-		       bool kernel);
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__fixup_duplicate(struct rb_root_cached *symbols);
 void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms);
@@ -278,5 +284,6 @@ enum {
 };
 
 int symbol__validate_sym_arguments(void);
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, struct perf_env *env);
 
 #endif /* __PERF_SYMBOL */
-- 
2.53.0.1213.gd9a14994de-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v7 0/4] perf symbol/env: ELF machine clean up and lazy idle computation
  2026-04-09 23:06                       ` [PATCH v6 1/3] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
@ 2026-05-01 18:20                         ` Ian Rogers
  2026-05-01 18:20                           ` [PATCH v7 1/4] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
                                             ` (3 more replies)
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  1 sibling, 4 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-01 18:20 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Add a helper to perf_env to compute the e_machine if it is
EM_NONE. Derive the value from the arch string if available. Similarly
derive the arch string from the ELF machine if available, for
consistency. This means perf's arch (machine type) is no longer
determined by uname but set to match that of the perf ELF executable.
  
Switch the idle computation to the point of use and lazily compute it,
rather than computing it for every symbol. The current only user is
perf top. At the point of use the perf_env is available and this can
be used to make sure the idle function computation is machine and
kernel version dependent.
  
To avoid concurrent update issues with bitfields sharing a byte in
struct symbol due to the lazy computation, introduce a global lock for
updates to these fields and use setter functions. The reads remain
lockless.
  
v7:
 - Address better handling of strdup failures with arch in the header/env.
 - Address concurrent update issues in  struct symbol  bitfields by
   introducing a global lock for writes.
  
v6: Ensure arch is canonical by going to e_machine and back (Sashiko)
https://lore.kernel.org/linux-perf-users/20260409230620.4176210-1-irogers@google.com/

v5: Add perf_env os_release helper (Namhyung/Sashiko)
https://lore.kernel.org/lkml/20260406170905.2614260-1-irogers@google.com/
  
v4: Fix Sashiko issues where an array element wasn't sorted properly,
    the e_flags weren't returned properly, the idle type is change to
    a u8 rather than an enum value and the s390 version check for
    psw_idle is slightly reordered and tweaked.
https://lore.kernel.org/lkml/20260327045025.2276517-1-irogers@google.com/
  
v3: Properly set up the e_machine coming from the perf_env as reported
    by Honglei Wang.
https://lore.kernel.org/lkml/20260326174521.1829203-1-irogers@google.com/
  
v2: Some minor white space clean up:
https://lore.kernel.org/lkml/20260325161836.1029457-1-irogers@google.com/
  
v1: https://lore.kernel.org/lkml/20260302234343.564937-1-irogers@google.com/

Ian Rogers (4):
  perf env: Add perf_env__e_machine helper and use in perf_env__arch
  perf env: Add helper to lazily compute the os_release
  perf symbol: Add setters for bitfields sharing a byte to avoid
    concurrent update issues
  perf symbol: Lazily compute idle and use a global lock for updates

 tools/perf/builtin-kwork.c        |   2 +-
 tools/perf/builtin-sched.c        |   2 +-
 tools/perf/util/annotate.c        |   2 +-
 tools/perf/util/data-convert-bt.c |   2 +-
 tools/perf/util/env.c             | 218 +++++++++++++++++++++++++-----
 tools/perf/util/env.h             |   2 +
 tools/perf/util/header.c          |  63 +++++++--
 tools/perf/util/session.c         |  25 ++--
 tools/perf/util/symbol-elf.c      |   2 +-
 tools/perf/util/symbol.c          | 134 ++++++++++++------
 tools/perf/util/symbol.h          |  17 ++-
 11 files changed, 357 insertions(+), 112 deletions(-)

-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH v7 1/4] perf env: Add perf_env__e_machine helper and use in perf_env__arch
  2026-05-01 18:20                         ` [PATCH v7 0/4] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
@ 2026-05-01 18:20                           ` Ian Rogers
  2026-05-01 18:20                           ` [PATCH v7 2/4] perf env: Add helper to lazily compute the os_release Ian Rogers
                                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-01 18:20 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Add a helper that lazily computes the e_machine and falls back of
EM_HOST. Use the perf_env's arch to compute the e_machine if
available. Use a binary search for some efficiency in this, but handle
somewhat complex duplicate rules. Switch perf_env__arch to be derived
the e_machine for consistency. This switches arch from being uname
derived to matching that of the perf binary (via EM_HOST). Update
session to use the helper, which may mean using EM_HOST when no
threads are available. This also updates the perf data file header
that gets the e_machine/e_flags from the session.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/env.c     | 197 +++++++++++++++++++++++++++++++-------
 tools/perf/util/env.h     |   1 +
 tools/perf/util/header.c  |  47 +++++++--
 tools/perf/util/session.c |  25 ++---
 4 files changed, 212 insertions(+), 58 deletions(-)

diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 1e54e2c86360..1671769d4441 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -1,10 +1,12 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "cpumap.h"
+#include "dwarf-regs.h"
 #include "debug.h"
 #include "env.h"
 #include "util/header.h"
 #include "util/rwsem.h"
 #include <linux/compiler.h>
+#include <linux/kernel.h>
 #include <linux/ctype.h>
 #include <linux/rbtree.h>
 #include <linux/string.h>
@@ -588,51 +590,172 @@ void cpu_cache_level__free(struct cpu_cache_level *cache)
 	zfree(&cache->size);
 }
 
+struct arch_to_e_machine {
+	const char *prefix;
+	uint16_t e_machine;
+};
+
 /*
- * Return architecture name in a normalized form.
- * The conversion logic comes from the Makefile.
+ * A mapping from an arch prefix string to an ELF machine that can be used in a
+ * bsearch. Some arch prefixes are shared an need additional processing as
+ * marked next to the architecture. The prefixes handle both perf's architecture
+ * naming and those from uname.
  */
-static const char *normalize_arch(char *arch)
-{
-	if (!strcmp(arch, "x86_64"))
-		return "x86";
-	if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
-		return "x86";
-	if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
-		return "sparc";
-	if (!strncmp(arch, "aarch64", 7) || !strncmp(arch, "arm64", 5))
-		return "arm64";
-	if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
-		return "arm";
-	if (!strncmp(arch, "s390", 4))
-		return "s390";
-	if (!strncmp(arch, "parisc", 6))
-		return "parisc";
-	if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
-		return "powerpc";
-	if (!strncmp(arch, "mips", 4))
-		return "mips";
-	if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
-		return "sh";
-	if (!strncmp(arch, "loongarch", 9))
-		return "loongarch";
-
-	return arch;
+static const struct arch_to_e_machine prefix_to_e_machine[] = {
+	{"aarch64", EM_AARCH64},
+	{"alpha", EM_ALPHA},
+	{"arc", EM_ARC},
+	{"arm", EM_ARM}, /* Check also for EM_AARCH64. */
+	{"avr", EM_AVR},  /* Check also for EM_AVR32. */
+	{"bfin", EM_BLACKFIN},
+	{"blackfin", EM_BLACKFIN},
+	{"cris", EM_CRIS},
+	{"csky", EM_CSKY},
+	{"hppa", EM_PARISC},
+	{"i386", EM_386},
+	{"i486", EM_386},
+	{"i586", EM_386},
+	{"i686", EM_386},
+	{"loongarch", EM_LOONGARCH},
+	{"m32r", EM_M32R},
+	{"m68k", EM_68K},
+	{"microblaze", EM_MICROBLAZE},
+	{"mips", EM_MIPS},
+	{"msp430", EM_MSP430},
+	{"parisc", EM_PARISC},
+	{"powerpc", EM_PPC}, /* Check also for EM_PPC64. */
+	{"ppc", EM_PPC}, /* Check also for EM_PPC64. */
+	{"riscv", EM_RISCV},
+	{"s390", EM_S390},
+	{"sa110", EM_ARM},
+	{"sh", EM_SH},
+	{"sparc", EM_SPARC}, /* Check also for EM_SPARCV9. */
+	{"sun4u", EM_SPARC},
+	{"x86", EM_X86_64}, /* Check also for EM_386. */
+	{"xtensa", EM_XTENSA},
+};
+
+static int compare_prefix(const void *key, const void *element)
+{
+	const char *search_key = key;
+	const struct arch_to_e_machine *map_element = element;
+	size_t prefix_len = strlen(map_element->prefix);
+
+	return strncmp(search_key, map_element->prefix, prefix_len);
+}
+
+static uint16_t perf_arch_to_e_machine(const char *perf_arch, int is_64_bit)
+{
+	/* Binary search for a matching prefix. */
+	const struct arch_to_e_machine *result;
+
+	if (!perf_arch)
+		return EM_HOST;
+
+	result = bsearch(perf_arch,
+			 prefix_to_e_machine, ARRAY_SIZE(prefix_to_e_machine),
+			 sizeof(prefix_to_e_machine[0]),
+			 compare_prefix);
+
+	if (!result) {
+		pr_debug("Unknown perf arch for ELF machine mapping: %s\n", perf_arch);
+		return EM_NONE;
+	}
+
+	/*
+	 * Handle conflicting prefixes. If the is_64_bit is unknown (-1) then
+	 * assume 64-bit. We can't use perf_env__kernel_is_64_bit as that
+	 * depends on the arch string.
+	 */
+	switch (result->e_machine) {
+	case EM_ARM:
+		return !strcmp(perf_arch, "arm64") ? EM_AARCH64 : EM_ARM;
+	case EM_AVR:
+		return !strcmp(perf_arch, "avr32") ? EM_AVR32 : EM_AVR;
+	case EM_PPC:
+		return (is_64_bit != 0) || strstarts(perf_arch, "ppc64") ? EM_PPC64 : EM_PPC;
+	case EM_SPARC:
+		return (is_64_bit != 0) || !strcmp(perf_arch, "sparc64") ? EM_SPARCV9 : EM_SPARC;
+	case EM_X86_64:
+		return (is_64_bit != 0) || !strcmp(perf_arch, "x86_64") ? EM_X86_64 : EM_386;
+	default:
+		return result->e_machine;
+	}
+}
+
+static const char *e_machine_to_perf_arch(uint16_t e_machine)
+{
+	/*
+	 * Table for if either the perf arch string differs from uname or there
+	 * are >1 ELF machine with the prefix.
+	 */
+	static const struct arch_to_e_machine extras[] = {
+		{"arm64", EM_AARCH64},
+		{"avr32", EM_AVR32},
+		{"powerpc", EM_PPC},
+		{"powerpc", EM_PPC64},
+		{"sparc", EM_SPARCV9},
+		{"x86", EM_386},
+		{"x86", EM_X86_64},
+		{"none", EM_NONE},
+	};
+
+	for (size_t i = 0; i < ARRAY_SIZE(extras); i++) {
+		if (extras[i].e_machine == e_machine)
+			return extras[i].prefix;
+	}
+
+	for (size_t i = 0; i < ARRAY_SIZE(prefix_to_e_machine); i++) {
+		if (prefix_to_e_machine[i].e_machine == e_machine)
+			return prefix_to_e_machine[i].prefix;
+
+	}
+	return "unknown";
+}
+
+uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags)
+{
+	if (!env) {
+		if (e_flags)
+			*e_flags = EF_HOST;
+
+		return EM_HOST;
+	}
+	if (env->e_machine == EM_NONE) {
+		env->e_machine = perf_arch_to_e_machine(env->arch, env->kernel_is_64_bit);
+
+		if (env->e_machine == EM_HOST)
+			env->e_flags = EF_HOST;
+	}
+	if (e_flags)
+		*e_flags = env->e_flags;
+
+	return env->e_machine;
 }
 
 const char *perf_env__arch(struct perf_env *env)
 {
-	char *arch_name;
+	uint16_t e_machine;
+	const char *arch;
 
-	if (!env || !env->arch) { /* Assume local operation */
-		static struct utsname uts = { .machine[0] = '\0', };
-		if (uts.machine[0] == '\0' && uname(&uts) < 0)
-			return NULL;
-		arch_name = uts.machine;
-	} else
-		arch_name = env->arch;
+	if (!env)
+		return e_machine_to_perf_arch(EM_HOST);
+
+	if (env->arch)
+		return env->arch;
 
-	return normalize_arch(arch_name);
+	/*
+	 * Lazily compute/allocate arch. The e_machine may have been
+	 * read from a data file and so may not be EM_HOST.
+	 */
+	e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
+	arch = e_machine_to_perf_arch(e_machine);
+	env->arch = strdup(arch);
+	/*
+	 * Avoid potential crashes on the arch string if memory allocation in
+	 * strdup fails and NULL were to be returned.
+	 */
+	return env->arch ?: arch;
 }
 
 #if defined(HAVE_LIBTRACEEVENT)
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index c7052ac1f856..d36a0fb2cd04 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -187,6 +187,7 @@ int perf_env__read_cpu_topology_map(struct perf_env *env);
 
 void cpu_cache_level__free(struct cpu_cache_level *cache);
 
+uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags);
 const char *perf_env__arch(struct perf_env *env);
 const char *perf_env__arch_strerrno(struct perf_env *env, int err);
 const char *perf_env__cpuid(struct perf_env *env);
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index f30e48eb3fc3..8d5152bde25d 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -379,21 +379,28 @@ static int write_osrelease(struct feat_fd *ff,
 	return do_write_string(ff, uts.release);
 }
 
-static int write_arch(struct feat_fd *ff,
-		      struct evlist *evlist __maybe_unused)
+static int write_arch(struct feat_fd *ff, struct evlist *evlist)
 {
 	struct utsname uts;
-	int ret;
+	const char *arch = NULL;
 
-	ret = uname(&uts);
-	if (ret < 0)
-		return -1;
+	if (evlist->session) {
+		/* Force the computation in the perf_env of the e_machine of the threads. */
+		perf_session__e_machine(evlist->session, /*e_flags=*/NULL);
+		arch = perf_env__arch(perf_session__env(evlist->session));
+	}
+
+	if (!arch) {
+		int ret = uname(&uts);
 
-	return do_write_string(ff, uts.machine);
+		if (ret < 0)
+			return -1;
+		arch = uts.machine;
+	}
+	return do_write_string(ff, arch);
 }
 
-static int write_e_machine(struct feat_fd *ff,
-			   struct evlist *evlist __maybe_unused)
+static int write_e_machine(struct feat_fd *ff, struct evlist *evlist)
 {
 	/* e_machine expanded from 16 to 32-bits for alignment. */
 	uint32_t e_flags;
@@ -2684,10 +2691,30 @@ static int process_##__feat(struct feat_fd *ff, void *data __maybe_unused) \
 FEAT_PROCESS_STR_FUN(hostname, hostname);
 FEAT_PROCESS_STR_FUN(osrelease, os_release);
 FEAT_PROCESS_STR_FUN(version, version);
-FEAT_PROCESS_STR_FUN(arch, arch);
 FEAT_PROCESS_STR_FUN(cpudesc, cpu_desc);
 FEAT_PROCESS_STR_FUN(cpuid, cpuid);
 
+static int process_arch(struct feat_fd *ff, void *data __maybe_unused)
+{
+	uint16_t saved_e_machine = ff->ph->env.e_machine;
+
+	free(ff->ph->env.arch);
+	ff->ph->env.arch = do_read_string(ff);
+	if (!ff->ph->env.arch)
+		return -ENOMEM;
+	/*
+	 * Make the arch string canonical by computing the e_machine from it,
+	 * then turning the e_machine back into an arch string.
+	 */
+	ff->ph->env.e_machine = EM_NONE;
+	if (perf_env__e_machine(&ff->ph->env, /*e_flags=*/NULL) != EM_NONE) {
+		zfree(&ff->ph->env.arch);
+		perf_env__arch(&ff->ph->env);
+	}
+	ff->ph->env.e_machine = saved_e_machine;
+	return 0;
+}
+
 static int process_e_machine(struct feat_fd *ff, void *data __maybe_unused)
 {
 	int ret;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index fe0de2a0277f..bc7add02a2de 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -3023,14 +3023,19 @@ uint16_t perf_session__e_machine(struct perf_session *session, uint32_t *e_flags
 		return EM_HOST;
 	}
 
+	/*
+	 * Is the env caching an e_machine? If not we want to compute from the
+	 * more accurate threads.
+	 */
 	env = perf_session__env(session);
-	if (env && env->e_machine != EM_NONE) {
-		if (e_flags)
-			*e_flags = env->e_flags;
-
-		return env->e_machine;
-	}
+	if (env && env->e_machine != EM_NONE)
+		return perf_env__e_machine(env, e_flags);
 
+	/*
+	 * Compute from threads, note this is more accurate than
+	 * perf_env__e_machine that falls back on EM_HOST and doesn't consider
+	 * mixed 32-bit and 64-bit threads.
+	 */
 	machines__for_each_thread(&session->machines,
 				  perf_session__e_machine_cb,
 				  &args);
@@ -3048,10 +3053,8 @@ uint16_t perf_session__e_machine(struct perf_session *session, uint32_t *e_flags
 
 	/*
 	 * Couldn't determine from the perf_env or current set of
-	 * threads. Default to the host.
+	 * threads. Potentially use logic that uses the arch string otherwise
+	 * default to the host.
 	 */
-	if (e_flags)
-		*e_flags = EF_HOST;
-
-	return EM_HOST;
+	return perf_env__e_machine(env, e_flags);
 }
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v7 2/4] perf env: Add helper to lazily compute the os_release
  2026-05-01 18:20                         ` [PATCH v7 0/4] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  2026-05-01 18:20                           ` [PATCH v7 1/4] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
@ 2026-05-01 18:20                           ` Ian Rogers
  2026-05-01 18:20                           ` [PATCH v7 3/4] perf symbol: Add setters for bitfields sharing a byte to avoid concurrent update issues Ian Rogers
  2026-05-01 18:20                           ` [PATCH v7 4/4] perf symbol: Lazily compute idle and use a global lock for updates Ian Rogers
  3 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-01 18:20 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

In live mode the os_release isn't being initialized, make a lazy
initialization helper that assumes when the os_release isn't
initialized this is live mode.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/data-convert-bt.c |  2 +-
 tools/perf/util/env.c             | 21 +++++++++++++++++++++
 tools/perf/util/env.h             |  1 +
 tools/perf/util/header.c          | 16 +++++++++++-----
 tools/perf/util/symbol.c          |  4 ++--
 5 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c
index 3b8f2df823a9..2c88420fe33e 100644
--- a/tools/perf/util/data-convert-bt.c
+++ b/tools/perf/util/data-convert-bt.c
@@ -1414,7 +1414,7 @@ do {									\
 
 	ADD("host",    env->hostname);
 	ADD("sysname", "Linux");
-	ADD("release", env->os_release);
+	ADD("release", perf_env__os_release(env));
 	ADD("version", env->version);
 	ADD("machine", env->arch);
 	ADD("domain", "kernel");
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 1671769d4441..c3e464c6de2f 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -330,6 +330,27 @@ int perf_env__kernel_is_64_bit(struct perf_env *env)
 	return env->kernel_is_64_bit;
 }
 
+const char *perf_env__os_release(struct perf_env *env)
+{
+	struct utsname uts;
+	int ret;
+
+	if (!env)
+		return perf_version_string;
+
+	if (env->os_release)
+		return env->os_release;
+
+	/*
+	 * The os_release is being accessed but wasn't initialized from a data
+	 * file, assume this is 'live' mode and use the release from uname. If
+	 * uname or strdup fails then use the current perf tool version.
+	 */
+	ret = uname(&uts);
+	env->os_release = strdup(ret < 0 ? perf_version_string : uts.release);
+	return env->os_release ?: perf_version_string;
+}
+
 int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[])
 {
 	int i;
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index d36a0fb2cd04..56020f4381cd 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -175,6 +175,7 @@ void free_cpu_domain_info(struct cpu_domain_map **cd_map, u32 schedstat_version,
 void perf_env__exit(struct perf_env *env);
 
 int perf_env__kernel_is_64_bit(struct perf_env *env);
+const char *perf_env__os_release(struct perf_env *env);
 
 int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[]);
 
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 8d5152bde25d..cfafed3cc69f 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -370,13 +370,19 @@ static int write_osrelease(struct feat_fd *ff,
 			   struct evlist *evlist __maybe_unused)
 {
 	struct utsname uts;
-	int ret;
+	const char *release = NULL;
 
-	ret = uname(&uts);
-	if (ret < 0)
-		return -1;
+	if (evlist->session)
+		release = perf_env__os_release(perf_session__env(evlist->session));
 
-	return do_write_string(ff, uts.release);
+	if (!release) {
+		int ret = uname(&uts);
+
+		if (ret < 0)
+			return -1;
+		release = uts.release;
+	}
+	return do_write_string(ff, release);
 }
 
 static int write_arch(struct feat_fd *ff, struct evlist *evlist)
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index fcaeeddbbb6b..fd332db56157 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -2209,7 +2209,7 @@ static int vmlinux_path__init(struct perf_env *env)
 {
 	struct utsname uts;
 	char bf[PATH_MAX];
-	char *kernel_version;
+	const char *kernel_version;
 	unsigned int i;
 
 	vmlinux_path = malloc(sizeof(char *) * (ARRAY_SIZE(vmlinux_paths) +
@@ -2226,7 +2226,7 @@ static int vmlinux_path__init(struct perf_env *env)
 		return 0;
 
 	if (env) {
-		kernel_version = env->os_release;
+		kernel_version = perf_env__os_release(env);
 	} else {
 		if (uname(&uts) < 0)
 			goto out_fail;
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v7 3/4] perf symbol: Add setters for bitfields sharing a byte to avoid concurrent update issues
  2026-05-01 18:20                         ` [PATCH v7 0/4] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  2026-05-01 18:20                           ` [PATCH v7 1/4] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
  2026-05-01 18:20                           ` [PATCH v7 2/4] perf env: Add helper to lazily compute the os_release Ian Rogers
@ 2026-05-01 18:20                           ` Ian Rogers
  2026-05-01 18:20                           ` [PATCH v7 4/4] perf symbol: Lazily compute idle and use a global lock for updates Ian Rogers
  3 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-01 18:20 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

A problem with putting bitfields into struct symbol is that other bits in
the symbol could be updated concurrently and only one update to the
underlying storage unit happen, leading to lost updates.

To avoid this, introduce a global lock `symbol_bits_lock` in `symbol.c`
and helper functions to update the bits sharing a byte:
`symbol__set_ignore` and `symbol__set_annotate2`.

`inlined` is not given a setter as it is only initialized in
`new_inline_sym` when the symbol is under construction and not shared.

Assisted-by: Gemini:gemini-3.1-pro-preview
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-kwork.c |  2 +-
 tools/perf/builtin-sched.c |  2 +-
 tools/perf/util/annotate.c |  2 +-
 tools/perf/util/symbol.c   | 22 ++++++++++++++++++++++
 tools/perf/util/symbol.h   |  3 +++
 5 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index 9d3a4c779a41..7337ee956dc9 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -725,7 +725,7 @@ static void timehist_save_callchain(struct perf_kwork *kwork,
 		if (sym) {
 			if (!strcmp(sym->name, "__softirqentry_text_start") ||
 			    !strcmp(sym->name, "__do_softirq"))
-				sym->ignore = 1;
+				symbol__set_ignore(sym, true);
 		}
 
 		callchain_cursor_advance(cursor);
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 555247568e7a..655e95f660c2 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2371,7 +2371,7 @@ static void save_task_callchain(struct perf_sched *sched,
 			if (!strcmp(sym->name, "schedule") ||
 			    !strcmp(sym->name, "__schedule") ||
 			    !strcmp(sym->name, "preempt_schedule"))
-				sym->ignore = 1;
+				symbol__set_ignore(sym, true);
 		}
 
 		callchain_cursor_advance(cursor);
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index e745f3034a0e..d550a0061159 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -2224,7 +2224,7 @@ int symbol__annotate2(struct map_symbol *ms, struct evsel *evsel,
 
 	annotation__init_column_widths(notes, sym);
 	annotation__update_column_widths(notes);
-	sym->annotate2 = 1;
+	symbol__set_annotate2(sym, true);
 
 	return 0;
 }
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index fd332db56157..e6a1f23634ec 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -31,6 +31,7 @@
 #include "map.h"
 #include "symbol.h"
 #include "map_symbol.h"
+#include "mutex.h"
 #include "mem-events.h"
 #include "mem-info.h"
 #include "symsrc.h"
@@ -52,6 +53,8 @@ static int dso__load_kernel_sym(struct dso *dso, struct map *map);
 static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map);
 static bool symbol__is_idle(const char *name);
 
+static struct mutex symbol_bits_lock;
+
 int vmlinux_path__nr_entries;
 char **vmlinux_path;
 
@@ -345,6 +348,20 @@ void symbol__delete(struct symbol *sym)
 	free(((void *)sym) - symbol_conf.priv_size);
 }
 
+void symbol__set_ignore(struct symbol *sym, bool ignore)
+{
+	mutex_lock(&symbol_bits_lock);
+	sym->ignore = ignore;
+	mutex_unlock(&symbol_bits_lock);
+}
+
+void symbol__set_annotate2(struct symbol *sym, bool annotate2)
+{
+	mutex_lock(&symbol_bits_lock);
+	sym->annotate2 = annotate2;
+	mutex_unlock(&symbol_bits_lock);
+}
+
 void symbols__delete(struct rb_root_cached *symbols)
 {
 	struct symbol *pos;
@@ -2398,6 +2415,8 @@ int symbol__init(struct perf_env *env)
 	if (symbol_conf.initialized)
 		return 0;
 
+	mutex_init(&symbol_bits_lock);
+
 	symbol_conf.priv_size = PERF_ALIGN(symbol_conf.priv_size, sizeof(u64));
 
 	symbol__elf_init();
@@ -2476,6 +2495,9 @@ void symbol__exit(void)
 {
 	if (!symbol_conf.initialized)
 		return;
+
+	mutex_destroy(&symbol_bits_lock);
+
 	strlist__delete(symbol_conf.bt_stop_list);
 	strlist__delete(symbol_conf.sym_list);
 	strlist__delete(symbol_conf.dso_list);
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index bd6eb90c8668..5d98d7e84d57 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -77,6 +77,9 @@ struct symbol {
 void symbol__delete(struct symbol *sym);
 void symbols__delete(struct rb_root_cached *symbols);
 
+void symbol__set_ignore(struct symbol *sym, bool ignore);
+void symbol__set_annotate2(struct symbol *sym, bool annotate2);
+
 /* symbols__for_each_entry - iterate over symbols (rb_root)
  *
  * @symbols: the rb_root of symbols
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v7 4/4] perf symbol: Lazily compute idle and use a global lock for updates
  2026-05-01 18:20                         ` [PATCH v7 0/4] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                             ` (2 preceding siblings ...)
  2026-05-01 18:20                           ` [PATCH v7 3/4] perf symbol: Add setters for bitfields sharing a byte to avoid concurrent update issues Ian Rogers
@ 2026-05-01 18:20                           ` Ian Rogers
  3 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-01 18:20 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Move the idle boolean to a helper symbol__is_idle function. In the
function lazily compute whether a symbol is an idle function taking
into consideration the kernel version and architecture of the
machine. As symbols__insert no longer needs to know if a symbol is for
the kernel, remove the argument.

This change is inspired by mailing list discussion, particularly from
Thomas Richter <tmricht@linux.ibm.com> and Heiko Carstens
<hca@linux.ibm.com>:
https://lore.kernel.org/lkml/20260219113850.354271-1-tmricht@linux.ibm.com/

The change switches x86 matches to use strstarts which means
intel_idle_irq is matched as part of strstarts(name, "intel_idle"), a
change suggested by Honglei Wang <jameshongleiwang@126.com> in:
https://lore.kernel.org/lkml/20260323085255.98173-1-jameshongleiwang@126.com/

To avoid concurrent update issues with other bitfields in `struct symbol`,
this change uses the global lock `symbol_bits_lock` (introduced in a
previous commit) for updates to the `idle` field. A static helper
`symbol__set_idle` taking a boolean is used to encapsulate the lock and
mapping to `enum symbol_idle_kind`.

Assisted-by: Gemini:gemini-3.1-pro-preview
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/symbol-elf.c |   2 +-
 tools/perf/util/symbol.c     | 108 +++++++++++++++++++++++------------
 tools/perf/util/symbol.h     |  14 +++--
 3 files changed, 81 insertions(+), 43 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 7afa8a117139..e8f7fe3f19fc 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1727,7 +1727,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 
 		arch__sym_update(f, &sym);
 
-		__symbols__insert(dso__symbols(curr_dso), f, dso__kernel(dso));
+		__symbols__insert(dso__symbols(curr_dso), f);
 		nr++;
 	}
 	dso__put(curr_dso);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index e6a1f23634ec..8ec4b2836b44 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -51,7 +51,6 @@
 
 static int dso__load_kernel_sym(struct dso *dso, struct map *map);
 static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map);
-static bool symbol__is_idle(const char *name);
 
 static struct mutex symbol_bits_lock;
 
@@ -362,6 +361,13 @@ void symbol__set_annotate2(struct symbol *sym, bool annotate2)
 	mutex_unlock(&symbol_bits_lock);
 }
 
+static void symbol__set_idle(struct symbol *sym, bool idle)
+{
+	mutex_lock(&symbol_bits_lock);
+	sym->idle = idle ? SYMBOL_IDLE__IDLE : SYMBOL_IDLE__NOT_IDLE;
+	mutex_unlock(&symbol_bits_lock);
+}
+
 void symbols__delete(struct rb_root_cached *symbols)
 {
 	struct symbol *pos;
@@ -375,8 +381,7 @@ void symbols__delete(struct rb_root_cached *symbols)
 	}
 }
 
-void __symbols__insert(struct rb_root_cached *symbols,
-		       struct symbol *sym, bool kernel)
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
 	struct rb_node **p = &symbols->rb_root.rb_node;
 	struct rb_node *parent = NULL;
@@ -384,17 +389,6 @@ void __symbols__insert(struct rb_root_cached *symbols,
 	struct symbol *s;
 	bool leftmost = true;
 
-	if (kernel) {
-		const char *name = sym->name;
-		/*
-		 * ppc64 uses function descriptors and appends a '.' to the
-		 * start of every instruction address. Remove it.
-		 */
-		if (name[0] == '.')
-			name++;
-		sym->idle = symbol__is_idle(name);
-	}
-
 	while (*p != NULL) {
 		parent = *p;
 		s = rb_entry(parent, struct symbol, rb_node);
@@ -411,7 +405,7 @@ void __symbols__insert(struct rb_root_cached *symbols,
 
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
-	__symbols__insert(symbols, sym, false);
+	__symbols__insert(symbols, sym);
 }
 
 static struct symbol *symbols__find(struct rb_root_cached *symbols, u64 ip)
@@ -572,7 +566,7 @@ void dso__reset_find_symbol_cache(struct dso *dso)
 
 void dso__insert_symbol(struct dso *dso, struct symbol *sym)
 {
-	__symbols__insert(dso__symbols(dso), sym, dso__kernel(dso));
+	__symbols__insert(dso__symbols(dso), sym);
 
 	/* update the symbol cache if necessary */
 	if (dso__last_find_result_addr(dso) >= sym->start &&
@@ -738,43 +732,81 @@ int modules__parse(const char *filename, void *arg,
  * These are symbols in the kernel image, so make sure that
  * sym is from a kernel DSO.
  */
-static bool symbol__is_idle(const char *name)
+static int sym_name_cmp(const void *a, const void *b)
 {
-	const char * const idle_symbols[] = {
+	const char *name = a;
+	const char *const *sym = b;
+
+	return strcmp(name, *sym);
+}
+
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, struct perf_env *env)
+{
+	static const char * const idle_symbols[] = {
 		"acpi_idle_do_entry",
 		"acpi_processor_ffh_cstate_enter",
 		"arch_cpu_idle",
 		"cpu_idle",
 		"cpu_startup_entry",
-		"idle_cpu",
-		"intel_idle",
-		"intel_idle_ibrs",
 		"default_idle",
-		"native_safe_halt",
 		"enter_idle",
 		"exit_idle",
-		"mwait_idle",
-		"mwait_idle_with_hints",
-		"mwait_idle_with_hints.constprop.0",
+		"idle_cpu",
+		"native_safe_halt",
 		"poll_idle",
-		"ppc64_runlatch_off",
 		"pseries_dedicated_idle_sleep",
-		"psw_idle",
-		"psw_idle_exit",
-		NULL
 	};
-	int i;
-	static struct strlist *idle_symbols_list;
+	const char *name = sym->name;
+	uint16_t e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
 
-	if (idle_symbols_list)
-		return strlist__has_entry(idle_symbols_list, name);
+	if (sym->idle)
+		return sym->idle == SYMBOL_IDLE__IDLE;
 
-	idle_symbols_list = strlist__new(NULL, NULL);
+	if (!dso || dso__kernel(dso) == DSO_SPACE__USER) {
+		symbol__set_idle(sym, /*idle=*/false);
+		return false;
+	}
 
-	for (i = 0; idle_symbols[i]; i++)
-		strlist__add(idle_symbols_list, idle_symbols[i]);
+	/*
+	 * ppc64 uses function descriptors and appends a '.' to the
+	 * start of every instruction address. Remove it.
+	 */
+	if (name[0] == '.')
+		name++;
+
+	if (bsearch(name, idle_symbols, ARRAY_SIZE(idle_symbols),
+		    sizeof(idle_symbols[0]), sym_name_cmp)) {
+		symbol__set_idle(sym, /*idle=*/true);
+		return true;
+	}
+
+	if (e_machine == EM_386 || e_machine == EM_X86_64) {
+		if (strstarts(name, "mwait_idle") ||
+		    strstarts(name, "intel_idle")) {
+			symbol__set_idle(sym, /*idle=*/true);
+			return true;
+		}
+	}
 
-	return strlist__has_entry(idle_symbols_list, name);
+	if (e_machine == EM_PPC64 && !strcmp(name, "ppc64_runlatch_off")) {
+		symbol__set_idle(sym, /*idle=*/true);
+		return true;
+	}
+
+	if (e_machine == EM_S390 && strstarts(name, "psw_idle")) {
+		int major = 0, minor = 0;
+		const char *release = perf_env__os_release(env);
+
+		/* Before v6.10, s390 used psw_idle. */
+		if (release && sscanf(release, "%d.%d", &major, &minor) == 2 &&
+		    (major < 6 || (major == 6 && minor < 10))) {
+			symbol__set_idle(sym, /*idle=*/true);
+			return true;
+		}
+	}
+
+	symbol__set_idle(sym, /*idle=*/false);
+	return false;
 }
 
 static int map__process_kallsym_symbol(void *arg, const char *name,
@@ -803,7 +835,7 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
 	 * We will pass the symbols to the filter later, in
 	 * map__split_kallsyms, when we have split the maps per module
 	 */
-	__symbols__insert(root, sym, !strchr(name, '['));
+	__symbols__insert(root, sym);
 
 	return 0;
 }
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 5d98d7e84d57..717d2f876d58 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -43,6 +43,12 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
 			     GElf_Shdr *shp, const char *name, size_t *idx);
 #endif
 
+enum symbol_idle_kind {
+	SYMBOL_IDLE__UNKNOWN = 0,
+	SYMBOL_IDLE__NOT_IDLE = 1,
+	SYMBOL_IDLE__IDLE = 2,
+};
+
 /**
  * A symtab entry. When allocated this may be preceded by an annotation (see
  * symbol__annotation) and/or a browser_index (see symbol__browser_index).
@@ -58,8 +64,8 @@ struct symbol {
 	u8		type:4;
 	/** ELF binding type as defined for st_info. E.g. STB_WEAK or STB_GLOBAL. */
 	u8		binding:4;
-	/** Set true for kernel symbols of idle routines. */
-	u8		idle:1;
+	/** Cache for symbol__is_idle holding enum symbol_idle_kind values. */
+	u8		idle:2;
 	/** Resolvable but tools ignore it (e.g. idle routines). */
 	u8		ignore:1;
 	/** Symbol for an inlined function. */
@@ -197,8 +203,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss);
 
 char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
 
-void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym,
-		       bool kernel);
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__fixup_duplicate(struct rb_root_cached *symbols);
 void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms);
@@ -281,5 +286,6 @@ enum {
 };
 
 int symbol__validate_sym_arguments(void);
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, struct perf_env *env);
 
 #endif /* __PERF_SYMBOL */
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation
  2026-04-09 23:06                       ` [PATCH v6 1/3] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
  2026-05-01 18:20                         ` [PATCH v7 0/4] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
@ 2026-05-02  6:59                         ` Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 01/17] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
                                             ` (17 more replies)
  1 sibling, 18 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Add a helper to perf_env to compute the e_machine if it is EM_NONE.
Derive the value from the arch string if available. Similarly derive
the arch string from the ELF machine if available, for
consistency. This means perf's arch (machine type) is no longer
determined by uname but set to match that of the perf ELF executable.
  
Switch the idle computation to the point of use and lazily compute it,
rather than computing it for every symbol. The current only user is
`perf top`. At the point of use the perf_env is available and this can
be used to make sure the idle function computation is machine and
kernel version dependent.
  
To avoid concurrent update issues with bitfields sharing a byte in
`struct symbol` due to the lazy computation, introduce a global lock
for updates to these fields and use setter functions. The reads remain
lockless.
  
v8:
 - Address Sashiko AI review feedback for Patch 1:
   - Switch all code dependent on the arch string to use `e_machine`
     instead (e.g., in `perf c2c`, `perf lock-contention`, `perf
     header`, `perf arch common`, `tests/topology.c`,
     `perf_env__init_kernel_mode`).
   - Update `machine__is` and `machine__normalized_is` to take
     `e_machine` integers instead of strings.
   - Refactor `arch_syscalls__strerrno_function` (generated via
     `arch_errno_names.sh`) to take an `e_machine` instead of an arch
     string.
   - Avoid premature caching of the host architecture in
     `perf_session__e_machine` by using a non-caching helper when
     threads are not yet available.
  
v7:
 - Address better handling of strdup failures with arch in the
   header/env.
 - Address concurrent update issues in `struct symbol` bitfields by
   introducing a global lock for writes.
https://lore.kernel.org/linux-perf-users/20260501182021.3651851-1-irogers@google.com/

v6: Ensure arch is canonical by going to e_machine and back (Sashiko)
https://lore.kernel.org/linux-perf-users/20260409230620.4176210-1-irogers@google.com/
  
v5: Add perf_env os_release helper (Namhyung/Sashiko)
https://lore.kernel.org/lkml/20260406170905.2614260-1-irogers@google.com/
  
v4: Fix Sashiko issues where an array element wasn't sorted properly,
    the e_flags weren't returned properly, the idle type is change to
    a u8 rather than an enum value and the s390 version check for
    psw_idle is slightly reordered and tweaked.
https://lore.kernel.org/lkml/20260327045025.2276517-1-irogers@google.com/
  
v3: Properly set up the e_machine coming from the perf_env as reported
    by Honglei Wang.
https://lore.kernel.org/lkml/20260326174521.1829203-1-irogers@google.com/
  
v2: Some minor white space clean up:
https://lore.kernel.org/lkml/20260325161836.1029457-1-irogers@google.com/
  
v1: https://lore.kernel.org/lkml/20260302234343.564937-1-irogers@google.com/

Ian Rogers (17):
  perf env: Add perf_env__e_machine helper and use in perf_env__arch
  perf tests topology: Switch env->arch use to env->e_machine
  perf capstone: Determine architecture from e_machine
  perf print_insn: Use e_machine for fallback IP length check
  perf machine: Use perf_env e_machine rather than arch
  perf sample-raw: Use perf_env e_machine rather than arch
  perf sort: Use perf_env e_machine rather than arch
  perf symbol: Avoid use of machine__is
  perf arch common: Use perf_env e_machine rather than arch
  perf header: In print_pmu_caps use perf_env e_machine
  perf c2c: Use perf_env e_machine rather than arch
  perf lock-contention: Use perf_env e_machine rather than arch
  perf env: Refactor perf_env__arch_strerrno
  perf env: Remove unused perf_env__raw_arch
  perf env: Add helper to lazily compute the os_release
  perf symbol: Add setters for bitfields sharing a byte to avoid
    concurrent update issues
  perf symbol: Lazily compute idle and use a global lock for updates

 tools/perf/arch/common.c                    |  55 ++--
 tools/perf/builtin-c2c.c                    |   2 +-
 tools/perf/builtin-kwork.c                  |   2 +-
 tools/perf/builtin-sched.c                  |   2 +-
 tools/perf/builtin-trace.c                  |   5 +-
 tools/perf/tests/topology.c                 |   8 +-
 tools/perf/trace/beauty/arch_errno_names.sh |  40 ++-
 tools/perf/util/annotate.c                  |   2 +-
 tools/perf/util/capstone.c                  | 115 +++++---
 tools/perf/util/data-convert-bt.c           |   2 +-
 tools/perf/util/env.c                       | 283 +++++++++++++++-----
 tools/perf/util/env.h                       |  11 +-
 tools/perf/util/header.c                    |  70 +++--
 tools/perf/util/lock-contention.c           |   6 +-
 tools/perf/util/machine.c                   |  25 +-
 tools/perf/util/machine.h                   |   2 -
 tools/perf/util/print_insn.c                |   8 +-
 tools/perf/util/sample-raw.c                |  18 +-
 tools/perf/util/session.c                   |  26 +-
 tools/perf/util/sort.c                      |  12 +-
 tools/perf/util/symbol-elf.c                |   2 +-
 tools/perf/util/symbol.c                    | 163 +++++++----
 tools/perf/util/symbol.h                    |  17 +-
 23 files changed, 612 insertions(+), 264 deletions(-)

-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH v8 01/17] perf env: Add perf_env__e_machine helper and use in perf_env__arch
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
@ 2026-05-02  6:59                           ` Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 02/17] perf tests topology: Switch env->arch use to env->e_machine Ian Rogers
                                             ` (16 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Add a helper that lazily computes the e_machine and falls back to
EM_HOST. Use the perf_env's arch to compute the e_machine if
available. Use a binary search for some efficiency in this, but handle
somewhat complex duplicate rules. Switch perf_env__arch to be derived
the e_machine for consistency. This switches arch from being uname
derived to matching that of the perf binary (via EM_HOST). Update
session to use the helper, which may mean using EM_HOST when no
threads are available. This also updates the perf data file header
that gets the e_machine/e_flags from the session.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/env.c     | 231 +++++++++++++++++++++++++++++++-------
 tools/perf/util/env.h     |   2 +
 tools/perf/util/header.c  |  47 ++++++--
 tools/perf/util/session.c |  26 +++--
 4 files changed, 243 insertions(+), 63 deletions(-)

diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 1e54e2c86360..4ff4caab3b32 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -1,10 +1,12 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "cpumap.h"
+#include "dwarf-regs.h"
 #include "debug.h"
 #include "env.h"
 #include "util/header.h"
 #include "util/rwsem.h"
 #include <linux/compiler.h>
+#include <linux/kernel.h>
 #include <linux/ctype.h>
 #include <linux/rbtree.h>
 #include <linux/string.h>
@@ -309,12 +311,21 @@ void perf_env__init(struct perf_env *env)
 
 static void perf_env__init_kernel_mode(struct perf_env *env)
 {
-	const char *arch = perf_env__raw_arch(env);
+	uint16_t e_machine = env->e_machine;
 
-	if (!strncmp(arch, "x86_64", 6) || !strncmp(arch, "aarch64", 7) ||
-	    !strncmp(arch, "arm64", 5) || !strncmp(arch, "mips64", 6) ||
-	    !strncmp(arch, "parisc64", 8) || !strncmp(arch, "riscv64", 7) ||
-	    !strncmp(arch, "s390x", 5) || !strncmp(arch, "sparc64", 7))
+	if (env->arch && (e_machine == EM_NONE || e_machine == EM_MIPS || e_machine == EM_RISCV)) {
+		if (str_ends_with(env->arch, "64") || !strncmp(env->arch, "s390x", 5))
+			env->kernel_is_64_bit = 1;
+		else
+			env->kernel_is_64_bit = 0;
+		return;
+	}
+	if (e_machine == EM_NONE)
+		e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
+
+	if (e_machine == EM_X86_64 || e_machine == EM_AARCH64 ||
+	    e_machine == EM_PPC64 || e_machine == EM_SPARCV9 ||
+	    e_machine == EM_S390)
 		env->kernel_is_64_bit = 1;
 	else
 		env->kernel_is_64_bit = 0;
@@ -588,51 +599,187 @@ void cpu_cache_level__free(struct cpu_cache_level *cache)
 	zfree(&cache->size);
 }
 
+struct arch_to_e_machine {
+	const char *prefix;
+	uint16_t e_machine;
+};
+
 /*
- * Return architecture name in a normalized form.
- * The conversion logic comes from the Makefile.
+ * A mapping from an arch prefix string to an ELF machine that can be used in a
+ * bsearch. Some arch prefixes are shared an need additional processing as
+ * marked next to the architecture. The prefixes handle both perf's architecture
+ * naming and those from uname.
  */
-static const char *normalize_arch(char *arch)
-{
-	if (!strcmp(arch, "x86_64"))
-		return "x86";
-	if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
-		return "x86";
-	if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
-		return "sparc";
-	if (!strncmp(arch, "aarch64", 7) || !strncmp(arch, "arm64", 5))
-		return "arm64";
-	if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
-		return "arm";
-	if (!strncmp(arch, "s390", 4))
-		return "s390";
-	if (!strncmp(arch, "parisc", 6))
-		return "parisc";
-	if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
-		return "powerpc";
-	if (!strncmp(arch, "mips", 4))
-		return "mips";
-	if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
-		return "sh";
-	if (!strncmp(arch, "loongarch", 9))
-		return "loongarch";
-
-	return arch;
+static const struct arch_to_e_machine prefix_to_e_machine[] = {
+	{"aarch64", EM_AARCH64},
+	{"alpha", EM_ALPHA},
+	{"arc", EM_ARC},
+	{"arm", EM_ARM}, /* Check also for EM_AARCH64. */
+	{"avr", EM_AVR},  /* Check also for EM_AVR32. */
+	{"bfin", EM_BLACKFIN},
+	{"blackfin", EM_BLACKFIN},
+	{"cris", EM_CRIS},
+	{"csky", EM_CSKY},
+	{"hppa", EM_PARISC},
+	{"i386", EM_386},
+	{"i486", EM_386},
+	{"i586", EM_386},
+	{"i686", EM_386},
+	{"loongarch", EM_LOONGARCH},
+	{"m32r", EM_M32R},
+	{"m68k", EM_68K},
+	{"microblaze", EM_MICROBLAZE},
+	{"mips", EM_MIPS},
+	{"msp430", EM_MSP430},
+	{"parisc", EM_PARISC},
+	{"powerpc", EM_PPC}, /* Check also for EM_PPC64. */
+	{"ppc", EM_PPC}, /* Check also for EM_PPC64. */
+	{"riscv", EM_RISCV},
+	{"s390", EM_S390},
+	{"sa110", EM_ARM},
+	{"sh", EM_SH},
+	{"sparc", EM_SPARC}, /* Check also for EM_SPARCV9. */
+	{"sun4u", EM_SPARC},
+	{"x86", EM_X86_64}, /* Check also for EM_386. */
+	{"xtensa", EM_XTENSA},
+};
+
+static int compare_prefix(const void *key, const void *element)
+{
+	const char *search_key = key;
+	const struct arch_to_e_machine *map_element = element;
+	size_t prefix_len = strlen(map_element->prefix);
+
+	return strncmp(search_key, map_element->prefix, prefix_len);
+}
+
+static uint16_t perf_arch_to_e_machine(const char *perf_arch, int is_64_bit)
+{
+	/* Binary search for a matching prefix. */
+	const struct arch_to_e_machine *result;
+
+	if (!perf_arch)
+		return EM_HOST;
+
+	result = bsearch(perf_arch,
+			 prefix_to_e_machine, ARRAY_SIZE(prefix_to_e_machine),
+			 sizeof(prefix_to_e_machine[0]),
+			 compare_prefix);
+
+	if (!result) {
+		pr_debug("Unknown perf arch for ELF machine mapping: %s\n", perf_arch);
+		return EM_NONE;
+	}
+
+	/*
+	 * Handle conflicting prefixes. If the is_64_bit is unknown (-1) then
+	 * assume 64-bit. We can't use perf_env__kernel_is_64_bit as that
+	 * depends on the arch string.
+	 */
+	switch (result->e_machine) {
+	case EM_ARM:
+		return !strcmp(perf_arch, "arm64") ? EM_AARCH64 : EM_ARM;
+	case EM_AVR:
+		return !strcmp(perf_arch, "avr32") ? EM_AVR32 : EM_AVR;
+	case EM_PPC:
+		return (is_64_bit != 0) || strstarts(perf_arch, "ppc64") ? EM_PPC64 : EM_PPC;
+	case EM_SPARC:
+		return (is_64_bit != 0) || !strcmp(perf_arch, "sparc64") ? EM_SPARCV9 : EM_SPARC;
+	case EM_X86_64:
+		return (is_64_bit != 0) || !strcmp(perf_arch, "x86_64") ? EM_X86_64 : EM_386;
+	default:
+		return result->e_machine;
+	}
+}
+
+static const char *e_machine_to_perf_arch(uint16_t e_machine)
+{
+	/*
+	 * Table for if either the perf arch string differs from uname or there
+	 * are >1 ELF machine with the prefix.
+	 */
+	static const struct arch_to_e_machine extras[] = {
+		{"arm64", EM_AARCH64},
+		{"avr32", EM_AVR32},
+		{"powerpc", EM_PPC},
+		{"powerpc", EM_PPC64},
+		{"sparc", EM_SPARCV9},
+		{"x86", EM_386},
+		{"x86", EM_X86_64},
+		{"none", EM_NONE},
+	};
+
+	for (size_t i = 0; i < ARRAY_SIZE(extras); i++) {
+		if (extras[i].e_machine == e_machine)
+			return extras[i].prefix;
+	}
+
+	for (size_t i = 0; i < ARRAY_SIZE(prefix_to_e_machine); i++) {
+		if (prefix_to_e_machine[i].e_machine == e_machine)
+			return prefix_to_e_machine[i].prefix;
+
+	}
+	return "unknown";
+}
+
+uint16_t perf_env__e_machine_nocache(struct perf_env *env, uint32_t *e_flags)
+{
+	uint16_t e_machine = EM_HOST;
+
+	if (env)
+		e_machine = perf_arch_to_e_machine(env->arch, env->kernel_is_64_bit);
+
+	if (e_flags && e_machine == EM_HOST)
+		*e_flags = EF_HOST;
+
+	return e_machine;
+}
+
+uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags)
+{
+	uint16_t e_machine;
+	uint32_t local_e_flags;
+
+	if (env && env->e_machine != EM_NONE) {
+		if (e_flags)
+			*e_flags = env->e_flags;
+
+		return env->e_machine;
+	}
+	e_machine = perf_env__e_machine_nocache(env, &local_e_flags);
+	if (env) {
+		env->e_machine = e_machine;
+		env->e_flags = local_e_flags;
+	}
+	if (e_flags)
+		*e_flags = local_e_flags;
+
+	return e_machine;
 }
 
 const char *perf_env__arch(struct perf_env *env)
 {
-	char *arch_name;
+	uint16_t e_machine;
+	const char *arch;
 
-	if (!env || !env->arch) { /* Assume local operation */
-		static struct utsname uts = { .machine[0] = '\0', };
-		if (uts.machine[0] == '\0' && uname(&uts) < 0)
-			return NULL;
-		arch_name = uts.machine;
-	} else
-		arch_name = env->arch;
+	if (!env)
+		return e_machine_to_perf_arch(EM_HOST);
+
+	if (env->arch)
+		return env->arch;
 
-	return normalize_arch(arch_name);
+	/*
+	 * Lazily compute/allocate arch. The e_machine may have been
+	 * read from a data file and so may not be EM_HOST.
+	 */
+	e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
+	arch = e_machine_to_perf_arch(e_machine);
+	env->arch = strdup(arch);
+	/*
+	 * Avoid potential crashes on the arch string if memory allocation in
+	 * strdup fails and NULL were to be returned.
+	 */
+	return env->arch ?: arch;
 }
 
 #if defined(HAVE_LIBTRACEEVENT)
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index c7052ac1f856..7151a9138e3f 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -187,6 +187,8 @@ int perf_env__read_cpu_topology_map(struct perf_env *env);
 
 void cpu_cache_level__free(struct cpu_cache_level *cache);
 
+uint16_t perf_env__e_machine_nocache(struct perf_env *env, uint32_t *e_flags);
+uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags);
 const char *perf_env__arch(struct perf_env *env);
 const char *perf_env__arch_strerrno(struct perf_env *env, int err);
 const char *perf_env__cpuid(struct perf_env *env);
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index f30e48eb3fc3..8d5152bde25d 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -379,21 +379,28 @@ static int write_osrelease(struct feat_fd *ff,
 	return do_write_string(ff, uts.release);
 }
 
-static int write_arch(struct feat_fd *ff,
-		      struct evlist *evlist __maybe_unused)
+static int write_arch(struct feat_fd *ff, struct evlist *evlist)
 {
 	struct utsname uts;
-	int ret;
+	const char *arch = NULL;
 
-	ret = uname(&uts);
-	if (ret < 0)
-		return -1;
+	if (evlist->session) {
+		/* Force the computation in the perf_env of the e_machine of the threads. */
+		perf_session__e_machine(evlist->session, /*e_flags=*/NULL);
+		arch = perf_env__arch(perf_session__env(evlist->session));
+	}
+
+	if (!arch) {
+		int ret = uname(&uts);
 
-	return do_write_string(ff, uts.machine);
+		if (ret < 0)
+			return -1;
+		arch = uts.machine;
+	}
+	return do_write_string(ff, arch);
 }
 
-static int write_e_machine(struct feat_fd *ff,
-			   struct evlist *evlist __maybe_unused)
+static int write_e_machine(struct feat_fd *ff, struct evlist *evlist)
 {
 	/* e_machine expanded from 16 to 32-bits for alignment. */
 	uint32_t e_flags;
@@ -2684,10 +2691,30 @@ static int process_##__feat(struct feat_fd *ff, void *data __maybe_unused) \
 FEAT_PROCESS_STR_FUN(hostname, hostname);
 FEAT_PROCESS_STR_FUN(osrelease, os_release);
 FEAT_PROCESS_STR_FUN(version, version);
-FEAT_PROCESS_STR_FUN(arch, arch);
 FEAT_PROCESS_STR_FUN(cpudesc, cpu_desc);
 FEAT_PROCESS_STR_FUN(cpuid, cpuid);
 
+static int process_arch(struct feat_fd *ff, void *data __maybe_unused)
+{
+	uint16_t saved_e_machine = ff->ph->env.e_machine;
+
+	free(ff->ph->env.arch);
+	ff->ph->env.arch = do_read_string(ff);
+	if (!ff->ph->env.arch)
+		return -ENOMEM;
+	/*
+	 * Make the arch string canonical by computing the e_machine from it,
+	 * then turning the e_machine back into an arch string.
+	 */
+	ff->ph->env.e_machine = EM_NONE;
+	if (perf_env__e_machine(&ff->ph->env, /*e_flags=*/NULL) != EM_NONE) {
+		zfree(&ff->ph->env.arch);
+		perf_env__arch(&ff->ph->env);
+	}
+	ff->ph->env.e_machine = saved_e_machine;
+	return 0;
+}
+
 static int process_e_machine(struct feat_fd *ff, void *data __maybe_unused)
 {
 	int ret;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index fe0de2a0277f..3e64db2d27c2 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -3023,14 +3023,19 @@ uint16_t perf_session__e_machine(struct perf_session *session, uint32_t *e_flags
 		return EM_HOST;
 	}
 
+	/*
+	 * Is the env caching an e_machine? If not we want to compute from the
+	 * more accurate threads.
+	 */
 	env = perf_session__env(session);
-	if (env && env->e_machine != EM_NONE) {
-		if (e_flags)
-			*e_flags = env->e_flags;
-
-		return env->e_machine;
-	}
+	if (env && env->e_machine != EM_NONE)
+		return perf_env__e_machine(env, e_flags);
 
+	/*
+	 * Compute from threads, note this is more accurate than
+	 * perf_env__e_machine that falls back on EM_HOST and doesn't consider
+	 * mixed 32-bit and 64-bit threads.
+	 */
 	machines__for_each_thread(&session->machines,
 				  perf_session__e_machine_cb,
 				  &args);
@@ -3048,10 +3053,9 @@ uint16_t perf_session__e_machine(struct perf_session *session, uint32_t *e_flags
 
 	/*
 	 * Couldn't determine from the perf_env or current set of
-	 * threads. Default to the host.
+	 * threads. Potentially use logic that uses the arch string otherwise
+	 * default to the host. Don't cache in the perf_env in case later
+	 * threads indicate a better ELF machine type.
 	 */
-	if (e_flags)
-		*e_flags = EF_HOST;
-
-	return EM_HOST;
+	return perf_env__e_machine_nocache(env, e_flags);
 }
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v8 02/17] perf tests topology: Switch env->arch use to env->e_machine
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 01/17] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
@ 2026-05-02  6:59                           ` Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 03/17] perf capstone: Determine architecture from e_machine Ian Rogers
                                             ` (15 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Some arch string comparisons weren't normalized. Avoid potential
issues with normalized names vs uname values by swtiching to using the
e_machine.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/topology.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index f54502ebef4b..d4c5c330c679 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -11,6 +11,7 @@
 #include "pmus.h"
 #include "target.h"
 #include <linux/err.h>
+#include <elf.h>
 
 #define TEMPL "/tmp/perf-test-XXXXXX"
 #define DATA_SIZE	10
@@ -74,6 +75,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 	struct aggr_cpu_id id;
 	struct perf_cpu cpu;
 	struct perf_env *env;
+	uint16_t e_machine;
 
 	session = perf_session__new(&data, NULL);
 	TEST_ASSERT_VAL("can't get session", !IS_ERR(session));
@@ -101,7 +103,9 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 	 *  condition is true (see do_core_id_test in header.c). So always
 	 *  run this test on those platforms.
 	 */
-	if (!env->cpu && strncmp(env->arch, "s390", 4) && strncmp(env->arch, "aarch64", 7))
+	e_machine = perf_env__e_machine(env, NULL);
+
+	if (!env->cpu && e_machine != EM_S390 && e_machine != EM_AARCH64)
 		return TEST_SKIP;
 
 	/*
@@ -110,7 +114,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 	 * physical_package_id will be set to -1. Hence skip this
 	 * test if physical_package_id returns -1 for cpu from perf_cpu_map.
 	 */
-	if (!strncmp(env->arch, "ppc64le", 7)) {
+	if (e_machine == EM_PPC64) {
 		if (cpu__get_socket_id(perf_cpu_map__cpu(map, 0)) == -1)
 			return TEST_SKIP;
 	}
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v8 03/17] perf capstone: Determine architecture from e_machine
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 01/17] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 02/17] perf tests topology: Switch env->arch use to env->e_machine Ian Rogers
@ 2026-05-02  6:59                           ` Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 04/17] perf print_insn: Use e_machine for fallback IP length check Ian Rogers
                                             ` (14 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Avoid the use of arch string that is imprecise and use the
e_machine. Do more e_machine to capstone machine translations adding
MIPS and RISCV. Remove unnecessary maybe_unused annotations.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/capstone.c | 115 +++++++++++++++++++++++++------------
 1 file changed, 79 insertions(+), 36 deletions(-)

diff --git a/tools/perf/util/capstone.c b/tools/perf/util/capstone.c
index 25cf6e15ec27..e6226b751c36 100644
--- a/tools/perf/util/capstone.c
+++ b/tools/perf/util/capstone.c
@@ -16,6 +16,7 @@
 #include <fcntl.h>
 #include <inttypes.h>
 #include <string.h>
+#include <elf.h>
 
 #include <capstone/capstone.h>
 
@@ -137,37 +138,74 @@ static enum cs_err perf_cs_close(csh *handle)
 #endif
 }
 
-static int capstone_init(struct machine *machine, csh *cs_handle, bool is64,
-			 bool disassembler_style)
+static bool e_machine_to_capstone(uint16_t e_machine, bool is64,
+				  enum cs_arch *arch, enum cs_mode *mode)
+{
+        switch (e_machine) {
+        case EM_X86_64:
+		*arch = CS_ARCH_X86;
+		*mode = CS_MODE_64;
+		return true;
+        case EM_386:
+		*arch = CS_ARCH_X86;
+		*mode = CS_MODE_32;
+		return true;
+        case EM_AARCH64:
+		*arch = CS_ARCH_ARM64;
+		*mode = CS_MODE_ARM;
+		return true;
+        case EM_ARM:
+		*arch = CS_ARCH_ARM;
+		*mode = CS_MODE_ARM | CS_MODE_V8;
+		return true;
+        case EM_S390:
+		*arch = CS_ARCH_SYSZ;
+		*mode = CS_MODE_BIG_ENDIAN;
+		return true;
+        case EM_MIPS:
+		*arch = CS_ARCH_MIPS;
+		*mode = is64 ? CS_MODE_MIPS64 : CS_MODE_MIPS32;
+		*mode |= CS_MODE_BIG_ENDIAN;
+		return true;
+        case EM_PPC:
+		*arch = CS_ARCH_PPC;
+		*mode = CS_MODE_BIG_ENDIAN | CS_MODE_32;
+		return true;
+        case EM_PPC64:
+		*arch = CS_ARCH_PPC;
+		*mode = CS_MODE_BIG_ENDIAN | CS_MODE_64;
+		return true;
+        case EM_SPARC:
+		*arch = CS_ARCH_SPARC;
+		*mode = CS_MODE_BIG_ENDIAN | CS_MODE_32;
+		return true;
+        case EM_SPARCV9:
+		*arch = CS_ARCH_SPARC;
+		*mode = CS_MODE_BIG_ENDIAN | CS_MODE_V9 | CS_MODE_64;
+		return true;
+        case EM_RISCV:
+		*arch = CS_ARCH_RISCV;
+		*mode = is64 ? CS_MODE_RISCV64 : CS_MODE_RISCV32;
+		return true;
+        default:
+		return false;
+        }
+}
+
+static int capstone_init(uint16_t e_machine, csh *cs_handle, bool is64, bool disassembler_style)
 {
 	enum cs_arch arch;
 	enum cs_mode mode;
 
-	if (machine__is(machine, "x86_64") && is64) {
-		arch = CS_ARCH_X86;
-		mode = CS_MODE_64;
-	} else if (machine__normalized_is(machine, "x86")) {
-		arch = CS_ARCH_X86;
-		mode = CS_MODE_32;
-	} else if (machine__normalized_is(machine, "arm64")) {
-		arch = CS_ARCH_ARM64;
-		mode = CS_MODE_ARM;
-	} else if (machine__normalized_is(machine, "arm")) {
-		arch = CS_ARCH_ARM;
-		mode = CS_MODE_ARM + CS_MODE_V8;
-	} else if (machine__normalized_is(machine, "s390")) {
-		arch = CS_ARCH_SYSZ;
-		mode = CS_MODE_BIG_ENDIAN;
-	} else {
+	if (!e_machine_to_capstone(e_machine, is64, &arch, &mode))
 		return -1;
-	}
 
 	if (perf_cs_open(arch, mode, cs_handle) != CS_ERR_OK) {
 		pr_warning_once("cs_open failed\n");
 		return -1;
 	}
 
-	if (machine__normalized_is(machine, "x86")) {
+	if (arch == CS_ARCH_X86) {
 		/*
 		 * In case of using capstone_init while symbol__disassemble
 		 * setting CS_OPT_SYNTAX_ATT depends if disassembler_style opts
@@ -212,28 +250,31 @@ static size_t print_insn_x86(struct thread *thread, u8 cpumode, struct cs_insn *
 }
 
 
-ssize_t capstone__fprintf_insn_asm(struct machine *machine __maybe_unused,
-				   struct thread *thread __maybe_unused,
-				   u8 cpumode __maybe_unused, bool is64bit __maybe_unused,
-				   const uint8_t *code __maybe_unused,
-				   size_t code_size __maybe_unused,
-				   uint64_t ip __maybe_unused, int *lenp __maybe_unused,
-				   int print_opts __maybe_unused, FILE *fp __maybe_unused)
+ssize_t capstone__fprintf_insn_asm(struct machine *machine,
+				   struct thread *thread,
+				   u8 cpumode,
+				   bool is64bit,
+				   const uint8_t *code,
+				   size_t code_size,
+				   uint64_t ip, int *lenp,
+				   int print_opts,
+				   FILE *fp)
 {
 	size_t printed;
 	struct cs_insn *insn;
 	csh cs_handle;
 	size_t count;
+	uint16_t e_machine = thread__e_machine(thread, machine, /*e_flags=*/NULL);
 	int ret;
 
 	/* TODO: Try to initiate capstone only once but need a proper place. */
-	ret = capstone_init(machine, &cs_handle, is64bit, true);
+	ret = capstone_init(e_machine, &cs_handle, is64bit, /*disassembler_style=*/true);
 	if (ret < 0)
 		return ret;
 
 	count = perf_cs_disasm(cs_handle, code, code_size, ip, 1, &insn);
 	if (count > 0) {
-		if (machine__normalized_is(machine, "x86"))
+		if (e_machine == EM_X86_64 || e_machine == EM_386)
 			printed = print_insn_x86(thread, cpumode, &insn[0], print_opts, fp);
 		else
 			printed = fprintf(fp, "%s %s", insn[0].mnemonic, insn[0].op_str);
@@ -322,9 +363,9 @@ static int find_file_offset(u64 start, u64 len, u64 pgoff, void *arg)
 	return 0;
 }
 
-int symbol__disassemble_capstone(const char *filename __maybe_unused,
-				 struct symbol *sym __maybe_unused,
-				 struct annotate_args *args __maybe_unused)
+int symbol__disassemble_capstone(const char *filename,
+				 struct symbol *sym,
+				 struct annotate_args *args)
 {
 	struct annotation *notes = symbol__annotation(sym);
 	struct map *map = args->ms->map;
@@ -344,6 +385,7 @@ int symbol__disassemble_capstone(const char *filename __maybe_unused,
 	char disasm_buf[512];
 	struct disasm_line *dl;
 	bool disassembler_style = false;
+	uint16_t e_machine;
 
 	if (args->options->objdump_path)
 		return -1;
@@ -373,8 +415,8 @@ int symbol__disassemble_capstone(const char *filename __maybe_unused,
 	    !strcmp(args->options->disassembler_style, "att"))
 		disassembler_style = true;
 
-	if (capstone_init(maps__machine(thread__maps(args->ms->thread)), &handle, is_64bit,
-			  disassembler_style) < 0)
+	e_machine = thread__e_machine(args->ms->thread, /*machine=*/NULL, /*e_flags=*/NULL);
+	if (capstone_init(e_machine, &handle, is_64bit, disassembler_style) < 0)
 		goto err;
 
 	needs_cs_close = true;
@@ -466,6 +508,7 @@ int symbol__disassemble_capstone_powerpc(const char *filename __maybe_unused,
 	struct disasm_line *dl;
 	u32 *line;
 	bool disassembler_style = false;
+	uint16_t e_machine;
 
 	if (args->options->objdump_path)
 		return -1;
@@ -484,8 +527,8 @@ int symbol__disassemble_capstone_powerpc(const char *filename __maybe_unused,
 	    !strcmp(args->options->disassembler_style, "att"))
 		disassembler_style = true;
 
-	if (capstone_init(maps__machine(thread__maps(args->ms->thread)), &handle, is_64bit,
-			  disassembler_style) < 0)
+	e_machine = thread__e_machine(args->ms->thread, /*machine=*/NULL, /*e_flags=*/NULL);
+	if (capstone_init(e_machine, &handle, is_64bit, disassembler_style) < 0)
 		goto err;
 
 	needs_cs_close = true;
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v8 04/17] perf print_insn: Use e_machine for fallback IP length check
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                             ` (2 preceding siblings ...)
  2026-05-02  6:59                           ` [PATCH v8 03/17] perf capstone: Determine architecture from e_machine Ian Rogers
@ 2026-05-02  6:59                           ` Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 05/17] perf machine: Use perf_env e_machine rather than arch Ian Rogers
                                             ` (13 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Avoid string comparisons with perf_env arch, switch to using the more
precise ELF machine.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/print_insn.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/print_insn.c b/tools/perf/util/print_insn.c
index 02e6fbb8ca04..5e36344174d6 100644
--- a/tools/perf/util/print_insn.c
+++ b/tools/perf/util/print_insn.c
@@ -9,6 +9,7 @@
 #include <stdbool.h>
 #include "capstone.h"
 #include "debug.h"
+#include "env.h"
 #include "sample.h"
 #include "symbol.h"
 #include "machine.h"
@@ -17,6 +18,7 @@
 #include "dump-insn.h"
 #include "map.h"
 #include "dso.h"
+#include <elf.h>
 
 size_t sample__fprintf_insn_raw(struct perf_sample *sample, FILE *fp)
 {
@@ -33,13 +35,13 @@ size_t sample__fprintf_insn_raw(struct perf_sample *sample, FILE *fp)
 static bool is64bitip(struct machine *machine, struct addr_location *al)
 {
 	const struct dso *dso = al->map ? map__dso(al->map) : NULL;
+	uint16_t e_machine;
 
 	if (dso)
 		return dso__is_64_bit(dso);
 
-	return machine__is(machine, "x86_64") ||
-		machine__normalized_is(machine, "arm64") ||
-		machine__normalized_is(machine, "s390");
+	e_machine = perf_env__e_machine(machine->env, /*e_flags=*/NULL);
+	return e_machine == EM_X86_64 || e_machine == EM_AARCH64 || e_machine == EM_S390;
 }
 
 ssize_t fprintf_insn_asm(struct machine *machine, struct thread *thread, u8 cpumode,
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v8 05/17] perf machine: Use perf_env e_machine rather than arch
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                             ` (3 preceding siblings ...)
  2026-05-02  6:59                           ` [PATCH v8 04/17] perf print_insn: Use e_machine for fallback IP length check Ian Rogers
@ 2026-05-02  6:59                           ` Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 06/17] perf sample-raw: " Ian Rogers
                                             ` (12 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

The arch string is derived from uname and may be normalized causing
potential differences meaning the ELF machine can be more
precise. Reduce the scope of machine__is as often it is better to use
a thread for the e_machine rather than the machine. Switch from string
to ELF machine constant comparisons.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/machine.c | 25 ++++++++-----------------
 tools/perf/util/machine.h |  2 --
 2 files changed, 8 insertions(+), 19 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index e76f8c86e62a..6d32d3cb5cb7 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1611,10 +1611,15 @@ static bool machine__uses_kcore(struct machine *machine)
 	return dsos__for_each_dso(&machine->dsos, machine__uses_kcore_cb, NULL) != 0 ? true : false;
 }
 
+static bool machine__is(struct machine *machine, uint16_t e_machine)
+{
+	return machine && perf_env__e_machine(machine->env, NULL) == e_machine;
+}
+
 static bool perf_event__is_extra_kernel_mmap(struct machine *machine,
 					     struct extra_kernel_map *xm)
 {
-	return machine__is(machine, "x86_64") &&
+	return machine__is(machine, EM_X86_64) &&
 	       is_entry_trampoline(xm->name);
 }
 
@@ -2770,7 +2775,7 @@ static int find_prev_cpumode(struct ip_callchain *chain, struct thread *thread,
 static u64 get_leaf_frame_caller(struct perf_sample *sample,
 		struct thread *thread, int usr_idx)
 {
-	if (machine__normalized_is(maps__machine(thread__maps(thread)), "arm64"))
+	if (thread__e_machine(thread, /*machine=*/NULL, /*e_flags=*/NULL) == EM_AARCH64)
 		return get_leaf_frame_caller_aarch64(sample, thread, usr_idx);
 	else
 		return 0;
@@ -3141,20 +3146,6 @@ int machine__set_current_tid(struct machine *machine, int cpu, pid_t pid,
 	return 0;
 }
 
-/*
- * Compares the raw arch string. N.B. see instead perf_env__arch() or
- * machine__normalized_is() if a normalized arch is needed.
- */
-bool machine__is(struct machine *machine, const char *arch)
-{
-	return machine && !strcmp(perf_env__raw_arch(machine->env), arch);
-}
-
-bool machine__normalized_is(struct machine *machine, const char *arch)
-{
-	return machine && !strcmp(perf_env__arch(machine->env), arch);
-}
-
 int machine__nr_cpus_avail(struct machine *machine)
 {
 	return machine ? perf_env__nr_cpus_avail(machine->env) : 0;
@@ -3181,7 +3172,7 @@ int machine__get_kernel_start(struct machine *machine)
 		 * start of kernel text, but still above 2^63. So leave
 		 * kernel_start = 1ULL << 63 for x86_64.
 		 */
-		if (!err && !machine__is(machine, "x86_64"))
+		if (!err && !machine__is(machine, EM_X86_64))
 			machine->kernel_start = map__start(map);
 	}
 	return err;
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 22a42c5825fa..003c970b3e4b 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -227,8 +227,6 @@ static inline bool machine__is_host(struct machine *machine)
 }
 
 bool machine__is_lock_function(struct machine *machine, u64 addr);
-bool machine__is(struct machine *machine, const char *arch);
-bool machine__normalized_is(struct machine *machine, const char *arch);
 int machine__nr_cpus_avail(struct machine *machine);
 
 struct thread *machine__findnew_thread(struct machine *machine, pid_t pid, pid_t tid);
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v8 06/17] perf sample-raw: Use perf_env e_machine rather than arch
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                             ` (4 preceding siblings ...)
  2026-05-02  6:59                           ` [PATCH v8 05/17] perf machine: Use perf_env e_machine rather than arch Ian Rogers
@ 2026-05-02  6:59                           ` Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 07/17] perf sort: " Ian Rogers
                                             ` (11 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Use the e_machine rather than the arch to determine S390 and x86 types.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/sample-raw.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/sample-raw.c b/tools/perf/util/sample-raw.c
index bcf442574d6e..b10056ac8057 100644
--- a/tools/perf/util/sample-raw.c
+++ b/tools/perf/util/sample-raw.c
@@ -1,6 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-
-#include <string.h>
+#include <elf.h>
 #include <linux/string.h>
 #include "evlist.h"
 #include "env.h"
@@ -14,14 +13,15 @@
  */
 void evlist__init_trace_event_sample_raw(struct evlist *evlist, struct perf_env *env)
 {
-	const char *arch_pf = perf_env__arch(env);
-	const char *cpuid = perf_env__cpuid(env);
+	uint16_t e_machine = perf_env__e_machine(env, /*e_eflags=*/NULL);
 
-	if (arch_pf && !strcmp("s390", arch_pf))
+	if (e_machine == EM_S390) {
 		evlist->trace_event_sample_raw = evlist__s390_sample_raw;
-	else if (arch_pf && !strcmp("x86", arch_pf) &&
-		 cpuid && strstarts(cpuid, "AuthenticAMD") &&
-		 evlist__has_amd_ibs(evlist)) {
-		evlist->trace_event_sample_raw = evlist__amd_sample_raw;
+	} else if (e_machine == EM_X86_64 || e_machine == EM_386) {
+		const char *cpuid = perf_env__cpuid(env);
+
+		if (cpuid && strstarts(cpuid, "AuthenticAMD") &&
+		    evlist__has_amd_ibs(evlist))
+			evlist->trace_event_sample_raw = evlist__amd_sample_raw;
 	}
 }
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v8 07/17] perf sort: Use perf_env e_machine rather than arch
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                             ` (5 preceding siblings ...)
  2026-05-02  6:59                           ` [PATCH v8 06/17] perf sample-raw: " Ian Rogers
@ 2026-05-02  6:59                           ` Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 08/17] perf symbol: Avoid use of machine__is Ian Rogers
                                             ` (10 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Use the e_machine rather than the arch to determine x86 or PPC types.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/sort.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 0020089cb13c..06a641cf49e3 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <ctype.h>
+#include <elf.h>
 #include <errno.h>
 #include <inttypes.h>
 #include <regex.h>
@@ -2673,9 +2674,10 @@ struct sort_dimension {
 
 static int arch_support_sort_key(const char *sort_key, struct perf_env *env)
 {
-	const char *arch = perf_env__arch(env);
+	uint16_t e_machine = perf_env__e_machine(env, /*e_eflags=*/NULL);
 
-	if (!strcmp("x86", arch) || !strcmp("powerpc", arch)) {
+	if (e_machine == EM_X86_64 || e_machine == EM_386 ||
+	    e_machine == EM_PPC64 || e_machine == EM_PPC) {
 		if (!strcmp(sort_key, "p_stage_cyc"))
 			return 1;
 		if (!strcmp(sort_key, "local_p_stage_cyc"))
@@ -2686,14 +2688,14 @@ static int arch_support_sort_key(const char *sort_key, struct perf_env *env)
 
 static const char *arch_perf_header_entry(const char *se_header, struct perf_env *env)
 {
-	const char *arch = perf_env__arch(env);
+	uint16_t e_machine = perf_env__e_machine(env, /*e_eflags=*/NULL);
 
-	if (!strcmp("x86", arch)) {
+	if (e_machine == EM_X86_64 || e_machine == EM_386) {
 		if (!strcmp(se_header, "Local Pipeline Stage Cycle"))
 			return "Local Retire Latency";
 		else if (!strcmp(se_header, "Pipeline Stage Cycle"))
 			return "Retire Latency";
-	} else if (!strcmp("powerpc", arch)) {
+	} else if (e_machine == EM_PPC64 || e_machine == EM_PPC) {
 		if (!strcmp(se_header, "Local INSTR Latency"))
 			return "Finish Cyc";
 		else if (!strcmp(se_header, "INSTR Latency"))
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v8 08/17] perf symbol: Avoid use of machine__is
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                             ` (6 preceding siblings ...)
  2026-05-02  6:59                           ` [PATCH v8 07/17] perf sort: " Ian Rogers
@ 2026-05-02  6:59                           ` Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 09/17] perf arch common: Use perf_env e_machine rather than arch Ian Rogers
                                             ` (9 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Switch to using the ELF machine from the dso or running machine rather
than the machine perf_env arch that may fall back on EM_HOST. This
also avoids potentially imprecise string comparisons.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/symbol.c | 29 +++++++++++++++++++++++------
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index fcaeeddbbb6b..8aaaab0ad4b7 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -851,6 +851,24 @@ static int maps__split_kallsyms_for_kcore(struct maps *kmaps, struct dso *dso)
 	return count;
 }
 
+static uint16_t machine_or_dso_e_machine(struct machine *machine, struct dso *dso)
+{
+	uint16_t e_machine = EM_NONE;
+
+	/* Check for a cached value first. */
+	if (machine && machine->env && machine->env->e_machine != EM_NONE)
+		return machine->env->e_machine;
+
+	/* DSO should be most accurate */
+	if (dso)
+		e_machine = dso__e_machine(dso, machine, /*e_flags=*/NULL);
+
+	if (e_machine != EM_NONE)
+		return e_machine;
+
+	return perf_env__e_machine(machine ? machine->env : NULL, /*e_flags=*/NULL);
+}
+
 /*
  * Split the symbols into maps, making sure there are no overlaps, i.e. the
  * kernel range is broken in several maps, named [kernel].N, as we don't have
@@ -866,14 +884,13 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 	struct rb_root_cached *root = dso__symbols(dso);
 	struct rb_node *next = rb_first_cached(root);
 	int kernel_range = 0;
-	bool x86_64;
+	uint16_t e_machine = EM_NONE;
 
 	if (!kmaps)
 		return -1;
 
 	machine = maps__machine(kmaps);
-
-	x86_64 = machine__is(machine, "x86_64");
+	e_machine = machine_or_dso_e_machine(machine, dso);
 
 	while (next) {
 		char *module;
@@ -925,7 +942,7 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 			 */
 			pos->start = map__map_ip(curr_map, pos->start);
 			pos->end   = map__map_ip(curr_map, pos->end);
-		} else if (x86_64 && is_entry_trampoline(pos->name)) {
+		} else if (e_machine == EM_X86_64 && is_entry_trampoline(pos->name)) {
 			/*
 			 * These symbols are not needed anymore since the
 			 * trampoline maps refer to the text section and it's
@@ -1428,7 +1445,7 @@ static int dso__load_kcore(struct dso *dso, struct map *map,
 		free(new_node);
 	}
 
-	if (machine__is(machine, "x86_64")) {
+	if (machine_or_dso_e_machine(machine, dso) == EM_X86_64) {
 		u64 addr;
 
 		/*
@@ -1716,7 +1733,7 @@ int dso__load(struct dso *dso, struct map *map)
 			ret = dso__load_guest_kernel_sym(dso, map);
 
 		machine = maps__machine(map__kmaps(map));
-		if (machine__is(machine, "x86_64"))
+		if (machine_or_dso_e_machine(machine, dso) == EM_X86_64)
 			machine__map_x86_64_entry_trampolines(machine, dso);
 		goto out;
 	}
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v8 09/17] perf arch common: Use perf_env e_machine rather than arch
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                             ` (7 preceding siblings ...)
  2026-05-02  6:59                           ` [PATCH v8 08/17] perf symbol: Avoid use of machine__is Ian Rogers
@ 2026-05-02  6:59                           ` Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 10/17] perf header: In print_pmu_caps use perf_env e_machine Ian Rogers
                                             ` (8 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Use the e_machine rather than arch string matching.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/arch/common.c | 55 +++++++++++++++++++++++++---------------
 1 file changed, 35 insertions(+), 20 deletions(-)

diff --git a/tools/perf/arch/common.c b/tools/perf/arch/common.c
index 21836f70f231..e9b5b61feffe 100644
--- a/tools/perf/arch/common.c
+++ b/tools/perf/arch/common.c
@@ -1,12 +1,14 @@
 // SPDX-License-Identifier: GPL-2.0
+#include "common.h"
+
 #include <limits.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
-#include "common.h"
 #include "../util/env.h"
 #include "../util/debug.h"
+#include <dwarf-regs.h>
 #include <linux/zalloc.h>
 
 static const char *const arc_triplets[] = {
@@ -145,7 +147,8 @@ static int perf_env__lookup_binutils_path(struct perf_env *env,
 					  const char *name, char **path)
 {
 	int idx;
-	const char *arch = perf_env__arch(env), *cross_env;
+	uint16_t e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
+	const char *cross_env;
 	const char *const *path_list;
 	char *buf = NULL;
 
@@ -153,7 +156,7 @@ static int perf_env__lookup_binutils_path(struct perf_env *env,
 	 * We don't need to try to find objdump path for native system.
 	 * Just use default binutils path (e.g.: "objdump").
 	 */
-	if (!strcmp(perf_env__arch(NULL), arch))
+	if (e_machine == EM_HOST)
 		goto out;
 
 	cross_env = getenv("CROSS_COMPILE");
@@ -170,30 +173,42 @@ static int perf_env__lookup_binutils_path(struct perf_env *env,
 		zfree(&buf);
 	}
 
-	if (!strcmp(arch, "arc"))
+	switch(e_machine) {
+	case EM_ARC:
 		path_list = arc_triplets;
-	else if (!strcmp(arch, "arm"))
+		break;
+	case EM_ARM:
 		path_list = arm_triplets;
-	else if (!strcmp(arch, "arm64"))
+		break;
+	case EM_AARCH64:
 		path_list = arm64_triplets;
-	else if (!strcmp(arch, "powerpc"))
+		break;
+	case EM_PPC:
+	case EM_PPC64:
 		path_list = powerpc_triplets;
-	else if (!strcmp(arch, "riscv32"))
-		path_list = riscv32_triplets;
-	else if (!strcmp(arch, "riscv64"))
-		path_list = riscv64_triplets;
-	else if (!strcmp(arch, "sh"))
+		break;
+	case EM_RISCV:
+		path_list = perf_env__kernel_is_64_bit(env) ? riscv64_triplets : riscv32_triplets;
+		break;
+	case EM_SH:
 		path_list = sh_triplets;
-	else if (!strcmp(arch, "s390"))
+		break;
+	case EM_S390:
 		path_list = s390_triplets;
-	else if (!strcmp(arch, "sparc"))
+		break;
+	case EM_SPARC:
+	case EM_SPARCV9:
 		path_list = sparc_triplets;
-	else if (!strcmp(arch, "x86"))
+		break;
+	case EM_X86_64:
+	case EM_386:
 		path_list = x86_triplets;
-	else if (!strcmp(arch, "mips"))
+		break;
+	case EM_MIPS:
 		path_list = mips_triplets;
-	else {
-		ui__error("binutils for %s not supported.\n", arch);
+		break;
+	default:
+		ui__error("binutils for %s not supported.\n", perf_env__arch(env));
 		goto out_error;
 	}
 
@@ -202,7 +217,7 @@ static int perf_env__lookup_binutils_path(struct perf_env *env,
 		ui__error("Please install %s for %s.\n"
 			  "You can add it to PATH, set CROSS_COMPILE or "
 			  "override the default using --%s.\n",
-			  name, arch, name);
+			  name, perf_env__arch(env), name);
 		goto out_error;
 	}
 
@@ -237,5 +252,5 @@ int perf_env__lookup_objdump(struct perf_env *env, char **path)
  */
 bool perf_env__single_address_space(struct perf_env *env)
 {
-	return strcmp(perf_env__arch(env), "sparc");
+	return perf_env__e_machine(env, /*e_flags=*/NULL) == EM_SPARC;
 }
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v8 10/17] perf header: In print_pmu_caps use perf_env e_machine
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                             ` (8 preceding siblings ...)
  2026-05-02  6:59                           ` [PATCH v8 09/17] perf arch common: Use perf_env e_machine rather than arch Ian Rogers
@ 2026-05-02  6:59                           ` Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 11/17] perf c2c: Use perf_env e_machine rather than arch Ian Rogers
                                             ` (7 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Switch from arch to e_machine in print_pmu_caps.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/header.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 8d5152bde25d..c6436269df4b 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2348,15 +2348,16 @@ static void print_cpu_pmu_caps(struct feat_fd *ff, FILE *fp)
 static void print_pmu_caps(struct feat_fd *ff, FILE *fp)
 {
 	struct perf_env *env = &ff->ph->env;
-	struct pmu_caps *pmu_caps;
+	uint16_t e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
 
 	for (int i = 0; i < env->nr_pmus_with_caps; i++) {
-		pmu_caps = &env->pmu_caps[i];
+		struct pmu_caps *pmu_caps = &env->pmu_caps[i];
+
 		__print_pmu_caps(fp, pmu_caps->nr_caps, pmu_caps->caps,
 				 pmu_caps->pmu_name);
 	}
 
-	if (strcmp(perf_env__arch(env), "x86") == 0 &&
+	if ((e_machine == EM_X86_64 || e_machine == EM_386) &&
 	    perf_env__has_pmu_mapping(env, "ibs_op")) {
 		char *max_precise = perf_env__find_pmu_cap(env, "cpu", "max_precise");
 
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v8 11/17] perf c2c: Use perf_env e_machine rather than arch
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                             ` (9 preceding siblings ...)
  2026-05-02  6:59                           ` [PATCH v8 10/17] perf header: In print_pmu_caps use perf_env e_machine Ian Rogers
@ 2026-05-02  6:59                           ` Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 12/17] perf lock-contention: " Ian Rogers
                                             ` (6 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Use the e_machine rather than arch string matching for AARCH64.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-c2c.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 72a7802775ee..09c8352a922c 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -3202,7 +3202,7 @@ static int perf_c2c__report(int argc, const char **argv)
 	 * default display type.
 	 */
 	if (!display) {
-		if (!strcmp(perf_env__arch(env), "arm64"))
+		if (perf_env__e_machine(env, /*e_flags=*/NULL) == EM_AARCH64)
 			display = "peer";
 		else
 			display = "tot";
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v8 12/17] perf lock-contention: Use perf_env e_machine rather than arch
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                             ` (10 preceding siblings ...)
  2026-05-02  6:59                           ` [PATCH v8 11/17] perf c2c: Use perf_env e_machine rather than arch Ian Rogers
@ 2026-05-02  6:59                           ` Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 13/17] perf env: Refactor perf_env__arch_strerrno Ian Rogers
                                             ` (5 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Use the e_machine rather than arch string matching for powerpc.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/lock-contention.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/lock-contention.c b/tools/perf/util/lock-contention.c
index 92e7b7b572a2..119a7206f3cd 100644
--- a/tools/perf/util/lock-contention.c
+++ b/tools/perf/util/lock-contention.c
@@ -104,7 +104,8 @@ bool match_callstack_filter(struct machine *machine, u64 *callstack, int max_sta
 	struct map *kmap;
 	struct symbol *sym;
 	u64 ip;
-	const char *arch = perf_env__arch(machine->env);
+	uint16_t e_machine = perf_env__e_machine(machine->env, /*e_flags=*/NULL);
+	bool is_powerpc = e_machine == EM_PPC64 || e_machine == EM_PPC;
 
 	if (list_empty(&callstack_filters))
 		return true;
@@ -125,8 +126,7 @@ bool match_callstack_filter(struct machine *machine, u64 *callstack, int max_sta
 		 * incase first or second callstack index entry has 0
 		 * address for powerpc.
 		 */
-		if (!callstack || (!callstack[i] && (strcmp(arch, "powerpc") ||
-						(i != 1 && i != 2))))
+		if (!callstack || (!callstack[i] && (!is_powerpc || (i != 1 && i != 2))))
 			break;
 
 		ip = callstack[i];
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v8 13/17] perf env: Refactor perf_env__arch_strerrno
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                             ` (11 preceding siblings ...)
  2026-05-02  6:59                           ` [PATCH v8 12/17] perf lock-contention: " Ian Rogers
@ 2026-05-02  6:59                           ` Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 14/17] perf env: Remove unused perf_env__raw_arch Ian Rogers
                                             ` (4 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

perf_env__arch_strerrno is only available with libtraceevent so hide
the declaration if no libtraceevent.

The previous approach maps an architecture string to a function
pointer to a function that takes an int errno values and returns a
string. The new approach takes an e_machine and an errno value and
returns a string.

As the only call site is in builtin-trace.c, the e_machine is already
present and potentially more specific than the perf_env arch string
that is a single global value.

The major complication in this approach is having the shell script
that generates the C code map a linux directory name to the matching
ELF machine constants.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-trace.c                  |  5 ++-
 tools/perf/trace/beauty/arch_errno_names.sh | 40 ++++++++++++++++++---
 tools/perf/util/env.c                       | 13 +++----
 tools/perf/util/env.h                       |  7 ++--
 4 files changed, 44 insertions(+), 21 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index e58c49d047a2..d1f21b5e7c98 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -3008,9 +3008,8 @@ static int trace__sys_exit(struct trace *trace, struct evsel *evsel,
 	} else if (ret < 0) {
 errno_print: {
 		char bf[STRERR_BUFSIZE];
-		struct perf_env *env = evsel__env(evsel) ?: &trace->host_env;
 		const char *emsg = str_error_r(-ret, bf, sizeof(bf));
-		const char *e = perf_env__arch_strerrno(env, err);
+		const char *e = perf_env__arch_strerrno(e_machine, err);
 
 		fprintf(trace->output, "-1 %s (%s)", e, emsg);
 	}
@@ -4890,7 +4889,7 @@ static size_t syscall__dump_stats(struct trace *trace, int e_machine, FILE *fp,
 
 				for (e = 0; e < stats->max_errno; ++e) {
 					if (stats->errnos[e] != 0)
-						fprintf(fp, "\t\t\t\t%s: %d\n", perf_env__arch_strerrno(trace->host->env, e + 1), stats->errnos[e]);
+						fprintf(fp, "\t\t\t\t%s: %d\n", perf_env__arch_strerrno(e_machine, e + 1), stats->errnos[e]);
 				}
 			}
 			lines++;
diff --git a/tools/perf/trace/beauty/arch_errno_names.sh b/tools/perf/trace/beauty/arch_errno_names.sh
index b22890b8d272..89b742927168 100755
--- a/tools/perf/trace/beauty/arch_errno_names.sh
+++ b/tools/perf/trace/beauty/arch_errno_names.sh
@@ -52,21 +52,49 @@ process_arch()
 		|IFS=, create_errno_lookup_func "$arch"
 }
 
+arch_to_e_machine()
+{
+	case "$1" in
+	alpha)      printf '\tcase EM_ALPHA:\n' ;;
+	arc)        printf '\tcase EM_ARC:\n' ;;
+	arm)        printf '\tcase EM_ARM:\n' ;;
+	arm64)      printf '\tcase EM_AARCH64:\n' ;;
+	csky)       printf '\tcase EM_CSKY:\n' ;;
+	hexagon)    printf '\tcase EM_HEXAGON:\n' ;;
+	loongarch)  printf '\tcase EM_LOONGARCH:\n' ;;
+	microblaze) printf '\tcase EM_MICROBLAZE:\n' ;;
+	mips)       printf '\tcase EM_MIPS:\n' ;;
+	parisc)     printf '\tcase EM_PARISC:\n' ;;
+	powerpc)    printf '\tcase EM_PPC:\n\tcase EM_PPC64:\n' ;;
+	riscv)      printf '\tcase EM_RISCV:\n' ;;
+	s390)       printf '\tcase EM_S390:\n' ;;
+	sh)         printf '\tcase EM_SH:\n' ;;
+	sparc)      printf '\tcase EM_SPARC:\n\tcase EM_SPARCV9:\n' ;;
+	x86)        printf '\tcase EM_386:\n\tcase EM_X86_64:\n' ;;
+	xtensa)     printf '\tcase EM_XTENSA:\n' ;;
+	esac
+}
+
 create_arch_errno_table_func()
 {
 	archlist="$1"
 	default="$2"
 
-	printf 'static arch_syscalls__strerrno_t *\n'
-	printf 'arch_syscalls__strerrno_function(const char *arch)\n'
+	printf 'static const char *\n'
+	printf 'arch_syscalls__strerrno(uint16_t e_machine, int err)\n'
 	printf '{\n'
+	printf '\tswitch (e_machine) {\n'
 	for arch in $archlist; do
 		arch_str=$(arch_string "$arch")
-		printf '\tif (!strcmp(arch, "%s"))\n' "$arch_str"
-		printf '\t\treturn errno_to_name__%s;\n' "$arch_str"
+		ems=$(arch_to_e_machine "$arch_str")
+		if [ -n "$ems" ]; then
+			printf '%s\n' "$ems"
+			printf '\t\treturn errno_to_name__%s(err);\n' "$arch_str"
+		fi
 	done
 	arch_str=$(arch_string "$default")
-	printf '\treturn errno_to_name__%s;\n' "$arch_str"
+	printf '\tdefault:\n\t\treturn errno_to_name__%s(err);\n' "$arch_str"
+	printf '\t}\n'
 	printf '}\n'
 }
 
@@ -74,6 +102,8 @@ cat <<EoHEADER
 /* SPDX-License-Identifier: GPL-2.0 */
 
 #include <string.h>
+#include <stdint.h>
+#include <elf.h>
 
 EoHEADER
 
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 4ff4caab3b32..97f4aa1131a1 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -786,17 +786,12 @@ const char *perf_env__arch(struct perf_env *env)
 #include "trace/beauty/arch_errno_names.c"
 #endif
 
-const char *perf_env__arch_strerrno(struct perf_env *env __maybe_unused, int err __maybe_unused)
-{
 #if defined(HAVE_LIBTRACEEVENT)
-	if (env->arch_strerrno == NULL)
-		env->arch_strerrno = arch_syscalls__strerrno_function(perf_env__arch(env));
-
-	return env->arch_strerrno ? env->arch_strerrno(err) : "no arch specific strerrno function";
-#else
-	return "!HAVE_LIBTRACEEVENT";
-#endif
+const char *perf_env__arch_strerrno(uint16_t e_machine, int err)
+{
+	return arch_syscalls__strerrno(e_machine, err);
 }
+#endif
 
 const char *perf_env__cpuid(struct perf_env *env)
 {
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index 7151a9138e3f..68dead1b36a6 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -67,8 +67,6 @@ struct cpu_domain_map {
 	struct domain_info	**domains;
 };
 
-typedef const char *(arch_syscalls__strerrno_t)(int err);
-
 struct perf_env {
 	char			*hostname;
 	char			*os_release;
@@ -158,7 +156,6 @@ struct perf_env {
 		 */
 		bool	enabled;
 	} clock;
-	arch_syscalls__strerrno_t *arch_strerrno;
 };
 
 enum perf_compress_type {
@@ -190,7 +187,9 @@ void cpu_cache_level__free(struct cpu_cache_level *cache);
 uint16_t perf_env__e_machine_nocache(struct perf_env *env, uint32_t *e_flags);
 uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags);
 const char *perf_env__arch(struct perf_env *env);
-const char *perf_env__arch_strerrno(struct perf_env *env, int err);
+#if defined(HAVE_LIBTRACEEVENT)
+const char *perf_env__arch_strerrno(uint16_t e_machine, int err);
+#endif
 const char *perf_env__cpuid(struct perf_env *env);
 const char *perf_env__raw_arch(struct perf_env *env);
 int perf_env__nr_cpus_avail(struct perf_env *env);
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v8 14/17] perf env: Remove unused perf_env__raw_arch
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                             ` (12 preceding siblings ...)
  2026-05-02  6:59                           ` [PATCH v8 13/17] perf env: Refactor perf_env__arch_strerrno Ian Rogers
@ 2026-05-02  6:59                           ` Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 15/17] perf env: Add helper to lazily compute the os_release Ian Rogers
                                             ` (3 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

The switch to using e_machine has made the perf_env__raw_arch function
unused so remove it.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/env.c | 18 ------------------
 tools/perf/util/env.h |  1 -
 2 files changed, 19 deletions(-)

diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 97f4aa1131a1..5944acd28996 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -451,19 +451,6 @@ int perf_env__read_cpuid(struct perf_env *env)
 	return 0;
 }
 
-static int perf_env__read_arch(struct perf_env *env)
-{
-	struct utsname uts;
-
-	if (env->arch)
-		return 0;
-
-	if (!uname(&uts))
-		env->arch = strdup(uts.machine);
-
-	return env->arch ? 0 : -ENOMEM;
-}
-
 static int perf_env__read_nr_cpus_avail(struct perf_env *env)
 {
 	if (env->nr_cpus_avail == 0)
@@ -582,11 +569,6 @@ int perf_env__read_core_pmu_caps(struct perf_env *env)
 	return ret;
 }
 
-const char *perf_env__raw_arch(struct perf_env *env)
-{
-	return env && !perf_env__read_arch(env) ? env->arch : "unknown";
-}
-
 int perf_env__nr_cpus_avail(struct perf_env *env)
 {
 	return env && !perf_env__read_nr_cpus_avail(env) ? env->nr_cpus_avail : 0;
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index 68dead1b36a6..a95fd7eb3524 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -191,7 +191,6 @@ const char *perf_env__arch(struct perf_env *env);
 const char *perf_env__arch_strerrno(uint16_t e_machine, int err);
 #endif
 const char *perf_env__cpuid(struct perf_env *env);
-const char *perf_env__raw_arch(struct perf_env *env);
 int perf_env__nr_cpus_avail(struct perf_env *env);
 
 void perf_env__init(struct perf_env *env);
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v8 15/17] perf env: Add helper to lazily compute the os_release
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                             ` (13 preceding siblings ...)
  2026-05-02  6:59                           ` [PATCH v8 14/17] perf env: Remove unused perf_env__raw_arch Ian Rogers
@ 2026-05-02  6:59                           ` Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 16/17] perf symbol: Add setters for bitfields sharing a byte to avoid concurrent update issues Ian Rogers
                                             ` (2 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

In live mode the os_release isn't being initialized, make a lazy
initialization helper that assumes when the os_release isn't
initialized this is live mode.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/data-convert-bt.c |  2 +-
 tools/perf/util/env.c             | 21 +++++++++++++++++++++
 tools/perf/util/env.h             |  1 +
 tools/perf/util/header.c          | 16 +++++++++++-----
 tools/perf/util/symbol.c          |  4 ++--
 5 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c
index 3b8f2df823a9..2c88420fe33e 100644
--- a/tools/perf/util/data-convert-bt.c
+++ b/tools/perf/util/data-convert-bt.c
@@ -1414,7 +1414,7 @@ do {									\
 
 	ADD("host",    env->hostname);
 	ADD("sysname", "Linux");
-	ADD("release", env->os_release);
+	ADD("release", perf_env__os_release(env));
 	ADD("version", env->version);
 	ADD("machine", env->arch);
 	ADD("domain", "kernel");
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 5944acd28996..1090aaa2985f 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -339,6 +339,27 @@ int perf_env__kernel_is_64_bit(struct perf_env *env)
 	return env->kernel_is_64_bit;
 }
 
+const char *perf_env__os_release(struct perf_env *env)
+{
+	struct utsname uts;
+	int ret;
+
+	if (!env)
+		return perf_version_string;
+
+	if (env->os_release)
+		return env->os_release;
+
+	/*
+	 * The os_release is being accessed but wasn't initialized from a data
+	 * file, assume this is 'live' mode and use the release from uname. If
+	 * uname or strdup fails then use the current perf tool version.
+	 */
+	ret = uname(&uts);
+	env->os_release = strdup(ret < 0 ? perf_version_string : uts.release);
+	return env->os_release ?: perf_version_string;
+}
+
 int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[])
 {
 	int i;
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index a95fd7eb3524..989545a47798 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -172,6 +172,7 @@ void free_cpu_domain_info(struct cpu_domain_map **cd_map, u32 schedstat_version,
 void perf_env__exit(struct perf_env *env);
 
 int perf_env__kernel_is_64_bit(struct perf_env *env);
+const char *perf_env__os_release(struct perf_env *env);
 
 int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[]);
 
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index c6436269df4b..4867a932cb88 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -370,13 +370,19 @@ static int write_osrelease(struct feat_fd *ff,
 			   struct evlist *evlist __maybe_unused)
 {
 	struct utsname uts;
-	int ret;
+	const char *release = NULL;
 
-	ret = uname(&uts);
-	if (ret < 0)
-		return -1;
+	if (evlist->session)
+		release = perf_env__os_release(perf_session__env(evlist->session));
 
-	return do_write_string(ff, uts.release);
+	if (!release) {
+		int ret = uname(&uts);
+
+		if (ret < 0)
+			return -1;
+		release = uts.release;
+	}
+	return do_write_string(ff, release);
 }
 
 static int write_arch(struct feat_fd *ff, struct evlist *evlist)
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 8aaaab0ad4b7..a70066d17729 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -2226,7 +2226,7 @@ static int vmlinux_path__init(struct perf_env *env)
 {
 	struct utsname uts;
 	char bf[PATH_MAX];
-	char *kernel_version;
+	const char *kernel_version;
 	unsigned int i;
 
 	vmlinux_path = malloc(sizeof(char *) * (ARRAY_SIZE(vmlinux_paths) +
@@ -2243,7 +2243,7 @@ static int vmlinux_path__init(struct perf_env *env)
 		return 0;
 
 	if (env) {
-		kernel_version = env->os_release;
+		kernel_version = perf_env__os_release(env);
 	} else {
 		if (uname(&uts) < 0)
 			goto out_fail;
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v8 16/17] perf symbol: Add setters for bitfields sharing a byte to avoid concurrent update issues
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                             ` (14 preceding siblings ...)
  2026-05-02  6:59                           ` [PATCH v8 15/17] perf env: Add helper to lazily compute the os_release Ian Rogers
@ 2026-05-02  6:59                           ` Ian Rogers
  2026-05-02  6:59                           ` [PATCH v8 17/17] perf symbol: Lazily compute idle and use a global lock for updates Ian Rogers
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

A problem with putting bitfields into struct symbol is that other bits in
the symbol could be updated concurrently and only one update to the
underlying storage unit happen, leading to lost updates.

To avoid this, introduce a global lock `symbol_bits_lock` in `symbol.c`
and helper functions to update the bits sharing a byte:
`symbol__set_ignore` and `symbol__set_annotate2`.

`inlined` is not given a setter as it is only initialized in
`new_inline_sym` when the symbol is under construction and not shared.

Assisted-by: Gemini:gemini-3.1-pro-preview
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-kwork.c |  2 +-
 tools/perf/builtin-sched.c |  2 +-
 tools/perf/util/annotate.c |  2 +-
 tools/perf/util/symbol.c   | 22 ++++++++++++++++++++++
 tools/perf/util/symbol.h   |  3 +++
 5 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index 9d3a4c779a41..7337ee956dc9 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -725,7 +725,7 @@ static void timehist_save_callchain(struct perf_kwork *kwork,
 		if (sym) {
 			if (!strcmp(sym->name, "__softirqentry_text_start") ||
 			    !strcmp(sym->name, "__do_softirq"))
-				sym->ignore = 1;
+				symbol__set_ignore(sym, true);
 		}
 
 		callchain_cursor_advance(cursor);
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 555247568e7a..655e95f660c2 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2371,7 +2371,7 @@ static void save_task_callchain(struct perf_sched *sched,
 			if (!strcmp(sym->name, "schedule") ||
 			    !strcmp(sym->name, "__schedule") ||
 			    !strcmp(sym->name, "preempt_schedule"))
-				sym->ignore = 1;
+				symbol__set_ignore(sym, true);
 		}
 
 		callchain_cursor_advance(cursor);
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index e745f3034a0e..d550a0061159 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -2224,7 +2224,7 @@ int symbol__annotate2(struct map_symbol *ms, struct evsel *evsel,
 
 	annotation__init_column_widths(notes, sym);
 	annotation__update_column_widths(notes);
-	sym->annotate2 = 1;
+	symbol__set_annotate2(sym, true);
 
 	return 0;
 }
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index a70066d17729..1238a0d6ce6e 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -31,6 +31,7 @@
 #include "map.h"
 #include "symbol.h"
 #include "map_symbol.h"
+#include "mutex.h"
 #include "mem-events.h"
 #include "mem-info.h"
 #include "symsrc.h"
@@ -52,6 +53,8 @@ static int dso__load_kernel_sym(struct dso *dso, struct map *map);
 static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map);
 static bool symbol__is_idle(const char *name);
 
+static struct mutex symbol_bits_lock;
+
 int vmlinux_path__nr_entries;
 char **vmlinux_path;
 
@@ -345,6 +348,20 @@ void symbol__delete(struct symbol *sym)
 	free(((void *)sym) - symbol_conf.priv_size);
 }
 
+void symbol__set_ignore(struct symbol *sym, bool ignore)
+{
+	mutex_lock(&symbol_bits_lock);
+	sym->ignore = ignore;
+	mutex_unlock(&symbol_bits_lock);
+}
+
+void symbol__set_annotate2(struct symbol *sym, bool annotate2)
+{
+	mutex_lock(&symbol_bits_lock);
+	sym->annotate2 = annotate2;
+	mutex_unlock(&symbol_bits_lock);
+}
+
 void symbols__delete(struct rb_root_cached *symbols)
 {
 	struct symbol *pos;
@@ -2415,6 +2432,8 @@ int symbol__init(struct perf_env *env)
 	if (symbol_conf.initialized)
 		return 0;
 
+	mutex_init(&symbol_bits_lock);
+
 	symbol_conf.priv_size = PERF_ALIGN(symbol_conf.priv_size, sizeof(u64));
 
 	symbol__elf_init();
@@ -2493,6 +2512,9 @@ void symbol__exit(void)
 {
 	if (!symbol_conf.initialized)
 		return;
+
+	mutex_destroy(&symbol_bits_lock);
+
 	strlist__delete(symbol_conf.bt_stop_list);
 	strlist__delete(symbol_conf.sym_list);
 	strlist__delete(symbol_conf.dso_list);
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index bd6eb90c8668..5d98d7e84d57 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -77,6 +77,9 @@ struct symbol {
 void symbol__delete(struct symbol *sym);
 void symbols__delete(struct rb_root_cached *symbols);
 
+void symbol__set_ignore(struct symbol *sym, bool ignore);
+void symbol__set_annotate2(struct symbol *sym, bool annotate2);
+
 /* symbols__for_each_entry - iterate over symbols (rb_root)
  *
  * @symbols: the rb_root of symbols
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v8 17/17] perf symbol: Lazily compute idle and use a global lock for updates
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                             ` (15 preceding siblings ...)
  2026-05-02  6:59                           ` [PATCH v8 16/17] perf symbol: Add setters for bitfields sharing a byte to avoid concurrent update issues Ian Rogers
@ 2026-05-02  6:59                           ` Ian Rogers
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-02  6:59 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Move the idle boolean to a helper symbol__is_idle function. In the
function lazily compute whether a symbol is an idle function taking
into consideration the kernel version and architecture of the
machine. As symbols__insert no longer needs to know if a symbol is for
the kernel, remove the argument.

This change is inspired by mailing list discussion, particularly from
Thomas Richter <tmricht@linux.ibm.com> and Heiko Carstens
<hca@linux.ibm.com>:
https://lore.kernel.org/lkml/20260219113850.354271-1-tmricht@linux.ibm.com/

The change switches x86 matches to use strstarts which means
intel_idle_irq is matched as part of strstarts(name, "intel_idle"), a
change suggested by Honglei Wang <jameshongleiwang@126.com> in:
https://lore.kernel.org/lkml/20260323085255.98173-1-jameshongleiwang@126.com/

To avoid concurrent update issues with other bitfields in `struct symbol`,
this change uses the global lock `symbol_bits_lock` (introduced in a
previous commit) for updates to the `idle` field. A static helper
`symbol__set_idle` taking a boolean is used to encapsulate the lock and
mapping to `enum symbol_idle_kind`.

Assisted-by: Gemini:gemini-3.1-pro-preview
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/symbol-elf.c |   2 +-
 tools/perf/util/symbol.c     | 108 +++++++++++++++++++++++------------
 tools/perf/util/symbol.h     |  14 +++--
 3 files changed, 81 insertions(+), 43 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 7afa8a117139..e8f7fe3f19fc 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1727,7 +1727,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 
 		arch__sym_update(f, &sym);
 
-		__symbols__insert(dso__symbols(curr_dso), f, dso__kernel(dso));
+		__symbols__insert(dso__symbols(curr_dso), f);
 		nr++;
 	}
 	dso__put(curr_dso);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 1238a0d6ce6e..6c642067c4ed 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -51,7 +51,6 @@
 
 static int dso__load_kernel_sym(struct dso *dso, struct map *map);
 static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map);
-static bool symbol__is_idle(const char *name);
 
 static struct mutex symbol_bits_lock;
 
@@ -362,6 +361,13 @@ void symbol__set_annotate2(struct symbol *sym, bool annotate2)
 	mutex_unlock(&symbol_bits_lock);
 }
 
+static void symbol__set_idle(struct symbol *sym, bool idle)
+{
+	mutex_lock(&symbol_bits_lock);
+	sym->idle = idle ? SYMBOL_IDLE__IDLE : SYMBOL_IDLE__NOT_IDLE;
+	mutex_unlock(&symbol_bits_lock);
+}
+
 void symbols__delete(struct rb_root_cached *symbols)
 {
 	struct symbol *pos;
@@ -375,8 +381,7 @@ void symbols__delete(struct rb_root_cached *symbols)
 	}
 }
 
-void __symbols__insert(struct rb_root_cached *symbols,
-		       struct symbol *sym, bool kernel)
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
 	struct rb_node **p = &symbols->rb_root.rb_node;
 	struct rb_node *parent = NULL;
@@ -384,17 +389,6 @@ void __symbols__insert(struct rb_root_cached *symbols,
 	struct symbol *s;
 	bool leftmost = true;
 
-	if (kernel) {
-		const char *name = sym->name;
-		/*
-		 * ppc64 uses function descriptors and appends a '.' to the
-		 * start of every instruction address. Remove it.
-		 */
-		if (name[0] == '.')
-			name++;
-		sym->idle = symbol__is_idle(name);
-	}
-
 	while (*p != NULL) {
 		parent = *p;
 		s = rb_entry(parent, struct symbol, rb_node);
@@ -411,7 +405,7 @@ void __symbols__insert(struct rb_root_cached *symbols,
 
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
-	__symbols__insert(symbols, sym, false);
+	__symbols__insert(symbols, sym);
 }
 
 static struct symbol *symbols__find(struct rb_root_cached *symbols, u64 ip)
@@ -572,7 +566,7 @@ void dso__reset_find_symbol_cache(struct dso *dso)
 
 void dso__insert_symbol(struct dso *dso, struct symbol *sym)
 {
-	__symbols__insert(dso__symbols(dso), sym, dso__kernel(dso));
+	__symbols__insert(dso__symbols(dso), sym);
 
 	/* update the symbol cache if necessary */
 	if (dso__last_find_result_addr(dso) >= sym->start &&
@@ -738,43 +732,81 @@ int modules__parse(const char *filename, void *arg,
  * These are symbols in the kernel image, so make sure that
  * sym is from a kernel DSO.
  */
-static bool symbol__is_idle(const char *name)
+static int sym_name_cmp(const void *a, const void *b)
 {
-	const char * const idle_symbols[] = {
+	const char *name = a;
+	const char *const *sym = b;
+
+	return strcmp(name, *sym);
+}
+
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, struct perf_env *env)
+{
+	static const char * const idle_symbols[] = {
 		"acpi_idle_do_entry",
 		"acpi_processor_ffh_cstate_enter",
 		"arch_cpu_idle",
 		"cpu_idle",
 		"cpu_startup_entry",
-		"idle_cpu",
-		"intel_idle",
-		"intel_idle_ibrs",
 		"default_idle",
-		"native_safe_halt",
 		"enter_idle",
 		"exit_idle",
-		"mwait_idle",
-		"mwait_idle_with_hints",
-		"mwait_idle_with_hints.constprop.0",
+		"idle_cpu",
+		"native_safe_halt",
 		"poll_idle",
-		"ppc64_runlatch_off",
 		"pseries_dedicated_idle_sleep",
-		"psw_idle",
-		"psw_idle_exit",
-		NULL
 	};
-	int i;
-	static struct strlist *idle_symbols_list;
+	const char *name = sym->name;
+	uint16_t e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
 
-	if (idle_symbols_list)
-		return strlist__has_entry(idle_symbols_list, name);
+	if (sym->idle)
+		return sym->idle == SYMBOL_IDLE__IDLE;
 
-	idle_symbols_list = strlist__new(NULL, NULL);
+	if (!dso || dso__kernel(dso) == DSO_SPACE__USER) {
+		symbol__set_idle(sym, /*idle=*/false);
+		return false;
+	}
 
-	for (i = 0; idle_symbols[i]; i++)
-		strlist__add(idle_symbols_list, idle_symbols[i]);
+	/*
+	 * ppc64 uses function descriptors and appends a '.' to the
+	 * start of every instruction address. Remove it.
+	 */
+	if (name[0] == '.')
+		name++;
+
+	if (bsearch(name, idle_symbols, ARRAY_SIZE(idle_symbols),
+		    sizeof(idle_symbols[0]), sym_name_cmp)) {
+		symbol__set_idle(sym, /*idle=*/true);
+		return true;
+	}
+
+	if (e_machine == EM_386 || e_machine == EM_X86_64) {
+		if (strstarts(name, "mwait_idle") ||
+		    strstarts(name, "intel_idle")) {
+			symbol__set_idle(sym, /*idle=*/true);
+			return true;
+		}
+	}
 
-	return strlist__has_entry(idle_symbols_list, name);
+	if (e_machine == EM_PPC64 && !strcmp(name, "ppc64_runlatch_off")) {
+		symbol__set_idle(sym, /*idle=*/true);
+		return true;
+	}
+
+	if (e_machine == EM_S390 && strstarts(name, "psw_idle")) {
+		int major = 0, minor = 0;
+		const char *release = perf_env__os_release(env);
+
+		/* Before v6.10, s390 used psw_idle. */
+		if (release && sscanf(release, "%d.%d", &major, &minor) == 2 &&
+		    (major < 6 || (major == 6 && minor < 10))) {
+			symbol__set_idle(sym, /*idle=*/true);
+			return true;
+		}
+	}
+
+	symbol__set_idle(sym, /*idle=*/false);
+	return false;
 }
 
 static int map__process_kallsym_symbol(void *arg, const char *name,
@@ -803,7 +835,7 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
 	 * We will pass the symbols to the filter later, in
 	 * map__split_kallsyms, when we have split the maps per module
 	 */
-	__symbols__insert(root, sym, !strchr(name, '['));
+	__symbols__insert(root, sym);
 
 	return 0;
 }
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 5d98d7e84d57..717d2f876d58 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -43,6 +43,12 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
 			     GElf_Shdr *shp, const char *name, size_t *idx);
 #endif
 
+enum symbol_idle_kind {
+	SYMBOL_IDLE__UNKNOWN = 0,
+	SYMBOL_IDLE__NOT_IDLE = 1,
+	SYMBOL_IDLE__IDLE = 2,
+};
+
 /**
  * A symtab entry. When allocated this may be preceded by an annotation (see
  * symbol__annotation) and/or a browser_index (see symbol__browser_index).
@@ -58,8 +64,8 @@ struct symbol {
 	u8		type:4;
 	/** ELF binding type as defined for st_info. E.g. STB_WEAK or STB_GLOBAL. */
 	u8		binding:4;
-	/** Set true for kernel symbols of idle routines. */
-	u8		idle:1;
+	/** Cache for symbol__is_idle holding enum symbol_idle_kind values. */
+	u8		idle:2;
 	/** Resolvable but tools ignore it (e.g. idle routines). */
 	u8		ignore:1;
 	/** Symbol for an inlined function. */
@@ -197,8 +203,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss);
 
 char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
 
-void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym,
-		       bool kernel);
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__fixup_duplicate(struct rb_root_cached *symbols);
 void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms);
@@ -281,5 +286,6 @@ enum {
 };
 
 int symbol__validate_sym_arguments(void);
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, struct perf_env *env);
 
 #endif /* __PERF_SYMBOL */
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation
  2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                             ` (16 preceding siblings ...)
  2026-05-02  6:59                           ` [PATCH v8 17/17] perf symbol: Lazily compute idle and use a global lock for updates Ian Rogers
@ 2026-05-03  0:22                           ` Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 01/18] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
                                               ` (17 more replies)
  17 siblings, 18 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Add a helper to perf_env to compute the e_machine if it is EM_NONE.
Derive the value from the arch string if available. Similarly derive
the arch string from the ELF machine if available, for
consistency. This means perf's arch (machine type) is no longer
determined by uname but set to match that of the perf ELF executable.

Migrate code away from strcmp on env->arch to using the e_machine
comparisons that are more accurate and not prone to uname and other
naming differences. While cleaning this up, also clean up the capstone
initialization code to cover more architectures and to set the big
endian flag based on ELF header information.

Switch the idle computation to the point of use and lazily compute it,
rather than computing it for every symbol. The current only user is
`perf top`. At the point of use the perf_env is available and this can
be used to make sure the idle function computation is machine and
kernel version dependent.

To avoid concurrent update issues with bitfields sharing a byte in
`struct symbol` switch to using C11 atomics.

v9:
 - Key changes in v9:
   - **C11 Atomics for `struct symbol`**: Dropped the global
     `symbol_bits_lock` introduced in v7/v8. Replaced unsafe bitfields
     with a thread-safe `_Atomic uint16_t flags` and lockless atomic
     helpers (e.g., `symbol__type()`, `symbol__set_inlined()`).
   - **Bi-endianness Support**: Added `*_endian` variants for `dso` and
     `thread` helpers to ensure Capstone correctly disassembles cross-endian
     binaries.
   - **Architecture Hardening**:
     - Fixed inverted SPARC logic in `perf_env__single_address_space()`.
     - Prioritized DSO architecture over global environment in
       `machine_or_dso_e_machine()`.
     - Fixed an uninitialized memory leak in `perf_env__e_machine()`.
     - Removed lossy `normalize_arch()` canonicalization in `process_arch()`.

 - Review Feedback Status:
   - **Addressed**: C11 atomics migration, bi-endianness, SPARC logic,
     DSO prioritization, and uninitialized memory fixes.
   - **Not Addressed / Dropped**:
     - Patch 15 OS Release: The concern regarding the `uname()` fallback
       during offline analysis was determined to be incorrect for these
       uninitialized states; the original lazy assumption is retained.
     - Patch 04/11: The `EM_AARCH64` fallbacks were dropped as the
       definition should come from dwarf-regs.h when necessary.

v8:
 - Address Sashiko AI review feedback for Patch 1:
   - Switch all code dependent on the arch string to use `e_machine`
     instead.
   - Update `machine__is` and `machine__normalized_is` to take
     `e_machine` integers instead of strings.
   - Refactor `arch_syscalls__strerrno_function` to take an `e_machine`.
   - Avoid premature caching of the host architecture in
     `perf_session__e_machine`.

v7:
 - Address better handling of strdup failures with arch in the
   header/env.
 - Address concurrent update issues in `struct symbol` bitfields by
   introducing a global lock for writes.

v6: Ensure arch is canonical by going to e_machine and back (Sashiko)
v5: Add perf_env os_release helper (Namhyung/Sashiko)
v4: Fix Sashiko issues where an array element wasn't sorted properly,
    the e_flags weren't returned properly, the idle type is change to
    a u8 rather than an enum value and the s390 version check for
    psw_idle is slightly reordered and tweaked.
v3: Properly set up the e_machine coming from the perf_env as reported
    by Honglei Wang.
v2: Some minor white space clean up.
v1: Initial release.

Ian Rogers (18):
  perf env: Add perf_env__e_machine helper and use in perf_env__arch
  perf tests topology: Switch env->arch use to env->e_machine
  perf env, dso, thread: Add _endian variants for e_machine helpers
  perf capstone: Determine architecture from e_machine
  perf print_insn: Use e_machine for fallback IP length check
  perf symbol: Avoid use of machine__is
  perf machine: Use perf_env e_machine rather than arch
  perf sample-raw: Use perf_env e_machine rather than arch
  perf sort: Use perf_env e_machine rather than arch
  perf arch common: Use perf_env e_machine rather than arch
  perf header: In print_pmu_caps use perf_env e_machine
  perf c2c: Use perf_env e_machine rather than arch
  perf lock-contention: Use perf_env e_machine rather than arch
  perf env: Refactor perf_env__arch_strerrno
  perf env: Remove unused perf_env__raw_arch
  perf env: Add helper to lazily compute the os_release
  perf symbol: Add setters for bitfields sharing a byte to avoid
    concurrent update issues
  perf symbol: Lazily compute idle

 tools/perf/arch/common.c                      |  62 ++--
 tools/perf/builtin-c2c.c                      |  40 +--
 tools/perf/builtin-inject.c                   |   6 +-
 tools/perf/builtin-kwork.c                    |   2 +-
 tools/perf/builtin-report.c                   |   2 +-
 tools/perf/builtin-sched.c                    |   4 +-
 tools/perf/builtin-top.c                      |   7 +-
 tools/perf/builtin-trace.c                    |   7 +-
 tools/perf/tests/symbols.c                    |   2 +-
 tools/perf/tests/topology.c                   |   8 +-
 tools/perf/tests/vmlinux-kallsyms.c           |   2 +-
 tools/perf/trace/beauty/arch_errno_names.sh   |  40 ++-
 tools/perf/ui/browsers/annotate.c             |   2 +-
 tools/perf/ui/browsers/map.c                  |   4 +-
 tools/perf/util/annotate.c                    |   5 +-
 tools/perf/util/auxtrace.c                    |   6 +-
 tools/perf/util/callchain.c                   |   4 +-
 tools/perf/util/capstone.c                    | 129 +++++---
 tools/perf/util/data-convert-bt.c             |   2 +-
 tools/perf/util/dlfilter.c                    |   2 +-
 tools/perf/util/dso.c                         |  19 +-
 tools/perf/util/dso.h                         |  14 +-
 tools/perf/util/env.c                         | 295 ++++++++++++++----
 tools/perf/util/env.h                         |  12 +-
 tools/perf/util/evsel_fprintf.c               |   6 +-
 tools/perf/util/header.c                      |  58 ++--
 tools/perf/util/intel-pt.c                    |   2 +-
 tools/perf/util/lock-contention.c             |   6 +-
 tools/perf/util/machine.c                     |  27 +-
 tools/perf/util/machine.h                     |   2 -
 tools/perf/util/print_insn.c                  |  23 +-
 tools/perf/util/print_insn.h                  |   3 +
 tools/perf/util/probe-event.c                 |   4 +-
 tools/perf/util/sample-raw.c                  |  21 +-
 tools/perf/util/sample-raw.h                  |   6 +-
 .../util/scripting-engines/trace-event-perl.c |   2 +-
 .../scripting-engines/trace-event-python.c    |   4 +-
 tools/perf/util/session.c                     |  26 +-
 tools/perf/util/sort.c                        |  66 ++--
 tools/perf/util/srcline.c                     |  10 +-
 tools/perf/util/symbol-elf.c                  |   5 +-
 tools/perf/util/symbol.c                      | 208 ++++++++----
 tools/perf/util/symbol.h                      |  74 ++++-
 tools/perf/util/symbol_fprintf.c              |   4 +-
 tools/perf/util/thread.c                      |  22 +-
 tools/perf/util/thread.h                      |   8 +-
 46 files changed, 860 insertions(+), 403 deletions(-)

-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH v9 01/18] perf env: Add perf_env__e_machine helper and use in perf_env__arch
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  2026-05-04  1:35                               ` Namhyung Kim
  2026-05-03  0:22                             ` [PATCH v9 02/18] perf tests topology: Switch env->arch use to env->e_machine Ian Rogers
                                               ` (16 subsequent siblings)
  17 siblings, 1 reply; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Add a helper that lazily computes the e_machine and falls back to
EM_HOST. Use the perf_env's arch to compute the e_machine if
available. Use a binary search for some efficiency in this, but handle
somewhat complex duplicate rules. Switch perf_env__arch to be derived
the e_machine for consistency. This switches arch from being uname
derived to matching that of the perf binary (via EM_HOST). Update
session to use the helper, which may mean using EM_HOST when no
threads are available. This also updates the perf data file header
that gets the e_machine/e_flags from the session.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/env.c     | 231 +++++++++++++++++++++++++++++++-------
 tools/perf/util/env.h     |   2 +
 tools/perf/util/header.c  |  35 ++++--
 tools/perf/util/session.c |  26 +++--
 4 files changed, 231 insertions(+), 63 deletions(-)

diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 1e54e2c86360..0edc67a468ab 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -1,10 +1,12 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "cpumap.h"
+#include "dwarf-regs.h"
 #include "debug.h"
 #include "env.h"
 #include "util/header.h"
 #include "util/rwsem.h"
 #include <linux/compiler.h>
+#include <linux/kernel.h>
 #include <linux/ctype.h>
 #include <linux/rbtree.h>
 #include <linux/string.h>
@@ -309,12 +311,21 @@ void perf_env__init(struct perf_env *env)
 
 static void perf_env__init_kernel_mode(struct perf_env *env)
 {
-	const char *arch = perf_env__raw_arch(env);
+	uint16_t e_machine = env->e_machine;
 
-	if (!strncmp(arch, "x86_64", 6) || !strncmp(arch, "aarch64", 7) ||
-	    !strncmp(arch, "arm64", 5) || !strncmp(arch, "mips64", 6) ||
-	    !strncmp(arch, "parisc64", 8) || !strncmp(arch, "riscv64", 7) ||
-	    !strncmp(arch, "s390x", 5) || !strncmp(arch, "sparc64", 7))
+	if (env->arch && (e_machine == EM_NONE || e_machine == EM_MIPS || e_machine == EM_RISCV)) {
+		if (str_ends_with(env->arch, "64") || !strncmp(env->arch, "s390x", 5))
+			env->kernel_is_64_bit = 1;
+		else
+			env->kernel_is_64_bit = 0;
+		return;
+	}
+	if (e_machine == EM_NONE)
+		e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
+
+	if (e_machine == EM_X86_64 || e_machine == EM_AARCH64 ||
+	    e_machine == EM_PPC64 || e_machine == EM_SPARCV9 ||
+	    e_machine == EM_S390)
 		env->kernel_is_64_bit = 1;
 	else
 		env->kernel_is_64_bit = 0;
@@ -588,51 +599,187 @@ void cpu_cache_level__free(struct cpu_cache_level *cache)
 	zfree(&cache->size);
 }
 
+struct arch_to_e_machine {
+	const char *prefix;
+	uint16_t e_machine;
+};
+
 /*
- * Return architecture name in a normalized form.
- * The conversion logic comes from the Makefile.
+ * A mapping from an arch prefix string to an ELF machine that can be used in a
+ * bsearch. Some arch prefixes are shared an need additional processing as
+ * marked next to the architecture. The prefixes handle both perf's architecture
+ * naming and those from uname.
  */
-static const char *normalize_arch(char *arch)
-{
-	if (!strcmp(arch, "x86_64"))
-		return "x86";
-	if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
-		return "x86";
-	if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
-		return "sparc";
-	if (!strncmp(arch, "aarch64", 7) || !strncmp(arch, "arm64", 5))
-		return "arm64";
-	if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
-		return "arm";
-	if (!strncmp(arch, "s390", 4))
-		return "s390";
-	if (!strncmp(arch, "parisc", 6))
-		return "parisc";
-	if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
-		return "powerpc";
-	if (!strncmp(arch, "mips", 4))
-		return "mips";
-	if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
-		return "sh";
-	if (!strncmp(arch, "loongarch", 9))
-		return "loongarch";
-
-	return arch;
+static const struct arch_to_e_machine prefix_to_e_machine[] = {
+	{"aarch64", EM_AARCH64},
+	{"alpha", EM_ALPHA},
+	{"arc", EM_ARC},
+	{"arm", EM_ARM}, /* Check also for EM_AARCH64. */
+	{"avr", EM_AVR},  /* Check also for EM_AVR32. */
+	{"bfin", EM_BLACKFIN},
+	{"blackfin", EM_BLACKFIN},
+	{"cris", EM_CRIS},
+	{"csky", EM_CSKY},
+	{"hppa", EM_PARISC},
+	{"i386", EM_386},
+	{"i486", EM_386},
+	{"i586", EM_386},
+	{"i686", EM_386},
+	{"loongarch", EM_LOONGARCH},
+	{"m32r", EM_M32R},
+	{"m68k", EM_68K},
+	{"microblaze", EM_MICROBLAZE},
+	{"mips", EM_MIPS},
+	{"msp430", EM_MSP430},
+	{"parisc", EM_PARISC},
+	{"powerpc", EM_PPC}, /* Check also for EM_PPC64. */
+	{"ppc", EM_PPC}, /* Check also for EM_PPC64. */
+	{"riscv", EM_RISCV},
+	{"s390", EM_S390},
+	{"sa110", EM_ARM},
+	{"sh", EM_SH},
+	{"sparc", EM_SPARC}, /* Check also for EM_SPARCV9. */
+	{"sun4u", EM_SPARC},
+	{"x86", EM_X86_64}, /* Check also for EM_386. */
+	{"xtensa", EM_XTENSA},
+};
+
+static int compare_prefix(const void *key, const void *element)
+{
+	const char *search_key = key;
+	const struct arch_to_e_machine *map_element = element;
+	size_t prefix_len = strlen(map_element->prefix);
+
+	return strncmp(search_key, map_element->prefix, prefix_len);
+}
+
+static uint16_t perf_arch_to_e_machine(const char *perf_arch, int is_64_bit)
+{
+	/* Binary search for a matching prefix. */
+	const struct arch_to_e_machine *result;
+
+	if (!perf_arch)
+		return EM_HOST;
+
+	result = bsearch(perf_arch,
+			 prefix_to_e_machine, ARRAY_SIZE(prefix_to_e_machine),
+			 sizeof(prefix_to_e_machine[0]),
+			 compare_prefix);
+
+	if (!result) {
+		pr_debug("Unknown perf arch for ELF machine mapping: %s\n", perf_arch);
+		return EM_NONE;
+	}
+
+	/*
+	 * Handle conflicting prefixes. If the is_64_bit is unknown (-1) then
+	 * assume 64-bit. We can't use perf_env__kernel_is_64_bit as that
+	 * depends on the arch string.
+	 */
+	switch (result->e_machine) {
+	case EM_ARM:
+		return !strcmp(perf_arch, "arm64") ? EM_AARCH64 : EM_ARM;
+	case EM_AVR:
+		return !strcmp(perf_arch, "avr32") ? EM_AVR32 : EM_AVR;
+	case EM_PPC:
+		return (is_64_bit != 0) || strstarts(perf_arch, "ppc64") ? EM_PPC64 : EM_PPC;
+	case EM_SPARC:
+		return (is_64_bit != 0) || !strcmp(perf_arch, "sparc64") ? EM_SPARCV9 : EM_SPARC;
+	case EM_X86_64:
+		return (is_64_bit != 0) || !strcmp(perf_arch, "x86_64") ? EM_X86_64 : EM_386;
+	default:
+		return result->e_machine;
+	}
+}
+
+static const char *e_machine_to_perf_arch(uint16_t e_machine)
+{
+	/*
+	 * Table for if either the perf arch string differs from uname or there
+	 * are >1 ELF machine with the prefix.
+	 */
+	static const struct arch_to_e_machine extras[] = {
+		{"arm64", EM_AARCH64},
+		{"avr32", EM_AVR32},
+		{"powerpc", EM_PPC},
+		{"powerpc", EM_PPC64},
+		{"sparc", EM_SPARCV9},
+		{"x86", EM_386},
+		{"x86", EM_X86_64},
+		{"none", EM_NONE},
+	};
+
+	for (size_t i = 0; i < ARRAY_SIZE(extras); i++) {
+		if (extras[i].e_machine == e_machine)
+			return extras[i].prefix;
+	}
+
+	for (size_t i = 0; i < ARRAY_SIZE(prefix_to_e_machine); i++) {
+		if (prefix_to_e_machine[i].e_machine == e_machine)
+			return prefix_to_e_machine[i].prefix;
+
+	}
+	return "unknown";
+}
+
+uint16_t perf_env__e_machine_nocache(struct perf_env *env, uint32_t *e_flags)
+{
+	uint16_t e_machine = EM_HOST;
+
+	if (env)
+		e_machine = perf_arch_to_e_machine(env->arch, env->kernel_is_64_bit);
+
+	if (e_flags)
+		*e_flags = (e_machine == EM_HOST) ? EF_HOST : 0;
+
+	return e_machine;
+}
+
+uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags)
+{
+	uint16_t e_machine;
+	uint32_t local_e_flags = 0;
+
+	if (env && env->e_machine != EM_NONE) {
+		if (e_flags)
+			*e_flags = env->e_flags;
+
+		return env->e_machine;
+	}
+	e_machine = perf_env__e_machine_nocache(env, &local_e_flags);
+	if (env) {
+		env->e_machine = e_machine;
+		env->e_flags = local_e_flags;
+	}
+	if (e_flags)
+		*e_flags = local_e_flags;
+
+	return e_machine;
 }
 
 const char *perf_env__arch(struct perf_env *env)
 {
-	char *arch_name;
+	uint16_t e_machine;
+	const char *arch;
 
-	if (!env || !env->arch) { /* Assume local operation */
-		static struct utsname uts = { .machine[0] = '\0', };
-		if (uts.machine[0] == '\0' && uname(&uts) < 0)
-			return NULL;
-		arch_name = uts.machine;
-	} else
-		arch_name = env->arch;
+	if (!env)
+		return e_machine_to_perf_arch(EM_HOST);
+
+	if (env->arch)
+		return env->arch;
 
-	return normalize_arch(arch_name);
+	/*
+	 * Lazily compute/allocate arch. The e_machine may have been
+	 * read from a data file and so may not be EM_HOST.
+	 */
+	e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
+	arch = e_machine_to_perf_arch(e_machine);
+	env->arch = strdup(arch);
+	/*
+	 * Avoid potential crashes on the arch string if memory allocation in
+	 * strdup fails and NULL were to be returned.
+	 */
+	return env->arch ?: arch;
 }
 
 #if defined(HAVE_LIBTRACEEVENT)
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index c7052ac1f856..7151a9138e3f 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -187,6 +187,8 @@ int perf_env__read_cpu_topology_map(struct perf_env *env);
 
 void cpu_cache_level__free(struct cpu_cache_level *cache);
 
+uint16_t perf_env__e_machine_nocache(struct perf_env *env, uint32_t *e_flags);
+uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags);
 const char *perf_env__arch(struct perf_env *env);
 const char *perf_env__arch_strerrno(struct perf_env *env, int err);
 const char *perf_env__cpuid(struct perf_env *env);
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index f30e48eb3fc3..f1ae61392cce 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -379,21 +379,28 @@ static int write_osrelease(struct feat_fd *ff,
 	return do_write_string(ff, uts.release);
 }
 
-static int write_arch(struct feat_fd *ff,
-		      struct evlist *evlist __maybe_unused)
+static int write_arch(struct feat_fd *ff, struct evlist *evlist)
 {
 	struct utsname uts;
-	int ret;
+	const char *arch = NULL;
 
-	ret = uname(&uts);
-	if (ret < 0)
-		return -1;
+	if (evlist->session) {
+		/* Force the computation in the perf_env of the e_machine of the threads. */
+		perf_session__e_machine(evlist->session, /*e_flags=*/NULL);
+		arch = perf_env__arch(perf_session__env(evlist->session));
+	}
 
-	return do_write_string(ff, uts.machine);
+	if (!arch) {
+		int ret = uname(&uts);
+
+		if (ret < 0)
+			return -1;
+		arch = uts.machine;
+	}
+	return do_write_string(ff, arch);
 }
 
-static int write_e_machine(struct feat_fd *ff,
-			   struct evlist *evlist __maybe_unused)
+static int write_e_machine(struct feat_fd *ff, struct evlist *evlist)
 {
 	/* e_machine expanded from 16 to 32-bits for alignment. */
 	uint32_t e_flags;
@@ -2684,10 +2691,18 @@ static int process_##__feat(struct feat_fd *ff, void *data __maybe_unused) \
 FEAT_PROCESS_STR_FUN(hostname, hostname);
 FEAT_PROCESS_STR_FUN(osrelease, os_release);
 FEAT_PROCESS_STR_FUN(version, version);
-FEAT_PROCESS_STR_FUN(arch, arch);
 FEAT_PROCESS_STR_FUN(cpudesc, cpu_desc);
 FEAT_PROCESS_STR_FUN(cpuid, cpuid);
 
+static int process_arch(struct feat_fd *ff, void *data __maybe_unused)
+{
+	free(ff->ph->env.arch);
+	ff->ph->env.arch = do_read_string(ff);
+	if (!ff->ph->env.arch)
+		return -ENOMEM;
+	return 0;
+}
+
 static int process_e_machine(struct feat_fd *ff, void *data __maybe_unused)
 {
 	int ret;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index fe0de2a0277f..3e64db2d27c2 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -3023,14 +3023,19 @@ uint16_t perf_session__e_machine(struct perf_session *session, uint32_t *e_flags
 		return EM_HOST;
 	}
 
+	/*
+	 * Is the env caching an e_machine? If not we want to compute from the
+	 * more accurate threads.
+	 */
 	env = perf_session__env(session);
-	if (env && env->e_machine != EM_NONE) {
-		if (e_flags)
-			*e_flags = env->e_flags;
-
-		return env->e_machine;
-	}
+	if (env && env->e_machine != EM_NONE)
+		return perf_env__e_machine(env, e_flags);
 
+	/*
+	 * Compute from threads, note this is more accurate than
+	 * perf_env__e_machine that falls back on EM_HOST and doesn't consider
+	 * mixed 32-bit and 64-bit threads.
+	 */
 	machines__for_each_thread(&session->machines,
 				  perf_session__e_machine_cb,
 				  &args);
@@ -3048,10 +3053,9 @@ uint16_t perf_session__e_machine(struct perf_session *session, uint32_t *e_flags
 
 	/*
 	 * Couldn't determine from the perf_env or current set of
-	 * threads. Default to the host.
+	 * threads. Potentially use logic that uses the arch string otherwise
+	 * default to the host. Don't cache in the perf_env in case later
+	 * threads indicate a better ELF machine type.
 	 */
-	if (e_flags)
-		*e_flags = EF_HOST;
-
-	return EM_HOST;
+	return perf_env__e_machine_nocache(env, e_flags);
 }
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 02/18] perf tests topology: Switch env->arch use to env->e_machine
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 01/18] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 03/18] perf env, dso, thread: Add _endian variants for e_machine helpers Ian Rogers
                                               ` (15 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Some arch string comparisons weren't normalized. Avoid potential
issues with normalized names vs uname values by swtiching to using the
e_machine.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/topology.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index f54502ebef4b..d4c5c330c679 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -11,6 +11,7 @@
 #include "pmus.h"
 #include "target.h"
 #include <linux/err.h>
+#include <elf.h>
 
 #define TEMPL "/tmp/perf-test-XXXXXX"
 #define DATA_SIZE	10
@@ -74,6 +75,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 	struct aggr_cpu_id id;
 	struct perf_cpu cpu;
 	struct perf_env *env;
+	uint16_t e_machine;
 
 	session = perf_session__new(&data, NULL);
 	TEST_ASSERT_VAL("can't get session", !IS_ERR(session));
@@ -101,7 +103,9 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 	 *  condition is true (see do_core_id_test in header.c). So always
 	 *  run this test on those platforms.
 	 */
-	if (!env->cpu && strncmp(env->arch, "s390", 4) && strncmp(env->arch, "aarch64", 7))
+	e_machine = perf_env__e_machine(env, NULL);
+
+	if (!env->cpu && e_machine != EM_S390 && e_machine != EM_AARCH64)
 		return TEST_SKIP;
 
 	/*
@@ -110,7 +114,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 	 * physical_package_id will be set to -1. Hence skip this
 	 * test if physical_package_id returns -1 for cpu from perf_cpu_map.
 	 */
-	if (!strncmp(env->arch, "ppc64le", 7)) {
+	if (e_machine == EM_PPC64) {
 		if (cpu__get_socket_id(perf_cpu_map__cpu(map, 0)) == -1)
 			return TEST_SKIP;
 	}
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 03/18] perf env, dso, thread: Add _endian variants for e_machine helpers
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 01/18] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 02/18] perf tests topology: Switch env->arch use to env->e_machine Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 04/18] perf capstone: Determine architecture from e_machine Ian Rogers
                                               ` (14 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Add perf_arch_is_big_endian(), dso__read_e_machine_endian(),
dso__e_machine_endian(), and thread__e_machine_endian() to support
bi-endianness and cross-architecture analysis without breaking the
existing API.

These helpers allow querying the absolute endianness of a DSO or
thread, which is required for tools like Capstone that need to set the
correct disassembly mode.

Assisted-by: Gemini:gemini-3.1-pro-preview
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/dso.c    | 19 +++++++++++++------
 tools/perf/util/dso.h    | 14 ++++++++++++--
 tools/perf/util/env.c    | 12 ++++++++++++
 tools/perf/util/env.h    |  1 +
 tools/perf/util/thread.c | 22 ++++++++++++++++------
 tools/perf/util/thread.h |  8 +++++++-
 6 files changed, 61 insertions(+), 15 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index b791e1b6b2cf..6439b2a3c898 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -1220,7 +1220,8 @@ static enum dso_swap_type dso_swap_type__from_elf_data(unsigned char eidata)
 }
 
 /* Reads e_machine from fd, optionally caching data in dso. */
-uint16_t dso__read_e_machine(struct dso *optional_dso, int fd, uint32_t *e_flags)
+uint16_t dso__read_e_machine_endian(struct dso *optional_dso, int fd, uint32_t *e_flags,
+				    bool *is_big_endian)
 {
 	uint16_t e_machine = EM_NONE;
 	unsigned char e_ident[EI_NIDENT];
@@ -1250,6 +1251,9 @@ uint16_t dso__read_e_machine(struct dso *optional_dso, int fd, uint32_t *e_flags
 	if (swap_type == DSO_SWAP__UNSET)
 		return EM_NONE; // Bad ELF data encoding.
 
+	if (is_big_endian)
+		*is_big_endian = (e_ident[EI_DATA] == ELFDATA2MSB);
+
 	/* Cache the need for swapping. */
 	if (optional_dso) {
 		assert(dso__needs_swap(optional_dso) == DSO_SWAP__UNSET ||
@@ -1288,7 +1292,8 @@ uint16_t dso__read_e_machine(struct dso *optional_dso, int fd, uint32_t *e_flags
 	return e_machine;
 }
 
-uint16_t dso__e_machine(struct dso *dso, struct machine *machine, uint32_t *e_flags)
+uint16_t dso__e_machine_endian(struct dso *dso, struct machine *machine, uint32_t *e_flags,
+			       bool *is_big_endian)
 {
 	uint16_t e_machine = EM_NONE;
 	int fd;
@@ -1308,9 +1313,11 @@ uint16_t dso__e_machine(struct dso *dso, struct machine *machine, uint32_t *e_fl
 	case DSO_BINARY_TYPE__BPF_IMAGE:
 	case DSO_BINARY_TYPE__OOL:
 	case DSO_BINARY_TYPE__JAVA_JIT:
-		if (e_flags)
-			*e_flags = EF_HOST;
-		return EM_HOST;
+		if (is_big_endian) {
+			*is_big_endian = perf_arch_is_big_endian(
+				machine && machine->env ? machine->env->arch : NULL);
+		}
+		return perf_env__e_machine(machine ? machine->env : NULL, e_flags);
 	case DSO_BINARY_TYPE__DEBUGLINK:
 	case DSO_BINARY_TYPE__BUILD_ID_CACHE:
 	case DSO_BINARY_TYPE__BUILD_ID_CACHE_DEBUGINFO:
@@ -1338,7 +1345,7 @@ uint16_t dso__e_machine(struct dso *dso, struct machine *machine, uint32_t *e_fl
 	try_to_open_dso(dso, machine);
 	fd = dso__data(dso)->fd;
 	if (fd >= 0)
-		e_machine = dso__read_e_machine(dso, fd, e_flags);
+		e_machine = dso__read_e_machine_endian(dso, fd, e_flags, is_big_endian);
 	else if (e_flags)
 		*e_flags = 0;
 
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index ede691e9a249..2916b954a804 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -866,8 +866,18 @@ int dso__data_file_size(struct dso *dso, struct machine *machine);
 off_t dso__data_size(struct dso *dso, struct machine *machine);
 ssize_t dso__data_read_offset(struct dso *dso, struct machine *machine,
 			      u64 offset, u8 *data, ssize_t size);
-uint16_t dso__read_e_machine(struct dso *optional_dso, int fd, uint32_t *e_flags);
-uint16_t dso__e_machine(struct dso *dso, struct machine *machine, uint32_t *e_flags);
+uint16_t dso__read_e_machine_endian(struct dso *optional_dso, int fd, uint32_t *e_flags,
+				    bool *is_big_endian);
+static inline uint16_t dso__read_e_machine(struct dso *optional_dso, int fd, uint32_t *e_flags)
+{
+	return dso__read_e_machine_endian(optional_dso, fd, e_flags, NULL);
+}
+uint16_t dso__e_machine_endian(struct dso *dso, struct machine *machine, uint32_t *e_flags,
+			       bool *is_big_endian);
+static inline uint16_t dso__e_machine(struct dso *dso, struct machine *machine, uint32_t *e_flags)
+{
+	return dso__e_machine_endian(dso, machine, e_flags, NULL);
+}
 ssize_t dso__data_read_addr(struct dso *dso, struct map *map,
 			    struct machine *machine, u64 addr,
 			    u8 *data, ssize_t size);
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 0edc67a468ab..1a4db133262b 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -339,6 +339,18 @@ int perf_env__kernel_is_64_bit(struct perf_env *env)
 	return env->kernel_is_64_bit;
 }
 
+bool perf_arch_is_big_endian(const char *arch)
+{
+	if (!arch)
+		return __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__;
+
+	if (str_ends_with(arch, "_be") || !strcmp(arch, "sparc") || !strcmp(arch, "sparc64") ||
+	    !strcmp(arch, "s390") || !strcmp(arch, "s390x") || !strcmp(arch, "powerpc"))
+		return true;
+
+	return false;
+}
+
 int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[])
 {
 	int i;
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index 7151a9138e3f..c355df2dba7b 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -175,6 +175,7 @@ void free_cpu_domain_info(struct cpu_domain_map **cd_map, u32 schedstat_version,
 void perf_env__exit(struct perf_env *env);
 
 int perf_env__kernel_is_64_bit(struct perf_env *env);
+bool perf_arch_is_big_endian(const char *arch);
 
 int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[]);
 
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 22be77225bb0..8611293deca9 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -449,7 +449,7 @@ void thread__find_cpumode_addr_location(struct thread *thread, u64 addr,
 	}
 }
 
-static uint16_t read_proc_e_machine_for_pid(pid_t pid, uint32_t *e_flags)
+static uint16_t read_proc_e_machine_for_pid(pid_t pid, uint32_t *e_flags, bool *is_big_endian)
 {
 	char path[6 /* "/proc/" */ + 11 /* max length of pid */ + 5 /* "/exe\0" */];
 	int fd;
@@ -458,7 +458,8 @@ static uint16_t read_proc_e_machine_for_pid(pid_t pid, uint32_t *e_flags)
 	snprintf(path, sizeof(path), "/proc/%d/exe", pid);
 	fd = open(path, O_RDONLY);
 	if (fd >= 0) {
-		e_machine = dso__read_e_machine(/*optional_dso=*/NULL, fd, e_flags);
+		e_machine = dso__read_e_machine_endian(/*optional_dso=*/NULL, fd, e_flags,
+						       is_big_endian);
 		close(fd);
 	}
 	return e_machine;
@@ -468,6 +469,7 @@ struct thread__e_machine_callback_args {
 	struct machine *machine;
 	uint32_t e_flags;
 	uint16_t e_machine;
+	bool is_big_endian;
 };
 
 static int thread__e_machine_callback(struct map *map, void *_args)
@@ -478,11 +480,13 @@ static int thread__e_machine_callback(struct map *map, void *_args)
 	if (!dso)
 		return 0; // No dso, continue search.
 
-	args->e_machine = dso__e_machine(dso, args->machine, &args->e_flags);
+	args->e_machine =
+		dso__e_machine_endian(dso, args->machine, &args->e_flags, &args->is_big_endian);
 	return args->e_machine != EM_NONE ? 1 /* stop search */ : 0 /* continue search */;
 }
 
-uint16_t thread__e_machine(struct thread *thread, struct machine *machine, uint32_t *e_flags)
+uint16_t thread__e_machine_endian(struct thread *thread, struct machine *machine, uint32_t *e_flags,
+				  bool *is_big_endian)
 {
 	pid_t tid, pid;
 	uint16_t e_machine = RC_CHK_ACCESS(thread)->e_machine;
@@ -491,6 +495,7 @@ uint16_t thread__e_machine(struct thread *thread, struct machine *machine, uint3
 		.machine = machine,
 		.e_flags = 0,
 		.e_machine = EM_NONE,
+		.is_big_endian = false,
 	};
 
 	if (e_machine != EM_NONE) {
@@ -510,7 +515,8 @@ uint16_t thread__e_machine(struct thread *thread, struct machine *machine, uint3
 		struct thread *parent = machine__findnew_thread(machine, pid, pid);
 
 		if (parent) {
-			e_machine = thread__e_machine(parent, machine, &local_e_flags);
+			e_machine = thread__e_machine_endian(parent, machine, &local_e_flags,
+							     is_big_endian);
 			thread__put(parent);
 			goto out;
 		}
@@ -522,6 +528,8 @@ uint16_t thread__e_machine(struct thread *thread, struct machine *machine, uint3
 	if (args.e_machine != EM_NONE) {
 		e_machine = args.e_machine;
 		local_e_flags = args.e_flags;
+		if (is_big_endian)
+			*is_big_endian = args.is_big_endian;
 	} else {
 		/* Maps failed, perhaps we're live with map events disabled. */
 		bool is_live = machine->machines == NULL;
@@ -536,7 +544,7 @@ uint16_t thread__e_machine(struct thread *thread, struct machine *machine, uint3
 		}
 		/* Read from /proc/pid/exe if live. */
 		if (is_live)
-			e_machine = read_proc_e_machine_for_pid(pid, &local_e_flags);
+			e_machine = read_proc_e_machine_for_pid(pid, &local_e_flags, is_big_endian);
 	}
 out:
 	if (e_machine != EM_NONE) {
@@ -545,6 +553,8 @@ uint16_t thread__e_machine(struct thread *thread, struct machine *machine, uint3
 	} else {
 		e_machine = EM_HOST;
 		local_e_flags = EF_HOST;
+		if (is_big_endian)
+			*is_big_endian = (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__);
 	}
 	if (e_flags)
 		*e_flags = local_e_flags;
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index f5792d3e8a16..8b58590c89de 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -311,7 +311,13 @@ static inline void thread__set_filter_entry_depth(struct thread *thread, int dep
 	RC_CHK_ACCESS(thread)->filter_entry_depth = depth;
 }
 
-uint16_t thread__e_machine(struct thread *thread, struct machine *machine, uint32_t *e_flags);
+uint16_t thread__e_machine_endian(struct thread *thread, struct machine *machine, uint32_t *e_flags,
+				  bool *is_big_endian);
+static inline uint16_t thread__e_machine(struct thread *thread, struct machine *machine,
+					 uint32_t *e_flags)
+{
+	return thread__e_machine_endian(thread, machine, e_flags, NULL);
+}
 
 static inline void thread__set_e_machine(struct thread *thread, uint16_t e_machine)
 {
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 04/18] perf capstone: Determine architecture from e_machine
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                               ` (2 preceding siblings ...)
  2026-05-03  0:22                             ` [PATCH v9 03/18] perf env, dso, thread: Add _endian variants for e_machine helpers Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 05/18] perf print_insn: Use e_machine for fallback IP length check Ian Rogers
                                               ` (13 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Avoid the use of arch string that is imprecise and use the
e_machine. Do more e_machine to capstone machine translations adding
MIPS and RISCV. Remove unnecessary maybe_unused annotations.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/capstone.c | 129 ++++++++++++++++++++++++-------------
 1 file changed, 85 insertions(+), 44 deletions(-)

diff --git a/tools/perf/util/capstone.c b/tools/perf/util/capstone.c
index 25cf6e15ec27..870394b46911 100644
--- a/tools/perf/util/capstone.c
+++ b/tools/perf/util/capstone.c
@@ -1,7 +1,19 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "capstone.h"
-#include "annotate.h"
+
+#include <errno.h>
+#include <inttypes.h>
+#include <string.h>
+
+#include <dlfcn.h>
+#include <elf.h>
+#include <fcntl.h>
+#include <linux/ctype.h>
+
+#include <capstone/capstone.h>
+
 #include "addr_location.h"
+#include "annotate.h"
 #include "debug.h"
 #include "disasm.h"
 #include "dso.h"
@@ -11,13 +23,6 @@
 #include "print_insn.h"
 #include "symbol.h"
 #include "thread.h"
-#include <dlfcn.h>
-#include <errno.h>
-#include <fcntl.h>
-#include <inttypes.h>
-#include <string.h>
-
-#include <capstone/capstone.h>
 
 #ifdef LIBCAPSTONE_DLOPEN
 static void *perf_cs_dll_handle(void)
@@ -137,37 +142,67 @@ static enum cs_err perf_cs_close(csh *handle)
 #endif
 }
 
-static int capstone_init(struct machine *machine, csh *cs_handle, bool is64,
+static bool e_machine_to_capstone(uint16_t e_machine, bool is64, bool is_big_endian,
+				  enum cs_arch *arch, enum cs_mode *mode)
+{
+	*mode = is_big_endian ? CS_MODE_BIG_ENDIAN : CS_MODE_LITTLE_ENDIAN;
+	*mode |= is64 ? CS_MODE_64 : CS_MODE_32;
+
+	switch (e_machine) {
+	case EM_X86_64:
+	case EM_386:
+		*arch = CS_ARCH_X86;
+		return true;
+	case EM_AARCH64:
+		*arch = CS_ARCH_ARM64;
+		*mode |= CS_MODE_ARM;
+		return true;
+	case EM_ARM:
+		*arch = CS_ARCH_ARM;
+		*mode |= CS_MODE_ARM | CS_MODE_V8;
+		return true;
+	case EM_S390:
+		*arch = CS_ARCH_SYSZ;
+		return true;
+	case EM_MIPS:
+		*arch = CS_ARCH_MIPS;
+		*mode |= is64 ? CS_MODE_MIPS64 : CS_MODE_MIPS32;
+		return true;
+	case EM_PPC:
+	case EM_PPC64:
+		*arch = CS_ARCH_PPC;
+		return true;
+	case EM_SPARC:
+		*arch = CS_ARCH_SPARC;
+		return true;
+	case EM_SPARCV9:
+		*arch = CS_ARCH_SPARC;
+		*mode |= CS_MODE_V9;
+		return true;
+	case EM_RISCV:
+		*arch = CS_ARCH_RISCV;
+		*mode |= is64 ? CS_MODE_RISCV64 : CS_MODE_RISCV32;
+		return true;
+	default:
+		return false;
+	}
+}
+
+static int capstone_init(uint16_t e_machine, csh *cs_handle, bool is64, bool is_big_endian,
 			 bool disassembler_style)
 {
 	enum cs_arch arch;
 	enum cs_mode mode;
 
-	if (machine__is(machine, "x86_64") && is64) {
-		arch = CS_ARCH_X86;
-		mode = CS_MODE_64;
-	} else if (machine__normalized_is(machine, "x86")) {
-		arch = CS_ARCH_X86;
-		mode = CS_MODE_32;
-	} else if (machine__normalized_is(machine, "arm64")) {
-		arch = CS_ARCH_ARM64;
-		mode = CS_MODE_ARM;
-	} else if (machine__normalized_is(machine, "arm")) {
-		arch = CS_ARCH_ARM;
-		mode = CS_MODE_ARM + CS_MODE_V8;
-	} else if (machine__normalized_is(machine, "s390")) {
-		arch = CS_ARCH_SYSZ;
-		mode = CS_MODE_BIG_ENDIAN;
-	} else {
+	if (!e_machine_to_capstone(e_machine, is64, is_big_endian, &arch, &mode))
 		return -1;
-	}
 
 	if (perf_cs_open(arch, mode, cs_handle) != CS_ERR_OK) {
 		pr_warning_once("cs_open failed\n");
 		return -1;
 	}
 
-	if (machine__normalized_is(machine, "x86")) {
+	if (arch == CS_ARCH_X86) {
 		/*
 		 * In case of using capstone_init while symbol__disassemble
 		 * setting CS_OPT_SYNTAX_ATT depends if disassembler_style opts
@@ -211,29 +246,28 @@ static size_t print_insn_x86(struct thread *thread, u8 cpumode, struct cs_insn *
 	return printed;
 }
 
-
-ssize_t capstone__fprintf_insn_asm(struct machine *machine __maybe_unused,
-				   struct thread *thread __maybe_unused,
-				   u8 cpumode __maybe_unused, bool is64bit __maybe_unused,
-				   const uint8_t *code __maybe_unused,
-				   size_t code_size __maybe_unused,
-				   uint64_t ip __maybe_unused, int *lenp __maybe_unused,
-				   int print_opts __maybe_unused, FILE *fp __maybe_unused)
+ssize_t capstone__fprintf_insn_asm(struct machine *machine, struct thread *thread, u8 cpumode,
+				   bool is64bit, const uint8_t *code, size_t code_size, uint64_t ip,
+				   int *lenp, int print_opts, FILE *fp)
 {
 	size_t printed;
 	struct cs_insn *insn;
 	csh cs_handle;
 	size_t count;
+	bool is_big_endian = false;
+	uint16_t e_machine = thread__e_machine_endian(thread, machine,
+						      /*e_flags=*/NULL, &is_big_endian);
 	int ret;
 
 	/* TODO: Try to initiate capstone only once but need a proper place. */
-	ret = capstone_init(machine, &cs_handle, is64bit, true);
+	ret = capstone_init(e_machine, &cs_handle, is64bit, is_big_endian,
+			    /*disassembler_style=*/true);
 	if (ret < 0)
 		return ret;
 
 	count = perf_cs_disasm(cs_handle, code, code_size, ip, 1, &insn);
 	if (count > 0) {
-		if (machine__normalized_is(machine, "x86"))
+		if (e_machine == EM_X86_64 || e_machine == EM_386)
 			printed = print_insn_x86(thread, cpumode, &insn[0], print_opts, fp);
 		else
 			printed = fprintf(fp, "%s %s", insn[0].mnemonic, insn[0].op_str);
@@ -322,9 +356,8 @@ static int find_file_offset(u64 start, u64 len, u64 pgoff, void *arg)
 	return 0;
 }
 
-int symbol__disassemble_capstone(const char *filename __maybe_unused,
-				 struct symbol *sym __maybe_unused,
-				 struct annotate_args *args __maybe_unused)
+int symbol__disassemble_capstone(const char *filename, struct symbol *sym,
+				 struct annotate_args *args)
 {
 	struct annotation *notes = symbol__annotation(sym);
 	struct map *map = args->ms->map;
@@ -344,6 +377,8 @@ int symbol__disassemble_capstone(const char *filename __maybe_unused,
 	char disasm_buf[512];
 	struct disasm_line *dl;
 	bool disassembler_style = false;
+	uint16_t e_machine;
+	bool is_big_endian = false;
 
 	if (args->options->objdump_path)
 		return -1;
@@ -373,8 +408,10 @@ int symbol__disassemble_capstone(const char *filename __maybe_unused,
 	    !strcmp(args->options->disassembler_style, "att"))
 		disassembler_style = true;
 
-	if (capstone_init(maps__machine(thread__maps(args->ms->thread)), &handle, is_64bit,
-			  disassembler_style) < 0)
+	e_machine = thread__e_machine_endian(args->ms->thread,
+					     /*machine=*/NULL,
+					     /*e_flags=*/NULL, &is_big_endian);
+	if (capstone_init(e_machine, &handle, is_64bit, is_big_endian, disassembler_style) < 0)
 		goto err;
 
 	needs_cs_close = true;
@@ -466,6 +503,8 @@ int symbol__disassemble_capstone_powerpc(const char *filename __maybe_unused,
 	struct disasm_line *dl;
 	u32 *line;
 	bool disassembler_style = false;
+	uint16_t e_machine;
+	bool is_big_endian = false;
 
 	if (args->options->objdump_path)
 		return -1;
@@ -484,8 +523,10 @@ int symbol__disassemble_capstone_powerpc(const char *filename __maybe_unused,
 	    !strcmp(args->options->disassembler_style, "att"))
 		disassembler_style = true;
 
-	if (capstone_init(maps__machine(thread__maps(args->ms->thread)), &handle, is_64bit,
-			  disassembler_style) < 0)
+	e_machine = thread__e_machine_endian(args->ms->thread,
+					     /*machine=*/NULL,
+					     /*e_flags=*/NULL, &is_big_endian);
+	if (capstone_init(e_machine, &handle, is_64bit, is_big_endian, disassembler_style) < 0)
 		goto err;
 
 	needs_cs_close = true;
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 05/18] perf print_insn: Use e_machine for fallback IP length check
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                               ` (3 preceding siblings ...)
  2026-05-03  0:22                             ` [PATCH v9 04/18] perf capstone: Determine architecture from e_machine Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 06/18] perf symbol: Avoid use of machine__is Ian Rogers
                                               ` (12 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Avoid string comparisons with perf_env arch, switch to using the more
precise ELF machine.

Sort header files and fix missing definitions.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/print_insn.c | 23 ++++++++++++++---------
 tools/perf/util/print_insn.h |  3 +++
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/print_insn.c b/tools/perf/util/print_insn.c
index 02e6fbb8ca04..4068436f26ea 100644
--- a/tools/perf/util/print_insn.c
+++ b/tools/perf/util/print_insn.c
@@ -4,19 +4,24 @@
  *
  * Author(s): Changbin Du <changbin.du@huawei.com>
  */
+#include "print_insn.h"
+
 #include <inttypes.h>
-#include <string.h>
 #include <stdbool.h>
+#include <string.h>
+
+#include <dwarf-regs.h>
+
 #include "capstone.h"
 #include "debug.h"
+#include "dso.h"
+#include "dump-insn.h"
+#include "env.h"
+#include "machine.h"
+#include "map.h"
 #include "sample.h"
 #include "symbol.h"
-#include "machine.h"
 #include "thread.h"
-#include "print_insn.h"
-#include "dump-insn.h"
-#include "map.h"
-#include "dso.h"
 
 size_t sample__fprintf_insn_raw(struct perf_sample *sample, FILE *fp)
 {
@@ -33,13 +38,13 @@ size_t sample__fprintf_insn_raw(struct perf_sample *sample, FILE *fp)
 static bool is64bitip(struct machine *machine, struct addr_location *al)
 {
 	const struct dso *dso = al->map ? map__dso(al->map) : NULL;
+	uint16_t e_machine;
 
 	if (dso)
 		return dso__is_64_bit(dso);
 
-	return machine__is(machine, "x86_64") ||
-		machine__normalized_is(machine, "arm64") ||
-		machine__normalized_is(machine, "s390");
+	e_machine = perf_env__e_machine(machine->env, /*e_flags=*/NULL);
+	return e_machine == EM_X86_64 || e_machine == EM_AARCH64 || e_machine == EM_S390;
 }
 
 ssize_t fprintf_insn_asm(struct machine *machine, struct thread *thread, u8 cpumode,
diff --git a/tools/perf/util/print_insn.h b/tools/perf/util/print_insn.h
index 07d11af3fc1c..1f2c2f25f973 100644
--- a/tools/perf/util/print_insn.h
+++ b/tools/perf/util/print_insn.h
@@ -5,6 +5,9 @@
 #include <stddef.h>
 #include <stdio.h>
 
+#include <linux/types.h>
+
+struct addr_location;
 struct perf_sample;
 struct thread;
 struct machine;
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 06/18] perf symbol: Avoid use of machine__is
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                               ` (4 preceding siblings ...)
  2026-05-03  0:22                             ` [PATCH v9 05/18] perf print_insn: Use e_machine for fallback IP length check Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 07/18] perf machine: Use perf_env e_machine rather than arch Ian Rogers
                                               ` (11 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Switch to using the ELF machine from the dso or running machine rather
than the machine perf_env arch that may fall back on EM_HOST. This
also avoids potentially imprecise string comparisons.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/symbol.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index fcaeeddbbb6b..a4b1f837a5a5 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -851,6 +851,23 @@ static int maps__split_kallsyms_for_kcore(struct maps *kmaps, struct dso *dso)
 	return count;
 }
 
+static uint16_t machine_or_dso_e_machine(struct machine *machine, struct dso *dso)
+{
+	uint16_t e_machine = EM_NONE;
+	/* DSO should be most accurate */
+	if (dso)
+		e_machine = dso__e_machine(dso, machine, /*e_flags=*/NULL);
+
+	if (e_machine != EM_NONE)
+		return e_machine;
+
+	/* Check the global environment next. */
+	if (machine && machine->env && machine->env->e_machine != EM_NONE)
+		return machine->env->e_machine;
+
+	return perf_env__e_machine(machine ? machine->env : NULL, /*e_flags=*/NULL);
+}
+
 /*
  * Split the symbols into maps, making sure there are no overlaps, i.e. the
  * kernel range is broken in several maps, named [kernel].N, as we don't have
@@ -866,14 +883,13 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 	struct rb_root_cached *root = dso__symbols(dso);
 	struct rb_node *next = rb_first_cached(root);
 	int kernel_range = 0;
-	bool x86_64;
+	uint16_t e_machine = EM_NONE;
 
 	if (!kmaps)
 		return -1;
 
 	machine = maps__machine(kmaps);
-
-	x86_64 = machine__is(machine, "x86_64");
+	e_machine = machine_or_dso_e_machine(machine, dso);
 
 	while (next) {
 		char *module;
@@ -925,7 +941,7 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 			 */
 			pos->start = map__map_ip(curr_map, pos->start);
 			pos->end   = map__map_ip(curr_map, pos->end);
-		} else if (x86_64 && is_entry_trampoline(pos->name)) {
+		} else if (e_machine == EM_X86_64 && is_entry_trampoline(pos->name)) {
 			/*
 			 * These symbols are not needed anymore since the
 			 * trampoline maps refer to the text section and it's
@@ -1428,7 +1444,7 @@ static int dso__load_kcore(struct dso *dso, struct map *map,
 		free(new_node);
 	}
 
-	if (machine__is(machine, "x86_64")) {
+	if (machine_or_dso_e_machine(machine, dso) == EM_X86_64) {
 		u64 addr;
 
 		/*
@@ -1716,7 +1732,7 @@ int dso__load(struct dso *dso, struct map *map)
 			ret = dso__load_guest_kernel_sym(dso, map);
 
 		machine = maps__machine(map__kmaps(map));
-		if (machine__is(machine, "x86_64"))
+		if (machine_or_dso_e_machine(machine, dso) == EM_X86_64)
 			machine__map_x86_64_entry_trampolines(machine, dso);
 		goto out;
 	}
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 07/18] perf machine: Use perf_env e_machine rather than arch
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                               ` (5 preceding siblings ...)
  2026-05-03  0:22                             ` [PATCH v9 06/18] perf symbol: Avoid use of machine__is Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 08/18] perf sample-raw: " Ian Rogers
                                               ` (10 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

The arch string is derived from uname and may be normalized causing
potential differences meaning the ELF machine can be more
precise. Reduce the scope of machine__is as often it is better to use
a thread for the e_machine rather than the machine. Switch from string
to ELF machine constant comparisons.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/machine.c | 25 ++++++++-----------------
 tools/perf/util/machine.h |  2 --
 2 files changed, 8 insertions(+), 19 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index e76f8c86e62a..6d32d3cb5cb7 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1611,10 +1611,15 @@ static bool machine__uses_kcore(struct machine *machine)
 	return dsos__for_each_dso(&machine->dsos, machine__uses_kcore_cb, NULL) != 0 ? true : false;
 }
 
+static bool machine__is(struct machine *machine, uint16_t e_machine)
+{
+	return machine && perf_env__e_machine(machine->env, NULL) == e_machine;
+}
+
 static bool perf_event__is_extra_kernel_mmap(struct machine *machine,
 					     struct extra_kernel_map *xm)
 {
-	return machine__is(machine, "x86_64") &&
+	return machine__is(machine, EM_X86_64) &&
 	       is_entry_trampoline(xm->name);
 }
 
@@ -2770,7 +2775,7 @@ static int find_prev_cpumode(struct ip_callchain *chain, struct thread *thread,
 static u64 get_leaf_frame_caller(struct perf_sample *sample,
 		struct thread *thread, int usr_idx)
 {
-	if (machine__normalized_is(maps__machine(thread__maps(thread)), "arm64"))
+	if (thread__e_machine(thread, /*machine=*/NULL, /*e_flags=*/NULL) == EM_AARCH64)
 		return get_leaf_frame_caller_aarch64(sample, thread, usr_idx);
 	else
 		return 0;
@@ -3141,20 +3146,6 @@ int machine__set_current_tid(struct machine *machine, int cpu, pid_t pid,
 	return 0;
 }
 
-/*
- * Compares the raw arch string. N.B. see instead perf_env__arch() or
- * machine__normalized_is() if a normalized arch is needed.
- */
-bool machine__is(struct machine *machine, const char *arch)
-{
-	return machine && !strcmp(perf_env__raw_arch(machine->env), arch);
-}
-
-bool machine__normalized_is(struct machine *machine, const char *arch)
-{
-	return machine && !strcmp(perf_env__arch(machine->env), arch);
-}
-
 int machine__nr_cpus_avail(struct machine *machine)
 {
 	return machine ? perf_env__nr_cpus_avail(machine->env) : 0;
@@ -3181,7 +3172,7 @@ int machine__get_kernel_start(struct machine *machine)
 		 * start of kernel text, but still above 2^63. So leave
 		 * kernel_start = 1ULL << 63 for x86_64.
 		 */
-		if (!err && !machine__is(machine, "x86_64"))
+		if (!err && !machine__is(machine, EM_X86_64))
 			machine->kernel_start = map__start(map);
 	}
 	return err;
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 22a42c5825fa..003c970b3e4b 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -227,8 +227,6 @@ static inline bool machine__is_host(struct machine *machine)
 }
 
 bool machine__is_lock_function(struct machine *machine, u64 addr);
-bool machine__is(struct machine *machine, const char *arch);
-bool machine__normalized_is(struct machine *machine, const char *arch);
 int machine__nr_cpus_avail(struct machine *machine);
 
 struct thread *machine__findnew_thread(struct machine *machine, pid_t pid, pid_t tid);
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 08/18] perf sample-raw: Use perf_env e_machine rather than arch
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                               ` (6 preceding siblings ...)
  2026-05-03  0:22                             ` [PATCH v9 07/18] perf machine: Use perf_env e_machine rather than arch Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 09/18] perf sort: " Ian Rogers
                                               ` (9 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Use the e_machine rather than the arch to determine S390 and x86 types.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/sample-raw.c | 21 +++++++++++----------
 tools/perf/util/sample-raw.h |  6 +++++-
 2 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/sample-raw.c b/tools/perf/util/sample-raw.c
index bcf442574d6e..be998c713a0d 100644
--- a/tools/perf/util/sample-raw.c
+++ b/tools/perf/util/sample-raw.c
@@ -1,11 +1,12 @@
 /* SPDX-License-Identifier: GPL-2.0 */
+#include "sample-raw.h"
 
-#include <string.h>
+#include <elf.h>
 #include <linux/string.h>
-#include "evlist.h"
+
 #include "env.h"
+#include "evlist.h"
 #include "header.h"
-#include "sample-raw.h"
 #include "session.h"
 
 /*
@@ -14,14 +15,14 @@
  */
 void evlist__init_trace_event_sample_raw(struct evlist *evlist, struct perf_env *env)
 {
-	const char *arch_pf = perf_env__arch(env);
-	const char *cpuid = perf_env__cpuid(env);
+	uint16_t e_machine = perf_env__e_machine(env, /*e_eflags=*/NULL);
 
-	if (arch_pf && !strcmp("s390", arch_pf))
+	if (e_machine == EM_S390) {
 		evlist->trace_event_sample_raw = evlist__s390_sample_raw;
-	else if (arch_pf && !strcmp("x86", arch_pf) &&
-		 cpuid && strstarts(cpuid, "AuthenticAMD") &&
-		 evlist__has_amd_ibs(evlist)) {
-		evlist->trace_event_sample_raw = evlist__amd_sample_raw;
+	} else if (e_machine == EM_X86_64 || e_machine == EM_386) {
+		const char *cpuid = perf_env__cpuid(env);
+
+		if (cpuid && strstarts(cpuid, "AuthenticAMD") && evlist__has_amd_ibs(evlist))
+			evlist->trace_event_sample_raw = evlist__amd_sample_raw;
 	}
 }
diff --git a/tools/perf/util/sample-raw.h b/tools/perf/util/sample-raw.h
index 896e9a87e373..c8d38c841c8c 100644
--- a/tools/perf/util/sample-raw.h
+++ b/tools/perf/util/sample-raw.h
@@ -2,7 +2,10 @@
 #ifndef __SAMPLE_RAW_H
 #define __SAMPLE_RAW_H 1
 
+#include <stdbool.h>
+
 struct evlist;
+struct perf_env;
 union perf_event;
 struct perf_sample;
 
@@ -12,4 +15,5 @@ bool evlist__has_amd_ibs(struct evlist *evlist);
 void evlist__amd_sample_raw(struct evlist *evlist, union perf_event *event,
 			    struct perf_sample *sample);
 void evlist__init_trace_event_sample_raw(struct evlist *evlist, struct perf_env *env);
-#endif /* __PERF_EVLIST_H */
+
+#endif /* __SAMPLE_RAW_H */
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 09/18] perf sort: Use perf_env e_machine rather than arch
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                               ` (7 preceding siblings ...)
  2026-05-03  0:22                             ` [PATCH v9 08/18] perf sample-raw: " Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 10/18] perf arch common: " Ian Rogers
                                               ` (8 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Use the e_machine rather than the arch to determine x86 or PPC types.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/sort.c | 58 +++++++++++++++++++++++-------------------
 1 file changed, 32 insertions(+), 26 deletions(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 0020089cb13c..90bc4a31bb55 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1,40 +1,45 @@
 // SPDX-License-Identifier: GPL-2.0
+#include "sort.h"
+
 #include <ctype.h>
 #include <errno.h>
 #include <inttypes.h>
-#include <regex.h>
 #include <stdlib.h>
+
+#include <elf.h>
+#include <linux/kernel.h>
 #include <linux/mman.h>
+#include <linux/string.h>
 #include <linux/time64.h>
+
+#include <regex.h>
+
+#include "annotate-data.h"
+#include "annotate.h"
+#include "branch.h"
+#include "cacheline.h"
+#include "cgroup.h"
+#include "comm.h"
 #include "debug.h"
 #include "dso.h"
-#include "sort.h"
+#include "event.h"
+#include "evlist.h"
+#include "evsel.h"
 #include "hist.h"
-#include "cacheline.h"
-#include "comm.h"
+#include "machine.h"
 #include "map.h"
-#include "maps.h"
-#include "symbol.h"
 #include "map_symbol.h"
-#include "branch.h"
-#include "thread.h"
-#include "evsel.h"
-#include "evlist.h"
-#include "srcline.h"
-#include "strlist.h"
-#include "strbuf.h"
+#include "maps.h"
 #include "mem-events.h"
 #include "mem-info.h"
-#include "annotate.h"
-#include "annotate-data.h"
-#include "event.h"
-#include "time-utils.h"
-#include "cgroup.h"
-#include "machine.h"
 #include "session.h"
+#include "srcline.h"
+#include "strbuf.h"
+#include "strlist.h"
+#include "symbol.h"
+#include "thread.h"
+#include "time-utils.h"
 #include "trace-event.h"
-#include <linux/kernel.h>
-#include <linux/string.h>
 
 #ifdef HAVE_LIBTRACEEVENT
 #include <event-parse.h>
@@ -2673,9 +2678,10 @@ struct sort_dimension {
 
 static int arch_support_sort_key(const char *sort_key, struct perf_env *env)
 {
-	const char *arch = perf_env__arch(env);
+	uint16_t e_machine = perf_env__e_machine(env, /*e_eflags=*/NULL);
 
-	if (!strcmp("x86", arch) || !strcmp("powerpc", arch)) {
+	if (e_machine == EM_X86_64 || e_machine == EM_386 || e_machine == EM_PPC64 ||
+	    e_machine == EM_PPC) {
 		if (!strcmp(sort_key, "p_stage_cyc"))
 			return 1;
 		if (!strcmp(sort_key, "local_p_stage_cyc"))
@@ -2686,14 +2692,14 @@ static int arch_support_sort_key(const char *sort_key, struct perf_env *env)
 
 static const char *arch_perf_header_entry(const char *se_header, struct perf_env *env)
 {
-	const char *arch = perf_env__arch(env);
+	uint16_t e_machine = perf_env__e_machine(env, /*e_eflags=*/NULL);
 
-	if (!strcmp("x86", arch)) {
+	if (e_machine == EM_X86_64 || e_machine == EM_386) {
 		if (!strcmp(se_header, "Local Pipeline Stage Cycle"))
 			return "Local Retire Latency";
 		else if (!strcmp(se_header, "Pipeline Stage Cycle"))
 			return "Retire Latency";
-	} else if (!strcmp("powerpc", arch)) {
+	} else if (e_machine == EM_PPC64 || e_machine == EM_PPC) {
 		if (!strcmp(se_header, "Local INSTR Latency"))
 			return "Finish Cyc";
 		else if (!strcmp(se_header, "INSTR Latency"))
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 10/18] perf arch common: Use perf_env e_machine rather than arch
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                               ` (8 preceding siblings ...)
  2026-05-03  0:22                             ` [PATCH v9 09/18] perf sort: " Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 11/18] perf header: In print_pmu_caps use perf_env e_machine Ian Rogers
                                               ` (7 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Use the e_machine rather than arch string matching.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/arch/common.c | 62 ++++++++++++++++++++++++++--------------
 1 file changed, 40 insertions(+), 22 deletions(-)

diff --git a/tools/perf/arch/common.c b/tools/perf/arch/common.c
index 21836f70f231..1d8aff9b32d6 100644
--- a/tools/perf/arch/common.c
+++ b/tools/perf/arch/common.c
@@ -1,13 +1,18 @@
 // SPDX-License-Identifier: GPL-2.0
+#include "common.h"
+
 #include <limits.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+
+#include <linux/zalloc.h>
 #include <unistd.h>
-#include "common.h"
-#include "../util/env.h"
+
+#include <dwarf-regs.h>
+
 #include "../util/debug.h"
-#include <linux/zalloc.h>
+#include "../util/env.h"
 
 static const char *const arc_triplets[] = {
 	"arc-linux-",
@@ -145,7 +150,8 @@ static int perf_env__lookup_binutils_path(struct perf_env *env,
 					  const char *name, char **path)
 {
 	int idx;
-	const char *arch = perf_env__arch(env), *cross_env;
+	uint16_t e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
+	const char *cross_env;
 	const char *const *path_list;
 	char *buf = NULL;
 
@@ -153,7 +159,7 @@ static int perf_env__lookup_binutils_path(struct perf_env *env,
 	 * We don't need to try to find objdump path for native system.
 	 * Just use default binutils path (e.g.: "objdump").
 	 */
-	if (!strcmp(perf_env__arch(NULL), arch))
+	if (e_machine == EM_HOST)
 		goto out;
 
 	cross_env = getenv("CROSS_COMPILE");
@@ -170,30 +176,42 @@ static int perf_env__lookup_binutils_path(struct perf_env *env,
 		zfree(&buf);
 	}
 
-	if (!strcmp(arch, "arc"))
+	switch (e_machine) {
+	case EM_ARC:
 		path_list = arc_triplets;
-	else if (!strcmp(arch, "arm"))
+		break;
+	case EM_ARM:
 		path_list = arm_triplets;
-	else if (!strcmp(arch, "arm64"))
+		break;
+	case EM_AARCH64:
 		path_list = arm64_triplets;
-	else if (!strcmp(arch, "powerpc"))
+		break;
+	case EM_PPC:
+	case EM_PPC64:
 		path_list = powerpc_triplets;
-	else if (!strcmp(arch, "riscv32"))
-		path_list = riscv32_triplets;
-	else if (!strcmp(arch, "riscv64"))
-		path_list = riscv64_triplets;
-	else if (!strcmp(arch, "sh"))
+		break;
+	case EM_RISCV:
+		path_list = perf_env__kernel_is_64_bit(env) ? riscv64_triplets : riscv32_triplets;
+		break;
+	case EM_SH:
 		path_list = sh_triplets;
-	else if (!strcmp(arch, "s390"))
+		break;
+	case EM_S390:
 		path_list = s390_triplets;
-	else if (!strcmp(arch, "sparc"))
+		break;
+	case EM_SPARC:
+	case EM_SPARCV9:
 		path_list = sparc_triplets;
-	else if (!strcmp(arch, "x86"))
+		break;
+	case EM_X86_64:
+	case EM_386:
 		path_list = x86_triplets;
-	else if (!strcmp(arch, "mips"))
+		break;
+	case EM_MIPS:
 		path_list = mips_triplets;
-	else {
-		ui__error("binutils for %s not supported.\n", arch);
+		break;
+	default:
+		ui__error("binutils for %s not supported.\n", perf_env__arch(env));
 		goto out_error;
 	}
 
@@ -202,7 +220,7 @@ static int perf_env__lookup_binutils_path(struct perf_env *env,
 		ui__error("Please install %s for %s.\n"
 			  "You can add it to PATH, set CROSS_COMPILE or "
 			  "override the default using --%s.\n",
-			  name, arch, name);
+			  name, perf_env__arch(env), name);
 		goto out_error;
 	}
 
@@ -237,5 +255,5 @@ int perf_env__lookup_objdump(struct perf_env *env, char **path)
  */
 bool perf_env__single_address_space(struct perf_env *env)
 {
-	return strcmp(perf_env__arch(env), "sparc");
+	return perf_env__e_machine(env, /*e_flags=*/NULL) != EM_SPARCV9;
 }
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 11/18] perf header: In print_pmu_caps use perf_env e_machine
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                               ` (9 preceding siblings ...)
  2026-05-03  0:22                             ` [PATCH v9 10/18] perf arch common: " Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 12/18] perf c2c: Use perf_env e_machine rather than arch Ian Rogers
                                               ` (6 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Switch from arch to e_machine in print_pmu_caps.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/header.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index f1ae61392cce..bdf6c5d0fd5d 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2348,15 +2348,16 @@ static void print_cpu_pmu_caps(struct feat_fd *ff, FILE *fp)
 static void print_pmu_caps(struct feat_fd *ff, FILE *fp)
 {
 	struct perf_env *env = &ff->ph->env;
-	struct pmu_caps *pmu_caps;
+	uint16_t e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
 
 	for (int i = 0; i < env->nr_pmus_with_caps; i++) {
-		pmu_caps = &env->pmu_caps[i];
+		struct pmu_caps *pmu_caps = &env->pmu_caps[i];
+
 		__print_pmu_caps(fp, pmu_caps->nr_caps, pmu_caps->caps,
 				 pmu_caps->pmu_name);
 	}
 
-	if (strcmp(perf_env__arch(env), "x86") == 0 &&
+	if ((e_machine == EM_X86_64 || e_machine == EM_386) &&
 	    perf_env__has_pmu_mapping(env, "ibs_op")) {
 		char *max_precise = perf_env__find_pmu_cap(env, "cpu", "max_precise");
 
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 12/18] perf c2c: Use perf_env e_machine rather than arch
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                               ` (10 preceding siblings ...)
  2026-05-03  0:22                             ` [PATCH v9 11/18] perf header: In print_pmu_caps use perf_env e_machine Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 13/18] perf lock-contention: " Ian Rogers
                                               ` (5 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Use the e_machine rather than arch string matching for AARCH64.

Add include of dwarf-regs.h in case the EM_AARCH64 isn't defined, sort
the headers given this include.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-c2c.c | 40 ++++++++++++++++++++++------------------
 1 file changed, 22 insertions(+), 18 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 72a7802775ee..c55cab53531b 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -12,41 +12,45 @@
  */
 #include <errno.h>
 #include <inttypes.h>
+
+#include <asm/bug.h>
 #include <linux/compiler.h>
 #include <linux/err.h>
 #include <linux/kernel.h>
 #include <linux/stringify.h>
 #include <linux/zalloc.h>
-#include <asm/bug.h>
 #include <sys/param.h>
-#include "debug.h"
-#include "builtin.h"
+
+#include <dwarf-regs.h>
 #include <perf/cpumap.h>
 #include <subcmd/pager.h>
 #include <subcmd/parse-options.h>
-#include "map_symbol.h"
-#include "mem-events.h"
-#include "session.h"
-#include "hist.h"
-#include "sort.h"
-#include "tool.h"
+
+#include "builtin.h"
 #include "cacheline.h"
 #include "data.h"
+#include "debug.h"
 #include "event.h"
 #include "evlist.h"
 #include "evsel.h"
-#include "ui/browsers/hists.h"
-#include "thread.h"
-#include "mem2node.h"
+#include "hist.h"
+#include "map_symbol.h"
+#include "mem-events.h"
 #include "mem-info.h"
-#include "symbol.h"
-#include "ui/ui.h"
-#include "ui/progress.h"
+#include "mem2node.h"
 #include "pmus.h"
+#include "session.h"
+#include "sort.h"
 #include "string2.h"
-#include "util/util.h"
-#include "util/symbol.h"
+#include "symbol.h"
+#include "thread.h"
+#include "tool.h"
+#include "ui/browsers/hists.h"
+#include "ui/progress.h"
+#include "ui/ui.h"
 #include "util/annotate.h"
+#include "util/symbol.h"
+#include "util/util.h"
 
 struct c2c_hists {
 	struct hists		hists;
@@ -3202,7 +3206,7 @@ static int perf_c2c__report(int argc, const char **argv)
 	 * default display type.
 	 */
 	if (!display) {
-		if (!strcmp(perf_env__arch(env), "arm64"))
+		if (perf_env__e_machine(env, /*e_flags=*/NULL) == EM_AARCH64)
 			display = "peer";
 		else
 			display = "tot";
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 13/18] perf lock-contention: Use perf_env e_machine rather than arch
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                               ` (11 preceding siblings ...)
  2026-05-03  0:22                             ` [PATCH v9 12/18] perf c2c: Use perf_env e_machine rather than arch Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 14/18] perf env: Refactor perf_env__arch_strerrno Ian Rogers
                                               ` (4 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Use the e_machine rather than arch string matching for powerpc.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/lock-contention.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/lock-contention.c b/tools/perf/util/lock-contention.c
index 92e7b7b572a2..119a7206f3cd 100644
--- a/tools/perf/util/lock-contention.c
+++ b/tools/perf/util/lock-contention.c
@@ -104,7 +104,8 @@ bool match_callstack_filter(struct machine *machine, u64 *callstack, int max_sta
 	struct map *kmap;
 	struct symbol *sym;
 	u64 ip;
-	const char *arch = perf_env__arch(machine->env);
+	uint16_t e_machine = perf_env__e_machine(machine->env, /*e_flags=*/NULL);
+	bool is_powerpc = e_machine == EM_PPC64 || e_machine == EM_PPC;
 
 	if (list_empty(&callstack_filters))
 		return true;
@@ -125,8 +126,7 @@ bool match_callstack_filter(struct machine *machine, u64 *callstack, int max_sta
 		 * incase first or second callstack index entry has 0
 		 * address for powerpc.
 		 */
-		if (!callstack || (!callstack[i] && (strcmp(arch, "powerpc") ||
-						(i != 1 && i != 2))))
+		if (!callstack || (!callstack[i] && (!is_powerpc || (i != 1 && i != 2))))
 			break;
 
 		ip = callstack[i];
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 14/18] perf env: Refactor perf_env__arch_strerrno
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                               ` (12 preceding siblings ...)
  2026-05-03  0:22                             ` [PATCH v9 13/18] perf lock-contention: " Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 15/18] perf env: Remove unused perf_env__raw_arch Ian Rogers
                                               ` (3 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

perf_env__arch_strerrno is only available with libtraceevent so hide
the declaration if no libtraceevent.

The previous approach maps an architecture string to a function
pointer to a function that takes an int errno values and returns a
string. The new approach takes an e_machine and an errno value and
returns a string.

As the only call site is in builtin-trace.c, the e_machine is already
present and potentially more specific than the perf_env arch string
that is a single global value.

The major complication in this approach is having the shell script
that generates the C code map a linux directory name to the matching
ELF machine constants.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-trace.c                  |  7 ++--
 tools/perf/trace/beauty/arch_errno_names.sh | 40 ++++++++++++++++++---
 tools/perf/util/env.c                       | 13 +++----
 tools/perf/util/env.h                       |  7 ++--
 4 files changed, 46 insertions(+), 21 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index e58c49d047a2..d278af18542f 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -3008,9 +3008,8 @@ static int trace__sys_exit(struct trace *trace, struct evsel *evsel,
 	} else if (ret < 0) {
 errno_print: {
 		char bf[STRERR_BUFSIZE];
-		struct perf_env *env = evsel__env(evsel) ?: &trace->host_env;
 		const char *emsg = str_error_r(-ret, bf, sizeof(bf));
-		const char *e = perf_env__arch_strerrno(env, err);
+		const char *e = perf_env__arch_strerrno(e_machine, err);
 
 		fprintf(trace->output, "-1 %s (%s)", e, emsg);
 	}
@@ -4890,7 +4889,9 @@ static size_t syscall__dump_stats(struct trace *trace, int e_machine, FILE *fp,
 
 				for (e = 0; e < stats->max_errno; ++e) {
 					if (stats->errnos[e] != 0)
-						fprintf(fp, "\t\t\t\t%s: %d\n", perf_env__arch_strerrno(trace->host->env, e + 1), stats->errnos[e]);
+						fprintf(fp, "\t\t\t\t%s: %d\n",
+							perf_env__arch_strerrno(e_machine, e + 1),
+							stats->errnos[e]);
 				}
 			}
 			lines++;
diff --git a/tools/perf/trace/beauty/arch_errno_names.sh b/tools/perf/trace/beauty/arch_errno_names.sh
index b22890b8d272..89b742927168 100755
--- a/tools/perf/trace/beauty/arch_errno_names.sh
+++ b/tools/perf/trace/beauty/arch_errno_names.sh
@@ -52,21 +52,49 @@ process_arch()
 		|IFS=, create_errno_lookup_func "$arch"
 }
 
+arch_to_e_machine()
+{
+	case "$1" in
+	alpha)      printf '\tcase EM_ALPHA:\n' ;;
+	arc)        printf '\tcase EM_ARC:\n' ;;
+	arm)        printf '\tcase EM_ARM:\n' ;;
+	arm64)      printf '\tcase EM_AARCH64:\n' ;;
+	csky)       printf '\tcase EM_CSKY:\n' ;;
+	hexagon)    printf '\tcase EM_HEXAGON:\n' ;;
+	loongarch)  printf '\tcase EM_LOONGARCH:\n' ;;
+	microblaze) printf '\tcase EM_MICROBLAZE:\n' ;;
+	mips)       printf '\tcase EM_MIPS:\n' ;;
+	parisc)     printf '\tcase EM_PARISC:\n' ;;
+	powerpc)    printf '\tcase EM_PPC:\n\tcase EM_PPC64:\n' ;;
+	riscv)      printf '\tcase EM_RISCV:\n' ;;
+	s390)       printf '\tcase EM_S390:\n' ;;
+	sh)         printf '\tcase EM_SH:\n' ;;
+	sparc)      printf '\tcase EM_SPARC:\n\tcase EM_SPARCV9:\n' ;;
+	x86)        printf '\tcase EM_386:\n\tcase EM_X86_64:\n' ;;
+	xtensa)     printf '\tcase EM_XTENSA:\n' ;;
+	esac
+}
+
 create_arch_errno_table_func()
 {
 	archlist="$1"
 	default="$2"
 
-	printf 'static arch_syscalls__strerrno_t *\n'
-	printf 'arch_syscalls__strerrno_function(const char *arch)\n'
+	printf 'static const char *\n'
+	printf 'arch_syscalls__strerrno(uint16_t e_machine, int err)\n'
 	printf '{\n'
+	printf '\tswitch (e_machine) {\n'
 	for arch in $archlist; do
 		arch_str=$(arch_string "$arch")
-		printf '\tif (!strcmp(arch, "%s"))\n' "$arch_str"
-		printf '\t\treturn errno_to_name__%s;\n' "$arch_str"
+		ems=$(arch_to_e_machine "$arch_str")
+		if [ -n "$ems" ]; then
+			printf '%s\n' "$ems"
+			printf '\t\treturn errno_to_name__%s(err);\n' "$arch_str"
+		fi
 	done
 	arch_str=$(arch_string "$default")
-	printf '\treturn errno_to_name__%s;\n' "$arch_str"
+	printf '\tdefault:\n\t\treturn errno_to_name__%s(err);\n' "$arch_str"
+	printf '\t}\n'
 	printf '}\n'
 }
 
@@ -74,6 +102,8 @@ cat <<EoHEADER
 /* SPDX-License-Identifier: GPL-2.0 */
 
 #include <string.h>
+#include <stdint.h>
+#include <elf.h>
 
 EoHEADER
 
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 1a4db133262b..8ac7aff0b27c 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -798,17 +798,12 @@ const char *perf_env__arch(struct perf_env *env)
 #include "trace/beauty/arch_errno_names.c"
 #endif
 
-const char *perf_env__arch_strerrno(struct perf_env *env __maybe_unused, int err __maybe_unused)
-{
 #if defined(HAVE_LIBTRACEEVENT)
-	if (env->arch_strerrno == NULL)
-		env->arch_strerrno = arch_syscalls__strerrno_function(perf_env__arch(env));
-
-	return env->arch_strerrno ? env->arch_strerrno(err) : "no arch specific strerrno function";
-#else
-	return "!HAVE_LIBTRACEEVENT";
-#endif
+const char *perf_env__arch_strerrno(uint16_t e_machine, int err)
+{
+	return arch_syscalls__strerrno(e_machine, err);
 }
+#endif
 
 const char *perf_env__cpuid(struct perf_env *env)
 {
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index c355df2dba7b..ba51b871c401 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -67,8 +67,6 @@ struct cpu_domain_map {
 	struct domain_info	**domains;
 };
 
-typedef const char *(arch_syscalls__strerrno_t)(int err);
-
 struct perf_env {
 	char			*hostname;
 	char			*os_release;
@@ -158,7 +156,6 @@ struct perf_env {
 		 */
 		bool	enabled;
 	} clock;
-	arch_syscalls__strerrno_t *arch_strerrno;
 };
 
 enum perf_compress_type {
@@ -191,7 +188,9 @@ void cpu_cache_level__free(struct cpu_cache_level *cache);
 uint16_t perf_env__e_machine_nocache(struct perf_env *env, uint32_t *e_flags);
 uint16_t perf_env__e_machine(struct perf_env *env, uint32_t *e_flags);
 const char *perf_env__arch(struct perf_env *env);
-const char *perf_env__arch_strerrno(struct perf_env *env, int err);
+#if defined(HAVE_LIBTRACEEVENT)
+const char *perf_env__arch_strerrno(uint16_t e_machine, int err);
+#endif
 const char *perf_env__cpuid(struct perf_env *env);
 const char *perf_env__raw_arch(struct perf_env *env);
 int perf_env__nr_cpus_avail(struct perf_env *env);
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 15/18] perf env: Remove unused perf_env__raw_arch
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                               ` (13 preceding siblings ...)
  2026-05-03  0:22                             ` [PATCH v9 14/18] perf env: Refactor perf_env__arch_strerrno Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 16/18] perf env: Add helper to lazily compute the os_release Ian Rogers
                                               ` (2 subsequent siblings)
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

The switch to using e_machine has made the perf_env__raw_arch function
unused so remove it.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/env.c | 18 ------------------
 tools/perf/util/env.h |  1 -
 2 files changed, 19 deletions(-)

diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 8ac7aff0b27c..29d5fe37528b 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -463,19 +463,6 @@ int perf_env__read_cpuid(struct perf_env *env)
 	return 0;
 }
 
-static int perf_env__read_arch(struct perf_env *env)
-{
-	struct utsname uts;
-
-	if (env->arch)
-		return 0;
-
-	if (!uname(&uts))
-		env->arch = strdup(uts.machine);
-
-	return env->arch ? 0 : -ENOMEM;
-}
-
 static int perf_env__read_nr_cpus_avail(struct perf_env *env)
 {
 	if (env->nr_cpus_avail == 0)
@@ -594,11 +581,6 @@ int perf_env__read_core_pmu_caps(struct perf_env *env)
 	return ret;
 }
 
-const char *perf_env__raw_arch(struct perf_env *env)
-{
-	return env && !perf_env__read_arch(env) ? env->arch : "unknown";
-}
-
 int perf_env__nr_cpus_avail(struct perf_env *env)
 {
 	return env && !perf_env__read_nr_cpus_avail(env) ? env->nr_cpus_avail : 0;
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index ba51b871c401..bc4801d8399b 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -192,7 +192,6 @@ const char *perf_env__arch(struct perf_env *env);
 const char *perf_env__arch_strerrno(uint16_t e_machine, int err);
 #endif
 const char *perf_env__cpuid(struct perf_env *env);
-const char *perf_env__raw_arch(struct perf_env *env);
 int perf_env__nr_cpus_avail(struct perf_env *env);
 
 void perf_env__init(struct perf_env *env);
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 16/18] perf env: Add helper to lazily compute the os_release
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                               ` (14 preceding siblings ...)
  2026-05-03  0:22                             ` [PATCH v9 15/18] perf env: Remove unused perf_env__raw_arch Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 17/18] perf symbol: Add setters for bitfields sharing a byte to avoid concurrent update issues Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 18/18] perf symbol: Lazily compute idle Ian Rogers
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

In live mode the os_release isn't being initialized, make a lazy
initialization helper that assumes when the os_release isn't
initialized this is live mode.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/data-convert-bt.c |  2 +-
 tools/perf/util/env.c             | 21 +++++++++++++++++++++
 tools/perf/util/env.h             |  1 +
 tools/perf/util/header.c          | 16 +++++++++++-----
 tools/perf/util/symbol.c          |  4 ++--
 5 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c
index 3b8f2df823a9..2c88420fe33e 100644
--- a/tools/perf/util/data-convert-bt.c
+++ b/tools/perf/util/data-convert-bt.c
@@ -1414,7 +1414,7 @@ do {									\
 
 	ADD("host",    env->hostname);
 	ADD("sysname", "Linux");
-	ADD("release", env->os_release);
+	ADD("release", perf_env__os_release(env));
 	ADD("version", env->version);
 	ADD("machine", env->arch);
 	ADD("domain", "kernel");
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 29d5fe37528b..45dde40042b5 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -351,6 +351,27 @@ bool perf_arch_is_big_endian(const char *arch)
 	return false;
 }
 
+const char *perf_env__os_release(struct perf_env *env)
+{
+	struct utsname uts;
+	int ret;
+
+	if (!env)
+		return perf_version_string;
+
+	if (env->os_release)
+		return env->os_release;
+
+	/*
+	 * The os_release is being accessed but wasn't initialized from a data
+	 * file, assume this is 'live' mode and use the release from uname. If
+	 * uname or strdup fails then use the current perf tool version.
+	 */
+	ret = uname(&uts);
+	env->os_release = strdup(ret < 0 ? perf_version_string : uts.release);
+	return env->os_release ?: perf_version_string;
+}
+
 int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[])
 {
 	int i;
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index bc4801d8399b..bbf10446204c 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -173,6 +173,7 @@ void perf_env__exit(struct perf_env *env);
 
 int perf_env__kernel_is_64_bit(struct perf_env *env);
 bool perf_arch_is_big_endian(const char *arch);
+const char *perf_env__os_release(struct perf_env *env);
 
 int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[]);
 
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index bdf6c5d0fd5d..ce0c392ead69 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -370,13 +370,19 @@ static int write_osrelease(struct feat_fd *ff,
 			   struct evlist *evlist __maybe_unused)
 {
 	struct utsname uts;
-	int ret;
+	const char *release = NULL;
 
-	ret = uname(&uts);
-	if (ret < 0)
-		return -1;
+	if (evlist->session)
+		release = perf_env__os_release(perf_session__env(evlist->session));
 
-	return do_write_string(ff, uts.release);
+	if (!release) {
+		int ret = uname(&uts);
+
+		if (ret < 0)
+			return -1;
+		release = uts.release;
+	}
+	return do_write_string(ff, release);
 }
 
 static int write_arch(struct feat_fd *ff, struct evlist *evlist)
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index a4b1f837a5a5..fabed5b0fa57 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -2225,7 +2225,7 @@ static int vmlinux_path__init(struct perf_env *env)
 {
 	struct utsname uts;
 	char bf[PATH_MAX];
-	char *kernel_version;
+	const char *kernel_version;
 	unsigned int i;
 
 	vmlinux_path = malloc(sizeof(char *) * (ARRAY_SIZE(vmlinux_paths) +
@@ -2242,7 +2242,7 @@ static int vmlinux_path__init(struct perf_env *env)
 		return 0;
 
 	if (env) {
-		kernel_version = env->os_release;
+		kernel_version = perf_env__os_release(env);
 	} else {
 		if (uname(&uts) < 0)
 			goto out_fail;
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 17/18] perf symbol: Add setters for bitfields sharing a byte to avoid concurrent update issues
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                               ` (15 preceding siblings ...)
  2026-05-03  0:22                             ` [PATCH v9 16/18] perf env: Add helper to lazily compute the os_release Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  2026-05-03  0:22                             ` [PATCH v9 18/18] perf symbol: Lazily compute idle Ian Rogers
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

A problem with putting bitfields into struct symbol is that other bits in
the symbol could be updated concurrently and only one update to the
underlying storage unit happen, leading to lost updates.

To avoid this, use atomics to atomically read or set part of 16-bits
of flags in the symbol. Add accessors to simplify this.

The idle value has 3 values in preparation for a later change that
will lazily update it.

Assisted-by: Gemini:gemini-3.1-pro-preview
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-inject.c                   |  6 +-
 tools/perf/builtin-kwork.c                    |  2 +-
 tools/perf/builtin-report.c                   |  2 +-
 tools/perf/builtin-sched.c                    |  4 +-
 tools/perf/builtin-top.c                      |  6 +-
 tools/perf/tests/symbols.c                    |  2 +-
 tools/perf/tests/vmlinux-kallsyms.c           |  2 +-
 tools/perf/ui/browsers/annotate.c             |  2 +-
 tools/perf/ui/browsers/map.c                  |  4 +-
 tools/perf/util/annotate.c                    |  5 +-
 tools/perf/util/auxtrace.c                    |  6 +-
 tools/perf/util/callchain.c                   |  4 +-
 tools/perf/util/dlfilter.c                    |  2 +-
 tools/perf/util/evsel_fprintf.c               |  6 +-
 tools/perf/util/intel-pt.c                    |  2 +-
 tools/perf/util/machine.c                     |  2 +-
 tools/perf/util/probe-event.c                 |  4 +-
 .../util/scripting-engines/trace-event-perl.c |  2 +-
 .../scripting-engines/trace-event-python.c    |  4 +-
 tools/perf/util/sort.c                        |  8 +-
 tools/perf/util/srcline.c                     | 10 +--
 tools/perf/util/symbol-elf.c                  |  3 +-
 tools/perf/util/symbol.c                      | 84 +++++++++++++++----
 tools/perf/util/symbol.h                      | 70 ++++++++++++----
 tools/perf/util/symbol_fprintf.c              |  4 +-
 25 files changed, 171 insertions(+), 75 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index f174bc69cec4..390327c7f78d 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -439,9 +439,9 @@ static int perf_event__convert_sample_callchain(const struct perf_tool *tool,
 	node = cursor->first;
 	for (k = 0; k < cursor->nr && i < PERF_MAX_STACK_DEPTH; k++) {
 		if (machine__kernel_ip(machine, node->ip))
-			/* kernel IPs were added already */;
-		else if (node->ms.sym && node->ms.sym->inlined)
-			/* we can't handle inlined callchains */;
+			; /* kernel IPs were added already */
+		else if (node->ms.sym && symbol__inlined(node->ms.sym))
+			; /* we can't handle inlined callchains */
 		else
 			inject->raw_callchain->ips[i++] = node->ip;
 
diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index 9d3a4c779a41..7337ee956dc9 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -725,7 +725,7 @@ static void timehist_save_callchain(struct perf_kwork *kwork,
 		if (sym) {
 			if (!strcmp(sym->name, "__softirqentry_text_start") ||
 			    !strcmp(sym->name, "__do_softirq"))
-				sym->ignore = 1;
+				symbol__set_ignore(sym, true);
 		}
 
 		callchain_cursor_advance(cursor);
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 95c0bdba6b11..3c9ada8539c3 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -753,7 +753,7 @@ static int hists__resort_cb(struct hist_entry *he, void *arg)
 	struct report *rep = arg;
 	struct symbol *sym = he->ms.sym;
 
-	if (rep->symbol_ipc && sym && !sym->annotate2) {
+	if (rep->symbol_ipc && sym && !symbol__is_annotate2(sym)) {
 		struct evsel *evsel = hists_to_evsel(he->hists);
 
 		symbol__annotate2(&he->ms, evsel, NULL);
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 555247568e7a..7c874a258cb4 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2371,7 +2371,7 @@ static void save_task_callchain(struct perf_sched *sched,
 			if (!strcmp(sym->name, "schedule") ||
 			    !strcmp(sym->name, "__schedule") ||
 			    !strcmp(sym->name, "preempt_schedule"))
-				sym->ignore = 1;
+				symbol__set_ignore(sym, true);
 		}
 
 		callchain_cursor_advance(cursor);
@@ -3035,7 +3035,7 @@ static size_t callchain__fprintf_folded(FILE *fp, struct callchain_node *node)
 	list_for_each_entry(chain, &node->val, list) {
 		if (chain->ip >= PERF_CONTEXT_MAX)
 			continue;
-		if (chain->ms.sym && chain->ms.sym->ignore)
+		if (chain->ms.sym && symbol__ignore(chain->ms.sym))
 			continue;
 		ret += fprintf(fp, "%s%s", first ? "" : sep,
 			       callchain_list__sym_name(chain, bf, sizeof(bf),
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index f6eb543de537..9a0c388a7ec3 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -186,8 +186,8 @@ static void ui__warn_map_erange(struct map *map, struct symbol *sym, u64 ip)
 		    "Please report to linux-kernel@vger.kernel.org\n",
 		    ip, dso__long_name(dso), dso__symtab_origin(dso),
 		    map__start(map), map__end(map), sym->start, sym->end,
-		    sym->binding == STB_GLOBAL ? 'g' :
-		    sym->binding == STB_LOCAL  ? 'l' : 'w', sym->name,
+		    symbol__binding(sym) == STB_GLOBAL ? 'g' :
+		    symbol__binding(sym) == STB_LOCAL  ? 'l' : 'w', sym->name,
 		    err ? "[unknown]" : uts.machine,
 		    err ? "[unknown]" : uts.release, perf_version_string);
 	if (use_browser <= 0)
@@ -830,7 +830,7 @@ static void perf_event__process_sample(const struct perf_tool *tool,
 		}
 	}
 
-	if (al.sym == NULL || !al.sym->idle) {
+	if (al.sym == NULL || !symbol__is_idle(al.sym)) {
 		struct hists *hists = evsel__hists(evsel);
 		struct hist_entry_iter iter = {
 			.evsel		= evsel,
diff --git a/tools/perf/tests/symbols.c b/tools/perf/tests/symbols.c
index f4ffe5804f40..c09e04f36035 100644
--- a/tools/perf/tests/symbols.c
+++ b/tools/perf/tests/symbols.c
@@ -125,7 +125,7 @@ static int test_dso(struct dso *dso)
 	for (nd = rb_first_cached(dso__symbols(dso)); nd; nd = rb_next(nd)) {
 		struct symbol *sym = rb_entry(nd, struct symbol, rb_node);
 
-		if (sym->type != STT_FUNC && sym->type != STT_GNU_IFUNC)
+		if (symbol__type(sym) != STT_FUNC && symbol__type(sym) != STT_GNU_IFUNC)
 			continue;
 
 		/* Check for overlapping function symbols */
diff --git a/tools/perf/tests/vmlinux-kallsyms.c b/tools/perf/tests/vmlinux-kallsyms.c
index 524d46478364..7409abe4aa36 100644
--- a/tools/perf/tests/vmlinux-kallsyms.c
+++ b/tools/perf/tests/vmlinux-kallsyms.c
@@ -346,7 +346,7 @@ static int test__vmlinux_matches_kallsyms(struct test_suite *test __maybe_unused
 			 * such as __indirect_thunk_end.
 			 */
 			continue;
-		} else if (is_ignored_symbol(sym->name, sym->type)) {
+		} else if (is_ignored_symbol(sym->name, symbol__type(sym))) {
 			/*
 			 * Ignore hidden symbols, see scripts/kallsyms.c for the details
 			 */
diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index ea17e6d29a7e..e220c4dfc881 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -1185,7 +1185,7 @@ int __hist_entry__tui_annotate(struct hist_entry *he, struct map_symbol *ms,
 	if (dso__annotate_warned(dso))
 		return -1;
 
-	if (not_annotated || !sym->annotate2) {
+	if (not_annotated || !symbol__is_annotate2(sym)) {
 		err = symbol__annotate2(ms, evsel, &browser.arch);
 		if (err) {
 			annotate_browser__symbol_annotate_error(&browser, err);
diff --git a/tools/perf/ui/browsers/map.c b/tools/perf/ui/browsers/map.c
index c61ba3174a24..075a575cdc5d 100644
--- a/tools/perf/ui/browsers/map.c
+++ b/tools/perf/ui/browsers/map.c
@@ -32,8 +32,8 @@ static void map_browser__write(struct ui_browser *browser, void *nd, int row)
 	ui_browser__set_percent_color(browser, 0, current_entry);
 	ui_browser__printf(browser, "%*" PRIx64 " %*" PRIx64 " %c ",
 			   mb->addrlen, sym->start, mb->addrlen, sym->end,
-			   sym->binding == STB_GLOBAL ? 'g' :
-				sym->binding == STB_LOCAL  ? 'l' : 'w');
+			   symbol__binding(sym) == STB_GLOBAL ? 'g' :
+				symbol__binding(sym) == STB_LOCAL  ? 'l' : 'w');
 	width = browser->width - ((mb->addrlen * 2) + 4);
 	if (width > 0)
 		ui_browser__write_nstring(browser, sym->name, width);
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index e745f3034a0e..2ecb514888ba 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -235,7 +235,8 @@ static int __symbol__inc_addr_samples(struct map_symbol *ms,
 	h = annotated_source__histogram(src, evsel);
 	if (h == NULL) {
 		pr_debug("%s(%d): ENOMEM! sym->name=%s, start=%#" PRIx64 ", addr=%#" PRIx64 ", end=%#" PRIx64 ", func: %d\n",
-			 __func__, __LINE__, sym->name, sym->start, addr, sym->end, sym->type == STT_FUNC);
+			 __func__, __LINE__, sym->name, sym->start, addr, sym->end,
+			 symbol__type(sym) == STT_FUNC);
 		return -ENOMEM;
 	}
 
@@ -2224,7 +2225,7 @@ int symbol__annotate2(struct map_symbol *ms, struct evsel *evsel,
 
 	annotation__init_column_widths(notes, sym);
 	annotation__update_column_widths(notes);
-	sym->annotate2 = 1;
+	symbol__set_annotate2(sym, true);
 
 	return 0;
 }
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index a224687ffbc1..afcdefe95fee 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -2663,7 +2663,7 @@ static bool dso_sym_match(struct symbol *sym, const char *name, int *cnt,
 {
 	/* Same name, and global or the n'th found or any */
 	return !arch__compare_symbol_names(name, sym->name) &&
-	       ((!idx && sym->binding == STB_GLOBAL) ||
+	       ((!idx && symbol__binding(sym) == STB_GLOBAL) ||
 		(idx > 0 && ++*cnt == idx) ||
 		idx < 0);
 }
@@ -2681,8 +2681,8 @@ static void print_duplicate_syms(struct dso *dso, const char *sym_name)
 		if (dso_sym_match(sym, sym_name, &cnt, -1)) {
 			pr_err("#%d\t0x%"PRIx64"\t%c\t%s\n",
 			       ++cnt, sym->start,
-			       sym->binding == STB_GLOBAL ? 'g' :
-			       sym->binding == STB_LOCAL  ? 'l' : 'w',
+			       symbol__binding(sym) == STB_GLOBAL ? 'g' :
+			       symbol__binding(sym) == STB_LOCAL  ? 'l' : 'w',
 			       sym->name);
 			near = true;
 		} else if (near) {
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index f031cbbeeba8..9a107f42acdd 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -801,7 +801,7 @@ static enum match_result match_chain(struct callchain_cursor_node *node,
 			 * symbol start. Otherwise do a faster comparison based
 			 * on the symbol start address.
 			 */
-			if (cnode->ms.sym->inlined || node->ms.sym->inlined) {
+			if (symbol__inlined(cnode->ms.sym) || symbol__inlined(node->ms.sym)) {
 				match = match_chain_strings(cnode->ms.sym->name,
 							    node->ms.sym->name);
 				if (match != MATCH_ERROR)
@@ -1245,7 +1245,7 @@ char *callchain_list__sym_name(struct callchain_list *cl,
 	int printed;
 
 	if (cl->ms.sym) {
-		const char *inlined = cl->ms.sym->inlined ? " (inlined)" : "";
+		const char *inlined = symbol__inlined(cl->ms.sym) ? " (inlined)" : "";
 
 		if (show_srcline && cl->srcline)
 			printed = scnprintf(bf, bfsize, "%s %s%s",
diff --git a/tools/perf/util/dlfilter.c b/tools/perf/util/dlfilter.c
index dc31b5e7149e..e11e144af62b 100644
--- a/tools/perf/util/dlfilter.c
+++ b/tools/perf/util/dlfilter.c
@@ -56,7 +56,7 @@ static void al_to_d_al(struct addr_location *al, struct perf_dlfilter_al *d_al)
 			d_al->symoff = al->addr - map__start(al->map) - sym->start;
 		else
 			d_al->symoff = 0;
-		d_al->sym_binding = sym->binding;
+		d_al->sym_binding = symbol__binding(sym);
 	} else {
 		d_al->sym = NULL;
 		d_al->sym_start = 0;
diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
index 5521d00bff2c..0f7a25500a44 100644
--- a/tools/perf/util/evsel_fprintf.c
+++ b/tools/perf/util/evsel_fprintf.c
@@ -146,7 +146,7 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
 			sym = node->ms.sym;
 			map = node->ms.map;
 
-			if (sym && sym->ignore && print_skip_ignored)
+			if (sym && symbol__ignore(sym) && print_skip_ignored)
 				goto next;
 
 			printed += fprintf(fp, "%-*.*s", left_alignment, left_alignment, " ");
@@ -182,7 +182,7 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
 				addr_location__exit(&node_al);
 			}
 
-			if (print_dso && (!sym || !sym->inlined))
+			if (print_dso && (!sym || !symbol__inlined(sym)))
 				printed += map__fprintf_dsoname_dsoff(map, print_dsoff, addr, fp);
 
 			if (print_srcline) {
@@ -192,7 +192,7 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
 					printed += map__fprintf_srcline(map, addr, "\n  ", fp);
 			}
 
-			if (sym && sym->inlined)
+			if (sym && symbol__inlined(sym))
 				printed += fprintf(fp, " (inlined)");
 
 			if (!print_oneline)
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index fc9eec8b54b8..6a405a9d829c 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -2979,7 +2979,7 @@ static u64 intel_pt_switch_ip(struct intel_pt *pt, u64 *ptss_ip)
 	start = dso__first_symbol(map__dso(map));
 
 	for (sym = start; sym; sym = dso__next_symbol(sym)) {
-		if (sym->binding == STB_GLOBAL &&
+		if (symbol__binding(sym) == STB_GLOBAL &&
 		    !strcmp(sym->name, "__switch_to")) {
 			ip = map__unmap_ip(map, sym->start);
 			if (ip >= map__start(map) && ip < map__end(map)) {
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 6d32d3cb5cb7..7e38dde160b7 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1078,7 +1078,7 @@ static u64 find_entry_trampoline(struct dso *dso)
 	unsigned int i;
 
 	for (; sym; sym = dso__next_symbol(sym)) {
-		if (sym->binding != STB_GLOBAL)
+		if (symbol__binding(sym) != STB_GLOBAL)
 			continue;
 		for (i = 0; i < ARRAY_SIZE(syms); i++) {
 			if (!strcmp(sym->name, syms[i]))
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 34b4badd2c14..11ae4a09412c 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -416,7 +416,7 @@ static int find_alternative_probe_point(struct debuginfo *dinfo,
 	map__for_each_symbol_by_name(map, pp->function, sym, idx) {
 		if (uprobes) {
 			address = sym->start;
-			if (sym->type == STT_GNU_IFUNC)
+			if (symbol__type(sym) == STT_GNU_IFUNC)
 				pr_warning("Warning: The probe function (%s) is a GNU indirect function.\n"
 					   "Consider identifying the final function used at run time and set the probe directly on that.\n",
 					   pp->function);
@@ -3189,7 +3189,7 @@ static int find_probe_trace_events_from_map(struct perf_probe_event *pev,
 	for (j = 0; j < num_matched_functions; j++) {
 		sym = syms[j];
 
-		if (sym->type != STT_FUNC)
+		if (symbol__type(sym) != STT_FUNC)
 			continue;
 
 		/* There can be duplicated symbols in the map */
diff --git a/tools/perf/util/scripting-engines/trace-event-perl.c b/tools/perf/util/scripting-engines/trace-event-perl.c
index e261a57b87d4..3e0490030ddd 100644
--- a/tools/perf/util/scripting-engines/trace-event-perl.c
+++ b/tools/perf/util/scripting-engines/trace-event-perl.c
@@ -304,7 +304,7 @@ static SV *perl_process_callchain(struct perf_sample *sample,
 			}
 			if (!hv_stores(sym, "start",   newSVuv(node->ms.sym->start)) ||
 			    !hv_stores(sym, "end",     newSVuv(node->ms.sym->end)) ||
-			    !hv_stores(sym, "binding", newSVuv(node->ms.sym->binding)) ||
+			    !hv_stores(sym, "binding", newSVuv(symbol__binding(node->ms.sym))) ||
 			    !hv_stores(sym, "name",    newSVpvn(node->ms.sym->name,
 								node->ms.sym->namelen)) ||
 			    !hv_stores(elem, "sym",    newRV_noinc((SV*)sym))) {
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index 5a30caaec73e..9d62a0921aee 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -437,7 +437,7 @@ static PyObject *python_process_callchain(struct perf_sample *sample,
 			pydict_set_item_string_decref(pysym, "end",
 					PyLong_FromUnsignedLongLong(node->ms.sym->end));
 			pydict_set_item_string_decref(pysym, "binding",
-					_PyLong_FromLong(node->ms.sym->binding));
+					_PyLong_FromLong(symbol__binding(node->ms.sym)));
 			pydict_set_item_string_decref(pysym, "name",
 					_PyUnicode_FromStringAndSize(node->ms.sym->name,
 							node->ms.sym->namelen));
@@ -1275,7 +1275,7 @@ static int python_export_symbol(struct db_export *dbe, struct symbol *sym,
 	tuple_set_d64(t, 1, dso__db_id(dso));
 	tuple_set_d64(t, 2, sym->start);
 	tuple_set_d64(t, 3, sym->end);
-	tuple_set_s32(t, 4, sym->binding);
+	tuple_set_s32(t, 4, symbol__binding(sym));
 	tuple_set_string(t, 5, sym->name);
 
 	call_object(tables->symbol_handler, t, "symbol_table");
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 90bc4a31bb55..005e7d85dc4a 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -469,7 +469,7 @@ int64_t _sort__sym_cmp(struct symbol *sym_l, struct symbol *sym_r)
 	if (sym_l == sym_r)
 		return 0;
 
-	if (sym_l->inlined || sym_r->inlined) {
+	if (symbol__inlined(sym_l) || symbol__inlined(sym_r)) {
 		int ret = strcmp(sym_l->name, sym_r->name);
 
 		if (ret)
@@ -536,7 +536,7 @@ static int _hist_entry__sym_snprintf(struct map_symbol *ms,
 
 	ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", level);
 	if (sym && map) {
-		if (sym->type == STT_OBJECT) {
+		if (symbol__type(sym) == STT_OBJECT) {
 			ret += repsep_snprintf(bf + ret, size - ret, "%s", sym->name);
 			ret += repsep_snprintf(bf + ret, size - ret, "+0x%llx",
 					ip - map__unmap_ip(map, sym->start));
@@ -544,7 +544,7 @@ static int _hist_entry__sym_snprintf(struct map_symbol *ms,
 			ret += repsep_snprintf(bf + ret, size - ret, "%.*s",
 					       width - ret,
 					       sym->name);
-			if (sym->inlined)
+			if (symbol__inlined(sym))
 				ret += repsep_snprintf(bf + ret, size - ret,
 						       " (inlined)");
 		}
@@ -1483,7 +1483,7 @@ static int _hist_entry__addr_snprintf(struct map_symbol *ms,
 
 	ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", level);
 	if (sym && map) {
-		if (sym->type == STT_OBJECT) {
+		if (symbol__type(sym) == STT_OBJECT) {
 			ret += repsep_snprintf(bf + ret, size - ret, "%s", sym->name);
 			ret += repsep_snprintf(bf + ret, size - ret, "+0x%llx",
 					ip - map__unmap_ip(map, sym->start));
diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index db164d258163..877d4889cd0d 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -113,16 +113,16 @@ struct symbol *new_inline_sym(struct dso *dso,
 		/* ensure that we don't alias an inlined symbol, which could
 		 * lead to double frees in inline_node__delete
 		 */
-		assert(!base_sym->inlined);
+		assert(!symbol__inlined(base_sym));
 	} else {
 		/* create a fake symbol for the inline frame */
 		inline_sym = symbol__new(base_sym ? base_sym->start : 0,
 					 base_sym ? (base_sym->end - base_sym->start) : 0,
-					 base_sym ? base_sym->binding : 0,
-					 base_sym ? base_sym->type : 0,
+					 base_sym ? symbol__binding(base_sym) : 0,
+					 base_sym ? symbol__type(base_sym) : 0,
 					 funcname);
 		if (inline_sym)
-			inline_sym->inlined = 1;
+			symbol__set_inlined(inline_sym, true);
 	}
 
 	free(demangled);
@@ -437,7 +437,7 @@ void inline_node__delete(struct inline_node *node)
 		list_del_init(&ilist->list);
 		zfree_srcline(&ilist->srcline);
 		/* only the inlined symbols are owned by the list */
-		if (ilist->symbol && ilist->symbol->inlined)
+		if (ilist->symbol && symbol__inlined(ilist->symbol))
 			symbol__delete(ilist->symbol);
 		free(ilist);
 	}
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 7afa8a117139..a9045d6fcb95 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -350,7 +350,8 @@ static bool get_ifunc_name(Elf *elf, struct dso *dso, GElf_Ehdr *ehdr,
 	sym = dso__find_symbol_nocache(dso, addr);
 
 	/* Expecting the address to be an IFUNC or IFUNC alias */
-	if (!sym || sym->start != addr || (sym->type != STT_GNU_IFUNC && !sym->ifunc_alias))
+	if (!sym || sym->start != addr ||
+	    (symbol__type(sym) != STT_GNU_IFUNC && !symbol__ifunc_alias(sym)))
 		return false;
 
 	snprintf(buf, buf_sz, "%s@plt", sym->name);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index fabed5b0fa57..4702b8989354 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -50,7 +50,7 @@
 
 static int dso__load_kernel_sym(struct dso *dso, struct map *map);
 static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map);
-static bool symbol__is_idle(const char *name);
+static bool symbol__compute_is_idle(const char *name);
 
 int vmlinux_path__nr_entries;
 char **vmlinux_path;
@@ -163,24 +163,24 @@ static int choose_best_symbol(struct symbol *syma, struct symbol *symb)
 	else if ((a == 0) && (b > 0))
 		return SYMBOL_B;
 
-	if (syma->type != symb->type) {
-		if (syma->type == STT_NOTYPE)
+	if (symbol__type(syma) != symbol__type(symb)) {
+		if (symbol__type(syma) == STT_NOTYPE)
 			return SYMBOL_B;
-		if (symb->type == STT_NOTYPE)
+		if (symbol__type(symb) == STT_NOTYPE)
 			return SYMBOL_A;
 	}
 
 	/* Prefer a non weak symbol over a weak one */
-	a = syma->binding == STB_WEAK;
-	b = symb->binding == STB_WEAK;
+	a = symbol__binding(syma) == STB_WEAK;
+	b = symbol__binding(symb) == STB_WEAK;
 	if (b && !a)
 		return SYMBOL_A;
 	if (a && !b)
 		return SYMBOL_B;
 
 	/* Prefer a global symbol over a non global one */
-	a = syma->binding == STB_GLOBAL;
-	b = symb->binding == STB_GLOBAL;
+	a = symbol__binding(syma) == STB_GLOBAL;
+	b = symbol__binding(symb) == STB_GLOBAL;
 	if (a && !b)
 		return SYMBOL_A;
 	if (b && !a)
@@ -227,14 +227,14 @@ void symbols__fixup_duplicate(struct rb_root_cached *symbols)
 			continue;
 
 		if (choose_best_symbol(curr, next) == SYMBOL_A) {
-			if (next->type == STT_GNU_IFUNC)
-				curr->ifunc_alias = true;
+			if (symbol__type(next) == STT_GNU_IFUNC)
+				symbol__set_ifunc_alias(curr, true);
 			rb_erase_cached(&next->rb_node, symbols);
 			symbol__delete(next);
 			goto again;
 		} else {
-			if (curr->type == STT_GNU_IFUNC)
-				next->ifunc_alias = true;
+			if (symbol__type(curr) == STT_GNU_IFUNC)
+				symbol__set_ifunc_alias(next, true);
 			nd = rb_next(&curr->rb_node);
 			rb_erase_cached(&curr->rb_node, symbols);
 			symbol__delete(curr);
@@ -322,8 +322,8 @@ struct symbol *symbol__new(u64 start, u64 len, u8 binding, u8 type, const char *
 
 	sym->start   = start;
 	sym->end     = len ? start + len : start;
-	sym->type    = type;
-	sym->binding = binding;
+	atomic_init(&sym->flags, (type << SYMBOL_FLAG_TYPE_SHIFT) |
+				 (binding << SYMBOL_FLAG_BINDING_SHIFT));
 	sym->namelen = namelen - 1;
 
 	pr_debug4("%s: %s %#" PRIx64 "-%#" PRIx64 "\n",
@@ -345,6 +345,49 @@ void symbol__delete(struct symbol *sym)
 	free(((void *)sym) - symbol_conf.priv_size);
 }
 
+void symbol__set_ignore(struct symbol *sym, bool ignore)
+{
+	if (ignore)
+		atomic_fetch_or(&sym->flags, SYMBOL_FLAG_IGNORE);
+	else
+		atomic_fetch_and(&sym->flags, ~SYMBOL_FLAG_IGNORE);
+}
+
+void symbol__set_annotate2(struct symbol *sym, bool annotate2)
+{
+	if (annotate2)
+		atomic_fetch_or(&sym->flags, SYMBOL_FLAG_ANNOTATE2);
+	else
+		atomic_fetch_and(&sym->flags, ~SYMBOL_FLAG_ANNOTATE2);
+}
+
+void symbol__set_inlined(struct symbol *sym, bool inlined)
+{
+	if (inlined)
+		atomic_fetch_or(&sym->flags, SYMBOL_FLAG_INLINED);
+	else
+		atomic_fetch_and(&sym->flags, ~SYMBOL_FLAG_INLINED);
+}
+
+void symbol__set_ifunc_alias(struct symbol *sym, bool ifunc_alias)
+{
+	if (ifunc_alias)
+		atomic_fetch_or(&sym->flags, SYMBOL_FLAG_IFUNC_ALIAS);
+	else
+		atomic_fetch_and(&sym->flags, ~SYMBOL_FLAG_IFUNC_ALIAS);
+}
+
+static void symbol__set_idle(struct symbol *sym, bool idle)
+{
+	uint16_t old_flags = atomic_load(&sym->flags);
+	uint16_t new_flags;
+	uint16_t idle_val = idle ? SYMBOL_IDLE__IDLE : SYMBOL_IDLE__NOT_IDLE;
+
+	do {
+		new_flags = old_flags & ~SYMBOL_FLAG_IDLE_MASK;
+		new_flags |= (idle_val << SYMBOL_FLAG_IDLE_SHIFT);
+	} while (!atomic_compare_exchange_weak(&sym->flags, &old_flags, new_flags));
+}
 void symbols__delete(struct rb_root_cached *symbols)
 {
 	struct symbol *pos;
@@ -375,7 +418,7 @@ void __symbols__insert(struct rb_root_cached *symbols,
 		 */
 		if (name[0] == '.')
 			name++;
-		sym->idle = symbol__is_idle(name);
+		symbol__set_idle(sym, symbol__compute_is_idle(name));
 	}
 
 	while (*p != NULL) {
@@ -717,11 +760,19 @@ int modules__parse(const char *filename, void *arg,
 	return err;
 }
 
+bool symbol__is_idle(const struct symbol *sym)
+{
+	uint16_t flags = atomic_load(&sym->flags);
+	uint16_t idle_val = (flags & SYMBOL_FLAG_IDLE_MASK) >> SYMBOL_FLAG_IDLE_SHIFT;
+
+	return idle_val == SYMBOL_IDLE__IDLE;
+}
+
 /*
  * These are symbols in the kernel image, so make sure that
  * sym is from a kernel DSO.
  */
-static bool symbol__is_idle(const char *name)
+static bool symbol__compute_is_idle(const char *name)
 {
 	const char * const idle_symbols[] = {
 		"acpi_idle_do_entry",
@@ -2492,6 +2543,7 @@ void symbol__exit(void)
 {
 	if (!symbol_conf.initialized)
 		return;
+
 	strlist__delete(symbol_conf.bt_stop_list);
 	strlist__delete(symbol_conf.sym_list);
 	strlist__delete(symbol_conf.dso_list);
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index bd6eb90c8668..a199646f21f7 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -6,6 +6,7 @@
 #include <linux/refcount.h>
 #include <stdbool.h>
 #include <stdint.h>
+#include <stdatomic.h>
 #include <linux/list.h>
 #include <linux/rbtree.h>
 #include <stdio.h>
@@ -43,6 +44,23 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
 			     GElf_Shdr *shp, const char *name, size_t *idx);
 #endif
 
+enum symbol_idle_kind {
+	SYMBOL_IDLE__UNKNOWN = 0,
+	SYMBOL_IDLE__NOT_IDLE = 1,
+	SYMBOL_IDLE__IDLE = 2,
+};
+
+#define SYMBOL_FLAG_TYPE_SHIFT      0
+#define SYMBOL_FLAG_TYPE_MASK       (0xF << SYMBOL_FLAG_TYPE_SHIFT)
+#define SYMBOL_FLAG_BINDING_SHIFT   4
+#define SYMBOL_FLAG_BINDING_MASK    (0xF << SYMBOL_FLAG_BINDING_SHIFT)
+#define SYMBOL_FLAG_IDLE_SHIFT      8
+#define SYMBOL_FLAG_IDLE_MASK       (0x3 << SYMBOL_FLAG_IDLE_SHIFT)
+#define SYMBOL_FLAG_IGNORE          (1 << 10)
+#define SYMBOL_FLAG_INLINED         (1 << 11)
+#define SYMBOL_FLAG_ANNOTATE2       (1 << 12)
+#define SYMBOL_FLAG_IFUNC_ALIAS     (1 << 13)
+
 /**
  * A symtab entry. When allocated this may be preceded by an annotation (see
  * symbol__annotation) and/or a browser_index (see symbol__browser_index).
@@ -54,20 +72,7 @@ struct symbol {
 	u64		end;
 	/** Length of the string name. */
 	u16		namelen;
-	/** ELF symbol type as defined for st_info. E.g STT_OBJECT or STT_FUNC. */
-	u8		type:4;
-	/** ELF binding type as defined for st_info. E.g. STB_WEAK or STB_GLOBAL. */
-	u8		binding:4;
-	/** Set true for kernel symbols of idle routines. */
-	u8		idle:1;
-	/** Resolvable but tools ignore it (e.g. idle routines). */
-	u8		ignore:1;
-	/** Symbol for an inlined function. */
-	u8		inlined:1;
-	/** Has symbol__annotate2 been performed. */
-	u8		annotate2:1;
-	/** Symbol is an alias of an STT_GNU_IFUNC */
-	u8		ifunc_alias:1;
+	_Atomic uint16_t flags;
 	/** Architecture specific. Unused except on PPC where it holds st_other. */
 	u8		arch_sym;
 	/** The name of length namelen associated with the symbol. */
@@ -77,6 +82,43 @@ struct symbol {
 void symbol__delete(struct symbol *sym);
 void symbols__delete(struct rb_root_cached *symbols);
 
+static inline u8 symbol__type(const struct symbol *sym)
+{
+	return (atomic_load(&sym->flags) & SYMBOL_FLAG_TYPE_MASK) >> SYMBOL_FLAG_TYPE_SHIFT;
+}
+
+static inline u8 symbol__binding(const struct symbol *sym)
+{
+	return (atomic_load(&sym->flags) & SYMBOL_FLAG_BINDING_MASK) >> SYMBOL_FLAG_BINDING_SHIFT;
+}
+
+static inline bool symbol__ignore(const struct symbol *sym)
+{
+	return (atomic_load(&sym->flags) & SYMBOL_FLAG_IGNORE) != 0;
+}
+
+static inline bool symbol__inlined(const struct symbol *sym)
+{
+	return (atomic_load(&sym->flags) & SYMBOL_FLAG_INLINED) != 0;
+}
+
+static inline bool symbol__is_annotate2(const struct symbol *sym)
+{
+	return (atomic_load(&sym->flags) & SYMBOL_FLAG_ANNOTATE2) != 0;
+}
+
+static inline bool symbol__ifunc_alias(const struct symbol *sym)
+{
+	return (atomic_load(&sym->flags) & SYMBOL_FLAG_IFUNC_ALIAS) != 0;
+}
+
+bool symbol__is_idle(const struct symbol *sym);
+
+void symbol__set_ignore(struct symbol *sym, bool ignore);
+void symbol__set_annotate2(struct symbol *sym, bool annotate2);
+void symbol__set_inlined(struct symbol *sym, bool inlined);
+void symbol__set_ifunc_alias(struct symbol *sym, bool ifunc_alias);
+
 /* symbols__for_each_entry - iterate over symbols (rb_root)
  *
  * @symbols: the rb_root of symbols
diff --git a/tools/perf/util/symbol_fprintf.c b/tools/perf/util/symbol_fprintf.c
index 53e1af4ed9ac..4dc8d5761f52 100644
--- a/tools/perf/util/symbol_fprintf.c
+++ b/tools/perf/util/symbol_fprintf.c
@@ -11,8 +11,8 @@ size_t symbol__fprintf(struct symbol *sym, FILE *fp)
 {
 	return fprintf(fp, " %" PRIx64 "-%" PRIx64 " %c %s\n",
 		       sym->start, sym->end,
-		       sym->binding == STB_GLOBAL ? 'g' :
-		       sym->binding == STB_LOCAL  ? 'l' : 'w',
+		       symbol__binding(sym) == STB_GLOBAL ? 'g' :
+		       symbol__binding(sym) == STB_LOCAL  ? 'l' : 'w',
 		       sym->name);
 }
 
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v9 18/18] perf symbol: Lazily compute idle
  2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
                                               ` (16 preceding siblings ...)
  2026-05-03  0:22                             ` [PATCH v9 17/18] perf symbol: Add setters for bitfields sharing a byte to avoid concurrent update issues Ian Rogers
@ 2026-05-03  0:22                             ` Ian Rogers
  17 siblings, 0 replies; 80+ messages in thread
From: Ian Rogers @ 2026-05-03  0:22 UTC (permalink / raw)
  To: irogers, acme, namhyung, tmricht
  Cc: agordeev, gor, hca, jameshongleiwang, japo, linux-kernel,
	linux-perf-users, linux-s390, sumanthk

Switch from an idle boolean to a helper symbol__is_idle function. In
the function lazily compute whether a symbol is an idle function
taking into consideration the kernel version and architecture of the
machine. As symbols__insert no longer needs to know if a symbol is for
the kernel, remove the argument.

This change is inspired by mailing list discussion, particularly from
Thomas Richter <tmricht@linux.ibm.com> and Heiko Carstens
<hca@linux.ibm.com>:
https://lore.kernel.org/lkml/20260219113850.354271-1-tmricht@linux.ibm.com/

The change switches x86 matches to use strstarts which means
intel_idle_irq is matched as part of strstarts(name, "intel_idle"), a
change suggested by Honglei Wang <jameshongleiwang@126.com> in:
https://lore.kernel.org/lkml/20260323085255.98173-1-jameshongleiwang@126.com/

Assisted-by: Gemini:gemini-3.1-pro-preview
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-top.c     |   3 +-
 tools/perf/util/symbol-elf.c |   2 +-
 tools/perf/util/symbol.c     | 114 +++++++++++++++++++++--------------
 tools/perf/util/symbol.h     |   8 +--
 4 files changed, 74 insertions(+), 53 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 9a0c388a7ec3..efb4b1172190 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -830,7 +830,8 @@ static void perf_event__process_sample(const struct perf_tool *tool,
 		}
 	}
 
-	if (al.sym == NULL || !symbol__is_idle(al.sym)) {
+	if (al.sym == NULL ||
+	    !symbol__is_idle(al.sym, al.map ? map__dso(al.map) : NULL, machine->env)) {
 		struct hists *hists = evsel__hists(evsel);
 		struct hist_entry_iter iter = {
 			.evsel		= evsel,
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index a9045d6fcb95..69484abc07b6 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1728,7 +1728,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 
 		arch__sym_update(f, &sym);
 
-		__symbols__insert(dso__symbols(curr_dso), f, dso__kernel(dso));
+		__symbols__insert(dso__symbols(curr_dso), f);
 		nr++;
 	}
 	dso__put(curr_dso);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 4702b8989354..2caa6b8b8609 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -50,7 +50,6 @@
 
 static int dso__load_kernel_sym(struct dso *dso, struct map *map);
 static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map);
-static bool symbol__compute_is_idle(const char *name);
 
 int vmlinux_path__nr_entries;
 char **vmlinux_path;
@@ -401,8 +400,7 @@ void symbols__delete(struct rb_root_cached *symbols)
 	}
 }
 
-void __symbols__insert(struct rb_root_cached *symbols,
-		       struct symbol *sym, bool kernel)
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
 	struct rb_node **p = &symbols->rb_root.rb_node;
 	struct rb_node *parent = NULL;
@@ -410,17 +408,6 @@ void __symbols__insert(struct rb_root_cached *symbols,
 	struct symbol *s;
 	bool leftmost = true;
 
-	if (kernel) {
-		const char *name = sym->name;
-		/*
-		 * ppc64 uses function descriptors and appends a '.' to the
-		 * start of every instruction address. Remove it.
-		 */
-		if (name[0] == '.')
-			name++;
-		symbol__set_idle(sym, symbol__compute_is_idle(name));
-	}
-
 	while (*p != NULL) {
 		parent = *p;
 		s = rb_entry(parent, struct symbol, rb_node);
@@ -437,7 +424,7 @@ void __symbols__insert(struct rb_root_cached *symbols,
 
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
-	__symbols__insert(symbols, sym, false);
+	__symbols__insert(symbols, sym);
 }
 
 static struct symbol *symbols__find(struct rb_root_cached *symbols, u64 ip)
@@ -598,7 +585,7 @@ void dso__reset_find_symbol_cache(struct dso *dso)
 
 void dso__insert_symbol(struct dso *dso, struct symbol *sym)
 {
-	__symbols__insert(dso__symbols(dso), sym, dso__kernel(dso));
+	__symbols__insert(dso__symbols(dso), sym);
 
 	/* update the symbol cache if necessary */
 	if (dso__last_find_result_addr(dso) >= sym->start &&
@@ -760,55 +747,90 @@ int modules__parse(const char *filename, void *arg,
 	return err;
 }
 
-bool symbol__is_idle(const struct symbol *sym)
-{
-	uint16_t flags = atomic_load(&sym->flags);
-	uint16_t idle_val = (flags & SYMBOL_FLAG_IDLE_MASK) >> SYMBOL_FLAG_IDLE_SHIFT;
-
-	return idle_val == SYMBOL_IDLE__IDLE;
-}
-
 /*
  * These are symbols in the kernel image, so make sure that
  * sym is from a kernel DSO.
  */
-static bool symbol__compute_is_idle(const char *name)
+static int sym_name_cmp(const void *a, const void *b)
+{
+	const char *name = a;
+	const char *const *sym = b;
+
+	return strcmp(name, *sym);
+}
+
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, struct perf_env *env)
 {
-	const char * const idle_symbols[] = {
+	static const char * const idle_symbols[] = {
 		"acpi_idle_do_entry",
 		"acpi_processor_ffh_cstate_enter",
 		"arch_cpu_idle",
 		"cpu_idle",
 		"cpu_startup_entry",
-		"idle_cpu",
-		"intel_idle",
-		"intel_idle_ibrs",
 		"default_idle",
-		"native_safe_halt",
 		"enter_idle",
 		"exit_idle",
-		"mwait_idle",
-		"mwait_idle_with_hints",
-		"mwait_idle_with_hints.constprop.0",
+		"idle_cpu",
+		"native_safe_halt",
 		"poll_idle",
-		"ppc64_runlatch_off",
 		"pseries_dedicated_idle_sleep",
-		"psw_idle",
-		"psw_idle_exit",
-		NULL
 	};
-	int i;
-	static struct strlist *idle_symbols_list;
+	const char *name = sym->name;
+	uint16_t e_machine = perf_env__e_machine(env, /*e_flags=*/NULL);
+
+	{
+		uint16_t flags = atomic_load(&sym->flags);
+		uint16_t idle_val = (flags & SYMBOL_FLAG_IDLE_MASK) >> SYMBOL_FLAG_IDLE_SHIFT;
+
+		if (idle_val != SYMBOL_IDLE__UNKNOWN)
+			return idle_val == SYMBOL_IDLE__IDLE;
+	}
+
+	if (!dso || dso__kernel(dso) == DSO_SPACE__USER) {
+		symbol__set_idle(sym, /*idle=*/false);
+		return false;
+	}
+
+	/*
+	 * ppc64 uses function descriptors and appends a '.' to the
+	 * start of every instruction address. Remove it.
+	 */
+	if (name[0] == '.')
+		name++;
+
+	if (bsearch(name, idle_symbols, ARRAY_SIZE(idle_symbols),
+		    sizeof(idle_symbols[0]), sym_name_cmp)) {
+		symbol__set_idle(sym, /*idle=*/true);
+		return true;
+	}
 
-	if (idle_symbols_list)
-		return strlist__has_entry(idle_symbols_list, name);
+	if (e_machine == EM_386 || e_machine == EM_X86_64) {
+		if (strstarts(name, "mwait_idle") ||
+		    strstarts(name, "intel_idle")) {
+			symbol__set_idle(sym, /*idle=*/true);
+			return true;
+		}
+	}
 
-	idle_symbols_list = strlist__new(NULL, NULL);
+	if (e_machine == EM_PPC64 && !strcmp(name, "ppc64_runlatch_off")) {
+		symbol__set_idle(sym, /*idle=*/true);
+		return true;
+	}
 
-	for (i = 0; idle_symbols[i]; i++)
-		strlist__add(idle_symbols_list, idle_symbols[i]);
+	if (e_machine == EM_S390 && strstarts(name, "psw_idle")) {
+		int major = 0, minor = 0;
+		const char *release = perf_env__os_release(env);
 
-	return strlist__has_entry(idle_symbols_list, name);
+		/* Before v6.10, s390 used psw_idle. */
+		if (release && sscanf(release, "%d.%d", &major, &minor) == 2 &&
+		    (major < 6 || (major == 6 && minor < 10))) {
+			symbol__set_idle(sym, /*idle=*/true);
+			return true;
+		}
+	}
+
+	symbol__set_idle(sym, /*idle=*/false);
+	return false;
 }
 
 static int map__process_kallsym_symbol(void *arg, const char *name,
@@ -837,7 +859,7 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
 	 * We will pass the symbols to the filter later, in
 	 * map__split_kallsyms, when we have split the maps per module
 	 */
-	__symbols__insert(root, sym, !strchr(name, '['));
+	__symbols__insert(root, sym);
 
 	return 0;
 }
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index a199646f21f7..422e98a4ea2f 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -28,6 +28,7 @@ struct map;
 struct maps;
 struct option;
 struct build_id;
+struct perf_env;
 
 /*
  * libelf 0.8.x and earlier do not support ELF_C_READ_MMAP;
@@ -60,7 +61,6 @@ enum symbol_idle_kind {
 #define SYMBOL_FLAG_INLINED         (1 << 11)
 #define SYMBOL_FLAG_ANNOTATE2       (1 << 12)
 #define SYMBOL_FLAG_IFUNC_ALIAS     (1 << 13)
-
 /**
  * A symtab entry. When allocated this may be preceded by an annotation (see
  * symbol__annotation) and/or a browser_index (see symbol__browser_index).
@@ -112,7 +112,7 @@ static inline bool symbol__ifunc_alias(const struct symbol *sym)
 	return (atomic_load(&sym->flags) & SYMBOL_FLAG_IFUNC_ALIAS) != 0;
 }
 
-bool symbol__is_idle(const struct symbol *sym);
+bool symbol__is_idle(struct symbol *sym, const struct dso *dso, struct perf_env *env);
 
 void symbol__set_ignore(struct symbol *sym, bool ignore);
 void symbol__set_annotate2(struct symbol *sym, bool annotate2);
@@ -196,7 +196,6 @@ int filename__read_debuglink(const char *filename, char *debuglink,
 			     size_t size);
 bool filename__has_section(const char *filename, const char *sec);
 
-struct perf_env;
 int symbol__init(struct perf_env *env);
 void symbol__exit(void);
 void symbol__elf_init(void);
@@ -236,8 +235,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss);
 
 char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
 
-void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym,
-		       bool kernel);
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
 void symbols__fixup_duplicate(struct rb_root_cached *symbols);
 void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms);
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* Re: [PATCH v9 01/18] perf env: Add perf_env__e_machine helper and use in perf_env__arch
  2026-05-03  0:22                             ` [PATCH v9 01/18] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
@ 2026-05-04  1:35                               ` Namhyung Kim
  0 siblings, 0 replies; 80+ messages in thread
From: Namhyung Kim @ 2026-05-04  1:35 UTC (permalink / raw)
  To: Ian Rogers
  Cc: acme, tmricht, agordeev, gor, hca, jameshongleiwang, japo,
	linux-kernel, linux-perf-users, linux-s390, sumanthk

On Sat, May 02, 2026 at 05:22:31PM -0700, Ian Rogers wrote:
> Add a helper that lazily computes the e_machine and falls back to
> EM_HOST. Use the perf_env's arch to compute the e_machine if
> available. Use a binary search for some efficiency in this, but handle
> somewhat complex duplicate rules. Switch perf_env__arch to be derived
> the e_machine for consistency. This switches arch from being uname
> derived to matching that of the perf binary (via EM_HOST). Update
> session to use the helper, which may mean using EM_HOST when no
> threads are available. This also updates the perf data file header
> that gets the e_machine/e_flags from the session.
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
[SNIP]
> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index f30e48eb3fc3..f1ae61392cce 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
> @@ -379,21 +379,28 @@ static int write_osrelease(struct feat_fd *ff,
>  	return do_write_string(ff, uts.release);
>  }
>  
> -static int write_arch(struct feat_fd *ff,
> -		      struct evlist *evlist __maybe_unused)
> +static int write_arch(struct feat_fd *ff, struct evlist *evlist)
>  {
>  	struct utsname uts;
> -	int ret;
> +	const char *arch = NULL;
>  
> -	ret = uname(&uts);
> -	if (ret < 0)
> -		return -1;
> +	if (evlist->session) {
> +		/* Force the computation in the perf_env of the e_machine of the threads. */
> +		perf_session__e_machine(evlist->session, /*e_flags=*/NULL);
> +		arch = perf_env__arch(perf_session__env(evlist->session));
> +	}
>  
> -	return do_write_string(ff, uts.machine);
> +	if (!arch) {
> +		int ret = uname(&uts);
> +
> +		if (ret < 0)
> +			return -1;
> +		arch = uts.machine;
> +	}
> +	return do_write_string(ff, arch);
>  }
>  
> -static int write_e_machine(struct feat_fd *ff,
> -			   struct evlist *evlist __maybe_unused)
> +static int write_e_machine(struct feat_fd *ff, struct evlist *evlist)
>  {
>  	/* e_machine expanded from 16 to 32-bits for alignment. */
>  	uint32_t e_flags;
> @@ -2684,10 +2691,18 @@ static int process_##__feat(struct feat_fd *ff, void *data __maybe_unused) \
>  FEAT_PROCESS_STR_FUN(hostname, hostname);
>  FEAT_PROCESS_STR_FUN(osrelease, os_release);
>  FEAT_PROCESS_STR_FUN(version, version);
> -FEAT_PROCESS_STR_FUN(arch, arch);
>  FEAT_PROCESS_STR_FUN(cpudesc, cpu_desc);
>  FEAT_PROCESS_STR_FUN(cpuid, cpuid);
>  
> +static int process_arch(struct feat_fd *ff, void *data __maybe_unused)
> +{
> +	free(ff->ph->env.arch);
> +	ff->ph->env.arch = do_read_string(ff);
> +	if (!ff->ph->env.arch)
> +		return -ENOMEM;
> +	return 0;
> +}

Isn't it same as FEAT_PROCESS_STR_FUN()?


> +
>  static int process_e_machine(struct feat_fd *ff, void *data __maybe_unused)
>  {
>  	int ret;
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index fe0de2a0277f..3e64db2d27c2 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -3023,14 +3023,19 @@ uint16_t perf_session__e_machine(struct perf_session *session, uint32_t *e_flags
>  		return EM_HOST;
>  	}
>  
> +	/*
> +	 * Is the env caching an e_machine? If not we want to compute from the
> +	 * more accurate threads.
> +	 */
>  	env = perf_session__env(session);
> -	if (env && env->e_machine != EM_NONE) {
> -		if (e_flags)
> -			*e_flags = env->e_flags;
> -
> -		return env->e_machine;
> -	}
> +	if (env && env->e_machine != EM_NONE)
> +		return perf_env__e_machine(env, e_flags);
>  
> +	/*
> +	 * Compute from threads, note this is more accurate than
> +	 * perf_env__e_machine that falls back on EM_HOST and doesn't consider
> +	 * mixed 32-bit and 64-bit threads.
> +	 */

I'm curious if it's always better.  If EM_HOST is 64-bit and the first
thread in a session happens to be 32-bit.  Then resulting e_machine
would be 32- bit, right?  Is that what we want?

Thanks,
Namhyung


>  	machines__for_each_thread(&session->machines,
>  				  perf_session__e_machine_cb,
>  				  &args);
> @@ -3048,10 +3053,9 @@ uint16_t perf_session__e_machine(struct perf_session *session, uint32_t *e_flags
>  
>  	/*
>  	 * Couldn't determine from the perf_env or current set of
> -	 * threads. Default to the host.
> +	 * threads. Potentially use logic that uses the arch string otherwise
> +	 * default to the host. Don't cache in the perf_env in case later
> +	 * threads indicate a better ELF machine type.
>  	 */
> -	if (e_flags)
> -		*e_flags = EF_HOST;
> -
> -	return EM_HOST;
> +	return perf_env__e_machine_nocache(env, e_flags);
>  }
> -- 
> 2.54.0.545.g6539524ca2-goog
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

end of thread, other threads:[~2026-05-04  1:35 UTC | newest]

Thread overview: 80+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-19 11:38 [PATCH v2] perf symbol: Remove psw_idle() from list of idle symbols Thomas Richter
2026-02-19 11:55 ` Jan Polensky
2026-02-23 21:46 ` Namhyung Kim
2026-02-23 23:14   ` Arnaldo Melo
2026-03-02 18:43   ` Arnaldo Carvalho de Melo
2026-03-02 19:44     ` Ian Rogers
2026-03-04 14:34       ` Arnaldo Carvalho de Melo
2026-03-02 23:43 ` [PATCH v1] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
2026-03-24 17:14   ` Ian Rogers
2026-03-25  6:58     ` Namhyung Kim
2026-03-25 15:58       ` Ian Rogers
2026-03-25 16:18   ` [PATCH v2] " Ian Rogers
2026-03-26  7:20     ` Honglei Wang
2026-03-26 15:11       ` Ian Rogers
2026-03-26 17:45         ` [PATCH v3 0/2] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
2026-03-26 17:45           ` [PATCH v3 1/2] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
2026-03-26 17:45           ` [PATCH v3 2/2] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
2026-03-27  6:56             ` Honglei Wang
2026-03-27  4:50           ` [PATCH v4 0/2] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
2026-03-27  4:50             ` [PATCH v4 1/2] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
2026-04-06  5:05               ` Namhyung Kim
2026-04-06 15:36                 ` Ian Rogers
2026-03-27  4:50             ` [PATCH v4 2/2] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
2026-04-06  5:10               ` Namhyung Kim
2026-04-06 16:11                 ` Ian Rogers
2026-04-06 17:09                   ` [PATCH v5 0/3] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
2026-04-06 17:09                     ` [PATCH v5 1/3] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
2026-04-06 17:09                     ` [PATCH v5 2/3] perf env: Add helper to lazily compute the os_release Ian Rogers
2026-04-06 17:09                     ` [PATCH v5 3/3] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
2026-04-09 23:06                     ` [PATCH v6 0/3] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
2026-04-09 23:06                       ` [PATCH v6 1/3] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
2026-05-01 18:20                         ` [PATCH v7 0/4] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
2026-05-01 18:20                           ` [PATCH v7 1/4] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
2026-05-01 18:20                           ` [PATCH v7 2/4] perf env: Add helper to lazily compute the os_release Ian Rogers
2026-05-01 18:20                           ` [PATCH v7 3/4] perf symbol: Add setters for bitfields sharing a byte to avoid concurrent update issues Ian Rogers
2026-05-01 18:20                           ` [PATCH v7 4/4] perf symbol: Lazily compute idle and use a global lock for updates Ian Rogers
2026-05-02  6:59                         ` [PATCH v8 00/17] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
2026-05-02  6:59                           ` [PATCH v8 01/17] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
2026-05-02  6:59                           ` [PATCH v8 02/17] perf tests topology: Switch env->arch use to env->e_machine Ian Rogers
2026-05-02  6:59                           ` [PATCH v8 03/17] perf capstone: Determine architecture from e_machine Ian Rogers
2026-05-02  6:59                           ` [PATCH v8 04/17] perf print_insn: Use e_machine for fallback IP length check Ian Rogers
2026-05-02  6:59                           ` [PATCH v8 05/17] perf machine: Use perf_env e_machine rather than arch Ian Rogers
2026-05-02  6:59                           ` [PATCH v8 06/17] perf sample-raw: " Ian Rogers
2026-05-02  6:59                           ` [PATCH v8 07/17] perf sort: " Ian Rogers
2026-05-02  6:59                           ` [PATCH v8 08/17] perf symbol: Avoid use of machine__is Ian Rogers
2026-05-02  6:59                           ` [PATCH v8 09/17] perf arch common: Use perf_env e_machine rather than arch Ian Rogers
2026-05-02  6:59                           ` [PATCH v8 10/17] perf header: In print_pmu_caps use perf_env e_machine Ian Rogers
2026-05-02  6:59                           ` [PATCH v8 11/17] perf c2c: Use perf_env e_machine rather than arch Ian Rogers
2026-05-02  6:59                           ` [PATCH v8 12/17] perf lock-contention: " Ian Rogers
2026-05-02  6:59                           ` [PATCH v8 13/17] perf env: Refactor perf_env__arch_strerrno Ian Rogers
2026-05-02  6:59                           ` [PATCH v8 14/17] perf env: Remove unused perf_env__raw_arch Ian Rogers
2026-05-02  6:59                           ` [PATCH v8 15/17] perf env: Add helper to lazily compute the os_release Ian Rogers
2026-05-02  6:59                           ` [PATCH v8 16/17] perf symbol: Add setters for bitfields sharing a byte to avoid concurrent update issues Ian Rogers
2026-05-02  6:59                           ` [PATCH v8 17/17] perf symbol: Lazily compute idle and use a global lock for updates Ian Rogers
2026-05-03  0:22                           ` [PATCH v9 00/18] perf symbol/env: ELF machine clean up and lazy idle computation Ian Rogers
2026-05-03  0:22                             ` [PATCH v9 01/18] perf env: Add perf_env__e_machine helper and use in perf_env__arch Ian Rogers
2026-05-04  1:35                               ` Namhyung Kim
2026-05-03  0:22                             ` [PATCH v9 02/18] perf tests topology: Switch env->arch use to env->e_machine Ian Rogers
2026-05-03  0:22                             ` [PATCH v9 03/18] perf env, dso, thread: Add _endian variants for e_machine helpers Ian Rogers
2026-05-03  0:22                             ` [PATCH v9 04/18] perf capstone: Determine architecture from e_machine Ian Rogers
2026-05-03  0:22                             ` [PATCH v9 05/18] perf print_insn: Use e_machine for fallback IP length check Ian Rogers
2026-05-03  0:22                             ` [PATCH v9 06/18] perf symbol: Avoid use of machine__is Ian Rogers
2026-05-03  0:22                             ` [PATCH v9 07/18] perf machine: Use perf_env e_machine rather than arch Ian Rogers
2026-05-03  0:22                             ` [PATCH v9 08/18] perf sample-raw: " Ian Rogers
2026-05-03  0:22                             ` [PATCH v9 09/18] perf sort: " Ian Rogers
2026-05-03  0:22                             ` [PATCH v9 10/18] perf arch common: " Ian Rogers
2026-05-03  0:22                             ` [PATCH v9 11/18] perf header: In print_pmu_caps use perf_env e_machine Ian Rogers
2026-05-03  0:22                             ` [PATCH v9 12/18] perf c2c: Use perf_env e_machine rather than arch Ian Rogers
2026-05-03  0:22                             ` [PATCH v9 13/18] perf lock-contention: " Ian Rogers
2026-05-03  0:22                             ` [PATCH v9 14/18] perf env: Refactor perf_env__arch_strerrno Ian Rogers
2026-05-03  0:22                             ` [PATCH v9 15/18] perf env: Remove unused perf_env__raw_arch Ian Rogers
2026-05-03  0:22                             ` [PATCH v9 16/18] perf env: Add helper to lazily compute the os_release Ian Rogers
2026-05-03  0:22                             ` [PATCH v9 17/18] perf symbol: Add setters for bitfields sharing a byte to avoid concurrent update issues Ian Rogers
2026-05-03  0:22                             ` [PATCH v9 18/18] perf symbol: Lazily compute idle Ian Rogers
2026-04-09 23:06                       ` [PATCH v6 2/3] perf env: Add helper to lazily compute the os_release Ian Rogers
2026-04-09 23:06                       ` [PATCH v6 3/3] perf symbol: Lazily compute idle and use the perf_env Ian Rogers
2026-03-27  6:00           ` [PATCH v2] perf tests task-analyzer: Write test files to tmpdir Ian Rogers
2026-03-31  7:22             ` Namhyung Kim
2026-03-31 17:58               ` Ian Rogers
2026-04-01  3:41                 ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox