public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
* [RFT PATCH 0/7] perf tool: Support iostat for multiple platforms
@ 2026-01-26 12:35 Yushan Wang
  2026-01-26 12:35 ` [RFT PATCH 1/7] perf stat: Check color's length instead of the pointer Yushan Wang
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: Yushan Wang @ 2026-01-26 12:35 UTC (permalink / raw)
  To: mike.leach, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, peterz,
	john.g.garry, Jonathan.Cameron, shiju.jose, will,
	linux-perf-users, linux-arm-kernel
  Cc: linuxarm, liuyonglong, prime.zeng, fanghao11, wangzhou1,
	wangyushan12

Currently, platform-specific iostat code for PMUs is implemented as a
common iostat callback interface and invoked based on what is being
built. This approach limits support for iostat across different types of
PMUs.

Support of HiSilicon PCIe PMU iostat was raised at [1], which uses the
similar approach.

To extend support of iostat across platforms, change common iostat
interface to framework to allow perf to probe PMU capabilities during
runtime and route iostat request to the correct PMU-specific functions.
Then HiSilicon PCIe PMU iostat is supported with the new framework.

Request For Test:
Refactors has been made to x86 iostat to adapt the iostat framework, the
probe function that checks if there's any PMU's name contains 'x86-iio'
may not work properly, tests of that would be appreciated.

[1] https://lore.kernel.org/all/4688a613-c94a-49b0-9d0f-09173c64082d@arm.com/

Shiju Jose (2):
  perf-iostat: Extend iostat interface to support different iostat PMUs
  perf-iostat: Make x86 iostat compatible with new iostat framework

Yicong Yang (1):
  perf-iostat: Enable iostat mode for HiSilicon PCIe PMU

Yushan Wang (4):
  perf stat: Check color's length instead of the pointer
  perf stat: Save unnecessary print_metric() call
  perf-x86: iostat: Change iostat_prefix() to static
  perf-iostat: Support wilder wildcard-match for pmus

 tools/perf/arch/arm64/util/Build         |   1 +
 tools/perf/arch/arm64/util/hisi-iostat.c | 479 +++++++++++++++++++++++
 tools/perf/arch/x86/util/iostat.c        | 105 +++--
 tools/perf/builtin-script.c              |   2 +-
 tools/perf/util/iostat.c                 |  79 ++--
 tools/perf/util/iostat.h                 |  21 +-
 tools/perf/util/pmus.c                   |  12 +-
 tools/perf/util/pmus.h                   |   3 +
 tools/perf/util/stat-display.c           |   4 +-
 tools/perf/util/stat-shadow.c            |   4 +-
 10 files changed, 638 insertions(+), 72 deletions(-)
 create mode 100644 tools/perf/arch/arm64/util/hisi-iostat.c

-- 
2.33.0



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFT PATCH 1/7] perf stat: Check color's length instead of the pointer
  2026-01-26 12:35 [RFT PATCH 0/7] perf tool: Support iostat for multiple platforms Yushan Wang
@ 2026-01-26 12:35 ` Yushan Wang
  2026-01-27 15:58   ` Jonathan Cameron
  2026-01-26 12:35 ` [RFT PATCH 2/7] perf stat: Save unnecessary print_metric() call Yushan Wang
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Yushan Wang @ 2026-01-26 12:35 UTC (permalink / raw)
  To: mike.leach, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, peterz,
	john.g.garry, Jonathan.Cameron, shiju.jose, will,
	linux-perf-users, linux-arm-kernel
  Cc: linuxarm, liuyonglong, prime.zeng, fanghao11, wangzhou1,
	wangyushan12

Color string returned by metric_threshold_classify__color() is never
NULL, check the presence of *color will always return true.

Fix this by change the checks against length of *color.

Fixes: 37b77ae95416 ("perf stat: Change color to threshold in print_metric")

Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
---
 tools/perf/builtin-script.c    | 2 +-
 tools/perf/util/stat-display.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 62e43d3c5ad7..9fe90f564c69 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2100,7 +2100,7 @@ static void script_print_metric(struct perf_stat_config *config __maybe_unused,
 	perf_sample__fprintf_start(NULL, mctx->sample, mctx->thread, mctx->evsel,
 				   PERF_RECORD_SAMPLE, mctx->fp);
 	fputs("\tmetric: ", mctx->fp);
-	if (color)
+	if (strlen(color))
 		color_fprintf(mctx->fp, color, fmt, val);
 	else
 		printf(fmt, val);
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 6d02f84c5691..91c0c1020f4e 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -474,7 +474,7 @@ static void print_metric_std(struct perf_stat_config *config,
 		do_new_line_std(config, os);
 
 	n = fprintf(out, " # ");
-	if (color)
+	if (strlen(color))
 		n += color_fprintf(out, color, fmt, val);
 	else
 		n += fprintf(out, fmt, val);
@@ -607,7 +607,7 @@ static void print_metric_only(struct perf_stat_config *config,
 	if (mlen < strlen(unit))
 		mlen = strlen(unit) + 1;
 
-	if (color)
+	if (strlen(color))
 		mlen += strlen(color) + sizeof(PERF_COLOR_RESET) - 1;
 
 	color_snprintf(str, sizeof(str), color ?: "", fmt ?: "", val);
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFT PATCH 2/7] perf stat: Save unnecessary print_metric() call
  2026-01-26 12:35 [RFT PATCH 0/7] perf tool: Support iostat for multiple platforms Yushan Wang
  2026-01-26 12:35 ` [RFT PATCH 1/7] perf stat: Check color's length instead of the pointer Yushan Wang
@ 2026-01-26 12:35 ` Yushan Wang
  2026-01-27 16:01   ` Jonathan Cameron
  2026-01-26 12:35 ` [RFT PATCH 3/7] perf-x86: iostat: Change iostat_prefix() to static Yushan Wang
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Yushan Wang @ 2026-01-26 12:35 UTC (permalink / raw)
  To: mike.leach, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, peterz,
	john.g.garry, Jonathan.Cameron, shiju.jose, will,
	linux-perf-users, linux-arm-kernel
  Cc: linuxarm, liuyonglong, prime.zeng, fanghao11, wangzhou1,
	wangyushan12

Patch [1] removed the second branch of iostat_run, and changed num to 0
since it is the default behavior. But during iostat_run, default value 1
of num is required to avoid print_metric() call later.

Set num as 1 to avoid redundant print_metric() call that causes
unaligned blank printed.

Fixes: b71f46a6a708 ("perf stat: Remove hard coded shadow metrics")

[1]: https://lore.kernel.org/all/20251111212206.631711-8-irogers@google.com/

Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
---
 tools/perf/util/stat-shadow.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 9c83f7d96caa..9439baf8002f 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -319,8 +319,10 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 	void *ctxp = out->ctx;
 	int num = 0;
 
-	if (config->iostat_run)
+	if (config->iostat_run) {
 		iostat_print_metric(config, evsel, out);
+		num = 1;
+	}
 
 	perf_stat__print_shadow_stats_metricgroup(config, evsel, aggr_idx,
 						  &num, NULL, out);
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFT PATCH 3/7] perf-x86: iostat: Change iostat_prefix() to static
  2026-01-26 12:35 [RFT PATCH 0/7] perf tool: Support iostat for multiple platforms Yushan Wang
  2026-01-26 12:35 ` [RFT PATCH 1/7] perf stat: Check color's length instead of the pointer Yushan Wang
  2026-01-26 12:35 ` [RFT PATCH 2/7] perf stat: Save unnecessary print_metric() call Yushan Wang
@ 2026-01-26 12:35 ` Yushan Wang
  2026-01-26 12:35 ` [RFT PATCH 4/7] perf-iostat: Extend iostat interface to support different iostat PMUs Yushan Wang
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Yushan Wang @ 2026-01-26 12:35 UTC (permalink / raw)
  To: mike.leach, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, peterz,
	john.g.garry, Jonathan.Cameron, shiju.jose, will,
	linux-perf-users, linux-arm-kernel
  Cc: linuxarm, liuyonglong, prime.zeng, fanghao11, wangzhou1,
	wangyushan12

Change iostat_probe() to static function since it is not used outside.

Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
---
 tools/perf/arch/x86/util/iostat.c | 44 +++++++++++++++----------------
 tools/perf/util/iostat.c          |  7 -----
 tools/perf/util/iostat.h          |  2 --
 3 files changed, 22 insertions(+), 31 deletions(-)

diff --git a/tools/perf/arch/x86/util/iostat.c b/tools/perf/arch/x86/util/iostat.c
index 7442a2cd87ed..83be505955c8 100644
--- a/tools/perf/arch/x86/util/iostat.c
+++ b/tools/perf/arch/x86/util/iostat.c
@@ -332,6 +332,28 @@ static int iostat_event_group(struct evlist *evl,
 	return ret;
 }
 
+static void iostat_prefix(struct evlist *evlist,
+		   struct perf_stat_config *config,
+		   char *prefix, struct timespec *ts)
+{
+	struct iio_root_port *rp = evlist->selected->priv;
+
+	if (rp) {
+		/*
+		 * TODO: This is the incorrect format in JSON mode.
+		 *       See prepare_timestamp()
+		 */
+		if (ts)
+			sprintf(prefix, "%6lu.%09lu%s%04x:%02x%s",
+				ts->tv_sec, ts->tv_nsec,
+				config->csv_sep, rp->domain, rp->bus,
+				config->csv_sep);
+		else
+			sprintf(prefix, "%04x:%02x%s", rp->domain, rp->bus,
+				config->csv_sep);
+	}
+}
+
 int iostat_prepare(struct evlist *evlist, struct perf_stat_config *config)
 {
 	if (evlist->core.nr_entries > 0) {
@@ -396,28 +418,6 @@ void iostat_release(struct evlist *evlist)
 	}
 }
 
-void iostat_prefix(struct evlist *evlist,
-		   struct perf_stat_config *config,
-		   char *prefix, struct timespec *ts)
-{
-	struct iio_root_port *rp = evlist->selected->priv;
-
-	if (rp) {
-		/*
-		 * TODO: This is the incorrect format in JSON mode.
-		 *       See prepare_timestamp()
-		 */
-		if (ts)
-			sprintf(prefix, "%6lu.%09lu%s%04x:%02x%s",
-				ts->tv_sec, ts->tv_nsec,
-				config->csv_sep, rp->domain, rp->bus,
-				config->csv_sep);
-		else
-			sprintf(prefix, "%04x:%02x%s", rp->domain, rp->bus,
-				config->csv_sep);
-	}
-}
-
 void iostat_print_header_prefix(struct perf_stat_config *config)
 {
 	if (config->csv_output)
diff --git a/tools/perf/util/iostat.c b/tools/perf/util/iostat.c
index b770bd473af7..a68ab100780d 100644
--- a/tools/perf/util/iostat.c
+++ b/tools/perf/util/iostat.c
@@ -37,13 +37,6 @@ __weak void iostat_print_metric(struct perf_stat_config *config __maybe_unused,
 {
 }
 
-__weak void iostat_prefix(struct evlist *evlist __maybe_unused,
-			  struct perf_stat_config *config __maybe_unused,
-			  char *prefix __maybe_unused,
-			  struct timespec *ts __maybe_unused)
-{
-}
-
 __weak void iostat_print_counters(struct evlist *evlist __maybe_unused,
 				  struct perf_stat_config *config __maybe_unused,
 				  struct timespec *ts __maybe_unused,
diff --git a/tools/perf/util/iostat.h b/tools/perf/util/iostat.h
index a4e7299c5c2f..820930a096d9 100644
--- a/tools/perf/util/iostat.h
+++ b/tools/perf/util/iostat.h
@@ -35,8 +35,6 @@ int iostat_parse(const struct option *opt, const char *str,
 		 int unset __maybe_unused);
 void iostat_list(struct evlist *evlist, struct perf_stat_config *config);
 void iostat_release(struct evlist *evlist);
-void iostat_prefix(struct evlist *evlist, struct perf_stat_config *config,
-		   char *prefix, struct timespec *ts);
 void iostat_print_header_prefix(struct perf_stat_config *config);
 void iostat_print_metric(struct perf_stat_config *config, struct evsel *evsel,
 			 struct perf_stat_output_ctx *out);
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFT PATCH 4/7] perf-iostat: Extend iostat interface to support different iostat PMUs
  2026-01-26 12:35 [RFT PATCH 0/7] perf tool: Support iostat for multiple platforms Yushan Wang
                   ` (2 preceding siblings ...)
  2026-01-26 12:35 ` [RFT PATCH 3/7] perf-x86: iostat: Change iostat_prefix() to static Yushan Wang
@ 2026-01-26 12:35 ` Yushan Wang
  2026-01-26 12:35 ` [RFT PATCH 5/7] perf-iostat: Support wilder wildcard-match for pmus Yushan Wang
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Yushan Wang @ 2026-01-26 12:35 UTC (permalink / raw)
  To: mike.leach, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, peterz,
	john.g.garry, Jonathan.Cameron, shiju.jose, will,
	linux-perf-users, linux-arm-kernel
  Cc: linuxarm, liuyonglong, prime.zeng, fanghao11, wangzhou1,
	wangyushan12

From: Shiju Jose <shiju.jose@huawei.com>

Currently, platform-specific iostat code for PMUs is implemented as a
common iostat callback interface and linked during build. This approach
limits support for iostat across different implementations of PMU of the
same architecture.

To address this, extend common iostat interface to provide support for
different PMUs by allowing each PMU to register itself and receive
callbacks to its PMU-specific functions through the unified iostat
framework.

Signed-off-by: Shiju Jose  <shiju.jose@huawei.com>
Signed-off-by: Yushan Wang <wangyushan@huawei.com>
---
 tools/perf/util/iostat.c | 78 ++++++++++++++++++++++++++++------------
 tools/perf/util/iostat.h | 19 ++++++++--
 2 files changed, 73 insertions(+), 24 deletions(-)

diff --git a/tools/perf/util/iostat.c b/tools/perf/util/iostat.c
index a68ab100780d..84ab92d8f0b3 100644
--- a/tools/perf/util/iostat.c
+++ b/tools/perf/util/iostat.c
@@ -1,47 +1,81 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "util/iostat.h"
-#include "util/debug.h"
+
+static struct iostat_pmu_list *iostat_pmu;
 
 enum iostat_mode_t iostat_mode = IOSTAT_NONE;
 
-__weak int iostat_prepare(struct evlist *evlist __maybe_unused,
-			  struct perf_stat_config *config __maybe_unused)
+__weak int iostat_prepare(struct evlist *evlist, struct perf_stat_config *config)
+{
+	if (!iostat_pmu)
+		return -1;
+
+	return iostat_pmu->prepare(evlist, config);
+}
+
+__weak int iostat_parse(const struct option *opt, const char *str, int unset)
+{
+	if (!iostat_pmu)
+		return -1;
+
+	return iostat_pmu->parse(opt, str, unset);
+}
+
+__weak void iostat_list(struct evlist *evlist, struct perf_stat_config *config)
+{
+	iostat_pmu->list(evlist, config);
+}
+
+__weak void iostat_release(struct evlist *evlist)
 {
-	return -1;
+	iostat_pmu->release(evlist);
 }
 
-__weak int iostat_parse(const struct option *opt __maybe_unused,
-			 const char *str __maybe_unused,
-			 int unset __maybe_unused)
+__weak void iostat_print_header_prefix(struct perf_stat_config *config)
 {
-	pr_err("iostat mode is not supported on current platform\n");
-	return -1;
+	iostat_pmu->print_header_prefix(config);
 }
 
-__weak void iostat_list(struct evlist *evlist __maybe_unused,
-		       struct perf_stat_config *config __maybe_unused)
+__weak void iostat_print_metric(struct perf_stat_config *config,
+				struct evsel *evsel,
+				struct perf_stat_output_ctx *out)
 {
+	iostat_pmu->print_metric(config, evsel, out);
 }
 
-__weak void iostat_release(struct evlist *evlist __maybe_unused)
+__weak void iostat_print_counters(struct evlist *evlist,
+				  struct perf_stat_config *config,
+				  struct timespec *ts, char *prefix,
+				  iostat_print_counter_t print_cnt_cb,
+				  void *arg)
 {
+	iostat_pmu->print_counters(evlist, config, ts, prefix,
+				   print_cnt_cb, arg);
 }
 
-__weak void iostat_print_header_prefix(struct perf_stat_config *config __maybe_unused)
+int register_iostat_pmu(struct iostat_pmu_list *pmu)
 {
+	if (!pmu || !pmu->probe)
+		return -1;
+
+	if (pmu->probe(pmu))
+		return 0;
+
+	iostat_pmu = pmu;
+
+	return 0;
 }
 
-__weak void iostat_print_metric(struct perf_stat_config *config __maybe_unused,
-				struct evsel *evsel __maybe_unused,
-				struct perf_stat_output_ctx *out __maybe_unused)
+static void unregister_iostat_pmu(void)
 {
+	if (!iostat_pmu)
+		return;
+
+	iostat_pmu = NULL;
 }
 
-__weak void iostat_print_counters(struct evlist *evlist __maybe_unused,
-				  struct perf_stat_config *config __maybe_unused,
-				  struct timespec *ts __maybe_unused,
-				  char *prefix __maybe_unused,
-				  iostat_print_counter_t print_cnt_cb __maybe_unused,
-				  void *arg __maybe_unused)
+__attribute__((destructor))
+static void iostat_exit(void)
 {
+	unregister_iostat_pmu();
 }
diff --git a/tools/perf/util/iostat.h b/tools/perf/util/iostat.h
index 820930a096d9..58225542e49d 100644
--- a/tools/perf/util/iostat.h
+++ b/tools/perf/util/iostat.h
@@ -31,8 +31,7 @@ extern enum iostat_mode_t iostat_mode;
 typedef void (*iostat_print_counter_t)(struct perf_stat_config *, struct evsel *, void *);
 
 int iostat_prepare(struct evlist *evlist, struct perf_stat_config *config);
-int iostat_parse(const struct option *opt, const char *str,
-		 int unset __maybe_unused);
+int iostat_parse(const struct option *opt, const char *str, int unset);
 void iostat_list(struct evlist *evlist, struct perf_stat_config *config);
 void iostat_release(struct evlist *evlist);
 void iostat_print_header_prefix(struct perf_stat_config *config);
@@ -42,4 +41,20 @@ void iostat_print_counters(struct evlist *evlist,
 			   struct perf_stat_config *config, struct timespec *ts,
 			   char *prefix, iostat_print_counter_t print_cnt_cb, void *arg);
 
+struct iostat_pmu_list {
+	const char *pmu_name;
+	int (*probe)(struct iostat_pmu_list *iostat_pmu);
+	int (*prepare)(struct evlist *evlist, struct perf_stat_config *config);
+	int (*parse)(const struct option *opt, const char *str, int unset);
+	void (*list)(struct evlist *evlist, struct perf_stat_config *config);
+	void (*print_header_prefix)(struct perf_stat_config *config);
+	void (*print_metric)(struct perf_stat_config *config, struct evsel *evsel,
+			     struct perf_stat_output_ctx *out);
+	void (*print_counters)(struct evlist *evlist,
+			       struct perf_stat_config *config, struct timespec *ts,
+			       char *prefix, iostat_print_counter_t print_cnt_cb, void *arg);
+	void (*release)(struct evlist *evlist __maybe_unused);
+};
+
+int register_iostat_pmu(struct iostat_pmu_list *iostat_pmu);
 #endif /* _IOSTAT_H */
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFT PATCH 5/7] perf-iostat: Support wilder wildcard-match for pmus
  2026-01-26 12:35 [RFT PATCH 0/7] perf tool: Support iostat for multiple platforms Yushan Wang
                   ` (3 preceding siblings ...)
  2026-01-26 12:35 ` [RFT PATCH 4/7] perf-iostat: Extend iostat interface to support different iostat PMUs Yushan Wang
@ 2026-01-26 12:35 ` Yushan Wang
  2026-01-26 16:44   ` Ian Rogers
  2026-01-26 12:35 ` [RFT PATCH 6/7] perf-iostat: Make x86 iostat compatible with new iostat framework Yushan Wang
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Yushan Wang @ 2026-01-26 12:35 UTC (permalink / raw)
  To: mike.leach, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, peterz,
	john.g.garry, Jonathan.Cameron, shiju.jose, will,
	linux-perf-users, linux-arm-kernel
  Cc: linuxarm, liuyonglong, prime.zeng, fanghao11, wangzhou1,
	wangyushan12

Current wildcard matching of pmu names only support the form of
"<pmu_name>%d", which may not be sufficient for pmus with other forms of
name (e.g. HiSilicon PCIe PMU has the name of "hisi_pcie%d_pmu%d").

To address that, change the wildcard matching function into a callback,
and add a new version of wildcard-matching function using the callback
to support more flexible pmu names.

Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
---
 tools/perf/util/pmus.c | 12 +++++++++---
 tools/perf/util/pmus.h |  3 +++
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
index 98be2eb8f1f0..35184d477d07 100644
--- a/tools/perf/util/pmus.c
+++ b/tools/perf/util/pmus.c
@@ -402,7 +402,8 @@ struct perf_pmu *perf_pmus__scan_for_event(struct perf_pmu *pmu, const char *eve
 	return NULL;
 }
 
-struct perf_pmu *perf_pmus__scan_matching_wildcard(struct perf_pmu *pmu, const char *wildcard)
+struct perf_pmu *perf_pmus__scan_matching(struct perf_pmu *pmu, const char *wildcard,
+					  perf_pmus_match_t match)
 {
 	bool use_core_pmus = !pmu || pmu->is_core;
 
@@ -436,19 +437,24 @@ struct perf_pmu *perf_pmus__scan_matching_wildcard(struct perf_pmu *pmu, const c
 	}
 	if (use_core_pmus) {
 		list_for_each_entry_continue(pmu, &core_pmus, list) {
-			if (perf_pmu__wildcard_match(pmu, wildcard))
+			if (match(pmu, wildcard))
 				return pmu;
 		}
 		pmu = NULL;
 		pmu = list_prepare_entry(pmu, &other_pmus, list);
 	}
 	list_for_each_entry_continue(pmu, &other_pmus, list) {
-		if (perf_pmu__wildcard_match(pmu, wildcard))
+		if (match(pmu, wildcard))
 			return pmu;
 	}
 	return NULL;
 }
 
+struct perf_pmu *perf_pmus__scan_matching_wildcard(struct perf_pmu *pmu, const char *wildcard)
+{
+	return perf_pmus__scan_matching(pmu, wildcard, perf_pmu__wildcard_match);
+}
+
 static struct perf_pmu *perf_pmus__scan_skip_duplicates(struct perf_pmu *pmu)
 {
 	bool use_core_pmus = !pmu || pmu->is_core;
diff --git a/tools/perf/util/pmus.h b/tools/perf/util/pmus.h
index 7cb36863711a..9308afb5a7b8 100644
--- a/tools/perf/util/pmus.h
+++ b/tools/perf/util/pmus.h
@@ -9,6 +9,8 @@ struct perf_event_attr;
 struct perf_pmu;
 struct print_callbacks;
 
+typedef _Bool (*perf_pmus_match_t)(const struct perf_pmu *, const char *);
+
 size_t pmu_name_len_no_suffix(const char *str);
 /* Exposed for testing only. */
 int pmu_name_cmp(const char *lhs_pmu_name, const char *rhs_pmu_name);
@@ -22,6 +24,7 @@ struct perf_pmu *perf_pmus__find_by_attr(const struct perf_event_attr *attr);
 struct perf_pmu *perf_pmus__scan(struct perf_pmu *pmu);
 struct perf_pmu *perf_pmus__scan_core(struct perf_pmu *pmu);
 struct perf_pmu *perf_pmus__scan_for_event(struct perf_pmu *pmu, const char *event);
+struct perf_pmu *perf_pmus__scan_matching(struct perf_pmu *pmu, const char *wildcard, perf_pmus_match_t match);
 struct perf_pmu *perf_pmus__scan_matching_wildcard(struct perf_pmu *pmu, const char *wildcard);
 
 const struct perf_pmu *perf_pmus__pmu_for_pmu_filter(const char *str);
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFT PATCH 6/7] perf-iostat: Make x86 iostat compatible with new iostat framework
  2026-01-26 12:35 [RFT PATCH 0/7] perf tool: Support iostat for multiple platforms Yushan Wang
                   ` (4 preceding siblings ...)
  2026-01-26 12:35 ` [RFT PATCH 5/7] perf-iostat: Support wilder wildcard-match for pmus Yushan Wang
@ 2026-01-26 12:35 ` Yushan Wang
  2026-01-26 12:35 ` [RFT PATCH 7/7] perf-iostat: Enable iostat mode for HiSilicon PCIe PMU Yushan Wang
  2026-01-26 17:01 ` [RFT PATCH 0/7] perf tool: Support iostat for multiple platforms Ian Rogers
  7 siblings, 0 replies; 16+ messages in thread
From: Yushan Wang @ 2026-01-26 12:35 UTC (permalink / raw)
  To: mike.leach, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, peterz,
	john.g.garry, Jonathan.Cameron, shiju.jose, will,
	linux-perf-users, linux-arm-kernel
  Cc: linuxarm, liuyonglong, prime.zeng, fanghao11, wangzhou1,
	wangyushan12

From: Shiju Jose <shiju.jose@huawei.com>

Change the original x86 iio iostat supporter to be compatible with the
set of iostat frameworks.

The matching function of x86 iio may not be correct.

Signed-off-by: Shiju Jose  <shiju.jose@huawei.com>
Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
---
 tools/perf/arch/x86/util/iostat.c | 63 ++++++++++++++++++++++++-------
 tools/perf/util/iostat.c          | 26 ++++++-------
 2 files changed, 62 insertions(+), 27 deletions(-)

diff --git a/tools/perf/arch/x86/util/iostat.c b/tools/perf/arch/x86/util/iostat.c
index 83be505955c8..1585a76d69e1 100644
--- a/tools/perf/arch/x86/util/iostat.c
+++ b/tools/perf/arch/x86/util/iostat.c
@@ -332,7 +332,7 @@ static int iostat_event_group(struct evlist *evl,
 	return ret;
 }
 
-static void iostat_prefix(struct evlist *evlist,
+static void iio_iostat_prefix(struct evlist *evlist,
 		   struct perf_stat_config *config,
 		   char *prefix, struct timespec *ts)
 {
@@ -354,7 +354,7 @@ static void iostat_prefix(struct evlist *evlist,
 	}
 }
 
-int iostat_prepare(struct evlist *evlist, struct perf_stat_config *config)
+static int iio_iostat_prepare(struct evlist *evlist, struct perf_stat_config *config)
 {
 	if (evlist->core.nr_entries > 0) {
 		pr_warning("The -e and -M options are not supported."
@@ -371,8 +371,8 @@ int iostat_prepare(struct evlist *evlist, struct perf_stat_config *config)
 	return iostat_event_group(evlist, root_ports);
 }
 
-int iostat_parse(const struct option *opt, const char *str,
-		 int unset __maybe_unused)
+static int iio_iostat_parse(const struct option *opt, const char *str,
+			    int unset __maybe_unused)
 {
 	int ret;
 	struct perf_stat_config *config = (struct perf_stat_config *)opt->data;
@@ -392,7 +392,7 @@ int iostat_parse(const struct option *opt, const char *str,
 	return ret;
 }
 
-void iostat_list(struct evlist *evlist, struct perf_stat_config *config)
+static void iio_iostat_list(struct evlist *evlist, struct perf_stat_config *config)
 {
 	struct evsel *evsel;
 	struct iio_root_port *rp = NULL;
@@ -405,7 +405,7 @@ void iostat_list(struct evlist *evlist, struct perf_stat_config *config)
 	}
 }
 
-void iostat_release(struct evlist *evlist)
+static void iio_iostat_release(struct evlist *evlist)
 {
 	struct evsel *evsel;
 	struct iio_root_port *rp = NULL;
@@ -418,7 +418,7 @@ void iostat_release(struct evlist *evlist)
 	}
 }
 
-void iostat_print_header_prefix(struct perf_stat_config *config)
+static void iio_iostat_print_header_prefix(struct perf_stat_config *config)
 {
 	if (config->csv_output)
 		fputs("port,", config->output);
@@ -428,8 +428,8 @@ void iostat_print_header_prefix(struct perf_stat_config *config)
 		fprintf(config->output, "   port         ");
 }
 
-void iostat_print_metric(struct perf_stat_config *config, struct evsel *evsel,
-			 struct perf_stat_output_ctx *out)
+static void iio_iostat_print_metric(struct perf_stat_config *config, struct evsel *evsel,
+				    struct perf_stat_output_ctx *out)
 {
 	double iostat_value = 0;
 	u64 prev_count_val = 0;
@@ -452,24 +452,59 @@ void iostat_print_metric(struct perf_stat_config *config, struct evsel *evsel,
 			  iostat_value / (256 * 1024));
 }
 
-void iostat_print_counters(struct evlist *evlist,
-			   struct perf_stat_config *config, struct timespec *ts,
-			   char *prefix, iostat_print_counter_t print_cnt_cb, void *arg)
+static void iio_iostat_print_counters(struct evlist *evlist,
+				      struct perf_stat_config *config, struct timespec *ts,
+				      char *prefix, iostat_print_counter_t print_cnt_cb, void *arg)
 {
 	void *perf_device = NULL;
 	struct evsel *counter = evlist__first(evlist);
 
 	evlist__set_selected(evlist, counter);
-	iostat_prefix(evlist, config, prefix, ts);
+	iio_iostat_prefix(evlist, config, prefix, ts);
 	fprintf(config->output, "%s", prefix);
 	evlist__for_each_entry(evlist, counter) {
 		perf_device = evlist->selected->priv;
 		if (perf_device && perf_device != counter->priv) {
 			evlist__set_selected(evlist, counter);
-			iostat_prefix(evlist, config, prefix, ts);
+			iio_iostat_prefix(evlist, config, prefix, ts);
 			fprintf(config->output, "\n%s", prefix);
 		}
 		print_cnt_cb(config, counter, arg);
 	}
 	fputc('\n', config->output);
 }
+
+static bool iio_iostat_pmu_match(const struct perf_pmu *pmu, const char *wildcard)
+{
+	return !strncmp(pmu->name, wildcard, strlen(wildcard));
+}
+
+/*
+ * FIXME: pmu name prefix match might not work for x86 iio.
+ */
+static int iio_iostat_probe(struct iostat_pmu_list *iostat_pmu)
+{
+	return !perf_pmus__scan_matching(NULL, iostat_pmu->pmu_name, iio_iostat_pmu_match);
+}
+
+static struct iostat_pmu_list x86_iio_iostat_pmu_list[]  = {
+	{
+		.pmu_name = "x86-iio",
+		.probe = iio_iostat_probe,
+		.prepare = iio_iostat_prepare,
+		.parse = iio_iostat_parse,
+		.list = iio_iostat_list,
+		.print_header_prefix = iio_iostat_print_header_prefix,
+		.print_metric = iio_iostat_print_metric,
+		.print_counters = iio_iostat_print_counters,
+		.release = iio_iostat_release,
+	},
+};
+
+static void __attribute__((constructor)) x86_iio_iostat_pmu_init(void)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(x86_iio_iostat_pmu_list); i++)
+		register_iostat_pmu(&x86_iio_iostat_pmu_list[i]);
+}
diff --git a/tools/perf/util/iostat.c b/tools/perf/util/iostat.c
index 84ab92d8f0b3..7f1f6b1e56a8 100644
--- a/tools/perf/util/iostat.c
+++ b/tools/perf/util/iostat.c
@@ -5,7 +5,7 @@ static struct iostat_pmu_list *iostat_pmu;
 
 enum iostat_mode_t iostat_mode = IOSTAT_NONE;
 
-__weak int iostat_prepare(struct evlist *evlist, struct perf_stat_config *config)
+int iostat_prepare(struct evlist *evlist, struct perf_stat_config *config)
 {
 	if (!iostat_pmu)
 		return -1;
@@ -13,7 +13,7 @@ __weak int iostat_prepare(struct evlist *evlist, struct perf_stat_config *config
 	return iostat_pmu->prepare(evlist, config);
 }
 
-__weak int iostat_parse(const struct option *opt, const char *str, int unset)
+int iostat_parse(const struct option *opt, const char *str, int unset)
 {
 	if (!iostat_pmu)
 		return -1;
@@ -21,33 +21,33 @@ __weak int iostat_parse(const struct option *opt, const char *str, int unset)
 	return iostat_pmu->parse(opt, str, unset);
 }
 
-__weak void iostat_list(struct evlist *evlist, struct perf_stat_config *config)
+void iostat_list(struct evlist *evlist, struct perf_stat_config *config)
 {
 	iostat_pmu->list(evlist, config);
 }
 
-__weak void iostat_release(struct evlist *evlist)
+void iostat_release(struct evlist *evlist)
 {
 	iostat_pmu->release(evlist);
 }
 
-__weak void iostat_print_header_prefix(struct perf_stat_config *config)
+void iostat_print_header_prefix(struct perf_stat_config *config)
 {
 	iostat_pmu->print_header_prefix(config);
 }
 
-__weak void iostat_print_metric(struct perf_stat_config *config,
-				struct evsel *evsel,
-				struct perf_stat_output_ctx *out)
+void iostat_print_metric(struct perf_stat_config *config,
+			 struct evsel *evsel,
+			 struct perf_stat_output_ctx *out)
 {
 	iostat_pmu->print_metric(config, evsel, out);
 }
 
-__weak void iostat_print_counters(struct evlist *evlist,
-				  struct perf_stat_config *config,
-				  struct timespec *ts, char *prefix,
-				  iostat_print_counter_t print_cnt_cb,
-				  void *arg)
+void iostat_print_counters(struct evlist *evlist,
+			   struct perf_stat_config *config,
+			   struct timespec *ts, char *prefix,
+			   iostat_print_counter_t print_cnt_cb,
+			   void *arg)
 {
 	iostat_pmu->print_counters(evlist, config, ts, prefix,
 				   print_cnt_cb, arg);
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFT PATCH 7/7] perf-iostat: Enable iostat mode for HiSilicon PCIe PMU
  2026-01-26 12:35 [RFT PATCH 0/7] perf tool: Support iostat for multiple platforms Yushan Wang
                   ` (5 preceding siblings ...)
  2026-01-26 12:35 ` [RFT PATCH 6/7] perf-iostat: Make x86 iostat compatible with new iostat framework Yushan Wang
@ 2026-01-26 12:35 ` Yushan Wang
  2026-01-26 17:01 ` [RFT PATCH 0/7] perf tool: Support iostat for multiple platforms Ian Rogers
  7 siblings, 0 replies; 16+ messages in thread
From: Yushan Wang @ 2026-01-26 12:35 UTC (permalink / raw)
  To: mike.leach, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, peterz,
	john.g.garry, Jonathan.Cameron, shiju.jose, will,
	linux-perf-users, linux-arm-kernel
  Cc: linuxarm, liuyonglong, prime.zeng, fanghao11, wangzhou1,
	wangyushan12

From: Yicong Yang <yangyicong@hisilicon.com>

Some HiSilicon platforms provide PCIe PMU devices for monitoring the
throughput and latency of PCIe traffic. With the support of PCIe PMU
we can enable the perf iostat mode.

The HiSilicon PCIe PMU can support measuring the throughput of certain
TLP types and of certain root port. Totally 6 metrics are provided in
the unit of MB:

- Inbound MWR: Memory write TLPs from downstream devices to root port
- Inbound MRD: Memory read TLPs from downstream devices to root port
- Inbound CPL: Completion TLPs from downstream devices to root port
- Outbound MWR: Memory write TLPs from CPU to downstream devices
- Outbound MRD: Memory read TLPs from CPU to downstream devices
- Outbound CPL: Completions TLPs from CPU to downstream devices

Since the PMU measures the throughput in DWords. So we need to calculate
the throughput in MB like:
  Count * 4B / 1024 / 1024

Some of the display of the `perf iostat` will be like:
[root@localhost tmp]# ./perf iostat list
hisi_pcie0_core2<0000:40:00.0>
hisi_pcie2_core2<0000:5f:00.0>
hisi_pcie0_core1<0000:16:00.0>
hisi_pcie0_core1<0000:16:04.0>
[root@localhost tmp]# ./perf iostat --timeout 10000

 Performance counter stats for 'system wide':

    port              Inbound MWR(MB)      Inbound MRD(MB)      Inbound CPL(MB)     Outbound MWR(MB)     Outbound MRD(MB)     Outbound CPL(MB)
0000:40:00.0                    0                    0                    0                    0                    0                    0
0000:5f:00.0                    0                    0                    0                    0                    0                    0
0000:16:00.0             16272.99               366.58                    0                15.09                    0             16156.85
0000:16:04.0                    0                    0                    0                    0                    0                    0

      10.008227512 seconds time elapsed

[root@localhost tmp]# ./perf iostat 0000:16:00.0 -- fio -name=rw -numjobs=30 -filename=/dev/nvme0n1 -rw=rw -iodepth=128 -direct=1 -sync=0 -norandommap -group_reporting -runtime=10 -time_based -bs=64k 2>&1 > /dev/null

 Performance counter stats for 'system wide':

    port              Inbound MWR(MB)      Inbound MRD(MB)      Inbound CPL(MB)     Outbound MWR(MB)     Outbound MRD(MB)     Outbound CPL(MB)
0000:16:00.0                16614                  379                    0                   16                    0                16721

      10.180349717 seconds time elapsed

       0.558810000 seconds user
       2.495016000 seconds sys

More information of the HiSilicon PCIe PMU can be found at
Documentation/admin-guide/perf/hisi-pcie-pmu.rst.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Shiju Jose  <shiju.jose@huawei.com>
Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
---
 tools/perf/arch/arm64/util/Build         |   1 +
 tools/perf/arch/arm64/util/hisi-iostat.c | 479 +++++++++++++++++++++++
 2 files changed, 480 insertions(+)
 create mode 100644 tools/perf/arch/arm64/util/hisi-iostat.c

diff --git a/tools/perf/arch/arm64/util/Build b/tools/perf/arch/arm64/util/Build
index d63881081d2e..0137d1d0e790 100644
--- a/tools/perf/arch/arm64/util/Build
+++ b/tools/perf/arch/arm64/util/Build
@@ -6,6 +6,7 @@ perf-util-y += ../../arm/util/cs-etm.o
 perf-util-y += ../../arm/util/pmu.o
 perf-util-y += arm-spe.o
 perf-util-y += header.o
+perf-util-y += hisi-iostat.o
 perf-util-y += hisi-ptt.o
 perf-util-y += machine.o
 perf-util-y += mem-events.o
diff --git a/tools/perf/arch/arm64/util/hisi-iostat.c b/tools/perf/arch/arm64/util/hisi-iostat.c
new file mode 100644
index 000000000000..efabd0baddc3
--- /dev/null
+++ b/tools/perf/arch/arm64/util/hisi-iostat.c
@@ -0,0 +1,479 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * perf iostat support for HiSilicon PCIe PMU.
+ * Partly derived from tools/perf/arch/x86/util/iostat.c.
+ *
+ * Copyright (c) 2024 HiSilicon Technologies Co., Ltd.
+ * Author: Yicong Yang <yangyicong@hisilicon.com>
+ */
+
+#include <linux/err.h>
+#include <linux/limits.h>
+#include <linux/zalloc.h>
+
+#include <api/fs/fs.h>
+#include <dirent.h>
+#include <errno.h>
+#include <stdio.h>
+
+#include "util/counts.h"
+#include "util/debug.h"
+#include "util/iostat.h"
+#include "util/pmu.h"
+
+/* From include/uapi/linux/pci.h */
+#define PCI_SLOT(devfn)		(((devfn) >> 3) & 0x1f)
+#define PCI_DEVFN(slot, func)	((((slot) & 0x1f) << 3) | ((func) & 0x07))
+
+#define PCI_DEVICE_NAME_PATTERN		"%04x:%02hhx:%02hhx.%hhu"
+#define PCI_ROOT_BUS_DEVICES_PATH	"bus/pci/devices"
+
+static const char * const hisi_iostat_metrics[] = {
+	"Inbound MWR(MB)",
+	"Inbound MRD(MB)",
+	"Inbound CPL(MB)",
+	"Outbound MWR(MB)",
+	"Outbound MRD(MB)",
+	"Outbound CPL(MB)",
+};
+
+static const char * const hisi_iostat_cmd_template[] = {
+	/* Inbound Memory Write */
+	"hisi_pcie%hu_core%hu/event=0x0104,port=0x%hx/",
+	/* Inbound Memory Read */
+	"hisi_pcie%hu_core%hu/event=0x0804,port=0x%hx/",
+	/* Inbound Memory Completion */
+	"hisi_pcie%hu_core%hu/event=0x2004,port=0x%hx/",
+	/* Outbound Memory Write */
+	"hisi_pcie%hu_core%hu/event=0x0105,port=0x%hx/",
+	/* Outbound Memory Read */
+	"hisi_pcie%hu_core%hu/event=0x0405,port=0x%hx/",
+	/* Outbound Memory Completion */
+	"hisi_pcie%hu_core%hu/event=0x1005,port=0x%hx/",
+};
+
+struct hisi_pcie_root_port {
+	struct list_head list;
+	/* Is this Root Port selected for monitoring */
+	bool selected;
+	/* IDs to locate the PMU */
+	u16 sicl_id;
+	u16 core_id;
+	/* Filter mask for this Root Port */
+	u16 mask;
+	/* PCIe Root Port's <domain>:<bus>:<device>.<function> */
+	u32 domain;
+	u8 bus;
+	u8 dev;
+	u8 fn;
+};
+
+static LIST_HEAD(hisi_pcie_root_ports_list);
+
+/*
+ * Select specific Root Port to monitor. Return 0 if successfully find the
+ * Root Port, Otherwise -EINVAL.
+ */
+static int hisi_pcie_root_ports_select_one(u32 domain, u8 bus, u8 dev, u8 fn)
+{
+	struct hisi_pcie_root_port *rp;
+
+	list_for_each_entry(rp, &hisi_pcie_root_ports_list, list)
+		if (domain == rp->domain && bus == rp->bus &&
+		    dev == rp->dev && fn == rp->fn) {
+			rp->selected = true;
+			return 0;
+		}
+
+	return -EINVAL;
+}
+
+static void hisi_pcie_root_ports_select_all(void)
+{
+	struct hisi_pcie_root_port *rp;
+
+	list_for_each_entry(rp, &hisi_pcie_root_ports_list, list)
+		rp->selected = true;
+}
+
+static void hisi_pcie_root_ports_add(u16 sicl_id, u16 core_id, u8 target_bus,
+				     u16 bdf_min, u16 bdf_max)
+{
+	const char *sysfs = sysfs__mountpoint();
+	struct hisi_pcie_root_port *rp;
+	unsigned long path_len;
+	struct dirent *dent;
+	char path[PATH_MAX];
+	u8 bus, dev, fn;
+	u32 domain;
+	DIR *dir;
+	u16 bdf;
+	int ret;
+
+	path_len = snprintf(path, PATH_MAX, "%s/%s", sysfs, PCI_ROOT_BUS_DEVICES_PATH);
+	if (path_len > PATH_MAX)
+		return;
+
+	dir = opendir(path);
+	if (!dir)
+		return;
+
+	/* Scan the PCI root bus to find the match root port on @target_bus */
+	while ((dent = readdir(dir))) {
+		ret = sscanf(dent->d_name, PCI_DEVICE_NAME_PATTERN,
+			     &domain, &bus, &dev, &fn);
+		if (ret != 4 || bus != target_bus)
+			continue;
+
+		bdf = (bus << 8) | PCI_DEVFN(dev, fn);
+		if (bdf < bdf_min || bdf > bdf_max)
+			continue;
+
+		rp = zalloc(sizeof(*rp));
+		if (!rp)
+			continue;
+
+		rp->selected = false;
+		rp->sicl_id = sicl_id;
+		rp->core_id = core_id;
+		rp->domain = domain;
+		rp->bus = bus;
+		rp->dev = dev;
+		rp->fn = fn;
+
+		rp->mask = BIT((rp->dev - PCI_SLOT(bdf_min)) << 1);
+
+		list_add(&rp->list, &hisi_pcie_root_ports_list);
+
+		pr_debug3("Found root port %s\n", dent->d_name);
+	}
+
+	closedir(dir);
+}
+
+/* Scan the PMUs and build the mapping of the Root Ports to the PMU */
+static int hisi_pcie_root_ports_init(void)
+{
+	char event_source[PATH_MAX], bus_path[PATH_MAX];
+	unsigned long long bus, bdf_max, bdf_min;
+	u16 sicl_id, core_id;
+	struct dirent *dent;
+	DIR *dir;
+
+	perf_pmu__event_source_devices_scnprintf(event_source, sizeof(event_source));
+	dir = opendir(event_source);
+	if (!dir)
+		return -ENOENT;
+
+	while (dent = readdir(dir)) {
+		/*
+		 * This HiSilicon PCIe PMU will be named as:
+		 *   hisi_pcie<sicl_id>_core<core_id>
+		 */
+		if (sscanf(dent->d_name, "hisi_pcie%hu_core%hu", &sicl_id, &core_id) != 2)
+			continue;
+
+		/*
+		 * Driver will export the root port it can monitor through
+		 * the "bus" sysfs attribute.
+		 */
+		scnprintf(bus_path, sizeof(bus_path), "%s/hisi_pcie%hu_core%hu/bus",
+			  event_source, sicl_id, core_id);
+
+		/*
+		 * Per PCIe spec the bus should be 8bit, use unsigned long long
+		 * for the convience of the library function.
+		 */
+		if (filename__read_ull(bus_path, &bus))
+			continue;
+
+		scnprintf(bus_path, sizeof(bus_path), "%s/hisi_pcie%hu_core%hu/bdf_max",
+			  event_source, sicl_id, core_id);
+		if (filename__read_xll(bus_path, &bdf_max))
+			bdf_max = -1;
+
+		scnprintf(bus_path, sizeof(bus_path), "%s/hisi_pcie%hu_core%hu/bdf_min",
+			  event_source, sicl_id, core_id);
+		if (filename__read_xll(bus_path, &bdf_min))
+			bdf_min = 0;
+
+		pr_debug3("Found pmu %s bus 0x%llx\n", dent->d_name, bus);
+
+		hisi_pcie_root_ports_add(sicl_id, core_id, (u8)bus, (u16)bdf_min, (u16)bdf_max);
+	}
+
+	closedir(dir);
+	return !list_empty(&hisi_pcie_root_ports_list) ? 0 : -ENOENT;
+}
+
+static void hisi_pcie_root_ports_free(void)
+{
+	struct hisi_pcie_root_port *rp, *tmp;
+
+	if (list_empty(&hisi_pcie_root_ports_list))
+		return;
+
+	list_for_each_entry_safe(rp, tmp, &hisi_pcie_root_ports_list, list) {
+		list_del(&rp->list);
+		zfree(&rp);
+	}
+}
+
+static int hisi_iostat_add_events(struct evlist *evl)
+{
+	struct hisi_pcie_root_port *rp;
+	struct evsel *evsel;
+	unsigned int i, j;
+	char *iostat_cmd;
+	int pos = 0;
+	int ret;
+
+	if (list_empty(&hisi_pcie_root_ports_list))
+		return -ENOENT;
+
+	iostat_cmd = zalloc(PATH_MAX);
+	if (!iostat_cmd)
+		return -ENOMEM;
+
+	list_for_each_entry(rp, &hisi_pcie_root_ports_list, list) {
+		if (!rp->selected)
+			continue;
+
+		iostat_cmd[pos++] = '{';
+		for (j = 0; j < ARRAY_SIZE(hisi_iostat_cmd_template); j++) {
+			pos += snprintf(iostat_cmd + pos, ARG_MAX - pos - 1,
+					hisi_iostat_cmd_template[j],
+					rp->sicl_id, rp->core_id, rp->mask);
+
+			if (j == ARRAY_SIZE(hisi_iostat_cmd_template) - 1)
+				iostat_cmd[pos++] = '}';
+			else
+				iostat_cmd[pos++] = ',';
+		}
+
+		ret = parse_event(evl, iostat_cmd);
+		if (ret)
+			break;
+
+		i = 0;
+		evlist__for_each_entry_reverse(evl, evsel) {
+			if (i == ARRAY_SIZE(hisi_iostat_cmd_template))
+				break;
+
+			evsel->priv = rp;
+			i++;
+		}
+
+		memset(iostat_cmd, 0, PATH_MAX);
+		pos = 0;
+	}
+
+	zfree(&iostat_cmd);
+	return ret;
+}
+
+static int hisi_iostat_prepare(struct evlist *evlist,
+			       struct perf_stat_config *config)
+{
+	if (evlist->core.nr_entries > 0) {
+		pr_warning("The -e and -M options are not supported."
+			   "All chosen events/metrics will be dropped\n");
+		evlist__delete(evlist);
+		evlist = evlist__new();
+		if (!evlist)
+			return -ENOMEM;
+	}
+
+	config->metric_only = true;
+	config->aggr_mode = AGGR_GLOBAL;
+
+	return hisi_iostat_add_events(evlist);
+}
+
+static int hisi_pcie_root_ports_list_filter(const char *str)
+{
+	char *tok, *tmp, *copy = NULL;
+	u8 bus, dev, fn;
+	u32 domain;
+	int ret;
+
+	copy = strdup(str);
+	if (!copy)
+		return -ENOMEM;
+
+	for (tok = strtok_r(copy, ",", &tmp); tok; tok = strtok_r(NULL, ",", &tmp)) {
+		ret = sscanf(tok, PCI_DEVICE_NAME_PATTERN, &domain, &bus, &dev, &fn);
+		if (ret != 4) {
+			ret = -EINVAL;
+			break;
+		}
+
+		ret = hisi_pcie_root_ports_select_one(domain, bus, dev, fn);
+		if (ret)
+			break;
+	}
+
+	zfree(&copy);
+	return ret;
+}
+
+static int hisi_iostat_parse(const struct option *opt, const char *str, int unset __maybe_unused)
+{
+	struct perf_stat_config *config = (struct perf_stat_config *)opt->data;
+	int ret;
+
+	ret = hisi_pcie_root_ports_init();
+	if (ret)
+		return ret;
+
+	config->iostat_run = true;
+
+	if (!str) {
+		iostat_mode = IOSTAT_RUN;
+		hisi_pcie_root_ports_select_all();
+	} else if (!strcmp(str, "list")) {
+		iostat_mode = IOSTAT_LIST;
+		hisi_pcie_root_ports_select_all();
+	} else {
+		iostat_mode = IOSTAT_RUN;
+		ret = hisi_pcie_root_ports_list_filter(str);
+	}
+
+	return ret;
+}
+
+static void hisi_pcie_root_port_show(FILE *output,
+				     const struct hisi_pcie_root_port * const rp)
+{
+	if (output && rp)
+		fprintf(output, "hisi_pcie%hu_core%hu<" PCI_DEVICE_NAME_PATTERN ">\n",
+			rp->sicl_id, rp->core_id, rp->domain, rp->bus, rp->dev, rp->fn);
+}
+
+static void hisi_iostat_list(struct evlist *evlist __maybe_unused, struct perf_stat_config *config)
+{
+	struct hisi_pcie_root_port *rp = NULL;
+	struct evsel *evsel;
+
+	evlist__for_each_entry(evlist, evsel) {
+		if (rp != evsel->priv) {
+			hisi_pcie_root_port_show(config->output, evsel->priv);
+			rp = evsel->priv;
+		}
+	}
+}
+
+static void hisi_iostat_release(struct evlist *evlist)
+{
+	struct evsel *evsel;
+
+	evlist__for_each_entry(evlist, evsel)
+		evsel->priv = NULL;
+
+	hisi_pcie_root_ports_free();
+}
+
+static void hisi_iostat_print_header_prefix(struct perf_stat_config *config)
+{
+	if (config->csv_output)
+		fputs("port,", config->output);
+	else if (config->interval)
+		fprintf(config->output, "#          time    port         ");
+	else
+		fprintf(config->output, "   port         ");
+}
+
+static void hisi_iostat_print_metric(struct perf_stat_config *config, struct evsel *evsel,
+				     struct perf_stat_output_ctx *out)
+{
+	const char *iostat_metric = hisi_iostat_metrics[evsel->core.idx % ARRAY_SIZE(hisi_iostat_metrics)];
+	struct perf_counts_values *count;
+	double iostat_value;
+
+	/* We're using AGGR_GLOBAL so there's only one aggr counts aggr[0]. */
+	count = &evsel->stats->aggr[0].counts;
+
+	/* The counts has been scaled, we can use it directly. */
+	iostat_value = (double)count->val;
+
+	/*
+	 * Display two digits after decimal point for better accuracy if the
+	 * value is non-zero.
+	 */
+	out->print_metric(config, out->ctx, METRIC_THRESHOLD_UNKNOWN,
+			  iostat_value > 0 ? "%8.2f" : "%8.0f",
+			  iostat_metric, iostat_value / (256 * 1024));
+}
+
+static void hisi_iostat_prefix(struct evlist *evlist, struct perf_stat_config *config,
+			       char *prefix, struct timespec *ts)
+{
+	struct hisi_pcie_root_port *rp = evlist->selected->priv;
+
+	if (!rp)
+		return;
+
+	if (ts)
+		sprintf(prefix, "%6lu.%09lu%s" PCI_DEVICE_NAME_PATTERN "%s",
+			ts->tv_sec, ts->tv_nsec, config->csv_sep,
+			rp->domain, rp->bus, rp->dev, rp->fn,
+			config->csv_sep);
+	else
+		sprintf(prefix, PCI_DEVICE_NAME_PATTERN "%s",
+			rp->domain, rp->bus, rp->dev, rp->fn,
+			config->csv_sep);
+}
+
+static void hisi_iostat_print_counters(struct evlist *evlist, struct perf_stat_config *config,
+				       struct timespec *ts, char *prefix,
+				       iostat_print_counter_t print_cnt_cb, void *arg)
+{
+	struct evsel *counter = evlist__first(evlist);
+	void *perf_device;
+
+	evlist__set_selected(evlist, counter);
+	hisi_iostat_prefix(evlist, config, prefix, ts);
+	fprintf(config->output, "%s", prefix);
+	evlist__for_each_entry(evlist, counter) {
+		perf_device = evlist->selected->priv;
+		if (perf_device && perf_device != counter->priv) {
+			evlist__set_selected(evlist, counter);
+			hisi_iostat_prefix(evlist, config, prefix, ts);
+			fprintf(config->output, "\n%s", prefix);
+		}
+		print_cnt_cb(config, counter, arg);
+	}
+	fputc('\n', config->output);
+}
+
+static bool hisi_iostat_pmu_match(const struct perf_pmu *pmu, const char *wildcard)
+{
+	return !strncmp(pmu->name, wildcard, strlen(wildcard));
+}
+
+static int hisi_iostat_probe(struct iostat_pmu_list *iostat_pmu)
+{
+	return !perf_pmus__scan_matching(NULL, iostat_pmu->pmu_name, hisi_iostat_pmu_match);
+}
+
+static struct iostat_pmu_list hisi_iostat_pmu_list[]  = {
+	{
+		.pmu_name = "hisi_pcie",
+		.probe = hisi_iostat_probe,
+		.prepare = hisi_iostat_prepare,
+		.parse = hisi_iostat_parse,
+		.list = hisi_iostat_list,
+		.print_header_prefix = hisi_iostat_print_header_prefix,
+		.print_metric = hisi_iostat_print_metric,
+		.print_counters = hisi_iostat_print_counters,
+		.release = hisi_iostat_release,
+	},
+};
+
+static void __attribute__((constructor)) hisi_iostat_pmu_init(void)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(hisi_iostat_pmu_list); i++)
+		register_iostat_pmu(&hisi_iostat_pmu_list[i]);
+}
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFT PATCH 5/7] perf-iostat: Support wilder wildcard-match for pmus
  2026-01-26 12:35 ` [RFT PATCH 5/7] perf-iostat: Support wilder wildcard-match for pmus Yushan Wang
@ 2026-01-26 16:44   ` Ian Rogers
  2026-01-29 15:14     ` wangyushan
  0 siblings, 1 reply; 16+ messages in thread
From: Ian Rogers @ 2026-01-26 16:44 UTC (permalink / raw)
  To: Yushan Wang
  Cc: mike.leach, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, adrian.hunter, peterz, john.g.garry,
	Jonathan.Cameron, shiju.jose, will, linux-perf-users,
	linux-arm-kernel, linuxarm, liuyonglong, prime.zeng, fanghao11,
	wangzhou1

On Mon, Jan 26, 2026 at 4:35 AM Yushan Wang <wangyushan12@huawei.com> wrote:
>
> Current wildcard matching of pmu names only support the form of
> "<pmu_name>%d", which may not be sufficient for pmus with other forms of
> name (e.g. HiSilicon PCIe PMU has the name of "hisi_pcie%d_pmu%d").
>
> To address that, change the wildcard matching function into a callback,
> and add a new version of wildcard-matching function using the callback
> to support more flexible pmu names.

There's some discussion around PMU name matching here (and the replies):
https://lore.kernel.org/lkml/20251231224233.113839-12-zide.chen@intel.com/
that I think is relevant.

> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
> ---
>  tools/perf/util/pmus.c | 12 +++++++++---
>  tools/perf/util/pmus.h |  3 +++
>  2 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
> index 98be2eb8f1f0..35184d477d07 100644
> --- a/tools/perf/util/pmus.c
> +++ b/tools/perf/util/pmus.c
> @@ -402,7 +402,8 @@ struct perf_pmu *perf_pmus__scan_for_event(struct perf_pmu *pmu, const char *eve
>         return NULL;
>  }
>
> -struct perf_pmu *perf_pmus__scan_matching_wildcard(struct perf_pmu *pmu, const char *wildcard)
> +struct perf_pmu *perf_pmus__scan_matching(struct perf_pmu *pmu, const char *wildcard,
> +                                         perf_pmus_match_t match)
>  {
>         bool use_core_pmus = !pmu || pmu->is_core;
>
> @@ -436,19 +437,24 @@ struct perf_pmu *perf_pmus__scan_matching_wildcard(struct perf_pmu *pmu, const c
>         }
>         if (use_core_pmus) {
>                 list_for_each_entry_continue(pmu, &core_pmus, list) {
> -                       if (perf_pmu__wildcard_match(pmu, wildcard))
> +                       if (match(pmu, wildcard))
>                                 return pmu;
>                 }
>                 pmu = NULL;
>                 pmu = list_prepare_entry(pmu, &other_pmus, list);
>         }
>         list_for_each_entry_continue(pmu, &other_pmus, list) {
> -               if (perf_pmu__wildcard_match(pmu, wildcard))
> +               if (match(pmu, wildcard))
>                         return pmu;
>         }
>         return NULL;
>  }
>
> +struct perf_pmu *perf_pmus__scan_matching_wildcard(struct perf_pmu *pmu, const char *wildcard)
> +{
> +       return perf_pmus__scan_matching(pmu, wildcard, perf_pmu__wildcard_match);
> +}
> +
>  static struct perf_pmu *perf_pmus__scan_skip_duplicates(struct perf_pmu *pmu)
>  {
>         bool use_core_pmus = !pmu || pmu->is_core;
> diff --git a/tools/perf/util/pmus.h b/tools/perf/util/pmus.h
> index 7cb36863711a..9308afb5a7b8 100644
> --- a/tools/perf/util/pmus.h
> +++ b/tools/perf/util/pmus.h
> @@ -9,6 +9,8 @@ struct perf_event_attr;
>  struct perf_pmu;
>  struct print_callbacks;
>
> +typedef _Bool (*perf_pmus_match_t)(const struct perf_pmu *, const char *);

nit: can we use stdbool.h and make this return a bool?
nit: in general Linux style is to avoid typedefs:
https://www.kernel.org/doc/html/v4.10/process/coding-style.html#typedefs

Thanks,
Ian

> +
>  size_t pmu_name_len_no_suffix(const char *str);
>  /* Exposed for testing only. */
>  int pmu_name_cmp(const char *lhs_pmu_name, const char *rhs_pmu_name);
> @@ -22,6 +24,7 @@ struct perf_pmu *perf_pmus__find_by_attr(const struct perf_event_attr *attr);
>  struct perf_pmu *perf_pmus__scan(struct perf_pmu *pmu);
>  struct perf_pmu *perf_pmus__scan_core(struct perf_pmu *pmu);
>  struct perf_pmu *perf_pmus__scan_for_event(struct perf_pmu *pmu, const char *event);
> +struct perf_pmu *perf_pmus__scan_matching(struct perf_pmu *pmu, const char *wildcard, perf_pmus_match_t match);
>  struct perf_pmu *perf_pmus__scan_matching_wildcard(struct perf_pmu *pmu, const char *wildcard);
>
>  const struct perf_pmu *perf_pmus__pmu_for_pmu_filter(const char *str);
> --
> 2.33.0
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFT PATCH 0/7] perf tool: Support iostat for multiple platforms
  2026-01-26 12:35 [RFT PATCH 0/7] perf tool: Support iostat for multiple platforms Yushan Wang
                   ` (6 preceding siblings ...)
  2026-01-26 12:35 ` [RFT PATCH 7/7] perf-iostat: Enable iostat mode for HiSilicon PCIe PMU Yushan Wang
@ 2026-01-26 17:01 ` Ian Rogers
  2026-01-29 15:14   ` wangyushan
  7 siblings, 1 reply; 16+ messages in thread
From: Ian Rogers @ 2026-01-26 17:01 UTC (permalink / raw)
  To: Yushan Wang
  Cc: mike.leach, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, adrian.hunter, peterz, john.g.garry,
	Jonathan.Cameron, shiju.jose, will, linux-perf-users,
	linux-arm-kernel, linuxarm, liuyonglong, prime.zeng, fanghao11,
	wangzhou1

On Mon, Jan 26, 2026 at 4:35 AM Yushan Wang <wangyushan12@huawei.com> wrote:
>
> Currently, platform-specific iostat code for PMUs is implemented as a
> common iostat callback interface and invoked based on what is being
> built. This approach limits support for iostat across different types of
> PMUs.
>
> Support of HiSilicon PCIe PMU iostat was raised at [1], which uses the
> similar approach.
>
> To extend support of iostat across platforms, change common iostat
> interface to framework to allow perf to probe PMU capabilities during
> runtime and route iostat request to the correct PMU-specific functions.
> Then HiSilicon PCIe PMU iostat is supported with the new framework.
>
> Request For Test:
> Refactors has been made to x86 iostat to adapt the iostat framework, the
> probe function that checks if there's any PMU's name contains 'x86-iio'
> may not work properly, tests of that would be appreciated.
>
> [1] https://lore.kernel.org/all/4688a613-c94a-49b0-9d0f-09173c64082d@arm.com/
>
> Shiju Jose (2):
>   perf-iostat: Extend iostat interface to support different iostat PMUs
>   perf-iostat: Make x86 iostat compatible with new iostat framework
>
> Yicong Yang (1):
>   perf-iostat: Enable iostat mode for HiSilicon PCIe PMU
>
> Yushan Wang (4):
>   perf stat: Check color's length instead of the pointer
>   perf stat: Save unnecessary print_metric() call
>   perf-x86: iostat: Change iostat_prefix() to static
>   perf-iostat: Support wilder wildcard-match for pmus

Hi,

Thanks for the changes! Given the iostat code is display code, it'd be
great if we could avoid the arch directory usage. For example, I may
record data on a machine with a HiSilicon PCIe PMU but then want to
look at the data on an x86 laptop, as the code is under arch/arm64 it
won't get built. Given the PMU name is unique, is it possible to put
this code into tools/perf/util and determine whether to use it or not
by seeing if the PMU is present by its name? I know that means
refactoring the x86 code that hasn't done this.

A thought that is away from the iostat code and may simplify the code
base is that we could introduce system PMU, rather than CPU, dependent
metrics. Perhaps the iostat code could be replaced by a particular
metric group like Default or TopdownL1, so `perf iostat` becomes
something of a synonym for `perf stat -M iostat`. An example of what
this may look like is (unmerged):
https://lore.kernel.org/lkml/20260108191105.695131-34-irogers@google.com/
where I introduce a memory bandwidth saturation metric dependent on
uncore PMUs (either cbox or cha) on Intel. There is a metric
"have_event" function that can be used to make a metric conditional.
If you `perf stat record` a metric today and then look at the
perf.data with `perf script --fields metric` (IIRC) then every metric
that has the recorded events within it will be computed, although the
code needs better testing, etc. I suspect the biggest downside to this
approach is in how the output looks, but perhaps that can be tweaked.
For example, the `ilist.py` command:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/python/ilist.py?h=perf-tools-next
supports regular metrics.

That said, I'm not against the changes, the arch directory usage, not
using json metrics, etc. and this change is doing clean up, following
existing patterns, etc. Sometimes the codebase has evolved but certain
commands haven't kept up. I think `perf iostat` is a command like
that.

Thanks,
Ian

>  tools/perf/arch/arm64/util/Build         |   1 +
>  tools/perf/arch/arm64/util/hisi-iostat.c | 479 +++++++++++++++++++++++
>  tools/perf/arch/x86/util/iostat.c        | 105 +++--
>  tools/perf/builtin-script.c              |   2 +-
>  tools/perf/util/iostat.c                 |  79 ++--
>  tools/perf/util/iostat.h                 |  21 +-
>  tools/perf/util/pmus.c                   |  12 +-
>  tools/perf/util/pmus.h                   |   3 +
>  tools/perf/util/stat-display.c           |   4 +-
>  tools/perf/util/stat-shadow.c            |   4 +-
>  10 files changed, 638 insertions(+), 72 deletions(-)
>  create mode 100644 tools/perf/arch/arm64/util/hisi-iostat.c
>
> --
> 2.33.0
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFT PATCH 1/7] perf stat: Check color's length instead of the pointer
  2026-01-26 12:35 ` [RFT PATCH 1/7] perf stat: Check color's length instead of the pointer Yushan Wang
@ 2026-01-27 15:58   ` Jonathan Cameron
  2026-01-29 15:19     ` wangyushan
  0 siblings, 1 reply; 16+ messages in thread
From: Jonathan Cameron @ 2026-01-27 15:58 UTC (permalink / raw)
  To: Yushan Wang
  Cc: mike.leach, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, peterz,
	john.g.garry, shiju.jose, will, linux-perf-users,
	linux-arm-kernel, linuxarm, liuyonglong, prime.zeng, fanghao11,
	wangzhou1

On Mon, 26 Jan 2026 20:35:08 +0800
Yushan Wang <wangyushan12@huawei.com> wrote:

> Color string returned by metric_threshold_classify__color() is never
> NULL, check the presence of *color will always return true.
> 
> Fix this by change the checks against length of *color.
> 
> Fixes: 37b77ae95416 ("perf stat: Change color to threshold in print_metric")
> 
No blank line between tags.

Fixes: 37b77ae95416 ("perf stat: Change color to threshold in print_metric")
Signed-off-by: Yushan Wang <wangyushan12@huawei.com>

Some of the scripting run against the kernel uses that lack of blank lines
as a heuristic to identify what is a tag vs other similar looking text.

Thanks,

Jonathan

> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
> ---
>  tools/perf/builtin-script.c    | 2 +-
>  tools/perf/util/stat-display.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index 62e43d3c5ad7..9fe90f564c69 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -2100,7 +2100,7 @@ static void script_print_metric(struct perf_stat_config *config __maybe_unused,
>  	perf_sample__fprintf_start(NULL, mctx->sample, mctx->thread, mctx->evsel,
>  				   PERF_RECORD_SAMPLE, mctx->fp);
>  	fputs("\tmetric: ", mctx->fp);
> -	if (color)
> +	if (strlen(color))
>  		color_fprintf(mctx->fp, color, fmt, val);
>  	else
>  		printf(fmt, val);
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index 6d02f84c5691..91c0c1020f4e 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -474,7 +474,7 @@ static void print_metric_std(struct perf_stat_config *config,
>  		do_new_line_std(config, os);
>  
>  	n = fprintf(out, " # ");
> -	if (color)
> +	if (strlen(color))
>  		n += color_fprintf(out, color, fmt, val);
>  	else
>  		n += fprintf(out, fmt, val);
> @@ -607,7 +607,7 @@ static void print_metric_only(struct perf_stat_config *config,
>  	if (mlen < strlen(unit))
>  		mlen = strlen(unit) + 1;
>  
> -	if (color)
> +	if (strlen(color))
>  		mlen += strlen(color) + sizeof(PERF_COLOR_RESET) - 1;
>  
>  	color_snprintf(str, sizeof(str), color ?: "", fmt ?: "", val);



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFT PATCH 2/7] perf stat: Save unnecessary print_metric() call
  2026-01-26 12:35 ` [RFT PATCH 2/7] perf stat: Save unnecessary print_metric() call Yushan Wang
@ 2026-01-27 16:01   ` Jonathan Cameron
  2026-01-29 15:17     ` wangyushan
  0 siblings, 1 reply; 16+ messages in thread
From: Jonathan Cameron @ 2026-01-27 16:01 UTC (permalink / raw)
  To: Yushan Wang
  Cc: mike.leach, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, peterz,
	john.g.garry, shiju.jose, will, linux-perf-users,
	linux-arm-kernel, linuxarm, liuyonglong, prime.zeng, fanghao11,
	wangzhou1

On Mon, 26 Jan 2026 20:35:09 +0800
Yushan Wang <wangyushan12@huawei.com> wrote:

> Patch [1] removed the second branch of iostat_run, and changed num to 0
> since it is the default behavior. But during iostat_run, default value 1
> of num is required to avoid print_metric() call later.
> 
> Set num as 1 to avoid redundant print_metric() call that causes
> unaligned blank printed.
> 
> Fixes: b71f46a6a708 ("perf stat: Remove hard coded shadow metrics")
> 
> [1]: https://lore.kernel.org/all/20251111212206.631711-8-irogers@google.com/

Use a Link tag for links, but don't add them when they just point to the patch
you already have as a fixes tag.  Just refer to it as "The patch listed under
Fixes".
> 
> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
> ---
>  tools/perf/util/stat-shadow.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
> index 9c83f7d96caa..9439baf8002f 100644
> --- a/tools/perf/util/stat-shadow.c
> +++ b/tools/perf/util/stat-shadow.c
> @@ -319,8 +319,10 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  	void *ctxp = out->ctx;
>  	int num = 0;
>  
> -	if (config->iostat_run)
> +	if (config->iostat_run) {
>  		iostat_print_metric(config, evsel, out);
> +		num = 1;
> +	}
>  
>  	perf_stat__print_shadow_stats_metricgroup(config, evsel, aggr_idx,
>  						  &num, NULL, out);



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFT PATCH 5/7] perf-iostat: Support wilder wildcard-match for pmus
  2026-01-26 16:44   ` Ian Rogers
@ 2026-01-29 15:14     ` wangyushan
  0 siblings, 0 replies; 16+ messages in thread
From: wangyushan @ 2026-01-29 15:14 UTC (permalink / raw)
  To: Ian Rogers
  Cc: mike.leach, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, adrian.hunter, peterz, john.g.garry,
	Jonathan.Cameron, shiju.jose, will, linux-perf-users,
	linux-arm-kernel, linuxarm, liuyonglong, prime.zeng, fanghao11,
	wangzhou1, Yushan Wang



On 1/27/2026 12:44 AM, Ian Rogers wrote:
> On Mon, Jan 26, 2026 at 4:35 AM Yushan Wang <wangyushan12@huawei.com> wrote:
>> Current wildcard matching of pmu names only support the form of
>> "<pmu_name>%d", which may not be sufficient for pmus with other forms of
>> name (e.g. HiSilicon PCIe PMU has the name of "hisi_pcie%d_pmu%d").
>>
>> To address that, change the wildcard matching function into a callback,
>> and add a new version of wildcard-matching function using the callback
>> to support more flexible pmu names.
> There's some discussion around PMU name matching here (and the replies):
> https://lore.kernel.org/lkml/20251231224233.113839-12-zide.chen@intel.com/
> that I think is relevant.
The complexity of name matching is sad. Do you think it's a good idea to add a callback for specific use? Maybe the wildcard matching needs more complete split, that extracts the match logic of core and uncore pmus into individual functions?
>> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
>> ---
>>  tools/perf/util/pmus.c | 12 +++++++++---
>>  tools/perf/util/pmus.h |  3 +++
>>  2 files changed, 12 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
>> index 98be2eb8f1f0..35184d477d07 100644
>> --- a/tools/perf/util/pmus.c
>> +++ b/tools/perf/util/pmus.c
>> @@ -402,7 +402,8 @@ struct perf_pmu *perf_pmus__scan_for_event(struct perf_pmu *pmu, const char *eve
>>         return NULL;
>>  }
>>
>> -struct perf_pmu *perf_pmus__scan_matching_wildcard(struct perf_pmu *pmu, const char *wildcard)
>> +struct perf_pmu *perf_pmus__scan_matching(struct perf_pmu *pmu, const char *wildcard,
>> +                                         perf_pmus_match_t match)
>>  {
>>         bool use_core_pmus = !pmu || pmu->is_core;
>>
>> @@ -436,19 +437,24 @@ struct perf_pmu *perf_pmus__scan_matching_wildcard(struct perf_pmu *pmu, const c
>>         }
>>         if (use_core_pmus) {
>>                 list_for_each_entry_continue(pmu, &core_pmus, list) {
>> -                       if (perf_pmu__wildcard_match(pmu, wildcard))
>> +                       if (match(pmu, wildcard))
>>                                 return pmu;
>>                 }
>>                 pmu = NULL;
>>                 pmu = list_prepare_entry(pmu, &other_pmus, list);
>>         }
>>         list_for_each_entry_continue(pmu, &other_pmus, list) {
>> -               if (perf_pmu__wildcard_match(pmu, wildcard))
>> +               if (match(pmu, wildcard))
>>                         return pmu;
>>         }
>>         return NULL;
>>  }
>>
>> +struct perf_pmu *perf_pmus__scan_matching_wildcard(struct perf_pmu *pmu, const char *wildcard)
>> +{
>> +       return perf_pmus__scan_matching(pmu, wildcard, perf_pmu__wildcard_match);
>> +}
>> +
>>  static struct perf_pmu *perf_pmus__scan_skip_duplicates(struct perf_pmu *pmu)
>>  {
>>         bool use_core_pmus = !pmu || pmu->is_core;
>> diff --git a/tools/perf/util/pmus.h b/tools/perf/util/pmus.h
>> index 7cb36863711a..9308afb5a7b8 100644
>> --- a/tools/perf/util/pmus.h
>> +++ b/tools/perf/util/pmus.h
>> @@ -9,6 +9,8 @@ struct perf_event_attr;
>>  struct perf_pmu;
>>  struct print_callbacks;
>>
>> +typedef _Bool (*perf_pmus_match_t)(const struct perf_pmu *, const char *);
> nit: can we use stdbool.h and make this return a bool?
> nit: in general Linux style is to avoid typedefs:
> https://www.kernel.org/doc/html/v4.10/process/coding-style.html#typedefs

Yes, thanks for pointing out, I will fix them.

>
> Thanks,
> Ian

Thanks, 
Yushan
>
>> +
>>  size_t pmu_name_len_no_suffix(const char *str);
>>  /* Exposed for testing only. */
>>  int pmu_name_cmp(const char *lhs_pmu_name, const char *rhs_pmu_name);
>> @@ -22,6 +24,7 @@ struct perf_pmu *perf_pmus__find_by_attr(const struct perf_event_attr *attr);
>>  struct perf_pmu *perf_pmus__scan(struct perf_pmu *pmu);
>>  struct perf_pmu *perf_pmus__scan_core(struct perf_pmu *pmu);
>>  struct perf_pmu *perf_pmus__scan_for_event(struct perf_pmu *pmu, const char *event);
>> +struct perf_pmu *perf_pmus__scan_matching(struct perf_pmu *pmu, const char *wildcard, perf_pmus_match_t match);
>>  struct perf_pmu *perf_pmus__scan_matching_wildcard(struct perf_pmu *pmu, const char *wildcard);
>>
>>  const struct perf_pmu *perf_pmus__pmu_for_pmu_filter(const char *str);
>> --
>> 2.33.0
>>



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFT PATCH 0/7] perf tool: Support iostat for multiple platforms
  2026-01-26 17:01 ` [RFT PATCH 0/7] perf tool: Support iostat for multiple platforms Ian Rogers
@ 2026-01-29 15:14   ` wangyushan
  0 siblings, 0 replies; 16+ messages in thread
From: wangyushan @ 2026-01-29 15:14 UTC (permalink / raw)
  To: Ian Rogers
  Cc: mike.leach, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, adrian.hunter, peterz, john.g.garry,
	Jonathan.Cameron, shiju.jose, will, linux-perf-users,
	linux-arm-kernel, linuxarm, liuyonglong, prime.zeng, fanghao11,
	wangzhou1, Yushan Wang



On 1/27/2026 1:01 AM, Ian Rogers wrote:
> On Mon, Jan 26, 2026 at 4:35 AM Yushan Wang <wangyushan12@huawei.com> wrote:
>> Currently, platform-specific iostat code for PMUs is implemented as a
>> common iostat callback interface and invoked based on what is being
>> built. This approach limits support for iostat across different types of
>> PMUs.
>>
>> Support of HiSilicon PCIe PMU iostat was raised at [1], which uses the
>> similar approach.
>>
>> To extend support of iostat across platforms, change common iostat
>> interface to framework to allow perf to probe PMU capabilities during
>> runtime and route iostat request to the correct PMU-specific functions.
>> Then HiSilicon PCIe PMU iostat is supported with the new framework.
>>
>> Request For Test:
>> Refactors has been made to x86 iostat to adapt the iostat framework, the
>> probe function that checks if there's any PMU's name contains 'x86-iio'
>> may not work properly, tests of that would be appreciated.
>>
>> [1] https://lore.kernel.org/all/4688a613-c94a-49b0-9d0f-09173c64082d@arm.com/
>>
>> Shiju Jose (2):
>>   perf-iostat: Extend iostat interface to support different iostat PMUs
>>   perf-iostat: Make x86 iostat compatible with new iostat framework
>>
>> Yicong Yang (1):
>>   perf-iostat: Enable iostat mode for HiSilicon PCIe PMU
>>
>> Yushan Wang (4):
>>   perf stat: Check color's length instead of the pointer
>>   perf stat: Save unnecessary print_metric() call
>>   perf-x86: iostat: Change iostat_prefix() to static
>>   perf-iostat: Support wilder wildcard-match for pmus
> Hi,
>
> Thanks for the changes! Given the iostat code is display code, it'd be
> great if we could avoid the arch directory usage. For example, I may
> record data on a machine with a HiSilicon PCIe PMU but then want to
> look at the data on an x86 laptop, as the code is under arch/arm64 it
> won't get built.Given the PMU name is unique, is it possible to put
> this code into tools/perf/util and determine whether to use it or not
> by seeing if the PMU is present by its name? I know that means
> refactoring the x86 code that hasn't done this.

Yes, iostat itself didn't do much more than rearrange the display
format. I can try to refactor that to the util directory, though x86
iostat may need even more testing :)

> A thought that is away from the iostat code and may simplify the code
> base is that we could introduce system PMU, rather than CPU, dependent
> metrics. Perhaps the iostat code could be replaced by a particular
> metric group like Default or TopdownL1, so `perf iostat` becomes
> something of a synonym for `perf stat -M iostat`. An example of what
> this may look like is (unmerged):
> https://lore.kernel.org/lkml/20260108191105.695131-34-irogers@google.com/
> where I introduce a memory bandwidth saturation metric dependent on
> uncore PMUs (either cbox or cha) on Intel. There is a metric
> "have_event" function that can be used to make a metric conditional.
> If you `perf stat record` a metric today and then look at the
> perf.data with `perf script --fields metric` (IIRC) then every metric
> that has the recorded events within it will be computed, although the
> code needs better testing, etc. I suspect the biggest downside to this
> approach is in how the output looks, but perhaps that can be tweaked.
> For example, the `ilist.py` command:
> https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/python/ilist.py?h=perf-tools-next
> supports regular metrics.
Since iostat is a small sub-command with not many events involved,
it is great to have some events extracted as system metrics, like PCIe
bandwidth etc.

But the variety of bdf filtering and display may be difficult to adapt
to the way metric does it. What do you think if we leave the iostat as
is for now and maybe merge it to system metric infrastructure later?

> That said, I'm not against the changes, the arch directory usage, not
> using json metrics, etc. and this change is doing clean up, following
> existing patterns, etc. Sometimes the codebase has evolved but certain
> commands haven't kept up. I think `perf iostat` is a command like
> that.

And we are trying to make it follow up!

> Thanks,
> Ian

Thanks for the feedback!
Yushan
>>  tools/perf/arch/arm64/util/Build         |   1 +
>>  tools/perf/arch/arm64/util/hisi-iostat.c | 479 +++++++++++++++++++++++
>>  tools/perf/arch/x86/util/iostat.c        | 105 +++--
>>  tools/perf/builtin-script.c              |   2 +-
>>  tools/perf/util/iostat.c                 |  79 ++--
>>  tools/perf/util/iostat.h                 |  21 +-
>>  tools/perf/util/pmus.c                   |  12 +-
>>  tools/perf/util/pmus.h                   |   3 +
>>  tools/perf/util/stat-display.c           |   4 +-
>>  tools/perf/util/stat-shadow.c            |   4 +-
>>  10 files changed, 638 insertions(+), 72 deletions(-)
>>  create mode 100644 tools/perf/arch/arm64/util/hisi-iostat.c
>>
>> --
>> 2.33.0
>>



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFT PATCH 2/7] perf stat: Save unnecessary print_metric() call
  2026-01-27 16:01   ` Jonathan Cameron
@ 2026-01-29 15:17     ` wangyushan
  0 siblings, 0 replies; 16+ messages in thread
From: wangyushan @ 2026-01-29 15:17 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: mike.leach, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, peterz,
	john.g.garry, shiju.jose, will, linux-perf-users,
	linux-arm-kernel, linuxarm, liuyonglong, prime.zeng, fanghao11,
	wangzhou1, Yushan Wang



On 1/28/2026 12:01 AM, Jonathan Cameron wrote:
> On Mon, 26 Jan 2026 20:35:09 +0800
> Yushan Wang <wangyushan12@huawei.com> wrote:
>
>> Patch [1] removed the second branch of iostat_run, and changed num to 0
>> since it is the default behavior. But during iostat_run, default value 1
>> of num is required to avoid print_metric() call later.
>>
>> Set num as 1 to avoid redundant print_metric() call that causes
>> unaligned blank printed.
>>
>> Fixes: b71f46a6a708 ("perf stat: Remove hard coded shadow metrics")
>>
>> [1]: https://lore.kernel.org/all/20251111212206.631711-8-irogers@google.com/
> Use a Link tag for links, but don't add them when they just point to the patch
> you already have as a fixes tag.  Just refer to it as "The patch listed under
> Fixes".

Yes, I will fix that in the next version, thanks for pointing out.

Yushan

>> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
>> ---
>>  tools/perf/util/stat-shadow.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
>> index 9c83f7d96caa..9439baf8002f 100644
>> --- a/tools/perf/util/stat-shadow.c
>> +++ b/tools/perf/util/stat-shadow.c
>> @@ -319,8 +319,10 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>>  	void *ctxp = out->ctx;
>>  	int num = 0;
>>  
>> -	if (config->iostat_run)
>> +	if (config->iostat_run) {
>>  		iostat_print_metric(config, evsel, out);
>> +		num = 1;
>> +	}
>>  
>>  	perf_stat__print_shadow_stats_metricgroup(config, evsel, aggr_idx,
>>  						  &num, NULL, out);



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFT PATCH 1/7] perf stat: Check color's length instead of the pointer
  2026-01-27 15:58   ` Jonathan Cameron
@ 2026-01-29 15:19     ` wangyushan
  0 siblings, 0 replies; 16+ messages in thread
From: wangyushan @ 2026-01-29 15:19 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: mike.leach, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, peterz,
	john.g.garry, shiju.jose, will, linux-perf-users,
	linux-arm-kernel, linuxarm, liuyonglong, prime.zeng, fanghao11,
	wangzhou1, Yushan Wang



On 1/27/2026 11:58 PM, Jonathan Cameron wrote:
> On Mon, 26 Jan 2026 20:35:08 +0800
> Yushan Wang <wangyushan12@huawei.com> wrote:
>
>> Color string returned by metric_threshold_classify__color() is never
>> NULL, check the presence of *color will always return true.
>>
>> Fix this by change the checks against length of *color.
>>
>> Fixes: 37b77ae95416 ("perf stat: Change color to threshold in print_metric")
>>
> No blank line between tags.
>
> Fixes: 37b77ae95416 ("perf stat: Change color to threshold in print_metric")
> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
>
> Some of the scripting run against the kernel uses that lack of blank lines
> as a heuristic to identify what is a tag vs other similar looking text.

Thanks for reminding, will fix it in the next version.
> Thanks,
>
> Jonathan
>
>> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
>> ---
>>  tools/perf/builtin-script.c    | 2 +-
>>  tools/perf/util/stat-display.c | 4 ++--
>>  2 files changed, 3 insertions(+), 3 deletions(-)
>>
[...]

Thanks!
Yushan


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2026-01-29 15:20 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-26 12:35 [RFT PATCH 0/7] perf tool: Support iostat for multiple platforms Yushan Wang
2026-01-26 12:35 ` [RFT PATCH 1/7] perf stat: Check color's length instead of the pointer Yushan Wang
2026-01-27 15:58   ` Jonathan Cameron
2026-01-29 15:19     ` wangyushan
2026-01-26 12:35 ` [RFT PATCH 2/7] perf stat: Save unnecessary print_metric() call Yushan Wang
2026-01-27 16:01   ` Jonathan Cameron
2026-01-29 15:17     ` wangyushan
2026-01-26 12:35 ` [RFT PATCH 3/7] perf-x86: iostat: Change iostat_prefix() to static Yushan Wang
2026-01-26 12:35 ` [RFT PATCH 4/7] perf-iostat: Extend iostat interface to support different iostat PMUs Yushan Wang
2026-01-26 12:35 ` [RFT PATCH 5/7] perf-iostat: Support wilder wildcard-match for pmus Yushan Wang
2026-01-26 16:44   ` Ian Rogers
2026-01-29 15:14     ` wangyushan
2026-01-26 12:35 ` [RFT PATCH 6/7] perf-iostat: Make x86 iostat compatible with new iostat framework Yushan Wang
2026-01-26 12:35 ` [RFT PATCH 7/7] perf-iostat: Enable iostat mode for HiSilicon PCIe PMU Yushan Wang
2026-01-26 17:01 ` [RFT PATCH 0/7] perf tool: Support iostat for multiple platforms Ian Rogers
2026-01-29 15:14   ` wangyushan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox