linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 1/4] Create source symlink in perf object dir
@ 2024-08-07 23:18 Andi Kleen
  2024-08-07 23:18 ` [PATCH v9 2/4] perf test: Support external tests for separate objdir Andi Kleen
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Andi Kleen @ 2024-08-07 23:18 UTC (permalink / raw)
  To: namhyung; +Cc: linux-perf-users, Andi Kleen

Create a source symlink to the original source in the objdir.
This is similar to what the main kernel build script does.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/Makefile.perf | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 175e4c7898f0..d46892d8223b 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -163,6 +163,8 @@ ifneq ($(OUTPUT),)
 # for flex/bison parsers.
 VPATH += $(OUTPUT)
 export VPATH
+# create symlink to the original source
+SOURCE := $(shell ln -sf $(srctree)/tools/perf $(OUTPUT)/source)
 endif
 
 ifeq ($(V),1)
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v9 2/4] perf test: Support external tests for separate objdir
  2024-08-07 23:18 [PATCH v9 1/4] Create source symlink in perf object dir Andi Kleen
@ 2024-08-07 23:18 ` Andi Kleen
  2024-08-07 23:18 ` [PATCH v9 3/4] perf script: Fix perf script -F +metric Andi Kleen
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Andi Kleen @ 2024-08-07 23:18 UTC (permalink / raw)
  To: namhyung; +Cc: linux-perf-users, Andi Kleen

Extend the searching for the test files so that it works
when running perf from a separate objdir, and also when
the perf executable is symlinked.

Signed-off-by: Andi Kleen <ak@linux.intel.com>

----

v2: Fix string test (Namhyung)
Handle both in source and out of source builds.
v3: Remove duplicated string check (Namhyung)
---
 tools/perf/tests/tests-scripts.c | 35 +++++++++++++++++++++++++++++---
 1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/tools/perf/tests/tests-scripts.c b/tools/perf/tests/tests-scripts.c
index e2042b368269..8dc1e398288c 100644
--- a/tools/perf/tests/tests-scripts.c
+++ b/tools/perf/tests/tests-scripts.c
@@ -29,16 +29,45 @@
 
 static int shell_tests__dir_fd(void)
 {
-	char path[PATH_MAX], *exec_path;
-	static const char * const devel_dirs[] = { "./tools/perf/tests/shell", "./tests/shell", };
+	struct stat st;
+	char path[PATH_MAX], path2[PATH_MAX], *exec_path;
+	static const char * const devel_dirs[] = {
+		"./tools/perf/tests/shell",
+		"./tests/shell",
+		"./source/tests/shell"
+	};
+	int fd;
+	char *p;
 
 	for (size_t i = 0; i < ARRAY_SIZE(devel_dirs); ++i) {
-		int fd = open(devel_dirs[i], O_PATH);
+		fd = open(devel_dirs[i], O_PATH);
 
 		if (fd >= 0)
 			return fd;
 	}
 
+	/* Use directory of executable */
+	if (readlink("/proc/self/exe", path2, sizeof path2) < 0)
+		return -1;
+	/* Follow another level of symlink if there */
+	if (lstat(path2, &st) == 0 && (st.st_mode & S_IFMT) == S_IFLNK) {
+		scnprintf(path, sizeof(path), path2);
+		if (readlink(path, path2, sizeof path2) < 0)
+			return -1;
+	}
+	/* Get directory */
+	p = strrchr(path2, '/');
+	if (p)
+		*p = 0;
+	scnprintf(path, sizeof(path), "%s/tests/shell", path2);
+	fd = open(path, O_PATH);
+	if (fd >= 0)
+		return fd;
+	scnprintf(path, sizeof(path), "%s/source/tests/shell", path2);
+	fd = open(path, O_PATH);
+	if (fd >= 0)
+		return fd;
+
 	/* Then installed path. */
 	exec_path = get_argv_exec_path();
 	scnprintf(path, sizeof(path), "%s/tests/shell", exec_path);
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v9 3/4] perf script: Fix perf script -F +metric
  2024-08-07 23:18 [PATCH v9 1/4] Create source symlink in perf object dir Andi Kleen
  2024-08-07 23:18 ` [PATCH v9 2/4] perf test: Support external tests for separate objdir Andi Kleen
@ 2024-08-07 23:18 ` Andi Kleen
  2024-08-08  5:18   ` Namhyung Kim
  2024-08-07 23:18 ` [PATCH v9 4/4] Add a test case for " Andi Kleen
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Andi Kleen @ 2024-08-07 23:18 UTC (permalink / raw)
  To: namhyung; +Cc: linux-perf-users, Andi Kleen

This fixes a regression with perf script -F +metric originally caused by :

commit 37cc8ad77cf81f3ffd226856c367b0e15333a738
Author: Ian Rogers <irogers@google.com>
Date:   Sun Feb 19 01:28:46 2023 -0800

    perf metric: Directly use counts rather than saved_value

In the perf script environment the evsel wouldn't allocate an aggr
values array, which led to a -1 reference because the metric
evaluation would try to reference NULL - 1 (for aggr_idx)

Give the perf script evsels a single CPU aggr setup. That's
enough because the groups are always contiguous, so no need
to store more than one CPU's worth of values.

Before

% perf record -e '{cycles,instructions}:S' perf bench  mem memcpy
% perf script -F +metric
Segmentation fault (core dumped)

After:

% perf record -e '{cycles,instructions}:S' perf bench  mem memcpy
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.028 MB perf.data (90 samples) ]
% perf script -F +metric
       perf-exec 1847557 264658.180789:       3009       cycles:  ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
       perf-exec 1847557 264658.180789:        382 instructions:  ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
       perf-exec 1847557 264658.180789:         metric:    0.13  insn per cycle
...

Fixes: 37cc8ad77cf8 ("perf metric: Directly use counts rather ...")
Signed-off-by: Andi Kleen <ak@linux.intel.com>

----

v2: Reformat code
v3: Work around bogus warning
v4: Set up aggr map only for metrics case to keep perf stat record
working
v5: Broken version
v6: Only set up limited aggregation mode with -F +metric. Add conflict
checks with perf stat record files.
v7: Remove some unnecessary conflict checks. Fix buffer overflow. Minor cleanups.
v8: Add check for leader sampling. Update some comments.
v9: Correct check for leader sampling (Namhyung)
v10: Check for PERF_FORMAT_GROUP and PERF_SAMPLE_READ (Namhyung)
---
 tools/perf/builtin-script.c | 51 +++++++++++++++++++++++++++++++++----
 1 file changed, 46 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index c16224b1fef3..cb63507af839 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -335,7 +335,6 @@ struct evsel_script {
        FILE *fp;
        u64  samples;
        /* For metric output */
-       u64  val;
        int  gnum;
 };
 
@@ -2127,18 +2126,31 @@ static void perf_sample__fprint_metric(struct perf_script *script,
 	};
 	struct evsel *ev2;
 	u64 val;
+	static int printed;
 
 	if (!evsel->stats)
 		evlist__alloc_stats(&stat_config, script->session->evlist, /*alloc_raw=*/false);
 	if (evsel_script(leader)->gnum++ == 0)
 		perf_stat__reset_shadow_stats();
-	val = sample->period * evsel->scale;
-	evsel_script(evsel)->val = val;
+	val = sample->period;
+	if (!(leader->core.attr.read_format & PERF_FORMAT_GROUP) ||
+	    !(leader->core.attr.sample_type & PERF_SAMPLE_READ)) {
+		if (!printed)
+			fprintf(stderr, "perf script: -F metric requires {}:S for groups leader sampling\n");
+		printed = 1;
+		return;
+	}
+	/*
+	 * Always use the first entry as storage because the leader sampling
+	 * groups are contiguous and there's no need to handle multiple indexes
+	 * for anything.
+	 */
+	evsel->stats->aggr[0].counts.val = val;
 	if (evsel_script(leader)->gnum == leader->core.nr_members) {
 		for_each_group_member (ev2, leader) {
 			perf_stat__print_shadow_stats(&stat_config, ev2,
-						      evsel_script(ev2)->val,
-						      sample->cpu,
+						      evsel->stats->aggr[0].counts.val,
+						      0,
 						      &ctx,
 						      NULL);
 		}
@@ -2325,6 +2337,20 @@ static void process_event(struct perf_script *script,
 		fflush(fp);
 }
 
+static void check_metric_conflict(void)
+{
+	int i;
+	/*
+	 * Avoid conflict with the aggregation mode used for the metric printing.
+	 */
+	for (i = 0; i < OUTPUT_TYPE_MAX; i++) {
+		if (output[i].fields & PERF_OUTPUT_METRIC) {
+			fprintf(stderr, "perf stat record files are not supported with -F metric\n");
+			exit(1);
+		}
+	}
+}
+
 static struct scripting_ops	*scripting_ops;
 
 static void __process_stat(struct evsel *counter, u64 tstamp)
@@ -2334,6 +2360,8 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
 	struct perf_cpu cpu;
 	static int header_printed;
 
+	check_metric_conflict();
+
 	if (!header_printed) {
 		printf("%3s %8s %15s %15s %15s %15s %s\n",
 		       "CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT");
@@ -3725,6 +3753,8 @@ static int process_stat_config_event(struct perf_session *session __maybe_unused
 {
 	perf_event__read_stat_config(&stat_config, &event->stat_config);
 
+	check_metric_conflict();
+
 	/*
 	 * Aggregation modes are not used since post-processing scripts are
 	 * supposed to take care of such requirements
@@ -4088,6 +4118,17 @@ int cmd_script(int argc, const char **argv)
 
 	argc = parse_options_subcommand(argc, argv, options, script_subcommands, script_usage,
 			     PARSE_OPT_STOP_AT_NON_OPTION);
+	for (i = 0; i < OUTPUT_TYPE_MAX; i++) {
+		if (output[i].fields & PERF_OUTPUT_METRIC) {
+			stat_config.aggr_map = cpu_aggr_map__empty_new(1);
+			err = -ENOMEM;
+			if (!stat_config.aggr_map)
+				goto out;
+			err = 0;
+			stat_config.aggr_map->nr = 1;
+			break;
+		}
+	}
 
 	if (symbol_conf.guestmount ||
 	    symbol_conf.default_guest_vmlinux_name ||
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v9 4/4] Add a test case for perf script -F +metric
  2024-08-07 23:18 [PATCH v9 1/4] Create source symlink in perf object dir Andi Kleen
  2024-08-07 23:18 ` [PATCH v9 2/4] perf test: Support external tests for separate objdir Andi Kleen
  2024-08-07 23:18 ` [PATCH v9 3/4] perf script: Fix perf script -F +metric Andi Kleen
@ 2024-08-07 23:18 ` Andi Kleen
  2024-11-25 17:25 ` [PATCH v9 1/4] Create source symlink in perf object dir Luca Ceresoli
  2025-04-09 17:43 ` Florian Fainelli
  4 siblings, 0 replies; 13+ messages in thread
From: Andi Kleen @ 2024-08-07 23:18 UTC (permalink / raw)
  To: namhyung; +Cc: linux-perf-users, Andi Kleen

Sample a noploop workload with cycles and instructions and check if
there is at least one metric being output by perf script.

The output is

% perf test -v 98
98: perf script tests
...
script metric test
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.429 MB /tmp/perf-test-script.nkvvuzMOla/perf.data (8174 samples) ]
       perf-exec  853636 3967843.162782:        metric:    0.15  insn per cycle
            perf  853636 3967843.163210:        metric:    0.87  insn per cycle
            perf  853636 3967843.163226:        metric:    0.84  insn per cycle
            perf  853636 3967843.163240:        metric:    0.88  insn per cycle
            perf  853636 3967843.163276:        metric:    1.06  insn per cycle
            perf  853636 3967843.163387:        metric:    1.09  insn per cycle
            perf  853636 3967843.163578:        metric:    1.17  insn per cycle
            perf  853636 3967843.164412:        metric:    4.53  insn per cycle
            perf  853636 3967843.164623:        metric:    1.18  insn per cycle
            perf  853636 3967843.164817:        metric:    1.25  insn per cycle
script metric test [Success]
...
%

Signed-off-by: Andi Kleen <ak@linux.intel.com>

----

v2: Avoid bashisms. Use noploop
v3: Avoid false positive in shellcheck
v4: Use :S in the test (Ian, Namhyung)
v5: Update commit message.
---
 tools/perf/tests/shell/script.sh | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/tools/perf/tests/shell/script.sh b/tools/perf/tests/shell/script.sh
index c1a603653662..e69c3548583a 100755
--- a/tools/perf/tests/shell/script.sh
+++ b/tools/perf/tests/shell/script.sh
@@ -7,6 +7,7 @@ set -e
 temp_dir=$(mktemp -d /tmp/perf-test-script.XXXXXXXXXX)
 
 perfdatafile="${temp_dir}/perf.data"
+scriptoutput="${temp_dir}/script"
 db_test="${temp_dir}/db_test.py"
 
 err=0
@@ -88,8 +89,21 @@ test_parallel_perf()
 	echo "parallel-perf test [Success]"
 }
 
+test_metric()
+{
+	echo "script metric test"
+	if ! perf list | grep -q cycles ; then return ; fi
+	if ! perf list | grep -q instructions ; then return ; fi
+	perf record -e '{cycles,instructions}:S' -o "${perfdatafile}" perf test -w noploop
+	perf script -i "${perfdatafile}" -F +metric  > $scriptoutput
+	test "`grep -c metric $scriptoutput`" -gt 5
+	grep metric $scriptoutput | head
+	echo "script metric test [Success]"
+}
+
 test_db
 test_parallel_perf
+test_metric
 
 cleanup
 
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v9 3/4] perf script: Fix perf script -F +metric
  2024-08-07 23:18 ` [PATCH v9 3/4] perf script: Fix perf script -F +metric Andi Kleen
@ 2024-08-08  5:18   ` Namhyung Kim
  2024-08-08 18:09     ` Ian Rogers
  0 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2024-08-08  5:18 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-perf-users

Hi Andi,

On Wed, Aug 7, 2024 at 4:18 PM Andi Kleen <ak@linux.intel.com> wrote:
>
> This fixes a regression with perf script -F +metric originally caused by :
>
> commit 37cc8ad77cf81f3ffd226856c367b0e15333a738
> Author: Ian Rogers <irogers@google.com>
> Date:   Sun Feb 19 01:28:46 2023 -0800
>
>     perf metric: Directly use counts rather than saved_value
>
> In the perf script environment the evsel wouldn't allocate an aggr
> values array, which led to a -1 reference because the metric
> evaluation would try to reference NULL - 1 (for aggr_idx)
>
> Give the perf script evsels a single CPU aggr setup. That's
> enough because the groups are always contiguous, so no need
> to store more than one CPU's worth of values.
>
> Before
>
> % perf record -e '{cycles,instructions}:S' perf bench  mem memcpy
> % perf script -F +metric
> Segmentation fault (core dumped)
>
> After:
>
> % perf record -e '{cycles,instructions}:S' perf bench  mem memcpy
> ...
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.028 MB perf.data (90 samples) ]
> % perf script -F +metric
>        perf-exec 1847557 264658.180789:       3009       cycles:  ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
>        perf-exec 1847557 264658.180789:        382 instructions:  ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
>        perf-exec 1847557 264658.180789:         metric:    0.13  insn per cycle
> ...
>
> Fixes: 37cc8ad77cf8 ("perf metric: Directly use counts rather ...")
> Signed-off-by: Andi Kleen <ak@linux.intel.com>

Thanks for your work, for the series:

Acked-by: Namhyung Kim <namhyung@kernel.org>

Thanks,
Namhyung

>
> ----
>
> v2: Reformat code
> v3: Work around bogus warning
> v4: Set up aggr map only for metrics case to keep perf stat record
> working
> v5: Broken version
> v6: Only set up limited aggregation mode with -F +metric. Add conflict
> checks with perf stat record files.
> v7: Remove some unnecessary conflict checks. Fix buffer overflow. Minor cleanups.
> v8: Add check for leader sampling. Update some comments.
> v9: Correct check for leader sampling (Namhyung)
> v10: Check for PERF_FORMAT_GROUP and PERF_SAMPLE_READ (Namhyung)
> ---
>  tools/perf/builtin-script.c | 51 +++++++++++++++++++++++++++++++++----
>  1 file changed, 46 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index c16224b1fef3..cb63507af839 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -335,7 +335,6 @@ struct evsel_script {
>         FILE *fp;
>         u64  samples;
>         /* For metric output */
> -       u64  val;
>         int  gnum;
>  };
>
> @@ -2127,18 +2126,31 @@ static void perf_sample__fprint_metric(struct perf_script *script,
>         };
>         struct evsel *ev2;
>         u64 val;
> +       static int printed;
>
>         if (!evsel->stats)
>                 evlist__alloc_stats(&stat_config, script->session->evlist, /*alloc_raw=*/false);
>         if (evsel_script(leader)->gnum++ == 0)
>                 perf_stat__reset_shadow_stats();
> -       val = sample->period * evsel->scale;
> -       evsel_script(evsel)->val = val;
> +       val = sample->period;
> +       if (!(leader->core.attr.read_format & PERF_FORMAT_GROUP) ||
> +           !(leader->core.attr.sample_type & PERF_SAMPLE_READ)) {
> +               if (!printed)
> +                       fprintf(stderr, "perf script: -F metric requires {}:S for groups leader sampling\n");
> +               printed = 1;
> +               return;
> +       }
> +       /*
> +        * Always use the first entry as storage because the leader sampling
> +        * groups are contiguous and there's no need to handle multiple indexes
> +        * for anything.
> +        */
> +       evsel->stats->aggr[0].counts.val = val;
>         if (evsel_script(leader)->gnum == leader->core.nr_members) {
>                 for_each_group_member (ev2, leader) {
>                         perf_stat__print_shadow_stats(&stat_config, ev2,
> -                                                     evsel_script(ev2)->val,
> -                                                     sample->cpu,
> +                                                     evsel->stats->aggr[0].counts.val,
> +                                                     0,
>                                                       &ctx,
>                                                       NULL);
>                 }
> @@ -2325,6 +2337,20 @@ static void process_event(struct perf_script *script,
>                 fflush(fp);
>  }
>
> +static void check_metric_conflict(void)
> +{
> +       int i;
> +       /*
> +        * Avoid conflict with the aggregation mode used for the metric printing.
> +        */
> +       for (i = 0; i < OUTPUT_TYPE_MAX; i++) {
> +               if (output[i].fields & PERF_OUTPUT_METRIC) {
> +                       fprintf(stderr, "perf stat record files are not supported with -F metric\n");
> +                       exit(1);
> +               }
> +       }
> +}
> +
>  static struct scripting_ops    *scripting_ops;
>
>  static void __process_stat(struct evsel *counter, u64 tstamp)
> @@ -2334,6 +2360,8 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
>         struct perf_cpu cpu;
>         static int header_printed;
>
> +       check_metric_conflict();
> +
>         if (!header_printed) {
>                 printf("%3s %8s %15s %15s %15s %15s %s\n",
>                        "CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT");
> @@ -3725,6 +3753,8 @@ static int process_stat_config_event(struct perf_session *session __maybe_unused
>  {
>         perf_event__read_stat_config(&stat_config, &event->stat_config);
>
> +       check_metric_conflict();
> +
>         /*
>          * Aggregation modes are not used since post-processing scripts are
>          * supposed to take care of such requirements
> @@ -4088,6 +4118,17 @@ int cmd_script(int argc, const char **argv)
>
>         argc = parse_options_subcommand(argc, argv, options, script_subcommands, script_usage,
>                              PARSE_OPT_STOP_AT_NON_OPTION);
> +       for (i = 0; i < OUTPUT_TYPE_MAX; i++) {
> +               if (output[i].fields & PERF_OUTPUT_METRIC) {
> +                       stat_config.aggr_map = cpu_aggr_map__empty_new(1);
> +                       err = -ENOMEM;
> +                       if (!stat_config.aggr_map)
> +                               goto out;
> +                       err = 0;
> +                       stat_config.aggr_map->nr = 1;
> +                       break;
> +               }
> +       }
>
>         if (symbol_conf.guestmount ||
>             symbol_conf.default_guest_vmlinux_name ||
> --
> 2.45.2
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v9 3/4] perf script: Fix perf script -F +metric
  2024-08-08  5:18   ` Namhyung Kim
@ 2024-08-08 18:09     ` Ian Rogers
  2024-08-08 20:18       ` Andi Kleen
  0 siblings, 1 reply; 13+ messages in thread
From: Ian Rogers @ 2024-08-08 18:09 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: Andi Kleen, linux-perf-users

On Wed, Aug 7, 2024 at 10:18 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hi Andi,
>
> On Wed, Aug 7, 2024 at 4:18 PM Andi Kleen <ak@linux.intel.com> wrote:
> >
> > This fixes a regression with perf script -F +metric originally caused by :
> >
> > commit 37cc8ad77cf81f3ffd226856c367b0e15333a738
> > Author: Ian Rogers <irogers@google.com>
> > Date:   Sun Feb 19 01:28:46 2023 -0800
> >
> >     perf metric: Directly use counts rather than saved_value
> >
> > In the perf script environment the evsel wouldn't allocate an aggr
> > values array, which led to a -1 reference because the metric
> > evaluation would try to reference NULL - 1 (for aggr_idx)
> >
> > Give the perf script evsels a single CPU aggr setup. That's
> > enough because the groups are always contiguous, so no need
> > to store more than one CPU's worth of values.
> >
> > Before
> >
> > % perf record -e '{cycles,instructions}:S' perf bench  mem memcpy
> > % perf script -F +metric
> > Segmentation fault (core dumped)
> >
> > After:
> >
> > % perf record -e '{cycles,instructions}:S' perf bench  mem memcpy
> > ...
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.028 MB perf.data (90 samples) ]
> > % perf script -F +metric
> >        perf-exec 1847557 264658.180789:       3009       cycles:  ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
> >        perf-exec 1847557 264658.180789:        382 instructions:  ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
> >        perf-exec 1847557 264658.180789:         metric:    0.13  insn per cycle
> > ...
> >
> > Fixes: 37cc8ad77cf8 ("perf metric: Directly use counts rather ...")
> > Signed-off-by: Andi Kleen <ak@linux.intel.com>
>
> Thanks for your work, for the series:
>
> Acked-by: Namhyung Kim <namhyung@kernel.org>

Fwiw, not acked-by me. Repeating my comments below:

> >
> > v2: Reformat code
> > v3: Work around bogus warning
> > v4: Set up aggr map only for metrics case to keep perf stat record
> > working
> > v5: Broken version
> > v6: Only set up limited aggregation mode with -F +metric. Add conflict
> > checks with perf stat record files.
> > v7: Remove some unnecessary conflict checks. Fix buffer overflow. Minor cleanups.
> > v8: Add check for leader sampling. Update some comments.
> > v9: Correct check for leader sampling (Namhyung)
> > v10: Check for PERF_FORMAT_GROUP and PERF_SAMPLE_READ (Namhyung)
> > ---
> >  tools/perf/builtin-script.c | 51 +++++++++++++++++++++++++++++++++----
> >  1 file changed, 46 insertions(+), 5 deletions(-)
> >
> > diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> > index c16224b1fef3..cb63507af839 100644
> > --- a/tools/perf/builtin-script.c
> > +++ b/tools/perf/builtin-script.c
> > @@ -335,7 +335,6 @@ struct evsel_script {
> >         FILE *fp;
> >         u64  samples;
> >         /* For metric output */
> > -       u64  val;
> >         int  gnum;

We're globally assuming properties of events assuming leader sampling.
gnum shouldn't exist. The assumptions will be invalid when leader
sampling isn't enabled.

> >  };
> >
> > @@ -2127,18 +2126,31 @@ static void perf_sample__fprint_metric(struct perf_script *script,
> >         };
> >         struct evsel *ev2;
> >         u64 val;
> > +       static int printed;
> >
> >         if (!evsel->stats)
> >                 evlist__alloc_stats(&stat_config, script->session->evlist, /*alloc_raw=*/false);
> >         if (evsel_script(leader)->gnum++ == 0)
> >                 perf_stat__reset_shadow_stats();
> > -       val = sample->period * evsel->scale;
> > -       evsel_script(evsel)->val = val;
> > +       val = sample->period;
> > +       if (!(leader->core.attr.read_format & PERF_FORMAT_GROUP) ||
> > +           !(leader->core.attr.sample_type & PERF_SAMPLE_READ)) {
> > +               if (!printed)
> > +                       fprintf(stderr, "perf script: -F metric requires {}:S for groups leader sampling\n");
> > +               printed = 1;
> > +               return;
> > +       }
> > +       /*
> > +        * Always use the first entry as storage because the leader sampling
> > +        * groups are contiguous and there's no need to handle multiple indexes
> > +        * for anything.
> > +        */
> > +       evsel->stats->aggr[0].counts.val = val;

Updating counts needs perf_stat_process_counter to update the
aggregation. How does sample->cpu given below relate to aggr[0] here?
This and the comment above are a hack liable to break in anything but
straightforward circumstances. See:
https://lore.kernel.org/lkml/20240720074552.1915993-2-irogers@google.com/
on how to handle this.

> >         if (evsel_script(leader)->gnum == leader->core.nr_members) {
> >                 for_each_group_member (ev2, leader) {
> >                         perf_stat__print_shadow_stats(&stat_config, ev2,
> > -                                                     evsel_script(ev2)->val,
> > -                                                     sample->cpu,
> > +                                                     evsel->stats->aggr[0].counts.val,
> > +                                                     0,

Similarly this hard coded and unexplained constant.

> >                                                       &ctx,
> >                                                       NULL);
> >                 }
> > @@ -2325,6 +2337,20 @@ static void process_event(struct perf_script *script,
> >                 fflush(fp);
> >  }
> >
> > +static void check_metric_conflict(void)
> > +{
> > +       int i;
> > +       /*
> > +        * Avoid conflict with the aggregation mode used for the metric printing.
> > +        */

Why is it a conflict, why not support aggregation modes?

Thanks,
Ian

> > +       for (i = 0; i < OUTPUT_TYPE_MAX; i++) {
> > +               if (output[i].fields & PERF_OUTPUT_METRIC) {
> > +                       fprintf(stderr, "perf stat record files are not supported with -F metric\n");
> > +                       exit(1);
> > +               }
> > +       }
> > +}
> > +
> >  static struct scripting_ops    *scripting_ops;
> >
> >  static void __process_stat(struct evsel *counter, u64 tstamp)
> > @@ -2334,6 +2360,8 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
> >         struct perf_cpu cpu;
> >         static int header_printed;
> >
> > +       check_metric_conflict();
> > +
> >         if (!header_printed) {
> >                 printf("%3s %8s %15s %15s %15s %15s %s\n",
> >                        "CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT");
> > @@ -3725,6 +3753,8 @@ static int process_stat_config_event(struct perf_session *session __maybe_unused
> >  {
> >         perf_event__read_stat_config(&stat_config, &event->stat_config);
> >
> > +       check_metric_conflict();
> > +
> >         /*
> >          * Aggregation modes are not used since post-processing scripts are
> >          * supposed to take care of such requirements
> > @@ -4088,6 +4118,17 @@ int cmd_script(int argc, const char **argv)
> >
> >         argc = parse_options_subcommand(argc, argv, options, script_subcommands, script_usage,
> >                              PARSE_OPT_STOP_AT_NON_OPTION);
> > +       for (i = 0; i < OUTPUT_TYPE_MAX; i++) {
> > +               if (output[i].fields & PERF_OUTPUT_METRIC) {
> > +                       stat_config.aggr_map = cpu_aggr_map__empty_new(1);
> > +                       err = -ENOMEM;
> > +                       if (!stat_config.aggr_map)
> > +                               goto out;
> > +                       err = 0;
> > +                       stat_config.aggr_map->nr = 1;
> > +                       break;
> > +               }
> > +       }
> >
> >         if (symbol_conf.guestmount ||
> >             symbol_conf.default_guest_vmlinux_name ||
> > --
> > 2.45.2
> >
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v9 3/4] perf script: Fix perf script -F +metric
  2024-08-08 18:09     ` Ian Rogers
@ 2024-08-08 20:18       ` Andi Kleen
  2024-08-09 15:00         ` Ian Rogers
  0 siblings, 1 reply; 13+ messages in thread
From: Andi Kleen @ 2024-08-08 20:18 UTC (permalink / raw)
  To: Ian Rogers; +Cc: Namhyung Kim, linux-perf-users

> 
> Fwiw, not acked-by me. Repeating my comments below:

Let me just point out that you broke it in the first place.

A little more enthusiasm in handling regressions would be appreciated.

> > > diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> > > index c16224b1fef3..cb63507af839 100644
> > > --- a/tools/perf/builtin-script.c
> > > +++ b/tools/perf/builtin-script.c
> > > @@ -335,7 +335,6 @@ struct evsel_script {
> > >         FILE *fp;
> > >         u64  samples;
> > >         /* For metric output */
> > > -       u64  val;
> > >         int  gnum;
> 
> We're globally assuming properties of events assuming leader sampling.
> gnum shouldn't exist. The assumptions will be invalid when leader
> sampling isn't enabled.

This is checked for now, see the earlier discussion.

> > > +       if (!(leader->core.attr.read_format & PERF_FORMAT_GROUP) ||
> > > +           !(leader->core.attr.sample_type & PERF_SAMPLE_READ)) {
> > > +               if (!printed)
> > > +                       fprintf(stderr, "perf script: -F metric requires {}:S for groups leader sampling\n");
> > > +               printed = 1;
> > > +               return;
> > > +       }
> > > +       /*
> > > +        * Always use the first entry as storage because the leader sampling
> > > +        * groups are contiguous and there's no need to handle multiple indexes
> > > +        * for anything.
> > > +        */
> > > +       evsel->stats->aggr[0].counts.val = val;
> 
> Updating counts needs perf_stat_process_counter to update the
> aggregation. How does sample->cpu given below relate to aggr[0] here?
> This and the comment above are a hack liable to break in anything but
> straightforward circumstances. See:
> https://lore.kernel.org/lkml/20240720074552.1915993-2-irogers@google.com/
> on how to handle this.

Your patch is based on a fundamental misunderstanding on the perf script
metrics use model, see below.

> 
> > >         if (evsel_script(leader)->gnum == leader->core.nr_members) {
> > >                 for_each_group_member (ev2, leader) {
> > >                         perf_stat__print_shadow_stats(&stat_config, ev2,
> > > -                                                     evsel_script(ev2)->val,
> > > -                                                     sample->cpu,
> > > +                                                     evsel->stats->aggr[0].counts.val,
> > > +                                                     0,
> 
> Similarly this hard coded and unexplained constant.

There's a comment just on top of it explaining it:

        /*
         * Always use the first entry as storage because the leader sampling
         * groups are contiguous and there's no need to handle multiple indexes
         * for anything.
         */


> 
> > >                                                       &ctx,
> > >                                                       NULL);
> > >                 }
> > > @@ -2325,6 +2337,20 @@ static void process_event(struct perf_script *script,
> > >                 fflush(fp);
> > >  }
> > >
> > > +static void check_metric_conflict(void)
> > > +{
> > > +       int i;
> > > +       /*
> > > +        * Avoid conflict with the aggregation mode used for the metric printing.
> > > +        */
> 
> Why is it a conflict, why not support aggregation modes?

I thought I had explained that before.

perf script has two touch point with perf stat output code: one is this one
(for metrics on sampling group) and the other is for perf stat record / script

None of them actually support aggregation because perf script as a
general rule doesn't aggregate, it only handles single events.

But the later one forces AGGR_NONE which doesn't work with the mode the group
output uses.

This feature is only to report the metric state for a single sampling period
and group as explained in the original commit [1]

If you want to aggregate multiple sample periods you need a lot the
machinery from perf report, which doesn't make sense to duplicate
in perf script.

[1] https://www.uwsg.indiana.edu/hypermail/linux/kernel/1712.0/04705.html

-Andi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v9 3/4] perf script: Fix perf script -F +metric
  2024-08-08 20:18       ` Andi Kleen
@ 2024-08-09 15:00         ` Ian Rogers
  0 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2024-08-09 15:00 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Namhyung Kim, linux-perf-users

On Thu, Aug 8, 2024 at 1:18 PM Andi Kleen <ak@linux.intel.com> wrote:
>
> >
> > Fwiw, not acked-by me. Repeating my comments below:
>
> Let me just point out that you broke it in the first place.
>
> A little more enthusiasm in handling regressions would be appreciated.

You want more than a patch series that reimplements things correctly?
https://lore.kernel.org/lkml/20240720074552.1915993-1-irogers@google.com/
You have not listened to feedback from the earlier patch series you
sent and keep sending something that I have explained I disagree with.

> > > > diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> > > > index c16224b1fef3..cb63507af839 100644
> > > > --- a/tools/perf/builtin-script.c
> > > > +++ b/tools/perf/builtin-script.c
> > > > @@ -335,7 +335,6 @@ struct evsel_script {
> > > >         FILE *fp;
> > > >         u64  samples;
> > > >         /* For metric output */
> > > > -       u64  val;
> > > >         int  gnum;
> >
> > We're globally assuming properties of events assuming leader sampling.
> > gnum shouldn't exist. The assumptions will be invalid when leader
> > sampling isn't enabled.
>
> This is checked for now, see the earlier discussion.

So you can then remove the gnum as the samples are all in the event.

> > > > +       if (!(leader->core.attr.read_format & PERF_FORMAT_GROUP) ||
> > > > +           !(leader->core.attr.sample_type & PERF_SAMPLE_READ)) {
> > > > +               if (!printed)
> > > > +                       fprintf(stderr, "perf script: -F metric requires {}:S for groups leader sampling\n");
> > > > +               printed = 1;
> > > > +               return;
> > > > +       }
> > > > +       /*
> > > > +        * Always use the first entry as storage because the leader sampling
> > > > +        * groups are contiguous and there's no need to handle multiple indexes
> > > > +        * for anything.
> > > > +        */
> > > > +       evsel->stats->aggr[0].counts.val = val;
> >
> > Updating counts needs perf_stat_process_counter to update the
> > aggregation. How does sample->cpu given below relate to aggr[0] here?
> > This and the comment above are a hack liable to break in anything but
> > straightforward circumstances. See:
> > https://lore.kernel.org/lkml/20240720074552.1915993-2-irogers@google.com/
> > on how to handle this.
>
> Your patch is based on a fundamental misunderstanding on the perf script
> metrics use model, see below.

Great, that's explained like where?

> >
> > > >         if (evsel_script(leader)->gnum == leader->core.nr_members) {
> > > >                 for_each_group_member (ev2, leader) {
> > > >                         perf_stat__print_shadow_stats(&stat_config, ev2,
> > > > -                                                     evsel_script(ev2)->val,
> > > > -                                                     sample->cpu,
> > > > +                                                     evsel->stats->aggr[0].counts.val,
> > > > +                                                     0,
> >
> > Similarly this hard coded and unexplained constant.
>
> There's a comment just on top of it explaining it:

No, I mean if you are going to use 0 as your special index do something like:

const int andis_special_aggr_index = 0;

then pass that around rather than 0 otherwise people have to figure
out the numbers are connected. At least in the case of arbitrary
constants you can name them like:

/*aggr_idx=*/0,

/*

>         /*
>          * Always use the first entry as storage because the leader sampling
>          * groups are contiguous and there's no need to handle multiple indexes
>          * for anything.
>          */
>
>
> >
> > > >                                                       &ctx,
> > > >                                                       NULL);
> > > >                 }
> > > > @@ -2325,6 +2337,20 @@ static void process_event(struct perf_script *script,
> > > >                 fflush(fp);
> > > >  }
> > > >
> > > > +static void check_metric_conflict(void)
> > > > +{
> > > > +       int i;
> > > > +       /*
> > > > +        * Avoid conflict with the aggregation mode used for the metric printing.
> > > > +        */
> >
> > Why is it a conflict, why not support aggregation modes?
>
> I thought I had explained that before.
>
> perf script has two touch point with perf stat output code: one is this one
> (for metrics on sampling group) and the other is for perf stat record / script
>
> None of them actually support aggregation because perf script as a
> general rule doesn't aggregate, it only handles single events.

evsels have counts, counts are per thread, per cpu, etc. we get >1
event and so we have aggregation. You can put counts into aggregation
and then clear the aggregation immediately afterward. Metrics are
determined from aggregated counts as that's the most generic way of
doing things.

> But the later one forces AGGR_NONE which doesn't work with the mode the group
> output uses.
>
> This feature is only to report the metric state for a single sampling period
> and group as explained in the original commit [1]
>
> If you want to aggregate multiple sample periods you need a lot the
> machinery from perf report, which doesn't make sense to duplicate
> in perf script.
>
> [1] https://www.uwsg.indiana.edu/hypermail/linux/kernel/1712.0/04705.html
>
> -Andi

You are arguing you are special and don't need to use aggregation
machinery then calling APIs that use the aggregation machinery - as
the person maintaining the other end I'm asking to not do that.
Randomly combining samples with gnum and claiming their periods are
related is just weird, and also fundamentally broken for perf inject:
https://lore.kernel.org/lkml/20240729220620.2957754-1-irogers@google.com/
Making assumptions about aggregation indices is brittle, and prone to
break on the other end.

Tbh, nobody noticed in years that there was a segv here. The hard
coded metrics have been disappearing and should disappear even more
so. I don't consider it bad that this is broken, we can then just
delete it knowing nobody has cared about it in years.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v9 1/4] Create source symlink in perf object dir
  2024-08-07 23:18 [PATCH v9 1/4] Create source symlink in perf object dir Andi Kleen
                   ` (2 preceding siblings ...)
  2024-08-07 23:18 ` [PATCH v9 4/4] Add a test case for " Andi Kleen
@ 2024-11-25 17:25 ` Luca Ceresoli
  2024-11-25 19:12   ` Andi Kleen
  2025-04-09 17:43 ` Florian Fainelli
  4 siblings, 1 reply; 13+ messages in thread
From: Luca Ceresoli @ 2024-11-25 17:25 UTC (permalink / raw)
  To: Andi Kleen
  Cc: namhyung, linux-perf-users, Yann E. MORIN,
	buildroot@buildroot.org, Thomas Petazzoni

Hello Andi,

+cc Yann and the Buildroot mailing list (see the end of this report).

On Wed,  7 Aug 2024 16:18:20 -0700
Andi Kleen <ak@linux.intel.com> wrote:

> Create a source symlink to the original source in the objdir.
> This is similar to what the main kernel build script does.
> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  tools/perf/Makefile.perf | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 175e4c7898f0..d46892d8223b 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -163,6 +163,8 @@ ifneq ($(OUTPUT),)
>  # for flex/bison parsers.
>  VPATH += $(OUTPUT)
>  export VPATH
> +# create symlink to the original source
> +SOURCE := $(shell ln -sf $(srctree)/tools/perf $(OUTPUT)/source)

I found that since this patch got applied in 890a1961c812 (v6.12-rc1)
building perf fails silently in some cases.

The cases I have tested:
 - make <FLAGS> -C tools/perf                       -> OK
 - make <FLAGS> -C tools/perf O=/my/out/of/tree/dir -> OK
 - make <FLAGS> -C tools/perf O=tools/perf          -> fails

The failure in the third case is silent: make exits with a return value
of 0, but there is no perf executable created.

Even 'make <FLAGS> -C tools/perf install' exits returning 0, but no
executable is installed. However the following lines are output during
'make install', which are a sign of this issue happening:

  install: omitting directory '/home/murray/w/bootlin/linux/tools/perf/perf'
  ln: failed to access '/tmp/aaa/usr/bin/perf': No such file or directory

I realize 'O=tools/perf' is "useless" in the sense that the same result
would be obtained by not setting 'O' at all. However I also believe it
is expected that O=tools/perf should work equally to not setting 'O',
and as such there is a regression.

Also, right now the Buildroot embedded Linux build system builds perf
by always passing O=$(LINUX_DIR)/tools/perf/, thus it is currently
broken for kernels 6.12-rc1 and later.

Below is the test script I wrote to automate my testing. It is a bit
more complex than expected because it needs to hide a different bug,
which would otherwise mask the one being reported here. A slightly
different version of this script was used to automate 'git bisect'.

-------------------------8<-------------------------

#!/bin/sh

# Usage:
#   build-perf-report          Build with O=tools/perf
#   build-perf-report <DIR>    Build with O=<DIR>

set -eu

# Default to O=tools/perf
OUT_DIR=${1:-tools/perf}

echo "Output dir: ${OUT_DIR}"

# This is the toolchain I initially used to reproduce the issue:
# https://toolchains.bootlin.com/downloads/releases/toolchains/aarch64/tarballs/aarch64--glibc--stable-2024.02-1.tar.bz2
#
# However commenting the two lines for a native build is equally
# reproducing the problem on my x86_64 Ubuntu 22.04 host
export ARCH=arm64
export CROSS_COMPILE="ccache ${HOME}/x-tools/aarch64--glibc--stable-2024.02-1/bin/aarch64-linux-"

LINUX_MAKE_FLAGS="\
    GIT_DIR=. \
    KCFLAGS=-Wno-attribute-alias \
    WERROR=0 \
    REGENERATE_PARSERS=1 \
    DESTDIR=/tmp/aaa \
    prefix=/usr \
    NO_GTK2=1 \
    NO_LIBPERL=1 \
    NO_LIBPYTHON=1 \
    NO_LIBBIONIC=1 \
    NO_LIBTRACEEVENT=1 \
    NO_NEWT=1 \
    NO_SLANG=1 \
    NO_LIBAUDIT=1 \
    NO_LIBUNWIND=1 \
    NO_LIBNUMA=1 \
    NO_LIBELF=1 \
    NO_DWARF=1 \
    NO_DEMANGLE=1 \
    NO_LIBCRYPTO=1 \
    NO_ZLIB=1 \
    NO_LZMA=1 \
    -C tools/perf O=${OUT_DIR}
"

echo "Kernel version: $(git describe)"

# Revert (without committing) 13d675aea6cac5bb4619ef9e3fdd90129fe6d139
# which introduces a different build failure (fixed by a later commit)
# masking the issue being examined
git checkout -- .
git show 13d675aea6cac5bb4619ef9e3fdd90129fe6d139 | git apply || :

# Clean the DESTDIR
rm -fr /tmp/aaa/

git clean -xdfq
make defconfig

# If building out of tree, create an empty build dir
if [ ${OUT_DIR} != "tools/perf" ]; then
    rm -fr ${OUT_DIR}
    mkdir -p ${OUT_DIR}
fi

make ${LINUX_MAKE_FLAGS}
make ${LINUX_MAKE_FLAGS} install

# remove the fix added above
git checkout -- .

echo "-------------------------------------------------------"
echo "perf files in O (${OUT_DIR}):"
find ${OUT_DIR} -name perf -not -type d  | xargs file
echo "-------------------------------------------------------"
echo "perf executable in DESTDIR:"
find /tmp/aaa/ -name perf -not -type d | xargs file | grep 'executable'
echo "-------------------------------------------------------"

-------------------------8<-------------------------

Best regards,
Luca

-- 
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v9 1/4] Create source symlink in perf object dir
  2024-11-25 17:25 ` [PATCH v9 1/4] Create source symlink in perf object dir Luca Ceresoli
@ 2024-11-25 19:12   ` Andi Kleen
  2024-11-26  7:34     ` Luca Ceresoli
  0 siblings, 1 reply; 13+ messages in thread
From: Andi Kleen @ 2024-11-25 19:12 UTC (permalink / raw)
  To: Luca Ceresoli
  Cc: namhyung, linux-perf-users, Yann E. MORIN,
	buildroot@buildroot.org, Thomas Petazzoni

On Mon, Nov 25, 2024 at 06:25:06PM +0100, Luca Ceresoli wrote:
> Hello Andi,
> 
> +cc Yann and the Buildroot mailing list (see the end of this report).
> 
> On Wed,  7 Aug 2024 16:18:20 -0700
> Andi Kleen <ak@linux.intel.com> wrote:
> 
> > Create a source symlink to the original source in the objdir.
> > This is similar to what the main kernel build script does.
> > 
> > Signed-off-by: Andi Kleen <ak@linux.intel.com>
> > ---
> >  tools/perf/Makefile.perf | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> > index 175e4c7898f0..d46892d8223b 100644
> > --- a/tools/perf/Makefile.perf
> > +++ b/tools/perf/Makefile.perf
> > @@ -163,6 +163,8 @@ ifneq ($(OUTPUT),)
> >  # for flex/bison parsers.
> >  VPATH += $(OUTPUT)
> >  export VPATH
> > +# create symlink to the original source
> > +SOURCE := $(shell ln -sf $(srctree)/tools/perf $(OUTPUT)/source)
> 
> I found that since this patch got applied in 890a1961c812 (v6.12-rc1)
> building perf fails silently in some cases.
> 
> The cases I have tested:
>  - make <FLAGS> -C tools/perf                       -> OK
>  - make <FLAGS> -C tools/perf O=/my/out/of/tree/dir -> OK
>  - make <FLAGS> -C tools/perf O=tools/perf          -> fails
> 
> The failure in the third case is silent: make exits with a return value
> of 0, but there is no perf executable created.

The failing case is incorrect arguments because the O= is relative to -C 
and there is no tools/perf/tools/perf

But yes a error message would be good.

Could just ignore the linking error, so that you hopefully get a better
error message later.

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 9dd2e8d3f3c9..18f639ec853b 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -164,7 +164,7 @@ ifneq ($(OUTPUT),)
 VPATH += $(OUTPUT)
 export VPATH
 # create symlink to the original source
-SOURCE := $(shell ln -sf $(srctree)/tools/perf $(OUTPUT)/source)
+SOURCE := $(shell ln -sf $(srctree)/tools/perf $(OUTPUT)/source || true)
 endif
 
 ifeq ($(V),1)

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v9 1/4] Create source symlink in perf object dir
  2024-11-25 19:12   ` Andi Kleen
@ 2024-11-26  7:34     ` Luca Ceresoli
  0 siblings, 0 replies; 13+ messages in thread
From: Luca Ceresoli @ 2024-11-26  7:34 UTC (permalink / raw)
  To: Andi Kleen
  Cc: namhyung, linux-perf-users, Yann E. MORIN,
	buildroot@buildroot.org, Thomas Petazzoni

Hello Andi,

thanks for the feedback.

On Mon, 25 Nov 2024 11:12:50 -0800
Andi Kleen <ak@linux.intel.com> wrote:

> On Mon, Nov 25, 2024 at 06:25:06PM +0100, Luca Ceresoli wrote:
> > Hello Andi,
> > 
> > +cc Yann and the Buildroot mailing list (see the end of this report).
> > 
> > On Wed,  7 Aug 2024 16:18:20 -0700
> > Andi Kleen <ak@linux.intel.com> wrote:
> >   
> > > Create a source symlink to the original source in the objdir.
> > > This is similar to what the main kernel build script does.
> > > 
> > > Signed-off-by: Andi Kleen <ak@linux.intel.com>
> > > ---
> > >  tools/perf/Makefile.perf | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> > > index 175e4c7898f0..d46892d8223b 100644
> > > --- a/tools/perf/Makefile.perf
> > > +++ b/tools/perf/Makefile.perf
> > > @@ -163,6 +163,8 @@ ifneq ($(OUTPUT),)
> > >  # for flex/bison parsers.
> > >  VPATH += $(OUTPUT)
> > >  export VPATH
> > > +# create symlink to the original source
> > > +SOURCE := $(shell ln -sf $(srctree)/tools/perf $(OUTPUT)/source)  
> > 
> > I found that since this patch got applied in 890a1961c812 (v6.12-rc1)
> > building perf fails silently in some cases.
> > 
> > The cases I have tested:
> >  - make <FLAGS> -C tools/perf                       -> OK
> >  - make <FLAGS> -C tools/perf O=/my/out/of/tree/dir -> OK
> >  - make <FLAGS> -C tools/perf O=tools/perf          -> fails
> > 
> > The failure in the third case is silent: make exits with a return value
> > of 0, but there is no perf executable created.  
> 
> The failing case is incorrect arguments because the O= is relative to -C 
> and there is no tools/perf/tools/perf

OK. However I get the same result even with O=$(pwd)/tools/perf, which
is precisely what Buildroot does:
https://git.buildroot.net/buildroot/tree/package/linux-tools/linux-tool-perf.mk.in#n170

> But yes a error message would be good.

Sure, as well as a non-zero return value.

> Could just ignore the linking error, so that you hopefully get a better
> error message later.
> 
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 9dd2e8d3f3c9..18f639ec853b 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -164,7 +164,7 @@ ifneq ($(OUTPUT),)
>  VPATH += $(OUTPUT)
>  export VPATH
>  # create symlink to the original source
> -SOURCE := $(shell ln -sf $(srctree)/tools/perf $(OUTPUT)/source)
> +SOURCE := $(shell ln -sf $(srctree)/tools/perf $(OUTPUT)/source || true)

I tried this but I don't see any difference. What is the change you
would have expected in the build output?

I'm afraid I don't understand what the SOURCE variable is supposed to
do so I can hardly follow the logic.

Luca

-- 
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v9 1/4] Create source symlink in perf object dir
  2024-08-07 23:18 [PATCH v9 1/4] Create source symlink in perf object dir Andi Kleen
                   ` (3 preceding siblings ...)
  2024-11-25 17:25 ` [PATCH v9 1/4] Create source symlink in perf object dir Luca Ceresoli
@ 2025-04-09 17:43 ` Florian Fainelli
  2025-04-09 17:59   ` Florian Fainelli
  4 siblings, 1 reply; 13+ messages in thread
From: Florian Fainelli @ 2025-04-09 17:43 UTC (permalink / raw)
  To: Andi Kleen, namhyung, Arnaldo Carvalho de Melo, Markus Mayer
  Cc: linux-perf-users

Howdy!

On 8/7/24 16:18, Andi Kleen wrote:
> Create a source symlink to the original source in the objdir.
> This is similar to what the main kernel build script does.
> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>

Sorry, I am late to the party here, we were just debugging why the 
"perf" binary stopped being installed in our root filesystems when we 
moved to Linux 6.12 and beyond. Turns out that this change is 
responsible and causes the following directory structure to be created:

lrwxrwxrwx  1 ff944844 ff944844      74 Apr  9 09:43 perf -> 
/local/users/fainelli/buildroot/output/arm64/build/linux-custom/tools/perf
-rwxr-xr-x  1 ff944844 ff944844    3688 Apr  8 16:42 perf-archive
-rwxr-xr-x  1 ff944844 ff944844    3688 Apr  8 15:17 perf-archive.sh
-rw-r--r--  1 ff944844 ff944844  208760 Apr  8 16:41 perf-bench-in.o
-rw-r--r--  1 ff944844 ff944844    7070 Apr  8 15:17 perf-completion.sh
-rw-r--r--  1 ff944844 ff944844  995736 Apr  8 16:42 perf-in.o
-rwxr-xr-x  1 ff944844 ff944844     290 Apr  8 16:42 perf-iostat
-rw-r--r--  1 ff944844 ff944844     290 Apr  8 13:55 perf-iostat.sh
-rw-r--r--  1 ff944844 ff944844     547 Oct 21 14:16 perf-read-vdso.c
-rw-r--r--  1 ff944844 ff944844     452 Apr  8 13:31 perf-sys.h
-rw-r--r--  1 ff944844 ff944844  876656 Apr  8 16:41 perf-test-in.o
-rw-r--r--  1 ff944844 ff944844  307968 Apr  8 16:41 perf-ui-in.o
-rw-r--r--  1 ff944844 ff944844 3333632 Apr  8 16:42 perf-util-in.o
-rw-r--r--  1 ff944844 ff944844   13959 Apr  8 16:19 perf.c
-rw-r--r--  1 ff944844 ff944844     253 Apr  8 16:19 perf.h
-rw-r--r--  1 ff944844 ff944844   23944 Apr  8 16:42 perf.o
drwxr-xr-x  4 ff944844 ff944844    4096 Apr  8 16:19 pmu-events
drwxr-xr-x  2 ff944844 ff944844    4096 Apr  8 14:42 python
drwxr-xr-x  4 ff944844 ff944844    4096 Apr  8 16:19 scripts
lrwxrwxrwx  1 ff944844 ff944844      74 Apr  8 16:40 source -> 
/local/users/fainelli/buildroot/output/arm64/build/linux-custom/tools/perf
drwxr-xr-x  6 ff944844 ff944844   12288 Apr  8 16:19 tests
drwxr-xr-x  4 ff944844 ff944844    4096 Apr 30  2024 trace
drwxr-xr-x  6 ff944844 ff944844    4096 Apr  8 16:19 ui
drwxr-xr-x 11 ff944844 ff944844   28672 Apr  8 16:19 util

While the "source" symbolic link makes sense the symbolic link from 
"perf" back to itself does not.

and later the "install" target will hit the following:

   INSTALL libapi_headers
   INSTALL libperf_headers
   INSTALL libsubcmd_headers
   INSTALL libsymbol_headers
   INSTALL libbpf_headers
DESTDIR_SQ=/local/users/fainelli/buildroot/output/arm64/target
bindir_SQ=/usr/bin
OUTPUT=/local/users/fainelli/buildroot/output/arm64/build/linux-custom/tools/perf/
   INSTALL binaries
install: omitting directory 
'/local/users/fainelli/buildroot/output/arm64/build/linux-custom/tools/perf/perf'
ln: failed to access 
'/local/users/fainelli/buildroot/output/arm64/target/usr/bin/perf': No 
such file or directory
   INSTALL libexec
   INSTALL perf-archive
   INSTALL perf-iostat
   INSTALL dlfilters
   INSTALL perf_completion-script
   INSTALL perf-tip
   INSTALL tests

I added instrumentation to print DESTDIR_SQ, bindir_SQ and OUTPUT to 
confirm they were as expected.

We build perf using buildroot's makefile which can be seen here:

https://git.buildroot.net/buildroot/tree/package/linux-tools/linux-tool-perf.mk.in

Any clue where that first symbolic link from "perf" is coming from? Thanks!

> ---
>   tools/perf/Makefile.perf | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 175e4c7898f0..d46892d8223b 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -163,6 +163,8 @@ ifneq ($(OUTPUT),)
>   # for flex/bison parsers.
>   VPATH += $(OUTPUT)
>   export VPATH
> +# create symlink to the original source
> +SOURCE := $(shell ln -sf $(srctree)/tools/perf $(OUTPUT)/source)
>   endif
>   
>   ifeq ($(V),1)


-- 
Florian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v9 1/4] Create source symlink in perf object dir
  2025-04-09 17:43 ` Florian Fainelli
@ 2025-04-09 17:59   ` Florian Fainelli
  0 siblings, 0 replies; 13+ messages in thread
From: Florian Fainelli @ 2025-04-09 17:59 UTC (permalink / raw)
  To: Andi Kleen, namhyung, Arnaldo Carvalho de Melo, Markus Mayer
  Cc: linux-perf-users

On 4/9/25 10:43, Florian Fainelli wrote:
> Howdy!
> 
> On 8/7/24 16:18, Andi Kleen wrote:
>> Create a source symlink to the original source in the objdir.
>> This is similar to what the main kernel build script does.
>>
>> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> 
> Sorry, I am late to the party here, we were just debugging why the 
> "perf" binary stopped being installed in our root filesystems when we 
> moved to Linux 6.12 and beyond. Turns out that this change is 
> responsible and causes the following directory structure to be created:
> 
> lrwxrwxrwx  1 ff944844 ff944844      74 Apr  9 09:43 perf -> /local/ 
> users/fainelli/buildroot/output/arm64/build/linux-custom/tools/perf
> -rwxr-xr-x  1 ff944844 ff944844    3688 Apr  8 16:42 perf-archive
> -rwxr-xr-x  1 ff944844 ff944844    3688 Apr  8 15:17 perf-archive.sh
> -rw-r--r--  1 ff944844 ff944844  208760 Apr  8 16:41 perf-bench-in.o
> -rw-r--r--  1 ff944844 ff944844    7070 Apr  8 15:17 perf-completion.sh
> -rw-r--r--  1 ff944844 ff944844  995736 Apr  8 16:42 perf-in.o
> -rwxr-xr-x  1 ff944844 ff944844     290 Apr  8 16:42 perf-iostat
> -rw-r--r--  1 ff944844 ff944844     290 Apr  8 13:55 perf-iostat.sh
> -rw-r--r--  1 ff944844 ff944844     547 Oct 21 14:16 perf-read-vdso.c
> -rw-r--r--  1 ff944844 ff944844     452 Apr  8 13:31 perf-sys.h
> -rw-r--r--  1 ff944844 ff944844  876656 Apr  8 16:41 perf-test-in.o
> -rw-r--r--  1 ff944844 ff944844  307968 Apr  8 16:41 perf-ui-in.o
> -rw-r--r--  1 ff944844 ff944844 3333632 Apr  8 16:42 perf-util-in.o
> -rw-r--r--  1 ff944844 ff944844   13959 Apr  8 16:19 perf.c
> -rw-r--r--  1 ff944844 ff944844     253 Apr  8 16:19 perf.h
> -rw-r--r--  1 ff944844 ff944844   23944 Apr  8 16:42 perf.o
> drwxr-xr-x  4 ff944844 ff944844    4096 Apr  8 16:19 pmu-events
> drwxr-xr-x  2 ff944844 ff944844    4096 Apr  8 14:42 python
> drwxr-xr-x  4 ff944844 ff944844    4096 Apr  8 16:19 scripts
> lrwxrwxrwx  1 ff944844 ff944844      74 Apr  8 16:40 source -> /local/ 
> users/fainelli/buildroot/output/arm64/build/linux-custom/tools/perf
> drwxr-xr-x  6 ff944844 ff944844   12288 Apr  8 16:19 tests
> drwxr-xr-x  4 ff944844 ff944844    4096 Apr 30  2024 trace
> drwxr-xr-x  6 ff944844 ff944844    4096 Apr  8 16:19 ui
> drwxr-xr-x 11 ff944844 ff944844   28672 Apr  8 16:19 util
> 
> While the "source" symbolic link makes sense the symbolic link from 
> "perf" back to itself does not.
> 
> and later the "install" target will hit the following:
> 
>    INSTALL libapi_headers
>    INSTALL libperf_headers
>    INSTALL libsubcmd_headers
>    INSTALL libsymbol_headers
>    INSTALL libbpf_headers
> DESTDIR_SQ=/local/users/fainelli/buildroot/output/arm64/target
> bindir_SQ=/usr/bin
> OUTPUT=/local/users/fainelli/buildroot/output/arm64/build/linux-custom/ 
> tools/perf/
>    INSTALL binaries
> install: omitting directory '/local/users/fainelli/buildroot/output/ 
> arm64/build/linux-custom/tools/perf/perf'
> ln: failed to access '/local/users/fainelli/buildroot/output/arm64/ 
> target/usr/bin/perf': No such file or directory
>    INSTALL libexec
>    INSTALL perf-archive
>    INSTALL perf-iostat
>    INSTALL dlfilters
>    INSTALL perf_completion-script
>    INSTALL perf-tip
>    INSTALL tests
> 
> I added instrumentation to print DESTDIR_SQ, bindir_SQ and OUTPUT to 
> confirm they were as expected.
> 
> We build perf using buildroot's makefile which can be seen here:
> 
> https://git.buildroot.net/buildroot/tree/package/linux-tools/linux-tool- 
> perf.mk.in
> 
> Any clue where that first symbolic link from "perf" is coming from? Thanks!

Another thing is that the "trace" binary is supposed to be a symbolic 
link to the "perf" binary, but in my root filesystem they end up being 
the same executable copied twice over.
-- 
Florian

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-04-09 17:59 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-07 23:18 [PATCH v9 1/4] Create source symlink in perf object dir Andi Kleen
2024-08-07 23:18 ` [PATCH v9 2/4] perf test: Support external tests for separate objdir Andi Kleen
2024-08-07 23:18 ` [PATCH v9 3/4] perf script: Fix perf script -F +metric Andi Kleen
2024-08-08  5:18   ` Namhyung Kim
2024-08-08 18:09     ` Ian Rogers
2024-08-08 20:18       ` Andi Kleen
2024-08-09 15:00         ` Ian Rogers
2024-08-07 23:18 ` [PATCH v9 4/4] Add a test case for " Andi Kleen
2024-11-25 17:25 ` [PATCH v9 1/4] Create source symlink in perf object dir Luca Ceresoli
2024-11-25 19:12   ` Andi Kleen
2024-11-26  7:34     ` Luca Ceresoli
2025-04-09 17:43 ` Florian Fainelli
2025-04-09 17:59   ` Florian Fainelli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).