* [PATCH 1/2] perf stat: refactor __run_perf_stat common code @ 2022-07-29 16:12 Adrián Herrera Arcila 2022-07-29 16:12 ` [PATCH 2/2] perf stat: fix unexpected delay behaviour Adrián Herrera Arcila 2022-08-01 11:27 ` [PATCH 1/2] perf stat: refactor __run_perf_stat common code Leo Yan 0 siblings, 2 replies; 12+ messages in thread From: Adrián Herrera Arcila @ 2022-07-29 16:12 UTC (permalink / raw) To: linux-kernel, linux-perf-users, acme Cc: peterz, mingo, mark.rutland, alexander.shishkin, jolsa, namhyung, leo.yan, songliubraving, james.clark, Adrián Herrera Arcila This extracts common code from the branches of the forks if-then-else. enable_counters(), which was at the beginning of both branches of the conditional, is now unconditional; evlist__start_workload is extracted to a different if, which enables making the common clocking code unconditional. Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com> --- tools/perf/builtin-stat.c | 25 +++++++++---------------- 1 file changed, 9 insertions(+), 16 deletions(-) diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index d2ecd4d29624..318ffd119dad 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -966,18 +966,18 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) return err; } - /* - * Enable counters and exec the command: - */ - if (forks) { - err = enable_counters(); - if (err) - return -1; + err = enable_counters(); + if (err) + return -1; + + /* Exec the command, if any */ + if (forks) evlist__start_workload(evsel_list); - t0 = rdclock(); - clock_gettime(CLOCK_MONOTONIC, &ref_time); + t0 = rdclock(); + clock_gettime(CLOCK_MONOTONIC, &ref_time); + if (forks) { if (interval || timeout || evlist__ctlfd_initialized(evsel_list)) status = dispatch_events(forks, timeout, interval, ×); if (child_pid != -1) { @@ -995,13 +995,6 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) if (WIFSIGNALED(status)) psignal(WTERMSIG(status), argv[0]); } else { - err = enable_counters(); - if (err) - return -1; - - t0 = rdclock(); - clock_gettime(CLOCK_MONOTONIC, &ref_time); - status = dispatch_events(forks, timeout, interval, ×); } -- 2.36.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/2] perf stat: fix unexpected delay behaviour 2022-07-29 16:12 [PATCH 1/2] perf stat: refactor __run_perf_stat common code Adrián Herrera Arcila @ 2022-07-29 16:12 ` Adrián Herrera Arcila 2022-08-01 8:20 ` James Clark 2022-08-01 11:45 ` Leo Yan 2022-08-01 11:27 ` [PATCH 1/2] perf stat: refactor __run_perf_stat common code Leo Yan 1 sibling, 2 replies; 12+ messages in thread From: Adrián Herrera Arcila @ 2022-07-29 16:12 UTC (permalink / raw) To: linux-kernel, linux-perf-users, acme Cc: peterz, mingo, mark.rutland, alexander.shishkin, jolsa, namhyung, leo.yan, songliubraving, james.clark, Adrián Herrera Arcila The described --delay behaviour is to delay the enablement of events, but not the execution of the command, if one is passed, which is incorrectly the current behaviour. This patch decouples the enablement from the delay, and enables events before or after launching the workload dependent on the options passed by the user. This code structure is inspired by that in perf-record, and tries to be consistent with it. Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters") Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com> --- tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++----------------- 1 file changed, 32 insertions(+), 24 deletions(-) diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 318ffd119dad..f98c823b16dd 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -559,7 +559,7 @@ static bool handle_interval(unsigned int interval, int *times) return false; } -static int enable_counters(void) +static int enable_bpf_counters(void) { struct evsel *evsel; int err; @@ -572,28 +572,6 @@ static int enable_counters(void) if (err) return err; } - - if (stat_config.initial_delay < 0) { - pr_info(EVLIST_DISABLED_MSG); - return 0; - } - - if (stat_config.initial_delay > 0) { - pr_info(EVLIST_DISABLED_MSG); - usleep(stat_config.initial_delay * USEC_PER_MSEC); - } - - /* - * We need to enable counters only if: - * - we don't have tracee (attaching to task or cpu) - * - we have initial delay configured - */ - if (!target__none(&target) || stat_config.initial_delay) { - if (!all_counters_use_bpf) - evlist__enable(evsel_list); - if (stat_config.initial_delay > 0) - pr_info(EVLIST_ENABLED_MSG); - } return 0; } @@ -966,10 +944,24 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) return err; } - err = enable_counters(); + err = enable_bpf_counters(); if (err) return -1; + /* + * Enable events manually here if perf-stat is run: + * 1. with a target (any of --all-cpus, --cpu, --pid or --tid) + * 2. without measurement delay (no --delay) + * 3. without all events associated to BPF + * + * This is because if run with a target, events are not enabled + * on exec if a workload is passed, and because there is no delay + * we ensure to enable them before the workload starts + */ + if (!target__none(&target) && !stat_config.initial_delay && + !all_counters_use_bpf) + evlist__enable(evsel_list); + /* Exec the command, if any */ if (forks) evlist__start_workload(evsel_list); @@ -977,6 +969,22 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) t0 = rdclock(); clock_gettime(CLOCK_MONOTONIC, &ref_time); + /* + * If a measurement delay was specified, start it, and if positive, + * enable events manually after. We respect the delay even if all + * events are associated to BPF + */ + if (stat_config.initial_delay) { + /* At this point, events are guaranteed disabled */ + pr_info(EVLIST_DISABLED_MSG); + if (stat_config.initial_delay > 0) { + usleep(stat_config.initial_delay * USEC_PER_MSEC); + if (!all_counters_use_bpf) + evlist__enable(evsel_list); + pr_info(EVLIST_ENABLED_MSG); + } + } + if (forks) { if (interval || timeout || evlist__ctlfd_initialized(evsel_list)) status = dispatch_events(forks, timeout, interval, ×); -- 2.36.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour 2022-07-29 16:12 ` [PATCH 2/2] perf stat: fix unexpected delay behaviour Adrián Herrera Arcila @ 2022-08-01 8:20 ` James Clark 2022-12-13 14:44 ` Arnaldo Carvalho de Melo 2022-08-01 11:45 ` Leo Yan 1 sibling, 1 reply; 12+ messages in thread From: James Clark @ 2022-08-01 8:20 UTC (permalink / raw) To: Adrián Herrera Arcila, linux-kernel, linux-perf-users, acme, leo.yan, songliubraving Cc: peterz, mingo, mark.rutland, alexander.shishkin, jolsa, namhyung On 29/07/2022 17:12, Adrián Herrera Arcila wrote: > The described --delay behaviour is to delay the enablement of events, but > not the execution of the command, if one is passed, which is incorrectly > the current behaviour. > > This patch decouples the enablement from the delay, and enables events > before or after launching the workload dependent on the options passed > by the user. This code structure is inspired by that in perf-record, and > tries to be consistent with it. > > Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com > Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters") > Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com> > --- > tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++----------------- > 1 file changed, 32 insertions(+), 24 deletions(-) Looks good to me. Fixes the counter delay issue and the code is pretty similar to perf record now. Although I would wait for Leo's or Song's comment as well because they were involved. Reviewed-by: James Clark <james.clark@arm.com> > > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c > index 318ffd119dad..f98c823b16dd 100644 > --- a/tools/perf/builtin-stat.c > +++ b/tools/perf/builtin-stat.c > @@ -559,7 +559,7 @@ static bool handle_interval(unsigned int interval, int *times) > return false; > } > > -static int enable_counters(void) > +static int enable_bpf_counters(void) > { > struct evsel *evsel; > int err; > @@ -572,28 +572,6 @@ static int enable_counters(void) > if (err) > return err; > } > - > - if (stat_config.initial_delay < 0) { > - pr_info(EVLIST_DISABLED_MSG); > - return 0; > - } > - > - if (stat_config.initial_delay > 0) { > - pr_info(EVLIST_DISABLED_MSG); > - usleep(stat_config.initial_delay * USEC_PER_MSEC); > - } > - > - /* > - * We need to enable counters only if: > - * - we don't have tracee (attaching to task or cpu) > - * - we have initial delay configured > - */ > - if (!target__none(&target) || stat_config.initial_delay) { > - if (!all_counters_use_bpf) > - evlist__enable(evsel_list); > - if (stat_config.initial_delay > 0) > - pr_info(EVLIST_ENABLED_MSG); > - } > return 0; > } > > @@ -966,10 +944,24 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) > return err; > } > > - err = enable_counters(); > + err = enable_bpf_counters(); > if (err) > return -1; > > + /* > + * Enable events manually here if perf-stat is run: > + * 1. with a target (any of --all-cpus, --cpu, --pid or --tid) > + * 2. without measurement delay (no --delay) > + * 3. without all events associated to BPF > + * > + * This is because if run with a target, events are not enabled > + * on exec if a workload is passed, and because there is no delay > + * we ensure to enable them before the workload starts > + */ > + if (!target__none(&target) && !stat_config.initial_delay && > + !all_counters_use_bpf) > + evlist__enable(evsel_list); > + > /* Exec the command, if any */ > if (forks) > evlist__start_workload(evsel_list); > @@ -977,6 +969,22 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) > t0 = rdclock(); > clock_gettime(CLOCK_MONOTONIC, &ref_time); > > + /* > + * If a measurement delay was specified, start it, and if positive, > + * enable events manually after. We respect the delay even if all > + * events are associated to BPF > + */ > + if (stat_config.initial_delay) { > + /* At this point, events are guaranteed disabled */ > + pr_info(EVLIST_DISABLED_MSG); > + if (stat_config.initial_delay > 0) { > + usleep(stat_config.initial_delay * USEC_PER_MSEC); > + if (!all_counters_use_bpf) > + evlist__enable(evsel_list); > + pr_info(EVLIST_ENABLED_MSG); > + } > + } > + > if (forks) { > if (interval || timeout || evlist__ctlfd_initialized(evsel_list)) > status = dispatch_events(forks, timeout, interval, ×); ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour 2022-08-01 8:20 ` James Clark @ 2022-12-13 14:44 ` Arnaldo Carvalho de Melo 2022-12-13 16:40 ` Namhyung Kim 0 siblings, 1 reply; 12+ messages in thread From: Arnaldo Carvalho de Melo @ 2022-12-13 14:44 UTC (permalink / raw) To: James Clark Cc: Adrián Herrera Arcila, linux-kernel, linux-perf-users, leo.yan, songliubraving, peterz, mingo, mark.rutland, alexander.shishkin, jolsa, namhyung Em Mon, Aug 01, 2022 at 09:20:37AM +0100, James Clark escreveu: > > > On 29/07/2022 17:12, Adrián Herrera Arcila wrote: > > The described --delay behaviour is to delay the enablement of events, but > > not the execution of the command, if one is passed, which is incorrectly > > the current behaviour. > > > > This patch decouples the enablement from the delay, and enables events > > before or after launching the workload dependent on the options passed > > by the user. This code structure is inspired by that in perf-record, and > > tries to be consistent with it. > > > > Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com > > Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters") > > Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com> > > --- > > tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++----------------- > > 1 file changed, 32 insertions(+), 24 deletions(-) > > Looks good to me. Fixes the counter delay issue and the code is pretty > similar to perf record now. Although I would wait for Leo's or Song's > comment as well because they were involved. I think I didn't notice Leo's ack, it still applies, so I'm doing it now. - Arnaldo > Reviewed-by: James Clark <james.clark@arm.com> > > > > > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c > > index 318ffd119dad..f98c823b16dd 100644 > > --- a/tools/perf/builtin-stat.c > > +++ b/tools/perf/builtin-stat.c > > @@ -559,7 +559,7 @@ static bool handle_interval(unsigned int interval, int *times) > > return false; > > } > > > > -static int enable_counters(void) > > +static int enable_bpf_counters(void) > > { > > struct evsel *evsel; > > int err; > > @@ -572,28 +572,6 @@ static int enable_counters(void) > > if (err) > > return err; > > } > > - > > - if (stat_config.initial_delay < 0) { > > - pr_info(EVLIST_DISABLED_MSG); > > - return 0; > > - } > > - > > - if (stat_config.initial_delay > 0) { > > - pr_info(EVLIST_DISABLED_MSG); > > - usleep(stat_config.initial_delay * USEC_PER_MSEC); > > - } > > - > > - /* > > - * We need to enable counters only if: > > - * - we don't have tracee (attaching to task or cpu) > > - * - we have initial delay configured > > - */ > > - if (!target__none(&target) || stat_config.initial_delay) { > > - if (!all_counters_use_bpf) > > - evlist__enable(evsel_list); > > - if (stat_config.initial_delay > 0) > > - pr_info(EVLIST_ENABLED_MSG); > > - } > > return 0; > > } > > > > @@ -966,10 +944,24 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) > > return err; > > } > > > > - err = enable_counters(); > > + err = enable_bpf_counters(); > > if (err) > > return -1; > > > > + /* > > + * Enable events manually here if perf-stat is run: > > + * 1. with a target (any of --all-cpus, --cpu, --pid or --tid) > > + * 2. without measurement delay (no --delay) > > + * 3. without all events associated to BPF > > + * > > + * This is because if run with a target, events are not enabled > > + * on exec if a workload is passed, and because there is no delay > > + * we ensure to enable them before the workload starts > > + */ > > + if (!target__none(&target) && !stat_config.initial_delay && > > + !all_counters_use_bpf) > > + evlist__enable(evsel_list); > > + > > /* Exec the command, if any */ > > if (forks) > > evlist__start_workload(evsel_list); > > @@ -977,6 +969,22 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) > > t0 = rdclock(); > > clock_gettime(CLOCK_MONOTONIC, &ref_time); > > > > + /* > > + * If a measurement delay was specified, start it, and if positive, > > + * enable events manually after. We respect the delay even if all > > + * events are associated to BPF > > + */ > > + if (stat_config.initial_delay) { > > + /* At this point, events are guaranteed disabled */ > > + pr_info(EVLIST_DISABLED_MSG); > > + if (stat_config.initial_delay > 0) { > > + usleep(stat_config.initial_delay * USEC_PER_MSEC); > > + if (!all_counters_use_bpf) > > + evlist__enable(evsel_list); > > + pr_info(EVLIST_ENABLED_MSG); > > + } > > + } > > + > > if (forks) { > > if (interval || timeout || evlist__ctlfd_initialized(evsel_list)) > > status = dispatch_events(forks, timeout, interval, ×); -- - Arnaldo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour 2022-12-13 14:44 ` Arnaldo Carvalho de Melo @ 2022-12-13 16:40 ` Namhyung Kim 2022-12-14 14:26 ` James Clark 2022-12-14 14:41 ` Arnaldo Carvalho de Melo 0 siblings, 2 replies; 12+ messages in thread From: Namhyung Kim @ 2022-12-13 16:40 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: James Clark, Adrián Herrera Arcila, linux-kernel, linux-perf-users, leo.yan, songliubraving, peterz, mingo, mark.rutland, alexander.shishkin, jolsa Hi, On Tue, Dec 13, 2022 at 6:44 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Mon, Aug 01, 2022 at 09:20:37AM +0100, James Clark escreveu: > > > > > > On 29/07/2022 17:12, Adrián Herrera Arcila wrote: > > > The described --delay behaviour is to delay the enablement of events, but > > > not the execution of the command, if one is passed, which is incorrectly > > > the current behaviour. > > > > > > This patch decouples the enablement from the delay, and enables events > > > before or after launching the workload dependent on the options passed > > > by the user. This code structure is inspired by that in perf-record, and > > > tries to be consistent with it. > > > > > > Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com > > > Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters") > > > Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com> > > > --- > > > tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++----------------- > > > 1 file changed, 32 insertions(+), 24 deletions(-) > > > > Looks good to me. Fixes the counter delay issue and the code is pretty > > similar to perf record now. Although I would wait for Leo's or Song's > > comment as well because they were involved. > > I think I didn't notice Leo's ack, it still applies, so I'm doing it > now. I think the BPF counters should be enabled/disabled together. Thanks, Namhyung ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour 2022-12-13 16:40 ` Namhyung Kim @ 2022-12-14 14:26 ` James Clark 2022-12-14 14:41 ` Arnaldo Carvalho de Melo 1 sibling, 0 replies; 12+ messages in thread From: James Clark @ 2022-12-14 14:26 UTC (permalink / raw) To: Namhyung Kim, Arnaldo Carvalho de Melo Cc: Adrián Herrera Arcila, linux-kernel, linux-perf-users, leo.yan, songliubraving, peterz, mingo, mark.rutland, alexander.shishkin, jolsa On 13/12/2022 16:40, Namhyung Kim wrote: > Hi, > > On Tue, Dec 13, 2022 at 6:44 AM Arnaldo Carvalho de Melo > <acme@kernel.org> wrote: >> >> Em Mon, Aug 01, 2022 at 09:20:37AM +0100, James Clark escreveu: >>> >>> >>> On 29/07/2022 17:12, Adrián Herrera Arcila wrote: >>>> The described --delay behaviour is to delay the enablement of events, but >>>> not the execution of the command, if one is passed, which is incorrectly >>>> the current behaviour. >>>> >>>> This patch decouples the enablement from the delay, and enables events >>>> before or after launching the workload dependent on the options passed >>>> by the user. This code structure is inspired by that in perf-record, and >>>> tries to be consistent with it. >>>> >>>> Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com >>>> Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters") >>>> Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com> >>>> --- >>>> tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++----------------- >>>> 1 file changed, 32 insertions(+), 24 deletions(-) >>> >>> Looks good to me. Fixes the counter delay issue and the code is pretty >>> similar to perf record now. Although I would wait for Leo's or Song's >>> comment as well because they were involved. >> >> I think I didn't notice Leo's ack, it still applies, so I'm doing it >> now. > > I think the BPF counters should be enabled/disabled together. I did notice that difference between the two, but I wasn't sure of the exact reason that it was done that way on Adrián's version. It seems like it's not separated in perf record so maybe you are right. > > Thanks, > Namhyung ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour 2022-12-13 16:40 ` Namhyung Kim 2022-12-14 14:26 ` James Clark @ 2022-12-14 14:41 ` Arnaldo Carvalho de Melo 2022-12-14 15:57 ` Leo Yan 1 sibling, 1 reply; 12+ messages in thread From: Arnaldo Carvalho de Melo @ 2022-12-14 14:41 UTC (permalink / raw) To: Namhyung Kim Cc: James Clark, Adrián Herrera Arcila, linux-kernel, linux-perf-users, leo.yan, songliubraving, peterz, mingo, mark.rutland, alexander.shishkin, jolsa Em Tue, Dec 13, 2022 at 08:40:31AM -0800, Namhyung Kim escreveu: > Hi, > > On Tue, Dec 13, 2022 at 6:44 AM Arnaldo Carvalho de Melo > <acme@kernel.org> wrote: > > > > Em Mon, Aug 01, 2022 at 09:20:37AM +0100, James Clark escreveu: > > > > > > > > > On 29/07/2022 17:12, Adrián Herrera Arcila wrote: > > > > The described --delay behaviour is to delay the enablement of events, but > > > > not the execution of the command, if one is passed, which is incorrectly > > > > the current behaviour. > > > > > > > > This patch decouples the enablement from the delay, and enables events > > > > before or after launching the workload dependent on the options passed > > > > by the user. This code structure is inspired by that in perf-record, and > > > > tries to be consistent with it. > > > > > > > > Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com > > > > Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters") > > > > Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com> > > > > --- > > > > tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++----------------- > > > > 1 file changed, 32 insertions(+), 24 deletions(-) > > > > > > Looks good to me. Fixes the counter delay issue and the code is pretty > > > similar to perf record now. Although I would wait for Leo's or Song's > > > comment as well because they were involved. > > > > I think I didn't notice Leo's ack, it still applies, so I'm doing it > > now. > > I think the BPF counters should be enabled/disabled together. Ok, so I removed this one and applied Namhyung's. - Arnaldo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour 2022-12-14 14:41 ` Arnaldo Carvalho de Melo @ 2022-12-14 15:57 ` Leo Yan 2022-12-14 17:35 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 12+ messages in thread From: Leo Yan @ 2022-12-14 15:57 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Namhyung Kim, James Clark, Adrián Herrera Arcila, linux-kernel, linux-perf-users, songliubraving, peterz, mingo, mark.rutland, alexander.shishkin, jolsa On Wed, Dec 14, 2022 at 11:41:55AM -0300, Arnaldo Carvalho de Melo wrote: > Em Tue, Dec 13, 2022 at 08:40:31AM -0800, Namhyung Kim escreveu: > > Hi, > > > > On Tue, Dec 13, 2022 at 6:44 AM Arnaldo Carvalho de Melo > > <acme@kernel.org> wrote: > > > > > > Em Mon, Aug 01, 2022 at 09:20:37AM +0100, James Clark escreveu: > > > > > > > > > > > > On 29/07/2022 17:12, Adrián Herrera Arcila wrote: > > > > > The described --delay behaviour is to delay the enablement of events, but > > > > > not the execution of the command, if one is passed, which is incorrectly > > > > > the current behaviour. > > > > > > > > > > This patch decouples the enablement from the delay, and enables events > > > > > before or after launching the workload dependent on the options passed > > > > > by the user. This code structure is inspired by that in perf-record, and > > > > > tries to be consistent with it. > > > > > > > > > > Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com > > > > > Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters") > > > > > Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com> > > > > > --- > > > > > tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++----------------- > > > > > 1 file changed, 32 insertions(+), 24 deletions(-) > > > > > > > > Looks good to me. Fixes the counter delay issue and the code is pretty > > > > similar to perf record now. Although I would wait for Leo's or Song's > > > > comment as well because they were involved. > > > > > > I think I didn't notice Leo's ack, it still applies, so I'm doing it > > > now. > > > > I think the BPF counters should be enabled/disabled together. > > Ok, so I removed this one and applied Namhyung's. I can guess why Adrián doesn't enable/disable BPF counters together :) Since 'perf stat' doesn't enable BPF counters with other normal PMU events in the first place, I believe this is deliberately by Song's patch fa853c4b839e ("perf stat: Enable counting events for BPF programs"), it says: "'perf stat -b' creates per-cpu perf_event and loads fentry/fexit BPF programs (monitor-progs) to the target BPF program (target-prog). The monitor-progs read perf_event before and after the target-prog, and aggregate the difference in a BPF map. Then the user space reads data from these maps". IIUC, when loading eBPF (counter) program, perf tool needs to handle eBPF program map specially (so that perf tool can know the latest eBPF program's map in kernel). I don't know anything for eBPF counter, so this is why I am still a bit puzzle which way is right to do (bind vs separate eBPF counters). But I personally prefer to let eBPF counter to respect delay, so it's fine for me to apply Namhyung's patch. Thanks, Leo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour 2022-12-14 15:57 ` Leo Yan @ 2022-12-14 17:35 ` Arnaldo Carvalho de Melo 2022-12-15 1:45 ` Leo Yan 0 siblings, 1 reply; 12+ messages in thread From: Arnaldo Carvalho de Melo @ 2022-12-14 17:35 UTC (permalink / raw) To: Leo Yan Cc: Namhyung Kim, James Clark, Adrián Herrera Arcila, linux-kernel, linux-perf-users, songliubraving, peterz, mingo, mark.rutland, alexander.shishkin, jolsa Em Wed, Dec 14, 2022 at 11:57:21PM +0800, Leo Yan escreveu: > On Wed, Dec 14, 2022 at 11:41:55AM -0300, Arnaldo Carvalho de Melo wrote: > > Em Tue, Dec 13, 2022 at 08:40:31AM -0800, Namhyung Kim escreveu: > > > Hi, > > > > > > On Tue, Dec 13, 2022 at 6:44 AM Arnaldo Carvalho de Melo > > > <acme@kernel.org> wrote: > > > > > > > > Em Mon, Aug 01, 2022 at 09:20:37AM +0100, James Clark escreveu: > > > > > > > > > > > > > > > On 29/07/2022 17:12, Adrián Herrera Arcila wrote: > > > > > > The described --delay behaviour is to delay the enablement of events, but > > > > > > not the execution of the command, if one is passed, which is incorrectly > > > > > > the current behaviour. > > > > > > > > > > > > This patch decouples the enablement from the delay, and enables events > > > > > > before or after launching the workload dependent on the options passed > > > > > > by the user. This code structure is inspired by that in perf-record, and > > > > > > tries to be consistent with it. > > > > > > > > > > > > Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com > > > > > > Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters") > > > > > > Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com> > > > > > > --- > > > > > > tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++----------------- > > > > > > 1 file changed, 32 insertions(+), 24 deletions(-) > > > > > > > > > > Looks good to me. Fixes the counter delay issue and the code is pretty > > > > > similar to perf record now. Although I would wait for Leo's or Song's > > > > > comment as well because they were involved. > > > > > > > > I think I didn't notice Leo's ack, it still applies, so I'm doing it > > > > now. > > > > > > I think the BPF counters should be enabled/disabled together. > > > > Ok, so I removed this one and applied Namhyung's. > > I can guess why Adrián doesn't enable/disable BPF counters together :) > > Since 'perf stat' doesn't enable BPF counters with other normal PMU > events in the first place, I believe this is deliberately by Song's > patch fa853c4b839e ("perf stat: Enable counting events for BPF > programs"), it says: > > "'perf stat -b' creates per-cpu perf_event and loads fentry/fexit BPF > programs (monitor-progs) to the target BPF program (target-prog). The > monitor-progs read perf_event before and after the target-prog, and > aggregate the difference in a BPF map. Then the user space reads data > from these maps". > > IIUC, when loading eBPF (counter) program, perf tool needs to handle > eBPF program map specially (so that perf tool can know the latest eBPF > program's map in kernel). > > I don't know anything for eBPF counter, so this is why I am still a bit > puzzle which way is right to do (bind vs separate eBPF counters). But > I personally prefer to let eBPF counter to respect delay, so it's fine > for me to apply Namhyung's patch. "I'm fine" can be read as an Acked-by, right? :-) - Arnaldo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour 2022-12-14 17:35 ` Arnaldo Carvalho de Melo @ 2022-12-15 1:45 ` Leo Yan 0 siblings, 0 replies; 12+ messages in thread From: Leo Yan @ 2022-12-15 1:45 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Namhyung Kim, James Clark, Adrián Herrera Arcila, linux-kernel, linux-perf-users, songliubraving, peterz, mingo, mark.rutland, alexander.shishkin, jolsa On Wed, Dec 14, 2022 at 02:35:01PM -0300, Arnaldo Carvalho de Melo wrote: [...] > > I don't know anything for eBPF counter, so this is why I am still a bit > > puzzle which way is right to do (bind vs separate eBPF counters). But > > I personally prefer to let eBPF counter to respect delay, so it's fine > > for me to apply Namhyung's patch. > > "I'm fine" can be read as an Acked-by, right? :-) Yes, have sent my Reviewed tag on Namhyung's patch. Thanks, Leo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour 2022-07-29 16:12 ` [PATCH 2/2] perf stat: fix unexpected delay behaviour Adrián Herrera Arcila 2022-08-01 8:20 ` James Clark @ 2022-08-01 11:45 ` Leo Yan 1 sibling, 0 replies; 12+ messages in thread From: Leo Yan @ 2022-08-01 11:45 UTC (permalink / raw) To: Adrián Herrera Arcila Cc: linux-kernel, linux-perf-users, acme, peterz, mingo, mark.rutland, alexander.shishkin, jolsa, namhyung, songliubraving, james.clark On Fri, Jul 29, 2022 at 04:12:44PM +0000, Adrián Herrera Arcila wrote: > The described --delay behaviour is to delay the enablement of events, but > not the execution of the command, if one is passed, which is incorrectly > the current behaviour. > > This patch decouples the enablement from the delay, and enables events > before or after launching the workload dependent on the options passed > by the user. This code structure is inspired by that in perf-record, and > tries to be consistent with it. > > Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com > Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters") > Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com> bpf_counter() is not enabled for delay in this patch, but I think this is purposed to keep the same behaviour with before. I would leave it to Song for a call. The patch LGTM and I tested with commands: $ time ./perf stat --delay 2000 --quiet sleep 2 Events disabled Events enabled real 0m2.039s user 0m0.007s sys 0m0.016s $ ./perf stat --delay 2000 --quiet echo "marking" Events disabled marking Events enabled Reviewed-by: Leo Yan <leo.yan@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> P.s. I took a bit time to get clear how 'perf stat' set enable_on_exec flag, which is set in the function create_perf_stat_counter(), so this can enable PMU event for the case when delay is zero, and this can avoid losing PMU tracing for workload. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] perf stat: refactor __run_perf_stat common code 2022-07-29 16:12 [PATCH 1/2] perf stat: refactor __run_perf_stat common code Adrián Herrera Arcila 2022-07-29 16:12 ` [PATCH 2/2] perf stat: fix unexpected delay behaviour Adrián Herrera Arcila @ 2022-08-01 11:27 ` Leo Yan 1 sibling, 0 replies; 12+ messages in thread From: Leo Yan @ 2022-08-01 11:27 UTC (permalink / raw) To: Adrián Herrera Arcila Cc: linux-kernel, linux-perf-users, acme, peterz, mingo, mark.rutland, alexander.shishkin, jolsa, namhyung, songliubraving, james.clark On Fri, Jul 29, 2022 at 04:12:43PM +0000, Adrián Herrera Arcila wrote: > This extracts common code from the branches of the forks if-then-else. > enable_counters(), which was at the beginning of both branches of the > conditional, is now unconditional; evlist__start_workload is extracted > to a different if, which enables making the common clocking code > unconditional. > > Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2022-12-15 1:46 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-07-29 16:12 [PATCH 1/2] perf stat: refactor __run_perf_stat common code Adrián Herrera Arcila 2022-07-29 16:12 ` [PATCH 2/2] perf stat: fix unexpected delay behaviour Adrián Herrera Arcila 2022-08-01 8:20 ` James Clark 2022-12-13 14:44 ` Arnaldo Carvalho de Melo 2022-12-13 16:40 ` Namhyung Kim 2022-12-14 14:26 ` James Clark 2022-12-14 14:41 ` Arnaldo Carvalho de Melo 2022-12-14 15:57 ` Leo Yan 2022-12-14 17:35 ` Arnaldo Carvalho de Melo 2022-12-15 1:45 ` Leo Yan 2022-08-01 11:45 ` Leo Yan 2022-08-01 11:27 ` [PATCH 1/2] perf stat: refactor __run_perf_stat common code Leo Yan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).