* [RESEND][PATCH v2 0/2] perf record: ratio-to-prev event term for auto counter reload @ 2025-09-02 16:40 Thomas Falcon 2025-09-02 16:40 ` [RESEND][PATCH v2 1/2] perf record: Add ratio-to-prev term Thomas Falcon ` (4 more replies) 0 siblings, 5 replies; 11+ messages in thread From: Thomas Falcon @ 2025-09-02 16:40 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Kan Liang Cc: linux-kernel, linux-perf-users, Andi Kleen, Thomas Falcon The Auto Counter Reload (ACR)[1] feature is used to track the relative rates of two or more perf events, only sampling when a given threshold is exceeded. This helps reduce overhead and unnecessary samples. However, enabling this feature currently requires setting two parameters: -- Event sampling period ("period") -- acr_mask, which determines which events get reloaded when the sample period is reached. For example, in the following command: perf record -e "{cpu_atom/branch-misses,period=200000,\ acr_mask=0x2/ppu,cpu_atom/branch-instructions,period=1000000,\ acr_mask=0x3/u}" -- ./mispredict The goal is to limit event sampling to cases when the branch miss rate exceeds 20%. If the branch instructions sample period is exceeded first, both events are reloaded. If branch misses exceed their threshold first, only the second counter is reloaded, and a sample is taken. To simplify this, provide a new “ratio-to-prev” event term that works alongside the period event option or -c option. This would allow users to specify the desired relative rate between events as a ratio, making configuration more intuitive. With this enhancement, the equivalent command would be: perf record -e "{cpu_atom/branch-misses/ppu,\ cpu_atom/branch-instructions,period=1000000,ratio_to_prev=5/u}" \ -- ./mispredict or perf record -e "{cpu_atom/branch-misses/ppu,\ cpu_atom/branch-instructions,ratio-to-prev=5/u}" -c 1000000 \ -- ./mispredict [1] https://lore.kernel.org/lkml/20250327195217.2683619-1-kan.liang@linux.intel.com/ Changes in v2 (mostly suggested by Ian Rogers): -- Add documentation explaining acr_mask bitmask used by ACR -- Move ACR specific implementation to arch/x86/ -- Provide test cases for event parsing and perf record tests Thomas Falcon (2): perf record: Add ratio-to-prev term perf record: add auto counter reload parse and regression tests tools/perf/Documentation/intel-acr.txt | 53 ++++++++++++++++++ tools/perf/Documentation/perf-list.txt | 2 + tools/perf/arch/x86/util/evsel.c | 53 ++++++++++++++++++ tools/perf/tests/parse-events.c | 54 ++++++++++++++++++ tools/perf/tests/shell/record.sh | 40 ++++++++++++++ tools/perf/util/evsel.c | 76 ++++++++++++++++++++++++++ tools/perf/util/evsel.h | 1 + tools/perf/util/evsel_config.h | 1 + tools/perf/util/parse-events.c | 22 ++++++++ tools/perf/util/parse-events.h | 3 +- tools/perf/util/parse-events.l | 1 + tools/perf/util/pmu.c | 3 +- 12 files changed, 307 insertions(+), 2 deletions(-) create mode 100644 tools/perf/Documentation/intel-acr.txt -- 2.50.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [RESEND][PATCH v2 1/2] perf record: Add ratio-to-prev term 2025-09-02 16:40 [RESEND][PATCH v2 0/2] perf record: ratio-to-prev event term for auto counter reload Thomas Falcon @ 2025-09-02 16:40 ` Thomas Falcon 2025-09-24 21:34 ` Ian Rogers 2025-09-02 16:40 ` [RESEND][PATCH v2 2/2] perf record: Add auto counter reload parse and regression tests Thomas Falcon ` (3 subsequent siblings) 4 siblings, 1 reply; 11+ messages in thread From: Thomas Falcon @ 2025-09-02 16:40 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Kan Liang Cc: linux-kernel, linux-perf-users, Andi Kleen, Thomas Falcon Provide ratio-to-prev term which allows the user to set the event sample period of two events corresponding to a desired ratio. If using on an Intel x86 platform with Auto Counter Reload support, also set corresponding event's config2 attribute with a bitmask which counters to reset and which counters to sample if the desired ratio is met or exceeded. On other platforms, only the sample period is affected by the ratio-to-prev term. Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> --- tools/perf/Documentation/intel-acr.txt | 53 ++++++++++++++++++ tools/perf/Documentation/perf-list.txt | 2 + tools/perf/arch/x86/util/evsel.c | 53 ++++++++++++++++++ tools/perf/util/evsel.c | 76 ++++++++++++++++++++++++++ tools/perf/util/evsel.h | 1 + tools/perf/util/evsel_config.h | 1 + tools/perf/util/parse-events.c | 22 ++++++++ tools/perf/util/parse-events.h | 3 +- tools/perf/util/parse-events.l | 1 + tools/perf/util/pmu.c | 3 +- 10 files changed, 213 insertions(+), 2 deletions(-) create mode 100644 tools/perf/Documentation/intel-acr.txt diff --git a/tools/perf/Documentation/intel-acr.txt b/tools/perf/Documentation/intel-acr.txt new file mode 100644 index 000000000000..72654fdd9a52 --- /dev/null +++ b/tools/perf/Documentation/intel-acr.txt @@ -0,0 +1,53 @@ +Intel Auto Counter Reload Support +--------------------------------- +Support for Intel Auto Counter Reload in perf tools + +Auto counter reload provides a means for software to specify to hardware +that certain counters, if supported, should be automatically reloaded +upon overflow of chosen counters. By taking a sample only if the rate of +one event exceeds some threshold relative to the rate of another event, +this feature enables software to sample based on the relative rate of +two or more events. To enable this, the user must provide a sample period +term and a bitmask ("acr_mask") for each relevant event specifying the +counters in an event group to reload if the event's specified sample +period is exceeded. + +For example, if the user desires to measure a scenario when IPC > 2, +the event group might look like the one below: + + perf record -e {cpu_atom/instructions,period=200000,acr_mask=0x2/, \ + cpu_atom/cycles,period=100000,acr_mask=0x3/} -- true + +In this case, if the "instructions" counter exceeds the sample period of +200000, the second counter, "cycles", will be reset and a sample will be +taken. If "cycles" is exceeded first, both counters in the group will be +reset. In this way, samples will only be taken for cases where IPC > 2. + +The acr_mask term is a hexadecimal value representing a bitmask of the +events in the group to be reset when the period is exceeded. In the +example above, "instructions" is assigned an acr_mask of 0x2, meaning +only the second event in the group is reloaded and a sample is taken +for the first event. "cycles" is assigned an acr_mask of 0x3, meaning +that both event counters will be reset if the sample period is exceeded +first. + +ratio-to-prev Event Term +------------------------ +To simplify this, an event term "ratio-to-prev" is provided which is used +alongside the sample period term n or the -c/--count option. This would +allow users to specify the desired relative rate between events as a +ratio. Note: Both events compared must belong to the same PMU. + +The command above would then become + + perf record -e {cpu_atom/instructions/, \ + cpu_atom/cycles,period=100000,ratio-to-prev=0.5/} -- true + +ratio-to-prev is the ratio of the event using the term relative +to the previous event in the group, which will always be 1, +for a 1:0.5 or 2:1 ratio. + +To sample for IPC < 2 for example, the events need to be reordered: + + perf record -e {cpu_atom/cycles/, \ + cpu_atom/instructions,period=200000,ratio-to-prev=2.0/} -- true diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt index 28215306a78a..10bc66d39202 100644 --- a/tools/perf/Documentation/perf-list.txt +++ b/tools/perf/Documentation/perf-list.txt @@ -392,6 +392,8 @@ Support raw format: . '--raw-dump [hw|sw|cache|tracepoint|pmu|event_glob]', shows the raw-dump of a certain kind of events. +include::intel-acr.txt[] + SEE ALSO -------- linkperf:perf-stat[1], linkperf:perf-top[1], diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c index 9bc80fff3aa0..84858e4c397d 100644 --- a/tools/perf/arch/x86/util/evsel.c +++ b/tools/perf/arch/x86/util/evsel.c @@ -1,7 +1,9 @@ // SPDX-License-Identifier: GPL-2.0 #include <stdio.h> #include <stdlib.h> +#include "util/evlist.h" #include "util/evsel.h" +#include "util/evsel_config.h" #include "util/env.h" #include "util/pmu.h" #include "util/pmus.h" @@ -67,6 +69,57 @@ int arch_evsel__hw_name(struct evsel *evsel, char *bf, size_t size) event_name); } +void arch_evsel__apply_ratio_to_prev(struct evsel *evsel, + struct perf_event_attr *attr) +{ + struct perf_event_attr *prev_attr = NULL; + struct evsel *evsel_prev = NULL; + const char *name = "acr_mask"; + int evsel_idx = 0; + __u64 ev_mask, pr_ev_mask; + + if (!perf_pmu__has_format(evsel->pmu, name)) { + pr_err("'%s' does not have acr_mask format support\n", evsel->pmu->name); + return; + } + if (perf_pmu__format_type(evsel->pmu, name) != + PERF_PMU_FORMAT_VALUE_CONFIG2) { + pr_err("'%s' does not have config2 format support\n", evsel->pmu->name); + return; + } + + evsel_prev = evsel__prev(evsel); + if (!evsel_prev) { + pr_err("Previous event does not exist.\n"); + return; + } + + prev_attr = &evsel_prev->core.attr; + + if (prev_attr->config2) { + pr_err("'%s' has set config2 (acr_mask?) already, configuration not supported\n", evsel_prev->name); + return; + } + + /* + * acr_mask (config2) is calculated using the event's index in + * the event group. The first event will use the index of the + * second event as its mask (e.g., 0x2), indicating that the + * second event counter will be reset and a sample taken for + * the first event if its counter overflows. The second event + * will use the mask consisting of the first and second bits + * (e.g., 0x3), meaning both counters will be reset if the + * second event counter overflows. + */ + + evsel_idx = evsel__group_idx(evsel); + ev_mask = 1ull << evsel_idx; + pr_ev_mask = 1ull << (evsel_idx - 1); + + prev_attr->config2 = ev_mask; + attr->config2 = ev_mask | pr_ev_mask; +} + static void ibs_l3miss_warn(void) { pr_warning( diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index d264c143b592..f6f93920c0aa 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -1091,6 +1091,71 @@ static void evsel__reset_callgraph(struct evsel *evsel, struct callchain_param * } } +static void evsel__apply_ratio_to_prev(struct evsel *evsel, + struct perf_event_attr *attr, + struct record_opts *opts, + const char *buf) +{ + struct perf_event_attr *prev_attr = NULL; + struct evsel *evsel_prev = NULL; + u64 type = evsel->core.attr.sample_type; + u64 prev_type = 0; + double rtp; + + rtp = strtod(buf, NULL); + if (rtp <= 0) { + pr_err("Invalid ratio-to-prev value %lf\n", rtp); + return; + } + if (evsel == evsel__leader(evsel)) { + pr_err("Invalid use of ratio-to-prev term without preceding element in group\n"); + return; + } + if (!evsel->pmu->is_core) { + pr_err("Event using ratio-to-prev term must have a core PMU\n"); + return; + } + + evsel_prev = evsel__prev(evsel); + if (!evsel_prev) { + pr_err("Previous event does not exist.\n"); + return; + } + + if (evsel_prev->pmu->type != evsel->pmu->type) { + pr_err("Compared events (\"%s\", \"%s\") must have same PMU\n", + evsel->name, evsel_prev->name); + return; + } + + prev_attr = &evsel_prev->core.attr; + prev_type = evsel_prev->core.attr.sample_type; + + if (!(prev_type & PERF_SAMPLE_PERIOD)) { + attr->sample_period = prev_attr->sample_period * rtp; + attr->freq = 0; + evsel__reset_sample_bit(evsel, PERIOD); + } else if (!(type & PERF_SAMPLE_PERIOD)) { + prev_attr->sample_period = attr->sample_period / rtp; + prev_attr->freq = 0; + evsel__reset_sample_bit(evsel_prev, PERIOD); + } else { + if (opts->user_interval != ULLONG_MAX) { + prev_attr->sample_period = opts->user_interval; + attr->sample_period = prev_attr->sample_period * rtp; + prev_attr->freq = 0; + attr->freq = 0; + evsel__reset_sample_bit(evsel_prev, PERIOD); + evsel__reset_sample_bit(evsel, PERIOD); + } else { + pr_err("Event period term or count (-c) must be set when using ratio-to-prev term.\n"); + return; + } + } + + arch_evsel__apply_ratio_to_prev(evsel, attr); +} + static void evsel__apply_config_terms(struct evsel *evsel, struct record_opts *opts, bool track) { @@ -1104,6 +1169,7 @@ static void evsel__apply_config_terms(struct evsel *evsel, u32 dump_size = 0; int max_stack = 0; const char *callgraph_buf = NULL; + const char *rtp_buf = NULL; list_for_each_entry(term, config_terms, list) { switch (term->type) { @@ -1174,6 +1240,9 @@ static void evsel__apply_config_terms(struct evsel *evsel, break; case EVSEL__CONFIG_TERM_CFG_CHG: break; + case EVSEL__CONFIG_TERM_RATIO_TO_PREV: + rtp_buf = term->val.str; + break; default: break; } @@ -1225,6 +1294,8 @@ static void evsel__apply_config_terms(struct evsel *evsel, evsel__config_callchain(evsel, opts, ¶m); } } + if (rtp_buf) + evsel__apply_ratio_to_prev(evsel, attr, opts, rtp_buf); } struct evsel_config_term *__evsel__get_config_term(struct evsel *evsel, enum evsel_term_type type) @@ -1249,6 +1320,11 @@ void __weak arch__post_evsel_config(struct evsel *evsel __maybe_unused, { } +void __weak arch_evsel__apply_ratio_to_prev(struct evsel *evsel __maybe_unused, + struct perf_event_attr *attr __maybe_unused) +{ +} + static void evsel__set_default_freq_period(struct record_opts *opts, struct perf_event_attr *attr) { diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h index 5797a02e5d6a..5002c795e818 100644 --- a/tools/perf/util/evsel.h +++ b/tools/perf/util/evsel.h @@ -342,6 +342,7 @@ void evsel__set_sample_id(struct evsel *evsel, bool use_sample_identifier); void arch_evsel__set_sample_weight(struct evsel *evsel); void arch__post_evsel_config(struct evsel *evsel, struct perf_event_attr *attr); int arch_evsel__open_strerror(struct evsel *evsel, char *msg, size_t size); +void arch_evsel__apply_ratio_to_prev(struct evsel *evsel, struct perf_event_attr *attr); int evsel__set_filter(struct evsel *evsel, const char *filter); int evsel__append_tp_filter(struct evsel *evsel, const char *filter); diff --git a/tools/perf/util/evsel_config.h b/tools/perf/util/evsel_config.h index 94a1e9cf73d6..bcd3a978f0c4 100644 --- a/tools/perf/util/evsel_config.h +++ b/tools/perf/util/evsel_config.h @@ -28,6 +28,7 @@ enum evsel_term_type { EVSEL__CONFIG_TERM_AUX_ACTION, EVSEL__CONFIG_TERM_AUX_SAMPLE_SIZE, EVSEL__CONFIG_TERM_CFG_CHG, + EVSEL__CONFIG_TERM_RATIO_TO_PREV, }; struct evsel_config_term { diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index 8282ddf68b98..850de3a51f47 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -834,6 +834,7 @@ const char *parse_events__term_type_str(enum parse_events__term_type term_type) [PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE] = "legacy-cache", [PARSE_EVENTS__TERM_TYPE_HARDWARE] = "hardware", [PARSE_EVENTS__TERM_TYPE_CPU] = "cpu", + [PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV] = "ratio-to-prev", }; if ((unsigned int)term_type >= __PARSE_EVENTS__TERM_TYPE_NR) return "unknown term"; @@ -884,6 +885,7 @@ config_term_avail(enum parse_events__term_type term_type, struct parse_events_er case PARSE_EVENTS__TERM_TYPE_RAW: case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: case PARSE_EVENTS__TERM_TYPE_HARDWARE: + case PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV: default: if (!err) return false; @@ -1037,6 +1039,21 @@ do { \ perf_cpu_map__put(map); break; } + case PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV: + CHECK_TYPE_VAL(STR); + if (strtod(term->val.str, NULL) <= 0) { + parse_events_error__handle(err, term->err_val, + strdup("zero or negative"), + NULL); + return -EINVAL; + } + if (errno == ERANGE) { + parse_events_error__handle(err, term->err_val, + strdup("too big"), + NULL); + return -EINVAL; + } + break; case PARSE_EVENTS__TERM_TYPE_DRV_CFG: case PARSE_EVENTS__TERM_TYPE_USER: case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: @@ -1165,6 +1182,7 @@ static int config_term_tracepoint(struct perf_event_attr *attr, case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: case PARSE_EVENTS__TERM_TYPE_HARDWARE: case PARSE_EVENTS__TERM_TYPE_CPU: + case PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV: default: if (err) { parse_events_error__handle(err, term->err_term, @@ -1289,6 +1307,9 @@ do { \ ADD_CONFIG_TERM_VAL(AUX_SAMPLE_SIZE, aux_sample_size, term->val.num, term->weak); break; + case PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV: + ADD_CONFIG_TERM_STR(RATIO_TO_PREV, term->val.str, term->weak); + break; case PARSE_EVENTS__TERM_TYPE_USER: case PARSE_EVENTS__TERM_TYPE_CONFIG: case PARSE_EVENTS__TERM_TYPE_CONFIG1: @@ -1355,6 +1376,7 @@ static int get_config_chgs(struct perf_pmu *pmu, struct parse_events_terms *head case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: case PARSE_EVENTS__TERM_TYPE_HARDWARE: case PARSE_EVENTS__TERM_TYPE_CPU: + case PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV: default: break; } diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h index 62dc7202e3ba..b2dcc52e3814 100644 --- a/tools/perf/util/parse-events.h +++ b/tools/perf/util/parse-events.h @@ -83,7 +83,8 @@ enum parse_events__term_type { PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE, PARSE_EVENTS__TERM_TYPE_HARDWARE, PARSE_EVENTS__TERM_TYPE_CPU, -#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_CPU + 1) + PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV, +#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV + 1) }; struct parse_events_term { diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l index 2034590eb789..25206de68007 100644 --- a/tools/perf/util/parse-events.l +++ b/tools/perf/util/parse-events.l @@ -336,6 +336,7 @@ aux-action { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_ACTION); } aux-sample-size { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE); } metric-id { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_METRIC_ID); } cpu { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CPU); } +ratio-to-prev { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV); } cpu-cycles|cycles { return hw_term(yyscanner, PERF_COUNT_HW_CPU_CYCLES); } stalled-cycles-frontend|idle-cycles-frontend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); } stalled-cycles-backend|idle-cycles-backend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); } diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c index 5a291f1380ed..3d1f975e8db9 100644 --- a/tools/perf/util/pmu.c +++ b/tools/perf/util/pmu.c @@ -1541,7 +1541,7 @@ static int pmu_config_term(const struct perf_pmu *pmu, break; case PARSE_EVENTS__TERM_TYPE_USER: /* Not hardcoded. */ return -EINVAL; - case PARSE_EVENTS__TERM_TYPE_NAME ... PARSE_EVENTS__TERM_TYPE_CPU: + case PARSE_EVENTS__TERM_TYPE_NAME ... PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV: /* Skip non-config terms. */ break; default: @@ -1930,6 +1930,7 @@ int perf_pmu__for_each_format(struct perf_pmu *pmu, void *state, pmu_format_call "aux-action=(pause|resume|start-paused)", "aux-sample-size=number", "cpu=number", + "ratio-to-prev=string", }; struct perf_pmu_format *format; int ret; -- 2.50.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RESEND][PATCH v2 1/2] perf record: Add ratio-to-prev term 2025-09-02 16:40 ` [RESEND][PATCH v2 1/2] perf record: Add ratio-to-prev term Thomas Falcon @ 2025-09-24 21:34 ` Ian Rogers 0 siblings, 0 replies; 11+ messages in thread From: Ian Rogers @ 2025-09-24 21:34 UTC (permalink / raw) To: Thomas Falcon Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang, linux-kernel, linux-perf-users, Andi Kleen On Tue, Sep 2, 2025 at 9:41 AM Thomas Falcon <thomas.falcon@intel.com> wrote: > > Provide ratio-to-prev term which allows the user to > set the event sample period of two events corresponding > to a desired ratio. If using on an Intel x86 platform with > Auto Counter Reload support, also set corresponding event's > config2 attribute with a bitmask which counters to reset and > which counters to sample if the desired ratio is met or exceeded. > On other platforms, only the sample period is affected by the > ratio-to-prev term. > > Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> > --- > tools/perf/Documentation/intel-acr.txt | 53 ++++++++++++++++++ > tools/perf/Documentation/perf-list.txt | 2 + > tools/perf/arch/x86/util/evsel.c | 53 ++++++++++++++++++ > tools/perf/util/evsel.c | 76 ++++++++++++++++++++++++++ > tools/perf/util/evsel.h | 1 + > tools/perf/util/evsel_config.h | 1 + > tools/perf/util/parse-events.c | 22 ++++++++ > tools/perf/util/parse-events.h | 3 +- > tools/perf/util/parse-events.l | 1 + > tools/perf/util/pmu.c | 3 +- > 10 files changed, 213 insertions(+), 2 deletions(-) > create mode 100644 tools/perf/Documentation/intel-acr.txt > > diff --git a/tools/perf/Documentation/intel-acr.txt b/tools/perf/Documentation/intel-acr.txt > new file mode 100644 > index 000000000000..72654fdd9a52 > --- /dev/null > +++ b/tools/perf/Documentation/intel-acr.txt > @@ -0,0 +1,53 @@ > +Intel Auto Counter Reload Support > +--------------------------------- > +Support for Intel Auto Counter Reload in perf tools > + > +Auto counter reload provides a means for software to specify to hardware > +that certain counters, if supported, should be automatically reloaded > +upon overflow of chosen counters. By taking a sample only if the rate of > +one event exceeds some threshold relative to the rate of another event, > +this feature enables software to sample based on the relative rate of > +two or more events. To enable this, the user must provide a sample period > +term and a bitmask ("acr_mask") for each relevant event specifying the > +counters in an event group to reload if the event's specified sample > +period is exceeded. > + > +For example, if the user desires to measure a scenario when IPC > 2, > +the event group might look like the one below: > + > + perf record -e {cpu_atom/instructions,period=200000,acr_mask=0x2/, \ > + cpu_atom/cycles,period=100000,acr_mask=0x3/} -- true > + > +In this case, if the "instructions" counter exceeds the sample period of > +200000, the second counter, "cycles", will be reset and a sample will be > +taken. If "cycles" is exceeded first, both counters in the group will be > +reset. In this way, samples will only be taken for cases where IPC > 2. > + > +The acr_mask term is a hexadecimal value representing a bitmask of the > +events in the group to be reset when the period is exceeded. In the > +example above, "instructions" is assigned an acr_mask of 0x2, meaning > +only the second event in the group is reloaded and a sample is taken > +for the first event. "cycles" is assigned an acr_mask of 0x3, meaning > +that both event counters will be reset if the sample period is exceeded > +first. This is great! Thank you for adding it. > + > +ratio-to-prev Event Term > +------------------------ > +To simplify this, an event term "ratio-to-prev" is provided which is used > +alongside the sample period term n or the -c/--count option. This would > +allow users to specify the desired relative rate between events as a > +ratio. Note: Both events compared must belong to the same PMU. > + > +The command above would then become > + > + perf record -e {cpu_atom/instructions/, \ > + cpu_atom/cycles,period=100000,ratio-to-prev=0.5/} -- true > + > +ratio-to-prev is the ratio of the event using the term relative > +to the previous event in the group, which will always be 1, > +for a 1:0.5 or 2:1 ratio. > + > +To sample for IPC < 2 for example, the events need to be reordered: > + > + perf record -e {cpu_atom/cycles/, \ > + cpu_atom/instructions,period=200000,ratio-to-prev=2.0/} -- true > diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt > index 28215306a78a..10bc66d39202 100644 > --- a/tools/perf/Documentation/perf-list.txt > +++ b/tools/perf/Documentation/perf-list.txt > @@ -392,6 +392,8 @@ Support raw format: > . '--raw-dump [hw|sw|cache|tracepoint|pmu|event_glob]', shows the raw-dump of > a certain kind of events. > > +include::intel-acr.txt[] > + > SEE ALSO > -------- > linkperf:perf-stat[1], linkperf:perf-top[1], > diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c > index 9bc80fff3aa0..84858e4c397d 100644 > --- a/tools/perf/arch/x86/util/evsel.c > +++ b/tools/perf/arch/x86/util/evsel.c > @@ -1,7 +1,9 @@ > // SPDX-License-Identifier: GPL-2.0 > #include <stdio.h> > #include <stdlib.h> > +#include "util/evlist.h" > #include "util/evsel.h" > +#include "util/evsel_config.h" > #include "util/env.h" > #include "util/pmu.h" > #include "util/pmus.h" > @@ -67,6 +69,57 @@ int arch_evsel__hw_name(struct evsel *evsel, char *bf, size_t size) > event_name); > } > > +void arch_evsel__apply_ratio_to_prev(struct evsel *evsel, > + struct perf_event_attr *attr) > +{ > + struct perf_event_attr *prev_attr = NULL; > + struct evsel *evsel_prev = NULL; > + const char *name = "acr_mask"; > + int evsel_idx = 0; > + __u64 ev_mask, pr_ev_mask; > + > + if (!perf_pmu__has_format(evsel->pmu, name)) { > + pr_err("'%s' does not have acr_mask format support\n", evsel->pmu->name); > + return; > + } > + if (perf_pmu__format_type(evsel->pmu, name) != > + PERF_PMU_FORMAT_VALUE_CONFIG2) { > + pr_err("'%s' does not have config2 format support\n", evsel->pmu->name); > + return; > + } > + > + evsel_prev = evsel__prev(evsel); > + if (!evsel_prev) { > + pr_err("Previous event does not exist.\n"); > + return; > + } Nit: you should probably check that the leader of both events is the same, this means the events are in the same group. > + > + prev_attr = &evsel_prev->core.attr; > + > + if (prev_attr->config2) { > + pr_err("'%s' has set config2 (acr_mask?) already, configuration not supported\n", evsel_prev->name); > + return; > + } > + > + /* > + * acr_mask (config2) is calculated using the event's index in > + * the event group. The first event will use the index of the > + * second event as its mask (e.g., 0x2), indicating that the > + * second event counter will be reset and a sample taken for > + * the first event if its counter overflows. The second event > + * will use the mask consisting of the first and second bits > + * (e.g., 0x3), meaning both counters will be reset if the > + * second event counter overflows. > + */ > + > + evsel_idx = evsel__group_idx(evsel); > + ev_mask = 1ull << evsel_idx; > + pr_ev_mask = 1ull << (evsel_idx - 1); > + > + prev_attr->config2 = ev_mask; > + attr->config2 = ev_mask | pr_ev_mask; > +} > + > static void ibs_l3miss_warn(void) > { > pr_warning( > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c > index d264c143b592..f6f93920c0aa 100644 > --- a/tools/perf/util/evsel.c > +++ b/tools/perf/util/evsel.c > @@ -1091,6 +1091,71 @@ static void evsel__reset_callgraph(struct evsel *evsel, struct callchain_param * > } > } > > +static void evsel__apply_ratio_to_prev(struct evsel *evsel, > + struct perf_event_attr *attr, > + struct record_opts *opts, > + const char *buf) > +{ > + struct perf_event_attr *prev_attr = NULL; > + struct evsel *evsel_prev = NULL; > + u64 type = evsel->core.attr.sample_type; > + u64 prev_type = 0; > + double rtp; > + > + rtp = strtod(buf, NULL); > + if (rtp <= 0) { > + pr_err("Invalid ratio-to-prev value %lf\n", rtp); > + return; > + } > + if (evsel == evsel__leader(evsel)) { > + pr_err("Invalid use of ratio-to-prev term without preceding element in group\n"); > + return; > + } Ah, it is done here. > + if (!evsel->pmu->is_core) { > + pr_err("Event using ratio-to-prev term must have a core PMU\n"); > + return; > + } > + > + evsel_prev = evsel__prev(evsel); > + if (!evsel_prev) { > + pr_err("Previous event does not exist.\n"); > + return; > + } > + > + if (evsel_prev->pmu->type != evsel->pmu->type) { > + pr_err("Compared events (\"%s\", \"%s\") must have same PMU\n", > + evsel->name, evsel_prev->name); > + return; > + } > + > + prev_attr = &evsel_prev->core.attr; > + prev_type = evsel_prev->core.attr.sample_type; > + > + if (!(prev_type & PERF_SAMPLE_PERIOD)) { > + attr->sample_period = prev_attr->sample_period * rtp; > + attr->freq = 0; > + evsel__reset_sample_bit(evsel, PERIOD); > + } else if (!(type & PERF_SAMPLE_PERIOD)) { > + prev_attr->sample_period = attr->sample_period / rtp; > + prev_attr->freq = 0; > + evsel__reset_sample_bit(evsel_prev, PERIOD); > + } else { > + if (opts->user_interval != ULLONG_MAX) { > + prev_attr->sample_period = opts->user_interval; > + attr->sample_period = prev_attr->sample_period * rtp; > + prev_attr->freq = 0; > + attr->freq = 0; > + evsel__reset_sample_bit(evsel_prev, PERIOD); > + evsel__reset_sample_bit(evsel, PERIOD); > + } else { > + pr_err("Event period term or count (-c) must be set when using ratio-to-prev term.\n"); > + return; > + } > + } > + > + arch_evsel__apply_ratio_to_prev(evsel, attr); > +} > + > static void evsel__apply_config_terms(struct evsel *evsel, > struct record_opts *opts, bool track) > { > @@ -1104,6 +1169,7 @@ static void evsel__apply_config_terms(struct evsel *evsel, > u32 dump_size = 0; > int max_stack = 0; > const char *callgraph_buf = NULL; > + const char *rtp_buf = NULL; > > list_for_each_entry(term, config_terms, list) { > switch (term->type) { > @@ -1174,6 +1240,9 @@ static void evsel__apply_config_terms(struct evsel *evsel, > break; > case EVSEL__CONFIG_TERM_CFG_CHG: > break; > + case EVSEL__CONFIG_TERM_RATIO_TO_PREV: > + rtp_buf = term->val.str; > + break; > default: > break; > } > @@ -1225,6 +1294,8 @@ static void evsel__apply_config_terms(struct evsel *evsel, > evsel__config_callchain(evsel, opts, ¶m); > } > } > + if (rtp_buf) > + evsel__apply_ratio_to_prev(evsel, attr, opts, rtp_buf); > } > > struct evsel_config_term *__evsel__get_config_term(struct evsel *evsel, enum evsel_term_type type) > @@ -1249,6 +1320,11 @@ void __weak arch__post_evsel_config(struct evsel *evsel __maybe_unused, > { > } > > +void __weak arch_evsel__apply_ratio_to_prev(struct evsel *evsel __maybe_unused, > + struct perf_event_attr *attr __maybe_unused) > +{ > +} > + I'm not a fan of weak functions as they introduce subtle bugs, but I can see why you have this organization. Reviewed-by: Ian Rogers <irogers@google.com> Thanks, Ian > static void evsel__set_default_freq_period(struct record_opts *opts, > struct perf_event_attr *attr) > { > diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h > index 5797a02e5d6a..5002c795e818 100644 > --- a/tools/perf/util/evsel.h > +++ b/tools/perf/util/evsel.h > @@ -342,6 +342,7 @@ void evsel__set_sample_id(struct evsel *evsel, bool use_sample_identifier); > void arch_evsel__set_sample_weight(struct evsel *evsel); > void arch__post_evsel_config(struct evsel *evsel, struct perf_event_attr *attr); > int arch_evsel__open_strerror(struct evsel *evsel, char *msg, size_t size); > +void arch_evsel__apply_ratio_to_prev(struct evsel *evsel, struct perf_event_attr *attr); > > int evsel__set_filter(struct evsel *evsel, const char *filter); > int evsel__append_tp_filter(struct evsel *evsel, const char *filter); > diff --git a/tools/perf/util/evsel_config.h b/tools/perf/util/evsel_config.h > index 94a1e9cf73d6..bcd3a978f0c4 100644 > --- a/tools/perf/util/evsel_config.h > +++ b/tools/perf/util/evsel_config.h > @@ -28,6 +28,7 @@ enum evsel_term_type { > EVSEL__CONFIG_TERM_AUX_ACTION, > EVSEL__CONFIG_TERM_AUX_SAMPLE_SIZE, > EVSEL__CONFIG_TERM_CFG_CHG, > + EVSEL__CONFIG_TERM_RATIO_TO_PREV, > }; > > struct evsel_config_term { > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c > index 8282ddf68b98..850de3a51f47 100644 > --- a/tools/perf/util/parse-events.c > +++ b/tools/perf/util/parse-events.c > @@ -834,6 +834,7 @@ const char *parse_events__term_type_str(enum parse_events__term_type term_type) > [PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE] = "legacy-cache", > [PARSE_EVENTS__TERM_TYPE_HARDWARE] = "hardware", > [PARSE_EVENTS__TERM_TYPE_CPU] = "cpu", > + [PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV] = "ratio-to-prev", > }; > if ((unsigned int)term_type >= __PARSE_EVENTS__TERM_TYPE_NR) > return "unknown term"; > @@ -884,6 +885,7 @@ config_term_avail(enum parse_events__term_type term_type, struct parse_events_er > case PARSE_EVENTS__TERM_TYPE_RAW: > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > case PARSE_EVENTS__TERM_TYPE_HARDWARE: > + case PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV: > default: > if (!err) > return false; > @@ -1037,6 +1039,21 @@ do { \ > perf_cpu_map__put(map); > break; > } > + case PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV: > + CHECK_TYPE_VAL(STR); > + if (strtod(term->val.str, NULL) <= 0) { > + parse_events_error__handle(err, term->err_val, > + strdup("zero or negative"), > + NULL); > + return -EINVAL; > + } > + if (errno == ERANGE) { > + parse_events_error__handle(err, term->err_val, > + strdup("too big"), > + NULL); > + return -EINVAL; > + } > + break; > case PARSE_EVENTS__TERM_TYPE_DRV_CFG: > case PARSE_EVENTS__TERM_TYPE_USER: > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > @@ -1165,6 +1182,7 @@ static int config_term_tracepoint(struct perf_event_attr *attr, > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > case PARSE_EVENTS__TERM_TYPE_HARDWARE: > case PARSE_EVENTS__TERM_TYPE_CPU: > + case PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV: > default: > if (err) { > parse_events_error__handle(err, term->err_term, > @@ -1289,6 +1307,9 @@ do { \ > ADD_CONFIG_TERM_VAL(AUX_SAMPLE_SIZE, aux_sample_size, > term->val.num, term->weak); > break; > + case PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV: > + ADD_CONFIG_TERM_STR(RATIO_TO_PREV, term->val.str, term->weak); > + break; > case PARSE_EVENTS__TERM_TYPE_USER: > case PARSE_EVENTS__TERM_TYPE_CONFIG: > case PARSE_EVENTS__TERM_TYPE_CONFIG1: > @@ -1355,6 +1376,7 @@ static int get_config_chgs(struct perf_pmu *pmu, struct parse_events_terms *head > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > case PARSE_EVENTS__TERM_TYPE_HARDWARE: > case PARSE_EVENTS__TERM_TYPE_CPU: > + case PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV: > default: > break; > } > diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h > index 62dc7202e3ba..b2dcc52e3814 100644 > --- a/tools/perf/util/parse-events.h > +++ b/tools/perf/util/parse-events.h > @@ -83,7 +83,8 @@ enum parse_events__term_type { > PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE, > PARSE_EVENTS__TERM_TYPE_HARDWARE, > PARSE_EVENTS__TERM_TYPE_CPU, > -#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_CPU + 1) > + PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV, > +#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV + 1) > }; > > struct parse_events_term { > diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l > index 2034590eb789..25206de68007 100644 > --- a/tools/perf/util/parse-events.l > +++ b/tools/perf/util/parse-events.l > @@ -336,6 +336,7 @@ aux-action { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_ACTION); } > aux-sample-size { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE); } > metric-id { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_METRIC_ID); } > cpu { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CPU); } > +ratio-to-prev { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV); } > cpu-cycles|cycles { return hw_term(yyscanner, PERF_COUNT_HW_CPU_CYCLES); } > stalled-cycles-frontend|idle-cycles-frontend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); } > stalled-cycles-backend|idle-cycles-backend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); } > diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c > index 5a291f1380ed..3d1f975e8db9 100644 > --- a/tools/perf/util/pmu.c > +++ b/tools/perf/util/pmu.c > @@ -1541,7 +1541,7 @@ static int pmu_config_term(const struct perf_pmu *pmu, > break; > case PARSE_EVENTS__TERM_TYPE_USER: /* Not hardcoded. */ > return -EINVAL; > - case PARSE_EVENTS__TERM_TYPE_NAME ... PARSE_EVENTS__TERM_TYPE_CPU: > + case PARSE_EVENTS__TERM_TYPE_NAME ... PARSE_EVENTS__TERM_TYPE_RATIO_TO_PREV: > /* Skip non-config terms. */ > break; > default: > @@ -1930,6 +1930,7 @@ int perf_pmu__for_each_format(struct perf_pmu *pmu, void *state, pmu_format_call > "aux-action=(pause|resume|start-paused)", > "aux-sample-size=number", > "cpu=number", > + "ratio-to-prev=string", > }; > struct perf_pmu_format *format; > int ret; > -- > 2.50.1 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [RESEND][PATCH v2 2/2] perf record: Add auto counter reload parse and regression tests 2025-09-02 16:40 [RESEND][PATCH v2 0/2] perf record: ratio-to-prev event term for auto counter reload Thomas Falcon 2025-09-02 16:40 ` [RESEND][PATCH v2 1/2] perf record: Add ratio-to-prev term Thomas Falcon @ 2025-09-02 16:40 ` Thomas Falcon 2025-09-24 21:37 ` Ian Rogers 2025-09-24 19:09 ` [RESEND][PATCH v2 0/2] perf record: ratio-to-prev event term for auto counter reload Falcon, Thomas ` (2 subsequent siblings) 4 siblings, 1 reply; 11+ messages in thread From: Thomas Falcon @ 2025-09-02 16:40 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Kan Liang Cc: linux-kernel, linux-perf-users, Andi Kleen, Thomas Falcon Include event parsing and regression tests for auto counter reload and ratio-to-prev event term. Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> --- tools/perf/tests/parse-events.c | 54 ++++++++++++++++++++++++++++++++ tools/perf/tests/shell/record.sh | 40 +++++++++++++++++++++++ 2 files changed, 94 insertions(+) diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c index bb8004397650..67550cc60555 100644 --- a/tools/perf/tests/parse-events.c +++ b/tools/perf/tests/parse-events.c @@ -1736,6 +1736,53 @@ static int test__intel_pt(struct evlist *evlist) return TEST_OK; } +static bool test__acr_valid(void) +{ + struct perf_pmu *pmu = NULL; + + while ((pmu = perf_pmus__scan_core(pmu)) != NULL) { + if (perf_pmu__has_format(pmu, "acr_mask")) + return true; + } + + return false; +} + +static int test__ratio_to_prev(struct evlist *evlist) +{ + struct evsel *evsel; + int ret; + + TEST_ASSERT_VAL("wrong number of entries", 2 * perf_pmus__num_core_pmus() == evlist->core.nr_entries); + + evlist__for_each_entry(evlist, evsel) { + if (!perf_pmu__has_format(evsel->pmu, "acr_mask")) + return TEST_OK; + + if (evsel == evlist__first(evlist)) { + TEST_ASSERT_VAL("wrong config2", 0 == evsel->core.attr.config2); + TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel)); + TEST_ASSERT_VAL("wrong core.nr_members", evsel->core.nr_members == 2); + TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 0); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + } else { + TEST_ASSERT_VAL("wrong config2", 0 == evsel->core.attr.config2); + TEST_ASSERT_VAL("wrong leader", !evsel__is_group_leader(evsel)); + TEST_ASSERT_VAL("wrong core.nr_members", evsel->core.nr_members == 0); + TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 1); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_INSTRUCTIONS, "instructions"); + } + if (ret) + return ret; + /* + * The period value gets configured within evlist__config, + * while this test executes only parse events method. + */ + TEST_ASSERT_VAL("wrong period", 0 == evsel->core.attr.sample_period); + } + return TEST_OK; +} + static int test__checkevent_complex_name(struct evlist *evlist) { struct evsel *evsel = evlist__first(evlist); @@ -2249,6 +2296,13 @@ static const struct evlist_test test__events[] = { .check = test__checkevent_tracepoint, /* 4 */ }, + { + .name = "{cycles,instructions/period=200000,ratio-to-prev=2.0/}", + .valid = test__acr_valid, + .check = test__ratio_to_prev, + /* 5 */ + }, + }; static const struct evlist_test test__events_pmu[] = { diff --git a/tools/perf/tests/shell/record.sh b/tools/perf/tests/shell/record.sh index b1ad24fb3b33..0f5841c479e7 100755 --- a/tools/perf/tests/shell/record.sh +++ b/tools/perf/tests/shell/record.sh @@ -388,6 +388,45 @@ test_callgraph() { echo "Callgraph test [Success]" } +test_ratio_to_prev() { + echo "ratio-to-prev test" + if ! perf record -o /dev/null -e "{instructions, cycles/period=100000,ratio-to-prev=0.5/}" \ + true 2> /dev/null + then + echo "ratio-to-prev [Skipped not supported]" + return + fi + if ! perf record -o /dev/null -e "instructions, cycles/period=100000,ratio-to-prev=0.5/" \ + true |& grep -q 'Invalid use of ratio-to-prev term without preceding element in group' + then + echo "ratio-to-prev test [Failed elements must be in same group]" + err=1 + return + fi + if ! perf record -o /dev/null -e "{instructions,dummy,cycles/period=100000,ratio-to-prev=0.5/}" \ + true |& grep -q 'must have same PMU' + then + echo "ratio-to-prev test [Failed elements must have same PMU]" + err=1 + return + fi + if ! perf record -o /dev/null -e "{instructions,cycles/ratio-to-prev=0.5/}" \ + true |& grep -q 'Event period term or count (-c) must be set when using ratio-to-prev term.' + then + echo "ratio-to-prev test [Failed period must be set]" + err=1 + return + fi + if ! perf record -o /dev/null -e "{cycles/ratio-to-prev=0.5/}" \ + true |& grep -q 'Invalid use of ratio-to-prev term without preceding element in group' + then + echo "ratio-to-prev test [Failed need 2+ events]" + err=1 + return + fi + echo "Basic ratio-to-prev record test [Success]" +} + # raise the limit of file descriptors to minimum if [[ $default_fd_limit -lt $min_fd_limit ]]; then ulimit -Sn $min_fd_limit @@ -404,6 +443,7 @@ test_leader_sampling test_topdown_leader_sampling test_precise_max test_callgraph +test_ratio_to_prev # restore the default value ulimit -Sn $default_fd_limit -- 2.50.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RESEND][PATCH v2 2/2] perf record: Add auto counter reload parse and regression tests 2025-09-02 16:40 ` [RESEND][PATCH v2 2/2] perf record: Add auto counter reload parse and regression tests Thomas Falcon @ 2025-09-24 21:37 ` Ian Rogers 0 siblings, 0 replies; 11+ messages in thread From: Ian Rogers @ 2025-09-24 21:37 UTC (permalink / raw) To: Thomas Falcon Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang, linux-kernel, linux-perf-users, Andi Kleen On Tue, Sep 2, 2025 at 9:41 AM Thomas Falcon <thomas.falcon@intel.com> wrote: > > Include event parsing and regression tests for auto counter reload > and ratio-to-prev event term. > > Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> Reviewed-by: Ian Rogers <irogers@google.com> There will be conflicts with: https://lore.kernel.org/linux-perf-users/20250923223312.238185-1-irogers@google.com/ as I refactored parts of the parse-events test. I wouldn't be holding my breath waiting for those patches to land, so I guess I'll deal with the issues when they pop up. Thanks, Ian > --- > tools/perf/tests/parse-events.c | 54 ++++++++++++++++++++++++++++++++ > tools/perf/tests/shell/record.sh | 40 +++++++++++++++++++++++ > 2 files changed, 94 insertions(+) > > diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c > index bb8004397650..67550cc60555 100644 > --- a/tools/perf/tests/parse-events.c > +++ b/tools/perf/tests/parse-events.c > @@ -1736,6 +1736,53 @@ static int test__intel_pt(struct evlist *evlist) > return TEST_OK; > } > > +static bool test__acr_valid(void) > +{ > + struct perf_pmu *pmu = NULL; > + > + while ((pmu = perf_pmus__scan_core(pmu)) != NULL) { > + if (perf_pmu__has_format(pmu, "acr_mask")) > + return true; > + } > + > + return false; > +} > + > +static int test__ratio_to_prev(struct evlist *evlist) > +{ > + struct evsel *evsel; > + int ret; > + > + TEST_ASSERT_VAL("wrong number of entries", 2 * perf_pmus__num_core_pmus() == evlist->core.nr_entries); > + > + evlist__for_each_entry(evlist, evsel) { > + if (!perf_pmu__has_format(evsel->pmu, "acr_mask")) > + return TEST_OK; > + > + if (evsel == evlist__first(evlist)) { > + TEST_ASSERT_VAL("wrong config2", 0 == evsel->core.attr.config2); > + TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel)); > + TEST_ASSERT_VAL("wrong core.nr_members", evsel->core.nr_members == 2); > + TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 0); > + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); > + } else { > + TEST_ASSERT_VAL("wrong config2", 0 == evsel->core.attr.config2); > + TEST_ASSERT_VAL("wrong leader", !evsel__is_group_leader(evsel)); > + TEST_ASSERT_VAL("wrong core.nr_members", evsel->core.nr_members == 0); > + TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 1); > + ret = assert_hw(&evsel->core, PERF_COUNT_HW_INSTRUCTIONS, "instructions"); > + } > + if (ret) > + return ret; > + /* > + * The period value gets configured within evlist__config, > + * while this test executes only parse events method. > + */ > + TEST_ASSERT_VAL("wrong period", 0 == evsel->core.attr.sample_period); > + } > + return TEST_OK; > +} > + > static int test__checkevent_complex_name(struct evlist *evlist) > { > struct evsel *evsel = evlist__first(evlist); > @@ -2249,6 +2296,13 @@ static const struct evlist_test test__events[] = { > .check = test__checkevent_tracepoint, > /* 4 */ > }, > + { > + .name = "{cycles,instructions/period=200000,ratio-to-prev=2.0/}", > + .valid = test__acr_valid, > + .check = test__ratio_to_prev, > + /* 5 */ > + }, > + > }; > > static const struct evlist_test test__events_pmu[] = { > diff --git a/tools/perf/tests/shell/record.sh b/tools/perf/tests/shell/record.sh > index b1ad24fb3b33..0f5841c479e7 100755 > --- a/tools/perf/tests/shell/record.sh > +++ b/tools/perf/tests/shell/record.sh > @@ -388,6 +388,45 @@ test_callgraph() { > echo "Callgraph test [Success]" > } > > +test_ratio_to_prev() { > + echo "ratio-to-prev test" > + if ! perf record -o /dev/null -e "{instructions, cycles/period=100000,ratio-to-prev=0.5/}" \ > + true 2> /dev/null > + then > + echo "ratio-to-prev [Skipped not supported]" > + return > + fi > + if ! perf record -o /dev/null -e "instructions, cycles/period=100000,ratio-to-prev=0.5/" \ > + true |& grep -q 'Invalid use of ratio-to-prev term without preceding element in group' > + then > + echo "ratio-to-prev test [Failed elements must be in same group]" > + err=1 > + return > + fi > + if ! perf record -o /dev/null -e "{instructions,dummy,cycles/period=100000,ratio-to-prev=0.5/}" \ > + true |& grep -q 'must have same PMU' > + then > + echo "ratio-to-prev test [Failed elements must have same PMU]" > + err=1 > + return > + fi > + if ! perf record -o /dev/null -e "{instructions,cycles/ratio-to-prev=0.5/}" \ > + true |& grep -q 'Event period term or count (-c) must be set when using ratio-to-prev term.' > + then > + echo "ratio-to-prev test [Failed period must be set]" > + err=1 > + return > + fi > + if ! perf record -o /dev/null -e "{cycles/ratio-to-prev=0.5/}" \ > + true |& grep -q 'Invalid use of ratio-to-prev term without preceding element in group' > + then > + echo "ratio-to-prev test [Failed need 2+ events]" > + err=1 > + return > + fi > + echo "Basic ratio-to-prev record test [Success]" > +} > + > # raise the limit of file descriptors to minimum > if [[ $default_fd_limit -lt $min_fd_limit ]]; then > ulimit -Sn $min_fd_limit > @@ -404,6 +443,7 @@ test_leader_sampling > test_topdown_leader_sampling > test_precise_max > test_callgraph > +test_ratio_to_prev > > # restore the default value > ulimit -Sn $default_fd_limit > -- > 2.50.1 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RESEND][PATCH v2 0/2] perf record: ratio-to-prev event term for auto counter reload 2025-09-02 16:40 [RESEND][PATCH v2 0/2] perf record: ratio-to-prev event term for auto counter reload Thomas Falcon 2025-09-02 16:40 ` [RESEND][PATCH v2 1/2] perf record: Add ratio-to-prev term Thomas Falcon 2025-09-02 16:40 ` [RESEND][PATCH v2 2/2] perf record: Add auto counter reload parse and regression tests Thomas Falcon @ 2025-09-24 19:09 ` Falcon, Thomas 2025-09-30 7:28 ` Mi, Dapeng 2025-10-02 19:35 ` Arnaldo Carvalho de Melo 4 siblings, 0 replies; 11+ messages in thread From: Falcon, Thomas @ 2025-09-24 19:09 UTC (permalink / raw) To: alexander.shishkin@linux.intel.com, peterz@infradead.org, acme@kernel.org, mingo@redhat.com, mark.rutland@arm.com, Hunter, Adrian, namhyung@kernel.org, irogers@google.com, jolsa@kernel.org, kan.liang@linux.intel.com Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, ak@linux.intel.com On Tue, 2025-09-02 at 11:40 -0500, Thomas Falcon wrote: > The Auto Counter Reload (ACR)[1] feature is used to track the > relative rates of two or more perf events, only sampling > when a given threshold is exceeded. This helps reduce overhead > and unnecessary samples. However, enabling this feature > currently requires setting two parameters: Ping. Thanks, Tom > > -- Event sampling period ("period") > -- acr_mask, which determines which events get reloaded > when the sample period is reached. > > For example, in the following command: > > perf record -e "{cpu_atom/branch-misses,period=200000,\ > acr_mask=0x2/ppu,cpu_atom/branch-instructions,period=1000000,\ > acr_mask=0x3/u}" -- ./mispredict > > The goal is to limit event sampling to cases when the > branch miss rate exceeds 20%. If the branch instructions > sample period is exceeded first, both events are reloaded. > If branch misses exceed their threshold first, only the > second counter is reloaded, and a sample is taken. > > To simplify this, provide a new “ratio-to-prev” event term > that works alongside the period event option or -c option. > This would allow users to specify the desired relative rate > between events as a ratio, making configuration more intuitive. > > With this enhancement, the equivalent command would be: > > perf record -e "{cpu_atom/branch-misses/ppu,\ > cpu_atom/branch-instructions,period=1000000,ratio_to_prev=5/u}" \ > -- ./mispredict > > or > > perf record -e "{cpu_atom/branch-misses/ppu,\ > cpu_atom/branch-instructions,ratio-to-prev=5/u}" -c 1000000 \ > -- ./mispredict > > [1] > https://lore.kernel.org/lkml/20250327195217.2683619-1-kan.liang@linux.intel.com/ > > Changes in v2 (mostly suggested by Ian Rogers): > > -- Add documentation explaining acr_mask bitmask used by ACR > -- Move ACR specific implementation to arch/x86/ > -- Provide test cases for event parsing and perf record tests > > Thomas Falcon (2): > perf record: Add ratio-to-prev term > perf record: add auto counter reload parse and regression tests > > tools/perf/Documentation/intel-acr.txt | 53 ++++++++++++++++++ > tools/perf/Documentation/perf-list.txt | 2 + > tools/perf/arch/x86/util/evsel.c | 53 ++++++++++++++++++ > tools/perf/tests/parse-events.c | 54 ++++++++++++++++++ > tools/perf/tests/shell/record.sh | 40 ++++++++++++++ > tools/perf/util/evsel.c | 76 > ++++++++++++++++++++++++++ > tools/perf/util/evsel.h | 1 + > tools/perf/util/evsel_config.h | 1 + > tools/perf/util/parse-events.c | 22 ++++++++ > tools/perf/util/parse-events.h | 3 +- > tools/perf/util/parse-events.l | 1 + > tools/perf/util/pmu.c | 3 +- > 12 files changed, 307 insertions(+), 2 deletions(-) > create mode 100644 tools/perf/Documentation/intel-acr.txt > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RESEND][PATCH v2 0/2] perf record: ratio-to-prev event term for auto counter reload 2025-09-02 16:40 [RESEND][PATCH v2 0/2] perf record: ratio-to-prev event term for auto counter reload Thomas Falcon ` (2 preceding siblings ...) 2025-09-24 19:09 ` [RESEND][PATCH v2 0/2] perf record: ratio-to-prev event term for auto counter reload Falcon, Thomas @ 2025-09-30 7:28 ` Mi, Dapeng 2025-10-02 15:38 ` Falcon, Thomas 2025-10-02 19:35 ` Arnaldo Carvalho de Melo 4 siblings, 1 reply; 11+ messages in thread From: Mi, Dapeng @ 2025-09-30 7:28 UTC (permalink / raw) To: Thomas Falcon, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Kan Liang Cc: linux-kernel, linux-perf-users, Andi Kleen On 9/3/2025 12:40 AM, Thomas Falcon wrote: > The Auto Counter Reload (ACR)[1] feature is used to track the > relative rates of two or more perf events, only sampling > when a given threshold is exceeded. This helps reduce overhead > and unnecessary samples. However, enabling this feature > currently requires setting two parameters: > > -- Event sampling period ("period") > -- acr_mask, which determines which events get reloaded > when the sample period is reached. > > For example, in the following command: > > perf record -e "{cpu_atom/branch-misses,period=200000,\ > acr_mask=0x2/ppu,cpu_atom/branch-instructions,period=1000000,\ > acr_mask=0x3/u}" -- ./mispredict > > The goal is to limit event sampling to cases when the > branch miss rate exceeds 20%. If the branch instructions > sample period is exceeded first, both events are reloaded. > If branch misses exceed their threshold first, only the > second counter is reloaded, and a sample is taken. > > To simplify this, provide a new “ratio-to-prev” event term > that works alongside the period event option or -c option. > This would allow users to specify the desired relative rate > between events as a ratio, making configuration more intuitive. > > With this enhancement, the equivalent command would be: > > perf record -e "{cpu_atom/branch-misses/ppu,\ > cpu_atom/branch-instructions,period=1000000,ratio_to_prev=5/u}" \ > -- ./mispredict Hi Tom, Does this "ratio-to-prev" option support 3 and more events in ACR group? If not, should we consider to support the cases there are 3 and more events in the ACR group? (If I remember correct, the PMU driver should support it). e.g., perf record -e "{cpu_atom/branch-misses,period=200000,acr_mask=0x6/p,cpu_atom/branches,period=1000000,acr_mask=0x7/,cpu_atom/branches,period=1000000,acr_mask=0x7/}" -- sleep 1 Of course, this is just an example that indicates the cases are supported, it doesn't mean the command is meaningful. But we can't exclude that users have such real requirements. If we want to support 3 and more events in ACR group (if not already), we'd better rename the "ratio-to-prev" option to "ratio-to-head" and only allow the group leader can be set the sampling period explicitly with "period" option and the sampling period of all other group members can only be calculated base on the sampling period of group leader and the "ratio-to-head", maybe like this. perf record -e "{cpu_atom/branch-misses,period=200000/p,cpu_atom/branches,ratio-to-head=5/,cpu_atom/branches,ratio-to-head=5/}" -- sleep 1 Thanks. > > or > > perf record -e "{cpu_atom/branch-misses/ppu,\ > cpu_atom/branch-instructions,ratio-to-prev=5/u}" -c 1000000 \ > -- ./mispredict > > [1] https://lore.kernel.org/lkml/20250327195217.2683619-1-kan.liang@linux.intel.com/ > > Changes in v2 (mostly suggested by Ian Rogers): > > -- Add documentation explaining acr_mask bitmask used by ACR > -- Move ACR specific implementation to arch/x86/ > -- Provide test cases for event parsing and perf record tests > > Thomas Falcon (2): > perf record: Add ratio-to-prev term > perf record: add auto counter reload parse and regression tests > > tools/perf/Documentation/intel-acr.txt | 53 ++++++++++++++++++ > tools/perf/Documentation/perf-list.txt | 2 + > tools/perf/arch/x86/util/evsel.c | 53 ++++++++++++++++++ > tools/perf/tests/parse-events.c | 54 ++++++++++++++++++ > tools/perf/tests/shell/record.sh | 40 ++++++++++++++ > tools/perf/util/evsel.c | 76 ++++++++++++++++++++++++++ > tools/perf/util/evsel.h | 1 + > tools/perf/util/evsel_config.h | 1 + > tools/perf/util/parse-events.c | 22 ++++++++ > tools/perf/util/parse-events.h | 3 +- > tools/perf/util/parse-events.l | 1 + > tools/perf/util/pmu.c | 3 +- > 12 files changed, 307 insertions(+), 2 deletions(-) > create mode 100644 tools/perf/Documentation/intel-acr.txt > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RESEND][PATCH v2 0/2] perf record: ratio-to-prev event term for auto counter reload 2025-09-30 7:28 ` Mi, Dapeng @ 2025-10-02 15:38 ` Falcon, Thomas 2025-10-09 2:31 ` Mi, Dapeng 0 siblings, 1 reply; 11+ messages in thread From: Falcon, Thomas @ 2025-10-02 15:38 UTC (permalink / raw) To: alexander.shishkin@linux.intel.com, peterz@infradead.org, acme@kernel.org, dapeng1.mi@linux.intel.com, mingo@redhat.com, Hunter, Adrian, namhyung@kernel.org, jolsa@kernel.org, kan.liang@linux.intel.com, irogers@google.com, mark.rutland@arm.com Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, ak@linux.intel.com On Tue, 2025-09-30 at 15:28 +0800, Mi, Dapeng wrote: > > On 9/3/2025 12:40 AM, Thomas Falcon wrote: > > The Auto Counter Reload (ACR)[1] feature is used to track the > > relative rates of two or more perf events, only sampling > > when a given threshold is exceeded. This helps reduce overhead > > and unnecessary samples. However, enabling this feature > > currently requires setting two parameters: > > > > -- Event sampling period ("period") > > -- acr_mask, which determines which events get reloaded > > when the sample period is reached. > > > > For example, in the following command: > > > > perf record -e "{cpu_atom/branch-misses,period=200000,\ > > acr_mask=0x2/ppu,cpu_atom/branch-instructions,period=1000000,\ > > acr_mask=0x3/u}" -- ./mispredict > > > > The goal is to limit event sampling to cases when the > > branch miss rate exceeds 20%. If the branch instructions > > sample period is exceeded first, both events are reloaded. > > If branch misses exceed their threshold first, only the > > second counter is reloaded, and a sample is taken. > > > > To simplify this, provide a new “ratio-to-prev” event term > > that works alongside the period event option or -c option. > > This would allow users to specify the desired relative rate > > between events as a ratio, making configuration more intuitive. > > > > With this enhancement, the equivalent command would be: > > > > perf record -e "{cpu_atom/branch-misses/ppu,\ > > cpu_atom/branch-instructions,period=1000000,ratio_to_prev=5/u}" \ > > -- ./mispredict > > Hi Tom, > > Does this "ratio-to-prev" option support 3 and more events in ACR > group? > Hi Dapeng, The 'ratio-to-prev' option only supports groups with two events at this time. For larger event groups, the "acr_mask" term is available. > If not, should we consider to support the cases there are 3 and more > events > in the ACR group? (If I remember correct, the PMU driver should > support it). > Correct. > e.g., > > perf record -e > "{cpu_atom/branch- > misses,period=200000,acr_mask=0x6/p,cpu_atom/branches,period=1000000, > acr_mask=0x7/,cpu_atom/branches,period=1000000,acr_mask=0x7/}" > -- sleep 1 > > Of course, this is just an example that indicates the cases are > supported, > it doesn't mean the command is meaningful. But we can't exclude that > users > have such real requirements. > > If we want to support 3 and more events in ACR group (if not > already), we'd > better rename the "ratio-to-prev" option to "ratio-to-head" and only > allow > the group leader can be set the sampling period explicitly with > "period" > option and the sampling period of all other group members can only be > calculated base on the sampling period of group leader and > the "ratio-to-head", maybe like this. > > perf record -e > "{cpu_atom/branch-misses,period=200000/p,cpu_atom/branches,ratio-to- > head=5/,cpu_atom/branches,ratio-to-head=5/}" > -- sleep 1 > > Thanks. > > Thanks, those are good suggestions, but the goal of the feature was to provide users a way to utilize ACR to make simple comparisons without needing to use the "acr_mask" field. For tests comparing larger event groups, the acr_mask field may be used instead. Thanks, Tom > > > > or > > > > perf record -e "{cpu_atom/branch-misses/ppu,\ > > cpu_atom/branch-instructions,ratio-to-prev=5/u}" -c 1000000 \ > > -- ./mispredict > > > > [1] > > https://lore.kernel.org/lkml/20250327195217.2683619-1-kan.liang@linux.intel.com/ > > > > Changes in v2 (mostly suggested by Ian Rogers): > > > > -- Add documentation explaining acr_mask bitmask used by ACR > > -- Move ACR specific implementation to arch/x86/ > > -- Provide test cases for event parsing and perf record tests > > > > Thomas Falcon (2): > > perf record: Add ratio-to-prev term > > perf record: add auto counter reload parse and regression tests > > > > tools/perf/Documentation/intel-acr.txt | 53 ++++++++++++++++++ > > tools/perf/Documentation/perf-list.txt | 2 + > > tools/perf/arch/x86/util/evsel.c | 53 ++++++++++++++++++ > > tools/perf/tests/parse-events.c | 54 ++++++++++++++++++ > > tools/perf/tests/shell/record.sh | 40 ++++++++++++++ > > tools/perf/util/evsel.c | 76 > > ++++++++++++++++++++++++++ > > tools/perf/util/evsel.h | 1 + > > tools/perf/util/evsel_config.h | 1 + > > tools/perf/util/parse-events.c | 22 ++++++++ > > tools/perf/util/parse-events.h | 3 +- > > tools/perf/util/parse-events.l | 1 + > > tools/perf/util/pmu.c | 3 +- > > 12 files changed, 307 insertions(+), 2 deletions(-) > > create mode 100644 tools/perf/Documentation/intel-acr.txt > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RESEND][PATCH v2 0/2] perf record: ratio-to-prev event term for auto counter reload 2025-10-02 15:38 ` Falcon, Thomas @ 2025-10-09 2:31 ` Mi, Dapeng 0 siblings, 0 replies; 11+ messages in thread From: Mi, Dapeng @ 2025-10-09 2:31 UTC (permalink / raw) To: Falcon, Thomas, alexander.shishkin@linux.intel.com, peterz@infradead.org, acme@kernel.org, mingo@redhat.com, Hunter, Adrian, namhyung@kernel.org, jolsa@kernel.org, kan.liang@linux.intel.com, irogers@google.com, mark.rutland@arm.com Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, ak@linux.intel.com On 10/2/2025 11:38 PM, Falcon, Thomas wrote: > On Tue, 2025-09-30 at 15:28 +0800, Mi, Dapeng wrote: >> On 9/3/2025 12:40 AM, Thomas Falcon wrote: >>> The Auto Counter Reload (ACR)[1] feature is used to track the >>> relative rates of two or more perf events, only sampling >>> when a given threshold is exceeded. This helps reduce overhead >>> and unnecessary samples. However, enabling this feature >>> currently requires setting two parameters: >>> >>> -- Event sampling period ("period") >>> -- acr_mask, which determines which events get reloaded >>> when the sample period is reached. >>> >>> For example, in the following command: >>> >>> perf record -e "{cpu_atom/branch-misses,period=200000,\ >>> acr_mask=0x2/ppu,cpu_atom/branch-instructions,period=1000000,\ >>> acr_mask=0x3/u}" -- ./mispredict >>> >>> The goal is to limit event sampling to cases when the >>> branch miss rate exceeds 20%. If the branch instructions >>> sample period is exceeded first, both events are reloaded. >>> If branch misses exceed their threshold first, only the >>> second counter is reloaded, and a sample is taken. >>> >>> To simplify this, provide a new “ratio-to-prev” event term >>> that works alongside the period event option or -c option. >>> This would allow users to specify the desired relative rate >>> between events as a ratio, making configuration more intuitive. >>> >>> With this enhancement, the equivalent command would be: >>> >>> perf record -e "{cpu_atom/branch-misses/ppu,\ >>> cpu_atom/branch-instructions,period=1000000,ratio_to_prev=5/u}" \ >>> -- ./mispredict >> Hi Tom, >> >> Does this "ratio-to-prev" option support 3 and more events in ACR >> group? >> > Hi Dapeng, > > The 'ratio-to-prev' option only supports groups with two events at this > time. For larger event groups, the "acr_mask" term is available. > >> If not, should we consider to support the cases there are 3 and more >> events >> in the ACR group? (If I remember correct, the PMU driver should >> support it). >> > Correct. > >> e.g., >> >> perf record -e >> "{cpu_atom/branch- >> misses,period=200000,acr_mask=0x6/p,cpu_atom/branches,period=1000000, >> acr_mask=0x7/,cpu_atom/branches,period=1000000,acr_mask=0x7/}" >> -- sleep 1 >> >> Of course, this is just an example that indicates the cases are >> supported, >> it doesn't mean the command is meaningful. But we can't exclude that >> users >> have such real requirements. >> >> If we want to support 3 and more events in ACR group (if not >> already), we'd >> better rename the "ratio-to-prev" option to "ratio-to-head" and only >> allow >> the group leader can be set the sampling period explicitly with >> "period" >> option and the sampling period of all other group members can only be >> calculated base on the sampling period of group leader and >> the "ratio-to-head", maybe like this. >> >> perf record -e >> "{cpu_atom/branch-misses,period=200000/p,cpu_atom/branches,ratio-to- >> head=5/,cpu_atom/branches,ratio-to-head=5/}" >> -- sleep 1 >> >> Thanks. >> >> > Thanks, those are good suggestions, but the goal of the feature was to > provide users a way to utilize ACR to make simple comparisons without > needing to use the "acr_mask" field. For tests comparing larger event > groups, the acr_mask field may be used instead. Yeah, I understand the intent is to get a new simple option for using the ACR feature. But we already support 2 events for the ACR group, why not we make it more generic and can support more events? > > Thanks, > Tom > >>> or >>> >>> perf record -e "{cpu_atom/branch-misses/ppu,\ >>> cpu_atom/branch-instructions,ratio-to-prev=5/u}" -c 1000000 \ >>> -- ./mispredict >>> >>> [1] >>> https://lore.kernel.org/lkml/20250327195217.2683619-1-kan.liang@linux.intel.com/ >>> >>> Changes in v2 (mostly suggested by Ian Rogers): >>> >>> -- Add documentation explaining acr_mask bitmask used by ACR >>> -- Move ACR specific implementation to arch/x86/ >>> -- Provide test cases for event parsing and perf record tests >>> >>> Thomas Falcon (2): >>> perf record: Add ratio-to-prev term >>> perf record: add auto counter reload parse and regression tests >>> >>> tools/perf/Documentation/intel-acr.txt | 53 ++++++++++++++++++ >>> tools/perf/Documentation/perf-list.txt | 2 + >>> tools/perf/arch/x86/util/evsel.c | 53 ++++++++++++++++++ >>> tools/perf/tests/parse-events.c | 54 ++++++++++++++++++ >>> tools/perf/tests/shell/record.sh | 40 ++++++++++++++ >>> tools/perf/util/evsel.c | 76 >>> ++++++++++++++++++++++++++ >>> tools/perf/util/evsel.h | 1 + >>> tools/perf/util/evsel_config.h | 1 + >>> tools/perf/util/parse-events.c | 22 ++++++++ >>> tools/perf/util/parse-events.h | 3 +- >>> tools/perf/util/parse-events.l | 1 + >>> tools/perf/util/pmu.c | 3 +- >>> 12 files changed, 307 insertions(+), 2 deletions(-) >>> create mode 100644 tools/perf/Documentation/intel-acr.txt >>> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RESEND][PATCH v2 0/2] perf record: ratio-to-prev event term for auto counter reload 2025-09-02 16:40 [RESEND][PATCH v2 0/2] perf record: ratio-to-prev event term for auto counter reload Thomas Falcon ` (3 preceding siblings ...) 2025-09-30 7:28 ` Mi, Dapeng @ 2025-10-02 19:35 ` Arnaldo Carvalho de Melo 2025-10-02 21:57 ` Falcon, Thomas 4 siblings, 1 reply; 11+ messages in thread From: Arnaldo Carvalho de Melo @ 2025-10-02 19:35 UTC (permalink / raw) To: Thomas Falcon Cc: Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Kan Liang, linux-kernel, linux-perf-users, Andi Kleen On Tue, Sep 02, 2025 at 11:40:44AM -0500, Thomas Falcon wrote: > The Auto Counter Reload (ACR)[1] feature is used to track the > relative rates of two or more perf events, only sampling > when a given threshold is exceeded. This helps reduce overhead > and unnecessary samples. However, enabling this feature > currently requires setting two parameters: Can you please try to rebase to what is in tmp.perf-tools-next now? - Arnaldo ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RESEND][PATCH v2 0/2] perf record: ratio-to-prev event term for auto counter reload 2025-10-02 19:35 ` Arnaldo Carvalho de Melo @ 2025-10-02 21:57 ` Falcon, Thomas 0 siblings, 0 replies; 11+ messages in thread From: Falcon, Thomas @ 2025-10-02 21:57 UTC (permalink / raw) To: acme@kernel.org Cc: alexander.shishkin@linux.intel.com, ak@linux.intel.com, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, peterz@infradead.org, mark.rutland@arm.com, mingo@redhat.com, Hunter, Adrian, namhyung@kernel.org, jolsa@kernel.org, kan.liang@linux.intel.com, irogers@google.com On Thu, 2025-10-02 at 16:35 -0300, Arnaldo Carvalho de Melo wrote: > On Tue, Sep 02, 2025 at 11:40:44AM -0500, Thomas Falcon wrote: > > The Auto Counter Reload (ACR)[1] feature is used to track the > > relative rates of two or more perf events, only sampling > > when a given threshold is exceeded. This helps reduce overhead > > and unnecessary samples. However, enabling this feature > > currently requires setting two parameters: > > Can you please try to rebase to what is in tmp.perf-tools-next now? Ok, I will send it ASAP. thanks, Tom > > - Arnaldo ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-10-09 2:31 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-09-02 16:40 [RESEND][PATCH v2 0/2] perf record: ratio-to-prev event term for auto counter reload Thomas Falcon 2025-09-02 16:40 ` [RESEND][PATCH v2 1/2] perf record: Add ratio-to-prev term Thomas Falcon 2025-09-24 21:34 ` Ian Rogers 2025-09-02 16:40 ` [RESEND][PATCH v2 2/2] perf record: Add auto counter reload parse and regression tests Thomas Falcon 2025-09-24 21:37 ` Ian Rogers 2025-09-24 19:09 ` [RESEND][PATCH v2 0/2] perf record: ratio-to-prev event term for auto counter reload Falcon, Thomas 2025-09-30 7:28 ` Mi, Dapeng 2025-10-02 15:38 ` Falcon, Thomas 2025-10-09 2:31 ` Mi, Dapeng 2025-10-02 19:35 ` Arnaldo Carvalho de Melo 2025-10-02 21:57 ` Falcon, Thomas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).