* [RFC PATCH v2 1/4] perf: Allow periodic events to alternate between two sample periods
2024-04-22 10:49 [RFC PATCH v2 0/4] A mechanism for efficient support for per-function metrics Ben Gainey
@ 2024-04-22 10:49 ` Ben Gainey
2024-04-22 10:49 ` [RFC PATCH v2 2/4] perf: Allow adding fixed random jitter to the alternate sampling period Ben Gainey
` (3 subsequent siblings)
4 siblings, 0 replies; 10+ messages in thread
From: Ben Gainey @ 2024-04-22 10:49 UTC (permalink / raw)
To: peterz, mingo, acme, namhyung
Cc: ak, will, james.clark, mark.rutland, alexander.shishkin, jolsa,
irogers, adrian.hunter, linux-perf-users, linux-kernel,
linux-arm-kernel, Ben Gainey
This change modifies perf_event_attr to add a second, alternative
sample period field, and modifies the core perf overflow handling
such that when specified an event will alternate between two sample
periods.
Currently, perf does not provide a mechanism for decoupling the period
over which counters are counted from the period between samples. This is
problematic for building a tool to measure per-function metrics derived
from a sampled counter group. Ideally such a tool wants a very small
sample window in order to correctly attribute the metrics to a given
function, but prefers a larger sample period that provides representative
coverage without excessive probe effect, triggering throttling, or
generating excessive amounts of data.
By alternating between a long and short sample_period and subsequently
discarding the long samples, tools may decouple the period between
samples that the tool cares about from the window of time over which
interesting counts are collected.
It is expected that typically tools would use this feature with the
cycles or instructions events as an approximation for time, but no
restrictions are applied to which events this can be applied to.
Signed-off-by: Ben Gainey <ben.gainey@arm.com>
---
include/linux/perf_event.h | 5 +++++
include/uapi/linux/perf_event.h | 3 +++
kernel/events/core.c | 39 +++++++++++++++++++++++++++++++++
3 files changed, 47 insertions(+)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index d2a15c0c6f8a..212bd302e548 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -267,6 +267,11 @@ struct hw_perf_event {
*/
u64 freq_time_stamp;
u64 freq_count_stamp;
+
+ /*
+ * Indicates that the alternative_sample_period is used
+ */
+ bool using_alternative_sample_period;
#endif
};
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 3a64499b0f5d..5c1701d091cf 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -379,6 +379,7 @@ enum perf_event_read_format {
#define PERF_ATTR_SIZE_VER6 120 /* add: aux_sample_size */
#define PERF_ATTR_SIZE_VER7 128 /* add: sig_data */
#define PERF_ATTR_SIZE_VER8 136 /* add: config3 */
+#define PERF_ATTR_SIZE_VER9 144 /* add: alternative_sample_period */
/*
* Hardware event_id to monitor via a performance monitoring event:
@@ -522,6 +523,8 @@ struct perf_event_attr {
__u64 sig_data;
__u64 config3; /* extension of config2 */
+
+ __u64 alternative_sample_period;
};
/*
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 724e6d7e128f..07f1f931e18e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4100,6 +4100,8 @@ static void perf_adjust_period(struct perf_event *event, u64 nsec, u64 count, bo
s64 period, sample_period;
s64 delta;
+ WARN_ON_ONCE(hwc->using_alternative_sample_period);
+
period = perf_calculate_period(event, nsec, count);
delta = (s64)(period - hwc->sample_period);
@@ -9552,6 +9554,7 @@ static int __perf_event_overflow(struct perf_event *event,
int throttle, struct perf_sample_data *data,
struct pt_regs *regs)
{
+ struct hw_perf_event *hwc = &event->hw;
int events = atomic_read(&event->event_limit);
int ret = 0;
@@ -9564,6 +9567,26 @@ static int __perf_event_overflow(struct perf_event *event,
ret = __perf_event_account_interrupt(event, throttle);
+ /*
+ * Swap the sample period to the alternative period
+ */
+ if (event->attr.alternative_sample_period) {
+ bool using_alt = hwc->using_alternative_sample_period;
+ u64 sample_period = (using_alt ? event->attr.sample_period
+ : event->attr.alternative_sample_period);
+
+ hwc->sample_period = sample_period;
+ hwc->using_alternative_sample_period = !using_alt;
+
+ if (local64_read(&hwc->period_left) > 0) {
+ event->pmu->stop(event, PERF_EF_UPDATE);
+
+ local64_set(&hwc->period_left, 0);
+
+ event->pmu->start(event, PERF_EF_RELOAD);
+ }
+ }
+
/*
* XXX event_limit might not quite work as expected on inherited
* events
@@ -12005,6 +12028,12 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
local64_set(&hwc->period_left, hwc->sample_period);
+ /*
+ * alternative_sample_period cannot be used with freq
+ */
+ if (attr->freq && attr->alternative_sample_period)
+ goto err_ns;
+
/*
* We currently do not support PERF_SAMPLE_READ on inherited events.
* See perf_output_read().
@@ -12459,9 +12488,19 @@ SYSCALL_DEFINE5(perf_event_open,
if (attr.freq) {
if (attr.sample_freq > sysctl_perf_event_sample_rate)
return -EINVAL;
+ if (attr.alternative_sample_period)
+ return -EINVAL;
} else {
if (attr.sample_period & (1ULL << 63))
return -EINVAL;
+ if (attr.alternative_sample_period) {
+ if (!attr.sample_period)
+ return -EINVAL;
+ if (attr.alternative_sample_period & (1ULL << 63))
+ return -EINVAL;
+ if (attr.alternative_sample_period == attr.sample_period)
+ attr.alternative_sample_period = 0;
+ }
}
/* Only privileged users can get physical addresses */
--
2.44.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 10+ messages in thread* [RFC PATCH v2 2/4] perf: Allow adding fixed random jitter to the alternate sampling period
2024-04-22 10:49 [RFC PATCH v2 0/4] A mechanism for efficient support for per-function metrics Ben Gainey
2024-04-22 10:49 ` [RFC PATCH v2 1/4] perf: Allow periodic events to alternate between two sample periods Ben Gainey
@ 2024-04-22 10:49 ` Ben Gainey
2024-04-22 13:08 ` James Clark
2024-04-22 10:49 ` [RFC PATCH v2 3/4] tools/perf: Modify event parser to support alt-period term Ben Gainey
` (2 subsequent siblings)
4 siblings, 1 reply; 10+ messages in thread
From: Ben Gainey @ 2024-04-22 10:49 UTC (permalink / raw)
To: peterz, mingo, acme, namhyung
Cc: ak, will, james.clark, mark.rutland, alexander.shishkin, jolsa,
irogers, adrian.hunter, linux-perf-users, linux-kernel,
linux-arm-kernel, Ben Gainey
This change modifies the core perf overflow handler, adding some small
random jitter to each sample period whenever an event switches between the
two alternate sample periods. A new flag is added to perf_event_attr to
opt into this behaviour.
This change follows the discussion in [1], where it is recognized that it
may be possible for certain patterns of execution to end up with biased
results.
[1] https://lore.kernel.org/linux-perf-users/Zc24eLqZycmIg3d2@tassilo/
Signed-off-by: Ben Gainey <ben.gainey@arm.com>
---
include/uapi/linux/perf_event.h | 3 ++-
kernel/events/core.c | 11 ++++++++++-
2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 5c1701d091cf..dd3697a4b300 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -461,7 +461,8 @@ struct perf_event_attr {
inherit_thread : 1, /* children only inherit if cloned with CLONE_THREAD */
remove_on_exec : 1, /* event is removed from task on exec */
sigtrap : 1, /* send synchronous SIGTRAP on event */
- __reserved_1 : 26;
+ jitter_alternate_period : 1, /* add a limited amount of jitter on each alternate period */
+ __reserved_1 : 25;
union {
__u32 wakeup_events; /* wakeup every n events */
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 07f1f931e18e..079ae520e836 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -15,6 +15,7 @@
#include <linux/idr.h>
#include <linux/file.h>
#include <linux/poll.h>
+#include <linux/random.h>
#include <linux/slab.h>
#include <linux/hash.h>
#include <linux/tick.h>
@@ -9546,6 +9547,8 @@ static inline bool sample_is_allowed(struct perf_event *event, struct pt_regs *r
return true;
}
+# define MAX_ALT_SAMPLE_PERIOD_JITTER 16
+
/*
* Generic event overflow handling, sampling.
*/
@@ -9573,7 +9576,10 @@ static int __perf_event_overflow(struct perf_event *event,
if (event->attr.alternative_sample_period) {
bool using_alt = hwc->using_alternative_sample_period;
u64 sample_period = (using_alt ? event->attr.sample_period
- : event->attr.alternative_sample_period);
+ : event->attr.alternative_sample_period)
+ + (event->attr.jitter_alternate_period
+ ? get_random_u32_below(MAX_ALT_SAMPLE_PERIOD_JITTER)
+ : 0);
hwc->sample_period = sample_period;
hwc->using_alternative_sample_period = !using_alt;
@@ -12503,6 +12509,9 @@ SYSCALL_DEFINE5(perf_event_open,
}
}
+ if (attr.jitter_alternate_period && !attr.alternative_sample_period)
+ return -EINVAL;
+
/* Only privileged users can get physical addresses */
if ((attr.sample_type & PERF_SAMPLE_PHYS_ADDR)) {
err = perf_allow_kernel(&attr);
--
2.44.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [RFC PATCH v2 2/4] perf: Allow adding fixed random jitter to the alternate sampling period
2024-04-22 10:49 ` [RFC PATCH v2 2/4] perf: Allow adding fixed random jitter to the alternate sampling period Ben Gainey
@ 2024-04-22 13:08 ` James Clark
2024-04-22 14:40 ` Ben Gainey
0 siblings, 1 reply; 10+ messages in thread
From: James Clark @ 2024-04-22 13:08 UTC (permalink / raw)
To: Ben Gainey
Cc: ak, will, mark.rutland, alexander.shishkin, jolsa, irogers,
adrian.hunter, linux-perf-users, linux-kernel, linux-arm-kernel,
peterz, mingo, acme, namhyung
On 22/04/2024 11:49, Ben Gainey wrote:
> This change modifies the core perf overflow handler, adding some small
> random jitter to each sample period whenever an event switches between the
> two alternate sample periods. A new flag is added to perf_event_attr to
> opt into this behaviour.
>
> This change follows the discussion in [1], where it is recognized that it
> may be possible for certain patterns of execution to end up with biased
> results.
>
> [1] https://lore.kernel.org/linux-perf-users/Zc24eLqZycmIg3d2@tassilo/
>
> Signed-off-by: Ben Gainey <ben.gainey@arm.com>
> ---
> include/uapi/linux/perf_event.h | 3 ++-
> kernel/events/core.c | 11 ++++++++++-
> 2 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 5c1701d091cf..dd3697a4b300 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -461,7 +461,8 @@ struct perf_event_attr {
> inherit_thread : 1, /* children only inherit if cloned with CLONE_THREAD */
> remove_on_exec : 1, /* event is removed from task on exec */
> sigtrap : 1, /* send synchronous SIGTRAP on event */
> - __reserved_1 : 26;
> + jitter_alternate_period : 1, /* add a limited amount of jitter on each alternate period */
> + __reserved_1 : 25;
>
> union {
> __u32 wakeup_events; /* wakeup every n events */
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 07f1f931e18e..079ae520e836 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -15,6 +15,7 @@
> #include <linux/idr.h>
> #include <linux/file.h>
> #include <linux/poll.h>
> +#include <linux/random.h>
> #include <linux/slab.h>
> #include <linux/hash.h>
> #include <linux/tick.h>
> @@ -9546,6 +9547,8 @@ static inline bool sample_is_allowed(struct perf_event *event, struct pt_regs *r
> return true;
> }
>
> +# define MAX_ALT_SAMPLE_PERIOD_JITTER 16
> +
Is 16 enough to make a difference with very large alternate periods? I'm
wondering if it's worth making it customisable and instead of adding the
boolean option add a 16 bit jitter field. Or the option could still be a
boolean but the jitter value is some ratio of the alt sample period, like:
get_random_u32_below(max(16, alternative_sample_period >> 4))
> /*
> * Generic event overflow handling, sampling.
> */
> @@ -9573,7 +9576,10 @@ static int __perf_event_overflow(struct perf_event *event,
> if (event->attr.alternative_sample_period) {
> bool using_alt = hwc->using_alternative_sample_period;
> u64 sample_period = (using_alt ? event->attr.sample_period
> - : event->attr.alternative_sample_period);
> + : event->attr.alternative_sample_period)
> + + (event->attr.jitter_alternate_period
> + ? get_random_u32_below(MAX_ALT_SAMPLE_PERIOD_JITTER)
> + : 0);
>
> hwc->sample_period = sample_period;
> hwc->using_alternative_sample_period = !using_alt;
> @@ -12503,6 +12509,9 @@ SYSCALL_DEFINE5(perf_event_open,
> }
> }
>
> + if (attr.jitter_alternate_period && !attr.alternative_sample_period)
> + return -EINVAL;
> +
> /* Only privileged users can get physical addresses */
> if ((attr.sample_type & PERF_SAMPLE_PHYS_ADDR)) {
> err = perf_allow_kernel(&attr);
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [RFC PATCH v2 2/4] perf: Allow adding fixed random jitter to the alternate sampling period
2024-04-22 13:08 ` James Clark
@ 2024-04-22 14:40 ` Ben Gainey
2024-04-23 9:55 ` James Clark
0 siblings, 1 reply; 10+ messages in thread
From: Ben Gainey @ 2024-04-22 14:40 UTC (permalink / raw)
To: James Clark
Cc: acme@kernel.org, alexander.shishkin@linux.intel.com,
ak@linux.intel.com, mingo@redhat.com, namhyung@kernel.org,
Mark Rutland, peterz@infradead.org,
linux-perf-users@vger.kernel.org, adrian.hunter@intel.com,
will@kernel.org, irogers@google.com, jolsa@kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
On Mon, 2024-04-22 at 14:08 +0100, James Clark wrote:
>
>
> On 22/04/2024 11:49, Ben Gainey wrote:
> > This change modifies the core perf overflow handler, adding some
> > small
> > random jitter to each sample period whenever an event switches
> > between the
> > two alternate sample periods. A new flag is added to
> > perf_event_attr to
> > opt into this behaviour.
> >
> > This change follows the discussion in [1], where it is recognized
> > that it
> > may be possible for certain patterns of execution to end up with
> > biased
> > results.
> >
> > [1] https://lore.kernel.org/linux-perf-
> > users/Zc24eLqZycmIg3d2@tassilo/
> >
> > Signed-off-by: Ben Gainey <ben.gainey@arm.com>
> > ---
> > include/uapi/linux/perf_event.h | 3 ++-
> > kernel/events/core.c | 11 ++++++++++-
> > 2 files changed, 12 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/uapi/linux/perf_event.h
> > b/include/uapi/linux/perf_event.h
> > index 5c1701d091cf..dd3697a4b300 100644
> > --- a/include/uapi/linux/perf_event.h
> > +++ b/include/uapi/linux/perf_event.h
> > @@ -461,7 +461,8 @@ struct perf_event_attr {
> > inherit_thread : 1, /* children only inherit if cloned with
> > CLONE_THREAD */
> > remove_on_exec : 1, /* event is removed from task on exec */
> > sigtrap : 1, /* send synchronous SIGTRAP on event */
> > - __reserved_1 : 26;
> > + jitter_alternate_period : 1, /* add a limited amount of jitter on
> > each alternate period */
> > + __reserved_1 : 25;
> >
> > union {
> > __u32 wakeup_events; /* wakeup every n events */
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 07f1f931e18e..079ae520e836 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -15,6 +15,7 @@
> > #include <linux/idr.h>
> > #include <linux/file.h>
> > #include <linux/poll.h>
> > +#include <linux/random.h>
> > #include <linux/slab.h>
> > #include <linux/hash.h>
> > #include <linux/tick.h>
> > @@ -9546,6 +9547,8 @@ static inline bool sample_is_allowed(struct
> > perf_event *event, struct pt_regs *r
> > return true;
> > }
> >
> > +# define MAX_ALT_SAMPLE_PERIOD_JITTER 16
> > +
>
> Is 16 enough to make a difference with very large alternate periods?
> I'm
> wondering if it's worth making it customisable and instead of adding
> the
> boolean option add a 16 bit jitter field. Or the option could still
> be a
> boolean but the jitter value is some ratio of the alt sample period,
> like:
>
> get_random_u32_below(max(16, alternative_sample_period >> 4))
>
I don't really have a strong opinion; in all my time I've never seen an
Arm PMU produce a precise and constant period anyway, so this may be
more useful in the case the architecture is able to support precise
sampling. In any case it's is likely to be specific to a particular
workload / configuration anyway.
The main downside I can see for making it configurable is that the
compiler cannot then optimise the get_random call as well as for a
constant, which may be undesirable on this code path.
> > /*
> > * Generic event overflow handling, sampling.
> > */
> > @@ -9573,7 +9576,10 @@ static int __perf_event_overflow(struct
> > perf_event *event,
> > if (event->attr.alternative_sample_period) {
> > bool using_alt = hwc->using_alternative_sample_period;
> > u64 sample_period = (using_alt ? event->attr.sample_period
> > - : event->attr.alternative_sample_period);
> > + : event->attr.alternative_sample_period)
> > + + (event->attr.jitter_alternate_period
> > + ? get_random_u32_below(MAX_ALT_SAMPLE_PERIOD_JITTER)
> > + : 0);
> >
> > hwc->sample_period = sample_period;
> > hwc->using_alternative_sample_period = !using_alt;
> > @@ -12503,6 +12509,9 @@ SYSCALL_DEFINE5(perf_event_open,
> > }
> > }
> >
> > + if (attr.jitter_alternate_period &&
> > !attr.alternative_sample_period)
> > + return -EINVAL;
> > +
> > /* Only privileged users can get physical addresses */
> > if ((attr.sample_type & PERF_SAMPLE_PHYS_ADDR)) {
> > err = perf_allow_kernel(&attr);
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [RFC PATCH v2 2/4] perf: Allow adding fixed random jitter to the alternate sampling period
2024-04-22 14:40 ` Ben Gainey
@ 2024-04-23 9:55 ` James Clark
0 siblings, 0 replies; 10+ messages in thread
From: James Clark @ 2024-04-23 9:55 UTC (permalink / raw)
To: Ben Gainey
Cc: acme@kernel.org, alexander.shishkin@linux.intel.com,
ak@linux.intel.com, mingo@redhat.com, namhyung@kernel.org,
Mark Rutland, peterz@infradead.org,
linux-perf-users@vger.kernel.org, adrian.hunter@intel.com,
will@kernel.org, irogers@google.com, jolsa@kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
On 22/04/2024 15:40, Ben Gainey wrote:
> On Mon, 2024-04-22 at 14:08 +0100, James Clark wrote:
>>
>>
>> On 22/04/2024 11:49, Ben Gainey wrote:
>>> This change modifies the core perf overflow handler, adding some
>>> small
>>> random jitter to each sample period whenever an event switches
>>> between the
>>> two alternate sample periods. A new flag is added to
>>> perf_event_attr to
>>> opt into this behaviour.
>>>
>>> This change follows the discussion in [1], where it is recognized
>>> that it
>>> may be possible for certain patterns of execution to end up with
>>> biased
>>> results.
>>>
>>> [1] https://lore.kernel.org/linux-perf-
>>> users/Zc24eLqZycmIg3d2@tassilo/
>>>
>>> Signed-off-by: Ben Gainey <ben.gainey@arm.com>
>>> ---
>>> include/uapi/linux/perf_event.h | 3 ++-
>>> kernel/events/core.c | 11 ++++++++++-
>>> 2 files changed, 12 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/include/uapi/linux/perf_event.h
>>> b/include/uapi/linux/perf_event.h
>>> index 5c1701d091cf..dd3697a4b300 100644
>>> --- a/include/uapi/linux/perf_event.h
>>> +++ b/include/uapi/linux/perf_event.h
>>> @@ -461,7 +461,8 @@ struct perf_event_attr {
>>> inherit_thread : 1, /* children only inherit if cloned with
>>> CLONE_THREAD */
>>> remove_on_exec : 1, /* event is removed from task on exec */
>>> sigtrap : 1, /* send synchronous SIGTRAP on event */
>>> - __reserved_1 : 26;
>>> + jitter_alternate_period : 1, /* add a limited amount of jitter on
>>> each alternate period */
>>> + __reserved_1 : 25;
>>>
>>> union {
>>> __u32 wakeup_events; /* wakeup every n events */
>>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>>> index 07f1f931e18e..079ae520e836 100644
>>> --- a/kernel/events/core.c
>>> +++ b/kernel/events/core.c
>>> @@ -15,6 +15,7 @@
>>> #include <linux/idr.h>
>>> #include <linux/file.h>
>>> #include <linux/poll.h>
>>> +#include <linux/random.h>
>>> #include <linux/slab.h>
>>> #include <linux/hash.h>
>>> #include <linux/tick.h>
>>> @@ -9546,6 +9547,8 @@ static inline bool sample_is_allowed(struct
>>> perf_event *event, struct pt_regs *r
>>> return true;
>>> }
>>>
>>> +# define MAX_ALT_SAMPLE_PERIOD_JITTER 16
>>> +
>>
>> Is 16 enough to make a difference with very large alternate periods?
>> I'm
>> wondering if it's worth making it customisable and instead of adding
>> the
>> boolean option add a 16 bit jitter field. Or the option could still
>> be a
>> boolean but the jitter value is some ratio of the alt sample period,
>> like:
>>
>> get_random_u32_below(max(16, alternative_sample_period >> 4))
>>
>
> I don't really have a strong opinion; in all my time I've never seen an
> Arm PMU produce a precise and constant period anyway, so this may be
> more useful in the case the architecture is able to support precise
> sampling. In any case it's is likely to be specific to a particular
> workload / configuration anyway.
>
> The main downside I can see for making it configurable is that the
> compiler cannot then optimise the get_random call as well as for a
> constant, which may be undesirable on this code path.
>
>
Hmmm I see, I didn't expect get_random_u32_below() to have such
different implementations depending on whether it was a constant or not.
You don't have to remove the constant though, it could be:
get_random_u32() & (jitter_power_of_2_max - 1)
If you're really worried about optimising this path, then generating the
jitter with some rotate/xor/mask operation is probably much faster. I
don't think the requirements are for it to actually be "random", but
just something to perturb it, even a badly distributed random number
would be fine.
But also yes I don't have a particularly strong opinion either. Just
that if someone does have a justification for a configurable value in
the future, depending on how that's implemented it could make the new
jitter boolean redundant which would be annoying.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 10+ messages in thread
* [RFC PATCH v2 3/4] tools/perf: Modify event parser to support alt-period term
2024-04-22 10:49 [RFC PATCH v2 0/4] A mechanism for efficient support for per-function metrics Ben Gainey
2024-04-22 10:49 ` [RFC PATCH v2 1/4] perf: Allow periodic events to alternate between two sample periods Ben Gainey
2024-04-22 10:49 ` [RFC PATCH v2 2/4] perf: Allow adding fixed random jitter to the alternate sampling period Ben Gainey
@ 2024-04-22 10:49 ` Ben Gainey
2024-04-22 10:49 ` [RFC PATCH v2 4/4] tools/perf: Modify event parser to support alt-period-jitter term Ben Gainey
2024-04-23 15:42 ` [RFC PATCH v2 0/4] A mechanism for efficient support for per-function metrics Andi Kleen
4 siblings, 0 replies; 10+ messages in thread
From: Ben Gainey @ 2024-04-22 10:49 UTC (permalink / raw)
To: peterz, mingo, acme, namhyung
Cc: ak, will, james.clark, mark.rutland, alexander.shishkin, jolsa,
irogers, adrian.hunter, linux-perf-users, linux-kernel,
linux-arm-kernel, Ben Gainey
parse-events is modified, adding the "alt-period" term which can be used
to specify the alternative sampling period.
Signed-off-by: Ben Gainey <ben.gainey@arm.com>
---
tools/include/uapi/linux/perf_event.h | 3 +++
tools/perf/tests/attr.c | 1 +
tools/perf/tests/attr.py | 1 +
tools/perf/tests/attr/base-record | 3 ++-
tools/perf/tests/attr/base-record-spe | 1 +
tools/perf/tests/attr/base-stat | 3 ++-
tools/perf/tests/attr/system-wide-dummy | 3 ++-
tools/perf/tests/attr/test-record-alt-period-term | 11 +++++++++++
tools/perf/tests/attr/test-record-dummy-C0 | 3 ++-
tools/perf/util/parse-events.c | 15 +++++++++++++++
tools/perf/util/parse-events.h | 3 ++-
tools/perf/util/parse-events.l | 1 +
tools/perf/util/perf_event_attr_fprintf.c | 1 +
13 files changed, 44 insertions(+), 5 deletions(-)
create mode 100644 tools/perf/tests/attr/test-record-alt-period-term
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 3a64499b0f5d..5c1701d091cf 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -379,6 +379,7 @@ enum perf_event_read_format {
#define PERF_ATTR_SIZE_VER6 120 /* add: aux_sample_size */
#define PERF_ATTR_SIZE_VER7 128 /* add: sig_data */
#define PERF_ATTR_SIZE_VER8 136 /* add: config3 */
+#define PERF_ATTR_SIZE_VER9 144 /* add: alternative_sample_period */
/*
* Hardware event_id to monitor via a performance monitoring event:
@@ -522,6 +523,8 @@ struct perf_event_attr {
__u64 sig_data;
__u64 config3; /* extension of config2 */
+
+ __u64 alternative_sample_period;
};
/*
diff --git a/tools/perf/tests/attr.c b/tools/perf/tests/attr.c
index 97e1bdd6ec0e..956b58c7ba8f 100644
--- a/tools/perf/tests/attr.c
+++ b/tools/perf/tests/attr.c
@@ -139,6 +139,7 @@ static int store_event(struct perf_event_attr *attr, pid_t pid, struct perf_cpu
WRITE_ASS(branch_sample_type, "llu");
WRITE_ASS(sample_regs_user, "llu");
WRITE_ASS(sample_stack_user, PRIu32);
+ WRITE_ASS(alternative_sample_period, "llu");
fclose(file);
return 0;
diff --git a/tools/perf/tests/attr.py b/tools/perf/tests/attr.py
index e890c261ad26..75c4527393f9 100644
--- a/tools/perf/tests/attr.py
+++ b/tools/perf/tests/attr.py
@@ -91,6 +91,7 @@ class Event(dict):
'branch_sample_type',
'sample_regs_user',
'sample_stack_user',
+ 'alternative_sample_period',
]
def add(self, data):
diff --git a/tools/perf/tests/attr/base-record b/tools/perf/tests/attr/base-record
index b44e4e6e4443..403de2e2c891 100644
--- a/tools/perf/tests/attr/base-record
+++ b/tools/perf/tests/attr/base-record
@@ -5,7 +5,7 @@ group_fd=-1
flags=0|8
cpu=*
type=0|1
-size=136
+size=144
config=0|1
sample_period=*
sample_type=263
@@ -39,3 +39,4 @@ config2=0
branch_sample_type=0
sample_regs_user=0
sample_stack_user=0
+alternative_sample_period=0
\ No newline at end of file
diff --git a/tools/perf/tests/attr/base-record-spe b/tools/perf/tests/attr/base-record-spe
index 08fa96b59240..db528d7d8b73 100644
--- a/tools/perf/tests/attr/base-record-spe
+++ b/tools/perf/tests/attr/base-record-spe
@@ -38,3 +38,4 @@ config2=*
branch_sample_type=*
sample_regs_user=*
sample_stack_user=*
+alternative_sample_period=0
\ No newline at end of file
diff --git a/tools/perf/tests/attr/base-stat b/tools/perf/tests/attr/base-stat
index fccd8ec4d1b0..27ef0fa1386f 100644
--- a/tools/perf/tests/attr/base-stat
+++ b/tools/perf/tests/attr/base-stat
@@ -5,7 +5,7 @@ group_fd=-1
flags=0|8
cpu=*
type=0
-size=136
+size=144
config=0
sample_period=0
sample_type=65536
@@ -39,3 +39,4 @@ config2=0
branch_sample_type=0
sample_regs_user=0
sample_stack_user=0
+alternative_sample_period=0
diff --git a/tools/perf/tests/attr/system-wide-dummy b/tools/perf/tests/attr/system-wide-dummy
index a1e1d6a263bf..5c4d2a60931d 100644
--- a/tools/perf/tests/attr/system-wide-dummy
+++ b/tools/perf/tests/attr/system-wide-dummy
@@ -7,7 +7,7 @@ cpu=*
pid=-1
flags=8
type=1
-size=136
+size=144
config=9
sample_period=1
# PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_TIME |
@@ -50,3 +50,4 @@ config2=0
branch_sample_type=0
sample_regs_user=0
sample_stack_user=0
+alternative_sample_period=0
diff --git a/tools/perf/tests/attr/test-record-alt-period-term b/tools/perf/tests/attr/test-record-alt-period-term
new file mode 100644
index 000000000000..ccbebfcbd194
--- /dev/null
+++ b/tools/perf/tests/attr/test-record-alt-period-term
@@ -0,0 +1,11 @@
+[config]
+command = record
+args = --no-bpf-event -e cycles/period=3,alt-period=2/ -- kill >/dev/null 2>&1
+ret = 1
+
+[event-10:base-record]
+sample_period=3
+alternative_sample_period=2
+
+freq=0
+sample_type=7
\ No newline at end of file
diff --git a/tools/perf/tests/attr/test-record-dummy-C0 b/tools/perf/tests/attr/test-record-dummy-C0
index 576ec48b3aaf..d4f0546e02b6 100644
--- a/tools/perf/tests/attr/test-record-dummy-C0
+++ b/tools/perf/tests/attr/test-record-dummy-C0
@@ -10,7 +10,7 @@ cpu=0
pid=-1
flags=8
type=1
-size=136
+size=144
config=9
sample_period=4000
# PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_TIME |
@@ -53,3 +53,4 @@ config2=0
branch_sample_type=0
sample_regs_user=0
sample_stack_user=0
+alternative_sample_period=0
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 6f8b0fa17689..d516d84fa1ee 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -773,6 +773,7 @@ static const char *config_term_name(enum parse_events__term_type term_type)
[PARSE_EVENTS__TERM_TYPE_RAW] = "raw",
[PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE] = "legacy-cache",
[PARSE_EVENTS__TERM_TYPE_HARDWARE] = "hardware",
+ [PARSE_EVENTS__TERM_TYPE_ALT_SAMPLE_PERIOD] = "alt-period",
};
if ((unsigned int)term_type >= __PARSE_EVENTS__TERM_TYPE_NR)
return "unknown term";
@@ -801,6 +802,7 @@ config_term_avail(enum parse_events__term_type term_type, struct parse_events_er
case PARSE_EVENTS__TERM_TYPE_NAME:
case PARSE_EVENTS__TERM_TYPE_METRIC_ID:
case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD:
+ case PARSE_EVENTS__TERM_TYPE_ALT_SAMPLE_PERIOD:
case PARSE_EVENTS__TERM_TYPE_PERCORE:
return true;
case PARSE_EVENTS__TERM_TYPE_USER:
@@ -945,6 +947,16 @@ do { \
return -EINVAL;
}
break;
+ case PARSE_EVENTS__TERM_TYPE_ALT_SAMPLE_PERIOD:
+ CHECK_TYPE_VAL(NUM);
+ if (term->val.num == 0) {
+ parse_events_error__handle(err, term->err_val,
+ strdup("expected a non-zero value"),
+ NULL);
+ return -EINVAL;
+ }
+ attr->alternative_sample_period = term->val.num;
+ break;
case PARSE_EVENTS__TERM_TYPE_DRV_CFG:
case PARSE_EVENTS__TERM_TYPE_USER:
case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE:
@@ -1071,6 +1083,7 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
case PARSE_EVENTS__TERM_TYPE_RAW:
case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE:
case PARSE_EVENTS__TERM_TYPE_HARDWARE:
+ case PARSE_EVENTS__TERM_TYPE_ALT_SAMPLE_PERIOD:
default:
if (err) {
parse_events_error__handle(err, term->err_term,
@@ -1201,6 +1214,7 @@ do { \
case PARSE_EVENTS__TERM_TYPE_RAW:
case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE:
case PARSE_EVENTS__TERM_TYPE_HARDWARE:
+ case PARSE_EVENTS__TERM_TYPE_ALT_SAMPLE_PERIOD:
default:
break;
}
@@ -1254,6 +1268,7 @@ static int get_config_chgs(struct perf_pmu *pmu, struct parse_events_terms *head
case PARSE_EVENTS__TERM_TYPE_RAW:
case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE:
case PARSE_EVENTS__TERM_TYPE_HARDWARE:
+ case PARSE_EVENTS__TERM_TYPE_ALT_SAMPLE_PERIOD:
default:
break;
}
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 809359e8544e..06e7ce8a29ef 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -79,7 +79,8 @@ enum parse_events__term_type {
PARSE_EVENTS__TERM_TYPE_RAW,
PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE,
PARSE_EVENTS__TERM_TYPE_HARDWARE,
-#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_HARDWARE + 1)
+ PARSE_EVENTS__TERM_TYPE_ALT_SAMPLE_PERIOD,
+#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_ALT_SAMPLE_PERIOD + 1)
};
struct parse_events_term {
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index e86c45675e1d..312a4f1837b9 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -246,6 +246,7 @@ percore { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_PERCORE); }
aux-output { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT); }
aux-sample-size { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE); }
metric-id { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_METRIC_ID); }
+alt-period { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_ALT_SAMPLE_PERIOD); }
cpu-cycles|cycles { return hw_term(yyscanner, PERF_COUNT_HW_CPU_CYCLES); }
stalled-cycles-frontend|idle-cycles-frontend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); }
stalled-cycles-backend|idle-cycles-backend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); }
diff --git a/tools/perf/util/perf_event_attr_fprintf.c b/tools/perf/util/perf_event_attr_fprintf.c
index 8f04d3b7f3ec..61acd99e08f5 100644
--- a/tools/perf/util/perf_event_attr_fprintf.c
+++ b/tools/perf/util/perf_event_attr_fprintf.c
@@ -323,6 +323,7 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
PRINT_ATTRf(sample_max_stack, p_unsigned);
PRINT_ATTRf(aux_sample_size, p_unsigned);
PRINT_ATTRf(sig_data, p_unsigned);
+ PRINT_ATTRf(alternative_sample_period, p_unsigned);
return ret;
}
--
2.44.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 10+ messages in thread* [RFC PATCH v2 4/4] tools/perf: Modify event parser to support alt-period-jitter term
2024-04-22 10:49 [RFC PATCH v2 0/4] A mechanism for efficient support for per-function metrics Ben Gainey
` (2 preceding siblings ...)
2024-04-22 10:49 ` [RFC PATCH v2 3/4] tools/perf: Modify event parser to support alt-period term Ben Gainey
@ 2024-04-22 10:49 ` Ben Gainey
2024-04-23 15:42 ` [RFC PATCH v2 0/4] A mechanism for efficient support for per-function metrics Andi Kleen
4 siblings, 0 replies; 10+ messages in thread
From: Ben Gainey @ 2024-04-22 10:49 UTC (permalink / raw)
To: peterz, mingo, acme, namhyung
Cc: ak, will, james.clark, mark.rutland, alexander.shishkin, jolsa,
irogers, adrian.hunter, linux-perf-users, linux-kernel,
linux-arm-kernel, Ben Gainey
parse-events is modified, adding the "alt-period-jitter" term which
can be used to enable random jitter of the alternative sample
period.
Signed-off-by: Ben Gainey <ben.gainey@arm.com>
---
tools/include/uapi/linux/perf_event.h | 3 ++-
tools/perf/tests/attr.c | 1 +
tools/perf/tests/attr.py | 1 +
tools/perf/tests/attr/base-record | 3 ++-
tools/perf/tests/attr/base-record-spe | 3 ++-
tools/perf/tests/attr/base-stat | 1 +
tools/perf/tests/attr/system-wide-dummy | 1 +
.../tests/attr/test-record-alt-period-jitter-term | 12 ++++++++++++
tools/perf/tests/attr/test-record-dummy-C0 | 1 +
tools/perf/util/parse-events.c | 15 +++++++++++++++
tools/perf/util/parse-events.h | 3 ++-
tools/perf/util/parse-events.l | 1 +
12 files changed, 41 insertions(+), 4 deletions(-)
create mode 100644 tools/perf/tests/attr/test-record-alt-period-jitter-term
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 5c1701d091cf..dd3697a4b300 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -461,7 +461,8 @@ struct perf_event_attr {
inherit_thread : 1, /* children only inherit if cloned with CLONE_THREAD */
remove_on_exec : 1, /* event is removed from task on exec */
sigtrap : 1, /* send synchronous SIGTRAP on event */
- __reserved_1 : 26;
+ jitter_alternate_period : 1, /* add a limited amount of jitter on each alternate period */
+ __reserved_1 : 25;
union {
__u32 wakeup_events; /* wakeup every n events */
diff --git a/tools/perf/tests/attr.c b/tools/perf/tests/attr.c
index 956b58c7ba8f..7fb5d1d0b0ab 100644
--- a/tools/perf/tests/attr.c
+++ b/tools/perf/tests/attr.c
@@ -140,6 +140,7 @@ static int store_event(struct perf_event_attr *attr, pid_t pid, struct perf_cpu
WRITE_ASS(sample_regs_user, "llu");
WRITE_ASS(sample_stack_user, PRIu32);
WRITE_ASS(alternative_sample_period, "llu");
+ WRITE_ASS(jitter_alternate_period, "d");
fclose(file);
return 0;
diff --git a/tools/perf/tests/attr.py b/tools/perf/tests/attr.py
index 75c4527393f9..ecab8e69418f 100644
--- a/tools/perf/tests/attr.py
+++ b/tools/perf/tests/attr.py
@@ -92,6 +92,7 @@ class Event(dict):
'sample_regs_user',
'sample_stack_user',
'alternative_sample_period',
+ 'jitter_alternate_period',
]
def add(self, data):
diff --git a/tools/perf/tests/attr/base-record b/tools/perf/tests/attr/base-record
index 403de2e2c891..39a7228612c2 100644
--- a/tools/perf/tests/attr/base-record
+++ b/tools/perf/tests/attr/base-record
@@ -39,4 +39,5 @@ config2=0
branch_sample_type=0
sample_regs_user=0
sample_stack_user=0
-alternative_sample_period=0
\ No newline at end of file
+alternative_sample_period=0
+jitter_alternate_period=0
diff --git a/tools/perf/tests/attr/base-record-spe b/tools/perf/tests/attr/base-record-spe
index db528d7d8b73..b228cd98cfa1 100644
--- a/tools/perf/tests/attr/base-record-spe
+++ b/tools/perf/tests/attr/base-record-spe
@@ -38,4 +38,5 @@ config2=*
branch_sample_type=*
sample_regs_user=*
sample_stack_user=*
-alternative_sample_period=0
\ No newline at end of file
+alternative_sample_period=0
+jitter_alternate_period=0
diff --git a/tools/perf/tests/attr/base-stat b/tools/perf/tests/attr/base-stat
index 27ef0fa1386f..d9057d780262 100644
--- a/tools/perf/tests/attr/base-stat
+++ b/tools/perf/tests/attr/base-stat
@@ -40,3 +40,4 @@ branch_sample_type=0
sample_regs_user=0
sample_stack_user=0
alternative_sample_period=0
+jitter_alternate_period=0
diff --git a/tools/perf/tests/attr/system-wide-dummy b/tools/perf/tests/attr/system-wide-dummy
index 5c4d2a60931d..4d80542c3a68 100644
--- a/tools/perf/tests/attr/system-wide-dummy
+++ b/tools/perf/tests/attr/system-wide-dummy
@@ -51,3 +51,4 @@ branch_sample_type=0
sample_regs_user=0
sample_stack_user=0
alternative_sample_period=0
+jitter_alternate_period=0
diff --git a/tools/perf/tests/attr/test-record-alt-period-jitter-term b/tools/perf/tests/attr/test-record-alt-period-jitter-term
new file mode 100644
index 000000000000..65f7c06c26e9
--- /dev/null
+++ b/tools/perf/tests/attr/test-record-alt-period-jitter-term
@@ -0,0 +1,12 @@
+[config]
+command = record
+args = --no-bpf-event -e cycles/period=3,alt-period=2,jitter_alternate_period/ -- kill >/dev/null 2>&1
+ret = 1
+
+[event-10:base-record]
+sample_period=3
+alternative_sample_period=2
+jitter_alternate_period=1
+
+freq=0
+sample_type=7
\ No newline at end of file
diff --git a/tools/perf/tests/attr/test-record-dummy-C0 b/tools/perf/tests/attr/test-record-dummy-C0
index d4f0546e02b6..0f3360c35a5e 100644
--- a/tools/perf/tests/attr/test-record-dummy-C0
+++ b/tools/perf/tests/attr/test-record-dummy-C0
@@ -54,3 +54,4 @@ branch_sample_type=0
sample_regs_user=0
sample_stack_user=0
alternative_sample_period=0
+jitter_alternate_period=0
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index d516d84fa1ee..9821e3cd26a4 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -774,6 +774,7 @@ static const char *config_term_name(enum parse_events__term_type term_type)
[PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE] = "legacy-cache",
[PARSE_EVENTS__TERM_TYPE_HARDWARE] = "hardware",
[PARSE_EVENTS__TERM_TYPE_ALT_SAMPLE_PERIOD] = "alt-period",
+ [PARSE_EVENTS__TERM_TYPE_ALT_PERIOD_JITTER] = "alt-period-jitter",
};
if ((unsigned int)term_type >= __PARSE_EVENTS__TERM_TYPE_NR)
return "unknown term";
@@ -803,6 +804,7 @@ config_term_avail(enum parse_events__term_type term_type, struct parse_events_er
case PARSE_EVENTS__TERM_TYPE_METRIC_ID:
case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD:
case PARSE_EVENTS__TERM_TYPE_ALT_SAMPLE_PERIOD:
+ case PARSE_EVENTS__TERM_TYPE_ALT_PERIOD_JITTER:
case PARSE_EVENTS__TERM_TYPE_PERCORE:
return true;
case PARSE_EVENTS__TERM_TYPE_USER:
@@ -957,6 +959,16 @@ do { \
}
attr->alternative_sample_period = term->val.num;
break;
+ case PARSE_EVENTS__TERM_TYPE_ALT_PERIOD_JITTER:
+ CHECK_TYPE_VAL(NUM);
+ if ((unsigned int)term->val.num > 1) {
+ parse_events_error__handle(err, term->err_val,
+ strdup("expected 0 or 1"),
+ NULL);
+ return -EINVAL;
+ }
+ attr->jitter_alternate_period = (term->val.num != 0);
+ break;
case PARSE_EVENTS__TERM_TYPE_DRV_CFG:
case PARSE_EVENTS__TERM_TYPE_USER:
case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE:
@@ -1084,6 +1096,7 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE:
case PARSE_EVENTS__TERM_TYPE_HARDWARE:
case PARSE_EVENTS__TERM_TYPE_ALT_SAMPLE_PERIOD:
+ case PARSE_EVENTS__TERM_TYPE_ALT_PERIOD_JITTER:
default:
if (err) {
parse_events_error__handle(err, term->err_term,
@@ -1215,6 +1228,7 @@ do { \
case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE:
case PARSE_EVENTS__TERM_TYPE_HARDWARE:
case PARSE_EVENTS__TERM_TYPE_ALT_SAMPLE_PERIOD:
+ case PARSE_EVENTS__TERM_TYPE_ALT_PERIOD_JITTER:
default:
break;
}
@@ -1269,6 +1283,7 @@ static int get_config_chgs(struct perf_pmu *pmu, struct parse_events_terms *head
case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE:
case PARSE_EVENTS__TERM_TYPE_HARDWARE:
case PARSE_EVENTS__TERM_TYPE_ALT_SAMPLE_PERIOD:
+ case PARSE_EVENTS__TERM_TYPE_ALT_PERIOD_JITTER:
default:
break;
}
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 06e7ce8a29ef..2b02c5a6a5f3 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -80,7 +80,8 @@ enum parse_events__term_type {
PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE,
PARSE_EVENTS__TERM_TYPE_HARDWARE,
PARSE_EVENTS__TERM_TYPE_ALT_SAMPLE_PERIOD,
-#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_ALT_SAMPLE_PERIOD + 1)
+ PARSE_EVENTS__TERM_TYPE_ALT_PERIOD_JITTER,
+#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_ALT_PERIOD_JITTER + 1)
};
struct parse_events_term {
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 312a4f1837b9..f5587dff596c 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -247,6 +247,7 @@ aux-output { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT); }
aux-sample-size { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE); }
metric-id { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_METRIC_ID); }
alt-period { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_ALT_SAMPLE_PERIOD); }
+alt-period-jitter { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_ALT_PERIOD_JITTER); }
cpu-cycles|cycles { return hw_term(yyscanner, PERF_COUNT_HW_CPU_CYCLES); }
stalled-cycles-frontend|idle-cycles-frontend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); }
stalled-cycles-backend|idle-cycles-backend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); }
--
2.44.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [RFC PATCH v2 0/4] A mechanism for efficient support for per-function metrics
2024-04-22 10:49 [RFC PATCH v2 0/4] A mechanism for efficient support for per-function metrics Ben Gainey
` (3 preceding siblings ...)
2024-04-22 10:49 ` [RFC PATCH v2 4/4] tools/perf: Modify event parser to support alt-period-jitter term Ben Gainey
@ 2024-04-23 15:42 ` Andi Kleen
2024-04-26 11:11 ` Ben Gainey
4 siblings, 1 reply; 10+ messages in thread
From: Andi Kleen @ 2024-04-23 15:42 UTC (permalink / raw)
To: Ben Gainey
Cc: peterz, mingo, acme, namhyung, will, james.clark, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter,
linux-perf-users, linux-kernel, linux-arm-kernel
> Cursory testing on a Xeon(R) W-2145 with a 300 *instruction* sample
> window (with and without the patch) suggests this approach would work
> for some counters. Calculating branch miss rates for example appears to
> be correct when used with the instruction counter as the sampling event,
> or at least this approach correctly identifies which functions in the
> test benchmark are prone to poor predictability. On the other hand the
> combination cycle and instructions counter does not appear to sample
> correctly as a pair. With something like
>
> perf record -e '{cycles/period=999700,alt-period=300/,instructions}:uS' ... benchmark
>
> I often see very large CPI, the same affect is observed without the
> patch enabled. No idea whats going on there, so any insight welcome...
My guess would be that the PMI handler cleared L1 and there are stalls
reloading the working set. You can check L1 miss events to confirm.
Unfortunately with the period change it cannot use multi-record
PEBS which would avoid the need for a PMI.
-Andi
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [RFC PATCH v2 0/4] A mechanism for efficient support for per-function metrics
2024-04-23 15:42 ` [RFC PATCH v2 0/4] A mechanism for efficient support for per-function metrics Andi Kleen
@ 2024-04-26 11:11 ` Ben Gainey
0 siblings, 0 replies; 10+ messages in thread
From: Ben Gainey @ 2024-04-26 11:11 UTC (permalink / raw)
To: ak@linux.intel.com
Cc: alexander.shishkin@linux.intel.com,
linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
peterz@infradead.org, Mark Rutland, mingo@redhat.com, James Clark,
acme@kernel.org, namhyung@kernel.org, jolsa@kernel.org,
will@kernel.org, irogers@google.com, adrian.hunter@intel.com,
linux-arm-kernel@lists.infradead.org
On Tue, 2024-04-23 at 08:42 -0700, Andi Kleen wrote:
> > Cursory testing on a Xeon(R) W-2145 with a 300 *instruction* sample
> > window (with and without the patch) suggests this approach would
> > work
> > for some counters. Calculating branch miss rates for example
> > appears to
> > be correct when used with the instruction counter as the sampling
> > event,
> > or at least this approach correctly identifies which functions in
> > the
> > test benchmark are prone to poor predictability. On the other hand
> > the
> > combination cycle and instructions counter does not appear to
> > sample
> > correctly as a pair. With something like
> >
> > perf record -e '{cycles/period=999700,alt-
> > period=300/,instructions}:uS' ... benchmark
> >
> > I often see very large CPI, the same affect is observed without the
> > patch enabled. No idea whats going on there, so any insight
> > welcome...
>
> My guess would be that the PMI handler cleared L1 and there are
> stalls
> reloading the working set. You can check L1 miss events to confirm.
> Unfortunately with the period change it cannot use multi-record
> PEBS which would avoid the need for a PMI.
>
> -Andi
Hi Andi,
Spent a bit of time looking at this.
Comparing the L1 counters against the values from 'perf stat' doesn't
appear to show some obvious cause.
I think this is just a quirk specific to using the cycle counter as the
sampling event, and is not related to the alt-period, as the affect is
present even on an unpatched kernel.
There appears to be some non-linear increase in CPI (over the sample
data as a whole) for the smallest values of period, e.g. for
period=100, CPI=~450; perf stat says it should be ~2.5. Manual
inspection of the raw data with:
perf script -F event,period -i perf.data.100
Shows repeating pattern along the lines of:
cycles=450
instructions=1
...
The affect quickly decreases as the period increases, with period=750,
the CPI is <2x (vs perf stat).
When the events are swapped so that the sampling event is
`instructions` rather than `cycles`, the affect is very much
diminished/gone; at P=100 is see about 3.5x overhead (vs perf stat),
and at P=500 the overhead is about 1.5x.
When alt-period is used such that "period=$((1000000-$P)),alt-
period=$P", the affect is unchanged.
Regards
Ben
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 10+ messages in thread