* [RESEND PATCH 0/2] perf/x86: Package residency counter improvements
@ 2023-09-12 12:44 Tero Kristo
2023-09-12 12:44 ` [RESEND PATCH 1/2] perf/x86/cstate: Allow reading the package statistics from local CPU Tero Kristo
2023-09-12 12:44 ` [RESEND PATCH " Tero Kristo
0 siblings, 2 replies; 7+ messages in thread
From: Tero Kristo @ 2023-09-12 12:44 UTC (permalink / raw)
To: x86, tglx, bp, dave.hansen
Cc: irogers, mark.rutland, linux-perf-users, hpa, mingo, bpf,
linux-kernel, acme, peterz, alexander.shishkin, adrian.hunter,
namhyung, jolsa
Hello,
No changes on this, just resending as 6.6-rc1 is out, and the first post
got no feedback.
Original cover letter with more details here [1].
-Tero
[1] https://lore.kernel.org/bpf/20230823055653.2964237-1-tero.kristo@linux.intel.com/T/#mdaea2388b76abc33ebccac3d0a6ccc0ac0def20b
^ permalink raw reply [flat|nested] 7+ messages in thread
* [RESEND PATCH 1/2] perf/x86/cstate: Allow reading the package statistics from local CPU
2023-09-12 12:44 [RESEND PATCH 0/2] perf/x86: Package residency counter improvements Tero Kristo
@ 2023-09-12 12:44 ` Tero Kristo
2023-09-13 12:59 ` [PATCHv2 2/2] perf/core: Allow reading package events from perf_event_read_local Tero Kristo
2023-09-12 12:44 ` [RESEND PATCH " Tero Kristo
1 sibling, 1 reply; 7+ messages in thread
From: Tero Kristo @ 2023-09-12 12:44 UTC (permalink / raw)
To: x86, tglx, bp, dave.hansen
Cc: irogers, mark.rutland, linux-perf-users, hpa, mingo, bpf,
linux-kernel, acme, peterz, alexander.shishkin, adrian.hunter,
namhyung, jolsa, Kan Liang
The MSR registers for reading the package residency counters are
available on every CPU of the package. To avoid doing unnecessary SMP
calls to read the values for these from the various CPUs inside a
package, allow reading them from any CPU of the package.
Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
Suggested-by: Kan Liang <kan.liang@intel.com>
---
arch/x86/events/intel/cstate.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 96fffb2d521d..cbeb6d2bf5b4 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -336,6 +336,9 @@ static int cstate_pmu_event_init(struct perf_event *event)
cfg = array_index_nospec((unsigned long)cfg, PERF_CSTATE_PKG_EVENT_MAX);
if (!(pkg_msr_mask & (1 << cfg)))
return -EINVAL;
+
+ event->event_caps |= PERF_EV_CAP_READ_ACTIVE_PKG;
+
event->hw.event_base = pkg_msr[cfg].msr;
cpu = cpumask_any_and(&cstate_pkg_cpu_mask,
topology_die_cpumask(event->cpu));
--
2.40.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [RESEND PATCH 2/2] perf/core: Allow reading package events from perf_event_read_local
2023-09-12 12:44 [RESEND PATCH 0/2] perf/x86: Package residency counter improvements Tero Kristo
2023-09-12 12:44 ` [RESEND PATCH 1/2] perf/x86/cstate: Allow reading the package statistics from local CPU Tero Kristo
@ 2023-09-12 12:44 ` Tero Kristo
2023-09-12 14:04 ` Peter Zijlstra
1 sibling, 1 reply; 7+ messages in thread
From: Tero Kristo @ 2023-09-12 12:44 UTC (permalink / raw)
To: x86, tglx, bp, dave.hansen
Cc: irogers, mark.rutland, linux-perf-users, hpa, mingo, bpf,
linux-kernel, acme, peterz, alexander.shishkin, adrian.hunter,
namhyung, jolsa
Per-package perf events are typically registered with a single CPU only,
however they can be read across all the CPUs within the package.
Currently perf_event_read maps the event CPU according to the topology
information to avoid an unnecessary SMP call, however
perf_event_read_local deals with hard values and rejects a read with a
failure if the CPU is not the one exactly registered. Allow similar
mapping within the perf_event_read_local if the perf event in question
can support this.
This allows users like BPF code to read the package perf events properly
across different CPUs within a package.
Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
---
kernel/events/core.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4c72a41f11af..780dde646e8a 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4528,6 +4528,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
{
unsigned long flags;
int ret = 0;
+ int event_cpu;
/*
* Disabling interrupts avoids all counter scheduling (context
@@ -4551,15 +4552,18 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
goto out;
}
+ event_cpu = READ_ONCE(event->oncpu);
+ event_cpu = __perf_event_read_cpu(event, event_cpu);
+
/* If this is a per-CPU event, it must be for this CPU */
if (!(event->attach_state & PERF_ATTACH_TASK) &&
- event->cpu != smp_processor_id()) {
+ event_cpu != smp_processor_id()) {
ret = -EINVAL;
goto out;
}
/* If this is a pinned event it must be running on this CPU */
- if (event->attr.pinned && event->oncpu != smp_processor_id()) {
+ if (event->attr.pinned && event_cpu != smp_processor_id()) {
ret = -EBUSY;
goto out;
}
@@ -4569,7 +4573,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
* or local to this CPU. Furthermore it means its ACTIVE (otherwise
* oncpu == -1).
*/
- if (event->oncpu == smp_processor_id())
+ if (event_cpu == smp_processor_id())
event->pmu->read(event);
*value = local64_read(&event->count);
--
2.40.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RESEND PATCH 2/2] perf/core: Allow reading package events from perf_event_read_local
2023-09-12 12:44 ` [RESEND PATCH " Tero Kristo
@ 2023-09-12 14:04 ` Peter Zijlstra
2023-09-12 14:54 ` Tero Kristo
0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2023-09-12 14:04 UTC (permalink / raw)
To: Tero Kristo
Cc: x86, tglx, bp, dave.hansen, irogers, mark.rutland,
linux-perf-users, hpa, mingo, bpf, linux-kernel, acme,
alexander.shishkin, adrian.hunter, namhyung, jolsa
On Tue, Sep 12, 2023 at 03:44:32PM +0300, Tero Kristo wrote:
> Per-package perf events are typically registered with a single CPU only,
> however they can be read across all the CPUs within the package.
> Currently perf_event_read maps the event CPU according to the topology
> information to avoid an unnecessary SMP call, however
> perf_event_read_local deals with hard values and rejects a read with a
> failure if the CPU is not the one exactly registered. Allow similar
> mapping within the perf_event_read_local if the perf event in question
> can support this.
>
> This allows users like BPF code to read the package perf events properly
> across different CPUs within a package.
>
> Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
> ---
> kernel/events/core.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 4c72a41f11af..780dde646e8a 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -4528,6 +4528,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
> {
> unsigned long flags;
> int ret = 0;
> + int event_cpu;
>
> /*
> * Disabling interrupts avoids all counter scheduling (context
> @@ -4551,15 +4552,18 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
> goto out;
> }
>
> + event_cpu = READ_ONCE(event->oncpu);
> + event_cpu = __perf_event_read_cpu(event, event_cpu);
What happens with __perf_event_read_cpu() when event_cpu == -1 ?
> +
> /* If this is a per-CPU event, it must be for this CPU */
> if (!(event->attach_state & PERF_ATTACH_TASK) &&
> - event->cpu != smp_processor_id()) {
> + event_cpu != smp_processor_id()) {
> ret = -EINVAL;
> goto out;
> }
>
> /* If this is a pinned event it must be running on this CPU */
> - if (event->attr.pinned && event->oncpu != smp_processor_id()) {
> + if (event->attr.pinned && event_cpu != smp_processor_id()) {
> ret = -EBUSY;
> goto out;
> }
> @@ -4569,7 +4573,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
> * or local to this CPU. Furthermore it means its ACTIVE (otherwise
> * oncpu == -1).
> */
> - if (event->oncpu == smp_processor_id())
> + if (event_cpu == smp_processor_id())
> event->pmu->read(event);
>
> *value = local64_read(&event->count);
> --
> 2.40.1
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RESEND PATCH 2/2] perf/core: Allow reading package events from perf_event_read_local
2023-09-12 14:04 ` Peter Zijlstra
@ 2023-09-12 14:54 ` Tero Kristo
0 siblings, 0 replies; 7+ messages in thread
From: Tero Kristo @ 2023-09-12 14:54 UTC (permalink / raw)
To: Peter Zijlstra
Cc: x86, tglx, bp, dave.hansen, irogers, mark.rutland,
linux-perf-users, hpa, mingo, bpf, linux-kernel, acme,
alexander.shishkin, adrian.hunter, namhyung, jolsa
On 12/09/2023 17:04, Peter Zijlstra wrote:
> On Tue, Sep 12, 2023 at 03:44:32PM +0300, Tero Kristo wrote:
>> Per-package perf events are typically registered with a single CPU only,
>> however they can be read across all the CPUs within the package.
>> Currently perf_event_read maps the event CPU according to the topology
>> information to avoid an unnecessary SMP call, however
>> perf_event_read_local deals with hard values and rejects a read with a
>> failure if the CPU is not the one exactly registered. Allow similar
>> mapping within the perf_event_read_local if the perf event in question
>> can support this.
>>
>> This allows users like BPF code to read the package perf events properly
>> across different CPUs within a package.
>>
>> Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
>> ---
>> kernel/events/core.c | 10 +++++++---
>> 1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 4c72a41f11af..780dde646e8a 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -4528,6 +4528,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
>> {
>> unsigned long flags;
>> int ret = 0;
>> + int event_cpu;
>>
>> /*
>> * Disabling interrupts avoids all counter scheduling (context
>> @@ -4551,15 +4552,18 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
>> goto out;
>> }
>>
>> + event_cpu = READ_ONCE(event->oncpu);
>> + event_cpu = __perf_event_read_cpu(event, event_cpu);
> What happens with __perf_event_read_cpu() when event_cpu == -1 ?
Good question. It looks like I need to add a check against that. Will
update and send v2 out.
-Tero
>
>> +
>> /* If this is a per-CPU event, it must be for this CPU */
>> if (!(event->attach_state & PERF_ATTACH_TASK) &&
>> - event->cpu != smp_processor_id()) {
>> + event_cpu != smp_processor_id()) {
>> ret = -EINVAL;
>> goto out;
>> }
>>
>> /* If this is a pinned event it must be running on this CPU */
>> - if (event->attr.pinned && event->oncpu != smp_processor_id()) {
>> + if (event->attr.pinned && event_cpu != smp_processor_id()) {
>> ret = -EBUSY;
>> goto out;
>> }
>> @@ -4569,7 +4573,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
>> * or local to this CPU. Furthermore it means its ACTIVE (otherwise
>> * oncpu == -1).
>> */
>> - if (event->oncpu == smp_processor_id())
>> + if (event_cpu == smp_processor_id())
>> event->pmu->read(event);
>>
>> *value = local64_read(&event->count);
>> --
>> 2.40.1
>>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCHv2 2/2] perf/core: Allow reading package events from perf_event_read_local
2023-09-12 12:44 ` [RESEND PATCH 1/2] perf/x86/cstate: Allow reading the package statistics from local CPU Tero Kristo
@ 2023-09-13 12:59 ` Tero Kristo
2023-10-03 11:00 ` Peter Zijlstra
0 siblings, 1 reply; 7+ messages in thread
From: Tero Kristo @ 2023-09-13 12:59 UTC (permalink / raw)
To: x86, bp, dave.hansen, tglx
Cc: hpa, irogers, jolsa, namhyung, adrian.hunter, acme, mingo, bpf,
linux-kernel, alexander.shishkin, linux-perf-users, peterz,
mark.rutland
Per-package perf events are typically registered with a single CPU only,
however they can be read across all the CPUs within the package.
Currently perf_event_read maps the event CPU according to the topology
information to avoid an unnecessary SMP call, however
perf_event_read_local deals with hard values and rejects a read with a
failure if the CPU is not the one exactly registered. Allow similar
mapping within the perf_event_read_local if the perf event in question
can support this.
This allows users like BPF code to read the package perf events properly
across different CPUs within a package.
Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
---
v2:
* prevent illegal array access in case event->oncpu == -1
* split the event->cpu / event->oncpu handling to their own variables
kernel/events/core.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4c72a41f11af..6b343bac0a71 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4425,6 +4425,9 @@ static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
{
u16 local_pkg, event_pkg;
+ if (event_cpu < 0 || event_cpu >= nr_cpu_ids)
+ return event_cpu;
+
if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
int local_cpu = smp_processor_id();
@@ -4528,6 +4531,8 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
{
unsigned long flags;
int ret = 0;
+ int event_cpu;
+ int event_oncpu;
/*
* Disabling interrupts avoids all counter scheduling (context
@@ -4551,15 +4556,22 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
goto out;
}
+ /*
+ * Get the event CPU numbers, and adjust them to local if the event is
+ * a per-package event that can be read locally
+ */
+ event_oncpu = __perf_event_read_cpu(event, event->oncpu);
+ event_cpu = __perf_event_read_cpu(event, event->cpu);
+
/* If this is a per-CPU event, it must be for this CPU */
if (!(event->attach_state & PERF_ATTACH_TASK) &&
- event->cpu != smp_processor_id()) {
+ event_cpu != smp_processor_id()) {
ret = -EINVAL;
goto out;
}
/* If this is a pinned event it must be running on this CPU */
- if (event->attr.pinned && event->oncpu != smp_processor_id()) {
+ if (event->attr.pinned && event_oncpu != smp_processor_id()) {
ret = -EBUSY;
goto out;
}
@@ -4569,7 +4581,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
* or local to this CPU. Furthermore it means its ACTIVE (otherwise
* oncpu == -1).
*/
- if (event->oncpu == smp_processor_id())
+ if (event_oncpu == smp_processor_id())
event->pmu->read(event);
*value = local64_read(&event->count);
--
2.40.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCHv2 2/2] perf/core: Allow reading package events from perf_event_read_local
2023-09-13 12:59 ` [PATCHv2 2/2] perf/core: Allow reading package events from perf_event_read_local Tero Kristo
@ 2023-10-03 11:00 ` Peter Zijlstra
0 siblings, 0 replies; 7+ messages in thread
From: Peter Zijlstra @ 2023-10-03 11:00 UTC (permalink / raw)
To: Tero Kristo
Cc: x86, bp, dave.hansen, tglx, hpa, irogers, jolsa, namhyung,
adrian.hunter, acme, mingo, bpf, linux-kernel, alexander.shishkin,
linux-perf-users, mark.rutland
On Wed, Sep 13, 2023 at 03:59:56PM +0300, Tero Kristo wrote:
> Per-package perf events are typically registered with a single CPU only,
> however they can be read across all the CPUs within the package.
> Currently perf_event_read maps the event CPU according to the topology
> information to avoid an unnecessary SMP call, however
> perf_event_read_local deals with hard values and rejects a read with a
> failure if the CPU is not the one exactly registered. Allow similar
> mapping within the perf_event_read_local if the perf event in question
> can support this.
>
> This allows users like BPF code to read the package perf events properly
> across different CPUs within a package.
>
> Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
> ---
> v2:
> * prevent illegal array access in case event->oncpu == -1
> * split the event->cpu / event->oncpu handling to their own variables
>
> kernel/events/core.c | 18 +++++++++++++++---
> 1 file changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 4c72a41f11af..6b343bac0a71 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -4425,6 +4425,9 @@ static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
> {
> u16 local_pkg, event_pkg;
>
> + if (event_cpu < 0 || event_cpu >= nr_cpu_ids)
> + return event_cpu;
if ((unsigned)event_cpu >= nr_cpu_ids)
return event_cpu;
As you could also find at the current __perf_event_read_cpu() callsite.
> +
> if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
> int local_cpu = smp_processor_id();
>
> @@ -4528,6 +4531,8 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
> {
> unsigned long flags;
> int ret = 0;
> + int event_cpu;
> + int event_oncpu;
You wrecked the x-mas tree :-)
I'll fix both up.
Thanks!
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-10-03 11:00 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-12 12:44 [RESEND PATCH 0/2] perf/x86: Package residency counter improvements Tero Kristo
2023-09-12 12:44 ` [RESEND PATCH 1/2] perf/x86/cstate: Allow reading the package statistics from local CPU Tero Kristo
2023-09-13 12:59 ` [PATCHv2 2/2] perf/core: Allow reading package events from perf_event_read_local Tero Kristo
2023-10-03 11:00 ` Peter Zijlstra
2023-09-12 12:44 ` [RESEND PATCH " Tero Kristo
2023-09-12 14:04 ` Peter Zijlstra
2023-09-12 14:54 ` Tero Kristo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).