* [RESEND PATCH 0/2] perf/x86: Package residency counter improvements @ 2023-09-12 12:44 Tero Kristo 2023-09-12 12:44 ` [RESEND PATCH 1/2] perf/x86/cstate: Allow reading the package statistics from local CPU Tero Kristo 2023-09-12 12:44 ` [RESEND PATCH " Tero Kristo 0 siblings, 2 replies; 7+ messages in thread From: Tero Kristo @ 2023-09-12 12:44 UTC (permalink / raw) To: x86, tglx, bp, dave.hansen Cc: irogers, mark.rutland, linux-perf-users, hpa, mingo, bpf, linux-kernel, acme, peterz, alexander.shishkin, adrian.hunter, namhyung, jolsa Hello, No changes on this, just resending as 6.6-rc1 is out, and the first post got no feedback. Original cover letter with more details here [1]. -Tero [1] https://lore.kernel.org/bpf/20230823055653.2964237-1-tero.kristo@linux.intel.com/T/#mdaea2388b76abc33ebccac3d0a6ccc0ac0def20b ^ permalink raw reply [flat|nested] 7+ messages in thread
* [RESEND PATCH 1/2] perf/x86/cstate: Allow reading the package statistics from local CPU 2023-09-12 12:44 [RESEND PATCH 0/2] perf/x86: Package residency counter improvements Tero Kristo @ 2023-09-12 12:44 ` Tero Kristo 2023-09-13 12:59 ` [PATCHv2 2/2] perf/core: Allow reading package events from perf_event_read_local Tero Kristo 2023-09-12 12:44 ` [RESEND PATCH " Tero Kristo 1 sibling, 1 reply; 7+ messages in thread From: Tero Kristo @ 2023-09-12 12:44 UTC (permalink / raw) To: x86, tglx, bp, dave.hansen Cc: irogers, mark.rutland, linux-perf-users, hpa, mingo, bpf, linux-kernel, acme, peterz, alexander.shishkin, adrian.hunter, namhyung, jolsa, Kan Liang The MSR registers for reading the package residency counters are available on every CPU of the package. To avoid doing unnecessary SMP calls to read the values for these from the various CPUs inside a package, allow reading them from any CPU of the package. Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com> Suggested-by: Kan Liang <kan.liang@intel.com> --- arch/x86/events/intel/cstate.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c index 96fffb2d521d..cbeb6d2bf5b4 100644 --- a/arch/x86/events/intel/cstate.c +++ b/arch/x86/events/intel/cstate.c @@ -336,6 +336,9 @@ static int cstate_pmu_event_init(struct perf_event *event) cfg = array_index_nospec((unsigned long)cfg, PERF_CSTATE_PKG_EVENT_MAX); if (!(pkg_msr_mask & (1 << cfg))) return -EINVAL; + + event->event_caps |= PERF_EV_CAP_READ_ACTIVE_PKG; + event->hw.event_base = pkg_msr[cfg].msr; cpu = cpumask_any_and(&cstate_pkg_cpu_mask, topology_die_cpumask(event->cpu)); -- 2.40.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCHv2 2/2] perf/core: Allow reading package events from perf_event_read_local 2023-09-12 12:44 ` [RESEND PATCH 1/2] perf/x86/cstate: Allow reading the package statistics from local CPU Tero Kristo @ 2023-09-13 12:59 ` Tero Kristo 2023-10-03 11:00 ` Peter Zijlstra 0 siblings, 1 reply; 7+ messages in thread From: Tero Kristo @ 2023-09-13 12:59 UTC (permalink / raw) To: x86, bp, dave.hansen, tglx Cc: hpa, irogers, jolsa, namhyung, adrian.hunter, acme, mingo, bpf, linux-kernel, alexander.shishkin, linux-perf-users, peterz, mark.rutland Per-package perf events are typically registered with a single CPU only, however they can be read across all the CPUs within the package. Currently perf_event_read maps the event CPU according to the topology information to avoid an unnecessary SMP call, however perf_event_read_local deals with hard values and rejects a read with a failure if the CPU is not the one exactly registered. Allow similar mapping within the perf_event_read_local if the perf event in question can support this. This allows users like BPF code to read the package perf events properly across different CPUs within a package. Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com> --- v2: * prevent illegal array access in case event->oncpu == -1 * split the event->cpu / event->oncpu handling to their own variables kernel/events/core.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 4c72a41f11af..6b343bac0a71 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -4425,6 +4425,9 @@ static int __perf_event_read_cpu(struct perf_event *event, int event_cpu) { u16 local_pkg, event_pkg; + if (event_cpu < 0 || event_cpu >= nr_cpu_ids) + return event_cpu; + if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) { int local_cpu = smp_processor_id(); @@ -4528,6 +4531,8 @@ int perf_event_read_local(struct perf_event *event, u64 *value, { unsigned long flags; int ret = 0; + int event_cpu; + int event_oncpu; /* * Disabling interrupts avoids all counter scheduling (context @@ -4551,15 +4556,22 @@ int perf_event_read_local(struct perf_event *event, u64 *value, goto out; } + /* + * Get the event CPU numbers, and adjust them to local if the event is + * a per-package event that can be read locally + */ + event_oncpu = __perf_event_read_cpu(event, event->oncpu); + event_cpu = __perf_event_read_cpu(event, event->cpu); + /* If this is a per-CPU event, it must be for this CPU */ if (!(event->attach_state & PERF_ATTACH_TASK) && - event->cpu != smp_processor_id()) { + event_cpu != smp_processor_id()) { ret = -EINVAL; goto out; } /* If this is a pinned event it must be running on this CPU */ - if (event->attr.pinned && event->oncpu != smp_processor_id()) { + if (event->attr.pinned && event_oncpu != smp_processor_id()) { ret = -EBUSY; goto out; } @@ -4569,7 +4581,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value, * or local to this CPU. Furthermore it means its ACTIVE (otherwise * oncpu == -1). */ - if (event->oncpu == smp_processor_id()) + if (event_oncpu == smp_processor_id()) event->pmu->read(event); *value = local64_read(&event->count); -- 2.40.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCHv2 2/2] perf/core: Allow reading package events from perf_event_read_local 2023-09-13 12:59 ` [PATCHv2 2/2] perf/core: Allow reading package events from perf_event_read_local Tero Kristo @ 2023-10-03 11:00 ` Peter Zijlstra 0 siblings, 0 replies; 7+ messages in thread From: Peter Zijlstra @ 2023-10-03 11:00 UTC (permalink / raw) To: Tero Kristo Cc: x86, bp, dave.hansen, tglx, hpa, irogers, jolsa, namhyung, adrian.hunter, acme, mingo, bpf, linux-kernel, alexander.shishkin, linux-perf-users, mark.rutland On Wed, Sep 13, 2023 at 03:59:56PM +0300, Tero Kristo wrote: > Per-package perf events are typically registered with a single CPU only, > however they can be read across all the CPUs within the package. > Currently perf_event_read maps the event CPU according to the topology > information to avoid an unnecessary SMP call, however > perf_event_read_local deals with hard values and rejects a read with a > failure if the CPU is not the one exactly registered. Allow similar > mapping within the perf_event_read_local if the perf event in question > can support this. > > This allows users like BPF code to read the package perf events properly > across different CPUs within a package. > > Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com> > --- > v2: > * prevent illegal array access in case event->oncpu == -1 > * split the event->cpu / event->oncpu handling to their own variables > > kernel/events/core.c | 18 +++++++++++++++--- > 1 file changed, 15 insertions(+), 3 deletions(-) > > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 4c72a41f11af..6b343bac0a71 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -4425,6 +4425,9 @@ static int __perf_event_read_cpu(struct perf_event *event, int event_cpu) > { > u16 local_pkg, event_pkg; > > + if (event_cpu < 0 || event_cpu >= nr_cpu_ids) > + return event_cpu; if ((unsigned)event_cpu >= nr_cpu_ids) return event_cpu; As you could also find at the current __perf_event_read_cpu() callsite. > + > if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) { > int local_cpu = smp_processor_id(); > > @@ -4528,6 +4531,8 @@ int perf_event_read_local(struct perf_event *event, u64 *value, > { > unsigned long flags; > int ret = 0; > + int event_cpu; > + int event_oncpu; You wrecked the x-mas tree :-) I'll fix both up. Thanks! ^ permalink raw reply [flat|nested] 7+ messages in thread
* [RESEND PATCH 2/2] perf/core: Allow reading package events from perf_event_read_local 2023-09-12 12:44 [RESEND PATCH 0/2] perf/x86: Package residency counter improvements Tero Kristo 2023-09-12 12:44 ` [RESEND PATCH 1/2] perf/x86/cstate: Allow reading the package statistics from local CPU Tero Kristo @ 2023-09-12 12:44 ` Tero Kristo 2023-09-12 14:04 ` Peter Zijlstra 1 sibling, 1 reply; 7+ messages in thread From: Tero Kristo @ 2023-09-12 12:44 UTC (permalink / raw) To: x86, tglx, bp, dave.hansen Cc: irogers, mark.rutland, linux-perf-users, hpa, mingo, bpf, linux-kernel, acme, peterz, alexander.shishkin, adrian.hunter, namhyung, jolsa Per-package perf events are typically registered with a single CPU only, however they can be read across all the CPUs within the package. Currently perf_event_read maps the event CPU according to the topology information to avoid an unnecessary SMP call, however perf_event_read_local deals with hard values and rejects a read with a failure if the CPU is not the one exactly registered. Allow similar mapping within the perf_event_read_local if the perf event in question can support this. This allows users like BPF code to read the package perf events properly across different CPUs within a package. Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com> --- kernel/events/core.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 4c72a41f11af..780dde646e8a 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -4528,6 +4528,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value, { unsigned long flags; int ret = 0; + int event_cpu; /* * Disabling interrupts avoids all counter scheduling (context @@ -4551,15 +4552,18 @@ int perf_event_read_local(struct perf_event *event, u64 *value, goto out; } + event_cpu = READ_ONCE(event->oncpu); + event_cpu = __perf_event_read_cpu(event, event_cpu); + /* If this is a per-CPU event, it must be for this CPU */ if (!(event->attach_state & PERF_ATTACH_TASK) && - event->cpu != smp_processor_id()) { + event_cpu != smp_processor_id()) { ret = -EINVAL; goto out; } /* If this is a pinned event it must be running on this CPU */ - if (event->attr.pinned && event->oncpu != smp_processor_id()) { + if (event->attr.pinned && event_cpu != smp_processor_id()) { ret = -EBUSY; goto out; } @@ -4569,7 +4573,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value, * or local to this CPU. Furthermore it means its ACTIVE (otherwise * oncpu == -1). */ - if (event->oncpu == smp_processor_id()) + if (event_cpu == smp_processor_id()) event->pmu->read(event); *value = local64_read(&event->count); -- 2.40.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RESEND PATCH 2/2] perf/core: Allow reading package events from perf_event_read_local 2023-09-12 12:44 ` [RESEND PATCH " Tero Kristo @ 2023-09-12 14:04 ` Peter Zijlstra 2023-09-12 14:54 ` Tero Kristo 0 siblings, 1 reply; 7+ messages in thread From: Peter Zijlstra @ 2023-09-12 14:04 UTC (permalink / raw) To: Tero Kristo Cc: x86, tglx, bp, dave.hansen, irogers, mark.rutland, linux-perf-users, hpa, mingo, bpf, linux-kernel, acme, alexander.shishkin, adrian.hunter, namhyung, jolsa On Tue, Sep 12, 2023 at 03:44:32PM +0300, Tero Kristo wrote: > Per-package perf events are typically registered with a single CPU only, > however they can be read across all the CPUs within the package. > Currently perf_event_read maps the event CPU according to the topology > information to avoid an unnecessary SMP call, however > perf_event_read_local deals with hard values and rejects a read with a > failure if the CPU is not the one exactly registered. Allow similar > mapping within the perf_event_read_local if the perf event in question > can support this. > > This allows users like BPF code to read the package perf events properly > across different CPUs within a package. > > Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com> > --- > kernel/events/core.c | 10 +++++++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 4c72a41f11af..780dde646e8a 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -4528,6 +4528,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value, > { > unsigned long flags; > int ret = 0; > + int event_cpu; > > /* > * Disabling interrupts avoids all counter scheduling (context > @@ -4551,15 +4552,18 @@ int perf_event_read_local(struct perf_event *event, u64 *value, > goto out; > } > > + event_cpu = READ_ONCE(event->oncpu); > + event_cpu = __perf_event_read_cpu(event, event_cpu); What happens with __perf_event_read_cpu() when event_cpu == -1 ? > + > /* If this is a per-CPU event, it must be for this CPU */ > if (!(event->attach_state & PERF_ATTACH_TASK) && > - event->cpu != smp_processor_id()) { > + event_cpu != smp_processor_id()) { > ret = -EINVAL; > goto out; > } > > /* If this is a pinned event it must be running on this CPU */ > - if (event->attr.pinned && event->oncpu != smp_processor_id()) { > + if (event->attr.pinned && event_cpu != smp_processor_id()) { > ret = -EBUSY; > goto out; > } > @@ -4569,7 +4573,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value, > * or local to this CPU. Furthermore it means its ACTIVE (otherwise > * oncpu == -1). > */ > - if (event->oncpu == smp_processor_id()) > + if (event_cpu == smp_processor_id()) > event->pmu->read(event); > > *value = local64_read(&event->count); > -- > 2.40.1 > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RESEND PATCH 2/2] perf/core: Allow reading package events from perf_event_read_local 2023-09-12 14:04 ` Peter Zijlstra @ 2023-09-12 14:54 ` Tero Kristo 0 siblings, 0 replies; 7+ messages in thread From: Tero Kristo @ 2023-09-12 14:54 UTC (permalink / raw) To: Peter Zijlstra Cc: x86, tglx, bp, dave.hansen, irogers, mark.rutland, linux-perf-users, hpa, mingo, bpf, linux-kernel, acme, alexander.shishkin, adrian.hunter, namhyung, jolsa On 12/09/2023 17:04, Peter Zijlstra wrote: > On Tue, Sep 12, 2023 at 03:44:32PM +0300, Tero Kristo wrote: >> Per-package perf events are typically registered with a single CPU only, >> however they can be read across all the CPUs within the package. >> Currently perf_event_read maps the event CPU according to the topology >> information to avoid an unnecessary SMP call, however >> perf_event_read_local deals with hard values and rejects a read with a >> failure if the CPU is not the one exactly registered. Allow similar >> mapping within the perf_event_read_local if the perf event in question >> can support this. >> >> This allows users like BPF code to read the package perf events properly >> across different CPUs within a package. >> >> Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com> >> --- >> kernel/events/core.c | 10 +++++++--- >> 1 file changed, 7 insertions(+), 3 deletions(-) >> >> diff --git a/kernel/events/core.c b/kernel/events/core.c >> index 4c72a41f11af..780dde646e8a 100644 >> --- a/kernel/events/core.c >> +++ b/kernel/events/core.c >> @@ -4528,6 +4528,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value, >> { >> unsigned long flags; >> int ret = 0; >> + int event_cpu; >> >> /* >> * Disabling interrupts avoids all counter scheduling (context >> @@ -4551,15 +4552,18 @@ int perf_event_read_local(struct perf_event *event, u64 *value, >> goto out; >> } >> >> + event_cpu = READ_ONCE(event->oncpu); >> + event_cpu = __perf_event_read_cpu(event, event_cpu); > What happens with __perf_event_read_cpu() when event_cpu == -1 ? Good question. It looks like I need to add a check against that. Will update and send v2 out. -Tero > >> + >> /* If this is a per-CPU event, it must be for this CPU */ >> if (!(event->attach_state & PERF_ATTACH_TASK) && >> - event->cpu != smp_processor_id()) { >> + event_cpu != smp_processor_id()) { >> ret = -EINVAL; >> goto out; >> } >> >> /* If this is a pinned event it must be running on this CPU */ >> - if (event->attr.pinned && event->oncpu != smp_processor_id()) { >> + if (event->attr.pinned && event_cpu != smp_processor_id()) { >> ret = -EBUSY; >> goto out; >> } >> @@ -4569,7 +4573,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value, >> * or local to this CPU. Furthermore it means its ACTIVE (otherwise >> * oncpu == -1). >> */ >> - if (event->oncpu == smp_processor_id()) >> + if (event_cpu == smp_processor_id()) >> event->pmu->read(event); >> >> *value = local64_read(&event->count); >> -- >> 2.40.1 >> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-10-03 11:00 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-09-12 12:44 [RESEND PATCH 0/2] perf/x86: Package residency counter improvements Tero Kristo 2023-09-12 12:44 ` [RESEND PATCH 1/2] perf/x86/cstate: Allow reading the package statistics from local CPU Tero Kristo 2023-09-13 12:59 ` [PATCHv2 2/2] perf/core: Allow reading package events from perf_event_read_local Tero Kristo 2023-10-03 11:00 ` Peter Zijlstra 2023-09-12 12:44 ` [RESEND PATCH " Tero Kristo 2023-09-12 14:04 ` Peter Zijlstra 2023-09-12 14:54 ` Tero Kristo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).