linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] perf/x86/uncore: Overflow handling enhancements
@ 2025-04-18  3:42 Sandipan Das
  2025-04-18  3:42 ` [PATCH v2 1/5] perf/x86/amd/uncore: Remove unused member from amd_uncore_ctx Sandipan Das
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Sandipan Das @ 2025-04-18  3:42 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel
  Cc: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
	jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
	x86, hpa, eranian, songliubraving, ravi.bangoria, ananth.narayan,
	sandipan.das

Uncore counters on AMD processors either roll over or saturate on
overflow and the amd-uncore driver has no way of knowing if multiple
overflows have occurred between two successive pmu->read() requests for
an event. This makes the user-visible counter values inaccurate. Solve
this by periodically initiating pmu->read() in order to keep prev_count
up-to-date. The approach follows Intel's precedent in handling uncore
counters.

Previous versions can be found at:
v1: https://lore.kernel.org/all/cover.1744184837.git.sandipan.das@amd.com/

Changes in v2:
 - Add a Fixes tag for the first patch.
 - Change the hrtimer mode for both the Intel and the AMD uncore drivers
   based on Peter's suggestion.
 - Fix an issue in the UMC workaround where prev_count is not zeroed out
   after a counter is reset.

Sandipan Das (5):
  perf/x86/amd/uncore: Remove unused member from amd_uncore_ctx
  perf/x86/intel/uncore: Use HRTIMER_MODE_HARD for detecting overflows
  perf/x86/amd/uncore: Use hrtimer for handling overflows
  perf/x86/amd/uncore: Add parameter to configure hrtimer
  perf/x86/amd/uncore: Prevent UMC counters from saturating

 arch/x86/events/amd/uncore.c   | 103 ++++++++++++++++++++++++++++++++-
 arch/x86/events/intel/uncore.c |  12 +---
 2 files changed, 103 insertions(+), 12 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/5] perf/x86/amd/uncore: Remove unused member from amd_uncore_ctx
  2025-04-18  3:42 [PATCH v2 0/5] perf/x86/uncore: Overflow handling enhancements Sandipan Das
@ 2025-04-18  3:42 ` Sandipan Das
  2025-04-18  3:43 ` [PATCH v2 2/5] perf/x86/intel/uncore: Use HRTIMER_MODE_HARD for detecting overflows Sandipan Das
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Sandipan Das @ 2025-04-18  3:42 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel
  Cc: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
	jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
	x86, hpa, eranian, songliubraving, ravi.bangoria, ananth.narayan,
	sandipan.das

Remove unused member "node" from struct amd_uncore_ctx.

Fixes: d6389d3ccc13 ("perf/x86/amd/uncore: Refactor uncore management")
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
---
 arch/x86/events/amd/uncore.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
index 49c26ce2b115..010024f09f2c 100644
--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -38,7 +38,6 @@ struct amd_uncore_ctx {
 	int refcnt;
 	int cpu;
 	struct perf_event **events;
-	struct hlist_node node;
 };
 
 struct amd_uncore_pmu {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/5] perf/x86/intel/uncore: Use HRTIMER_MODE_HARD for detecting overflows
  2025-04-18  3:42 [PATCH v2 0/5] perf/x86/uncore: Overflow handling enhancements Sandipan Das
  2025-04-18  3:42 ` [PATCH v2 1/5] perf/x86/amd/uncore: Remove unused member from amd_uncore_ctx Sandipan Das
@ 2025-04-18  3:43 ` Sandipan Das
  2025-04-18  3:43 ` [PATCH v2 3/5] perf/x86/amd/uncore: Use hrtimer for handling overflows Sandipan Das
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Sandipan Das @ 2025-04-18  3:43 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel
  Cc: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
	jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
	x86, hpa, eranian, songliubraving, ravi.bangoria, ananth.narayan,
	sandipan.das

Hrtimer handlers can be deferred to softirq context and affect timely
detection of counter overflows. Hence switch to HRTIMER_MODE_HARD.

Disabling and re-enabling IRQs in the hrtimer handler is not required
as pmu->start() and pmu->stop() can no longer intervene while updating
event->hw.prev_count.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
---
 arch/x86/events/intel/uncore.c | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index a34e50fc4a8f..5811e172f721 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -305,17 +305,11 @@ static enum hrtimer_restart uncore_pmu_hrtimer(struct hrtimer *hrtimer)
 {
 	struct intel_uncore_box *box;
 	struct perf_event *event;
-	unsigned long flags;
 	int bit;
 
 	box = container_of(hrtimer, struct intel_uncore_box, hrtimer);
 	if (!box->n_active || box->cpu != smp_processor_id())
 		return HRTIMER_NORESTART;
-	/*
-	 * disable local interrupt to prevent uncore_pmu_event_start/stop
-	 * to interrupt the update process
-	 */
-	local_irq_save(flags);
 
 	/*
 	 * handle boxes with an active event list as opposed to active
@@ -328,8 +322,6 @@ static enum hrtimer_restart uncore_pmu_hrtimer(struct hrtimer *hrtimer)
 	for_each_set_bit(bit, box->active_mask, UNCORE_PMC_IDX_MAX)
 		uncore_perf_event_update(box, box->events[bit]);
 
-	local_irq_restore(flags);
-
 	hrtimer_forward_now(hrtimer, ns_to_ktime(box->hrtimer_duration));
 	return HRTIMER_RESTART;
 }
@@ -337,7 +329,7 @@ static enum hrtimer_restart uncore_pmu_hrtimer(struct hrtimer *hrtimer)
 void uncore_pmu_start_hrtimer(struct intel_uncore_box *box)
 {
 	hrtimer_start(&box->hrtimer, ns_to_ktime(box->hrtimer_duration),
-		      HRTIMER_MODE_REL_PINNED);
+		      HRTIMER_MODE_REL_PINNED_HARD);
 }
 
 void uncore_pmu_cancel_hrtimer(struct intel_uncore_box *box)
@@ -347,7 +339,7 @@ void uncore_pmu_cancel_hrtimer(struct intel_uncore_box *box)
 
 static void uncore_pmu_init_hrtimer(struct intel_uncore_box *box)
 {
-	hrtimer_setup(&box->hrtimer, uncore_pmu_hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	hrtimer_setup(&box->hrtimer, uncore_pmu_hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
 }
 
 static struct intel_uncore_box *uncore_alloc_box(struct intel_uncore_type *type,
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 3/5] perf/x86/amd/uncore: Use hrtimer for handling overflows
  2025-04-18  3:42 [PATCH v2 0/5] perf/x86/uncore: Overflow handling enhancements Sandipan Das
  2025-04-18  3:42 ` [PATCH v2 1/5] perf/x86/amd/uncore: Remove unused member from amd_uncore_ctx Sandipan Das
  2025-04-18  3:43 ` [PATCH v2 2/5] perf/x86/intel/uncore: Use HRTIMER_MODE_HARD for detecting overflows Sandipan Das
@ 2025-04-18  3:43 ` Sandipan Das
  2025-04-18  4:38   ` Stephane Eranian
  2025-04-18  3:43 ` [PATCH v2 4/5] perf/x86/amd/uncore: Add parameter to configure hrtimer Sandipan Das
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 10+ messages in thread
From: Sandipan Das @ 2025-04-18  3:43 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel
  Cc: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
	jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
	x86, hpa, eranian, songliubraving, ravi.bangoria, ananth.narayan,
	sandipan.das

Uncore counters do not provide mechanisms like interrupts to report
overflows and the accumulated user-visible count is incorrect if there
is more than one overflow between two successive read requests for the
same event because the value of prev_count goes out-of-date for
calculating the correct delta.

To avoid this, start a hrtimer to periodically initiate a pmu->read() of
the active counters for keeping prev_count up-to-date. It should be
noted that the hrtimer duration should be lesser than the shortest time
it takes for a counter to overflow for this approach to be effective.

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
---
 arch/x86/events/amd/uncore.c | 63 ++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
index 010024f09f2c..e09bfbb4a4cd 100644
--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -21,6 +21,7 @@
 #define NUM_COUNTERS_NB		4
 #define NUM_COUNTERS_L2		4
 #define NUM_COUNTERS_L3		6
+#define NUM_COUNTERS_MAX	64
 
 #define RDPMC_BASE_NB		6
 #define RDPMC_BASE_LLC		10
@@ -38,6 +39,10 @@ struct amd_uncore_ctx {
 	int refcnt;
 	int cpu;
 	struct perf_event **events;
+	unsigned long active_mask[BITS_TO_LONGS(NUM_COUNTERS_MAX)];
+	int nr_active;
+	struct hrtimer hrtimer;
+	u64 hrtimer_duration;
 };
 
 struct amd_uncore_pmu {
@@ -87,6 +92,42 @@ static struct amd_uncore_pmu *event_to_amd_uncore_pmu(struct perf_event *event)
 	return container_of(event->pmu, struct amd_uncore_pmu, pmu);
 }
 
+static enum hrtimer_restart amd_uncore_hrtimer(struct hrtimer *hrtimer)
+{
+	struct amd_uncore_ctx *ctx;
+	struct perf_event *event;
+	int bit;
+
+	ctx = container_of(hrtimer, struct amd_uncore_ctx, hrtimer);
+
+	if (!ctx->nr_active || ctx->cpu != smp_processor_id())
+		return HRTIMER_NORESTART;
+
+	for_each_set_bit(bit, ctx->active_mask, NUM_COUNTERS_MAX) {
+		event = ctx->events[bit];
+		event->pmu->read(event);
+	}
+
+	hrtimer_forward_now(hrtimer, ns_to_ktime(ctx->hrtimer_duration));
+	return HRTIMER_RESTART;
+}
+
+static void amd_uncore_start_hrtimer(struct amd_uncore_ctx *ctx)
+{
+	hrtimer_start(&ctx->hrtimer, ns_to_ktime(ctx->hrtimer_duration),
+		      HRTIMER_MODE_REL_PINNED_HARD);
+}
+
+static void amd_uncore_cancel_hrtimer(struct amd_uncore_ctx *ctx)
+{
+	hrtimer_cancel(&ctx->hrtimer);
+}
+
+static void amd_uncore_init_hrtimer(struct amd_uncore_ctx *ctx)
+{
+	hrtimer_setup(&ctx->hrtimer, amd_uncore_hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
+}
+
 static void amd_uncore_read(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
@@ -117,18 +158,26 @@ static void amd_uncore_read(struct perf_event *event)
 
 static void amd_uncore_start(struct perf_event *event, int flags)
 {
+	struct amd_uncore_pmu *pmu = event_to_amd_uncore_pmu(event);
+	struct amd_uncore_ctx *ctx = *per_cpu_ptr(pmu->ctx, event->cpu);
 	struct hw_perf_event *hwc = &event->hw;
 
+	if (!ctx->nr_active++)
+		amd_uncore_start_hrtimer(ctx);
+
 	if (flags & PERF_EF_RELOAD)
 		wrmsrl(hwc->event_base, (u64)local64_read(&hwc->prev_count));
 
 	hwc->state = 0;
+	__set_bit(hwc->idx, ctx->active_mask);
 	wrmsrl(hwc->config_base, (hwc->config | ARCH_PERFMON_EVENTSEL_ENABLE));
 	perf_event_update_userpage(event);
 }
 
 static void amd_uncore_stop(struct perf_event *event, int flags)
 {
+	struct amd_uncore_pmu *pmu = event_to_amd_uncore_pmu(event);
+	struct amd_uncore_ctx *ctx = *per_cpu_ptr(pmu->ctx, event->cpu);
 	struct hw_perf_event *hwc = &event->hw;
 
 	wrmsrl(hwc->config_base, hwc->config);
@@ -138,6 +187,11 @@ static void amd_uncore_stop(struct perf_event *event, int flags)
 		event->pmu->read(event);
 		hwc->state |= PERF_HES_UPTODATE;
 	}
+
+	if (!--ctx->nr_active)
+		amd_uncore_cancel_hrtimer(ctx);
+
+	__clear_bit(hwc->idx, ctx->active_mask);
 }
 
 static int amd_uncore_add(struct perf_event *event, int flags)
@@ -490,6 +544,9 @@ static int amd_uncore_ctx_init(struct amd_uncore *uncore, unsigned int cpu)
 				goto fail;
 			}
 
+			amd_uncore_init_hrtimer(curr);
+			curr->hrtimer_duration = 60LL * NSEC_PER_SEC;
+
 			cpumask_set_cpu(cpu, &pmu->active_mask);
 		}
 
@@ -879,12 +936,18 @@ static int amd_uncore_umc_event_init(struct perf_event *event)
 
 static void amd_uncore_umc_start(struct perf_event *event, int flags)
 {
+	struct amd_uncore_pmu *pmu = event_to_amd_uncore_pmu(event);
+	struct amd_uncore_ctx *ctx = *per_cpu_ptr(pmu->ctx, event->cpu);
 	struct hw_perf_event *hwc = &event->hw;
 
+	if (!ctx->nr_active++)
+		amd_uncore_start_hrtimer(ctx);
+
 	if (flags & PERF_EF_RELOAD)
 		wrmsrl(hwc->event_base, (u64)local64_read(&hwc->prev_count));
 
 	hwc->state = 0;
+	__set_bit(hwc->idx, ctx->active_mask);
 	wrmsrl(hwc->config_base, (hwc->config | AMD64_PERFMON_V2_ENABLE_UMC));
 	perf_event_update_userpage(event);
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 4/5] perf/x86/amd/uncore: Add parameter to configure hrtimer
  2025-04-18  3:42 [PATCH v2 0/5] perf/x86/uncore: Overflow handling enhancements Sandipan Das
                   ` (2 preceding siblings ...)
  2025-04-18  3:43 ` [PATCH v2 3/5] perf/x86/amd/uncore: Use hrtimer for handling overflows Sandipan Das
@ 2025-04-18  3:43 ` Sandipan Das
  2025-04-18  3:43 ` [PATCH v2 5/5] perf/x86/amd/uncore: Prevent UMC counters from saturating Sandipan Das
  2025-04-18  8:32 ` [PATCH v2 0/5] perf/x86/uncore: Overflow handling enhancements Ingo Molnar
  5 siblings, 0 replies; 10+ messages in thread
From: Sandipan Das @ 2025-04-18  3:43 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel
  Cc: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
	jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
	x86, hpa, eranian, songliubraving, ravi.bangoria, ananth.narayan,
	sandipan.das

Introduce a module parameter for configuring the hrtimer duration in
milliseconds. The default duration is 60000 milliseconds and the intent
is to allow users to customize it to suit jitter tolerances. It should
be noted that a longer duration will reduce jitter but affect accuracy
if the programmed events cause the counters to overflow multiple times
in a single interval.

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
---
 arch/x86/events/amd/uncore.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
index e09bfbb4a4cd..70e0af36c378 100644
--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -87,6 +87,10 @@ struct amd_uncore {
 
 static struct amd_uncore uncores[UNCORE_TYPE_MAX];
 
+/* Interval for hrtimer, defaults to 60000 milliseconds */
+static unsigned int update_interval = 60 * MSEC_PER_SEC;
+module_param(update_interval, uint, 0444);
+
 static struct amd_uncore_pmu *event_to_amd_uncore_pmu(struct perf_event *event)
 {
 	return container_of(event->pmu, struct amd_uncore_pmu, pmu);
@@ -545,7 +549,7 @@ static int amd_uncore_ctx_init(struct amd_uncore *uncore, unsigned int cpu)
 			}
 
 			amd_uncore_init_hrtimer(curr);
-			curr->hrtimer_duration = 60LL * NSEC_PER_SEC;
+			curr->hrtimer_duration = (u64)update_interval * NSEC_PER_MSEC;
 
 			cpumask_set_cpu(cpu, &pmu->active_mask);
 		}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 5/5] perf/x86/amd/uncore: Prevent UMC counters from saturating
  2025-04-18  3:42 [PATCH v2 0/5] perf/x86/uncore: Overflow handling enhancements Sandipan Das
                   ` (3 preceding siblings ...)
  2025-04-18  3:43 ` [PATCH v2 4/5] perf/x86/amd/uncore: Add parameter to configure hrtimer Sandipan Das
@ 2025-04-18  3:43 ` Sandipan Das
  2025-04-18  8:32 ` [PATCH v2 0/5] perf/x86/uncore: Overflow handling enhancements Ingo Molnar
  5 siblings, 0 replies; 10+ messages in thread
From: Sandipan Das @ 2025-04-18  3:43 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel
  Cc: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
	jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
	x86, hpa, eranian, songliubraving, ravi.bangoria, ananth.narayan,
	sandipan.das

Unlike L3 and DF counters, UMC counters (PERF_CTRs) set the Overflow bit
(bit 48) and saturate on overflow. A subsequent pmu->read() of the event
reports an incorrect accumulated count as there is no difference between
the previous and the current values of the counter.

To avoid this, inspect the current counter value and proactively reset
the corresponding PERF_CTR register on every pmu->read(). Combined with
the periodic reads initiated by the hrtimer, the counters never get a
chance saturate but the resolution reduces to 47 bits.

Fixes: 25e56847821f ("perf/x86/amd/uncore: Add memory controller support")
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Reviewed-by: Song Liu <song@kernel.org>
---
 arch/x86/events/amd/uncore.c | 35 ++++++++++++++++++++++++++++++++++-
 1 file changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
index 70e0af36c378..d328de166481 100644
--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -956,6 +956,39 @@ static void amd_uncore_umc_start(struct perf_event *event, int flags)
 	perf_event_update_userpage(event);
 }
 
+static void amd_uncore_umc_read(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+	u64 prev, new, shift;
+	s64 delta;
+
+	shift = COUNTER_SHIFT + 1;
+	prev = local64_read(&hwc->prev_count);
+
+	/*
+	 * UMC counters do not have RDPMC assignments. Read counts directly
+	 * from the corresponding PERF_CTR.
+	 */
+	rdmsrl(hwc->event_base, new);
+
+	/*
+	 * Unlike the other uncore counters, UMC counters saturate and set the
+	 * Overflow bit (bit 48) on overflow. Since they do not roll over,
+	 * proactively reset the corresponding PERF_CTR when bit 47 is set so
+	 * that the counter never gets a chance to saturate.
+	 */
+	if (new & BIT_ULL(63 - COUNTER_SHIFT)) {
+		wrmsrl(hwc->event_base, 0);
+		local64_set(&hwc->prev_count, 0);
+	} else {
+		local64_set(&hwc->prev_count, new);
+	}
+
+	delta = (new << shift) - (prev << shift);
+	delta >>= shift;
+	local64_add(delta, &event->count);
+}
+
 static
 void amd_uncore_umc_ctx_scan(struct amd_uncore *uncore, unsigned int cpu)
 {
@@ -1034,7 +1067,7 @@ int amd_uncore_umc_ctx_init(struct amd_uncore *uncore, unsigned int cpu)
 				.del		= amd_uncore_del,
 				.start		= amd_uncore_umc_start,
 				.stop		= amd_uncore_stop,
-				.read		= amd_uncore_read,
+				.read		= amd_uncore_umc_read,
 				.capabilities	= PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT,
 				.module		= THIS_MODULE,
 			};
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/5] perf/x86/amd/uncore: Use hrtimer for handling overflows
  2025-04-18  3:43 ` [PATCH v2 3/5] perf/x86/amd/uncore: Use hrtimer for handling overflows Sandipan Das
@ 2025-04-18  4:38   ` Stephane Eranian
  2025-04-18  4:49     ` Sandipan Das
  0 siblings, 1 reply; 10+ messages in thread
From: Stephane Eranian @ 2025-04-18  4:38 UTC (permalink / raw)
  To: Sandipan Das
  Cc: linux-perf-users, linux-kernel, peterz, mingo, acme, namhyung,
	mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
	kan.liang, tglx, bp, dave.hansen, x86, hpa, songliubraving,
	ravi.bangoria, ananth.narayan

On Thu, Apr 17, 2025 at 8:44 PM Sandipan Das <sandipan.das@amd.com> wrote:
>
> Uncore counters do not provide mechanisms like interrupts to report
> overflows and the accumulated user-visible count is incorrect if there
> is more than one overflow between two successive read requests for the
> same event because the value of prev_count goes out-of-date for
> calculating the correct delta.
>
> To avoid this, start a hrtimer to periodically initiate a pmu->read() of
> the active counters for keeping prev_count up-to-date. It should be
> noted that the hrtimer duration should be lesser than the shortest time
> it takes for a counter to overflow for this approach to be effective.
>
The problem I see is that the number of uncore PMU varies a lot based
on the CPU model, in particular due to the L3 PMU.
Is there a timer armed per CCX or only a global one that will generate
IPI to all other CPUs?

> Signed-off-by: Sandipan Das <sandipan.das@amd.com>
> ---
>  arch/x86/events/amd/uncore.c | 63 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 63 insertions(+)
>
> diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
> index 010024f09f2c..e09bfbb4a4cd 100644
> --- a/arch/x86/events/amd/uncore.c
> +++ b/arch/x86/events/amd/uncore.c
> @@ -21,6 +21,7 @@
>  #define NUM_COUNTERS_NB                4
>  #define NUM_COUNTERS_L2                4
>  #define NUM_COUNTERS_L3                6
> +#define NUM_COUNTERS_MAX       64
>
>  #define RDPMC_BASE_NB          6
>  #define RDPMC_BASE_LLC         10
> @@ -38,6 +39,10 @@ struct amd_uncore_ctx {
>         int refcnt;
>         int cpu;
>         struct perf_event **events;
> +       unsigned long active_mask[BITS_TO_LONGS(NUM_COUNTERS_MAX)];
> +       int nr_active;
> +       struct hrtimer hrtimer;
> +       u64 hrtimer_duration;
>  };
>
>  struct amd_uncore_pmu {
> @@ -87,6 +92,42 @@ static struct amd_uncore_pmu *event_to_amd_uncore_pmu(struct perf_event *event)
>         return container_of(event->pmu, struct amd_uncore_pmu, pmu);
>  }
>
> +static enum hrtimer_restart amd_uncore_hrtimer(struct hrtimer *hrtimer)
> +{
> +       struct amd_uncore_ctx *ctx;
> +       struct perf_event *event;
> +       int bit;
> +
> +       ctx = container_of(hrtimer, struct amd_uncore_ctx, hrtimer);
> +
> +       if (!ctx->nr_active || ctx->cpu != smp_processor_id())
> +               return HRTIMER_NORESTART;
> +
> +       for_each_set_bit(bit, ctx->active_mask, NUM_COUNTERS_MAX) {
> +               event = ctx->events[bit];
> +               event->pmu->read(event);
> +       }
> +
> +       hrtimer_forward_now(hrtimer, ns_to_ktime(ctx->hrtimer_duration));
> +       return HRTIMER_RESTART;
> +}
> +
> +static void amd_uncore_start_hrtimer(struct amd_uncore_ctx *ctx)
> +{
> +       hrtimer_start(&ctx->hrtimer, ns_to_ktime(ctx->hrtimer_duration),
> +                     HRTIMER_MODE_REL_PINNED_HARD);
> +}
> +
> +static void amd_uncore_cancel_hrtimer(struct amd_uncore_ctx *ctx)
> +{
> +       hrtimer_cancel(&ctx->hrtimer);
> +}
> +
> +static void amd_uncore_init_hrtimer(struct amd_uncore_ctx *ctx)
> +{
> +       hrtimer_setup(&ctx->hrtimer, amd_uncore_hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
> +}
> +
>  static void amd_uncore_read(struct perf_event *event)
>  {
>         struct hw_perf_event *hwc = &event->hw;
> @@ -117,18 +158,26 @@ static void amd_uncore_read(struct perf_event *event)
>
>  static void amd_uncore_start(struct perf_event *event, int flags)
>  {
> +       struct amd_uncore_pmu *pmu = event_to_amd_uncore_pmu(event);
> +       struct amd_uncore_ctx *ctx = *per_cpu_ptr(pmu->ctx, event->cpu);
>         struct hw_perf_event *hwc = &event->hw;
>
> +       if (!ctx->nr_active++)
> +               amd_uncore_start_hrtimer(ctx);
> +
>         if (flags & PERF_EF_RELOAD)
>                 wrmsrl(hwc->event_base, (u64)local64_read(&hwc->prev_count));
>
>         hwc->state = 0;
> +       __set_bit(hwc->idx, ctx->active_mask);
>         wrmsrl(hwc->config_base, (hwc->config | ARCH_PERFMON_EVENTSEL_ENABLE));
>         perf_event_update_userpage(event);
>  }
>
>  static void amd_uncore_stop(struct perf_event *event, int flags)
>  {
> +       struct amd_uncore_pmu *pmu = event_to_amd_uncore_pmu(event);
> +       struct amd_uncore_ctx *ctx = *per_cpu_ptr(pmu->ctx, event->cpu);
>         struct hw_perf_event *hwc = &event->hw;
>
>         wrmsrl(hwc->config_base, hwc->config);
> @@ -138,6 +187,11 @@ static void amd_uncore_stop(struct perf_event *event, int flags)
>                 event->pmu->read(event);
>                 hwc->state |= PERF_HES_UPTODATE;
>         }
> +
> +       if (!--ctx->nr_active)
> +               amd_uncore_cancel_hrtimer(ctx);
> +
> +       __clear_bit(hwc->idx, ctx->active_mask);
>  }
>
>  static int amd_uncore_add(struct perf_event *event, int flags)
> @@ -490,6 +544,9 @@ static int amd_uncore_ctx_init(struct amd_uncore *uncore, unsigned int cpu)
>                                 goto fail;
>                         }
>
> +                       amd_uncore_init_hrtimer(curr);
> +                       curr->hrtimer_duration = 60LL * NSEC_PER_SEC;
> +
>                         cpumask_set_cpu(cpu, &pmu->active_mask);
>                 }
>
> @@ -879,12 +936,18 @@ static int amd_uncore_umc_event_init(struct perf_event *event)
>
>  static void amd_uncore_umc_start(struct perf_event *event, int flags)
>  {
> +       struct amd_uncore_pmu *pmu = event_to_amd_uncore_pmu(event);
> +       struct amd_uncore_ctx *ctx = *per_cpu_ptr(pmu->ctx, event->cpu);
>         struct hw_perf_event *hwc = &event->hw;
>
> +       if (!ctx->nr_active++)
> +               amd_uncore_start_hrtimer(ctx);
> +
>         if (flags & PERF_EF_RELOAD)
>                 wrmsrl(hwc->event_base, (u64)local64_read(&hwc->prev_count));
>
>         hwc->state = 0;
> +       __set_bit(hwc->idx, ctx->active_mask);
>         wrmsrl(hwc->config_base, (hwc->config | AMD64_PERFMON_V2_ENABLE_UMC));
>         perf_event_update_userpage(event);
>  }
> --
> 2.43.0
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/5] perf/x86/amd/uncore: Use hrtimer for handling overflows
  2025-04-18  4:38   ` Stephane Eranian
@ 2025-04-18  4:49     ` Sandipan Das
  0 siblings, 0 replies; 10+ messages in thread
From: Sandipan Das @ 2025-04-18  4:49 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-perf-users, linux-kernel, peterz, mingo, acme, namhyung,
	mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
	kan.liang, tglx, bp, dave.hansen, x86, hpa, songliubraving,
	ravi.bangoria, ananth.narayan

On 4/18/2025 10:08 AM, Stephane Eranian wrote:
> On Thu, Apr 17, 2025 at 8:44 PM Sandipan Das <sandipan.das@amd.com> wrote:
>>
>> Uncore counters do not provide mechanisms like interrupts to report
>> overflows and the accumulated user-visible count is incorrect if there
>> is more than one overflow between two successive read requests for the
>> same event because the value of prev_count goes out-of-date for
>> calculating the correct delta.
>>
>> To avoid this, start a hrtimer to periodically initiate a pmu->read() of
>> the active counters for keeping prev_count up-to-date. It should be
>> noted that the hrtimer duration should be lesser than the shortest time
>> it takes for a counter to overflow for this approach to be effective.
>>
> The problem I see is that the number of uncore PMU varies a lot based
> on the CPU model, in particular due to the L3 PMU.
> Is there a timer armed per CCX or only a global one that will generate
> IPI to all other CPUs?
> 

For L3 PMU, its on a per-CCX basis.

>> Signed-off-by: Sandipan Das <sandipan.das@amd.com>
>> ---
>>  arch/x86/events/amd/uncore.c | 63 ++++++++++++++++++++++++++++++++++++
>>  1 file changed, 63 insertions(+)
>>
>> diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
>> index 010024f09f2c..e09bfbb4a4cd 100644
>> --- a/arch/x86/events/amd/uncore.c
>> +++ b/arch/x86/events/amd/uncore.c
>> @@ -21,6 +21,7 @@
>>  #define NUM_COUNTERS_NB                4
>>  #define NUM_COUNTERS_L2                4
>>  #define NUM_COUNTERS_L3                6
>> +#define NUM_COUNTERS_MAX       64
>>
>>  #define RDPMC_BASE_NB          6
>>  #define RDPMC_BASE_LLC         10
>> @@ -38,6 +39,10 @@ struct amd_uncore_ctx {
>>         int refcnt;
>>         int cpu;
>>         struct perf_event **events;
>> +       unsigned long active_mask[BITS_TO_LONGS(NUM_COUNTERS_MAX)];
>> +       int nr_active;
>> +       struct hrtimer hrtimer;
>> +       u64 hrtimer_duration;
>>  };
>>
>>  struct amd_uncore_pmu {
>> @@ -87,6 +92,42 @@ static struct amd_uncore_pmu *event_to_amd_uncore_pmu(struct perf_event *event)
>>         return container_of(event->pmu, struct amd_uncore_pmu, pmu);
>>  }
>>
>> +static enum hrtimer_restart amd_uncore_hrtimer(struct hrtimer *hrtimer)
>> +{
>> +       struct amd_uncore_ctx *ctx;
>> +       struct perf_event *event;
>> +       int bit;
>> +
>> +       ctx = container_of(hrtimer, struct amd_uncore_ctx, hrtimer);
>> +
>> +       if (!ctx->nr_active || ctx->cpu != smp_processor_id())
>> +               return HRTIMER_NORESTART;
>> +
>> +       for_each_set_bit(bit, ctx->active_mask, NUM_COUNTERS_MAX) {
>> +               event = ctx->events[bit];
>> +               event->pmu->read(event);
>> +       }
>> +
>> +       hrtimer_forward_now(hrtimer, ns_to_ktime(ctx->hrtimer_duration));
>> +       return HRTIMER_RESTART;
>> +}
>> +
>> +static void amd_uncore_start_hrtimer(struct amd_uncore_ctx *ctx)
>> +{
>> +       hrtimer_start(&ctx->hrtimer, ns_to_ktime(ctx->hrtimer_duration),
>> +                     HRTIMER_MODE_REL_PINNED_HARD);
>> +}
>> +
>> +static void amd_uncore_cancel_hrtimer(struct amd_uncore_ctx *ctx)
>> +{
>> +       hrtimer_cancel(&ctx->hrtimer);
>> +}
>> +
>> +static void amd_uncore_init_hrtimer(struct amd_uncore_ctx *ctx)
>> +{
>> +       hrtimer_setup(&ctx->hrtimer, amd_uncore_hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
>> +}
>> +
>>  static void amd_uncore_read(struct perf_event *event)
>>  {
>>         struct hw_perf_event *hwc = &event->hw;
>> @@ -117,18 +158,26 @@ static void amd_uncore_read(struct perf_event *event)
>>
>>  static void amd_uncore_start(struct perf_event *event, int flags)
>>  {
>> +       struct amd_uncore_pmu *pmu = event_to_amd_uncore_pmu(event);
>> +       struct amd_uncore_ctx *ctx = *per_cpu_ptr(pmu->ctx, event->cpu);
>>         struct hw_perf_event *hwc = &event->hw;
>>
>> +       if (!ctx->nr_active++)
>> +               amd_uncore_start_hrtimer(ctx);
>> +
>>         if (flags & PERF_EF_RELOAD)
>>                 wrmsrl(hwc->event_base, (u64)local64_read(&hwc->prev_count));
>>
>>         hwc->state = 0;
>> +       __set_bit(hwc->idx, ctx->active_mask);
>>         wrmsrl(hwc->config_base, (hwc->config | ARCH_PERFMON_EVENTSEL_ENABLE));
>>         perf_event_update_userpage(event);
>>  }
>>
>>  static void amd_uncore_stop(struct perf_event *event, int flags)
>>  {
>> +       struct amd_uncore_pmu *pmu = event_to_amd_uncore_pmu(event);
>> +       struct amd_uncore_ctx *ctx = *per_cpu_ptr(pmu->ctx, event->cpu);
>>         struct hw_perf_event *hwc = &event->hw;
>>
>>         wrmsrl(hwc->config_base, hwc->config);
>> @@ -138,6 +187,11 @@ static void amd_uncore_stop(struct perf_event *event, int flags)
>>                 event->pmu->read(event);
>>                 hwc->state |= PERF_HES_UPTODATE;
>>         }
>> +
>> +       if (!--ctx->nr_active)
>> +               amd_uncore_cancel_hrtimer(ctx);
>> +
>> +       __clear_bit(hwc->idx, ctx->active_mask);
>>  }
>>
>>  static int amd_uncore_add(struct perf_event *event, int flags)
>> @@ -490,6 +544,9 @@ static int amd_uncore_ctx_init(struct amd_uncore *uncore, unsigned int cpu)
>>                                 goto fail;
>>                         }
>>
>> +                       amd_uncore_init_hrtimer(curr);
>> +                       curr->hrtimer_duration = 60LL * NSEC_PER_SEC;
>> +
>>                         cpumask_set_cpu(cpu, &pmu->active_mask);
>>                 }
>>
>> @@ -879,12 +936,18 @@ static int amd_uncore_umc_event_init(struct perf_event *event)
>>
>>  static void amd_uncore_umc_start(struct perf_event *event, int flags)
>>  {
>> +       struct amd_uncore_pmu *pmu = event_to_amd_uncore_pmu(event);
>> +       struct amd_uncore_ctx *ctx = *per_cpu_ptr(pmu->ctx, event->cpu);
>>         struct hw_perf_event *hwc = &event->hw;
>>
>> +       if (!ctx->nr_active++)
>> +               amd_uncore_start_hrtimer(ctx);
>> +
>>         if (flags & PERF_EF_RELOAD)
>>                 wrmsrl(hwc->event_base, (u64)local64_read(&hwc->prev_count));
>>
>>         hwc->state = 0;
>> +       __set_bit(hwc->idx, ctx->active_mask);
>>         wrmsrl(hwc->config_base, (hwc->config | AMD64_PERFMON_V2_ENABLE_UMC));
>>         perf_event_update_userpage(event);
>>  }
>> --
>> 2.43.0
>>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/5] perf/x86/uncore: Overflow handling enhancements
  2025-04-18  3:42 [PATCH v2 0/5] perf/x86/uncore: Overflow handling enhancements Sandipan Das
                   ` (4 preceding siblings ...)
  2025-04-18  3:43 ` [PATCH v2 5/5] perf/x86/amd/uncore: Prevent UMC counters from saturating Sandipan Das
@ 2025-04-18  8:32 ` Ingo Molnar
  2025-04-18  9:27   ` Sandipan Das
  5 siblings, 1 reply; 10+ messages in thread
From: Ingo Molnar @ 2025-04-18  8:32 UTC (permalink / raw)
  To: Sandipan Das
  Cc: linux-perf-users, linux-kernel, peterz, mingo, acme, namhyung,
	mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
	kan.liang, tglx, bp, dave.hansen, x86, hpa, eranian,
	songliubraving, ravi.bangoria, ananth.narayan


* Sandipan Das <sandipan.das@amd.com> wrote:

> Sandipan Das (5):
>   perf/x86/amd/uncore: Remove unused member from amd_uncore_ctx
>   perf/x86/intel/uncore: Use HRTIMER_MODE_HARD for detecting overflows
>   perf/x86/amd/uncore: Use hrtimer for handling overflows
>   perf/x86/amd/uncore: Add parameter to configure hrtimer
>   perf/x86/amd/uncore: Prevent UMC counters from saturating

Could you please fix your mailer to not mutiliate Cc: lines?

Cc: linux-perf-users@vger.kernel.org
Cc: peterz@infradead.org
Cc: mingo@redhat.com
Cc: acme@kernel.org
Cc: namhyung@kernel.org
Cc: mark.rutland@arm.com
Cc: alexander.shishkin@linux.intel.com
Cc: jolsa@kernel.org
Cc: irogers@google.com
Cc: adrian.hunter@intel.com
Cc: kan.liang@linux.intel.com
Cc: tglx@linutronix.de
Cc: bp@alien8.de
Cc: dave.hansen@linux.intel.com
Cc: x86@kernel.org
Cc: hpa@zytor.com
Cc: eranian@google.com
Cc: songliubraving@meta.com
Cc: ravi.bangoria@amd.com
Cc: ananth.narayan@amd.com

All these email addresses have real names, I suppose they weren't just 
written in in such a fashion?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/5] perf/x86/uncore: Overflow handling enhancements
  2025-04-18  8:32 ` [PATCH v2 0/5] perf/x86/uncore: Overflow handling enhancements Ingo Molnar
@ 2025-04-18  9:27   ` Sandipan Das
  0 siblings, 0 replies; 10+ messages in thread
From: Sandipan Das @ 2025-04-18  9:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-perf-users, linux-kernel, peterz, mingo, acme, namhyung,
	mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
	kan.liang, tglx, bp, dave.hansen, x86, hpa, eranian,
	songliubraving, ravi.bangoria, ananth.narayan



On 4/18/2025 2:02 PM, Ingo Molnar wrote:
> 
> * Sandipan Das <sandipan.das@amd.com> wrote:
> 
>> Sandipan Das (5):
>>   perf/x86/amd/uncore: Remove unused member from amd_uncore_ctx
>>   perf/x86/intel/uncore: Use HRTIMER_MODE_HARD for detecting overflows
>>   perf/x86/amd/uncore: Use hrtimer for handling overflows
>>   perf/x86/amd/uncore: Add parameter to configure hrtimer
>>   perf/x86/amd/uncore: Prevent UMC counters from saturating
> 
> Could you please fix your mailer to not mutiliate Cc: lines?
> 

My bad. Will fix this.

> Cc: linux-perf-users@vger.kernel.org
> Cc: peterz@infradead.org
> Cc: mingo@redhat.com
> Cc: acme@kernel.org
> Cc: namhyung@kernel.org
> Cc: mark.rutland@arm.com
> Cc: alexander.shishkin@linux.intel.com
> Cc: jolsa@kernel.org
> Cc: irogers@google.com
> Cc: adrian.hunter@intel.com
> Cc: kan.liang@linux.intel.com
> Cc: tglx@linutronix.de
> Cc: bp@alien8.de
> Cc: dave.hansen@linux.intel.com
> Cc: x86@kernel.org
> Cc: hpa@zytor.com
> Cc: eranian@google.com
> Cc: songliubraving@meta.com
> Cc: ravi.bangoria@amd.com
> Cc: ananth.narayan@amd.com
> 
> All these email addresses have real names, I suppose they weren't just 
> written in in such a fashion?
> 

And this too.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-04-18  9:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-18  3:42 [PATCH v2 0/5] perf/x86/uncore: Overflow handling enhancements Sandipan Das
2025-04-18  3:42 ` [PATCH v2 1/5] perf/x86/amd/uncore: Remove unused member from amd_uncore_ctx Sandipan Das
2025-04-18  3:43 ` [PATCH v2 2/5] perf/x86/intel/uncore: Use HRTIMER_MODE_HARD for detecting overflows Sandipan Das
2025-04-18  3:43 ` [PATCH v2 3/5] perf/x86/amd/uncore: Use hrtimer for handling overflows Sandipan Das
2025-04-18  4:38   ` Stephane Eranian
2025-04-18  4:49     ` Sandipan Das
2025-04-18  3:43 ` [PATCH v2 4/5] perf/x86/amd/uncore: Add parameter to configure hrtimer Sandipan Das
2025-04-18  3:43 ` [PATCH v2 5/5] perf/x86/amd/uncore: Prevent UMC counters from saturating Sandipan Das
2025-04-18  8:32 ` [PATCH v2 0/5] perf/x86/uncore: Overflow handling enhancements Ingo Molnar
2025-04-18  9:27   ` Sandipan Das

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).