[PATCH V3 00/16] perf: Fix the throttle logic for group

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH V3 00/16] perf: Fix the throttle logic for group
@ 2025-05-16 18:28 kan.liang
  2025-05-16 18:28 ` [PATCH V3 01/16] perf: Clean up event in freq mode check kan.liang
                   ` (15 more replies)
  0 siblings, 16 replies; 24+ messages in thread
From: kan.liang @ 2025-05-16 18:28 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

Changes since V2:
- Add a cleanup patch to check if an event is in freq mode
- Rename the parameter of the perf_event_unthrottle_group()
- Add Tested-by from Leo and Thomas
- Add Acked-by from Guo Ren

Changes since V1:
- Apply the suggested throttle/unthrottle functions from Peter.
  The MAX_INTERRUPTS and throttle logs are applied to all events.
- Update the description and comments accordingly
- Add Reviewed-by from Ravi and Max

The sampling read doesn't work well with a group.
The issue was originally found by the 'Basic leader sampling test' case
failed on s390.
https://lore.kernel.org/all/20250228062241.303309-1-tmricht@linux.ibm.com/

Stephane debugged it and found it was caused by the throttling logic.
https://lore.kernel.org/all/CABPqkBQzCMNS_PfLZBWVuX9o8Z55PovwJvpVWMWzyeExFJ5R4Q@mail.gmail.com/

The throttle logic is generic and shared by all ARCHs.
It also impacts other ARCHs, e.g., X86.

On an Intel GNR machine,
$ perf record -e "{cycles,cycles}:S" ...

$ perf report -D | grep THROTTLE | tail -2
            THROTTLE events:        426  ( 9.0%)
          UNTHROTTLE events:        425  ( 9.0%)

$ perf report -D | grep PERF_RECORD_SAMPLE -a4 | tail -n 5
0 1020120874009167 0x74970 [0x68]: PERF_RECORD_SAMPLE(IP, 0x1):
... sample_read:
.... group nr 2
..... id 0000000000000327, value 000000000cbb993a, lost 0
..... id 0000000000000328, value 00000002211c26df, lost 0

The patch set tries to provide a generic fix for the group throttle
support. So the buggy driver-specific implementation can be removed.

The patch set is verified on newer Intel platforms (Kan), ARM (Leo Yan),
and s390 (Thomas Richter).

Kan Liang (16):
  perf: Clean up event in freq mode check
  perf: Fix the throttle logic for a group
  perf/x86/intel: Remove driver-specific throttle support
  perf/x86/amd: Remove driver-specific throttle support
  perf/x86/zhaoxin: Remove driver-specific throttle support
  powerpc/perf: Remove driver-specific throttle support
  s390/perf: Remove driver-specific throttle support
  perf/arm: Remove driver-specific throttle support
  perf/apple_m1: Remove driver-specific throttle support
  alpha/perf: Remove driver-specific throttle support
  arc/perf: Remove driver-specific throttle support
  csky/perf: Remove driver-specific throttle support
  loongarch/perf: Remove driver-specific throttle support
  sparc/perf: Remove driver-specific throttle support
  xtensa/perf: Remove driver-specific throttle support
  mips/perf: Remove driver-specific throttle support

 arch/alpha/kernel/perf_event.c       | 11 ++---
 arch/arc/kernel/perf_event.c         |  6 +--
 arch/csky/kernel/perf_event.c        |  3 +-
 arch/loongarch/kernel/perf_event.c   |  3 +-
 arch/mips/kernel/perf_event_mipsxx.c |  3 +-
 arch/powerpc/perf/core-book3s.c      |  6 +--
 arch/powerpc/perf/core-fsl-emb.c     |  3 +-
 arch/s390/kernel/perf_cpum_cf.c      |  2 -
 arch/s390/kernel/perf_cpum_sf.c      |  5 +-
 arch/sparc/kernel/perf_event.c       |  3 +-
 arch/x86/events/amd/core.c           |  3 +-
 arch/x86/events/amd/ibs.c            |  4 +-
 arch/x86/events/core.c               |  3 +-
 arch/x86/events/intel/core.c         |  6 +--
 arch/x86/events/intel/ds.c           |  7 ++-
 arch/x86/events/intel/knc.c          |  3 +-
 arch/x86/events/intel/p4.c           |  3 +-
 arch/x86/events/zhaoxin/core.c       |  3 +-
 arch/xtensa/kernel/perf_event.c      |  3 +-
 drivers/perf/apple_m1_cpu_pmu.c      |  3 +-
 drivers/perf/arm_pmuv3.c             |  3 +-
 drivers/perf/arm_v6_pmu.c            |  3 +-
 drivers/perf/arm_v7_pmu.c            |  3 +-
 drivers/perf/arm_xscale_pmu.c        |  6 +--
 kernel/events/core.c                 | 73 ++++++++++++++++++++--------
 25 files changed, 84 insertions(+), 87 deletions(-)

-- 
2.38.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH V3 01/16] perf: Clean up event in freq mode check
  2025-05-16 18:28 [PATCH V3 00/16] perf: Fix the throttle logic for group kan.liang
@ 2025-05-16 18:28 ` kan.liang
  2025-05-17 12:59   ` [tip: perf/core] perf/core: Add the is_event_in_freq_mode() helper to simplify the code tip-bot2 for Kan Liang
  2025-05-16 18:28 ` [PATCH V3 02/16] perf: Fix the throttle logic for a group kan.liang
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: kan.liang @ 2025-05-16 18:28 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

Add a helper to check if an event is in freq mode to improve the
readability.

No functional change.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 kernel/events/core.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index a84abc2b7f20..af78ec118e8f 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2442,6 +2442,11 @@ event_filter_match(struct perf_event *event)
 	       perf_cgroup_match(event);
 }
 
+static inline bool is_event_in_freq_mode(struct perf_event *event)
+{
+	return event->attr.freq && event->attr.sample_freq;
+}
+
 static void
 event_sched_out(struct perf_event *event, struct perf_event_context *ctx)
 {
@@ -2479,7 +2484,7 @@ event_sched_out(struct perf_event *event, struct perf_event_context *ctx)
 
 	if (!is_software_event(event))
 		cpc->active_oncpu--;
-	if (event->attr.freq && event->attr.sample_freq) {
+	if (is_event_in_freq_mode(event)) {
 		ctx->nr_freq--;
 		epc->nr_freq--;
 	}
@@ -2780,7 +2785,7 @@ event_sched_in(struct perf_event *event, struct perf_event_context *ctx)
 
 	if (!is_software_event(event))
 		cpc->active_oncpu++;
-	if (event->attr.freq && event->attr.sample_freq) {
+	if (is_event_in_freq_mode(event)) {
 		ctx->nr_freq++;
 		epc->nr_freq++;
 	}
@@ -4391,11 +4396,11 @@ static void perf_adjust_freq_unthr_events(struct list_head *event_list)
 		if (hwc->interrupts == MAX_INTERRUPTS) {
 			hwc->interrupts = 0;
 			perf_log_throttle(event, 1);
-			if (!event->attr.freq || !event->attr.sample_freq)
+			if (!is_event_in_freq_mode(event))
 				event->pmu->start(event, 0);
 		}
 
-		if (!event->attr.freq || !event->attr.sample_freq)
+		if (!is_event_in_freq_mode(event))
 			continue;
 
 		/*
@@ -13129,7 +13134,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 
 	hwc = &event->hw;
 	hwc->sample_period = attr->sample_period;
-	if (attr->freq && attr->sample_freq)
+	if (is_event_in_freq_mode(event))
 		hwc->sample_period = 1;
 	hwc->last_period = hwc->sample_period;
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 02/16] perf: Fix the throttle logic for a group
  2025-05-16 18:28 [PATCH V3 00/16] perf: Fix the throttle logic for group kan.liang
  2025-05-16 18:28 ` [PATCH V3 01/16] perf: Clean up event in freq mode check kan.liang
@ 2025-05-16 18:28 ` kan.liang
  2025-05-17  8:22   ` Ingo Molnar
  2025-05-18 19:18   ` Namhyung Kim
  2025-05-16 18:28 ` [PATCH V3 03/16] perf/x86/intel: Remove driver-specific throttle support kan.liang
                   ` (13 subsequent siblings)
  15 siblings, 2 replies; 24+ messages in thread
From: kan.liang @ 2025-05-16 18:28 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

The current throttle logic doesn't work well with a group, e.g., the
following sampling-read case.

$ perf record -e "{cycles,cycles}:S" ...

$ perf report -D | grep THROTTLE | tail -2
            THROTTLE events:        426  ( 9.0%)
          UNTHROTTLE events:        425  ( 9.0%)

$ perf report -D | grep PERF_RECORD_SAMPLE -a4 | tail -n 5
0 1020120874009167 0x74970 [0x68]: PERF_RECORD_SAMPLE(IP, 0x1):
... sample_read:
.... group nr 2
..... id 0000000000000327, value 000000000cbb993a, lost 0
..... id 0000000000000328, value 00000002211c26df, lost 0

The second cycles event has a much larger value than the first cycles
event in the same group.

The current throttle logic in the generic code only logs the THROTTLE
event. It relies on the specific driver implementation to disable
events. For all ARCHs, the implementation is similar. Only the event is
disabled, rather than the group.

The logic to disable the group should be generic for all ARCHs. Add the
logic in the generic code. The following patch will remove the buggy
driver-specific implementation.

The throttle only happens when an event is overflowed. Stop the entire
group when any event in the group triggers the throttle.
The MAX_INTERRUPTS is set to all throttle events.

The unthrottled could happen in 3 places.
- event/group sched. All events in the group are scheduled one by one.
  All of them will be unthrottled eventually. Nothing needs to be
  changed.
- The perf_adjust_freq_unthr_events for each tick. Needs to restart the
  group altogether.
- The __perf_event_period(). The whole group needs to be restarted
  altogether as well.

With the fix,
$ sudo perf report -D | grep PERF_RECORD_SAMPLE -a4 | tail -n 5
0 3573470770332 0x12f5f8 [0x70]: PERF_RECORD_SAMPLE(IP, 0x2):
... sample_read:
.... group nr 2
..... id 0000000000000a28, value 00000004fd3dfd8f, lost 0
..... id 0000000000000a29, value 00000004fd3dfd8f, lost 0

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 kernel/events/core.c | 60 ++++++++++++++++++++++++++++++++------------
 1 file changed, 44 insertions(+), 16 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index af78ec118e8f..52490c2ce45b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2739,6 +2739,39 @@ void perf_event_disable_inatomic(struct perf_event *event)
 static void perf_log_throttle(struct perf_event *event, int enable);
 static void perf_log_itrace_start(struct perf_event *event);
 
+static void perf_event_unthrottle(struct perf_event *event, bool start)
+{
+	event->hw.interrupts = 0;
+	if (start)
+		event->pmu->start(event, 0);
+	perf_log_throttle(event, 1);
+}
+
+static void perf_event_throttle(struct perf_event *event)
+{
+	event->pmu->stop(event, 0);
+	event->hw.interrupts = MAX_INTERRUPTS;
+	perf_log_throttle(event, 0);
+}
+
+static void perf_event_unthrottle_group(struct perf_event *event, bool skip_start_event)
+{
+	struct perf_event *sibling, *leader = event->group_leader;
+
+	perf_event_unthrottle(leader, skip_start_event ? leader != event : true);
+	for_each_sibling_event(sibling, leader)
+		perf_event_unthrottle(sibling, skip_start_event ? sibling != event : true);
+}
+
+static void perf_event_throttle_group(struct perf_event *event)
+{
+	struct perf_event *sibling, *leader = event->group_leader;
+
+	perf_event_throttle(leader);
+	for_each_sibling_event(sibling, leader)
+		perf_event_throttle(sibling);
+}
+
 static int
 event_sched_in(struct perf_event *event, struct perf_event_context *ctx)
 {
@@ -4393,12 +4426,8 @@ static void perf_adjust_freq_unthr_events(struct list_head *event_list)
 
 		hwc = &event->hw;
 
-		if (hwc->interrupts == MAX_INTERRUPTS) {
-			hwc->interrupts = 0;
-			perf_log_throttle(event, 1);
-			if (!is_event_in_freq_mode(event))
-				event->pmu->start(event, 0);
-		}
+		if (hwc->interrupts == MAX_INTERRUPTS)
+			perf_event_unthrottle_group(event, is_event_in_freq_mode(event));
 
 		if (!is_event_in_freq_mode(event))
 			continue;
@@ -6426,14 +6455,6 @@ static void __perf_event_period(struct perf_event *event,
 	active = (event->state == PERF_EVENT_STATE_ACTIVE);
 	if (active) {
 		perf_pmu_disable(event->pmu);
-		/*
-		 * We could be throttled; unthrottle now to avoid the tick
-		 * trying to unthrottle while we already re-started the event.
-		 */
-		if (event->hw.interrupts == MAX_INTERRUPTS) {
-			event->hw.interrupts = 0;
-			perf_log_throttle(event, 1);
-		}
 		event->pmu->stop(event, PERF_EF_UPDATE);
 	}
 
@@ -6441,6 +6462,14 @@ static void __perf_event_period(struct perf_event *event,
 
 	if (active) {
 		event->pmu->start(event, PERF_EF_RELOAD);
+		/*
+		 * Once the period is force-reset, the event starts immediately.
+		 * But the event/group could be throttled. Unthrottle the
+		 * event/group now to avoid the next tick trying to unthrottle
+		 * while we already re-started the event/group.
+		 */
+		if (event->hw.interrupts == MAX_INTERRUPTS)
+			perf_event_unthrottle_group(event, true);
 		perf_pmu_enable(event->pmu);
 	}
 }
@@ -10331,8 +10360,7 @@ __perf_event_account_interrupt(struct perf_event *event, int throttle)
 	if (unlikely(throttle && hwc->interrupts >= max_samples_per_tick)) {
 		__this_cpu_inc(perf_throttled_count);
 		tick_dep_set_cpu(smp_processor_id(), TICK_DEP_BIT_PERF_EVENTS);
-		hwc->interrupts = MAX_INTERRUPTS;
-		perf_log_throttle(event, 0);
+		perf_event_throttle_group(event);
 		ret = 1;
 	}
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 03/16] perf/x86/intel: Remove driver-specific throttle support
  2025-05-16 18:28 [PATCH V3 00/16] perf: Fix the throttle logic for group kan.liang
  2025-05-16 18:28 ` [PATCH V3 01/16] perf: Clean up event in freq mode check kan.liang
  2025-05-16 18:28 ` [PATCH V3 02/16] perf: Fix the throttle logic for a group kan.liang
@ 2025-05-16 18:28 ` kan.liang
  2025-05-16 18:28 ` [PATCH V3 04/16] perf/x86/amd: " kan.liang
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: kan.liang @ 2025-05-16 18:28 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 arch/x86/events/core.c       | 3 +--
 arch/x86/events/intel/core.c | 6 ++----
 arch/x86/events/intel/ds.c   | 7 +++----
 arch/x86/events/intel/knc.c  | 3 +--
 arch/x86/events/intel/p4.c   | 3 +--
 5 files changed, 8 insertions(+), 14 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 186e31cd0c14..8a2f73333a50 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1730,8 +1730,7 @@ int x86_pmu_handle_irq(struct pt_regs *regs)
 
 		perf_sample_save_brstack(&data, event, &cpuc->lbr_stack, NULL);
 
-		if (perf_event_overflow(event, &data, regs))
-			x86_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	if (handled)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index b7562d66c6ea..a8309a67693e 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3138,8 +3138,7 @@ static void x86_pmu_handle_guest_pebs(struct pt_regs *regs,
 			continue;
 
 		perf_sample_data_init(data, 0, event->hw.last_period);
-		if (perf_event_overflow(event, data, regs))
-			x86_pmu_stop(event, 0);
+		perf_event_overflow(event, data, regs);
 
 		/* Inject one fake event is enough. */
 		break;
@@ -3282,8 +3281,7 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
 		if (has_branch_stack(event))
 			intel_pmu_lbr_save_brstack(&data, cpuc, event);
 
-		if (perf_event_overflow(event, &data, regs))
-			x86_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	return handled;
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 58c054fa56b5..f8610f7196f0 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2368,8 +2368,7 @@ __intel_pmu_pebs_last_event(struct perf_event *event,
 		 * All but the last records are processed.
 		 * The last one is left to be able to call the overflow handler.
 		 */
-		if (perf_event_overflow(event, data, regs))
-			x86_pmu_stop(event, 0);
+		perf_event_overflow(event, data, regs);
 	}
 
 	if (hwc->flags & PERF_X86_EVENT_AUTO_RELOAD) {
@@ -2597,8 +2596,8 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
 		if (error[bit]) {
 			perf_log_lost_samples(event, error[bit]);
 
-			if (iregs && perf_event_account_interrupt(event))
-				x86_pmu_stop(event, 0);
+			if (iregs)
+				perf_event_account_interrupt(event);
 		}
 
 		if (counts[bit]) {
diff --git a/arch/x86/events/intel/knc.c b/arch/x86/events/intel/knc.c
index 3e8ec049b46d..384589168c1a 100644
--- a/arch/x86/events/intel/knc.c
+++ b/arch/x86/events/intel/knc.c
@@ -254,8 +254,7 @@ static int knc_pmu_handle_irq(struct pt_regs *regs)
 
 		perf_sample_data_init(&data, 0, last_period);
 
-		if (perf_event_overflow(event, &data, regs))
-			x86_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	/*
diff --git a/arch/x86/events/intel/p4.c b/arch/x86/events/intel/p4.c
index c85a9fc44355..126d5ae264cb 100644
--- a/arch/x86/events/intel/p4.c
+++ b/arch/x86/events/intel/p4.c
@@ -1072,8 +1072,7 @@ static int p4_pmu_handle_irq(struct pt_regs *regs)
 			continue;
 
 
-		if (perf_event_overflow(event, &data, regs))
-			x86_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	if (handled)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 04/16] perf/x86/amd: Remove driver-specific throttle support
  2025-05-16 18:28 [PATCH V3 00/16] perf: Fix the throttle logic for group kan.liang
                   ` (2 preceding siblings ...)
  2025-05-16 18:28 ` [PATCH V3 03/16] perf/x86/intel: Remove driver-specific throttle support kan.liang
@ 2025-05-16 18:28 ` kan.liang
  2025-05-16 18:28 ` [PATCH V3 05/16] perf/x86/zhaoxin: " kan.liang
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: kan.liang @ 2025-05-16 18:28 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, Kan Liang, Ravi Bangoria,
	Sandipan Das

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Reviewed-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
---
 arch/x86/events/amd/core.c | 3 +--
 arch/x86/events/amd/ibs.c  | 4 +---
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index 30d6ceb4c8ad..5e64283b9bf2 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -1003,8 +1003,7 @@ static int amd_pmu_v2_handle_irq(struct pt_regs *regs)
 
 		perf_sample_save_brstack(&data, event, &cpuc->lbr_stack, NULL);
 
-		if (perf_event_overflow(event, &data, regs))
-			x86_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	/*
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index 0252b7ea8bca..4bbbca02aeb1 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -1373,9 +1373,7 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
 		hwc->sample_period = perf_ibs->min_period;
 
 out:
-	if (throttle) {
-		perf_ibs_stop(event, 0);
-	} else {
+	if (!throttle) {
 		if (perf_ibs == &perf_ibs_op) {
 			if (ibs_caps & IBS_CAPS_OPCNTEXT) {
 				new_config = period & IBS_OP_MAX_CNT_EXT_MASK;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 05/16] perf/x86/zhaoxin: Remove driver-specific throttle support
  2025-05-16 18:28 [PATCH V3 00/16] perf: Fix the throttle logic for group kan.liang
                   ` (3 preceding siblings ...)
  2025-05-16 18:28 ` [PATCH V3 04/16] perf/x86/amd: " kan.liang
@ 2025-05-16 18:28 ` kan.liang
  2025-05-16 18:28 ` [PATCH V3 06/16] powerpc/perf: " kan.liang
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: kan.liang @ 2025-05-16 18:28 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, Kan Liang, silviazhao,
	CodyYao-oc

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: silviazhao <silviazhao-oc@zhaoxin.com>
Cc: CodyYao-oc <CodyYao-oc@zhaoxin.com>
---
 arch/x86/events/zhaoxin/core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/events/zhaoxin/core.c b/arch/x86/events/zhaoxin/core.c
index 2fd9b0cf9a5e..49a5944fac63 100644
--- a/arch/x86/events/zhaoxin/core.c
+++ b/arch/x86/events/zhaoxin/core.c
@@ -397,8 +397,7 @@ static int zhaoxin_pmu_handle_irq(struct pt_regs *regs)
 		if (!x86_perf_event_set_period(event))
 			continue;
 
-		if (perf_event_overflow(event, &data, regs))
-			x86_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	/*
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 06/16] powerpc/perf: Remove driver-specific throttle support
  2025-05-16 18:28 [PATCH V3 00/16] perf: Fix the throttle logic for group kan.liang
                   ` (4 preceding siblings ...)
  2025-05-16 18:28 ` [PATCH V3 05/16] perf/x86/zhaoxin: " kan.liang
@ 2025-05-16 18:28 ` kan.liang
  2025-05-16 18:28 ` [PATCH V3 07/16] s390/perf: " kan.liang
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: kan.liang @ 2025-05-16 18:28 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, Kan Liang, Athira Rajeev,
	Madhavan Srinivasan, linuxppc-dev

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/perf/core-book3s.c  | 6 ++----
 arch/powerpc/perf/core-fsl-emb.c | 3 +--
 2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 42ff4d167acc..8b0081441f85 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2344,12 +2344,10 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
 			ppmu->get_mem_weight(&data.weight.full, event->attr.sample_type);
 			data.sample_flags |= PERF_SAMPLE_WEIGHT_TYPE;
 		}
-		if (perf_event_overflow(event, &data, regs))
-			power_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	} else if (period) {
 		/* Account for interrupt in case of invalid SIAR */
-		if (perf_event_account_interrupt(event))
-			power_pmu_stop(event, 0);
+		perf_event_account_interrupt(event);
 	}
 }
 
diff --git a/arch/powerpc/perf/core-fsl-emb.c b/arch/powerpc/perf/core-fsl-emb.c
index d2ffcc7021c5..7120ab20cbfe 100644
--- a/arch/powerpc/perf/core-fsl-emb.c
+++ b/arch/powerpc/perf/core-fsl-emb.c
@@ -635,8 +635,7 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
 
 		perf_sample_data_init(&data, 0, last_period);
 
-		if (perf_event_overflow(event, &data, regs))
-			fsl_emb_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	}
 }
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 07/16] s390/perf: Remove driver-specific throttle support
  2025-05-16 18:28 [PATCH V3 00/16] perf: Fix the throttle logic for group kan.liang
                   ` (5 preceding siblings ...)
  2025-05-16 18:28 ` [PATCH V3 06/16] powerpc/perf: " kan.liang
@ 2025-05-16 18:28 ` kan.liang
  2025-05-16 18:28 ` [PATCH V3 08/16] perf/arm: " kan.liang
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: kan.liang @ 2025-05-16 18:28 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, Kan Liang, linux-s390

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
---
 arch/s390/kernel/perf_cpum_cf.c | 2 --
 arch/s390/kernel/perf_cpum_sf.c | 5 +----
 2 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c
index e657fad7e376..6a262e198e35 100644
--- a/arch/s390/kernel/perf_cpum_cf.c
+++ b/arch/s390/kernel/perf_cpum_cf.c
@@ -980,8 +980,6 @@ static int cfdiag_push_sample(struct perf_event *event,
 	}
 
 	overflow = perf_event_overflow(event, &data, &regs);
-	if (overflow)
-		event->pmu->stop(event, 0);
 
 	perf_event_update_userpage(event);
 	return overflow;
diff --git a/arch/s390/kernel/perf_cpum_sf.c b/arch/s390/kernel/perf_cpum_sf.c
index ad22799d8a7d..91469401f2c9 100644
--- a/arch/s390/kernel/perf_cpum_sf.c
+++ b/arch/s390/kernel/perf_cpum_sf.c
@@ -1072,10 +1072,7 @@ static int perf_push_sample(struct perf_event *event,
 	overflow = 0;
 	if (perf_event_exclude(event, &regs, sde_regs))
 		goto out;
-	if (perf_event_overflow(event, &data, &regs)) {
-		overflow = 1;
-		event->pmu->stop(event, 0);
-	}
+	overflow = perf_event_overflow(event, &data, &regs);
 	perf_event_update_userpage(event);
 out:
 	return overflow;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 08/16] perf/arm: Remove driver-specific throttle support
  2025-05-16 18:28 [PATCH V3 00/16] perf: Fix the throttle logic for group kan.liang
                   ` (6 preceding siblings ...)
  2025-05-16 18:28 ` [PATCH V3 07/16] s390/perf: " kan.liang
@ 2025-05-16 18:28 ` kan.liang
  2025-05-16 18:28 ` [PATCH V3 09/16] perf/apple_m1: " kan.liang
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: kan.liang @ 2025-05-16 18:28 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, Kan Liang, Rob Herring,
	Vincenzo Frascino, Will Deacon

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Tested-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Rob Herring (Arm) <robh@kernel.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
---
 drivers/perf/arm_pmuv3.c      | 3 +--
 drivers/perf/arm_v6_pmu.c     | 3 +--
 drivers/perf/arm_v7_pmu.c     | 3 +--
 drivers/perf/arm_xscale_pmu.c | 6 ++----
 4 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
index e506d59654e7..3db9f4ed17e8 100644
--- a/drivers/perf/arm_pmuv3.c
+++ b/drivers/perf/arm_pmuv3.c
@@ -887,8 +887,7 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
 		 * an irq_work which will be taken care of in the handling of
 		 * IPI_IRQ_WORK.
 		 */
-		if (perf_event_overflow(event, &data, regs))
-			cpu_pmu->disable(event);
+		perf_event_overflow(event, &data, regs);
 	}
 	armv8pmu_start(cpu_pmu);
 
diff --git a/drivers/perf/arm_v6_pmu.c b/drivers/perf/arm_v6_pmu.c
index b09615bb2bb2..7cb12c8e06c7 100644
--- a/drivers/perf/arm_v6_pmu.c
+++ b/drivers/perf/arm_v6_pmu.c
@@ -276,8 +276,7 @@ armv6pmu_handle_irq(struct arm_pmu *cpu_pmu)
 		if (!armpmu_event_set_period(event))
 			continue;
 
-		if (perf_event_overflow(event, &data, regs))
-			cpu_pmu->disable(event);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	/*
diff --git a/drivers/perf/arm_v7_pmu.c b/drivers/perf/arm_v7_pmu.c
index 17831e1920bd..a1e438101114 100644
--- a/drivers/perf/arm_v7_pmu.c
+++ b/drivers/perf/arm_v7_pmu.c
@@ -930,8 +930,7 @@ static irqreturn_t armv7pmu_handle_irq(struct arm_pmu *cpu_pmu)
 		if (!armpmu_event_set_period(event))
 			continue;
 
-		if (perf_event_overflow(event, &data, regs))
-			cpu_pmu->disable(event);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	/*
diff --git a/drivers/perf/arm_xscale_pmu.c b/drivers/perf/arm_xscale_pmu.c
index 638fea9b1263..c2ac41dd9e19 100644
--- a/drivers/perf/arm_xscale_pmu.c
+++ b/drivers/perf/arm_xscale_pmu.c
@@ -186,8 +186,7 @@ xscale1pmu_handle_irq(struct arm_pmu *cpu_pmu)
 		if (!armpmu_event_set_period(event))
 			continue;
 
-		if (perf_event_overflow(event, &data, regs))
-			cpu_pmu->disable(event);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	irq_work_run();
@@ -519,8 +518,7 @@ xscale2pmu_handle_irq(struct arm_pmu *cpu_pmu)
 		if (!armpmu_event_set_period(event))
 			continue;
 
-		if (perf_event_overflow(event, &data, regs))
-			cpu_pmu->disable(event);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	irq_work_run();
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 09/16] perf/apple_m1: Remove driver-specific throttle support
  2025-05-16 18:28 [PATCH V3 00/16] perf: Fix the throttle logic for group kan.liang
                   ` (7 preceding siblings ...)
  2025-05-16 18:28 ` [PATCH V3 08/16] perf/arm: " kan.liang
@ 2025-05-16 18:28 ` kan.liang
  2025-05-16 18:28 ` [PATCH V3 10/16] alpha/perf: " kan.liang
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: kan.liang @ 2025-05-16 18:28 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, Kan Liang, Oliver Upton

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
---
 drivers/perf/apple_m1_cpu_pmu.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/perf/apple_m1_cpu_pmu.c b/drivers/perf/apple_m1_cpu_pmu.c
index df9a28ba69dc..81b6f1a62349 100644
--- a/drivers/perf/apple_m1_cpu_pmu.c
+++ b/drivers/perf/apple_m1_cpu_pmu.c
@@ -474,8 +474,7 @@ static irqreturn_t m1_pmu_handle_irq(struct arm_pmu *cpu_pmu)
 		if (!armpmu_event_set_period(event))
 			continue;
 
-		if (perf_event_overflow(event, &data, regs))
-			m1_pmu_disable_event(event);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	cpu_pmu->start(cpu_pmu);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 10/16] alpha/perf: Remove driver-specific throttle support
  2025-05-16 18:28 [PATCH V3 00/16] perf: Fix the throttle logic for group kan.liang
                   ` (8 preceding siblings ...)
  2025-05-16 18:28 ` [PATCH V3 09/16] perf/apple_m1: " kan.liang
@ 2025-05-16 18:28 ` kan.liang
  2025-05-16 18:28 ` [PATCH V3 11/16] arc/perf: " kan.liang
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: kan.liang @ 2025-05-16 18:28 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, Kan Liang, linux-alpha

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: linux-alpha@vger.kernel.org
---
 arch/alpha/kernel/perf_event.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/arch/alpha/kernel/perf_event.c b/arch/alpha/kernel/perf_event.c
index 1f0eb4f25c0f..a3eaab094ece 100644
--- a/arch/alpha/kernel/perf_event.c
+++ b/arch/alpha/kernel/perf_event.c
@@ -852,14 +852,9 @@ static void alpha_perf_event_irq_handler(unsigned long la_ptr,
 	alpha_perf_event_update(event, hwc, idx, alpha_pmu->pmc_max_period[idx]+1);
 	perf_sample_data_init(&data, 0, hwc->last_period);
 
-	if (alpha_perf_event_set_period(event, hwc, idx)) {
-		if (perf_event_overflow(event, &data, regs)) {
-			/* Interrupts coming too quickly; "throttle" the
-			 * counter, i.e., disable it for a little while.
-			 */
-			alpha_pmu_stop(event, 0);
-		}
-	}
+	if (alpha_perf_event_set_period(event, hwc, idx))
+		perf_event_overflow(event, &data, regs);
+
 	wrperfmon(PERFMON_CMD_ENABLE, cpuc->idx_mask);
 
 	return;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 11/16] arc/perf: Remove driver-specific throttle support
  2025-05-16 18:28 [PATCH V3 00/16] perf: Fix the throttle logic for group kan.liang
                   ` (9 preceding siblings ...)
  2025-05-16 18:28 ` [PATCH V3 10/16] alpha/perf: " kan.liang
@ 2025-05-16 18:28 ` kan.liang
  2025-05-18 21:58   ` Vineet Gupta
  2025-05-16 18:28 ` [PATCH V3 12/16] csky/perf: " kan.liang
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: kan.liang @ 2025-05-16 18:28 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, Kan Liang, Vineet Gupta,
	linux-snps-arc

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Vineet Gupta <vgupta@ikernel.org>
Cc: linux-snps-arc@lists.infradead.org
---
 arch/arc/kernel/perf_event.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c
index 6e5a651cd75c..ed6d4f0cd621 100644
--- a/arch/arc/kernel/perf_event.c
+++ b/arch/arc/kernel/perf_event.c
@@ -599,10 +599,8 @@ static irqreturn_t arc_pmu_intr(int irq, void *dev)
 
 		arc_perf_event_update(event, &event->hw, event->hw.idx);
 		perf_sample_data_init(&data, 0, hwc->last_period);
-		if (arc_pmu_event_set_period(event)) {
-			if (perf_event_overflow(event, &data, regs))
-				arc_pmu_stop(event, 0);
-		}
+		if (arc_pmu_event_set_period(event))
+			perf_event_overflow(event, &data, regs);
 
 		active_ints &= ~BIT(idx);
 	} while (active_ints);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 12/16] csky/perf: Remove driver-specific throttle support
  2025-05-16 18:28 [PATCH V3 00/16] perf: Fix the throttle logic for group kan.liang
                   ` (10 preceding siblings ...)
  2025-05-16 18:28 ` [PATCH V3 11/16] arc/perf: " kan.liang
@ 2025-05-16 18:28 ` kan.liang
  2025-05-16 18:28 ` [PATCH V3 13/16] loongarch/perf: " kan.liang
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: kan.liang @ 2025-05-16 18:28 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, Kan Liang, Guo Ren, Mao Han,
	Guo Ren, linux-csky

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Acked-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Mao Han <han_mao@c-sky.com>
Cc: Guo Ren <ren_guo@c-sky.com>
Cc: linux-csky@vger.kernel.org
---
 arch/csky/kernel/perf_event.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/csky/kernel/perf_event.c b/arch/csky/kernel/perf_event.c
index e5f18420ce64..e0a36acd265b 100644
--- a/arch/csky/kernel/perf_event.c
+++ b/arch/csky/kernel/perf_event.c
@@ -1139,8 +1139,7 @@ static irqreturn_t csky_pmu_handle_irq(int irq_num, void *dev)
 		perf_sample_data_init(&data, 0, hwc->last_period);
 		csky_pmu_event_set_period(event);
 
-		if (perf_event_overflow(event, &data, regs))
-			csky_pmu_stop_event(event);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	csky_pmu_enable(&csky_pmu.pmu);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 13/16] loongarch/perf: Remove driver-specific throttle support
  2025-05-16 18:28 [PATCH V3 00/16] perf: Fix the throttle logic for group kan.liang
                   ` (11 preceding siblings ...)
  2025-05-16 18:28 ` [PATCH V3 12/16] csky/perf: " kan.liang
@ 2025-05-16 18:28 ` kan.liang
  2025-05-16 18:28 ` [PATCH V3 14/16] sparc/perf: " kan.liang
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: kan.liang @ 2025-05-16 18:28 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, Kan Liang, Bibo Mao,
	Huacai Chen, loongarch

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: Huacai Chen <chenhuacai@loongson.cn>
Cc: loongarch@lists.linux.dev
---
 arch/loongarch/kernel/perf_event.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/loongarch/kernel/perf_event.c b/arch/loongarch/kernel/perf_event.c
index f86a4b838dd7..8ad098703488 100644
--- a/arch/loongarch/kernel/perf_event.c
+++ b/arch/loongarch/kernel/perf_event.c
@@ -479,8 +479,7 @@ static void handle_associated_event(struct cpu_hw_events *cpuc, int idx,
 	if (!loongarch_pmu_event_set_period(event, hwc, idx))
 		return;
 
-	if (perf_event_overflow(event, data, regs))
-		loongarch_pmu_disable_event(idx);
+	perf_event_overflow(event, data, regs);
 }
 
 static irqreturn_t pmu_handle_irq(int irq, void *dev)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 14/16] sparc/perf: Remove driver-specific throttle support
  2025-05-16 18:28 [PATCH V3 00/16] perf: Fix the throttle logic for group kan.liang
                   ` (12 preceding siblings ...)
  2025-05-16 18:28 ` [PATCH V3 13/16] loongarch/perf: " kan.liang
@ 2025-05-16 18:28 ` kan.liang
  2025-05-16 18:28 ` [PATCH V3 15/16] xtensa/perf: " kan.liang
  2025-05-16 18:28 ` [PATCH V3 16/16] mips/perf: " kan.liang
  15 siblings, 0 replies; 24+ messages in thread
From: kan.liang @ 2025-05-16 18:28 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, Kan Liang, David S . Miller,
	sparclinux

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
---
 arch/sparc/kernel/perf_event.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/sparc/kernel/perf_event.c b/arch/sparc/kernel/perf_event.c
index f02a283a8e8f..cae4d33002a5 100644
--- a/arch/sparc/kernel/perf_event.c
+++ b/arch/sparc/kernel/perf_event.c
@@ -1668,8 +1668,7 @@ static int __kprobes perf_event_nmi_handler(struct notifier_block *self,
 		if (!sparc_perf_event_set_period(event, hwc, idx))
 			continue;
 
-		if (perf_event_overflow(event, &data, regs))
-			sparc_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	finish_clock = sched_clock();
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 15/16] xtensa/perf: Remove driver-specific throttle support
  2025-05-16 18:28 [PATCH V3 00/16] perf: Fix the throttle logic for group kan.liang
                   ` (13 preceding siblings ...)
  2025-05-16 18:28 ` [PATCH V3 14/16] sparc/perf: " kan.liang
@ 2025-05-16 18:28 ` kan.liang
  2025-05-16 18:28 ` [PATCH V3 16/16] mips/perf: " kan.liang
  15 siblings, 0 replies; 24+ messages in thread
From: kan.liang @ 2025-05-16 18:28 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, Kan Liang, Max Filippov

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
---
 arch/xtensa/kernel/perf_event.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/xtensa/kernel/perf_event.c b/arch/xtensa/kernel/perf_event.c
index 183618090d05..223f1d452310 100644
--- a/arch/xtensa/kernel/perf_event.c
+++ b/arch/xtensa/kernel/perf_event.c
@@ -388,8 +388,7 @@ irqreturn_t xtensa_pmu_irq_handler(int irq, void *dev_id)
 			struct pt_regs *regs = get_irq_regs();
 
 			perf_sample_data_init(&data, 0, last_period);
-			if (perf_event_overflow(event, &data, regs))
-				xtensa_pmu_stop(event, 0);
+			perf_event_overflow(event, &data, regs);
 		}
 
 		rc = IRQ_HANDLED;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 16/16] mips/perf: Remove driver-specific throttle support
  2025-05-16 18:28 [PATCH V3 00/16] perf: Fix the throttle logic for group kan.liang
                   ` (14 preceding siblings ...)
  2025-05-16 18:28 ` [PATCH V3 15/16] xtensa/perf: " kan.liang
@ 2025-05-16 18:28 ` kan.liang
  15 siblings, 0 replies; 24+ messages in thread
From: kan.liang @ 2025-05-16 18:28 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, Kan Liang, Thomas Bogendoerfer,
	linux-mips

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: linux-mips@vger.kernel.org
---
 arch/mips/kernel/perf_event_mipsxx.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/mips/kernel/perf_event_mipsxx.c b/arch/mips/kernel/perf_event_mipsxx.c
index c4d6b09136b1..196a070349b0 100644
--- a/arch/mips/kernel/perf_event_mipsxx.c
+++ b/arch/mips/kernel/perf_event_mipsxx.c
@@ -791,8 +791,7 @@ static void handle_associated_event(struct cpu_hw_events *cpuc,
 	if (!mipspmu_event_set_period(event, hwc, idx))
 		return;
 
-	if (perf_event_overflow(event, data, regs))
-		mipsxx_pmu_disable_event(idx);
+	perf_event_overflow(event, data, regs);
 }
 
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 02/16] perf: Fix the throttle logic for a group
  2025-05-16 18:28 ` [PATCH V3 02/16] perf: Fix the throttle logic for a group kan.liang
@ 2025-05-17  8:22   ` Ingo Molnar
  2025-05-20 14:16     ` Liang, Kan
  2025-05-18 19:18   ` Namhyung Kim
  1 sibling, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2025-05-17  8:22 UTC (permalink / raw)
  To: kan.liang
  Cc: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users, eranian, ctshao, tmricht, leo.yan


* kan.liang@linux.intel.com <kan.liang@linux.intel.com> wrote:

> The throttle only happens when an event is overflowed. Stop the entire
> group when any event in the group triggers the throttle.
> The MAX_INTERRUPTS is set to all throttle events.

Since this is a relatively long series with a healthy dose of 
breakage-risk, I'm wondering about bisectability:

 - patch #2 auto-throttles groups, ie. stops the PMU

 - patches #3-#16 removes explicit PMU-stop calls.

In the interim commits, will the double PMU-stop in drivers not updated 
yet do anything noticeable, such as generate warnings, etc?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [tip: perf/core] perf/core: Add the is_event_in_freq_mode() helper to simplify the code
  2025-05-16 18:28 ` [PATCH V3 01/16] perf: Clean up event in freq mode check kan.liang
@ 2025-05-17 12:59   ` tip-bot2 for Kan Liang
  0 siblings, 0 replies; 24+ messages in thread
From: tip-bot2 for Kan Liang @ 2025-05-17 12:59 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Kan Liang, Ingo Molnar, Peter Zijlstra, x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     ca559503b89c30bc49178d0e4a1e0b23f991fb9f
Gitweb:        https://git.kernel.org/tip/ca559503b89c30bc49178d0e4a1e0b23f991fb9f
Author:        Kan Liang <kan.liang@linux.intel.com>
AuthorDate:    Fri, 16 May 2025 11:28:38 -07:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Sat, 17 May 2025 10:02:27 +02:00

perf/core: Add the is_event_in_freq_mode() helper to simplify the code

Add a helper to check if an event is in freq mode to improve readability.

No functional changes.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250516182853.2610284-2-kan.liang@linux.intel.com
---
 kernel/events/core.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index b846107..952340f 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2351,6 +2351,11 @@ event_filter_match(struct perf_event *event)
 	       perf_cgroup_match(event);
 }
 
+static inline bool is_event_in_freq_mode(struct perf_event *event)
+{
+	return event->attr.freq && event->attr.sample_freq;
+}
+
 static void
 event_sched_out(struct perf_event *event, struct perf_event_context *ctx)
 {
@@ -2388,7 +2393,7 @@ event_sched_out(struct perf_event *event, struct perf_event_context *ctx)
 
 	if (!is_software_event(event))
 		cpc->active_oncpu--;
-	if (event->attr.freq && event->attr.sample_freq) {
+	if (is_event_in_freq_mode(event)) {
 		ctx->nr_freq--;
 		epc->nr_freq--;
 	}
@@ -2686,7 +2691,7 @@ event_sched_in(struct perf_event *event, struct perf_event_context *ctx)
 
 	if (!is_software_event(event))
 		cpc->active_oncpu++;
-	if (event->attr.freq && event->attr.sample_freq) {
+	if (is_event_in_freq_mode(event)) {
 		ctx->nr_freq++;
 		epc->nr_freq++;
 	}
@@ -4252,11 +4257,11 @@ static void perf_adjust_freq_unthr_events(struct list_head *event_list)
 		if (hwc->interrupts == MAX_INTERRUPTS) {
 			hwc->interrupts = 0;
 			perf_log_throttle(event, 1);
-			if (!event->attr.freq || !event->attr.sample_freq)
+			if (!is_event_in_freq_mode(event))
 				event->pmu->start(event, 0);
 		}
 
-		if (!event->attr.freq || !event->attr.sample_freq)
+		if (!is_event_in_freq_mode(event))
 			continue;
 
 		/*
@@ -12848,7 +12853,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 
 	hwc = &event->hw;
 	hwc->sample_period = attr->sample_period;
-	if (attr->freq && attr->sample_freq)
+	if (is_event_in_freq_mode(event))
 		hwc->sample_period = 1;
 	hwc->last_period = hwc->sample_period;
 

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 02/16] perf: Fix the throttle logic for a group
  2025-05-16 18:28 ` [PATCH V3 02/16] perf: Fix the throttle logic for a group kan.liang
  2025-05-17  8:22   ` Ingo Molnar
@ 2025-05-18 19:18   ` Namhyung Kim
  2025-05-20 14:47     ` Liang, Kan
  1 sibling, 1 reply; 24+ messages in thread
From: Namhyung Kim @ 2025-05-18 19:18 UTC (permalink / raw)
  To: kan.liang
  Cc: peterz, mingo, irogers, mark.rutland, linux-kernel,
	linux-perf-users, eranian, ctshao, tmricht, leo.yan

Hi Kan,

On Fri, May 16, 2025 at 11:28:39AM -0700, kan.liang@linux.intel.com wrote:
> From: Kan Liang <kan.liang@linux.intel.com>
> 
> The current throttle logic doesn't work well with a group, e.g., the
> following sampling-read case.
> 
> $ perf record -e "{cycles,cycles}:S" ...
> 
> $ perf report -D | grep THROTTLE | tail -2
>             THROTTLE events:        426  ( 9.0%)
>           UNTHROTTLE events:        425  ( 9.0%)
> 
> $ perf report -D | grep PERF_RECORD_SAMPLE -a4 | tail -n 5
> 0 1020120874009167 0x74970 [0x68]: PERF_RECORD_SAMPLE(IP, 0x1):
> ... sample_read:
> .... group nr 2
> ..... id 0000000000000327, value 000000000cbb993a, lost 0
> ..... id 0000000000000328, value 00000002211c26df, lost 0
> 
> The second cycles event has a much larger value than the first cycles
> event in the same group.
> 
> The current throttle logic in the generic code only logs the THROTTLE
> event. It relies on the specific driver implementation to disable
> events. For all ARCHs, the implementation is similar. Only the event is
> disabled, rather than the group.
> 
> The logic to disable the group should be generic for all ARCHs. Add the
> logic in the generic code. The following patch will remove the buggy
> driver-specific implementation.
> 
> The throttle only happens when an event is overflowed. Stop the entire
> group when any event in the group triggers the throttle.
> The MAX_INTERRUPTS is set to all throttle events.
> 
> The unthrottled could happen in 3 places.
> - event/group sched. All events in the group are scheduled one by one.
>   All of them will be unthrottled eventually. Nothing needs to be
>   changed.
> - The perf_adjust_freq_unthr_events for each tick. Needs to restart the
>   group altogether.
> - The __perf_event_period(). The whole group needs to be restarted
>   altogether as well.
> 
> With the fix,
> $ sudo perf report -D | grep PERF_RECORD_SAMPLE -a4 | tail -n 5
> 0 3573470770332 0x12f5f8 [0x70]: PERF_RECORD_SAMPLE(IP, 0x2):
> ... sample_read:
> .... group nr 2
> ..... id 0000000000000a28, value 00000004fd3dfd8f, lost 0
> ..... id 0000000000000a29, value 00000004fd3dfd8f, lost 0

Thanks for working on this!

> 
> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> ---
>  kernel/events/core.c | 60 ++++++++++++++++++++++++++++++++------------
>  1 file changed, 44 insertions(+), 16 deletions(-)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index af78ec118e8f..52490c2ce45b 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2739,6 +2739,39 @@ void perf_event_disable_inatomic(struct perf_event *event)
>  static void perf_log_throttle(struct perf_event *event, int enable);
>  static void perf_log_itrace_start(struct perf_event *event);
>  
> +static void perf_event_unthrottle(struct perf_event *event, bool start)
> +{
> +	event->hw.interrupts = 0;
> +	if (start)
> +		event->pmu->start(event, 0);
> +	perf_log_throttle(event, 1);
> +}
> +
> +static void perf_event_throttle(struct perf_event *event)
> +{
> +	event->pmu->stop(event, 0);
> +	event->hw.interrupts = MAX_INTERRUPTS;
> +	perf_log_throttle(event, 0);
> +}
> +
> +static void perf_event_unthrottle_group(struct perf_event *event, bool skip_start_event)
> +{
> +	struct perf_event *sibling, *leader = event->group_leader;
> +
> +	perf_event_unthrottle(leader, skip_start_event ? leader != event : true);
> +	for_each_sibling_event(sibling, leader)
> +		perf_event_unthrottle(sibling, skip_start_event ? sibling != event : true);

This will add more PERF_RECORD_THROTTLE records for sibling events.
Maybe we can generate it for the actual target event only?

Also the condition for skip_start_event is if it's a freq event.
I think we can skip pmu->start() if the sibling is also a freq event.
I remember KVM folks concern about the number of PMU accesses as it
can cause VM exits.

Thanks,
Namhyung

> +}
> +
> +static void perf_event_throttle_group(struct perf_event *event)
> +{
> +	struct perf_event *sibling, *leader = event->group_leader;
> +
> +	perf_event_throttle(leader);
> +	for_each_sibling_event(sibling, leader)
> +		perf_event_throttle(sibling);
> +}
> +
>  static int
>  event_sched_in(struct perf_event *event, struct perf_event_context *ctx)
>  {
> @@ -4393,12 +4426,8 @@ static void perf_adjust_freq_unthr_events(struct list_head *event_list)
>  
>  		hwc = &event->hw;
>  
> -		if (hwc->interrupts == MAX_INTERRUPTS) {
> -			hwc->interrupts = 0;
> -			perf_log_throttle(event, 1);
> -			if (!is_event_in_freq_mode(event))
> -				event->pmu->start(event, 0);
> -		}
> +		if (hwc->interrupts == MAX_INTERRUPTS)
> +			perf_event_unthrottle_group(event, is_event_in_freq_mode(event));
>  
>  		if (!is_event_in_freq_mode(event))
>  			continue;
> @@ -6426,14 +6455,6 @@ static void __perf_event_period(struct perf_event *event,
>  	active = (event->state == PERF_EVENT_STATE_ACTIVE);
>  	if (active) {
>  		perf_pmu_disable(event->pmu);
> -		/*
> -		 * We could be throttled; unthrottle now to avoid the tick
> -		 * trying to unthrottle while we already re-started the event.
> -		 */
> -		if (event->hw.interrupts == MAX_INTERRUPTS) {
> -			event->hw.interrupts = 0;
> -			perf_log_throttle(event, 1);
> -		}
>  		event->pmu->stop(event, PERF_EF_UPDATE);
>  	}
>  
> @@ -6441,6 +6462,14 @@ static void __perf_event_period(struct perf_event *event,
>  
>  	if (active) {
>  		event->pmu->start(event, PERF_EF_RELOAD);
> +		/*
> +		 * Once the period is force-reset, the event starts immediately.
> +		 * But the event/group could be throttled. Unthrottle the
> +		 * event/group now to avoid the next tick trying to unthrottle
> +		 * while we already re-started the event/group.
> +		 */
> +		if (event->hw.interrupts == MAX_INTERRUPTS)
> +			perf_event_unthrottle_group(event, true);
>  		perf_pmu_enable(event->pmu);
>  	}
>  }
> @@ -10331,8 +10360,7 @@ __perf_event_account_interrupt(struct perf_event *event, int throttle)
>  	if (unlikely(throttle && hwc->interrupts >= max_samples_per_tick)) {
>  		__this_cpu_inc(perf_throttled_count);
>  		tick_dep_set_cpu(smp_processor_id(), TICK_DEP_BIT_PERF_EVENTS);
> -		hwc->interrupts = MAX_INTERRUPTS;
> -		perf_log_throttle(event, 0);
> +		perf_event_throttle_group(event);
>  		ret = 1;
>  	}
>  
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 11/16] arc/perf: Remove driver-specific throttle support
  2025-05-16 18:28 ` [PATCH V3 11/16] arc/perf: " kan.liang
@ 2025-05-18 21:58   ` Vineet Gupta
  0 siblings, 0 replies; 24+ messages in thread
From: Vineet Gupta @ 2025-05-18 21:58 UTC (permalink / raw)
  To: kan.liang, peterz, mingo, namhyung, irogers, mark.rutland,
	linux-kernel, linux-perf-users
  Cc: eranian, ctshao, tmricht, leo.yan, linux-snps-arc



On 5/16/25 11:28, kan.liang@linux.intel.com wrote:
> From: Kan Liang <kan.liang@linux.intel.com>
>
> The throttle support has been added in the generic code. Remove
> the driver-specific throttle support.
>
> Besides the throttle, perf_event_overflow may return true because of
> event_limit. It already does an inatomic event disable. The pmu->stop
> is not required either.
>
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> Cc: Vineet Gupta <vgupta@ikernel.org>
> Cc: linux-snps-arc@lists.infradead.org
Acked-by: Vineet Gupta <vgupta@kernel.org>

Please let me know if you wane to take it via ARC tree.

Thx,
-Vineet

> ---
>  arch/arc/kernel/perf_event.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c
> index 6e5a651cd75c..ed6d4f0cd621 100644
> --- a/arch/arc/kernel/perf_event.c
> +++ b/arch/arc/kernel/perf_event.c
> @@ -599,10 +599,8 @@ static irqreturn_t arc_pmu_intr(int irq, void *dev)
>  
>  		arc_perf_event_update(event, &event->hw, event->hw.idx);
>  		perf_sample_data_init(&data, 0, hwc->last_period);
> -		if (arc_pmu_event_set_period(event)) {
> -			if (perf_event_overflow(event, &data, regs))
> -				arc_pmu_stop(event, 0);
> -		}
> +		if (arc_pmu_event_set_period(event))
> +			perf_event_overflow(event, &data, regs);
>  
>  		active_ints &= ~BIT(idx);
>  	} while (active_ints);


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 02/16] perf: Fix the throttle logic for a group
  2025-05-17  8:22   ` Ingo Molnar
@ 2025-05-20 14:16     ` Liang, Kan
  0 siblings, 0 replies; 24+ messages in thread
From: Liang, Kan @ 2025-05-20 14:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users, eranian, ctshao, tmricht, leo.yan

On 2025-05-17 4:22 a.m., Ingo Molnar wrote:
> 
> * kan.liang@linux.intel.com <kan.liang@linux.intel.com> wrote:
> 
>> The throttle only happens when an event is overflowed. Stop the entire
>> group when any event in the group triggers the throttle.
>> The MAX_INTERRUPTS is set to all throttle events.
> 
> Since this is a relatively long series with a healthy dose of 
> breakage-risk, I'm wondering about bisectability:
> 
>  - patch #2 auto-throttles groups, ie. stops the PMU
> 
>  - patches #3-#16 removes explicit PMU-stop calls.
> 
> In the interim commits, will the double PMU-stop in drivers not updated 
> yet do anything noticeable, such as generate warnings, etc?
> 

The short answer is no.

Here are the details for different ARCHs.

There is a active_mask to track the active counter/event in X86. The
current implementation checks the corresponding bit first. If it is
already cleared, do nothing. It avoids the double PMU-stop. I've tested
on my machine.
AMD and Zhaoxin shares the same x86_pmu_stop() as Intel. They are OK as
well.

powerpc, S390, ARC, sparc and xtensa utilize the PERF_HES_STOPPED flag
instead. If the flag has been set, do nothing. It can also avoids the
double PMU-stop.

ARM, apple m1, csky, loongarch and mips invoke the disable_event, rather
than PMU stop. The disable_event unconditionally disables the counter
register. It doesn't check if the register is already disabled. But I
don't think double writing a register can trigger any issue.

Alpha utilizes the PERF_HES_STOPPED flag. But it seems still writes the
counter register even it's already disabled. Because the cpuc->enabled
is used to check whether to write to the register. It's not updated in
the alpha_pmu_stop(). But again, I don't think double writing a register
can trigger any issue.

Thanks,
Kan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 02/16] perf: Fix the throttle logic for a group
  2025-05-18 19:18   ` Namhyung Kim
@ 2025-05-20 14:47     ` Liang, Kan
  2025-05-20 20:02       ` Namhyung Kim
  0 siblings, 1 reply; 24+ messages in thread
From: Liang, Kan @ 2025-05-20 14:47 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: peterz, mingo, irogers, mark.rutland, linux-kernel,
	linux-perf-users, eranian, ctshao, tmricht, leo.yan



On 2025-05-18 3:18 p.m., Namhyung Kim wrote:
> Hi Kan,
> 
> On Fri, May 16, 2025 at 11:28:39AM -0700, kan.liang@linux.intel.com wrote:
>> From: Kan Liang <kan.liang@linux.intel.com>
>>
>> The current throttle logic doesn't work well with a group, e.g., the
>> following sampling-read case.
>>
>> $ perf record -e "{cycles,cycles}:S" ...
>>
>> $ perf report -D | grep THROTTLE | tail -2
>>             THROTTLE events:        426  ( 9.0%)
>>           UNTHROTTLE events:        425  ( 9.0%)
>>
>> $ perf report -D | grep PERF_RECORD_SAMPLE -a4 | tail -n 5
>> 0 1020120874009167 0x74970 [0x68]: PERF_RECORD_SAMPLE(IP, 0x1):
>> ... sample_read:
>> .... group nr 2
>> ..... id 0000000000000327, value 000000000cbb993a, lost 0
>> ..... id 0000000000000328, value 00000002211c26df, lost 0
>>
>> The second cycles event has a much larger value than the first cycles
>> event in the same group.
>>
>> The current throttle logic in the generic code only logs the THROTTLE
>> event. It relies on the specific driver implementation to disable
>> events. For all ARCHs, the implementation is similar. Only the event is
>> disabled, rather than the group.
>>
>> The logic to disable the group should be generic for all ARCHs. Add the
>> logic in the generic code. The following patch will remove the buggy
>> driver-specific implementation.
>>
>> The throttle only happens when an event is overflowed. Stop the entire
>> group when any event in the group triggers the throttle.
>> The MAX_INTERRUPTS is set to all throttle events.
>>
>> The unthrottled could happen in 3 places.
>> - event/group sched. All events in the group are scheduled one by one.
>>   All of them will be unthrottled eventually. Nothing needs to be
>>   changed.
>> - The perf_adjust_freq_unthr_events for each tick. Needs to restart the
>>   group altogether.
>> - The __perf_event_period(). The whole group needs to be restarted
>>   altogether as well.
>>
>> With the fix,
>> $ sudo perf report -D | grep PERF_RECORD_SAMPLE -a4 | tail -n 5
>> 0 3573470770332 0x12f5f8 [0x70]: PERF_RECORD_SAMPLE(IP, 0x2):
>> ... sample_read:
>> .... group nr 2
>> ..... id 0000000000000a28, value 00000004fd3dfd8f, lost 0
>> ..... id 0000000000000a29, value 00000004fd3dfd8f, lost 0
> 
> Thanks for working on this!
> 
>>
>> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>> ---
>>  kernel/events/core.c | 60 ++++++++++++++++++++++++++++++++------------
>>  1 file changed, 44 insertions(+), 16 deletions(-)
>>
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index af78ec118e8f..52490c2ce45b 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -2739,6 +2739,39 @@ void perf_event_disable_inatomic(struct perf_event *event)
>>  static void perf_log_throttle(struct perf_event *event, int enable);
>>  static void perf_log_itrace_start(struct perf_event *event);
>>  
>> +static void perf_event_unthrottle(struct perf_event *event, bool start)
>> +{
>> +	event->hw.interrupts = 0;
>> +	if (start)
>> +		event->pmu->start(event, 0);
>> +	perf_log_throttle(event, 1);
>> +}
>> +
>> +static void perf_event_throttle(struct perf_event *event)
>> +{
>> +	event->pmu->stop(event, 0);
>> +	event->hw.interrupts = MAX_INTERRUPTS;
>> +	perf_log_throttle(event, 0);
>> +}
>> +
>> +static void perf_event_unthrottle_group(struct perf_event *event, bool skip_start_event)
>> +{
>> +	struct perf_event *sibling, *leader = event->group_leader;
>> +
>> +	perf_event_unthrottle(leader, skip_start_event ? leader != event : true);
>> +	for_each_sibling_event(sibling, leader)
>> +		perf_event_unthrottle(sibling, skip_start_event ? sibling != event : true);
> 
> This will add more PERF_RECORD_THROTTLE records for sibling events.

Yes

> Maybe we can generate it for the actual target event only?

The current code cannot track the actual target event for unthrottle.
Because the MAX_INTERRUPTS are set for all events when event_throttle.

But I think we can only add a PERF_RECORD_THROTTLE record for the leader
event, which can reduce the number of THROTTLE records.

The sample right after the THROTTLE record must be generated by the
actual target event. I think it should be good enough for the perf tool
to locate the event.

I will add the below patch as a separate improvement in V4.

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 52490c2ce45b..cd559501cfbd 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2744,14 +2744,16 @@ static void perf_event_unthrottle(struct
perf_event *event, bool start)
  	event->hw.interrupts = 0;
  	if (start)
  	event->pmu->start(event, 0);
-	perf_log_throttle(event, 1);
+	if (event == event->group_leader)
+		perf_log_throttle(event, 1);
  }

  static void perf_event_throttle(struct perf_event *event)
  {
  	event->pmu->stop(event, 0);
  	event->hw.interrupts = MAX_INTERRUPTS;
-	perf_log_throttle(event, 0);
+	if (event == event->group_leader)
+		perf_log_throttle(event, 0);
  }


> 
> Also the condition for skip_start_event is if it's a freq event.
> I think we can skip pmu->start() if the sibling is also a freq event.

The skip_start_event is if it will be start later separately. It intends
to avoid the double start.

In the perf_adjust_freq_unthr_events(), it will only adjust and start
the leader event, not group. If we skip pmu->start() for a freq sibling
event, it will not start until the next context switch.

Thanks,
Kan

> I remember KVM folks concern about the number of PMU accesses as it
> can cause VM exits.
> 
> Thanks,
> Namhyung
> 
>> +}
>> +
>> +static void perf_event_throttle_group(struct perf_event *event)
>> +{
>> +	struct perf_event *sibling, *leader = event->group_leader;
>> +
>> +	perf_event_throttle(leader);
>> +	for_each_sibling_event(sibling, leader)
>> +		perf_event_throttle(sibling);
>> +}
>> +
>>  static int
>>  event_sched_in(struct perf_event *event, struct perf_event_context *ctx)
>>  {
>> @@ -4393,12 +4426,8 @@ static void perf_adjust_freq_unthr_events(struct list_head *event_list)
>>  
>>  		hwc = &event->hw;
>>  
>> -		if (hwc->interrupts == MAX_INTERRUPTS) {
>> -			hwc->interrupts = 0;
>> -			perf_log_throttle(event, 1);
>> -			if (!is_event_in_freq_mode(event))
>> -				event->pmu->start(event, 0);
>> -		}
>> +		if (hwc->interrupts == MAX_INTERRUPTS)
>> +			perf_event_unthrottle_group(event, is_event_in_freq_mode(event));
>>  
>>  		if (!is_event_in_freq_mode(event))
>>  			continue;
>> @@ -6426,14 +6455,6 @@ static void __perf_event_period(struct perf_event *event,
>>  	active = (event->state == PERF_EVENT_STATE_ACTIVE);
>>  	if (active) {
>>  		perf_pmu_disable(event->pmu);
>> -		/*
>> -		 * We could be throttled; unthrottle now to avoid the tick
>> -		 * trying to unthrottle while we already re-started the event.
>> -		 */
>> -		if (event->hw.interrupts == MAX_INTERRUPTS) {
>> -			event->hw.interrupts = 0;
>> -			perf_log_throttle(event, 1);
>> -		}
>>  		event->pmu->stop(event, PERF_EF_UPDATE);
>>  	}
>>  
>> @@ -6441,6 +6462,14 @@ static void __perf_event_period(struct perf_event *event,
>>  
>>  	if (active) {
>>  		event->pmu->start(event, PERF_EF_RELOAD);
>> +		/*
>> +		 * Once the period is force-reset, the event starts immediately.
>> +		 * But the event/group could be throttled. Unthrottle the
>> +		 * event/group now to avoid the next tick trying to unthrottle
>> +		 * while we already re-started the event/group.
>> +		 */
>> +		if (event->hw.interrupts == MAX_INTERRUPTS)
>> +			perf_event_unthrottle_group(event, true);
>>  		perf_pmu_enable(event->pmu);
>>  	}
>>  }
>> @@ -10331,8 +10360,7 @@ __perf_event_account_interrupt(struct perf_event *event, int throttle)
>>  	if (unlikely(throttle && hwc->interrupts >= max_samples_per_tick)) {
>>  		__this_cpu_inc(perf_throttled_count);
>>  		tick_dep_set_cpu(smp_processor_id(), TICK_DEP_BIT_PERF_EVENTS);
>> -		hwc->interrupts = MAX_INTERRUPTS;
>> -		perf_log_throttle(event, 0);
>> +		perf_event_throttle_group(event);
>>  		ret = 1;
>>  	}
>>  
>> -- 
>> 2.38.1
>>
> 


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 02/16] perf: Fix the throttle logic for a group
  2025-05-20 14:47     ` Liang, Kan
@ 2025-05-20 20:02       ` Namhyung Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Namhyung Kim @ 2025-05-20 20:02 UTC (permalink / raw)
  To: Liang, Kan
  Cc: peterz, mingo, irogers, mark.rutland, linux-kernel,
	linux-perf-users, eranian, ctshao, tmricht, leo.yan

On Tue, May 20, 2025 at 10:47:21AM -0400, Liang, Kan wrote:
> 
> 
> On 2025-05-18 3:18 p.m., Namhyung Kim wrote:
> > Hi Kan,
> > 
> > On Fri, May 16, 2025 at 11:28:39AM -0700, kan.liang@linux.intel.com wrote:
> >> From: Kan Liang <kan.liang@linux.intel.com>
> >>
> >> The current throttle logic doesn't work well with a group, e.g., the
> >> following sampling-read case.
> >>
> >> $ perf record -e "{cycles,cycles}:S" ...
> >>
> >> $ perf report -D | grep THROTTLE | tail -2
> >>             THROTTLE events:        426  ( 9.0%)
> >>           UNTHROTTLE events:        425  ( 9.0%)
> >>
> >> $ perf report -D | grep PERF_RECORD_SAMPLE -a4 | tail -n 5
> >> 0 1020120874009167 0x74970 [0x68]: PERF_RECORD_SAMPLE(IP, 0x1):
> >> ... sample_read:
> >> .... group nr 2
> >> ..... id 0000000000000327, value 000000000cbb993a, lost 0
> >> ..... id 0000000000000328, value 00000002211c26df, lost 0
> >>
> >> The second cycles event has a much larger value than the first cycles
> >> event in the same group.
> >>
> >> The current throttle logic in the generic code only logs the THROTTLE
> >> event. It relies on the specific driver implementation to disable
> >> events. For all ARCHs, the implementation is similar. Only the event is
> >> disabled, rather than the group.
> >>
> >> The logic to disable the group should be generic for all ARCHs. Add the
> >> logic in the generic code. The following patch will remove the buggy
> >> driver-specific implementation.
> >>
> >> The throttle only happens when an event is overflowed. Stop the entire
> >> group when any event in the group triggers the throttle.
> >> The MAX_INTERRUPTS is set to all throttle events.
> >>
> >> The unthrottled could happen in 3 places.
> >> - event/group sched. All events in the group are scheduled one by one.
> >>   All of them will be unthrottled eventually. Nothing needs to be
> >>   changed.
> >> - The perf_adjust_freq_unthr_events for each tick. Needs to restart the
> >>   group altogether.
> >> - The __perf_event_period(). The whole group needs to be restarted
> >>   altogether as well.
> >>
> >> With the fix,
> >> $ sudo perf report -D | grep PERF_RECORD_SAMPLE -a4 | tail -n 5
> >> 0 3573470770332 0x12f5f8 [0x70]: PERF_RECORD_SAMPLE(IP, 0x2):
> >> ... sample_read:
> >> .... group nr 2
> >> ..... id 0000000000000a28, value 00000004fd3dfd8f, lost 0
> >> ..... id 0000000000000a29, value 00000004fd3dfd8f, lost 0
> > 
> > Thanks for working on this!
> > 
> >>
> >> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> >> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> >> ---
> >>  kernel/events/core.c | 60 ++++++++++++++++++++++++++++++++------------
> >>  1 file changed, 44 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/kernel/events/core.c b/kernel/events/core.c
> >> index af78ec118e8f..52490c2ce45b 100644
> >> --- a/kernel/events/core.c
> >> +++ b/kernel/events/core.c
> >> @@ -2739,6 +2739,39 @@ void perf_event_disable_inatomic(struct perf_event *event)
> >>  static void perf_log_throttle(struct perf_event *event, int enable);
> >>  static void perf_log_itrace_start(struct perf_event *event);
> >>  
> >> +static void perf_event_unthrottle(struct perf_event *event, bool start)
> >> +{
> >> +	event->hw.interrupts = 0;
> >> +	if (start)
> >> +		event->pmu->start(event, 0);
> >> +	perf_log_throttle(event, 1);
> >> +}
> >> +
> >> +static void perf_event_throttle(struct perf_event *event)
> >> +{
> >> +	event->pmu->stop(event, 0);
> >> +	event->hw.interrupts = MAX_INTERRUPTS;
> >> +	perf_log_throttle(event, 0);
> >> +}
> >> +
> >> +static void perf_event_unthrottle_group(struct perf_event *event, bool skip_start_event)
> >> +{
> >> +	struct perf_event *sibling, *leader = event->group_leader;
> >> +
> >> +	perf_event_unthrottle(leader, skip_start_event ? leader != event : true);
> >> +	for_each_sibling_event(sibling, leader)
> >> +		perf_event_unthrottle(sibling, skip_start_event ? sibling != event : true);
> > 
> > This will add more PERF_RECORD_THROTTLE records for sibling events.
> 
> Yes
> 
> > Maybe we can generate it for the actual target event only?
> 
> The current code cannot track the actual target event for unthrottle.
> Because the MAX_INTERRUPTS are set for all events when event_throttle.

Right.

> 
> But I think we can only add a PERF_RECORD_THROTTLE record for the leader
> event, which can reduce the number of THROTTLE records.

Sounds good.

> 
> The sample right after the THROTTLE record must be generated by the
> actual target event. I think it should be good enough for the perf tool
> to locate the event.

IIRC perf tool doesn't track which event is throttled, but yeah, it'd be
possible to use the next sample to locate it.

> 
> I will add the below patch as a separate improvement in V4.
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 52490c2ce45b..cd559501cfbd 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2744,14 +2744,16 @@ static void perf_event_unthrottle(struct
> perf_event *event, bool start)
>   	event->hw.interrupts = 0;
>   	if (start)
>   	event->pmu->start(event, 0);
> -	perf_log_throttle(event, 1);
> +	if (event == event->group_leader)
> +		perf_log_throttle(event, 1);
>   }
> 
>   static void perf_event_throttle(struct perf_event *event)
>   {
>   	event->pmu->stop(event, 0);
>   	event->hw.interrupts = MAX_INTERRUPTS;
> -	perf_log_throttle(event, 0);
> +	if (event == event->group_leader)
> +		perf_log_throttle(event, 0);
>   }

Looks good.

> 
> 
> > 
> > Also the condition for skip_start_event is if it's a freq event.
> > I think we can skip pmu->start() if the sibling is also a freq event.
> 
> The skip_start_event is if it will be start later separately. It intends
> to avoid the double start.
> 
> In the perf_adjust_freq_unthr_events(), it will only adjust and start
> the leader event, not group. If we skip pmu->start() for a freq sibling
> event, it will not start until the next context switch.

Oh, I missed that it only has leaders in the active list.

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-05-20 20:02 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-16 18:28 [PATCH V3 00/16] perf: Fix the throttle logic for group kan.liang
2025-05-16 18:28 ` [PATCH V3 01/16] perf: Clean up event in freq mode check kan.liang
2025-05-17 12:59   ` [tip: perf/core] perf/core: Add the is_event_in_freq_mode() helper to simplify the code tip-bot2 for Kan Liang
2025-05-16 18:28 ` [PATCH V3 02/16] perf: Fix the throttle logic for a group kan.liang
2025-05-17  8:22   ` Ingo Molnar
2025-05-20 14:16     ` Liang, Kan
2025-05-18 19:18   ` Namhyung Kim
2025-05-20 14:47     ` Liang, Kan
2025-05-20 20:02       ` Namhyung Kim
2025-05-16 18:28 ` [PATCH V3 03/16] perf/x86/intel: Remove driver-specific throttle support kan.liang
2025-05-16 18:28 ` [PATCH V3 04/16] perf/x86/amd: " kan.liang
2025-05-16 18:28 ` [PATCH V3 05/16] perf/x86/zhaoxin: " kan.liang
2025-05-16 18:28 ` [PATCH V3 06/16] powerpc/perf: " kan.liang
2025-05-16 18:28 ` [PATCH V3 07/16] s390/perf: " kan.liang
2025-05-16 18:28 ` [PATCH V3 08/16] perf/arm: " kan.liang
2025-05-16 18:28 ` [PATCH V3 09/16] perf/apple_m1: " kan.liang
2025-05-16 18:28 ` [PATCH V3 10/16] alpha/perf: " kan.liang
2025-05-16 18:28 ` [PATCH V3 11/16] arc/perf: " kan.liang
2025-05-18 21:58   ` Vineet Gupta
2025-05-16 18:28 ` [PATCH V3 12/16] csky/perf: " kan.liang
2025-05-16 18:28 ` [PATCH V3 13/16] loongarch/perf: " kan.liang
2025-05-16 18:28 ` [PATCH V3 14/16] sparc/perf: " kan.liang
2025-05-16 18:28 ` [PATCH V3 15/16] xtensa/perf: " kan.liang
2025-05-16 18:28 ` [PATCH V3 16/16] mips/perf: " kan.liang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).