[RFC PATCH 00/15] perf: Fix the throttle logic for group

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH 00/15] perf: Fix the throttle logic for group
@ 2025-05-06 16:47 kan.liang
  2025-05-06 16:47 ` [RFC PATCH 01/15] perf: Fix the throttle logic for a group kan.liang
                   ` (14 more replies)
  0 siblings, 15 replies; 25+ messages in thread
From: kan.liang @ 2025-05-06 16:47 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

The sampling read doesn't work well with a group.
The issue was originally found by the 'Basic leader sampling test' case
failed on s390.
https://lore.kernel.org/all/20250228062241.303309-1-tmricht@linux.ibm.com/

Stephane debugged it and found it was caused by the throttling logic.
https://lore.kernel.org/all/CABPqkBQzCMNS_PfLZBWVuX9o8Z55PovwJvpVWMWzyeExFJ5R4Q@mail.gmail.com/

The throttle logic is generic and shared by all ARCHs.
It also impacts other ARCHs, e.g., X86.

On an Intel GNR machine,
$ perf record -e "{cycles,cycles}:S" ...

$ perf report -D | grep THROTTLE | tail -2
            THROTTLE events:        426  ( 9.0%)
          UNTHROTTLE events:        425  ( 9.0%)

$ perf report -D | grep PERF_RECORD_SAMPLE -a4 | tail -n 5
0 1020120874009167 0x74970 [0x68]: PERF_RECORD_SAMPLE(IP, 0x1):
... sample_read:
.... group nr 2
..... id 0000000000000327, value 000000000cbb993a, lost 0
..... id 0000000000000328, value 00000002211c26df, lost 0

The patch set tries to provide a generic fix for the group throttle
support. So the buggy driver-specific implementation can be removed.

The patch set is only verified on newer Intel platforms.

Kan Liang (15):
  perf: Fix the throttle logic for a group
  perf/x86/intel: Remove driver-specific throttle support
  perf/x86/amd: Remove driver-specific throttle support
  perf/x86/zhaoxin: Remove driver-specific throttle support
  powerpc/perf: Remove driver-specific throttle support
  s390/perf: Remove driver-specific throttle support
  perf/arm: Remove driver-specific throttle support
  perf/apple_m1: Remove driver-specific throttle support
  alpha/perf: Remove driver-specific throttle support
  arc/perf: Remove driver-specific throttle support
  csky/perf: Remove driver-specific throttle support
  loongarch/perf: Remove driver-specific throttle support
  sparc/perf: Remove driver-specific throttle support
  xtensa/perf: Remove driver-specific throttle support
  mips/perf: Remove driver-specific throttle support

 arch/alpha/kernel/perf_event.c       | 11 ++----
 arch/arc/kernel/perf_event.c         |  6 +--
 arch/csky/kernel/perf_event.c        |  3 +-
 arch/loongarch/kernel/perf_event.c   |  3 +-
 arch/mips/kernel/perf_event_mipsxx.c |  3 +-
 arch/powerpc/perf/core-book3s.c      |  6 +--
 arch/powerpc/perf/core-fsl-emb.c     |  3 +-
 arch/s390/kernel/perf_cpum_cf.c      |  2 -
 arch/s390/kernel/perf_cpum_sf.c      |  5 +--
 arch/sparc/kernel/perf_event.c       |  3 +-
 arch/x86/events/amd/core.c           |  3 +-
 arch/x86/events/amd/ibs.c            |  4 +-
 arch/x86/events/core.c               |  3 +-
 arch/x86/events/intel/core.c         |  6 +--
 arch/x86/events/intel/ds.c           |  7 ++--
 arch/x86/events/intel/knc.c          |  3 +-
 arch/x86/events/intel/p4.c           |  3 +-
 arch/x86/events/zhaoxin/core.c       |  3 +-
 arch/xtensa/kernel/perf_event.c      |  3 +-
 drivers/perf/apple_m1_cpu_pmu.c      |  3 +-
 drivers/perf/arm_pmuv3.c             |  3 +-
 drivers/perf/arm_v6_pmu.c            |  3 +-
 drivers/perf/arm_v7_pmu.c            |  3 +-
 drivers/perf/arm_xscale_pmu.c        |  6 +--
 kernel/events/core.c                 | 55 +++++++++++++++++++++-------
 25 files changed, 72 insertions(+), 81 deletions(-)

-- 
2.38.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC PATCH 01/15] perf: Fix the throttle logic for a group
  2025-05-06 16:47 [RFC PATCH 00/15] perf: Fix the throttle logic for group kan.liang
@ 2025-05-06 16:47 ` kan.liang
  2025-05-07 16:52   ` Ian Rogers
  2025-05-13  9:41   ` Peter Zijlstra
  2025-05-06 16:47 ` [RFC PATCH 02/15] perf/x86/intel: Remove driver-specific throttle support kan.liang
                   ` (13 subsequent siblings)
  14 siblings, 2 replies; 25+ messages in thread
From: kan.liang @ 2025-05-06 16:47 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

The current throttle logic doesn't work well with a group, e.g., the
following sampling-read case.

$ perf record -e "{cycles,cycles}:S" ...

$ perf report -D | grep THROTTLE | tail -2
            THROTTLE events:        426  ( 9.0%)
          UNTHROTTLE events:        425  ( 9.0%)

$ perf report -D | grep PERF_RECORD_SAMPLE -a4 | tail -n 5
0 1020120874009167 0x74970 [0x68]: PERF_RECORD_SAMPLE(IP, 0x1):
... sample_read:
.... group nr 2
..... id 0000000000000327, value 000000000cbb993a, lost 0
..... id 0000000000000328, value 00000002211c26df, lost 0

The second cycles event has a much larger value than the first cycles
event in the same group.

The current throttle logic in the generic code only logs the THROTTLE
event. It relies on the specific driver implementation to disable
events. However, for all ARCHs, the implementation is similar. It only
disable the event, rather than the group.

The logic to disable the group should be generic for all ARCHs. Add the
logic in the generic code. The following patch will remove the buggy
driver-specific implementation.

The throttle only happens when an event is overflowed. Stop the entire
group when any event in the group triggers the throttle. Set the
MAX_INTERRUPTS to the leader event to indicate the group is throttled.

The unthrottled could happen in 3 places.
- event/group sched. All events in the group are scheduled one by one.
  All of them will be unthrottled eventually. Nothing needs to be
  changed.
- The perf_adjust_freq_unthr_events for each tick. Needs to restart the
  group altogether.
- The __perf_event_period(). The whole group needs to be restarted
  altogether as well.

With the fix,
$ sudo perf report -D | grep PERF_RECORD_SAMPLE -a4 | tail -n 5
0 3573470770332 0x12f5f8 [0x70]: PERF_RECORD_SAMPLE(IP, 0x2):
... sample_read:
.... group nr 2
..... id 0000000000000a28, value 00000004fd3dfd8f, lost 0
..... id 0000000000000a29, value 00000004fd3dfd8f, lost 0

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 kernel/events/core.c | 55 +++++++++++++++++++++++++++++++++-----------
 1 file changed, 41 insertions(+), 14 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index a84abc2b7f20..eb0dc871f4f1 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2734,6 +2734,38 @@ void perf_event_disable_inatomic(struct perf_event *event)
 static void perf_log_throttle(struct perf_event *event, int enable);
 static void perf_log_itrace_start(struct perf_event *event);
 
+static void perf_event_group_unthrottle(struct perf_event *event, bool start_event)
+{
+	struct perf_event *leader = event->group_leader;
+	struct perf_event *sibling;
+
+	if (leader != event || start_event)
+		leader->pmu->start(leader, 0);
+	leader->hw.interrupts = 0;
+
+	for_each_sibling_event(sibling, leader) {
+		if (sibling != event || start_event)
+			sibling->pmu->start(sibling, 0);
+		sibling->hw.interrupts = 0;
+	}
+
+	perf_log_throttle(leader, 1);
+}
+
+static void perf_event_group_throttle(struct perf_event *event)
+{
+	struct perf_event *leader = event->group_leader;
+	struct perf_event *sibling;
+
+	leader->hw.interrupts = MAX_INTERRUPTS;
+	leader->pmu->stop(leader, 0);
+
+	for_each_sibling_event(sibling, leader)
+		sibling->pmu->stop(sibling, 0);
+
+	perf_log_throttle(leader, 0);
+}
+
 static int
 event_sched_in(struct perf_event *event, struct perf_event_context *ctx)
 {
@@ -4389,10 +4421,8 @@ static void perf_adjust_freq_unthr_events(struct list_head *event_list)
 		hwc = &event->hw;
 
 		if (hwc->interrupts == MAX_INTERRUPTS) {
-			hwc->interrupts = 0;
-			perf_log_throttle(event, 1);
-			if (!event->attr.freq || !event->attr.sample_freq)
-				event->pmu->start(event, 0);
+			perf_event_group_unthrottle(event,
+				!event->attr.freq || !event->attr.sample_freq);
 		}
 
 		if (!event->attr.freq || !event->attr.sample_freq)
@@ -6421,14 +6451,6 @@ static void __perf_event_period(struct perf_event *event,
 	active = (event->state == PERF_EVENT_STATE_ACTIVE);
 	if (active) {
 		perf_pmu_disable(event->pmu);
-		/*
-		 * We could be throttled; unthrottle now to avoid the tick
-		 * trying to unthrottle while we already re-started the event.
-		 */
-		if (event->hw.interrupts == MAX_INTERRUPTS) {
-			event->hw.interrupts = 0;
-			perf_log_throttle(event, 1);
-		}
 		event->pmu->stop(event, PERF_EF_UPDATE);
 	}
 
@@ -6436,6 +6458,12 @@ static void __perf_event_period(struct perf_event *event,
 
 	if (active) {
 		event->pmu->start(event, PERF_EF_RELOAD);
+		/*
+		 * We could be throttled; unthrottle now to avoid the tick
+		 * trying to unthrottle while we already re-started the event.
+		 */
+		if (event->group_leader->hw.interrupts == MAX_INTERRUPTS)
+			perf_event_group_unthrottle(event, false);
 		perf_pmu_enable(event->pmu);
 	}
 }
@@ -10326,8 +10354,7 @@ __perf_event_account_interrupt(struct perf_event *event, int throttle)
 	if (unlikely(throttle && hwc->interrupts >= max_samples_per_tick)) {
 		__this_cpu_inc(perf_throttled_count);
 		tick_dep_set_cpu(smp_processor_id(), TICK_DEP_BIT_PERF_EVENTS);
-		hwc->interrupts = MAX_INTERRUPTS;
-		perf_log_throttle(event, 0);
+		perf_event_group_throttle(event);
 		ret = 1;
 	}
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 02/15] perf/x86/intel: Remove driver-specific throttle support
  2025-05-06 16:47 [RFC PATCH 00/15] perf: Fix the throttle logic for group kan.liang
  2025-05-06 16:47 ` [RFC PATCH 01/15] perf: Fix the throttle logic for a group kan.liang
@ 2025-05-06 16:47 ` kan.liang
  2025-05-06 16:47 ` [RFC PATCH 03/15] perf/x86/amd: " kan.liang
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: kan.liang @ 2025-05-06 16:47 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 arch/x86/events/core.c       | 3 +--
 arch/x86/events/intel/core.c | 6 ++----
 arch/x86/events/intel/ds.c   | 7 +++----
 arch/x86/events/intel/knc.c  | 3 +--
 arch/x86/events/intel/p4.c   | 3 +--
 5 files changed, 8 insertions(+), 14 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 186e31cd0c14..8a2f73333a50 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1730,8 +1730,7 @@ int x86_pmu_handle_irq(struct pt_regs *regs)
 
 		perf_sample_save_brstack(&data, event, &cpuc->lbr_stack, NULL);
 
-		if (perf_event_overflow(event, &data, regs))
-			x86_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	if (handled)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index cd6329207311..3a319cf6d364 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3138,8 +3138,7 @@ static void x86_pmu_handle_guest_pebs(struct pt_regs *regs,
 			continue;
 
 		perf_sample_data_init(data, 0, event->hw.last_period);
-		if (perf_event_overflow(event, data, regs))
-			x86_pmu_stop(event, 0);
+		perf_event_overflow(event, data, regs);
 
 		/* Inject one fake event is enough. */
 		break;
@@ -3282,8 +3281,7 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
 		if (has_branch_stack(event))
 			intel_pmu_lbr_save_brstack(&data, cpuc, event);
 
-		if (perf_event_overflow(event, &data, regs))
-			x86_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	return handled;
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 61ee698deaab..f33451dc7bc7 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2362,8 +2362,7 @@ __intel_pmu_pebs_last_event(struct perf_event *event,
 		 * All but the last records are processed.
 		 * The last one is left to be able to call the overflow handler.
 		 */
-		if (perf_event_overflow(event, data, regs))
-			x86_pmu_stop(event, 0);
+		perf_event_overflow(event, data, regs);
 	}
 
 	if (hwc->flags & PERF_X86_EVENT_AUTO_RELOAD) {
@@ -2591,8 +2590,8 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
 		if (error[bit]) {
 			perf_log_lost_samples(event, error[bit]);
 
-			if (iregs && perf_event_account_interrupt(event))
-				x86_pmu_stop(event, 0);
+			if (iregs)
+				perf_event_account_interrupt(event);
 		}
 
 		if (counts[bit]) {
diff --git a/arch/x86/events/intel/knc.c b/arch/x86/events/intel/knc.c
index 3e8ec049b46d..384589168c1a 100644
--- a/arch/x86/events/intel/knc.c
+++ b/arch/x86/events/intel/knc.c
@@ -254,8 +254,7 @@ static int knc_pmu_handle_irq(struct pt_regs *regs)
 
 		perf_sample_data_init(&data, 0, last_period);
 
-		if (perf_event_overflow(event, &data, regs))
-			x86_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	/*
diff --git a/arch/x86/events/intel/p4.c b/arch/x86/events/intel/p4.c
index c85a9fc44355..126d5ae264cb 100644
--- a/arch/x86/events/intel/p4.c
+++ b/arch/x86/events/intel/p4.c
@@ -1072,8 +1072,7 @@ static int p4_pmu_handle_irq(struct pt_regs *regs)
 			continue;
 
 
-		if (perf_event_overflow(event, &data, regs))
-			x86_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	if (handled)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 03/15] perf/x86/amd: Remove driver-specific throttle support
  2025-05-06 16:47 [RFC PATCH 00/15] perf: Fix the throttle logic for group kan.liang
  2025-05-06 16:47 ` [RFC PATCH 01/15] perf: Fix the throttle logic for a group kan.liang
  2025-05-06 16:47 ` [RFC PATCH 02/15] perf/x86/intel: Remove driver-specific throttle support kan.liang
@ 2025-05-06 16:47 ` kan.liang
  2025-05-13  8:44   ` Ravi Bangoria
  2025-05-06 16:47 ` [RFC PATCH 04/15] perf/x86/zhaoxin: " kan.liang
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 25+ messages in thread
From: kan.liang @ 2025-05-06 16:47 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, Kan Liang, Sandipan Das, Ravi Bangoria

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
---
 arch/x86/events/amd/core.c | 3 +--
 arch/x86/events/amd/ibs.c  | 4 +---
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index 30d6ceb4c8ad..5e64283b9bf2 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -1003,8 +1003,7 @@ static int amd_pmu_v2_handle_irq(struct pt_regs *regs)
 
 		perf_sample_save_brstack(&data, event, &cpuc->lbr_stack, NULL);
 
-		if (perf_event_overflow(event, &data, regs))
-			x86_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	/*
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index 0252b7ea8bca..4bbbca02aeb1 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -1373,9 +1373,7 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
 		hwc->sample_period = perf_ibs->min_period;
 
 out:
-	if (throttle) {
-		perf_ibs_stop(event, 0);
-	} else {
+	if (!throttle) {
 		if (perf_ibs == &perf_ibs_op) {
 			if (ibs_caps & IBS_CAPS_OPCNTEXT) {
 				new_config = period & IBS_OP_MAX_CNT_EXT_MASK;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 04/15] perf/x86/zhaoxin: Remove driver-specific throttle support
  2025-05-06 16:47 [RFC PATCH 00/15] perf: Fix the throttle logic for group kan.liang
                   ` (2 preceding siblings ...)
  2025-05-06 16:47 ` [RFC PATCH 03/15] perf/x86/amd: " kan.liang
@ 2025-05-06 16:47 ` kan.liang
  2025-05-06 16:47 ` [RFC PATCH 05/15] powerpc/perf: " kan.liang
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: kan.liang @ 2025-05-06 16:47 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, Kan Liang, silviazhao, CodyYao-oc

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: silviazhao <silviazhao-oc@zhaoxin.com>
Cc: CodyYao-oc <CodyYao-oc@zhaoxin.com>
---
 arch/x86/events/zhaoxin/core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/events/zhaoxin/core.c b/arch/x86/events/zhaoxin/core.c
index 2fd9b0cf9a5e..49a5944fac63 100644
--- a/arch/x86/events/zhaoxin/core.c
+++ b/arch/x86/events/zhaoxin/core.c
@@ -397,8 +397,7 @@ static int zhaoxin_pmu_handle_irq(struct pt_regs *regs)
 		if (!x86_perf_event_set_period(event))
 			continue;
 
-		if (perf_event_overflow(event, &data, regs))
-			x86_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	/*
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 05/15] powerpc/perf: Remove driver-specific throttle support
  2025-05-06 16:47 [RFC PATCH 00/15] perf: Fix the throttle logic for group kan.liang
                   ` (3 preceding siblings ...)
  2025-05-06 16:47 ` [RFC PATCH 04/15] perf/x86/zhaoxin: " kan.liang
@ 2025-05-06 16:47 ` kan.liang
  2025-05-06 16:47 ` [RFC PATCH 06/15] s390/perf: " kan.liang
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: kan.liang @ 2025-05-06 16:47 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, Kan Liang, Athira Rajeev,
	Madhavan Srinivasan

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
---
 arch/powerpc/perf/core-book3s.c  | 6 ++----
 arch/powerpc/perf/core-fsl-emb.c | 3 +--
 2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 42ff4d167acc..8b0081441f85 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2344,12 +2344,10 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
 			ppmu->get_mem_weight(&data.weight.full, event->attr.sample_type);
 			data.sample_flags |= PERF_SAMPLE_WEIGHT_TYPE;
 		}
-		if (perf_event_overflow(event, &data, regs))
-			power_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	} else if (period) {
 		/* Account for interrupt in case of invalid SIAR */
-		if (perf_event_account_interrupt(event))
-			power_pmu_stop(event, 0);
+		perf_event_account_interrupt(event);
 	}
 }
 
diff --git a/arch/powerpc/perf/core-fsl-emb.c b/arch/powerpc/perf/core-fsl-emb.c
index d2ffcc7021c5..7120ab20cbfe 100644
--- a/arch/powerpc/perf/core-fsl-emb.c
+++ b/arch/powerpc/perf/core-fsl-emb.c
@@ -635,8 +635,7 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
 
 		perf_sample_data_init(&data, 0, last_period);
 
-		if (perf_event_overflow(event, &data, regs))
-			fsl_emb_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	}
 }
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 06/15] s390/perf: Remove driver-specific throttle support
  2025-05-06 16:47 [RFC PATCH 00/15] perf: Fix the throttle logic for group kan.liang
                   ` (4 preceding siblings ...)
  2025-05-06 16:47 ` [RFC PATCH 05/15] powerpc/perf: " kan.liang
@ 2025-05-06 16:47 ` kan.liang
  2025-05-06 16:47 ` [RFC PATCH 07/15] perf/arm: " kan.liang
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: kan.liang @ 2025-05-06 16:47 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
---
 arch/s390/kernel/perf_cpum_cf.c | 2 --
 arch/s390/kernel/perf_cpum_sf.c | 5 +----
 2 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c
index e657fad7e376..6a262e198e35 100644
--- a/arch/s390/kernel/perf_cpum_cf.c
+++ b/arch/s390/kernel/perf_cpum_cf.c
@@ -980,8 +980,6 @@ static int cfdiag_push_sample(struct perf_event *event,
 	}
 
 	overflow = perf_event_overflow(event, &data, &regs);
-	if (overflow)
-		event->pmu->stop(event, 0);
 
 	perf_event_update_userpage(event);
 	return overflow;
diff --git a/arch/s390/kernel/perf_cpum_sf.c b/arch/s390/kernel/perf_cpum_sf.c
index ad22799d8a7d..91469401f2c9 100644
--- a/arch/s390/kernel/perf_cpum_sf.c
+++ b/arch/s390/kernel/perf_cpum_sf.c
@@ -1072,10 +1072,7 @@ static int perf_push_sample(struct perf_event *event,
 	overflow = 0;
 	if (perf_event_exclude(event, &regs, sde_regs))
 		goto out;
-	if (perf_event_overflow(event, &data, &regs)) {
-		overflow = 1;
-		event->pmu->stop(event, 0);
-	}
+	overflow = perf_event_overflow(event, &data, &regs);
 	perf_event_update_userpage(event);
 out:
 	return overflow;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 07/15] perf/arm: Remove driver-specific throttle support
  2025-05-06 16:47 [RFC PATCH 00/15] perf: Fix the throttle logic for group kan.liang
                   ` (5 preceding siblings ...)
  2025-05-06 16:47 ` [RFC PATCH 06/15] s390/perf: " kan.liang
@ 2025-05-06 16:47 ` kan.liang
  2025-05-06 16:47 ` [RFC PATCH 08/15] perf/apple_m1: " kan.liang
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: kan.liang @ 2025-05-06 16:47 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, Kan Liang, Rob Herring,
	Vincenzo Frascino, Will Deacon

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Rob Herring (Arm) <robh@kernel.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 drivers/perf/arm_pmuv3.c      | 3 +--
 drivers/perf/arm_v6_pmu.c     | 3 +--
 drivers/perf/arm_v7_pmu.c     | 3 +--
 drivers/perf/arm_xscale_pmu.c | 6 ++----
 4 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
index e506d59654e7..3db9f4ed17e8 100644
--- a/drivers/perf/arm_pmuv3.c
+++ b/drivers/perf/arm_pmuv3.c
@@ -887,8 +887,7 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
 		 * an irq_work which will be taken care of in the handling of
 		 * IPI_IRQ_WORK.
 		 */
-		if (perf_event_overflow(event, &data, regs))
-			cpu_pmu->disable(event);
+		perf_event_overflow(event, &data, regs);
 	}
 	armv8pmu_start(cpu_pmu);
 
diff --git a/drivers/perf/arm_v6_pmu.c b/drivers/perf/arm_v6_pmu.c
index b09615bb2bb2..7cb12c8e06c7 100644
--- a/drivers/perf/arm_v6_pmu.c
+++ b/drivers/perf/arm_v6_pmu.c
@@ -276,8 +276,7 @@ armv6pmu_handle_irq(struct arm_pmu *cpu_pmu)
 		if (!armpmu_event_set_period(event))
 			continue;
 
-		if (perf_event_overflow(event, &data, regs))
-			cpu_pmu->disable(event);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	/*
diff --git a/drivers/perf/arm_v7_pmu.c b/drivers/perf/arm_v7_pmu.c
index 17831e1920bd..a1e438101114 100644
--- a/drivers/perf/arm_v7_pmu.c
+++ b/drivers/perf/arm_v7_pmu.c
@@ -930,8 +930,7 @@ static irqreturn_t armv7pmu_handle_irq(struct arm_pmu *cpu_pmu)
 		if (!armpmu_event_set_period(event))
 			continue;
 
-		if (perf_event_overflow(event, &data, regs))
-			cpu_pmu->disable(event);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	/*
diff --git a/drivers/perf/arm_xscale_pmu.c b/drivers/perf/arm_xscale_pmu.c
index 638fea9b1263..c2ac41dd9e19 100644
--- a/drivers/perf/arm_xscale_pmu.c
+++ b/drivers/perf/arm_xscale_pmu.c
@@ -186,8 +186,7 @@ xscale1pmu_handle_irq(struct arm_pmu *cpu_pmu)
 		if (!armpmu_event_set_period(event))
 			continue;
 
-		if (perf_event_overflow(event, &data, regs))
-			cpu_pmu->disable(event);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	irq_work_run();
@@ -519,8 +518,7 @@ xscale2pmu_handle_irq(struct arm_pmu *cpu_pmu)
 		if (!armpmu_event_set_period(event))
 			continue;
 
-		if (perf_event_overflow(event, &data, regs))
-			cpu_pmu->disable(event);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	irq_work_run();
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 08/15] perf/apple_m1: Remove driver-specific throttle support
  2025-05-06 16:47 [RFC PATCH 00/15] perf: Fix the throttle logic for group kan.liang
                   ` (6 preceding siblings ...)
  2025-05-06 16:47 ` [RFC PATCH 07/15] perf/arm: " kan.liang
@ 2025-05-06 16:47 ` kan.liang
  2025-05-06 16:47 ` [RFC PATCH 09/15] alpha/perf: " kan.liang
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: kan.liang @ 2025-05-06 16:47 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, Kan Liang, Oliver Upton

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
---
 drivers/perf/apple_m1_cpu_pmu.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/perf/apple_m1_cpu_pmu.c b/drivers/perf/apple_m1_cpu_pmu.c
index df9a28ba69dc..81b6f1a62349 100644
--- a/drivers/perf/apple_m1_cpu_pmu.c
+++ b/drivers/perf/apple_m1_cpu_pmu.c
@@ -474,8 +474,7 @@ static irqreturn_t m1_pmu_handle_irq(struct arm_pmu *cpu_pmu)
 		if (!armpmu_event_set_period(event))
 			continue;
 
-		if (perf_event_overflow(event, &data, regs))
-			m1_pmu_disable_event(event);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	cpu_pmu->start(cpu_pmu);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 09/15] alpha/perf: Remove driver-specific throttle support
  2025-05-06 16:47 [RFC PATCH 00/15] perf: Fix the throttle logic for group kan.liang
                   ` (7 preceding siblings ...)
  2025-05-06 16:47 ` [RFC PATCH 08/15] perf/apple_m1: " kan.liang
@ 2025-05-06 16:47 ` kan.liang
  2025-05-06 16:47 ` [RFC PATCH 10/15] arc/perf: " kan.liang
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: kan.liang @ 2025-05-06 16:47 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, Kan Liang, linux-alpha

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: linux-alpha@vger.kernel.org
---
 arch/alpha/kernel/perf_event.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/arch/alpha/kernel/perf_event.c b/arch/alpha/kernel/perf_event.c
index 1f0eb4f25c0f..a3eaab094ece 100644
--- a/arch/alpha/kernel/perf_event.c
+++ b/arch/alpha/kernel/perf_event.c
@@ -852,14 +852,9 @@ static void alpha_perf_event_irq_handler(unsigned long la_ptr,
 	alpha_perf_event_update(event, hwc, idx, alpha_pmu->pmc_max_period[idx]+1);
 	perf_sample_data_init(&data, 0, hwc->last_period);
 
-	if (alpha_perf_event_set_period(event, hwc, idx)) {
-		if (perf_event_overflow(event, &data, regs)) {
-			/* Interrupts coming too quickly; "throttle" the
-			 * counter, i.e., disable it for a little while.
-			 */
-			alpha_pmu_stop(event, 0);
-		}
-	}
+	if (alpha_perf_event_set_period(event, hwc, idx))
+		perf_event_overflow(event, &data, regs);
+
 	wrperfmon(PERFMON_CMD_ENABLE, cpuc->idx_mask);
 
 	return;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 10/15] arc/perf: Remove driver-specific throttle support
  2025-05-06 16:47 [RFC PATCH 00/15] perf: Fix the throttle logic for group kan.liang
                   ` (8 preceding siblings ...)
  2025-05-06 16:47 ` [RFC PATCH 09/15] alpha/perf: " kan.liang
@ 2025-05-06 16:47 ` kan.liang
  2025-05-06 16:47 ` [RFC PATCH 11/15] csky/perf: " kan.liang
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: kan.liang @ 2025-05-06 16:47 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, Kan Liang, Vineet Gupta

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Vineet Gupta <vgupta@ikernel.org>
---
 arch/arc/kernel/perf_event.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c
index 6e5a651cd75c..ed6d4f0cd621 100644
--- a/arch/arc/kernel/perf_event.c
+++ b/arch/arc/kernel/perf_event.c
@@ -599,10 +599,8 @@ static irqreturn_t arc_pmu_intr(int irq, void *dev)
 
 		arc_perf_event_update(event, &event->hw, event->hw.idx);
 		perf_sample_data_init(&data, 0, hwc->last_period);
-		if (arc_pmu_event_set_period(event)) {
-			if (perf_event_overflow(event, &data, regs))
-				arc_pmu_stop(event, 0);
-		}
+		if (arc_pmu_event_set_period(event))
+			perf_event_overflow(event, &data, regs);
 
 		active_ints &= ~BIT(idx);
 	} while (active_ints);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 11/15] csky/perf: Remove driver-specific throttle support
  2025-05-06 16:47 [RFC PATCH 00/15] perf: Fix the throttle logic for group kan.liang
                   ` (9 preceding siblings ...)
  2025-05-06 16:47 ` [RFC PATCH 10/15] arc/perf: " kan.liang
@ 2025-05-06 16:47 ` kan.liang
  2025-05-06 16:47 ` [RFC PATCH 12/15] loongarch/perf: " kan.liang
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: kan.liang @ 2025-05-06 16:47 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, Kan Liang, Mao Han, Guo Ren

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Mao Han <han_mao@c-sky.com>
Cc: Guo Ren <ren_guo@c-sky.com>
---
 arch/csky/kernel/perf_event.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/csky/kernel/perf_event.c b/arch/csky/kernel/perf_event.c
index e5f18420ce64..e0a36acd265b 100644
--- a/arch/csky/kernel/perf_event.c
+++ b/arch/csky/kernel/perf_event.c
@@ -1139,8 +1139,7 @@ static irqreturn_t csky_pmu_handle_irq(int irq_num, void *dev)
 		perf_sample_data_init(&data, 0, hwc->last_period);
 		csky_pmu_event_set_period(event);
 
-		if (perf_event_overflow(event, &data, regs))
-			csky_pmu_stop_event(event);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	csky_pmu_enable(&csky_pmu.pmu);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 12/15] loongarch/perf: Remove driver-specific throttle support
  2025-05-06 16:47 [RFC PATCH 00/15] perf: Fix the throttle logic for group kan.liang
                   ` (10 preceding siblings ...)
  2025-05-06 16:47 ` [RFC PATCH 11/15] csky/perf: " kan.liang
@ 2025-05-06 16:47 ` kan.liang
  2025-05-07  1:17   ` 陈华才
  2025-05-06 16:47 ` [RFC PATCH 13/15] sparc/perf: " kan.liang
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 25+ messages in thread
From: kan.liang @ 2025-05-06 16:47 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, Kan Liang, Bibo Mao, Huacai Chen

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/kernel/perf_event.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/loongarch/kernel/perf_event.c b/arch/loongarch/kernel/perf_event.c
index f86a4b838dd7..8ad098703488 100644
--- a/arch/loongarch/kernel/perf_event.c
+++ b/arch/loongarch/kernel/perf_event.c
@@ -479,8 +479,7 @@ static void handle_associated_event(struct cpu_hw_events *cpuc, int idx,
 	if (!loongarch_pmu_event_set_period(event, hwc, idx))
 		return;
 
-	if (perf_event_overflow(event, data, regs))
-		loongarch_pmu_disable_event(idx);
+	perf_event_overflow(event, data, regs);
 }
 
 static irqreturn_t pmu_handle_irq(int irq, void *dev)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 13/15] sparc/perf: Remove driver-specific throttle support
  2025-05-06 16:47 [RFC PATCH 00/15] perf: Fix the throttle logic for group kan.liang
                   ` (11 preceding siblings ...)
  2025-05-06 16:47 ` [RFC PATCH 12/15] loongarch/perf: " kan.liang
@ 2025-05-06 16:47 ` kan.liang
  2025-05-06 16:47 ` [RFC PATCH 14/15] xtensa/perf: " kan.liang
  2025-05-06 16:47 ` [RFC PATCH 15/15] mips/perf: " kan.liang
  14 siblings, 0 replies; 25+ messages in thread
From: kan.liang @ 2025-05-06 16:47 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, Kan Liang, David S . Miller

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
---
 arch/sparc/kernel/perf_event.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/sparc/kernel/perf_event.c b/arch/sparc/kernel/perf_event.c
index f02a283a8e8f..cae4d33002a5 100644
--- a/arch/sparc/kernel/perf_event.c
+++ b/arch/sparc/kernel/perf_event.c
@@ -1668,8 +1668,7 @@ static int __kprobes perf_event_nmi_handler(struct notifier_block *self,
 		if (!sparc_perf_event_set_period(event, hwc, idx))
 			continue;
 
-		if (perf_event_overflow(event, &data, regs))
-			sparc_pmu_stop(event, 0);
+		perf_event_overflow(event, &data, regs);
 	}
 
 	finish_clock = sched_clock();
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 14/15] xtensa/perf: Remove driver-specific throttle support
  2025-05-06 16:47 [RFC PATCH 00/15] perf: Fix the throttle logic for group kan.liang
                   ` (12 preceding siblings ...)
  2025-05-06 16:47 ` [RFC PATCH 13/15] sparc/perf: " kan.liang
@ 2025-05-06 16:47 ` kan.liang
  2025-05-11 20:43   ` Max Filippov
  2025-05-06 16:47 ` [RFC PATCH 15/15] mips/perf: " kan.liang
  14 siblings, 1 reply; 25+ messages in thread
From: kan.liang @ 2025-05-06 16:47 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, Kan Liang, Max Filippov

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
---
 arch/xtensa/kernel/perf_event.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/xtensa/kernel/perf_event.c b/arch/xtensa/kernel/perf_event.c
index 183618090d05..223f1d452310 100644
--- a/arch/xtensa/kernel/perf_event.c
+++ b/arch/xtensa/kernel/perf_event.c
@@ -388,8 +388,7 @@ irqreturn_t xtensa_pmu_irq_handler(int irq, void *dev_id)
 			struct pt_regs *regs = get_irq_regs();
 
 			perf_sample_data_init(&data, 0, last_period);
-			if (perf_event_overflow(event, &data, regs))
-				xtensa_pmu_stop(event, 0);
+			perf_event_overflow(event, &data, regs);
 		}
 
 		rc = IRQ_HANDLED;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 15/15] mips/perf: Remove driver-specific throttle support
  2025-05-06 16:47 [RFC PATCH 00/15] perf: Fix the throttle logic for group kan.liang
                   ` (13 preceding siblings ...)
  2025-05-06 16:47 ` [RFC PATCH 14/15] xtensa/perf: " kan.liang
@ 2025-05-06 16:47 ` kan.liang
  14 siblings, 0 replies; 25+ messages in thread
From: kan.liang @ 2025-05-06 16:47 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, Kan Liang, Thomas Bogendoerfer

From: Kan Liang <kan.liang@linux.intel.com>

The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
---
 arch/mips/kernel/perf_event_mipsxx.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/mips/kernel/perf_event_mipsxx.c b/arch/mips/kernel/perf_event_mipsxx.c
index c4d6b09136b1..196a070349b0 100644
--- a/arch/mips/kernel/perf_event_mipsxx.c
+++ b/arch/mips/kernel/perf_event_mipsxx.c
@@ -791,8 +791,7 @@ static void handle_associated_event(struct cpu_hw_events *cpuc,
 	if (!mipspmu_event_set_period(event, hwc, idx))
 		return;
 
-	if (perf_event_overflow(event, data, regs))
-		mipsxx_pmu_disable_event(idx);
+	perf_event_overflow(event, data, regs);
 }
 
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 12/15] loongarch/perf: Remove driver-specific throttle support
  2025-05-06 16:47 ` [RFC PATCH 12/15] loongarch/perf: " kan.liang
@ 2025-05-07  1:17   ` 陈华才
  2025-05-07 13:45     ` Liang, Kan
  0 siblings, 1 reply; 25+ messages in thread
From: 陈华才 @ 2025-05-07  1:17 UTC (permalink / raw)
  To: kan.liang
  Cc: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users, eranian, ctshao, tmricht, Bibo Mao, loongarch

Hi, Liang,

Since which version we need this patch? I mean which versions this patch should be backported.
And you need to CC loongarch@lists.linux.dev.

Huacai

> -----原始邮件-----
> 发件人: kan.liang@linux.intel.com
> 发送时间:2025-05-07 00:47:37 (星期三)
> 收件人: peterz@infradead.org, mingo@redhat.com, namhyung@kernel.org, irogers@google.com, mark.rutland@arm.com, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org
> 抄送: eranian@google.com, ctshao@google.com, tmricht@linux.ibm.com, "Kan Liang" <kan.liang@linux.intel.com>, "Bibo Mao" <maobibo@loongson.cn>, "Huacai Chen" <chenhuacai@loongson.cn>
> 主题: [RFC PATCH 12/15] loongarch/perf: Remove driver-specific throttle support
> 
> From: Kan Liang <kan.liang@linux.intel.com>
> 
> The throttle support has been added in the generic code. Remove
> the driver-specific throttle support.
> 
> Besides the throttle, perf_event_overflow may return true because of
> event_limit. It already does an inatomic event disable. The pmu->stop
> is not required either.
> 
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> Cc: Bibo Mao <maobibo@loongson.cn>
> Cc: Huacai Chen <chenhuacai@loongson.cn>
> ---
>  arch/loongarch/kernel/perf_event.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/arch/loongarch/kernel/perf_event.c b/arch/loongarch/kernel/perf_event.c
> index f86a4b838dd7..8ad098703488 100644
> --- a/arch/loongarch/kernel/perf_event.c
> +++ b/arch/loongarch/kernel/perf_event.c
> @@ -479,8 +479,7 @@ static void handle_associated_event(struct cpu_hw_events *cpuc, int idx,
>  	if (!loongarch_pmu_event_set_period(event, hwc, idx))
>  		return;
>  
> -	if (perf_event_overflow(event, data, regs))
> -		loongarch_pmu_disable_event(idx);
> +	perf_event_overflow(event, data, regs);
>  }
>  
>  static irqreturn_t pmu_handle_irq(int irq, void *dev)
> -- 
> 2.38.1

本邮件及其附件含有龙芯中科的商业秘密信息，仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用（包括但不限于全部或部分地泄露、复制或散发）本邮件及其附件中的信息。如果您错收本邮件，请您立即电话或邮件通知发件人并删除本邮件。 
This email and its attachments contain confidential information from Loongson Technology , which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this email in error, please notify the sender by phone or email immediately and delete it. 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 12/15] loongarch/perf: Remove driver-specific throttle support
  2025-05-07  1:17   ` 陈华才
@ 2025-05-07 13:45     ` Liang, Kan
  0 siblings, 0 replies; 25+ messages in thread
From: Liang, Kan @ 2025-05-07 13:45 UTC (permalink / raw)
  To: 陈华才
  Cc: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users, eranian, ctshao, tmricht, Bibo Mao, loongarch



On 2025-05-06 9:17 p.m., 陈华才 wrote:
> Hi, Liang,
> 
> Since which version we need this patch? I mean which versions this patch should be backported.

The generic change hasn't been merged. It's still under review.
https://lore.kernel.org/lkml/20250506164740.1317237-2-kan.liang@linux.intel.com/

The driver-specific change should be applied after the generic change is
merged.

> And you need to CC loongarch@lists.linux.dev.
> 

Sure.

Thanks,
Kan

> Huacai
> 
> 
>> -----原始邮件-----
>> 发件人: kan.liang@linux.intel.com
>> 发送时间:2025-05-07 00:47:37 (星期三)
>> 收件人: peterz@infradead.org, mingo@redhat.com, namhyung@kernel.org, irogers@google.com, mark.rutland@arm.com, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org
>> 抄送: eranian@google.com, ctshao@google.com, tmricht@linux.ibm.com, "Kan Liang" <kan.liang@linux.intel.com>, "Bibo Mao" <maobibo@loongson.cn>, "Huacai Chen" <chenhuacai@loongson.cn>
>> 主题: [RFC PATCH 12/15] loongarch/perf: Remove driver-specific throttle support
>>
>> From: Kan Liang <kan.liang@linux.intel.com>
>>
>> The throttle support has been added in the generic code. Remove
>> the driver-specific throttle support.
>>
>> Besides the throttle, perf_event_overflow may return true because of
>> event_limit. It already does an inatomic event disable. The pmu->stop
>> is not required either.
>>
>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>> Cc: Bibo Mao <maobibo@loongson.cn>
>> Cc: Huacai Chen <chenhuacai@loongson.cn>
>> ---
>>  arch/loongarch/kernel/perf_event.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/arch/loongarch/kernel/perf_event.c b/arch/loongarch/kernel/perf_event.c
>> index f86a4b838dd7..8ad098703488 100644
>> --- a/arch/loongarch/kernel/perf_event.c
>> +++ b/arch/loongarch/kernel/perf_event.c
>> @@ -479,8 +479,7 @@ static void handle_associated_event(struct cpu_hw_events *cpuc, int idx,
>>  	if (!loongarch_pmu_event_set_period(event, hwc, idx))
>>  		return;
>>  
>> -	if (perf_event_overflow(event, data, regs))
>> -		loongarch_pmu_disable_event(idx);
>> +	perf_event_overflow(event, data, regs);
>>  }
>>  
>>  static irqreturn_t pmu_handle_irq(int irq, void *dev)
>> -- 
>> 2.38.1
> 
> 
> 本邮件及其附件含有龙芯中科的商业秘密信息，仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用（包括但不限于全部或部分地泄露、复制或散发）本邮件及其附件中的信息。如果您错收本邮件，请您立即电话或邮件通知发件人并删除本邮件。 
> This email and its attachments contain confidential information from Loongson Technology , which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this email in error, please notify the sender by phone or email immediately and delete it. 
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 01/15] perf: Fix the throttle logic for a group
  2025-05-06 16:47 ` [RFC PATCH 01/15] perf: Fix the throttle logic for a group kan.liang
@ 2025-05-07 16:52   ` Ian Rogers
  2025-05-07 19:00     ` Liang, Kan
  2025-05-13  9:41   ` Peter Zijlstra
  1 sibling, 1 reply; 25+ messages in thread
From: Ian Rogers @ 2025-05-07 16:52 UTC (permalink / raw)
  To: kan.liang
  Cc: peterz, mingo, namhyung, mark.rutland, linux-kernel,
	linux-perf-users, eranian, ctshao, tmricht

On Tue, May 6, 2025 at 9:48 AM <kan.liang@linux.intel.com> wrote:
>
> From: Kan Liang <kan.liang@linux.intel.com>
>
> The current throttle logic doesn't work well with a group, e.g., the
> following sampling-read case.
>
> $ perf record -e "{cycles,cycles}:S" ...
>
> $ perf report -D | grep THROTTLE | tail -2
>             THROTTLE events:        426  ( 9.0%)
>           UNTHROTTLE events:        425  ( 9.0%)
>
> $ perf report -D | grep PERF_RECORD_SAMPLE -a4 | tail -n 5
> 0 1020120874009167 0x74970 [0x68]: PERF_RECORD_SAMPLE(IP, 0x1):
> ... sample_read:
> .... group nr 2
> ..... id 0000000000000327, value 000000000cbb993a, lost 0
> ..... id 0000000000000328, value 00000002211c26df, lost 0
>
> The second cycles event has a much larger value than the first cycles
> event in the same group.
>
> The current throttle logic in the generic code only logs the THROTTLE
> event. It relies on the specific driver implementation to disable
> events. However, for all ARCHs, the implementation is similar. It only
> disable the event, rather than the group.
>
> The logic to disable the group should be generic for all ARCHs. Add the
> logic in the generic code. The following patch will remove the buggy
> driver-specific implementation.
>
> The throttle only happens when an event is overflowed. Stop the entire
> group when any event in the group triggers the throttle. Set the
> MAX_INTERRUPTS to the leader event to indicate the group is throttled.
>
> The unthrottled could happen in 3 places.
> - event/group sched. All events in the group are scheduled one by one.
>   All of them will be unthrottled eventually. Nothing needs to be
>   changed.
> - The perf_adjust_freq_unthr_events for each tick. Needs to restart the
>   group altogether.
> - The __perf_event_period(). The whole group needs to be restarted
>   altogether as well.
>
> With the fix,
> $ sudo perf report -D | grep PERF_RECORD_SAMPLE -a4 | tail -n 5
> 0 3573470770332 0x12f5f8 [0x70]: PERF_RECORD_SAMPLE(IP, 0x2):
> ... sample_read:
> .... group nr 2
> ..... id 0000000000000a28, value 00000004fd3dfd8f, lost 0
> ..... id 0000000000000a29, value 00000004fd3dfd8f, lost 0

Thanks Kan! The patches look good to me. As I understand it patches 2
to 15 are just removing the logic where an event is unnecessarily
stopped twice, so is it possible to test just this patch in isolation?
Given the logic is generic it is applied to software events, so you
should be able to repeat the problem with `perf record -e
"{cpu-clock,cpu-clock}:S" ...` possibly by reducing the period or
increasing the frequency. This would be nice to show that it fixes the
problem more generically than just the Intel PMU.

Ian

> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> ---
>  kernel/events/core.c | 55 +++++++++++++++++++++++++++++++++-----------
>  1 file changed, 41 insertions(+), 14 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index a84abc2b7f20..eb0dc871f4f1 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2734,6 +2734,38 @@ void perf_event_disable_inatomic(struct perf_event *event)
>  static void perf_log_throttle(struct perf_event *event, int enable);
>  static void perf_log_itrace_start(struct perf_event *event);
>
> +static void perf_event_group_unthrottle(struct perf_event *event, bool start_event)
> +{
> +       struct perf_event *leader = event->group_leader;
> +       struct perf_event *sibling;
> +
> +       if (leader != event || start_event)
> +               leader->pmu->start(leader, 0);
> +       leader->hw.interrupts = 0;
> +
> +       for_each_sibling_event(sibling, leader) {
> +               if (sibling != event || start_event)
> +                       sibling->pmu->start(sibling, 0);
> +               sibling->hw.interrupts = 0;
> +       }
> +
> +       perf_log_throttle(leader, 1);
> +}
> +
> +static void perf_event_group_throttle(struct perf_event *event)
> +{
> +       struct perf_event *leader = event->group_leader;
> +       struct perf_event *sibling;
> +
> +       leader->hw.interrupts = MAX_INTERRUPTS;
> +       leader->pmu->stop(leader, 0);
> +
> +       for_each_sibling_event(sibling, leader)
> +               sibling->pmu->stop(sibling, 0);
> +
> +       perf_log_throttle(leader, 0);
> +}
> +
>  static int
>  event_sched_in(struct perf_event *event, struct perf_event_context *ctx)
>  {
> @@ -4389,10 +4421,8 @@ static void perf_adjust_freq_unthr_events(struct list_head *event_list)
>                 hwc = &event->hw;
>
>                 if (hwc->interrupts == MAX_INTERRUPTS) {
> -                       hwc->interrupts = 0;
> -                       perf_log_throttle(event, 1);
> -                       if (!event->attr.freq || !event->attr.sample_freq)
> -                               event->pmu->start(event, 0);
> +                       perf_event_group_unthrottle(event,
> +                               !event->attr.freq || !event->attr.sample_freq);
>                 }
>
>                 if (!event->attr.freq || !event->attr.sample_freq)
> @@ -6421,14 +6451,6 @@ static void __perf_event_period(struct perf_event *event,
>         active = (event->state == PERF_EVENT_STATE_ACTIVE);
>         if (active) {
>                 perf_pmu_disable(event->pmu);
> -               /*
> -                * We could be throttled; unthrottle now to avoid the tick
> -                * trying to unthrottle while we already re-started the event.
> -                */
> -               if (event->hw.interrupts == MAX_INTERRUPTS) {
> -                       event->hw.interrupts = 0;
> -                       perf_log_throttle(event, 1);
> -               }
>                 event->pmu->stop(event, PERF_EF_UPDATE);
>         }
>
> @@ -6436,6 +6458,12 @@ static void __perf_event_period(struct perf_event *event,
>
>         if (active) {
>                 event->pmu->start(event, PERF_EF_RELOAD);
> +               /*
> +                * We could be throttled; unthrottle now to avoid the tick
> +                * trying to unthrottle while we already re-started the event.
> +                */
> +               if (event->group_leader->hw.interrupts == MAX_INTERRUPTS)
> +                       perf_event_group_unthrottle(event, false);
>                 perf_pmu_enable(event->pmu);
>         }
>  }
> @@ -10326,8 +10354,7 @@ __perf_event_account_interrupt(struct perf_event *event, int throttle)
>         if (unlikely(throttle && hwc->interrupts >= max_samples_per_tick)) {
>                 __this_cpu_inc(perf_throttled_count);
>                 tick_dep_set_cpu(smp_processor_id(), TICK_DEP_BIT_PERF_EVENTS);
> -               hwc->interrupts = MAX_INTERRUPTS;
> -               perf_log_throttle(event, 0);
> +               perf_event_group_throttle(event);
>                 ret = 1;
>         }
>
> --
> 2.38.1
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 01/15] perf: Fix the throttle logic for a group
  2025-05-07 16:52   ` Ian Rogers
@ 2025-05-07 19:00     ` Liang, Kan
  0 siblings, 0 replies; 25+ messages in thread
From: Liang, Kan @ 2025-05-07 19:00 UTC (permalink / raw)
  To: Ian Rogers
  Cc: peterz, mingo, namhyung, mark.rutland, linux-kernel,
	linux-perf-users, eranian, ctshao, tmricht



On 2025-05-07 12:52 p.m., Ian Rogers wrote:
> On Tue, May 6, 2025 at 9:48 AM <kan.liang@linux.intel.com> wrote:
>>
>> From: Kan Liang <kan.liang@linux.intel.com>
>>
>> The current throttle logic doesn't work well with a group, e.g., the
>> following sampling-read case.
>>
>> $ perf record -e "{cycles,cycles}:S" ...
>>
>> $ perf report -D | grep THROTTLE | tail -2
>>             THROTTLE events:        426  ( 9.0%)
>>           UNTHROTTLE events:        425  ( 9.0%)
>>
>> $ perf report -D | grep PERF_RECORD_SAMPLE -a4 | tail -n 5
>> 0 1020120874009167 0x74970 [0x68]: PERF_RECORD_SAMPLE(IP, 0x1):
>> ... sample_read:
>> .... group nr 2
>> ..... id 0000000000000327, value 000000000cbb993a, lost 0
>> ..... id 0000000000000328, value 00000002211c26df, lost 0
>>
>> The second cycles event has a much larger value than the first cycles
>> event in the same group.
>>
>> The current throttle logic in the generic code only logs the THROTTLE
>> event. It relies on the specific driver implementation to disable
>> events. However, for all ARCHs, the implementation is similar. It only
>> disable the event, rather than the group.
>>
>> The logic to disable the group should be generic for all ARCHs. Add the
>> logic in the generic code. The following patch will remove the buggy
>> driver-specific implementation.
>>
>> The throttle only happens when an event is overflowed. Stop the entire
>> group when any event in the group triggers the throttle. Set the
>> MAX_INTERRUPTS to the leader event to indicate the group is throttled.
>>
>> The unthrottled could happen in 3 places.
>> - event/group sched. All events in the group are scheduled one by one.
>>   All of them will be unthrottled eventually. Nothing needs to be
>>   changed.
>> - The perf_adjust_freq_unthr_events for each tick. Needs to restart the
>>   group altogether.
>> - The __perf_event_period(). The whole group needs to be restarted
>>   altogether as well.
>>
>> With the fix,
>> $ sudo perf report -D | grep PERF_RECORD_SAMPLE -a4 | tail -n 5
>> 0 3573470770332 0x12f5f8 [0x70]: PERF_RECORD_SAMPLE(IP, 0x2):
>> ... sample_read:
>> .... group nr 2
>> ..... id 0000000000000a28, value 00000004fd3dfd8f, lost 0
>> ..... id 0000000000000a29, value 00000004fd3dfd8f, lost 0
> 
> Thanks Kan! The patches look good to me. As I understand it patches 2
> to 15 are just removing the logic where an event is unnecessarily
> stopped twice, so is it possible to test just this patch in isolation?
> Given the logic is generic it is applied to software events, so you
> should be able to repeat the problem with `perf record -e
> "{cpu-clock,cpu-clock}:S" ...` possibly by reducing the period or
> increasing the frequency. 

I don't think the exact same value between the two cpu-clock events in a
group can be got.
The SW clock events work differently. It relies on per-event hrtimer
rather than the overflow interrupt. I don't think there is a global
control for SW events. Strictly, it doesn't support "group".

The above result should only be observed on a PMU, e.g., Intel PMU, that
strictly supports "group".

For a PMU which doesn't strictly support "group", the patch should just
minimize the impact from the throttle logic. It's hard to demonstrate.

> This would be nice to show that it fixes the
> problem more generically than just the Intel PMU.

I only have Intel machines.
Hope people from AMD, ARM, or POWER can give it a try.

Thanks,
Kan>
> Ian
> 
>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>> ---
>>  kernel/events/core.c | 55 +++++++++++++++++++++++++++++++++-----------
>>  1 file changed, 41 insertions(+), 14 deletions(-)
>>
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index a84abc2b7f20..eb0dc871f4f1 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -2734,6 +2734,38 @@ void perf_event_disable_inatomic(struct perf_event *event)
>>  static void perf_log_throttle(struct perf_event *event, int enable);
>>  static void perf_log_itrace_start(struct perf_event *event);
>>
>> +static void perf_event_group_unthrottle(struct perf_event *event, bool start_event)
>> +{
>> +       struct perf_event *leader = event->group_leader;
>> +       struct perf_event *sibling;
>> +
>> +       if (leader != event || start_event)
>> +               leader->pmu->start(leader, 0);
>> +       leader->hw.interrupts = 0;
>> +
>> +       for_each_sibling_event(sibling, leader) {
>> +               if (sibling != event || start_event)
>> +                       sibling->pmu->start(sibling, 0);
>> +               sibling->hw.interrupts = 0;
>> +       }
>> +
>> +       perf_log_throttle(leader, 1);
>> +}
>> +
>> +static void perf_event_group_throttle(struct perf_event *event)
>> +{
>> +       struct perf_event *leader = event->group_leader;
>> +       struct perf_event *sibling;
>> +
>> +       leader->hw.interrupts = MAX_INTERRUPTS;
>> +       leader->pmu->stop(leader, 0);
>> +
>> +       for_each_sibling_event(sibling, leader)
>> +               sibling->pmu->stop(sibling, 0);
>> +
>> +       perf_log_throttle(leader, 0);
>> +}
>> +
>>  static int
>>  event_sched_in(struct perf_event *event, struct perf_event_context *ctx)
>>  {
>> @@ -4389,10 +4421,8 @@ static void perf_adjust_freq_unthr_events(struct list_head *event_list)
>>                 hwc = &event->hw;
>>
>>                 if (hwc->interrupts == MAX_INTERRUPTS) {
>> -                       hwc->interrupts = 0;
>> -                       perf_log_throttle(event, 1);
>> -                       if (!event->attr.freq || !event->attr.sample_freq)
>> -                               event->pmu->start(event, 0);
>> +                       perf_event_group_unthrottle(event,
>> +                               !event->attr.freq || !event->attr.sample_freq);
>>                 }
>>
>>                 if (!event->attr.freq || !event->attr.sample_freq)
>> @@ -6421,14 +6451,6 @@ static void __perf_event_period(struct perf_event *event,
>>         active = (event->state == PERF_EVENT_STATE_ACTIVE);
>>         if (active) {
>>                 perf_pmu_disable(event->pmu);
>> -               /*
>> -                * We could be throttled; unthrottle now to avoid the tick
>> -                * trying to unthrottle while we already re-started the event.
>> -                */
>> -               if (event->hw.interrupts == MAX_INTERRUPTS) {
>> -                       event->hw.interrupts = 0;
>> -                       perf_log_throttle(event, 1);
>> -               }
>>                 event->pmu->stop(event, PERF_EF_UPDATE);
>>         }
>>
>> @@ -6436,6 +6458,12 @@ static void __perf_event_period(struct perf_event *event,
>>
>>         if (active) {
>>                 event->pmu->start(event, PERF_EF_RELOAD);
>> +               /*
>> +                * We could be throttled; unthrottle now to avoid the tick
>> +                * trying to unthrottle while we already re-started the event.
>> +                */
>> +               if (event->group_leader->hw.interrupts == MAX_INTERRUPTS)
>> +                       perf_event_group_unthrottle(event, false);
>>                 perf_pmu_enable(event->pmu);
>>         }
>>  }
>> @@ -10326,8 +10354,7 @@ __perf_event_account_interrupt(struct perf_event *event, int throttle)
>>         if (unlikely(throttle && hwc->interrupts >= max_samples_per_tick)) {
>>                 __this_cpu_inc(perf_throttled_count);
>>                 tick_dep_set_cpu(smp_processor_id(), TICK_DEP_BIT_PERF_EVENTS);
>> -               hwc->interrupts = MAX_INTERRUPTS;
>> -               perf_log_throttle(event, 0);
>> +               perf_event_group_throttle(event);
>>                 ret = 1;
>>         }
>>
>> --
>> 2.38.1
>>
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 14/15] xtensa/perf: Remove driver-specific throttle support
  2025-05-06 16:47 ` [RFC PATCH 14/15] xtensa/perf: " kan.liang
@ 2025-05-11 20:43   ` Max Filippov
  0 siblings, 0 replies; 25+ messages in thread
From: Max Filippov @ 2025-05-11 20:43 UTC (permalink / raw)
  To: kan.liang
  Cc: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users, eranian, ctshao, tmricht

On Tue, May 6, 2025 at 9:48 AM <kan.liang@linux.intel.com> wrote:
>
> From: Kan Liang <kan.liang@linux.intel.com>
>
> The throttle support has been added in the generic code. Remove
> the driver-specific throttle support.
>
> Besides the throttle, perf_event_overflow may return true because of
> event_limit. It already does an inatomic event disable. The pmu->stop
> is not required either.
>
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> Cc: Max Filippov <jcmvbkbc@gmail.com>
> ---
>  arch/xtensa/kernel/perf_event.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)

Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>

-- 
Thanks.
-- Max

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 03/15] perf/x86/amd: Remove driver-specific throttle support
  2025-05-06 16:47 ` [RFC PATCH 03/15] perf/x86/amd: " kan.liang
@ 2025-05-13  8:44   ` Ravi Bangoria
  0 siblings, 0 replies; 25+ messages in thread
From: Ravi Bangoria @ 2025-05-13  8:44 UTC (permalink / raw)
  To: kan.liang
  Cc: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users, eranian, ctshao, tmricht, Sandipan Das,
	Ravi Bangoria

Hi Kan,

On 06-May-25 10:17 PM, kan.liang@linux.intel.com wrote:
> From: Kan Liang <kan.liang@linux.intel.com>
> 
> The throttle support has been added in the generic code. Remove
> the driver-specific throttle support.
> 
> Besides the throttle, perf_event_overflow may return true because of
> event_limit. It already does an inatomic event disable. The pmu->stop
> is not required either.
> 
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> Cc: Sandipan Das <sandipan.das@amd.com>
> Cc: Ravi Bangoria <ravi.bangoria@amd.com>

Reviewed-by: Ravi Bangoria <ravi.bangoria@amd.com>

Thanks,
Ravi

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 01/15] perf: Fix the throttle logic for a group
  2025-05-06 16:47 ` [RFC PATCH 01/15] perf: Fix the throttle logic for a group kan.liang
  2025-05-07 16:52   ` Ian Rogers
@ 2025-05-13  9:41   ` Peter Zijlstra
  2025-05-13  9:45     ` Peter Zijlstra
  2025-05-13 14:47     ` Liang, Kan
  1 sibling, 2 replies; 25+ messages in thread
From: Peter Zijlstra @ 2025-05-13  9:41 UTC (permalink / raw)
  To: kan.liang
  Cc: mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users, eranian, ctshao, tmricht

On Tue, May 06, 2025 at 09:47:26AM -0700, kan.liang@linux.intel.com wrote:

> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index a84abc2b7f20..eb0dc871f4f1 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2734,6 +2734,38 @@ void perf_event_disable_inatomic(struct perf_event *event)
>  static void perf_log_throttle(struct perf_event *event, int enable);
>  static void perf_log_itrace_start(struct perf_event *event);
>  
> +static void perf_event_group_unthrottle(struct perf_event *event, bool start_event)
> +{
> +	struct perf_event *leader = event->group_leader;
> +	struct perf_event *sibling;
> +
> +	if (leader != event || start_event)
> +		leader->pmu->start(leader, 0);
> +	leader->hw.interrupts = 0;
> +
> +	for_each_sibling_event(sibling, leader) {
> +		if (sibling != event || start_event)
> +			sibling->pmu->start(sibling, 0);
> +		sibling->hw.interrupts = 0;
> +	}
> +
> +	perf_log_throttle(leader, 1);
> +}
> +
> +static void perf_event_group_throttle(struct perf_event *event)
> +{
> +	struct perf_event *leader = event->group_leader;
> +	struct perf_event *sibling;
> +
> +	leader->hw.interrupts = MAX_INTERRUPTS;
> +	leader->pmu->stop(leader, 0);
> +
> +	for_each_sibling_event(sibling, leader)
> +		sibling->pmu->stop(sibling, 0);
> +
> +	perf_log_throttle(leader, 0);
> +}

Urgh, this seems inconsistent at best.

 - unthrottle will set interrupts to 0 for the whole group
 - throttle will set interrupts for leader
   while the old code would set interrupts for event.
 - interrupts was written with event stopped, while now you consistently
   write when started
 - both now issue perf_log_throttle() on leader, whereas previously it
   was issued on event.

IOW

before:				after:
 event stops			 leader MAX_INTERRUPTS
 event MAX_INTERRUPTS		 event group stops
 event logs			 leader logs

 event 0			 event group 0
 event starts			 event group starts
 event logs			 leader logs

Like said, a rather inconsistent and random collection of things.



What's wrong with something simple like:

static void perf_event_throttle(struct perf_event *event, bool start)
{
	event->hw.interrupts = 0;
	if (start)
		event->pmu->start(event, 0);
	perf_log_throttle(event, 1);
}

static void perf_event_unthrottle(struct perf_event *event)
{
	event->pmu->stop(event, 0);
	event->hw.interrupts = MAX_INTERRUPTS;
	perf_log_throttle(event, 0);
}

static void perf_event_throttle_group(struct perf_event *event, bool start)
{
	struct perf_event *sibling, *leader = event->group_leader;

	perf_event_throttle(leader, start);
	for_each_sibling_event(sibling, leader)
		perf_event_throttle(sibling, start);
}

static void perf_event_unthrottle_group(struct perf_event *event)
{
	struct perf_event *sibling, *leader = event->group_leader;

	perf_event_unthrottle(leader);
	for_each_sibling_event(sibling, leader)
		perf_event_unthrottle(sibling);
}

That way:

before:				after:
 event stops			 event group stops
 event MAX_INTERRUPTS		 event group MAX_INTERRUPTS
 event logs			 event group logs

 event 0			 event group 0
 event starts			 event group starts
 event logs			 event group logs

All that was done before is still done - no change in behaviour for
event. Its just that the rest of the group is now taken along for the
ride.

> @@ -6421,14 +6451,6 @@ static void __perf_event_period(struct perf_event *event,
>  	active = (event->state == PERF_EVENT_STATE_ACTIVE);
>  	if (active) {
>  		perf_pmu_disable(event->pmu);
> -		/*
> -		 * We could be throttled; unthrottle now to avoid the tick
> -		 * trying to unthrottle while we already re-started the event.
> -		 */
> -		if (event->hw.interrupts == MAX_INTERRUPTS) {
> -			event->hw.interrupts = 0;
> -			perf_log_throttle(event, 1);
> -		}
>  		event->pmu->stop(event, PERF_EF_UPDATE);
>  	}
>  
> @@ -6436,6 +6458,12 @@ static void __perf_event_period(struct perf_event *event,
>  
>  	if (active) {
>  		event->pmu->start(event, PERF_EF_RELOAD);
> +		/*
> +		 * We could be throttled; unthrottle now to avoid the tick
> +		 * trying to unthrottle while we already re-started the event.
> +		 */
> +		if (event->group_leader->hw.interrupts == MAX_INTERRUPTS)
> +			perf_event_group_unthrottle(event, false);
>  		perf_pmu_enable(event->pmu);
>  	}
>  }

This change seems random. Also, note that I'm kicking myself for the
total lack of useful information in commit 1e02cd40f151.

Notably, we're calling this from event_function_call(), this means we're
having IRQs disabled and are running on the events CPU. How can we
interact with the tick?



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 01/15] perf: Fix the throttle logic for a group
  2025-05-13  9:41   ` Peter Zijlstra
@ 2025-05-13  9:45     ` Peter Zijlstra
  2025-05-13 14:47     ` Liang, Kan
  1 sibling, 0 replies; 25+ messages in thread
From: Peter Zijlstra @ 2025-05-13  9:45 UTC (permalink / raw)
  To: kan.liang
  Cc: mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users, eranian, ctshao, tmricht

On Tue, May 13, 2025 at 11:41:55AM +0200, Peter Zijlstra wrote:
> On Tue, May 06, 2025 at 09:47:26AM -0700, kan.liang@linux.intel.com wrote:
> 
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index a84abc2b7f20..eb0dc871f4f1 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -2734,6 +2734,38 @@ void perf_event_disable_inatomic(struct perf_event *event)
> >  static void perf_log_throttle(struct perf_event *event, int enable);
> >  static void perf_log_itrace_start(struct perf_event *event);
> >  
> > +static void perf_event_group_unthrottle(struct perf_event *event, bool start_event)
> > +{
> > +	struct perf_event *leader = event->group_leader;
> > +	struct perf_event *sibling;
> > +
> > +	if (leader != event || start_event)
> > +		leader->pmu->start(leader, 0);
> > +	leader->hw.interrupts = 0;
> > +
> > +	for_each_sibling_event(sibling, leader) {
> > +		if (sibling != event || start_event)
> > +			sibling->pmu->start(sibling, 0);
> > +		sibling->hw.interrupts = 0;
> > +	}
> > +
> > +	perf_log_throttle(leader, 1);
> > +}
> > +
> > +static void perf_event_group_throttle(struct perf_event *event)
> > +{
> > +	struct perf_event *leader = event->group_leader;
> > +	struct perf_event *sibling;
> > +
> > +	leader->hw.interrupts = MAX_INTERRUPTS;
> > +	leader->pmu->stop(leader, 0);
> > +
> > +	for_each_sibling_event(sibling, leader)
> > +		sibling->pmu->stop(sibling, 0);
> > +
> > +	perf_log_throttle(leader, 0);
> > +}
> 
> Urgh, this seems inconsistent at best.
> 
>  - unthrottle will set interrupts to 0 for the whole group
>  - throttle will set interrupts for leader
>    while the old code would set interrupts for event.
>  - interrupts was written with event stopped, while now you consistently
>    write when started
>  - both now issue perf_log_throttle() on leader, whereas previously it
>    was issued on event.
> 
> IOW
> 
> before:				after:
>  event stops			 leader MAX_INTERRUPTS
>  event MAX_INTERRUPTS		 event group stops
>  event logs			 leader logs
> 
>  event 0			 event group 0
>  event starts			 event group starts
>  event logs			 leader logs

Sorry, that should obviously be:

  event 0			 event group starts
  event starts			 event group 0
  event logs			 leader logs


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 01/15] perf: Fix the throttle logic for a group
  2025-05-13  9:41   ` Peter Zijlstra
  2025-05-13  9:45     ` Peter Zijlstra
@ 2025-05-13 14:47     ` Liang, Kan
  1 sibling, 0 replies; 25+ messages in thread
From: Liang, Kan @ 2025-05-13 14:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users, eranian, ctshao, tmricht



On 2025-05-13 5:41 a.m., Peter Zijlstra wrote:
> On Tue, May 06, 2025 at 09:47:26AM -0700, kan.liang@linux.intel.com wrote:
> 
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index a84abc2b7f20..eb0dc871f4f1 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -2734,6 +2734,38 @@ void perf_event_disable_inatomic(struct perf_event *event)
>>  static void perf_log_throttle(struct perf_event *event, int enable);
>>  static void perf_log_itrace_start(struct perf_event *event);
>>  
>> +static void perf_event_group_unthrottle(struct perf_event *event, bool start_event)
>> +{
>> +	struct perf_event *leader = event->group_leader;
>> +	struct perf_event *sibling;
>> +
>> +	if (leader != event || start_event)
>> +		leader->pmu->start(leader, 0);
>> +	leader->hw.interrupts = 0;
>> +
>> +	for_each_sibling_event(sibling, leader) {
>> +		if (sibling != event || start_event)
>> +			sibling->pmu->start(sibling, 0);
>> +		sibling->hw.interrupts = 0;
>> +	}
>> +
>> +	perf_log_throttle(leader, 1);
>> +}
>> +
>> +static void perf_event_group_throttle(struct perf_event *event)
>> +{
>> +	struct perf_event *leader = event->group_leader;
>> +	struct perf_event *sibling;
>> +
>> +	leader->hw.interrupts = MAX_INTERRUPTS;
>> +	leader->pmu->stop(leader, 0);
>> +
>> +	for_each_sibling_event(sibling, leader)
>> +		sibling->pmu->stop(sibling, 0);
>> +
>> +	perf_log_throttle(leader, 0);
>> +}
> 
> Urgh, this seems inconsistent at best.
> 
>  - unthrottle will set interrupts to 0 for the whole group
>  - throttle will set interrupts for leader
>    while the old code would set interrupts for event.
>  - interrupts was written with event stopped, while now you consistently
>    write when started
>  - both now issue perf_log_throttle() on leader, whereas previously it
>    was issued on event.
> 
> IOW
> 
> before:				after:
>  event stops			 leader MAX_INTERRUPTS
>  event MAX_INTERRUPTS		 event group stops
>  event logs			 leader logs
> 
>  event 0			 event group 0
>  event starts			 event group starts
>  event logs			 leader logs
> 
> Like said, a rather inconsistent and random collection of things.
> 
> 
> 
> What's wrong with something simple like:
> 
> static void perf_event_throttle(struct perf_event *event, bool start)
> {
> 	event->hw.interrupts = 0;
> 	if (start)
> 		event->pmu->start(event, 0);
> 	perf_log_throttle(event, 1);
> }
> 
> static void perf_event_unthrottle(struct perf_event *event)
> {
> 	event->pmu->stop(event, 0);
> 	event->hw.interrupts = MAX_INTERRUPTS;
> 	perf_log_throttle(event, 0);
> }

I think the name is reversed. An event/group should be stopped when
throttle.

> 
> static void perf_event_throttle_group(struct perf_event *event, bool start)
> {
> 	struct perf_event *sibling, *leader = event->group_leader;
> 
> 	perf_event_throttle(leader, start);
> 	for_each_sibling_event(sibling, leader)
> 		perf_event_throttle(sibling, start);
> }
> 
> static void perf_event_unthrottle_group(struct perf_event *event)
> {
> 	struct perf_event *sibling, *leader = event->group_leader;
> 
> 	perf_event_unthrottle(leader);
> 	for_each_sibling_event(sibling, leader)
> 		perf_event_unthrottle(sibling);
> }
> 
> That way:
> 
> before:				after:
>  event stops			 event group stops
>  event MAX_INTERRUPTS		 event group MAX_INTERRUPTS
>  event logs			 event group logs
> 
>  event 0			 event group 0
>  event starts			 event group starts
>  event logs			 event group logs
> 
> All that was done before is still done - no change in behaviour for
> event. Its just that the rest of the group is now taken along for the
> ride.

Except the naming, the change looks good to me.
I will update it in V2. Thanks.

> 
>> @@ -6421,14 +6451,6 @@ static void __perf_event_period(struct perf_event *event,
>>  	active = (event->state == PERF_EVENT_STATE_ACTIVE);
>>  	if (active) {
>>  		perf_pmu_disable(event->pmu);
>> -		/*
>> -		 * We could be throttled; unthrottle now to avoid the tick
>> -		 * trying to unthrottle while we already re-started the event.
>> -		 */
>> -		if (event->hw.interrupts == MAX_INTERRUPTS) {
>> -			event->hw.interrupts = 0;
>> -			perf_log_throttle(event, 1);
>> -		}
>>  		event->pmu->stop(event, PERF_EF_UPDATE);
>>  	}
>>  
>> @@ -6436,6 +6458,12 @@ static void __perf_event_period(struct perf_event *event,
>>  
>>  	if (active) {
>>  		event->pmu->start(event, PERF_EF_RELOAD);
>> +		/*
>> +		 * We could be throttled; unthrottle now to avoid the tick
>> +		 * trying to unthrottle while we already re-started the event.
>> +		 */
>> +		if (event->group_leader->hw.interrupts == MAX_INTERRUPTS)
>> +			perf_event_group_unthrottle(event, false);
>>  		perf_pmu_enable(event->pmu);
>>  	}
>>  }
> 
> This change seems random. Also, note that I'm kicking myself for the
> total lack of useful information in commit 1e02cd40f151.
> 
> Notably, we're calling this from event_function_call(), this means we're
> having IRQs disabled and are running on the events CPU. How can we
> interact with the tick?

The commit bad7192b842c indicates that an event should be start
immediately once the period is force-reset. If perf does so for a
throttled event, the MAX_INTERRUPTS must be cleared. Otherwise, in the
next tick, the start will be invoked. At least, for x86, there will be a
WARN when perf starts an non-stop event.

I moved the code down a little bit so the events in a group could start
all together. But I think it's possible that the member event is
force-reset. So the start of the leader may not be invoked first. But it
should be OK, since the PMU is disabled.

Thanks,
Kan


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2025-05-13 14:47 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-06 16:47 [RFC PATCH 00/15] perf: Fix the throttle logic for group kan.liang
2025-05-06 16:47 ` [RFC PATCH 01/15] perf: Fix the throttle logic for a group kan.liang
2025-05-07 16:52   ` Ian Rogers
2025-05-07 19:00     ` Liang, Kan
2025-05-13  9:41   ` Peter Zijlstra
2025-05-13  9:45     ` Peter Zijlstra
2025-05-13 14:47     ` Liang, Kan
2025-05-06 16:47 ` [RFC PATCH 02/15] perf/x86/intel: Remove driver-specific throttle support kan.liang
2025-05-06 16:47 ` [RFC PATCH 03/15] perf/x86/amd: " kan.liang
2025-05-13  8:44   ` Ravi Bangoria
2025-05-06 16:47 ` [RFC PATCH 04/15] perf/x86/zhaoxin: " kan.liang
2025-05-06 16:47 ` [RFC PATCH 05/15] powerpc/perf: " kan.liang
2025-05-06 16:47 ` [RFC PATCH 06/15] s390/perf: " kan.liang
2025-05-06 16:47 ` [RFC PATCH 07/15] perf/arm: " kan.liang
2025-05-06 16:47 ` [RFC PATCH 08/15] perf/apple_m1: " kan.liang
2025-05-06 16:47 ` [RFC PATCH 09/15] alpha/perf: " kan.liang
2025-05-06 16:47 ` [RFC PATCH 10/15] arc/perf: " kan.liang
2025-05-06 16:47 ` [RFC PATCH 11/15] csky/perf: " kan.liang
2025-05-06 16:47 ` [RFC PATCH 12/15] loongarch/perf: " kan.liang
2025-05-07  1:17   ` 陈华才
2025-05-07 13:45     ` Liang, Kan
2025-05-06 16:47 ` [RFC PATCH 13/15] sparc/perf: " kan.liang
2025-05-06 16:47 ` [RFC PATCH 14/15] xtensa/perf: " kan.liang
2025-05-11 20:43   ` Max Filippov
2025-05-06 16:47 ` [RFC PATCH 15/15] mips/perf: " kan.liang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).