linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3] perf: Fix the throttle error of some clock events
@ 2025-06-04 17:15 kan.liang
  2025-06-04 20:08 ` Ian Rogers
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: kan.liang @ 2025-06-04 17:15 UTC (permalink / raw)
  To: peterz, mingo, namhyung, irogers, mark.rutland, linux-kernel,
	linux-perf-users
  Cc: eranian, ctshao, tmricht, Kan Liang, Leo Yan, Aishwarya TCV,
	Alexei Starovoitov, Venkat Rao Bagalkote

From: Kan Liang <kan.liang@linux.intel.com>

Both ARM and IBM CI reports RCU stall, which can be reproduced by the
below perf command.
  perf record -a -e cpu-clock -- sleep 2

The issue is introduced by the generic throttle patch set, which
unconditionally invoke the event_stop() when throttle is triggered.

The cpu-clock and task-clock are two special SW events, which rely on
the hrtimer. The throttle is invoked in the hrtimer handler. The
event_stop()->hrtimer_cancel() waits for the handler to finish, which is
a deadlock. Instead of invoking the stop(), the HRTIMER_NORESTART should
be used to stop the timer.

There may be two ways to fix it.
- Introduce a PMU flag to track the case. Avoid the event_stop in
  perf_event_throttle() if the flag is detected.
  It has been implemented in the
  https://lore.kernel.org/lkml/20250528175832.2999139-1-kan.liang@linux.intel.com/
  The new flag was thought to be an overkill for the issue.
- Add a check in the event_stop. Return immediately if the throttle is
  invoked in the hrtimer handler. Rely on the existing HRTIMER_NORESTART
  method to stop the timer.

The latter is implemented here.

Move event->hw.interrupts = MAX_INTERRUPTS before the stop(). It makes
the order the same as perf_event_unthrottle(). Except the patch, no one
checks the hw.interrupts in the stop(). There is no impact from the
order change.

Reported-by: Leo Yan <leo.yan@arm.com>
Reported-by: Aishwarya TCV <aishwarya.tcv@arm.com>
Closes: https://lore.kernel.org/lkml/20250527161656.GJ2566836@e132581.arm.com/
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Closes: https://lore.kernel.org/lkml/djxlh5fx326gcenwrr52ry3pk4wxmugu4jccdjysza7tlc5fef@ktp4rffawgcw/
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/lkml/8e8f51d8-af64-4d9e-934b-c0ee9f131293@linux.ibm.com/
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---

Changes since V2:
- Apply a different way to fix the issue.
  Remove all Tested-by since a different way is applied
- Update the change log
- Add more Reported-by

 kernel/events/core.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index f34c99f8ce8f..46441c23475d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2656,8 +2656,8 @@ static void perf_event_unthrottle(struct perf_event *event, bool start)
 
 static void perf_event_throttle(struct perf_event *event)
 {
-	event->pmu->stop(event, 0);
 	event->hw.interrupts = MAX_INTERRUPTS;
+	event->pmu->stop(event, 0);
 	if (event == event->group_leader)
 		perf_log_throttle(event, 0);
 }
@@ -11749,7 +11749,12 @@ static void perf_swevent_cancel_hrtimer(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
 
-	if (is_sampling_event(event)) {
+	/*
+	 * The throttle can be triggered in the hrtimer handler.
+	 * The HRTIMER_NORESTART should be used to stop the timer,
+	 * rather than hrtimer_cancel(). See perf_swevent_hrtimer()
+	 */
+	if (is_sampling_event(event) && (hwc->interrupts != MAX_INTERRUPTS)) {
 		ktime_t remaining = hrtimer_get_remaining(&hwc->hrtimer);
 		local64_set(&hwc->period_left, ktime_to_ns(remaining));
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-06-06 18:40 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-04 17:15 [PATCH V3] perf: Fix the throttle error of some clock events kan.liang
2025-06-04 20:08 ` Ian Rogers
2025-06-04 23:21 ` Alexei Starovoitov
2025-06-05 13:46   ` Liang, Kan
2025-06-05 16:58     ` Liang, Kan
2025-06-05 18:45     ` Alexei Starovoitov
2025-06-05 20:24       ` Liang, Kan
2025-06-05 20:46         ` Alexei Starovoitov
2025-06-05 23:50           ` Liang, Kan
2025-06-06  0:39             ` Alexei Starovoitov
2025-06-06 13:05               ` Liang, Kan
2025-06-06 17:42                 ` Alexei Starovoitov
2025-06-06 18:38                   ` Liang, Kan
2025-06-06 18:40                     ` Alexei Starovoitov
2025-06-05  6:39 ` Ingo Molnar
2025-06-05 13:51   ` Liang, Kan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).