public inbox for linux-perf-users@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Two semi-related perf throttling fixes
@ 2026-03-31 15:25 Calvin Owens
  2026-03-31 15:25 ` [PATCH 1/2] perf/x86: Avoid double accounting of PMU NMI latencies Calvin Owens
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Calvin Owens @ 2026-03-31 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-perf-users, x86, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	James Clark, Thomas Gleixner, Borislav Petkov, Dave Hansen,
	H. Peter Anvin

Hi all,

In the course of investigating [1], I set out to understand why this
sequence of messages is printed every boot, even when nobody is using
perf at all:

    perf: interrupt took too long (2516 > 2500), lowering kernel.perf_event_max_sample_rate to 79000
    perf: interrupt took too long (3156 > 3145), lowering kernel.perf_event_max_sample_rate to 63000
    perf: interrupt took too long (4014 > 3945), lowering kernel.perf_event_max_sample_rate to 49000
    perf: interrupt took too long (5035 > 5017), lowering kernel.perf_event_max_sample_rate to 39000
    perf: interrupt took too long (6302 > 6293), lowering kernel.perf_event_max_sample_rate to 31000
    perf: interrupt took too long (7879 > 7877), lowering kernel.perf_event_max_sample_rate to 25000
    perf: interrupt took too long (9852 > 9848), lowering kernel.perf_event_max_sample_rate to 20000

It turns out this happens because of how the dynamic sample rate
throttling interacts with the perf hardware watchdog. Patch [2/2] is my
attempt to prevent the dynamic throttling logic from acting solely based
on the latency of the watchdog NMI.

Intel CPUs were happy with that. But AMD CPUs still printed the messages!

That happens because AMD CPUs have a second PMU facility with its own
NMI handler, and both NMI handlers average in their latency, even when
they don't actually handle the NMI.

Patch [1/2] fixes that, which is a correctness issue entirely
independent of patch [2/2]. But it also happens to be required for patch
[2/2] to achieve its goal on AMD CPUs, so I sent them together.

Thanks,
Calvin

[1] https://lore.kernel.org/all/acMe-QZUel-bBYUh@mozart.vkv.me/

Calvin Owens (2):
  perf/x86: Avoid double accounting of PMU NMI latencies
  perf: Don't throttle based on NMI watchdog events

 arch/x86/events/amd/ibs.c |  6 +++---
 arch/x86/events/core.c    |  3 ++-
 kernel/events/core.c      | 14 ++++++++++++++
 3 files changed, 19 insertions(+), 4 deletions(-)

-- 
2.47.3


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-04-01  8:01 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-31 15:25 [PATCH 0/2] Two semi-related perf throttling fixes Calvin Owens
2026-03-31 15:25 ` [PATCH 1/2] perf/x86: Avoid double accounting of PMU NMI latencies Calvin Owens
2026-03-31 15:25 ` [PATCH 2/2] perf: Don't throttle based on NMI watchdog events Calvin Owens
2026-03-31 17:22   ` Calvin Owens
2026-03-31 17:43     ` Calvin Owens
2026-03-31 18:10     ` Calvin Owens
2026-03-31 21:07       ` Calvin Owens
2026-04-01  8:01 ` [PATCH 0/2] Two semi-related perf throttling fixes Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox