linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf/x86/intel: Restrict period on Haswell
@ 2024-07-29 22:33 Li Huafei
  2024-07-31 19:20 ` Thomas Gleixner
  0 siblings, 1 reply; 17+ messages in thread
From: Li Huafei @ 2024-07-29 22:33 UTC (permalink / raw)
  To: peterz, mingo
  Cc: acme, namhyung, mark.rutland, alexander.shishkin, jolsa, irogers,
	adrian.hunter, kan.liang, tglx, bp, dave.hansen, x86, hpa,
	linux-perf-users, linux-kernel, lihuafei1

On my Haswell machine, running the ltp test cve-2015-3290 concurrently
reports the following warnings:

  perfevents: irq loop stuck!
  WARNING: CPU: 31 PID: 32438 at arch/x86/events/intel/core.c:3174 intel_pmu_handle_irq+0x285/0x370
  CPU: 31 UID: 0 PID: 32438 Comm: cve-2015-3290 Kdump: loaded Tainted: G S      W          6.11.0-rc1+ #3
  ...
  Call Trace:
   <NMI>
   ? __warn+0xa4/0x220
   ? intel_pmu_handle_irq+0x285/0x370
   ? __report_bug+0x123/0x130
   ? intel_pmu_handle_irq+0x285/0x370
   ? __report_bug+0x123/0x130
   ? intel_pmu_handle_irq+0x285/0x370
   ? report_bug+0x3e/0xa0
   ? handle_bug+0x3c/0x70
   ? exc_invalid_op+0x18/0x50
   ? asm_exc_invalid_op+0x1a/0x20
   ? irq_work_claim+0x1e/0x40
   ? intel_pmu_handle_irq+0x285/0x370
   perf_event_nmi_handler+0x3d/0x60
   nmi_handle+0x104/0x330
   ? ___ratelimit+0xe4/0x1b0
   default_do_nmi+0x40/0x100
   exc_nmi+0x104/0x180
   end_repeat_nmi+0xf/0x53
   ...
   ? intel_pmu_lbr_enable_all+0x2a/0x90
   ? __intel_pmu_enable_all.constprop.0+0x16d/0x1b0
   ? __intel_pmu_enable_all.constprop.0+0x16d/0x1b0
   perf_ctx_enable+0x8e/0xc0
   __perf_install_in_context+0x146/0x3e0
   ? __pfx___perf_install_in_context+0x10/0x10
   remote_function+0x7c/0xa0
   ? __pfx_remote_function+0x10/0x10
   generic_exec_single+0xf8/0x150
   smp_call_function_single+0x1dc/0x230
   ? __pfx_remote_function+0x10/0x10
   ? __pfx_smp_call_function_single+0x10/0x10
   ? __pfx_remote_function+0x10/0x10
   ? lock_is_held_type+0x9e/0x120
   ? exclusive_event_installable+0x4f/0x140
   perf_install_in_context+0x197/0x330
   ? __pfx_perf_install_in_context+0x10/0x10
   ? __pfx___perf_install_in_context+0x10/0x10
   __do_sys_perf_event_open+0xb80/0x1100
   ? __pfx___do_sys_perf_event_open+0x10/0x10
   ? __pfx___lock_release+0x10/0x10
   ? lockdep_hardirqs_on_prepare+0x135/0x200
   ? ktime_get_coarse_real_ts64+0xee/0x100
   ? ktime_get_coarse_real_ts64+0x92/0x100
   do_syscall_64+0x70/0x180
   entry_SYSCALL_64_after_hwframe+0x76/0x7e
   ...

My machine has 32 physical cores, each with two logical cores. During
testing, it executes the CVE-2015-3290 test case 100 times concurrently.

This warning was already present in [1] and a patch was given there to
limit period to 128 on Haswell, but that patch was not merged into the
mainline.  In [2] the period on Nehalem was limited to 32. I tested 16
and 32 period on my machine and found that the problem could be
reproduced with a limit of 16, but the problem did not reproduce when
set to 32. It looks like we can limit the cycles to 32 on Haswell as
well.

[1] https://lore.kernel.org/lkml/20150501070226.GB18957@gmail.com/#r
[2] https://lore.kernel.org/all/1566256411-18820-1-git-send-email-johunt@akamai.com/T/#mf1479ab3f25d3f7f3a899244081baa2e7b7bc0b9

Signed-off-by: Li Huafei <lihuafei1@huawei.com>
---
 arch/x86/events/intel/core.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 0c9c2706d4ec..459dec2f07e3 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4625,6 +4625,11 @@ static void glc_limit_period(struct perf_event *event, s64 *left)
 		*left = max(*left, 128LL);
 }
 
+static void hsw_limit_period(struct perf_event *event, s64 *left)
+{
+	*left = max(*left, 32LL);
+}
+
 PMU_FORMAT_ATTR(event,	"config:0-7"	);
 PMU_FORMAT_ATTR(umask,	"config:8-15"	);
 PMU_FORMAT_ATTR(edge,	"config:18"	);
@@ -6767,6 +6772,7 @@ __init int intel_pmu_init(void)
 		x86_pmu.hw_config = hsw_hw_config;
 		x86_pmu.get_event_constraints = hsw_get_event_constraints;
 		x86_pmu.lbr_double_abort = true;
+		x86_pmu.limit_period = hsw_limit_period;
 		extra_attr = boot_cpu_has(X86_FEATURE_RTM) ?
 			hsw_format_attr : nhm_format_attr;
 		td_attr  = hsw_events_attrs;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-08-17 12:23 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-29 22:33 [PATCH] perf/x86/intel: Restrict period on Haswell Li Huafei
2024-07-31 19:20 ` Thomas Gleixner
2024-08-13 13:13   ` Li Huafei
2024-08-14 14:43     ` Thomas Gleixner
2024-08-14 14:52     ` Thomas Gleixner
2024-08-14 18:15       ` Liang, Kan
2024-08-14 19:01         ` Thomas Gleixner
2024-08-14 19:37           ` Liang, Kan
2024-08-14 22:47             ` Thomas Gleixner
2024-08-15 15:39               ` Liang, Kan
2024-08-15 18:26                 ` Thomas Gleixner
2024-08-15 20:15                   ` Liang, Kan
2024-08-15 23:43                     ` Thomas Gleixner
2024-08-16 19:27                       ` Liang, Kan
2024-08-17 12:22                         ` Liang, Kan
2024-08-17 12:23                         ` Thomas Gleixner
2024-08-15 19:01                 ` Vince Weaver

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).