* perf: fuzzer, lots of warnings @ 2015-07-02 15:42 Vince Weaver 2015-07-02 16:47 ` Peter Zijlstra 0 siblings, 1 reply; 8+ messages in thread From: Vince Weaver @ 2015-07-02 15:42 UTC (permalink / raw) To: linux-kernel Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Stephane Eranian So this does seem to have gotten worse with 4.2-rc0. Four warnings are triggered more or less immediately after starting fuzzing. (#1) WARNING: CPU: 0 PID: 32434 at arch/x86/kernel/cpu/perf_event.c:1219 x86_pmu_start+0xaa/0x100() if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED))) [ 224.158535] [<ffffffff8102a1ea>] x86_pmu_start+0xaa/0x100 [ 224.164401] [<ffffffff81159f39>] perf_ioctl+0x429/0x470 [ 224.170101] [<ffffffff811f199a>] do_vfs_ioctl+0x2ba/0x4a0 [ 224.176012] [<ffffffff811fef14>] ? mntput+0x24/0x40 [ 224.181343] [<ffffffff811df3ea>] ? __fput+0x17a/0x1e0 [ 224.186858] [<ffffffff811f1bf9>] SyS_ioctl+0x79/0x90 [ 224.192275] [<ffffffff816a64b2>] entry_SYSCALL_64_fastpath+0x16/0x7a (#2) [ 584.352324] WARNING: CPU: 2 PID: 18924 at arch/x86/kernel/cpu/perf_event_intel_ds.c:1198 intel_pmu_drain_pebs_nhm+0x283/0x2e0() (this one was described in a previous e-mail) (#3) [ 1210.501445] WARNING: CPU: 6 PID: 0 at arch/x86/kernel/cpu/perf_event.c:1307 x86_pmu_stop+0xb8/0xd0() WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED); [ 1210.653328] [<ffffffff810291b8>] x86_pmu_stop+0xb8/0xd0 [ 1210.659466] [<ffffffff81029216>] x86_pmu_del+0x46/0x140 [ 1210.665607] [<ffffffff81156244>] event_sched_out.isra.80+0x124/0x320 [ 1210.672924] [<ffffffff81156553>] ? __perf_event_disable+0x53/0x130 [ 1210.680097] [<ffffffff81156488>] group_sched_out+0x48/0xc0 [ 1210.686530] [<ffffffff8101d2a8>] ? native_sched_clock+0x28/0x90 [ 1210.693445] [<ffffffff811565fe>] __perf_event_disable+0xfe/0x130 [ 1210.700408] [<ffffffff811532c0>] ? cpu_clock_event_start+0x40/0x40 [ 1210.707550] [<ffffffff81153302>] remote_function+0x42/0x50 [ 1210.713942] [<ffffffff810f0dab>] flush_smp_call_function_queue+0x7b/0x170 [ 1210.721762] [<ffffffff810f1793>] generic_smp_call_function_single_interrupt+0x13/0x60 [ 1210.730679] [<ffffffff81049228>] smp_trace_call_function_single_interrupt+0x38/0xb0 [ 1210.739467] [<ffffffff816a78db>] trace_call_function_single_interrupt+0x6b/0x70 [ 1210.747820] <EOI> [<ffffffff81543f24>] ? cpuidle_enter_state+0xf4/0x220 [ 1210.755614] [<ffffffff81543f00>] ? cpuidle_enter_state+0xd0/0x220 [ 1210.762708] [<ffffffff81544087>] cpuidle_enter+0x17/0x20 [ 1210.768966] [<ffffffff810b05db>] call_cpuidle+0x3b/0x70 [ 1210.775135] [<ffffffff81544063>] ? cpuidle_select+0x13/0x20 [ 1210.781637] [<ffffffff810b0855>] cpu_startup_entry+0x245/0x310 [ 1210.788411] [<ffffffff81049a40>] start_secondary+0x120/0x130 (#4) [ 1638.702578] WARNING: CPU: 2 PID: 25543 at arch/x86/kernel/cpu/perf_event_intel_lbr.c:349 intel_pmu_lbr_disable+0xbc/0xd0() WARN_ON_ONCE(cpuc->lbr_users < 0); [ 1638.857921] [<ffffffff8102e56c>] intel_pmu_lbr_disable+0xbc/0xd0 [ 1638.864874] [<ffffffff81031812>] intel_pmu_disable_event+0xc2/0x130 [ 1638.872125] [<ffffffff81030a49>] ? __intel_pmu_disable_all+0x49/0x70 [ 1638.879455] [<ffffffff81029175>] x86_pmu_stop+0x75/0xd0 [ 1638.885560] [<ffffffff81159f1a>] perf_ioctl+0x40a/0x470 [ 1638.891650] [<ffffffff811f199a>] do_vfs_ioctl+0x2ba/0x4a0 [ 1638.897945] [<ffffffff811dd9ac>] ? vfs_read+0x11c/0x130 [ 1638.904069] [<ffffffff811f27bd>] ? poll_select_set_timeout+0x4d/0x70 [ 1638.911417] [<ffffffff811f1bf9>] SyS_ioctl+0x79/0x90 [ 1638.917295] [<ffffffff816a64b2>] entry_SYSCALL_64_fastpath+0x16/0x7a ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf: fuzzer, lots of warnings 2015-07-02 15:42 perf: fuzzer, lots of warnings Vince Weaver @ 2015-07-02 16:47 ` Peter Zijlstra 2015-07-02 17:59 ` Vince Weaver 0 siblings, 1 reply; 8+ messages in thread From: Peter Zijlstra @ 2015-07-02 16:47 UTC (permalink / raw) To: Vince Weaver Cc: linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo, Stephane Eranian On Thu, Jul 02, 2015 at 11:42:43AM -0400, Vince Weaver wrote: > (#4) > > [ 1638.702578] WARNING: CPU: 2 PID: 25543 at arch/x86/kernel/cpu/perf_event_intel_lbr.c:349 intel_pmu_lbr_disable+0xbc/0xd0() > > WARN_ON_ONCE(cpuc->lbr_users < 0); > > [ 1638.857921] [<ffffffff8102e56c>] intel_pmu_lbr_disable+0xbc/0xd0 > [ 1638.864874] [<ffffffff81031812>] intel_pmu_disable_event+0xc2/0x130 > [ 1638.872125] [<ffffffff81030a49>] ? __intel_pmu_disable_all+0x49/0x70 > [ 1638.879455] [<ffffffff81029175>] x86_pmu_stop+0x75/0xd0 > [ 1638.885560] [<ffffffff81159f1a>] perf_ioctl+0x40a/0x470 > [ 1638.891650] [<ffffffff811f199a>] do_vfs_ioctl+0x2ba/0x4a0 > [ 1638.897945] [<ffffffff811dd9ac>] ? vfs_read+0x11c/0x130 > [ 1638.904069] [<ffffffff811f27bd>] ? poll_select_set_timeout+0x4d/0x70 > [ 1638.911417] [<ffffffff811f1bf9>] SyS_ioctl+0x79/0x90 > [ 1638.917295] [<ffffffff816a64b2>] entry_SYSCALL_64_fastpath+0x16/0x7a This might maybe help: http://lkml.kernel.org/r/20150624144750.GJ18673@twins.programming.kicks-ass.net ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf: fuzzer, lots of warnings 2015-07-02 16:47 ` Peter Zijlstra @ 2015-07-02 17:59 ` Vince Weaver 2015-07-02 18:30 ` Ingo Molnar ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: Vince Weaver @ 2015-07-02 17:59 UTC (permalink / raw) To: Peter Zijlstra Cc: Vince Weaver, linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo, Stephane Eranian On Thu, 2 Jul 2015, Peter Zijlstra wrote: > On Thu, Jul 02, 2015 at 11:42:43AM -0400, Vince Weaver wrote: > > (#4) > > > > [ 1638.702578] WARNING: CPU: 2 PID: 25543 at arch/x86/kernel/cpu/perf_event_intel_lbr.c:349 intel_pmu_lbr_disable+0xbc/0xd0() > > > > WARN_ON_ONCE(cpuc->lbr_users < 0); > > > > [ 1638.857921] [<ffffffff8102e56c>] intel_pmu_lbr_disable+0xbc/0xd0 > > [ 1638.864874] [<ffffffff81031812>] intel_pmu_disable_event+0xc2/0x130 > > [ 1638.872125] [<ffffffff81030a49>] ? __intel_pmu_disable_all+0x49/0x70 > > [ 1638.879455] [<ffffffff81029175>] x86_pmu_stop+0x75/0xd0 > > [ 1638.885560] [<ffffffff81159f1a>] perf_ioctl+0x40a/0x470 > > [ 1638.891650] [<ffffffff811f199a>] do_vfs_ioctl+0x2ba/0x4a0 > > [ 1638.897945] [<ffffffff811dd9ac>] ? vfs_read+0x11c/0x130 > > [ 1638.904069] [<ffffffff811f27bd>] ? poll_select_set_timeout+0x4d/0x70 > > [ 1638.911417] [<ffffffff811f1bf9>] SyS_ioctl+0x79/0x90 > > [ 1638.917295] [<ffffffff816a64b2>] entry_SYSCALL_64_fastpath+0x16/0x7a > > This might maybe help: > > http://lkml.kernel.org/r/20150624144750.GJ18673@twins.programming.kicks-ass.net ah, OK, I missed that one somehow. It would also explain the [ 3971.156004] Uhhuh. NMI received for unknown reason 21 on CPU 2. [ 3971.162812] Do you have a strange power saving mode enabled? [ 3971.169363] Dazed and confused, but trying to continue messages I am getting. I'm also still getting a lot of perfevents: irq loop stuck! messages, I thought the workaround for that had gone in for 4.2 but I guess not. Vince ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf: fuzzer, lots of warnings 2015-07-02 17:59 ` Vince Weaver @ 2015-07-02 18:30 ` Ingo Molnar 2015-07-03 4:48 ` Vince Weaver 2015-07-02 19:48 ` Stephane Eranian 2015-07-03 4:46 ` Vince Weaver 2 siblings, 1 reply; 8+ messages in thread From: Ingo Molnar @ 2015-07-02 18:30 UTC (permalink / raw) To: Vince Weaver Cc: Peter Zijlstra, linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo, Stephane Eranian * Vince Weaver <vincent.weaver@maine.edu> wrote: > I'm also still getting a lot of > perfevents: irq loop stuck! > messages, I thought the workaround for that had gone in for 4.2 but I > guess not. Hm, so I was waiting for your feedback regarding the precise period cutoff to use, and I guess that's where the patch got lost. Does the value of 2 below work for you? Also I bet we'd need the workaround on a lot more CPU models as well, I sometimes see that warning on an early Nehalem prototype, model 26 (Nehalem-EP). So my guess is that everything Nehalem and later is affected, i.e. NHM, WSM, SNB, IVB and HSW: case 30: /* 45nm Nehalem */ case 26: /* 45nm Nehalem-EP */ case 46: /* 45nm Nehalem-EX */ case 37: /* 32nm Westmere */ case 44: /* 32nm Westmere-EP */ case 47: /* 32nm Westmere-EX */ case 42: /* 32nm SandyBridge */ case 45: /* 32nm SandyBridge-E/EN/EP */ case 58: /* 22nm IvyBridge */ case 62: /* 22nm IvyBridge-EP/EX */ case 60: /* 22nm Haswell Core */ case 63: /* 22nm Haswell Server */ case 69: /* 22nm Haswell ULT */ case 70: /* 22nm Haswell + GT3e (Intel Iris Pro graphics) */ Has anyone ever seen that warning on Broadwell and later Intel CPUs? Thanks, Ingo Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/kernel/cpu/perf_event_intel.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c index 960e85de13fb..26b13ea8299c 100644 --- a/arch/x86/kernel/cpu/perf_event_intel.c +++ b/arch/x86/kernel/cpu/perf_event_intel.c @@ -2479,6 +2479,15 @@ hsw_get_event_constraints(struct cpu_hw_events *cpuc, int idx, return c; } +/* + * Really short periods might create infinite PMC NMI loops on Haswell, + * so don't allow a period of 1. There's no official erratum for this AFAIK. + */ +static unsigned int hsw_limit_period(struct perf_event *event, unsigned int left) +{ + return max(left, 2U); +} + /* * Broadwell: @@ -2495,7 +2504,7 @@ hsw_get_event_constraints(struct cpu_hw_events *cpuc, int idx, * Therefore the effective (average) period matches the requested period, * despite coarser hardware granularity. */ -static unsigned bdw_limit_period(struct perf_event *event, unsigned left) +static unsigned int bdw_limit_period(struct perf_event *event, unsigned left) { if ((event->hw.config & INTEL_ARCH_EVENT_MASK) == X86_CONFIG(.event=0xc0, .umask=0x01)) { @@ -3265,6 +3274,7 @@ __init int intel_pmu_init(void) x86_pmu.hw_config = hsw_hw_config; x86_pmu.get_event_constraints = hsw_get_event_constraints; x86_pmu.cpu_events = hsw_events_attrs; + x86_pmu.limit_period = hsw_limit_period; x86_pmu.lbr_double_abort = true; pr_cont("Haswell events, "); break; ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: perf: fuzzer, lots of warnings 2015-07-02 18:30 ` Ingo Molnar @ 2015-07-03 4:48 ` Vince Weaver 2015-07-06 16:36 ` Vince Weaver 0 siblings, 1 reply; 8+ messages in thread From: Vince Weaver @ 2015-07-03 4:48 UTC (permalink / raw) To: Ingo Molnar Cc: Vince Weaver, Peter Zijlstra, linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo, Stephane Eranian On Thu, 2 Jul 2015, Ingo Molnar wrote: > > * Vince Weaver <vincent.weaver@maine.edu> wrote: > > > I'm also still getting a lot of > > perfevents: irq loop stuck! > > messages, I thought the workaround for that had gone in for 4.2 but I > > guess not. > > Hm, so I was waiting for your feedback regarding the precise period cutoff to use, > and I guess that's where the patch got lost. > > Does the value of 2 below work for you? I still get the messages when using 2. I'll see if I can narrow down the cutoff. I know the messages definitely go away when using 128. Vince ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf: fuzzer, lots of warnings 2015-07-03 4:48 ` Vince Weaver @ 2015-07-06 16:36 ` Vince Weaver 0 siblings, 0 replies; 8+ messages in thread From: Vince Weaver @ 2015-07-06 16:36 UTC (permalink / raw) To: Ingo Molnar Cc: Peter Zijlstra, linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo, Stephane Eranian On Fri, 3 Jul 2015, Vince Weaver wrote: > On Thu, 2 Jul 2015, Ingo Molnar wrote: > > > > > * Vince Weaver <vincent.weaver@maine.edu> wrote: > > > > > I'm also still getting a lot of > > > perfevents: irq loop stuck! > > > messages, I thought the workaround for that had gone in for 4.2 but I > > > guess not. > > > > Hm, so I was waiting for your feedback regarding the precise period cutoff to use, > > and I guess that's where the patch got lost. > > > > Does the value of 2 below work for you? > > I still get the messages when using 2. I'll see if I can narrow down the > cutoff. I know the messages definitely go away when using 128. >From what I can tell, a value of 2 will still generate "irq loop stuck", but 3 appears to be safe, but I'm still fuzzing to be sure. Previously I ran a kernel with 5 for a while and that also did not generate any messages. Vince ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf: fuzzer, lots of warnings 2015-07-02 17:59 ` Vince Weaver 2015-07-02 18:30 ` Ingo Molnar @ 2015-07-02 19:48 ` Stephane Eranian 2015-07-03 4:46 ` Vince Weaver 2 siblings, 0 replies; 8+ messages in thread From: Stephane Eranian @ 2015-07-02 19:48 UTC (permalink / raw) To: Vince Weaver; +Cc: Peter Zijlstra, LKML, Ingo Molnar, Arnaldo Carvalho de Melo On Thu, Jul 2, 2015 at 7:59 PM, Vince Weaver <vincent.weaver@maine.edu> wrote: > > On Thu, 2 Jul 2015, Peter Zijlstra wrote: > > > On Thu, Jul 02, 2015 at 11:42:43AM -0400, Vince Weaver wrote: > > > (#4) > > > > > > [ 1638.702578] WARNING: CPU: 2 PID: 25543 at arch/x86/kernel/cpu/perf_event_intel_lbr.c:349 intel_pmu_lbr_disable+0xbc/0xd0() > > > > > > WARN_ON_ONCE(cpuc->lbr_users < 0); > > > > > > [ 1638.857921] [<ffffffff8102e56c>] intel_pmu_lbr_disable+0xbc/0xd0 > > > [ 1638.864874] [<ffffffff81031812>] intel_pmu_disable_event+0xc2/0x130 > > > [ 1638.872125] [<ffffffff81030a49>] ? __intel_pmu_disable_all+0x49/0x70 > > > [ 1638.879455] [<ffffffff81029175>] x86_pmu_stop+0x75/0xd0 > > > [ 1638.885560] [<ffffffff81159f1a>] perf_ioctl+0x40a/0x470 > > > [ 1638.891650] [<ffffffff811f199a>] do_vfs_ioctl+0x2ba/0x4a0 > > > [ 1638.897945] [<ffffffff811dd9ac>] ? vfs_read+0x11c/0x130 > > > [ 1638.904069] [<ffffffff811f27bd>] ? poll_select_set_timeout+0x4d/0x70 > > > [ 1638.911417] [<ffffffff811f1bf9>] SyS_ioctl+0x79/0x90 > > > [ 1638.917295] [<ffffffff816a64b2>] entry_SYSCALL_64_fastpath+0x16/0x7a > > > > This might maybe help: > > > > http://lkml.kernel.org/r/20150624144750.GJ18673@twins.programming.kicks-ass.net > > ah, OK, I missed that one somehow. It would also explain the > > [ 3971.156004] Uhhuh. NMI received for unknown reason 21 on CPU 2. > [ 3971.162812] Do you have a strange power saving mode enabled? > [ 3971.169363] Dazed and confused, but trying to continue > > messages I am getting. > > > I'm also still getting a lot of > perfevents: irq loop stuck! Last I saw this was on WSM. I also suspect this may not happen if the event is only measured at the user level. > messages, I thought the workaround for that had gone in for 4.2 but I > guess not. > > Vince ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf: fuzzer, lots of warnings 2015-07-02 17:59 ` Vince Weaver 2015-07-02 18:30 ` Ingo Molnar 2015-07-02 19:48 ` Stephane Eranian @ 2015-07-03 4:46 ` Vince Weaver 2 siblings, 0 replies; 8+ messages in thread From: Vince Weaver @ 2015-07-03 4:46 UTC (permalink / raw) To: Vince Weaver Cc: Peter Zijlstra, linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo, Stephane Eranian On Thu, 2 Jul 2015, Vince Weaver wrote: > On Thu, 2 Jul 2015, Peter Zijlstra wrote: > > > On Thu, Jul 02, 2015 at 11:42:43AM -0400, Vince Weaver wrote: > > > (#4) > > > > > > [ 1638.702578] WARNING: CPU: 2 PID: 25543 at arch/x86/kernel/cpu/perf_event_intel_lbr.c:349 intel_pmu_lbr_disable+0xbc/0xd0() > > > > > > WARN_ON_ONCE(cpuc->lbr_users < 0); > > > > > > [ 1638.857921] [<ffffffff8102e56c>] intel_pmu_lbr_disable+0xbc/0xd0 > > > [ 1638.864874] [<ffffffff81031812>] intel_pmu_disable_event+0xc2/0x130 > > > [ 1638.872125] [<ffffffff81030a49>] ? __intel_pmu_disable_all+0x49/0x70 > > > [ 1638.879455] [<ffffffff81029175>] x86_pmu_stop+0x75/0xd0 > > > [ 1638.885560] [<ffffffff81159f1a>] perf_ioctl+0x40a/0x470 > > > [ 1638.891650] [<ffffffff811f199a>] do_vfs_ioctl+0x2ba/0x4a0 > > > [ 1638.897945] [<ffffffff811dd9ac>] ? vfs_read+0x11c/0x130 > > > [ 1638.904069] [<ffffffff811f27bd>] ? poll_select_set_timeout+0x4d/0x70 > > > [ 1638.911417] [<ffffffff811f1bf9>] SyS_ioctl+0x79/0x90 > > > [ 1638.917295] [<ffffffff816a64b2>] entry_SYSCALL_64_fastpath+0x16/0x7a > > > > This might maybe help: > > > > http://lkml.kernel.org/r/20150624144750.GJ18673@twins.programming.kicks-ass.net I applied that patch and while the lbr_disable warning hasn't re-appeared I am still getting a lot of NMI dazed and confused messages. Though that might be normal, I think I've always gotten NMI dazed and confused messages (although a bit more frequently on the core2 box than the haswell one). Vince ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-07-06 16:30 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-07-02 15:42 perf: fuzzer, lots of warnings Vince Weaver 2015-07-02 16:47 ` Peter Zijlstra 2015-07-02 17:59 ` Vince Weaver 2015-07-02 18:30 ` Ingo Molnar 2015-07-03 4:48 ` Vince Weaver 2015-07-06 16:36 ` Vince Weaver 2015-07-02 19:48 ` Stephane Eranian 2015-07-03 4:46 ` Vince Weaver
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox