All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] Core2 cpu triggers hard lockup with perf test
@ 2016-02-27 12:37 Jiri Olsa
  2016-02-27 14:48 ` Peter Zijlstra
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Jiri Olsa @ 2016-02-27 12:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, Andi Kleen,
	Stephane Eranian, Wang Nan, zheng.z.yan, Kan Liang
  Cc: LKML

hi,
we are getting hard lockups on Core2 cpus (model 23)
just by running 'perf test'

PID: 10425  TASK: ffff880068562e00  CPU: 3   COMMAND: "perf"
 #0 [ffff88007d985a08] machine_kexec at ffffffff8105521b
 #1 [ffff88007d985a68] crash_kexec at ffffffff810f7412
 #2 [ffff88007d985b38] panic at ffffffff8163c031
 #3 [ffff88007d985bb8] watchdog_overflow_callback at ffffffff81120472
 #4 [ffff88007d985bc8] __perf_event_overflow at ffffffff81164e0e
 #5 [ffff88007d985c00] perf_event_overflow at ffffffff81165a44
 #6 [ffff88007d985c10] intel_pmu_handle_irq at ffffffff81033198
 #7 [ffff88007d985e60] perf_event_nmi_handler at ffffffff8164be8b
 #8 [ffff88007d985e80] nmi_handle at ffffffff8164b5d9
 #9 [ffff88007d985ec8] do_nmi at ffffffff8164b789
#10 [ffff88007d985ef0] end_repeat_nmi at ffffffff8164aa13
    [exception RIP: intel_pmu_enable_all+17]
    RIP: ffffffff81032301  RSP: ffff88005e917c98  RFLAGS: 00000046
    RAX: ffff88007d98cd20  RBX: ffff88005e991000  RCX: 000000000000038f
    RDX: 0000000000000007  RSI: 0000000000000003  RDI: 0000000000000000
    RBP: ffff88005e917cd8   R8: ffffffffffffff85   R9: 000000ffffffffff
    R10: ffff88007d98c100  R11: ffff88005e9179e0  R12: ffff88007d98bd10
    R13: ffff88007d98b9e0  R14: ffff88007d98bc08  R15: 0000000000000002
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
#11 [ffff88005e917c98] intel_pmu_enable_all at ffffffff81032301
#12 [ffff88005e917c98] x86_pmu_enable at ffffffff8102ba24
#13 [ffff88005e917ce0] perf_pmu_enable at ffffffff81160457
#14 [ffff88005e917cf0] perf_event_context_sched_in at ffffffff81161930
#15 [ffff88005e917d20] perf_event_exec at ffffffff811621db
#16 [ffff88005e917d68] setup_new_exec at ffffffff811edffd
#17 [ffff88005e917d88] load_elf_binary at ffffffff81240ed9
#18 [ffff88005e917e58] search_binary_handler at ffffffff811ec89d
#19 [ffff88005e917ea0] do_execve_common at ffffffff811ede04
#20 [ffff88005e917f30] sys_execve at ffffffff811ee199
#21 [ffff88005e917f50] stub_execve at ffffffff816531a9

the reproducer seems to be hw event with very small
period like (thanks Arnaldo ;-):
  perf record -e cycles -c 123 kill

I bisected it down to the:
  156174999dd1 perf/intel/x86: Enlarge the PEBS buffer

Looks like the bigger PEBS buffer together with event being
marked as PERF_X86_EVENT_FREERUNNING will block the CPU right
after the event is enabled before it could reach local_irq_enable
and trigger the NMI watchdog.

I can't find what's special about Core2 CPU PEBS setup,
it seems that oher CPUs are ok (tried on ivb/snb/hsw).

reverting the 156174999dd1 fixed the issue for me

ideas? thanks,
jirka

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-03-08 13:16 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-27 12:37 [BUG] Core2 cpu triggers hard lockup with perf test Jiri Olsa
2016-02-27 14:48 ` Peter Zijlstra
2016-02-27 15:46 ` Andi Kleen
2016-02-29 22:12 ` Liang, Kan
2016-03-01  6:55   ` Jiri Olsa
2016-03-01  9:17   ` Peter Zijlstra
2016-03-01 11:06     ` Jiri Olsa
2016-03-01 11:20       ` Peter Zijlstra
2016-03-01 14:51       ` Andi Kleen
2016-03-01 14:59         ` Peter Zijlstra
2016-03-01 17:17           ` Jiri Olsa
2016-03-01 17:32             ` Andi Kleen
2016-03-01 17:49             ` Peter Zijlstra
2016-03-01 18:04               ` Jiri Olsa
2016-03-01 18:14                 ` Peter Zijlstra
2016-03-01 18:12             ` Peter Zijlstra
2016-03-01 19:03               ` [PATCH] perf x86: Use PAGE_SIZE for PEBS buffer size on Core2 Jiri Olsa
2016-03-08 13:15                 ` [tip:perf/core] perf/x86/intel: " tip-bot for Jiri Olsa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.