From: Mark Rutland <mark.rutland@arm.com>
To: linux-kernel@vger.kernel.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
Yan@leverpostej.cambridge.arm.com, Zheng <zheng.z.yan@intel.com>,
Stephane Eranian <eranian@google.com>,
Ingo Molnar <mingo@kernel.org>
Subject: Possible race between CPU hotplug and perf_pmu_migrate_context
Date: Mon, 1 Sep 2014 19:18:08 +0100 [thread overview]
Message-ID: <20140901181808.GA6427@leverpostej> (raw)
Hi all,
While trying some rework of the ARM CCI PMU driver on v3.17-rc2, I
encountered what seems to be a race between CPU hotplug and perf event
context migration, which results in a BUG in mm/slub.c.
It looks like this is a generic issue as I'm able to cause the same
splat with the uncore_imc driver on a Haswell machine (on v3.16.1 at
least):
[ 66.621306] ------------[ cut here ]------------
[ 66.625933] kernel BUG at mm/slub.c:3380!
[ 66.629947] invalid opcode: 0000 [#1] SMP
[ 66.634101] Modules linked in: vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) x86_pkg_temp_thermal
[ 66.643476] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O 3.16.1-uncore-pmu-test #2
[ 66.653132] Hardware name: LENOVO 10A6A03EUK/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[ 66.660530] task: ffff88040b584f50 ti: ffff88040b5d4000 task.ti: ffff88040b5d4000
[ 66.668009] RIP: 0010:[<ffffffff8114a443>] [<ffffffff8114a443>] kfree+0x133/0x140
[ 66.675615] RSP: 0018:ffff88041dc43ea8 EFLAGS: 00010246
[ 66.680930] RAX: 0200000000000400 RBX: ffff88041dc18100 RCX: 00000000000000c8
[ 66.688066] RDX: 0200000000000000 RSI: ffff8800db601800 RDI: ffff88041dc18100
[ 66.695202] RBP: ffff88041dc43ec0 R08: 00000000000156e0 R09: ffff88041dc556e0
[ 66.702334] R10: ffffea0010770600 R11: ffffea00036d8000 R12: ffffffff81c3dec0
[ 66.709472] R13: ffffffff8109dd33 R14: ffff880409b96b08 R15: 0000000000000006
[ 66.716607] FS: 0000000000000000(0000) GS:ffff88041dc40000(0000) knlGS:0000000000000000
[ 66.724697] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 66.730443] CR2: 00007fae8a93b000 CR3: 00000000dc962000 CR4: 00000000001407e0
[ 66.737580] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 66.744714] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 66.751852] Stack:
[ 66.753873] ffff88041dc4d300 ffffffff81c3dec0 000000000000000a ffff88041dc43f20
[ 66.761371] ffffffff8109dd33 ffff8800db600500 ffff88040b584f50 ffff88040b5d7fd8
[ 66.768873] ffff88041dc4d328 0000000000000000 0000000000000009 ffffffff81c090c8
[ 66.776371] Call Trace:
[ 66.778823] <IRQ>
[ 66.780759] [<ffffffff8109dd33>] rcu_process_callbacks+0x1e3/0x540
[ 66.787254] [<ffffffff8104e70e>] __do_softirq+0xee/0x280
[ 66.792654] [<ffffffff8104eaad>] irq_exit+0x9d/0xb0
[ 66.797625] [<ffffffff81032b4f>] smp_apic_timer_interrupt+0x3f/0x50
[ 66.803982] [<ffffffff817de68a>] apic_timer_interrupt+0x6a/0x70
[ 66.809994] <EOI>
[ 66.811926] [<ffffffff81590ce7>] ? cpuidle_enter_state+0x47/0xc0
[ 66.818250] [<ffffffff81590e12>] cpuidle_enter+0x12/0x20
[ 66.823650] [<ffffffff81086aa6>] cpu_startup_entry+0x256/0x3f0
[ 66.829572] [<ffffffff81030d82>] start_secondary+0x192/0x200
[ 66.835319] Code: 49 8b 02 31 f6 f6 c4 40 74 04 41 8b 72 68 4c 89 d7 e8 92 ed fb ff eb 93 4c 8b 50 30 48 8b 10 80 e6 80 4c 0f 44 d0 e9 36 ff ff ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 c7 c0 ea ff ff ff
[ 66.855859] RIP [<ffffffff8114a443>] kfree+0x133/0x140
[ 66.861113] RSP <ffff88041dc43ea8>
[ 66.864617] ---[ end trace 825fa0ba52ca10eb ]---
[ 66.869240] Kernel panic - not syncing: Fatal exception in interrupt
[ 66.875616] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
[ 66.885791] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
Has anything seen anything like this before? Is this a known issue?
I'm testing by opening and closing uncore/system PMU events while
hotplugging CPUs to force migration. I run a few instances of the
following program and script in parallel (please forgive the hardcoded
numbers).
Thanks,
Mark.
---->8----
#include <errno.h>
#include <linux/hw_breakpoint.h>
#include <linux/perf_event.h>
#include <stdio.h>
#include <sys/syscall.h>
#include <unistd.h>
static int perf_event_open(struct perf_event_attr *attr, pid_t pid, int cpu,
int group_fd, unsigned long flags)
{
return syscall(__NR_perf_event_open, attr, pid, cpu, group_fd, flags);
}
#define PMU_TYPE 6 /* uncore_imc */
#define PMU_EVENT 1 /* data_read */
struct perf_event_attr attr = {
.type = PMU_TYPE,
.config = PMU_EVENT,
.size = sizeof(attr),
};
int main(int argc, char *argv[])
{
while (1) {
int ret = perf_event_open(&attr, -1, 0, -1, 0);
if (ret < 0) {
fprintf(stderr, "Unable to open event: %d (%d)\n", ret, errno);
return ret;
}
close(ret);
}
return 0;
}
----8<----
---->8----
#!/bin/sh
MAX_CPU=7
while true; do
for i in $(seq 0 ${MAX_CPU}); do
echo 0 > /sys/devices/system/cpu/cpu${i}/online;
echo 1 > /sys/devices/system/cpu/cpu${i}/online;
done
done
----8<----
next reply other threads:[~2014-09-01 18:18 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-01 18:18 Mark Rutland [this message]
2014-09-01 19:05 ` Possible race between CPU hotplug and perf_pmu_migrate_context Peter Zijlstra
2014-09-02 18:58 ` Mark Rutland
2014-09-03 11:50 ` Mark Rutland
2014-09-04 10:44 ` Peter Zijlstra
2014-09-04 11:07 ` Mark Rutland
2014-09-05 15:16 ` Peter Zijlstra
2014-09-05 15:41 ` Linus Torvalds
2014-09-05 16:50 ` Vince Weaver
2014-09-05 16:59 ` Mark Rutland
2014-09-05 17:31 ` Linus Torvalds
2014-09-05 19:54 ` Peter Zijlstra
2014-09-08 8:39 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140901181808.GA6427@leverpostej \
--to=mark.rutland@arm.com \
--cc=Yan@leverpostej.cambridge.arm.com \
--cc=a.p.zijlstra@chello.nl \
--cc=eranian@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=zheng.z.yan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox