From: Jiri Olsa <jolsa@redhat.com>
To: Fengguang Wu <fengguang.wu@intel.com>,
Peter Zijlstra <peterz@infradead.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
Stephane Eranian <eranian@google.com>,
Ingo Molnar <mingo@kernel.org>
Subject: Re: [reboot] WARNING: CPU: 0 PID: 112 at kernel/events/core.c:5655 perf_swevent_add()
Date: Tue, 11 Mar 2014 01:56:12 +0100 [thread overview]
Message-ID: <20140311005611.GA1286@krava.redhat.com> (raw)
In-Reply-To: <20140310224023.GA1205@krava.redhat.com>
On Mon, Mar 10, 2014 at 11:40:23PM +0100, Jiri Olsa wrote:
> On Mon, Mar 10, 2014 at 01:53:19PM +0100, Jiri Olsa wrote:
> > On Sat, Mar 08, 2014 at 02:51:53PM +0800, Fengguang Wu wrote:
> > >
> > > Hi all,
> > >
> > > This is a very old WARNING, too old to be bisectable. The below 3 different
> > > back traces show that it's always triggered by trinity at system reboot time.
> > > Any ideas to quiet it? Thank you!
> >
> > hi,
> > is there cpu hotplug involved? like writing to:
> > /sys/devices/system/cpu/cpu*/online
> >
>
> I think there's race with hotplug code,
> I can reproduce this with:
>
> $ ./perf record -e faults ./perf bench sched pipe
>
> and put one of the cpus offline:
>
> [root@krava cpu]# pwd
> /sys/devices/system/cpu
> [root@krava cpu]# echo 0 > cpu1/online
the perf cpu offline callback takes down all cpu context events
and release swhash->swevent_hlist
this could race with task context software events being
just scheduled in on this cpu via perf_swevent_add
(note only cpu ctx events are terminated in the hotplug code)
the race happens in the gap between the cpu notifier code and the
cpu being actually taken down (and become un-sched-able)
I wonder what should we do:
- terminate task ctx events on hotplug-ed cpu (same as for cpu ctx)
this seems too much..
- schedule out task ctx events on hotplug-ed cpu
we might race again with another events sched in (during the race gap)
(if this could be prevented, this would be the best option i think)
- dont release that 'struct swevent_hlist' at all.. it's about 2KB size per cpu
- remove the warning ;-) or make it omit the hotplug-ed cpu case, so
we dont loose potentional bug warning, please check attached patch
thoughts?
jirka
---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 661951a..a53857e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5423,6 +5423,8 @@ struct swevent_htable {
/* Recursion avoidance in each contexts */
int recursion[PERF_NR_CONTEXTS];
+
+ bool offline;
};
static DEFINE_PER_CPU(struct swevent_htable, swevent_htable);
@@ -5669,8 +5671,10 @@ static int perf_swevent_add(struct perf_event *event, int flags)
hwc->state = !(flags & PERF_EF_START);
head = find_swevent_head(swhash, event);
- if (WARN_ON_ONCE(!head))
+ if (!head) {
+ WARN_ON_ONCE(!swhash->offline);
return -EINVAL;
+ }
hlist_add_head_rcu(&event->hlist_entry, head);
@@ -7850,6 +7854,7 @@ static void perf_event_init_cpu(int cpu)
struct swevent_htable *swhash = &per_cpu(swevent_htable, cpu);
mutex_lock(&swhash->hlist_mutex);
+ swhash->offline = false;
if (swhash->hlist_refcount > 0) {
struct swevent_hlist *hlist;
@@ -7907,6 +7912,7 @@ static void perf_event_exit_cpu(int cpu)
perf_event_exit_cpu_context(cpu);
mutex_lock(&swhash->hlist_mutex);
+ swhash->offline = true;
swevent_hlist_release(swhash);
mutex_unlock(&swhash->hlist_mutex);
}
next prev parent reply other threads:[~2014-03-11 0:56 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-08 6:51 [reboot] WARNING: CPU: 0 PID: 112 at kernel/events/core.c:5655 perf_swevent_add() Fengguang Wu
2014-03-08 6:56 ` [perf_swevent_init] BUG: unable to handle kernel paging request at b1793514 Fengguang Wu
2014-03-10 12:53 ` [reboot] WARNING: CPU: 0 PID: 112 at kernel/events/core.c:5655 perf_swevent_add() Jiri Olsa
2014-03-10 22:40 ` Jiri Olsa
2014-03-11 0:56 ` Jiri Olsa [this message]
2014-03-11 12:14 ` Fengguang Wu
2014-03-30 12:41 ` Fengguang Wu
2014-04-06 13:41 ` Jiri Olsa
2014-03-11 2:33 ` Fengguang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140311005611.GA1286@krava.redhat.com \
--to=jolsa@redhat.com \
--cc=eranian@google.com \
--cc=fengguang.wu@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox