From: Peter Zijlstra <peterz@infradead.org>
To: Thomas-Mich Richter <tmricht@linux.ibm.com>
Cc: Kees Cook <keescook@chromium.org>,
acme@redhat.com,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
Hendrik Brueckner <brueckner@linux.ibm.com>,
Martin Schwidefsky <schwidefsky@de.ibm.com>
Subject: Re: WARN_ON_ONCE() hit at kernel/events/core.c:330
Date: Thu, 4 Apr 2019 13:09:09 +0200 [thread overview]
Message-ID: <20190404110909.GY4038@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <eb3cbf9b-6851-72c6-2648-3e6eaa78fb7d@linux.ibm.com>
On Thu, Apr 04, 2019 at 11:15:39AM +0200, Thomas-Mich Richter wrote:
> On 4/3/19 12:41 PM, Peter Zijlstra wrote:
> > On Wed, Apr 03, 2019 at 11:47:00AM +0200, Thomas-Mich Richter wrote:
> >> I use linux 5.1.0-rc3 on s390 and got this WARN_ON_ONCE message:
> >>
> >> WARNING: CPU: 15 PID: 0 at kernel/events/core.c:330
> >> event_function_local.constprop.79+0xe2/0xe8
> >>
> >> which was introduced with
> >> commit cca2094605ef ("perf/core: Fix event_function_local()").
> >> ..snip....
> >>
> >> Any ideas or hints who to avoid/fix this warning?
> >
> > Some thoughts here:
> >
> > https://lkml.kernel.org/r/20190213101644.GN32534@hirez.programming.kicks-ass.net
> >
> > tl;dr, I've no frigging clue.
> >
>
> I have read this thread and at the end you mentioned:
>
> Humm, but in that case:
>
> context_switch()
> prepare_task_switch()
> perf_event_task_sched_out()
> __perf_event_task_sched_out()
> perf_event_context_sched_out()
> task_ctx_sched_out()
> ctx_sched_out()
> group_sched_out()
> event_sched_out()
> if (event->pending_disable)
>
> Would have already cleared the pending_disable state, so the IPI would
> not have ran perf_event_disable_local() in the first place.
>
> Our test system is configured to panic in WARN_ON_ONCE(). I looked
> at the dump. The event triggering WARN_ON_ONCE:
>
> crash> struct perf_event.oncpu 0x1f9b24800
> oncpu = 0
> crash> struct perf_event.state 0x1f9b24800
> state = PERF_EVENT_STATE_ACTIVE
> crash>
>
> This means the code in
> static void event_sched_out(....)
> {
> ....
> event->pmu->del(event, 0);
> event->oncpu = -1;
>
> if (event->pending_disable) {
> event->pending_disable = 0;
> state = PERF_EVENT_STATE_OFF;
> }
> perf_event_set_state(event, state);
> ...
> }
>
> has not finished and returned from this function. So the task was not completely context-switched
> out from CPU 0 while the interrupt handler was executing on CPU 15:
>
> static void perf_pending_event(...)
> {
> ....
> if (event->pending_disable) {
> event->pending_disable = 0;
> perf_event_disable_local(event); <--- Causes the WARN_ON_ONCE()
> }
> .....
> }
>
> I think there is a race, especially when the interrupt handler on CPU 15
> was invoked via timer interrupt an runs on a different CPU.
That is not entirely the scenario I talked about, but *groan*.
So what I meant was:
CPU-0 CPU-n
__schedule()
local_irq_disable()
...
deactivate_task(prev);
try_to_wake_up(@p)
...
smp_cond_load_acquire(&p->on_cpu, !VAL);
<PMI>
..
perf_event_disable_inatomic()
event->pending_disable = 1;
irq_work_queue() /* self-IPI */
</PMI>
context_switch()
prepare_task_switch()
perf_event_task_sched_out()
// the above chain that clears pending_disable
finish_task_switch()
finish_task()
smp_store_release(prev->on_cpu, 0);
/* finally.... */
// take woken
// context_switch to @p
finish_lock_switch()
raw_spin_unlock_irq()
/* w00t, IRQs enabled, self-IPI time */
<self-IPI>
perf_pending_event()
// event->pending_disable == 0
</self-IPI>
What you're suggesting, is that the time between:
smp_store_release(prev->on_cpu, 0);
and
<self-IPI>
on CPU-0 is sufficient for CPU-n to context switch to the task, enable
the event there, trigger a PMI that calls perf_event_disable_inatomic()
_again_ (this would mean irq_work_queue() failing, which we don't check)
(and schedule out again, although that's not required).
This being virt that might actually be possible if (v)CPU-0 takes a nap
I suppose.
Let me think about this a little more...
> > Does it reproduce on x86 without virt on? I don't have a PPC LPAR to
> > test things on.
> >
>
> s390 LPARs run under hipervisor control, no chance to run any OS without it.
Yah, I know.. Same difference though; I also don't have an s390.
next prev parent reply other threads:[~2019-04-04 11:09 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-03 9:47 WARN_ON_ONCE() hit at kernel/events/core.c:330 Thomas-Mich Richter
2019-04-03 10:41 ` Peter Zijlstra
2019-04-03 11:26 ` Thomas-Mich Richter
2019-04-04 9:15 ` Thomas-Mich Richter
2019-04-04 11:09 ` Peter Zijlstra [this message]
2019-04-04 12:02 ` Peter Zijlstra
2019-04-04 12:13 ` Peter Zijlstra
2019-04-04 13:03 ` Peter Zijlstra
2019-04-04 13:21 ` Thomas-Mich Richter
2019-04-05 10:18 ` Thomas-Mich Richter
2019-04-05 11:46 ` Peter Zijlstra
2019-04-08 7:12 ` Thomas-Mich Richter
2019-04-08 8:22 ` Peter Zijlstra
2019-04-08 8:47 ` Thomas-Mich Richter
2019-04-08 9:50 ` Peter Zijlstra
2019-04-08 13:28 ` Thomas-Mich Richter
2019-04-09 6:07 ` Thomas-Mich Richter
2019-04-09 8:29 ` Peter Zijlstra
2019-04-09 8:53 ` Mark Rutland
2019-04-10 13:51 ` Thomas-Mich Richter
2019-04-10 14:33 ` Peter Zijlstra
2019-04-11 12:06 ` Alexander Shishkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190404110909.GY4038@hirez.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=acme@redhat.com \
--cc=brueckner@linux.ibm.com \
--cc=heiko.carstens@de.ibm.com \
--cc=keescook@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=schwidefsky@de.ibm.com \
--cc=tmricht@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox