From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935183AbdAKQ1K (ORCPT ); Wed, 11 Jan 2017 11:27:10 -0500 Received: from foss.arm.com ([217.140.101.70]:53824 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1765706AbdAKQ1I (ORCPT ); Wed, 11 Jan 2017 11:27:08 -0500 Date: Wed, 11 Jan 2017 16:26:09 +0000 From: Mark Rutland To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, Ingo Molnar , Arnaldo Carvalho de Melo , Thomas Gleixner , Sebastian Andrzej Siewior , jeremy.linton@arm.com, Will Deacon Subject: Re: Perf hotplug lockup in v4.9-rc8 Message-ID: <20170111162609.GE26344@leverpostej> References: <20161207135217.GA25605@leverpostej> <20161207175347.GB13840@leverpostej> <20161207183455.GQ3124@twins.programming.kicks-ass.net> <20161209135900.GU3174@twins.programming.kicks-ass.net> <20170111145920.GB26344@leverpostej> <20170111160358.GA3081@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170111160358.GA3081@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 11, 2017 at 05:03:58PM +0100, Peter Zijlstra wrote: > On Wed, Jan 11, 2017 at 02:59:20PM +0000, Mark Rutland wrote: > > On Fri, Dec 09, 2016 at 02:59:00PM +0100, Peter Zijlstra wrote: > > > + * If we get a false negative, things are complicated. If we are after > > > + * perf_event_context_sched_in() ctx::lock will serialize us, and the > > > + * value must be correct. If we're before, it doesn't matter since > > > + * perf_event_context_sched_in() will program the counter. > > > + * > > > + * However, this hinges on the remote context switch having observed > > > + * our task->perf_event_ctxp[] store, such that it will in fact take > > > + * ctx::lock in perf_event_context_sched_in(). > > > > Sorry if I'm being thick here, but which store are we describing above? > > i.e. which function, how does that relate to perf_install_in_context()? > > The only store to perf_event_ctxp[] of interest is the initial one in > find_get_context(). Ah, I see. I'd missed the rcu_assign_pointer() when looking around for an assignment. > > I haven't managed to wrap my head around why this matters. :/ > > See the scenario from: > > https://lkml.kernel.org/r/20161212124228.GE3124@twins.programming.kicks-ass.net > > Its installing the first event on 't', which concurrently with the > install gets migrated to a third CPU. I was completely failing to consider that this was the installation of the first event; I should have read the existing comment. Things make a lot more sense now. > CPU0 CPU1 CPU2 > > (current == t) > > t->perf_event_ctxp[] = ctx; > smp_mb(); > cpu = task_cpu(t); > > switch(t, n); > migrate(t, 2); > switch(p, t); > > ctx = t->perf_event_ctxp[]; // must not be NULL > > smp_function_call(cpu, ..); > > generic_exec_single() > func(); > spin_lock(ctx->lock); > if (task_curr(t)) // false > > add_event_to_ctx(); > spin_unlock(ctx->lock); > > perf_event_context_sched_in(); > spin_lock(ctx->lock); > // sees event > > > > So its CPU0's store of t->perf_event_ctxp[] that must not go 'missing. > Because if CPU2's load of that variable were to observe NULL, it would > not try to schedule the ctx and we'd have a task running without its > counter, which would be 'bad'. > > As long as we observe !NULL, we'll acquire ctx->lock. If we acquire it > first and not see the event yet, then CPU0 must observe task_running() > and retry. If the install happens first, then we must see the event on > sched-in and all is well. I think I follow now. Thanks for bearing with me! > In any case, I'll try and write a proper Changelog for this... If it's just the commit message and/or comments changing, feel free to add: Tested-by: Mark Rutland Thanks, Mark.