From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933602AbcHDMhe (ORCPT ); Thu, 4 Aug 2016 08:37:34 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:58252 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933388AbcHDMhd (ORCPT ); Thu, 4 Aug 2016 08:37:33 -0400 Date: Thu, 4 Aug 2016 14:37:24 +0200 From: Peter Zijlstra To: Vegard Nossum Cc: David Carrillo-Cisneros , Vineet Gupta , Kan Liang , Arnaldo Carvalho de Melo , LKML , Alexander Shishkin Subject: Re: NULL ptr deref in perf/filter_match Message-ID: <20160804123724.GN6862@twins.programming.kicks-ass.net> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 29, 2016 at 11:41:11PM +0200, Vegard Nossum wrote: > Digging a bit deeper into this, it seems the event itself is getting > created by perf_event_open() and it gets added to the pmu_event_list > through: > > perf_event_open() > - perf_event_alloc() > - account_event() > - account_pmu_sb_event() > - attach_sb_event() > > so at this point the event is being attached but its ->ctx is still > NULL. It seems like ->ctx is set just a bit later in > perf_event_open(), though. > > But before that, __schedule() comes along and creates a stack trace > similar to the one above: > > __schedule() > - __perf_event_task_sched_out() > - perf_iterate_sb() > - perf_iterate_sb_cpu() > - event_filter_match() > - perf_cgroup_match() > - __get_cpu_context() > - (dereference ctx which is NULL) > > So I guess the question is... should the event be attached (= put on > the list) before ->ctx gets set? Or should the cgroup code check for a > NULL ->ctx? Does this fix it? Ordering is a bit of a mess, adding the events to the list _after_ they've been installed has the risk of missing things I think, nor does that result in particularly nice code. Then again, this isn't pretty either. --- kernel/events/core.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index a19550d80ab1..87d02b8cb87e 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -1716,8 +1716,8 @@ static inline int pmu_filter_match(struct perf_event *event) static inline int event_filter_match(struct perf_event *event) { - return (event->cpu == -1 || event->cpu == smp_processor_id()) - && perf_cgroup_match(event) && pmu_filter_match(event); + return (event->cpu == -1 || event->cpu == smp_processor_id()) && + perf_cgroup_match(event) && pmu_filter_match(event); } static void @@ -1737,8 +1737,8 @@ event_sched_out(struct perf_event *event, * maintained, otherwise bogus information is return * via read() for time_enabled, time_running: */ - if (event->state == PERF_EVENT_STATE_INACTIVE - && !event_filter_match(event)) { + if (event->state == PERF_EVENT_STATE_INACTIVE && + !event_filter_match(event)) { delta = tstamp - event->tstamp_stopped; event->tstamp_running += delta; event->tstamp_stopped = tstamp; @@ -2236,10 +2236,15 @@ perf_install_in_context(struct perf_event_context *ctx, lockdep_assert_held(&ctx->mutex); - event->ctx = ctx; if (event->cpu != -1) event->cpu = cpu; + /* + * Ensures that if we can observe event->ctx, both the event and ctx + * will be 'complete'. See perf_iterate_sb_cpu(). + */ + smp_store_release(&event->ctx, ctx); + if (!task) { cpu_function_call(cpu, __perf_install_in_context, event); return; @@ -5969,6 +5974,14 @@ static void perf_iterate_sb_cpu(perf_iterate_f output, void *data) struct perf_event *event; list_for_each_entry_rcu(event, &pel->list, sb_list) { + /* + * Skip events that are not fully formed yet; ensure that + * if we observe event->ctx, both event and ctx will be + * complete enough. See perf_install_in_context(). + */ + if (!smp_load_acquire(&event->ctx)) + continue; + if (event->state < PERF_EVENT_STATE_INACTIVE) continue; if (!event_filter_match(event))