From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753635AbcAGQ0M (ORCPT ); Thu, 7 Jan 2016 11:26:12 -0500 Received: from e38.co.us.ibm.com ([32.97.110.159]:34504 "EHLO e38.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752707AbcAGQ0J (ORCPT ); Thu, 7 Jan 2016 11:26:09 -0500 X-IBM-Helo: d03dlp01.boulder.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org;linux-tip-commits@vger.kernel.org Date: Thu, 7 Jan 2016 08:26:04 -0800 From: "Paul E. McKenney" To: Peter Zijlstra Cc: Eric Dumazet , syzkaller , LKML , acme@redhat.com, Linus Torvalds , Alexander Potapenko , Kostya Serebryany , Dmitry Vyukov , Sasha Levin , "H. Peter Anvin" , jolsa@redhat.com, Ingo Molnar , vincent.weaver@maine.edu, Thomas Gleixner , acme@kernel.org, Stephane Eranian , linux-tip-commits@vger.kernel.org Subject: Re: [tip:perf/core] perf: Fix race in perf_event_exec() Message-ID: <20160107162604.GP3818@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20151210195740.GG6357@twins.programming.kicks-ass.net> <20160107134008.GZ6344@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160107134008.GZ6344@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16010716-0029-0000-0000-00000F6943A4 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 07, 2016 at 02:40:08PM +0100, Peter Zijlstra wrote: > On Wed, Jan 06, 2016 at 01:56:56PM -0500, Eric Dumazet wrote: > > On Wed, Jan 6, 2016 at 1:46 PM, tip-bot for Peter Zijlstra > > wrote: > > > > > > > > This is because context switches can swap the task_struct::perf_event_ctxp[] > > > pointer around. Therefore you have to either disable preemption when looking > > > at current, or hold ctx->lock. > > > > > > > > > > > > > void perf_event_exec(void) > > > { > > > - struct perf_event_context *ctx; > > > int ctxn; > > > > > > rcu_read_lock(); > > > > Do we still need this rcu_read_lock(), if perf_event_enable_on_exec() > > uses local_irq_save( ? > > Strictly speaking we should not rely on the fact that RCU grace periods > do not progress with IRQs disabled, so yes. What Peter said! The current implementation would let you get away with IRQs disabled (aside from lockdep splats), but it is easy to imagine implementations that do not, hence the restriction. Is the extra rcu_read_lock() and rcu_read_unlock() causing a performance problem? If so, one option would be use of synchronize_sched() and call_rcu_sched(), which explicitly recognize disabled IRQs as read-side critical sections. Of course, this might well put you in a position where you would like to use IRQ disabling in some cases, BH disabling (or softirq context) in others, and rcu_read_lock() in yet other cases. If so, the good news is that RCU actually now supports this efficiently. There is the shiny new synchronize_rcu_mult() macro that can be used as follows to wait for all three types of grace periods: synchronize_rcu_mult(call_rcu, call_rcu_bh, call_rcu_sched); This would wait concurrently for all three grace periods to elapse. Thanx, Paul