From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758660AbdELSOG (ORCPT ); Fri, 12 May 2017 14:14:06 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:50705 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1757145AbdELSOE (ORCPT ); Fri, 12 May 2017 14:14:04 -0400 Date: Fri, 12 May 2017 11:13:57 -0700 From: "Paul E. McKenney" To: Steven Rostedt Cc: linux-kernel@vger.kernel.org, Thomas Gleixner , Peter Zijlstra , Ingo Molnar , Mathieu Desnoyers , Masami Hiramatsu Subject: Re: [RFC][PATCH 0/5] perf/tracing/cpuhotplug: Fix locking order Reply-To: paulmck@linux.vnet.ibm.com References: <20170512171544.100715273@goodmis.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170512171544.100715273@goodmis.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17051218-0048-0000-0000-000001804AC9 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007051; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000211; SDB=6.00859613; UDB=6.00426121; IPR=6.00639196; BA=6.00005345; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00015430; XFM=3.00000015; UTC=2017-05-12 18:14:00 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17051218-0049-0000-0000-0000411AF431 Message-Id: <20170512181357.GW3956@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-05-12_10:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1705120340 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 12, 2017 at 01:15:44PM -0400, Steven Rostedt wrote: > NOTE: This was quickly written. The change logs and patches probably need > some loving. This is for discussion. These may become legitimate patches, > but for now, I'm seeing if this is an acceptable solution. > > Also note, I checked out the last branch that I had Linus pull, and then > merged with tip's commit b53e5129~1, which is the commit just before > "trace/perf: Cure hotplug lock ordering issues" in tip/smp/hotplug > that placed get_online_cpus() ahead of event_mutex. The issue is that > event_mutex is a high level mutex, and trying to make sure that > get_online_cpus() is held whenever it is held and is going to be a > nightmare. So I started ahead of that commit, but I needed changes > in my own repo to do full testing. I renamed this temp branch to > "tip/cpuhotplug", and you can pull it as well. But this is not going > to be a candidate for pushing to Linus. > > The 5 patches here are: > > 1) fix a lockdep splat when taking stack trace in irqsoff tracing > (one of my tests triggered this) > > 2) Allow for get_online_cpus() to nest > > 3) Have kprobes take get_online_cpus() before taking jump label lock. > > 4) Have tracepoints always take get_online_cpus > > 5) Have perf take event_mutex before taking get_online_cpus() > > Thoughts? First a thought on the use of get_online_cpus()... Your mileage may vary, but from what I have seen, life is very good if your piece of the kernel uses get_online_cpus(), but refrains from using the CPU-hotplug notifiers. Life is also very good if your piece of the kernel uses CPU-hotplug notifiers, but refrains from using get_online_cpus(). If your code uses both get_online_cpus() -and- CPU-hotplug notifiers for the same data structures, life can be quite complex and difficult. For RCU, I cannot avoid using notifiers, so I have been removing uses of get_online_cpus(). Removing each of these has made large numbers of race conditions with CPU hotplug simply disappear -- not much reduction in lines of code, but big reductions in complexity and in state space. I have one more to go in _rcu_barrier(). Again, your mileage may vary. Thanx, Paul > -- Steve > > > git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git > tip/cpuhotplug > > Head SHA1: ddb06e08fa10ed5030285267a9b1e25d40c337c8 > > > Steven Rostedt (VMware) (5): > tracing: Make sure RCU is watching before calling a stack trace > cpu-hotplug: Allow get_online_cpus() to nest > kprobes: Take get_online_cpus() before taking jump_label_lock() > tracepoints: Grab get_online_cpus() before taking tracepoints_mutex > perf: Grab event_mutex before taking get_online_cpus() > > ---- > include/linux/sched.h | 1 + > include/linux/trace_events.h | 2 ++ > kernel/cpu.c | 9 +++++++++ > kernel/events/core.c | 9 +++++++++ > kernel/fork.c | 1 + > kernel/kprobes.c | 6 +++--- > kernel/trace/trace.c | 5 +++++ > kernel/trace/trace.h | 1 - > kernel/trace/trace_event_perf.c | 8 ++++---- > kernel/tracepoint.c | 5 +++++ > 10 files changed, 39 insertions(+), 8 deletions(-) >