From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756199AbdETQnB (ORCPT ); Sat, 20 May 2017 12:43:01 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:45089 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752608AbdETQm7 (ORCPT ); Sat, 20 May 2017 12:42:59 -0400 Date: Sat, 20 May 2017 09:42:54 -0700 From: "Paul E. McKenney" To: Konstantin Khlebnikov Cc: linux-kernel@vger.kernel.org, Peter Zijlstra , Lai Jiangshan , Josh Triplett , Steven Rostedt , Ingo Molnar , Mathieu Desnoyers Subject: Re: [PATCH] rcu: mark debug_lockdep_rcu_enabled() as pure Reply-To: paulmck@linux.vnet.ibm.com References: <149517743964.33034.3209718816521589307.stgit@buzz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <149517743964.33034.3209718816521589307.stgit@buzz> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17052016-0056-0000-0000-0000036B940E X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007091; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000212; SDB=6.00863126; UDB=6.00428331; IPR=6.00642896; BA=6.00005362; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00015522; XFM=3.00000015; UTC=2017-05-20 16:42:57 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17052016-0057-0000-0000-000007A1C48E Message-Id: <20170520164254.GF3956@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-05-20_05:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=18 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1705200086 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 19, 2017 at 10:03:59AM +0300, Konstantin Khlebnikov wrote: > This allows to get rid of unneeded invocations. > > Function debug_lockdep_rcu_enabled() becomes really hot if several > debug options are enabled together with CONFIG_PROVE_RCU. > > Hottest path ends with: > debug_lockdep_rcu_enabled > is_ftrace_trampoline > __kernel_text_address > > Here debug_lockdep_rcu_enabled() is called from condition > (debug_lockdep_rcu_enabled() && !__warned && (c)) inside macro > do_for_each_ftrace_op(), where "c" is false. > > With this patch "netperf -H localhost" shows boost from 2400 to 2500. > > Signed-off-by: Konstantin Khlebnikov Nice performance increase! The gcc documentation says that __attribute__((pure)) functions are supposed to have return values that depend only at the function's arguments and the values of global variables. However, it also says: Interesting non-pure functions are functions with infinite loops or those depending on volatile memory or other system resource, that may change between two consecutive calls (such as feof in a multithreading environment). This is OK for current->lockdep_recursion because this variable is changed only by the current task (I think so, anyway). It is sort of OK for debug_locks. This could be set to zero at any time by any other task, but if we have a race condition that very rarely causes two lockdep splats instead of just one, so what? (But I am sure that some of the people on CC will correct me if I am wrong here.) It should be OK for rcu_scheduler_active because the transition from RCU_SCHEDULER_INACTIVE to RCU_SCHEDULER_INIT happens before the first context switch, and the various barrier() calls, implied as well as explicit, should keep things straight. But I don't totally trust my analysis. Could you please get someone more gcc-savvy to review this and give their ack/review? Given that, I will queue it. Thanx, Paul > --- > include/linux/rcupdate.h | 2 +- > kernel/rcu/update.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h > index e1e5d002fdb9..9ecb3cb715bd 100644 > --- a/include/linux/rcupdate.h > +++ b/include/linux/rcupdate.h > @@ -472,7 +472,7 @@ extern struct lockdep_map rcu_lock_map; > extern struct lockdep_map rcu_bh_lock_map; > extern struct lockdep_map rcu_sched_lock_map; > extern struct lockdep_map rcu_callback_map; > -int debug_lockdep_rcu_enabled(void); > +int __pure debug_lockdep_rcu_enabled(void); > > int rcu_read_lock_held(void); > int rcu_read_lock_bh_held(void); > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c > index 273e869ca21d..a0c30abefdcd 100644 > --- a/kernel/rcu/update.c > +++ b/kernel/rcu/update.c > @@ -292,7 +292,7 @@ struct lockdep_map rcu_callback_map = > STATIC_LOCKDEP_MAP_INIT("rcu_callback", &rcu_callback_key); > EXPORT_SYMBOL_GPL(rcu_callback_map); > > -int notrace debug_lockdep_rcu_enabled(void) > +int __pure notrace debug_lockdep_rcu_enabled(void) > { > return rcu_scheduler_active != RCU_SCHEDULER_INACTIVE && debug_locks && > current->lockdep_recursion == 0; >