From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751753AbdG0R5x (ORCPT ); Thu, 27 Jul 2017 13:57:53 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:56509 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751551AbdG0R5w (ORCPT ); Thu, 27 Jul 2017 13:57:52 -0400 Date: Thu, 27 Jul 2017 10:57:47 -0700 From: "Paul E. McKenney" To: Konstantin Khlebnikov Cc: linux-kernel@vger.kernel.org, Peter Zijlstra , Lai Jiangshan , Josh Triplett , Steven Rostedt , Ingo Molnar , Mathieu Desnoyers Subject: Re: [PATCH] rcu: mark debug_lockdep_rcu_enabled() as pure Reply-To: paulmck@linux.vnet.ibm.com References: <149517743964.33034.3209718816521589307.stgit@buzz> <20170520164254.GF3956@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17072717-0056-0000-0000-000003AD9E25 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007436; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000214; SDB=6.00893737; UDB=6.00446840; IPR=6.00673896; BA=6.00005495; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016410; XFM=3.00000015; UTC=2017-07-27 17:57:49 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17072717-0057-0000-0000-000007E3BE19 Message-Id: <20170727175747.GI3730@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-07-27_10:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=18 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1706020000 definitions=main-1707270279 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, May 21, 2017 at 11:51:39AM +0300, Konstantin Khlebnikov wrote: > On 20.05.2017 19:42, Paul E. McKenney wrote: > >On Fri, May 19, 2017 at 10:03:59AM +0300, Konstantin Khlebnikov wrote: > >>This allows to get rid of unneeded invocations. > >> > >>Function debug_lockdep_rcu_enabled() becomes really hot if several > >>debug options are enabled together with CONFIG_PROVE_RCU. > >> > >>Hottest path ends with: > >> debug_lockdep_rcu_enabled > >> is_ftrace_trampoline > >> __kernel_text_address > >> > >>Here debug_lockdep_rcu_enabled() is called from condition > >>(debug_lockdep_rcu_enabled() && !__warned && (c)) inside macro > >>do_for_each_ftrace_op(), where "c" is false. > >> > >>With this patch "netperf -H localhost" shows boost from 2400 to 2500. > >> > >>Signed-off-by: Konstantin Khlebnikov > > > >Nice performance increase! > > > >The gcc documentation says that __attribute__((pure)) functions are > >supposed to have return values that depend only at the function's > >arguments and the values of global variables. However, it also says: > > > > Interesting non-pure functions are functions with infinite loops > > or those depending on volatile memory or other system resource, > > that may change between two consecutive calls (such as feof in > > a multithreading environment). > > > >This is OK for current->lockdep_recursion because this variable is changed > >only by the current task (I think so, anyway). > > > >It is sort of OK for debug_locks. This could be set to zero at any time > >by any other task, but if we have a race condition that very rarely causes > >two lockdep splats instead of just one, so what? (But I am sure that > >some of the people on CC will correct me if I am wrong here.) > > > >It should be OK for rcu_scheduler_active because the transition from > >RCU_SCHEDULER_INACTIVE to RCU_SCHEDULER_INIT happens before the first > >context switch, and the various barrier() calls, implied as well as > >explicit, should keep things straight. > > > >But I don't totally trust my analysis. Could you please get someone > >more gcc-savvy to review this and give their ack/review? Given that, > >I will queue it. > > > > Thanx, Paul > > Thank you for analisys. > > __attribute__((pure)) allows compiler to deduplicate / eliminate calls. > This indeed might expand race windows when global switches like debug_locks > changes their state. But strict synchronization here isn't required. Any objections to this patch? Thanx, Paul > >>--- > >> include/linux/rcupdate.h | 2 +- > >> kernel/rcu/update.c | 2 +- > >> 2 files changed, 2 insertions(+), 2 deletions(-) > >> > >>diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h > >>index e1e5d002fdb9..9ecb3cb715bd 100644 > >>--- a/include/linux/rcupdate.h > >>+++ b/include/linux/rcupdate.h > >>@@ -472,7 +472,7 @@ extern struct lockdep_map rcu_lock_map; > >> extern struct lockdep_map rcu_bh_lock_map; > >> extern struct lockdep_map rcu_sched_lock_map; > >> extern struct lockdep_map rcu_callback_map; > >>-int debug_lockdep_rcu_enabled(void); > >>+int __pure debug_lockdep_rcu_enabled(void); > >> > >> int rcu_read_lock_held(void); > >> int rcu_read_lock_bh_held(void); > >>diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c > >>index 273e869ca21d..a0c30abefdcd 100644 > >>--- a/kernel/rcu/update.c > >>+++ b/kernel/rcu/update.c > >>@@ -292,7 +292,7 @@ struct lockdep_map rcu_callback_map = > >> STATIC_LOCKDEP_MAP_INIT("rcu_callback", &rcu_callback_key); > >> EXPORT_SYMBOL_GPL(rcu_callback_map); > >> > >>-int notrace debug_lockdep_rcu_enabled(void) > >>+int __pure notrace debug_lockdep_rcu_enabled(void) > >> { > >> return rcu_scheduler_active != RCU_SCHEDULER_INACTIVE && debug_locks && > >> current->lockdep_recursion == 0; > >> > > >