From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751777AbdANJb2 (ORCPT ); Sat, 14 Jan 2017 04:31:28 -0500 Received: from mail-wm0-f66.google.com ([74.125.82.66]:34949 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751524AbdANJbZ (ORCPT ); Sat, 14 Jan 2017 04:31:25 -0500 Date: Sat, 14 Jan 2017 10:31:15 +0100 From: Ingo Molnar To: "Paul E. McKenney" Cc: linux-kernel@vger.kernel.org, jiangshanlai@gmail.com, dipankar@in.ibm.com, akpm@linux-foundation.org, mathieu.desnoyers@efficios.com, josh@joshtriplett.org, tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com, dvhart@linux.intel.com, fweisbec@gmail.com, oleg@redhat.com, bobby.prani@gmail.com, Lance Roy Subject: Re: [PATCH tip/core/rcu 1/3] srcu: More efficient reader counts. Message-ID: <20170114093115.GA14970@gmail.com> References: <20170114091941.GA22961@linux.vnet.ibm.com> <1484385601-23379-1-git-send-email-paulmck@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1484385601-23379-1-git-send-email-paulmck@linux.vnet.ibm.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Noticed a few minor nits: * Paul E. McKenney wrote: > From: Lance Roy > > SRCU uses two per-cpu counters: a nesting counter to count the number of > active critical sections, and a sequence counter to ensure that the nesting > counters don't change while they are being added together in > srcu_readers_active_idx_check(). > > This patch instead uses per-cpu lock and unlock counters. Because the both > counters only increase and srcu_readers_active_idx_check() reads the unlock > counter before the lock counter, this achieves the same end without having > to increment two different counters in srcu_read_lock(). This also saves a > smp_mb() in srcu_readers_active_idx_check(). typo: s/Because the both counters Because both counters > > A possible problem with this patch is that it can only handle > ULONG_MAX - NR_CPUS simultaneous readers, whereas the old version could > handle up to ULONG_MAX. I don't think this is a problem! :-) > > Suggested-by: Mathieu Desnoyers > Signed-off-by: Lance Roy > Signed-off-by: Paul E. McKenney > Cc: Lai Jiangshan > Cc: Peter Zijlstra > --- > include/linux/srcu.h | 4 +- > kernel/rcu/rcutorture.c | 18 +++++++- > kernel/rcu/srcu.c | 117 ++++++++++++++++++------------------------------ > 3 files changed, 62 insertions(+), 77 deletions(-) > > diff --git a/include/linux/srcu.h b/include/linux/srcu.h > index dc8eb63c6568..0caea34d8c5f 100644 > --- a/include/linux/srcu.h > +++ b/include/linux/srcu.h > @@ -34,8 +34,8 @@ > #include > > struct srcu_struct_array { > - unsigned long c[2]; > - unsigned long seq[2]; > + unsigned long lock_count[2]; > + unsigned long unlock_count[2]; > }; > > struct rcu_batch { > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c > index 87c51225ceec..6e4fd7680c70 100644 > --- a/kernel/rcu/rcutorture.c > +++ b/kernel/rcu/rcutorture.c > @@ -564,10 +564,24 @@ static void srcu_torture_stats(void) > pr_alert("%s%s per-CPU(idx=%d):", > torture_type, TORTURE_FLAG, idx); > for_each_possible_cpu(cpu) { > + unsigned long l0, l1; > + unsigned long u0, u1; > long c0, c1; > + struct srcu_struct_array* counts = > + per_cpu_ptr(srcu_ctlp->per_cpu_ref, cpu); Please don't break the line to pacify checkpatch - if the line is too long then maybe split out the loop body into a helper function - but keeping it a bit longer than 80 cols is fine as well. > > - c0 = (long)per_cpu_ptr(srcu_ctlp->per_cpu_ref, cpu)->c[!idx]; > - c1 = (long)per_cpu_ptr(srcu_ctlp->per_cpu_ref, cpu)->c[idx]; > + u0 = counts->unlock_count[!idx]; > + u1 = counts->unlock_count[idx]; > + > + /* Make sure that a lock is always counted if the corresponding > + unlock is counted. */ > + smp_rmb(); That's not the standard kernel code comment style. > + > + l0 = counts->lock_count[!idx]; > + l1 = counts->lock_count[idx]; > + > + c0 = (long)(l0 - u0); > + c1 = (long)(l1 - u1); These type casts look unnecessary to me. > for_each_possible_cpu(cpu) { > - t = READ_ONCE(per_cpu_ptr(sp->per_cpu_ref, cpu)->seq[idx]); > + struct srcu_struct_array* cpu_counts = > + per_cpu_ptr(sp->per_cpu_ref, cpu); > + t = READ_ONCE(cpu_counts->lock_count[idx]); > sum += t; > for_each_possible_cpu(cpu) { > - t = READ_ONCE(per_cpu_ptr(sp->per_cpu_ref, cpu)->c[idx]); > + struct srcu_struct_array* cpu_counts = > + per_cpu_ptr(sp->per_cpu_ref, cpu); > + t = READ_ONCE(cpu_counts->unlock_count[idx]); > sum += t; These linebreak look ugly as well. Some abbreviation of types and variables might help: s/srcu_struct_array/srcu_array s/cpu_counts/cpuc ? > + * If the locks are the same as the unlocks, then there must of have > + * been no readers on this index at some time in between. This does not > + * mean that there are no more readers, as one could have read the > + * current index but have incremented the lock counter yet. > > + * Note that there can be at most NR_CPUS worth of readers using the old > + * index that haven't incremented ->lock_count[] yet. Therefore, the > + * sum of the ->lock_count[]s cannot increment enough times to overflow > + * and end up equal the sum of the ->unlock_count[]s, as long as there > + * are at most ULONG_MAX - NR_CPUS readers at a time. (Yes, this does > + * mean that systems having more than a billion or so CPUs need to be > + * 64-bit systems.) Therefore, the only way that the return values of > + * the two calls to srcu_readers_(un)lock_idx() can be equal is if there > + * are no active readers using this index. typo: s/must of have been no readers/ must have been no readers Also, maybe I'm misreading it, but shouldn't it be: s/as one could have read the current index but have incremented the lock counter yet. /as one could have read the current index but not have incremented the lock counter yet. ? Also, the title: srcu: More efficient reader counts. should have a verb and no full stop, i.e. something like: srcu: Implement more efficient reader counts Thanks, Ingo