From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753982AbbIKRJy (ORCPT ); Fri, 11 Sep 2015 13:09:54 -0400 Received: from e37.co.us.ibm.com ([32.97.110.158]:59078 "EHLO e37.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752587AbbIKRJw (ORCPT ); Fri, 11 Sep 2015 13:09:52 -0400 X-Helo: d03dlp03.boulder.ibm.com X-MailFrom: paulmck@linux.vnet.ibm.com X-RcptTo: linux-kernel@vger.kernel.org Date: Fri, 11 Sep 2015 10:05:26 -0700 From: "Paul E. McKenney" To: Oleg Nesterov Cc: Davidlohr Bueso , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/1] rcu_sync: Cleanup the CONFIG_PROVE_RCU checks Message-ID: <20150911170526.GU4029@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20150911155901.GA27246@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150911155901.GA27246@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15091117-0025-0000-0000-00001C56A755 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 11, 2015 at 05:59:01PM +0200, Oleg Nesterov wrote: > On 09/10, Paul E. McKenney wrote: > > > > On Thu, Sep 10, 2015 at 03:59:42PM +0200, Oleg Nesterov wrote: > > > On 09/09, Paul E. McKenney wrote: > > > > > > > > This is obsolete, but its replacement is the same patch. > > > > > > fbe3b97183f84155d81e506b1aa7d2ce986f7a36 in linux-rcu.git#experimental > > > I guess? > > > > > > > Oleg, Davidlohr, am I missing something on how percpu_rwsem or > > > > locktorture work? > > > > > > No, I think the patch is fine. Thanks for doing this! I was going to > > > send something like this change too. And in fact I am still thinking > > > about another test which plays with rcu_sync only, but probably we > > > need some cleanups first (and we need them anyway). I'll try to do > > > this a bit later. > > > > I would welcome an rcu_sync-specific torture patch! > > I want it much more than you ;) I have already warned you, I'll send > more rcu_sync patches. The current code is actually a very early draft > which was written during the discussion with Peter a long ago. I sent > it unchanged because a) it was already reviewed and b) I tested it a > bit in the past. > > We can greatly simplify this code and at the same time make it more > useful. Actually I already have the patches. The 1st one removes > rcu_sync->cb_state and gp_ops->sync(). This makes the state machine > almost self-obvious and allows other improvements. See the resulting > (pseudo) code at the end. > > But again, I'll try very much to write the test before I send the patch. That sounds very good! ;-) > Until then, let me send this trivial cleanup. The CONFIG_PROVE_RCU > code looks trivial but imo really annoying. And it is not complete, > so lets document this at least. Plus rcu_lockdep_assert() looks more > consistent. > > > > > > +void torture_percpu_rwsem_init(void) > > > > +{ > > > > + BUG_ON(percpu_init_rwsem(&pcpu_rwsem)); > > > > +} > > > > + > > > > > > Aha, we don't really need this... I mean we can use the static initialiser > > > which can also be used by uprobes and cgroups. I'll try to send the patch > > > tomorrow. > > > > Very good, please do! > > Hmm. I am lier. I won't send this patch at least today. > > The change I had in mind is very simple, > > #define DECLARE_PERCPU_RWSEM(sem) \ > static DEFINE_PER_CPU(unsigned int, sem##_counters); \ > struct percpu_rw_semaphore sem = { \ > .fast_read_ctr = &sem##_counters, \ > ... \ > } > > and yes, uprobes and cgroups can use it. > > But somehow I missed that we can't use it to define a _static_ sem, > > static DECLARE_PERCPU_RWSEM(sem); > > obviously won't work. And damn, I am shy to admit that I spent several > hours trying to invent something but failed. Perhaps we can add 2 helpers, > DECLARE_PERCPU_RWSEM_GLOBAL() and DECLARE_PERCPU_RWSEM_STATIC(). That is indeed what we do for SRCU for the same reason, DEFINE_SRCU() and DEFINE_STATIC_SRCU(), but with a common __DEFINE_SRCU() doing the actual work. Thanx, Paul > Oleg. > > ------------------------------------------------------------------------------- > static const struct { > void (*call)(struct rcu_head *, void (*)(struct rcu_head *)); > void (*wait)(void); // TODO: remove this > #ifdef CONFIG_PROVE_RCU > int (*held)(void); > #endif > } gp_ops[] = { > ... > }; > > // COMMENT to explain these states > enum { GP_IDLE = 0, GP_ENTER, GP_PASSED, GP_EXIT, GP_REPLAY }; > > #define rss_lock gp_wait.lock > > // !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!1!!!!!!!! > // XXX code must be removed when we split rcu_sync_enter() into start + wait > // !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > static void rcu_sync_func(struct rcu_head *rcu) > { > struct rcu_sync *rsp = container_of(rcu, struct rcu_sync, cb_head); > unsigned long flags; > > BUG_ON(rsp->gp_state == GP_IDLE); > BUG_ON(rsp->gp_state == GP_PASSED); > > spin_lock_irqsave(&rsp->rss_lock, flags); > if (rsp->gp_count) { > /* > * COMMENT. > */ > rsp->gp_state = GP_PASSED; > wake_up_locked(&rsp->gp_wait); > } else if (rsp->gp_state == GP_REPLAY) { > /* > * A new rcu_sync_exit() has happened; requeue the callback > * to catch a later GP. > */ > rsp->gp_state = GP_EXIT; > gp_ops[rsp->gp_type].call(&rsp->cb_head, rcu_sync_func); > } else { > /* > * We're at least a GP after rcu_sync_exit(); eveybody will now > * have observed the write side critical section. Let 'em rip!. > */ > BUG_ON(rsp->gp_state == GP_ENTER); // XXX > rsp->gp_state = GP_IDLE; > } > spin_unlock_irqrestore(&rsp->rss_lock, flags); > } > > static void rcu_sync_call(struct rcu_sync *rsp) > { > // TODO: > // This is called by might_sleep() code outside of ->rss_lock, > // we can avoid ->call() in some cases (say rcu_blocking_is_gp()) > gp_ops[rsp->gp_type].call(&rsp->cb_head, rcu_sync_func); > } > > void rcu_sync_enter(struct rcu_sync *rsp) > { > int gp_count, gp_state; > > spin_lock_irq(&rsp->rss_lock); > gp_count = rsp->gp_count++; > gp_state = rsp->gp_state; > if (gp_state == GP_IDLE) > rsp->gp_state = GP_ENTER; > spin_unlock_irq(&rsp->rss_lock); > > BUG_ON(gp_count != 0 && gp_state == GP_IDLE); > BUG_ON(gp_count == 0 && gp_state == GP_PASSED); > BUG_ON(gp_count == 0 && gp_state == GP_ENTER); // XXX > > if (gp_state == GP_IDLE) > rcu_sync_call(rsp); > > wait_event(rsp->gp_wait, rsp->gp_state != GP_ENTER); > BUG_ON(rsp->gp_state < GP_PASSED); > } > > void rcu_sync_exit(struct rcu_sync *rsp) > { > bool need_call; > > BUG_ON(rsp->gp_state == GP_IDLE); > BUG_ON(rsp->gp_state == GP_ENTER); // XXX > > spin_lock_irq(&rsp->rss_lock); > if (!--rsp->gp_count) { > if (rsp->gp_state == GP_PASSED) { > need_call = true; > rsp->gp_state = GP_EXIT; > } else if (rsp->gp_state == GP_EXIT) { > rsp->gp_state = GP_REPLAY; > } > } > spin_unlock_irq(&rsp->rss_lock); > > // Comment to explain why we do not care if another enter() > // and perhaps even exit() comes after spin_unlock(). > if (need_call) > rcu_sync_call(rsp); > } > > void rcu_sync_dtor(struct rcu_sync *rsp) > { > int gp_state; > > BUG_ON(rsp->gp_count); > BUG_ON(rsp->gp_state == GP_ENTER); // XXX > BUG_ON(rsp->gp_state == GP_PASSED); > > spin_lock_irq(&rsp->rss_lock); > if (rsp->gp_state == GP_REPLAY) > rsp->gp_state = GP_EXIT; > gp_state = rsp->gp_state; > spin_unlock_irq(&rsp->rss_lock); > > // TODO: add another wake_up_locked() into rcu_sync_func(), > // use wait_event + spin_lock_wait, remove gp_ops->wait(). > > if (gp_state != GP_IDLE) { > gp_ops[rsp->gp_type].wait(); > BUG_ON(rsp->gp_state != GP_IDLE); > } > } >