From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753564AbbIKQBv (ORCPT ); Fri, 11 Sep 2015 12:01:51 -0400 Received: from mx1.redhat.com ([209.132.183.28]:48667 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753178AbbIKQBt (ORCPT ); Fri, 11 Sep 2015 12:01:49 -0400 Date: Fri, 11 Sep 2015 17:59:01 +0200 From: Oleg Nesterov To: "Paul E. McKenney" Cc: Davidlohr Bueso , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH 0/1] rcu_sync: Cleanup the CONFIG_PROVE_RCU checks Message-ID: <20150911155901.GA27246@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/10, Paul E. McKenney wrote: > > On Thu, Sep 10, 2015 at 03:59:42PM +0200, Oleg Nesterov wrote: > > On 09/09, Paul E. McKenney wrote: > > > > > > This is obsolete, but its replacement is the same patch. > > > > fbe3b97183f84155d81e506b1aa7d2ce986f7a36 in linux-rcu.git#experimental > > I guess? > > > > > Oleg, Davidlohr, am I missing something on how percpu_rwsem or > > > locktorture work? > > > > No, I think the patch is fine. Thanks for doing this! I was going to > > send something like this change too. And in fact I am still thinking > > about another test which plays with rcu_sync only, but probably we > > need some cleanups first (and we need them anyway). I'll try to do > > this a bit later. > > I would welcome an rcu_sync-specific torture patch! I want it much more than you ;) I have already warned you, I'll send more rcu_sync patches. The current code is actually a very early draft which was written during the discussion with Peter a long ago. I sent it unchanged because a) it was already reviewed and b) I tested it a bit in the past. We can greatly simplify this code and at the same time make it more useful. Actually I already have the patches. The 1st one removes rcu_sync->cb_state and gp_ops->sync(). This makes the state machine almost self-obvious and allows other improvements. See the resulting (pseudo) code at the end. But again, I'll try very much to write the test before I send the patch. Until then, let me send this trivial cleanup. The CONFIG_PROVE_RCU code looks trivial but imo really annoying. And it is not complete, so lets document this at least. Plus rcu_lockdep_assert() looks more consistent. > > > +void torture_percpu_rwsem_init(void) > > > +{ > > > + BUG_ON(percpu_init_rwsem(&pcpu_rwsem)); > > > +} > > > + > > > > Aha, we don't really need this... I mean we can use the static initialiser > > which can also be used by uprobes and cgroups. I'll try to send the patch > > tomorrow. > > Very good, please do! Hmm. I am lier. I won't send this patch at least today. The change I had in mind is very simple, #define DECLARE_PERCPU_RWSEM(sem) \ static DEFINE_PER_CPU(unsigned int, sem##_counters); \ struct percpu_rw_semaphore sem = { \ .fast_read_ctr = &sem##_counters, \ ... \ } and yes, uprobes and cgroups can use it. But somehow I missed that we can't use it to define a _static_ sem, static DECLARE_PERCPU_RWSEM(sem); obviously won't work. And damn, I am shy to admit that I spent several hours trying to invent something but failed. Perhaps we can add 2 helpers, DECLARE_PERCPU_RWSEM_GLOBAL() and DECLARE_PERCPU_RWSEM_STATIC(). Oleg. ------------------------------------------------------------------------------- static const struct { void (*call)(struct rcu_head *, void (*)(struct rcu_head *)); void (*wait)(void); // TODO: remove this #ifdef CONFIG_PROVE_RCU int (*held)(void); #endif } gp_ops[] = { ... }; // COMMENT to explain these states enum { GP_IDLE = 0, GP_ENTER, GP_PASSED, GP_EXIT, GP_REPLAY }; #define rss_lock gp_wait.lock // !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!1!!!!!!!! // XXX code must be removed when we split rcu_sync_enter() into start + wait // !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! static void rcu_sync_func(struct rcu_head *rcu) { struct rcu_sync *rsp = container_of(rcu, struct rcu_sync, cb_head); unsigned long flags; BUG_ON(rsp->gp_state == GP_IDLE); BUG_ON(rsp->gp_state == GP_PASSED); spin_lock_irqsave(&rsp->rss_lock, flags); if (rsp->gp_count) { /* * COMMENT. */ rsp->gp_state = GP_PASSED; wake_up_locked(&rsp->gp_wait); } else if (rsp->gp_state == GP_REPLAY) { /* * A new rcu_sync_exit() has happened; requeue the callback * to catch a later GP. */ rsp->gp_state = GP_EXIT; gp_ops[rsp->gp_type].call(&rsp->cb_head, rcu_sync_func); } else { /* * We're at least a GP after rcu_sync_exit(); eveybody will now * have observed the write side critical section. Let 'em rip!. */ BUG_ON(rsp->gp_state == GP_ENTER); // XXX rsp->gp_state = GP_IDLE; } spin_unlock_irqrestore(&rsp->rss_lock, flags); } static void rcu_sync_call(struct rcu_sync *rsp) { // TODO: // This is called by might_sleep() code outside of ->rss_lock, // we can avoid ->call() in some cases (say rcu_blocking_is_gp()) gp_ops[rsp->gp_type].call(&rsp->cb_head, rcu_sync_func); } void rcu_sync_enter(struct rcu_sync *rsp) { int gp_count, gp_state; spin_lock_irq(&rsp->rss_lock); gp_count = rsp->gp_count++; gp_state = rsp->gp_state; if (gp_state == GP_IDLE) rsp->gp_state = GP_ENTER; spin_unlock_irq(&rsp->rss_lock); BUG_ON(gp_count != 0 && gp_state == GP_IDLE); BUG_ON(gp_count == 0 && gp_state == GP_PASSED); BUG_ON(gp_count == 0 && gp_state == GP_ENTER); // XXX if (gp_state == GP_IDLE) rcu_sync_call(rsp); wait_event(rsp->gp_wait, rsp->gp_state != GP_ENTER); BUG_ON(rsp->gp_state < GP_PASSED); } void rcu_sync_exit(struct rcu_sync *rsp) { bool need_call; BUG_ON(rsp->gp_state == GP_IDLE); BUG_ON(rsp->gp_state == GP_ENTER); // XXX spin_lock_irq(&rsp->rss_lock); if (!--rsp->gp_count) { if (rsp->gp_state == GP_PASSED) { need_call = true; rsp->gp_state = GP_EXIT; } else if (rsp->gp_state == GP_EXIT) { rsp->gp_state = GP_REPLAY; } } spin_unlock_irq(&rsp->rss_lock); // Comment to explain why we do not care if another enter() // and perhaps even exit() comes after spin_unlock(). if (need_call) rcu_sync_call(rsp); } void rcu_sync_dtor(struct rcu_sync *rsp) { int gp_state; BUG_ON(rsp->gp_count); BUG_ON(rsp->gp_state == GP_ENTER); // XXX BUG_ON(rsp->gp_state == GP_PASSED); spin_lock_irq(&rsp->rss_lock); if (rsp->gp_state == GP_REPLAY) rsp->gp_state = GP_EXIT; gp_state = rsp->gp_state; spin_unlock_irq(&rsp->rss_lock); // TODO: add another wake_up_locked() into rcu_sync_func(), // use wait_event + spin_lock_wait, remove gp_ops->wait(). if (gp_state != GP_IDLE) { gp_ops[rsp->gp_type].wait(); BUG_ON(rsp->gp_state != GP_IDLE); } }