From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: linux-doc@vger.kernel.org, peterz@infradead.org,
fweisbec@gmail.com, linux-kernel@vger.kernel.org,
mingo@kernel.org, linux-arch@vger.kernel.org,
linux@arm.linux.org.uk, xiaoguangrong@linux.vnet.ibm.com,
wangyun@linux.vnet.ibm.com, nikunj@linux.vnet.ibm.com,
linux-pm@vger.kernel.org, rusty@rustcorp.com.au,
rostedt@goodmis.org, rjw@sisk.pl, namhyung@kernel.org,
tglx@linutronix.de, linux-arm-kernel@lists.infradead.org,
netdev@vger.kernel.org, oleg@redhat.com, sbw@mit.edu,
tj@kernel.org, akpm@linux-foundation.org,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v5 05/45] percpu_rwlock: Make percpu-rwlocks IRQ-safe, optimally
Date: Fri, 8 Feb 2013 15:44:03 -0800 [thread overview]
Message-ID: <20130208234403.GL2666@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130122073400.13822.52336.stgit@srivatsabhat.in.ibm.com>
On Tue, Jan 22, 2013 at 01:04:11PM +0530, Srivatsa S. Bhat wrote:
> If interrupt handlers can also be readers, then one of the ways to make
> per-CPU rwlocks safe, is to disable interrupts at the reader side before
> trying to acquire the per-CPU rwlock and keep it disabled throughout the
> duration of the read-side critical section.
>
> The goal is to avoid cases such as:
>
> 1. writer is active and it holds the global rwlock for write
>
> 2. a regular reader comes in and marks itself as present (by incrementing
> its per-CPU refcount) before checking whether writer is active.
>
> 3. an interrupt hits the reader;
> [If it had not hit, the reader would have noticed that the writer is
> active and would have decremented its refcount and would have tried
> to acquire the global rwlock for read].
> Since the interrupt handler also happens to be a reader, it notices
> the non-zero refcount (which was due to the reader who got interrupted)
> and thinks that this is a nested read-side critical section and
> proceeds to take the fastpath, which is wrong. The interrupt handler
> should have noticed that the writer is active and taken the rwlock
> for read.
>
> So, disabling interrupts can help avoid this problem (at the cost of keeping
> the interrupts disabled for quite long).
>
> But Oleg had a brilliant idea by which we can do much better than that:
> we can manage with disabling interrupts _just_ during the updates (writes to
> per-CPU refcounts) to safe-guard against races with interrupt handlers.
> Beyond that, we can keep the interrupts enabled and still be safe w.r.t
> interrupt handlers that can act as readers.
>
> Basically the idea is that we differentiate between the *part* of the
> per-CPU refcount that we use for reference counting vs the part that we use
> merely to make the writer wait for us to switch over to the right
> synchronization scheme.
>
> The scheme involves splitting the per-CPU refcounts into 2 parts:
> eg: the lower 16 bits are used to track the nesting depth of the reader
> (a "nested-counter"), and the remaining (upper) bits are used to merely mark
> the presence of the reader.
>
> As long as the overall reader_refcnt is non-zero, the writer waits for the
> reader (assuming that the reader is still actively using per-CPU refcounts for
> synchronization).
>
> The reader first sets one of the higher bits to mark its presence, and then
> uses the lower 16 bits to manage the nesting depth. So, an interrupt handler
> coming in as illustrated above will be able to distinguish between "this is
> a nested read-side critical section" vs "we have merely marked our presence
> to make the writer wait for us to switch" by looking at the same refcount.
> Thus, it makes it unnecessary to keep interrupts disabled throughout the
> read-side critical section, despite having the possibility of interrupt
> handlers being readers themselves.
>
>
> Implement this logic and rename the locking functions appropriately, to
> reflect what they do.
One nit below. The issues called out in the previous patch still seem
to me to apply.
Thanx, Paul
> Based-on-idea-by: Oleg Nesterov <oleg@redhat.com>
> Cc: David Howells <dhowells@redhat.com>
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> ---
>
> include/linux/percpu-rwlock.h | 15 ++++++++++-----
> lib/percpu-rwlock.c | 41 +++++++++++++++++++++++++++--------------
> 2 files changed, 37 insertions(+), 19 deletions(-)
>
> diff --git a/include/linux/percpu-rwlock.h b/include/linux/percpu-rwlock.h
> index 6819bb8..856ba6b 100644
> --- a/include/linux/percpu-rwlock.h
> +++ b/include/linux/percpu-rwlock.h
> @@ -34,11 +34,13 @@ struct percpu_rwlock {
> rwlock_t global_rwlock;
> };
>
> -extern void percpu_read_lock(struct percpu_rwlock *);
> -extern void percpu_read_unlock(struct percpu_rwlock *);
> +extern void percpu_read_lock_irqsafe(struct percpu_rwlock *);
> +extern void percpu_read_unlock_irqsafe(struct percpu_rwlock *);
>
> -extern void percpu_write_lock(struct percpu_rwlock *);
> -extern void percpu_write_unlock(struct percpu_rwlock *);
> +extern void percpu_write_lock_irqsave(struct percpu_rwlock *,
> + unsigned long *flags);
> +extern void percpu_write_unlock_irqrestore(struct percpu_rwlock *,
> + unsigned long *flags);
>
> extern int __percpu_init_rwlock(struct percpu_rwlock *,
> const char *, struct lock_class_key *);
> @@ -68,11 +70,14 @@ extern void percpu_free_rwlock(struct percpu_rwlock *);
> __percpu_init_rwlock(pcpu_rwlock, #pcpu_rwlock, &rwlock_key); \
> })
>
> +#define READER_PRESENT (1UL << 16)
> +#define READER_REFCNT_MASK (READER_PRESENT - 1)
> +
> #define reader_uses_percpu_refcnt(pcpu_rwlock, cpu) \
> (ACCESS_ONCE(per_cpu(*((pcpu_rwlock)->reader_refcnt), cpu)))
>
> #define reader_nested_percpu(pcpu_rwlock) \
> - (__this_cpu_read(*((pcpu_rwlock)->reader_refcnt)) > 1)
> + (__this_cpu_read(*((pcpu_rwlock)->reader_refcnt)) & READER_REFCNT_MASK)
>
> #define writer_active(pcpu_rwlock) \
> (__this_cpu_read(*((pcpu_rwlock)->writer_signal)))
> diff --git a/lib/percpu-rwlock.c b/lib/percpu-rwlock.c
> index 992da5c..a8d177a 100644
> --- a/lib/percpu-rwlock.c
> +++ b/lib/percpu-rwlock.c
> @@ -62,19 +62,19 @@ void percpu_free_rwlock(struct percpu_rwlock *pcpu_rwlock)
> pcpu_rwlock->writer_signal = NULL;
> }
>
> -void percpu_read_lock(struct percpu_rwlock *pcpu_rwlock)
> +void percpu_read_lock_irqsafe(struct percpu_rwlock *pcpu_rwlock)
> {
> preempt_disable();
>
> /* First and foremost, let the writer know that a reader is active */
> - this_cpu_inc(*pcpu_rwlock->reader_refcnt);
> + this_cpu_add(*pcpu_rwlock->reader_refcnt, READER_PRESENT);
>
> /*
> * If we are already using per-cpu refcounts, it is not safe to switch
> * the synchronization scheme. So continue using the refcounts.
> */
> if (reader_nested_percpu(pcpu_rwlock)) {
> - goto out;
> + this_cpu_inc(*pcpu_rwlock->reader_refcnt);
Hmmm... If the reader is nested, it -doesn't- need the memory barrier at
the end of this function. If there is lots of nesting, it might be
worth getting rid of it.
> } else {
> /*
> * The write to 'reader_refcnt' must be visible before we
> @@ -83,9 +83,19 @@ void percpu_read_lock(struct percpu_rwlock *pcpu_rwlock)
> smp_mb(); /* Paired with smp_rmb() in sync_reader() */
>
> if (likely(!writer_active(pcpu_rwlock))) {
> - goto out;
> + this_cpu_inc(*pcpu_rwlock->reader_refcnt);
> } else {
> /* Writer is active, so switch to global rwlock. */
> +
> + /*
> + * While we are spinning on ->global_rwlock, an
> + * interrupt can hit us, and the interrupt handler
> + * might call this function. The distinction between
> + * READER_PRESENT and the refcnt helps ensure that the
> + * interrupt handler also takes this branch and spins
> + * on the ->global_rwlock, as long as the writer is
> + * active.
> + */
> read_lock(&pcpu_rwlock->global_rwlock);
>
> /*
> @@ -95,26 +105,27 @@ void percpu_read_lock(struct percpu_rwlock *pcpu_rwlock)
> * back to per-cpu refcounts. (This also helps avoid
> * heterogeneous nesting of readers).
> */
> - if (writer_active(pcpu_rwlock))
> - this_cpu_dec(*pcpu_rwlock->reader_refcnt);
> - else
> + if (!writer_active(pcpu_rwlock)) {
> + this_cpu_inc(*pcpu_rwlock->reader_refcnt);
> read_unlock(&pcpu_rwlock->global_rwlock);
> + }
> }
> }
>
> -out:
> + this_cpu_sub(*pcpu_rwlock->reader_refcnt, READER_PRESENT);
> +
> /* Prevent reordering of any subsequent reads */
> smp_rmb();
> }
>
> -void percpu_read_unlock(struct percpu_rwlock *pcpu_rwlock)
> +void percpu_read_unlock_irqsafe(struct percpu_rwlock *pcpu_rwlock)
> {
> /*
> * We never allow heterogeneous nesting of readers. So it is trivial
> * to find out the kind of reader we are, and undo the operation
> * done by our corresponding percpu_read_lock().
> */
> - if (__this_cpu_read(*pcpu_rwlock->reader_refcnt)) {
> + if (reader_nested_percpu(pcpu_rwlock)) {
> this_cpu_dec(*pcpu_rwlock->reader_refcnt);
> smp_wmb(); /* Paired with smp_rmb() in sync_reader() */
> } else {
> @@ -184,7 +195,8 @@ static void sync_all_readers(struct percpu_rwlock *pcpu_rwlock)
> sync_reader(pcpu_rwlock, cpu);
> }
>
> -void percpu_write_lock(struct percpu_rwlock *pcpu_rwlock)
> +void percpu_write_lock_irqsave(struct percpu_rwlock *pcpu_rwlock,
> + unsigned long *flags)
> {
> /*
> * Tell all readers that a writer is becoming active, so that they
> @@ -192,10 +204,11 @@ void percpu_write_lock(struct percpu_rwlock *pcpu_rwlock)
> */
> announce_writer_active(pcpu_rwlock);
> sync_all_readers(pcpu_rwlock);
> - write_lock(&pcpu_rwlock->global_rwlock);
> + write_lock_irqsave(&pcpu_rwlock->global_rwlock, *flags);
> }
>
> -void percpu_write_unlock(struct percpu_rwlock *pcpu_rwlock)
> +void percpu_write_unlock_irqrestore(struct percpu_rwlock *pcpu_rwlock,
> + unsigned long *flags)
> {
> /*
> * Inform all readers that we are done, so that they can switch back
> @@ -203,6 +216,6 @@ void percpu_write_unlock(struct percpu_rwlock *pcpu_rwlock)
> * see it).
> */
> announce_writer_inactive(pcpu_rwlock);
> - write_unlock(&pcpu_rwlock->global_rwlock);
> + write_unlock_irqrestore(&pcpu_rwlock->global_rwlock, *flags);
> }
>
>
next prev parent reply other threads:[~2013-02-08 23:44 UTC|newest]
Thread overview: 122+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-22 7:33 [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug Srivatsa S. Bhat
2013-01-22 7:33 ` [PATCH v5 01/45] percpu_rwlock: Introduce the global reader-writer lock backend Srivatsa S. Bhat
2013-01-22 18:45 ` Stephen Hemminger
2013-01-22 19:41 ` Srivatsa S. Bhat
2013-01-22 19:32 ` Steven Rostedt
2013-01-22 19:58 ` Srivatsa S. Bhat
2013-01-22 20:54 ` Steven Rostedt
2013-01-24 4:14 ` Michel Lespinasse
2013-01-24 15:58 ` Oleg Nesterov
2013-01-22 7:33 ` [PATCH v5 02/45] percpu_rwlock: Introduce per-CPU variables for the reader and the writer Srivatsa S. Bhat
2013-01-22 7:33 ` [PATCH v5 03/45] percpu_rwlock: Provide a way to define and init percpu-rwlocks at compile time Srivatsa S. Bhat
2013-01-22 7:33 ` [PATCH v5 04/45] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks Srivatsa S. Bhat
2013-01-23 18:55 ` Tejun Heo
2013-01-23 19:33 ` Srivatsa S. Bhat
2013-01-23 19:57 ` Tejun Heo
2013-01-24 4:30 ` Srivatsa S. Bhat
2013-01-29 11:12 ` Namhyung Kim
2013-02-08 22:47 ` Paul E. McKenney
2013-02-10 18:38 ` Srivatsa S. Bhat
2013-02-08 23:10 ` Paul E. McKenney
2013-02-10 18:06 ` Oleg Nesterov
2013-02-10 19:24 ` Srivatsa S. Bhat
2013-02-10 19:50 ` Oleg Nesterov
2013-02-10 20:09 ` Srivatsa S. Bhat
2013-02-10 22:13 ` Paul E. McKenney
2013-02-10 19:54 ` Paul E. McKenney
2013-02-12 16:15 ` Paul E. McKenney
2013-02-10 19:10 ` Srivatsa S. Bhat
2013-02-10 19:47 ` Paul E. McKenney
2013-02-10 19:57 ` Srivatsa S. Bhat
2013-02-10 20:13 ` Oleg Nesterov
2013-02-10 20:20 ` Srivatsa S. Bhat
2013-01-22 7:34 ` [PATCH v5 05/45] percpu_rwlock: Make percpu-rwlocks IRQ-safe, optimally Srivatsa S. Bhat
2013-02-08 23:44 ` Paul E. McKenney [this message]
2013-02-10 19:27 ` Srivatsa S. Bhat
2013-02-10 18:42 ` Oleg Nesterov
2013-02-10 19:30 ` Srivatsa S. Bhat
2013-01-22 7:34 ` [PATCH v5 06/45] percpu_rwlock: Allow writers to be readers, and add lockdep annotations Srivatsa S. Bhat
2013-02-08 23:47 ` Paul E. McKenney
2013-02-10 19:32 ` Srivatsa S. Bhat
2013-01-22 7:34 ` [PATCH v5 07/45] CPU hotplug: Provide APIs to prevent CPU offline from atomic context Srivatsa S. Bhat
2013-01-29 10:21 ` Namhyung Kim
2013-02-10 19:34 ` Srivatsa S. Bhat
2013-02-08 23:50 ` Paul E. McKenney
2013-01-22 7:35 ` [PATCH v5 08/45] CPU hotplug: Convert preprocessor macros to static inline functions Srivatsa S. Bhat
2013-02-08 23:51 ` Paul E. McKenney
2013-01-22 7:35 ` [PATCH v5 09/45] smp, cpu hotplug: Fix smp_call_function_*() to prevent CPU offline properly Srivatsa S. Bhat
2013-02-09 0:07 ` Paul E. McKenney
2013-02-10 19:41 ` Srivatsa S. Bhat
2013-02-10 19:56 ` Paul E. McKenney
2013-02-10 19:59 ` Srivatsa S. Bhat
2013-01-22 7:35 ` [PATCH v5 10/45] smp, cpu hotplug: Fix on_each_cpu_*() " Srivatsa S. Bhat
2013-01-22 7:35 ` [PATCH v5 11/45] sched/timer: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-01-22 7:35 ` [PATCH v5 12/45] sched/migration: Use raw_spin_lock/unlock since interrupts are already disabled Srivatsa S. Bhat
2013-01-22 7:36 ` [PATCH v5 13/45] sched/rt: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-01-22 7:36 ` [PATCH v5 14/45] rcu, CPU hotplug: Fix comment referring to stop_machine() Srivatsa S. Bhat
2013-02-09 0:14 ` Paul E. McKenney
2013-02-10 19:43 ` Srivatsa S. Bhat
2013-01-22 7:36 ` [PATCH v5 15/45] tick: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-01-22 7:37 ` [PATCH v5 16/45] time/clocksource: " Srivatsa S. Bhat
2013-01-22 7:37 ` [PATCH v5 17/45] softirq: " Srivatsa S. Bhat
2013-01-22 7:38 ` [PATCH v5 18/45] irq: " Srivatsa S. Bhat
2013-01-22 7:38 ` [PATCH v5 19/45] net: " Srivatsa S. Bhat
2013-01-22 7:38 ` [PATCH v5 20/45] block: " Srivatsa S. Bhat
2013-01-22 7:38 ` [PATCH v5 21/45] crypto: pcrypt - Protect access to cpu_online_mask with get/put_online_cpus() Srivatsa S. Bhat
2013-01-22 7:39 ` [PATCH v5 22/45] infiniband: ehca: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-01-22 7:39 ` [PATCH v5 23/45] [SCSI] fcoe: " Srivatsa S. Bhat
2013-01-22 7:39 ` [PATCH v5 24/45] staging: octeon: " Srivatsa S. Bhat
2013-01-22 7:39 ` [PATCH v5 25/45] x86: " Srivatsa S. Bhat
2013-01-22 7:39 ` [PATCH v5 26/45] perf/x86: " Srivatsa S. Bhat
2013-01-22 7:40 ` [PATCH v5 27/45] KVM: Use get/put_online_cpus_atomic() to prevent CPU offline from atomic context Srivatsa S. Bhat
2013-01-22 7:40 ` [PATCH v5 28/45] kvm/vmx: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-01-22 7:40 ` [PATCH v5 29/45] x86/xen: " Srivatsa S. Bhat
2013-02-19 18:10 ` Konrad Rzeszutek Wilk
2013-02-19 18:29 ` Srivatsa S. Bhat
2013-01-22 7:41 ` [PATCH v5 30/45] alpha/smp: " Srivatsa S. Bhat
2013-01-22 7:41 ` [PATCH v5 31/45] blackfin/smp: " Srivatsa S. Bhat
2013-01-28 9:09 ` Bob Liu
2013-01-28 19:06 ` Tejun Heo
2013-01-29 1:14 ` Srivatsa S. Bhat
2013-01-22 7:41 ` [PATCH v5 32/45] cris/smp: " Srivatsa S. Bhat
2013-01-22 7:42 ` [PATCH v5 33/45] hexagon/smp: " Srivatsa S. Bhat
2013-01-22 7:42 ` [PATCH v5 34/45] ia64: " Srivatsa S. Bhat
2013-01-22 7:42 ` [PATCH v5 35/45] m32r: " Srivatsa S. Bhat
2013-01-22 7:42 ` [PATCH v5 36/45] MIPS: " Srivatsa S. Bhat
2013-01-22 7:43 ` [PATCH v5 37/45] mn10300: " Srivatsa S. Bhat
2013-01-22 7:43 ` [PATCH v5 38/45] parisc: " Srivatsa S. Bhat
2013-01-22 7:43 ` [PATCH v5 39/45] powerpc: " Srivatsa S. Bhat
2013-01-22 7:44 ` [PATCH v5 40/45] sh: " Srivatsa S. Bhat
2013-01-22 7:44 ` [PATCH v5 41/45] sparc: " Srivatsa S. Bhat
2013-01-22 7:44 ` [PATCH v5 42/45] tile: " Srivatsa S. Bhat
2013-01-22 7:44 ` [PATCH v5 43/45] cpu: No more __stop_machine() in _cpu_down() Srivatsa S. Bhat
2013-01-22 7:45 ` [PATCH v5 44/45] CPU hotplug, stop_machine: Decouple CPU hotplug from stop_machine() in Kconfig Srivatsa S. Bhat
2013-02-09 0:15 ` Paul E. McKenney
2013-02-10 19:45 ` Srivatsa S. Bhat
2013-01-22 7:45 ` [PATCH v5 45/45] Documentation/cpu-hotplug: Remove references to stop_machine() Srivatsa S. Bhat
2013-02-09 0:16 ` Paul E. McKenney
2013-02-04 13:47 ` [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug Srivatsa S. Bhat
2013-02-07 4:14 ` Rusty Russell
2013-02-07 6:11 ` Srivatsa S. Bhat
2013-02-08 15:41 ` Russell King - ARM Linux
2013-02-08 16:44 ` Srivatsa S. Bhat
2013-02-08 18:09 ` Srivatsa S. Bhat
2013-02-11 11:58 ` Vincent Guittot
2013-02-11 12:23 ` Srivatsa S. Bhat
2013-02-11 19:08 ` Paul E. McKenney
2013-02-12 3:58 ` Srivatsa S. Bhat
2013-02-15 13:28 ` Vincent Guittot
2013-02-15 19:40 ` Srivatsa S. Bhat
2013-02-18 10:24 ` Vincent Guittot
2013-02-18 10:34 ` Srivatsa S. Bhat
2013-02-18 10:51 ` Srivatsa S. Bhat
2013-02-18 10:58 ` Vincent Guittot
2013-02-18 15:30 ` Steven Rostedt
2013-02-18 16:50 ` Vincent Guittot
2013-02-18 19:53 ` Steven Rostedt
2013-02-18 19:53 ` Steven Rostedt
2013-02-19 10:33 ` Vincent Guittot
2013-02-18 10:54 ` Thomas Gleixner
2013-02-18 10:57 ` Srivatsa S. Bhat
2013-02-11 12:41 ` [PATCH v5 01/45] percpu_rwlock: Introduce the global reader-writer lock backend David Howells
2013-02-11 12:56 ` Srivatsa S. Bhat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130208234403.GL2666@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=fweisbec@gmail.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=linux@arm.linux.org.uk \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=nikunj@linux.vnet.ibm.com \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rjw@sisk.pl \
--cc=rostedt@goodmis.org \
--cc=rusty@rustcorp.com.au \
--cc=sbw@mit.edu \
--cc=srivatsa.bhat@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=wangyun@linux.vnet.ibm.com \
--cc=xiaoguangrong@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).