linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: linux-doc@vger.kernel.org, peterz@infradead.org,
	fweisbec@gmail.com, linux-kernel@vger.kernel.org,
	mingo@kernel.org, linux-arch@vger.kernel.org,
	linux@arm.linux.org.uk, xiaoguangrong@linux.vnet.ibm.com,
	wangyun@linux.vnet.ibm.com, paulmck@linux.vnet.ibm.com,
	nikunj@linux.vnet.ibm.com, linux-pm@vger.kernel.org,
	rusty@rustcorp.com.au, rostedt@goodmis.org, rjw@sisk.pl,
	namhyung@kernel.org, tglx@linutronix.de,
	linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org,
	oleg@redhat.com, sbw@mit.edu, tj@kernel.org,
	akpm@linux-foundation.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v5 01/45] percpu_rwlock: Introduce the global reader-writer lock backend
Date: Wed, 23 Jan 2013 01:11:55 +0530	[thread overview]
Message-ID: <50FEEB83.7030804@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130122104506.32b4e581@nehalam.linuxnetplumber.net>

On 01/23/2013 12:15 AM, Stephen Hemminger wrote:
> On Tue, 22 Jan 2013 13:03:22 +0530
> "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> wrote:
> 
>> A straight-forward (and obvious) algorithm to implement Per-CPU Reader-Writer
>> locks can also lead to too many deadlock possibilities which can make it very
>> hard/impossible to use. This is explained in the example below, which helps
>> justify the need for a different algorithm to implement flexible Per-CPU
>> Reader-Writer locks.
>>
>> We can use global rwlocks as shown below safely, without fear of deadlocks:
>>
>> Readers:
>>
>>          CPU 0                                CPU 1
>>          ------                               ------
>>
>> 1.    spin_lock(&random_lock);             read_lock(&my_rwlock);
>>
>>
>> 2.    read_lock(&my_rwlock);               spin_lock(&random_lock);
>>
>>
>> Writer:
>>
>>          CPU 2:
>>          ------
>>
>>        write_lock(&my_rwlock);
>>
>>
>> We can observe that there is no possibility of deadlocks or circular locking
>> dependencies here. Its perfectly safe.
>>
>> Now consider a blind/straight-forward conversion of global rwlocks to per-CPU
>> rwlocks like this:
>>
>> The reader locks its own per-CPU rwlock for read, and proceeds.
>>
>> Something like: read_lock(per-cpu rwlock of this cpu);
>>
>> The writer acquires all per-CPU rwlocks for write and only then proceeds.
>>
>> Something like:
>>
>>   for_each_online_cpu(cpu)
>> 	write_lock(per-cpu rwlock of 'cpu');
>>
>>
>> Now let's say that for performance reasons, the above scenario (which was
>> perfectly safe when using global rwlocks) was converted to use per-CPU rwlocks.
>>
>>
>>          CPU 0                                CPU 1
>>          ------                               ------
>>
>> 1.    spin_lock(&random_lock);             read_lock(my_rwlock of CPU 1);
>>
>>
>> 2.    read_lock(my_rwlock of CPU 0);       spin_lock(&random_lock);
>>
>>
>> Writer:
>>
>>          CPU 2:
>>          ------
>>
>>       for_each_online_cpu(cpu)
>>         write_lock(my_rwlock of 'cpu');
>>
>>
>> Consider what happens if the writer begins his operation in between steps 1
>> and 2 at the reader side. It becomes evident that we end up in a (previously
>> non-existent) deadlock due to a circular locking dependency between the 3
>> entities, like this:
>>
>>
>> (holds              Waiting for
>>  random_lock) CPU 0 -------------> CPU 2  (holds my_rwlock of CPU 0
>>                                                for write)
>>                ^                   |
>>                |                   |
>>         Waiting|                   | Waiting
>>           for  |                   |  for
>>                |                   V
>>                 ------ CPU 1 <------
>>
>>                 (holds my_rwlock of
>>                  CPU 1 for read)
>>
>>
>>
>> So obviously this "straight-forward" way of implementing percpu rwlocks is
>> deadlock-prone. One simple measure for (or characteristic of) safe percpu
>> rwlock should be that if a user replaces global rwlocks with per-CPU rwlocks
>> (for performance reasons), he shouldn't suddenly end up in numerous deadlock
>> possibilities which never existed before. The replacement should continue to
>> remain safe, and perhaps improve the performance.
>>
>> Observing the robustness of global rwlocks in providing a fair amount of
>> deadlock safety, we implement per-CPU rwlocks as nothing but global rwlocks,
>> as a first step.
>>
>>
>> Cc: David Howells <dhowells@redhat.com>
>> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> 
> We got rid of brlock years ago, do we have to reintroduce it like this?
> The problem was that brlock caused starvation.
> 

Um? I still see it in include/linux/lglock.h and its users in fs/ directory.

BTW, I'm not advocating that everybody start converting their global reader-writer
locks to per-cpu rwlocks, because such a conversion probably won't make sense
in all scenarios.

The thing is, for CPU hotplug in particular, the "preempt_disable() at the reader;
stop_machine() at the writer" scheme had some very desirable properties at the
reader side (even though people might hate stop_machine() with all their
heart ;-)), namely : 

At the reader side:

o No need to hold locks to prevent CPU offline
o Extremely fast/optimized updates (the preempt count)
o No need for heavy memory barriers
o Extremely flexible nesting rules

So this made perfect sense at the reader for CPU hotplug, because it is expected
that CPU hotplug operations are very infrequent, and it is well-known that quite
a few atomic hotplug readers are in very hot paths. The problem was that the
stop_machine() at the writer was not only a little too heavy, but also inflicted
real-time latencies on the system because it needed cooperation from _all_ CPUs
synchronously, to take one CPU down.

So the idea is to get rid of stop_machine() without hurting the reader side.
And this scheme of per-cpu rwlocks comes close to ensuring that. (You can look
at the previous versions of this patchset [links given in cover letter] to see
what other schemes we hashed out before coming to this one).

The only reason I exposed this as a generic locking scheme was because Tejun
pointed out that, complex locking schemes implemented in individual subsystems
is not such a good idea. And also this comes at a time when per-cpu rwsemaphores
have just been introduced in the kernel and Oleg had ideas about converting the
cpu hotplug (sleepable) locking to use them.

Regards,
Srivatsa S. Bhat

  reply	other threads:[~2013-01-22 19:43 UTC|newest]

Thread overview: 122+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-22  7:33 [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug Srivatsa S. Bhat
2013-01-22  7:33 ` [PATCH v5 01/45] percpu_rwlock: Introduce the global reader-writer lock backend Srivatsa S. Bhat
2013-01-22 18:45   ` Stephen Hemminger
2013-01-22 19:41     ` Srivatsa S. Bhat [this message]
2013-01-22 19:32   ` Steven Rostedt
2013-01-22 19:58     ` Srivatsa S. Bhat
2013-01-22 20:54       ` Steven Rostedt
2013-01-24  4:14     ` Michel Lespinasse
2013-01-24 15:58       ` Oleg Nesterov
2013-01-22  7:33 ` [PATCH v5 02/45] percpu_rwlock: Introduce per-CPU variables for the reader and the writer Srivatsa S. Bhat
2013-01-22  7:33 ` [PATCH v5 03/45] percpu_rwlock: Provide a way to define and init percpu-rwlocks at compile time Srivatsa S. Bhat
2013-01-22  7:33 ` [PATCH v5 04/45] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks Srivatsa S. Bhat
2013-01-23 18:55   ` Tejun Heo
2013-01-23 19:33     ` Srivatsa S. Bhat
2013-01-23 19:57       ` Tejun Heo
2013-01-24  4:30         ` Srivatsa S. Bhat
2013-01-29 11:12           ` Namhyung Kim
2013-02-08 22:47             ` Paul E. McKenney
2013-02-10 18:38               ` Srivatsa S. Bhat
2013-02-08 23:10   ` Paul E. McKenney
2013-02-10 18:06     ` Oleg Nesterov
2013-02-10 19:24       ` Srivatsa S. Bhat
2013-02-10 19:50         ` Oleg Nesterov
2013-02-10 20:09           ` Srivatsa S. Bhat
2013-02-10 22:13             ` Paul E. McKenney
2013-02-10 19:54       ` Paul E. McKenney
2013-02-12 16:15         ` Paul E. McKenney
2013-02-10 19:10     ` Srivatsa S. Bhat
2013-02-10 19:47       ` Paul E. McKenney
2013-02-10 19:57         ` Srivatsa S. Bhat
2013-02-10 20:13       ` Oleg Nesterov
2013-02-10 20:20         ` Srivatsa S. Bhat
2013-01-22  7:34 ` [PATCH v5 05/45] percpu_rwlock: Make percpu-rwlocks IRQ-safe, optimally Srivatsa S. Bhat
2013-02-08 23:44   ` Paul E. McKenney
2013-02-10 19:27     ` Srivatsa S. Bhat
2013-02-10 18:42   ` Oleg Nesterov
2013-02-10 19:30     ` Srivatsa S. Bhat
2013-01-22  7:34 ` [PATCH v5 06/45] percpu_rwlock: Allow writers to be readers, and add lockdep annotations Srivatsa S. Bhat
2013-02-08 23:47   ` Paul E. McKenney
2013-02-10 19:32     ` Srivatsa S. Bhat
2013-01-22  7:34 ` [PATCH v5 07/45] CPU hotplug: Provide APIs to prevent CPU offline from atomic context Srivatsa S. Bhat
2013-01-29 10:21   ` Namhyung Kim
2013-02-10 19:34     ` Srivatsa S. Bhat
2013-02-08 23:50   ` Paul E. McKenney
2013-01-22  7:35 ` [PATCH v5 08/45] CPU hotplug: Convert preprocessor macros to static inline functions Srivatsa S. Bhat
2013-02-08 23:51   ` Paul E. McKenney
2013-01-22  7:35 ` [PATCH v5 09/45] smp, cpu hotplug: Fix smp_call_function_*() to prevent CPU offline properly Srivatsa S. Bhat
2013-02-09  0:07   ` Paul E. McKenney
2013-02-10 19:41     ` Srivatsa S. Bhat
2013-02-10 19:56       ` Paul E. McKenney
2013-02-10 19:59         ` Srivatsa S. Bhat
2013-01-22  7:35 ` [PATCH v5 10/45] smp, cpu hotplug: Fix on_each_cpu_*() " Srivatsa S. Bhat
2013-01-22  7:35 ` [PATCH v5 11/45] sched/timer: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-01-22  7:35 ` [PATCH v5 12/45] sched/migration: Use raw_spin_lock/unlock since interrupts are already disabled Srivatsa S. Bhat
2013-01-22  7:36 ` [PATCH v5 13/45] sched/rt: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-01-22  7:36 ` [PATCH v5 14/45] rcu, CPU hotplug: Fix comment referring to stop_machine() Srivatsa S. Bhat
2013-02-09  0:14   ` Paul E. McKenney
2013-02-10 19:43     ` Srivatsa S. Bhat
2013-01-22  7:36 ` [PATCH v5 15/45] tick: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-01-22  7:37 ` [PATCH v5 16/45] time/clocksource: " Srivatsa S. Bhat
2013-01-22  7:37 ` [PATCH v5 17/45] softirq: " Srivatsa S. Bhat
2013-01-22  7:38 ` [PATCH v5 18/45] irq: " Srivatsa S. Bhat
2013-01-22  7:38 ` [PATCH v5 19/45] net: " Srivatsa S. Bhat
2013-01-22  7:38 ` [PATCH v5 20/45] block: " Srivatsa S. Bhat
2013-01-22  7:38 ` [PATCH v5 21/45] crypto: pcrypt - Protect access to cpu_online_mask with get/put_online_cpus() Srivatsa S. Bhat
2013-01-22  7:39 ` [PATCH v5 22/45] infiniband: ehca: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-01-22  7:39 ` [PATCH v5 23/45] [SCSI] fcoe: " Srivatsa S. Bhat
2013-01-22  7:39 ` [PATCH v5 24/45] staging: octeon: " Srivatsa S. Bhat
2013-01-22  7:39 ` [PATCH v5 25/45] x86: " Srivatsa S. Bhat
2013-01-22  7:39 ` [PATCH v5 26/45] perf/x86: " Srivatsa S. Bhat
2013-01-22  7:40 ` [PATCH v5 27/45] KVM: Use get/put_online_cpus_atomic() to prevent CPU offline from atomic context Srivatsa S. Bhat
2013-01-22  7:40 ` [PATCH v5 28/45] kvm/vmx: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-01-22  7:40 ` [PATCH v5 29/45] x86/xen: " Srivatsa S. Bhat
2013-02-19 18:10   ` Konrad Rzeszutek Wilk
2013-02-19 18:29     ` Srivatsa S. Bhat
2013-01-22  7:41 ` [PATCH v5 30/45] alpha/smp: " Srivatsa S. Bhat
2013-01-22  7:41 ` [PATCH v5 31/45] blackfin/smp: " Srivatsa S. Bhat
2013-01-28  9:09   ` Bob Liu
2013-01-28 19:06     ` Tejun Heo
2013-01-29  1:14       ` Srivatsa S. Bhat
2013-01-22  7:41 ` [PATCH v5 32/45] cris/smp: " Srivatsa S. Bhat
2013-01-22  7:42 ` [PATCH v5 33/45] hexagon/smp: " Srivatsa S. Bhat
2013-01-22  7:42 ` [PATCH v5 34/45] ia64: " Srivatsa S. Bhat
2013-01-22  7:42 ` [PATCH v5 35/45] m32r: " Srivatsa S. Bhat
2013-01-22  7:42 ` [PATCH v5 36/45] MIPS: " Srivatsa S. Bhat
2013-01-22  7:43 ` [PATCH v5 37/45] mn10300: " Srivatsa S. Bhat
2013-01-22  7:43 ` [PATCH v5 38/45] parisc: " Srivatsa S. Bhat
2013-01-22  7:43 ` [PATCH v5 39/45] powerpc: " Srivatsa S. Bhat
2013-01-22  7:44 ` [PATCH v5 40/45] sh: " Srivatsa S. Bhat
2013-01-22  7:44 ` [PATCH v5 41/45] sparc: " Srivatsa S. Bhat
2013-01-22  7:44 ` [PATCH v5 42/45] tile: " Srivatsa S. Bhat
2013-01-22  7:44 ` [PATCH v5 43/45] cpu: No more __stop_machine() in _cpu_down() Srivatsa S. Bhat
2013-01-22  7:45 ` [PATCH v5 44/45] CPU hotplug, stop_machine: Decouple CPU hotplug from stop_machine() in Kconfig Srivatsa S. Bhat
2013-02-09  0:15   ` Paul E. McKenney
2013-02-10 19:45     ` Srivatsa S. Bhat
2013-01-22  7:45 ` [PATCH v5 45/45] Documentation/cpu-hotplug: Remove references to stop_machine() Srivatsa S. Bhat
2013-02-09  0:16   ` Paul E. McKenney
2013-02-04 13:47 ` [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug Srivatsa S. Bhat
2013-02-07  4:14   ` Rusty Russell
2013-02-07  6:11     ` Srivatsa S. Bhat
2013-02-08 15:41       ` Russell King - ARM Linux
2013-02-08 16:44         ` Srivatsa S. Bhat
2013-02-08 18:09           ` Srivatsa S. Bhat
2013-02-11 11:58             ` Vincent Guittot
2013-02-11 12:23               ` Srivatsa S. Bhat
2013-02-11 19:08                 ` Paul E. McKenney
2013-02-12  3:58                   ` Srivatsa S. Bhat
2013-02-15 13:28                     ` Vincent Guittot
2013-02-15 19:40                       ` Srivatsa S. Bhat
2013-02-18 10:24                         ` Vincent Guittot
2013-02-18 10:34                           ` Srivatsa S. Bhat
2013-02-18 10:51                             ` Srivatsa S. Bhat
2013-02-18 10:58                               ` Vincent Guittot
2013-02-18 15:30                                 ` Steven Rostedt
2013-02-18 16:50                                   ` Vincent Guittot
2013-02-18 19:53                                     ` Steven Rostedt
2013-02-18 19:53                                     ` Steven Rostedt
2013-02-19 10:33                                       ` Vincent Guittot
2013-02-18 10:54                             ` Thomas Gleixner
2013-02-18 10:57                               ` Srivatsa S. Bhat
2013-02-11 12:41 ` [PATCH v5 01/45] percpu_rwlock: Introduce the global reader-writer lock backend David Howells
2013-02-11 12:56   ` Srivatsa S. Bhat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50FEEB83.7030804@linux.vnet.ibm.com \
    --to=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=fweisbec@gmail.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nikunj@linux.vnet.ibm.com \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rjw@sisk.pl \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=sbw@mit.edu \
    --cc=stephen@networkplumber.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=wangyun@linux.vnet.ibm.com \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).