From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: tj@kernel.org, tglx@linutronix.de, peterz@infradead.org,
paulmck@linux.vnet.ibm.com, rusty@rustcorp.com.au,
mingo@kernel.org, akpm@linux-foundation.org, namhyung@kernel.org,
vincent.guittot@linaro.org, sbw@mit.edu,
amit.kucheria@linaro.org, rostedt@goodmis.org, rjw@sisk.pl,
wangyun@linux.vnet.ibm.com, xiaoguangrong@linux.vnet.ibm.com,
nikunj@linux.vnet.ibm.com, linux-pm@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v2 01/10] CPU hotplug: Provide APIs for "light" atomic readers to prevent CPU offline
Date: Fri, 07 Dec 2012 00:47:29 +0530 [thread overview]
Message-ID: <50C0EF49.8050700@linux.vnet.ibm.com> (raw)
In-Reply-To: <50C0E88E.9050909@linux.vnet.ibm.com>
On 12/07/2012 12:18 AM, Srivatsa S. Bhat wrote:
> On 12/06/2012 09:48 PM, Oleg Nesterov wrote:
>> On 12/06, Srivatsa S. Bhat wrote:
>>>
>>> +void get_online_cpus_atomic(void)
>>> +{
>>> + int c, old;
>>> +
>>> + preempt_disable();
>>> + read_lock(&hotplug_rwlock);
>>
>> Confused... Why it also takes hotplug_rwlock?
>
> To avoid ABBA deadlocks.
>
> hotplug_rwlock was meant for the "light" readers.
> The atomic counters were meant for the "heavy/full" readers.
> I wanted them to be able to nest in any manner they wanted,
> such as:
>
> Full inside light:
>
> get_online_cpus_atomic_light()
> ...
> get_online_cpus_atomic_full()
> ...
> put_online_cpus_atomic_full()
> ...
> put_online_cpus_atomic_light()
>
> Or, light inside full:
>
> get_online_cpus_atomic_full()
> ...
> get_online_cpus_atomic_light()
> ...
> put_online_cpus_atomic_light()
> ...
> put_online_cpus_atomic_full()
>
> To allow this, I made the two sets of APIs take the locks
> in the same order internally.
>
> (I had some more description of this logic in the changelog
> of 2/10; the only difference there is that instead of atomic
> counters, I used rwlocks for the full-readers as well.
> https://lkml.org/lkml/2012/12/5/320)
>
One of the reasons why I changed everybody to global rwlocks
instead of per-cpu atomic counters was to avoid lock ordering
related deadlocks associated with per-cpu locking.
Eg:
CPU 0 CPU 1
------ ------
1. Acquire lock A Increment CPU1's
atomic counter
2. Increment CPU0's Try to acquire lock A
atomic counter
Now consider what happens if a hotplug writer (cpu_down) begins,
and starts looking at CPU0, to try to decrement its atomic counter,
in between steps 1 and 2.
The hotplug writer will be successful in CPU0 because CPU0 hasn't
yet incremented its counter. So, now CPU0 spins waiting for the
hotplug writer to reset the atomic counter again.
When the hotplug writer looks at CPU1, it can't decrement the
atomic counter because CPU1 has already incremented it. So the
hotplug writer waits. CPU1 goes ahead, only to start spinning on
lock A which was acquired by CPU0. So we end up in a deadlock due
to circular locking dependency between the 3 entities.
One way to deal with this would be that the writer should abort
its loop of trying to atomic_dec per-cpu counters, whenever it has
to wait. But that might prove to be too wasteful (and overly
paranoid) in practice.
So, instead, if we have global locks (like global rwlocks that I
finally used when posting out this v2), then we won't end up in
such messy issues.
And why exactly was I concerned about all this?
Because, the current code uses the extremely flexible preempt_disable()
/preempt_enable() pair which impose absolutely no ordering
restrictions. And probably the existing code _depends_ on that.
So if our new APIs that replace preempt_disable/enable start
imposing ordering restrictions, I feared that they might become
unusable. Hence I used global rwlocks, thereby trading cache-line
bouncing (due to global rwlocks) for lock-safety and flexibility.
Regards,
Srivatsa S. Bhat
next prev parent reply other threads:[~2012-12-06 19:19 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-05 18:42 [RFC PATCH v2 00/10][RESEND] CPU hotplug: stop_machine()-free CPU hotplug Srivatsa S. Bhat
2012-12-05 18:43 ` [RFC PATCH v2 01/10] CPU hotplug: Provide APIs for "light" atomic readers to prevent CPU offline Srivatsa S. Bhat
2012-12-05 18:47 ` Srivatsa S. Bhat
2012-12-05 18:51 ` Srivatsa S. Bhat
2012-12-05 18:53 ` Srivatsa S. Bhat
2012-12-05 18:56 ` Srivatsa S. Bhat
2012-12-05 18:59 ` Srivatsa S. Bhat
2012-12-05 20:14 ` Srivatsa S. Bhat
2012-12-06 16:18 ` Oleg Nesterov
2012-12-06 18:48 ` Srivatsa S. Bhat
2012-12-06 19:17 ` Srivatsa S. Bhat [this message]
2012-12-07 21:01 ` Oleg Nesterov
2012-12-06 19:28 ` Steven Rostedt
2012-12-06 19:36 ` Srivatsa S. Bhat
2012-12-06 22:02 ` Steven Rostedt
2012-12-07 17:33 ` Srivatsa S. Bhat
[not found] ` <20121207200014.GB13238@redhat.com>
2012-12-10 18:21 ` Oleg Nesterov
2012-12-10 19:07 ` Steven Rostedt
2012-12-07 19:56 ` Oleg Nesterov
2012-12-07 20:25 ` Srivatsa S. Bhat
2012-12-07 20:59 ` Oleg Nesterov
2012-12-05 19:07 ` Oleg Nesterov
2012-12-05 19:16 ` Srivatsa S. Bhat
2012-12-05 18:43 ` [RFC PATCH v2 02/10] CPU hotplug: Provide APIs for "full" " Srivatsa S. Bhat
2012-12-05 19:01 ` Srivatsa S. Bhat
2012-12-05 20:31 ` Srivatsa S. Bhat
2012-12-05 20:57 ` Tejun Heo
2012-12-06 4:31 ` Srivatsa S. Bhat
2012-12-05 18:43 ` [RFC PATCH v2 03/10] CPU hotplug: Convert preprocessor macros to static inline functions Srivatsa S. Bhat
2012-12-05 18:43 ` [RFC PATCH v2 04/10] smp, cpu hotplug: Fix smp_call_function_*() to prevent CPU offline properly Srivatsa S. Bhat
2012-12-05 18:43 ` [RFC PATCH v2 05/10] smp, cpu hotplug: Fix on_each_cpu_*() " Srivatsa S. Bhat
2012-12-05 18:44 ` [RFC PATCH v2 06/10] sched, cpu hotplug: Use stable online cpus in try_to_wake_up() & select_task_rq() Srivatsa S. Bhat
2012-12-05 18:44 ` [RFC PATCH v2 07/10] kick_process(), cpu-hotplug: Prevent offlining of target CPU properly Srivatsa S. Bhat
2012-12-05 18:44 ` [RFC PATCH v2 08/10] yield_to(), cpu-hotplug: Prevent offlining of other CPUs properly Srivatsa S. Bhat
2012-12-05 18:44 ` [RFC PATCH v2 09/10] kvm, vmx: Add full atomic synchronization with CPU Hotplug Srivatsa S. Bhat
2012-12-05 18:45 ` [RFC PATCH v2 10/10] cpu: No more __stop_machine() in _cpu_down() Srivatsa S. Bhat
2012-12-05 19:08 ` Oleg Nesterov
2012-12-05 19:12 ` Srivatsa S. Bhat
[not found] <20121205131038.17383.55472.stgit@srivatsabhat.in.ibm.com>
[not found] ` <20121205131136.17383.23318.stgit@srivatsabhat.in.ibm.com>
[not found] ` <20121205142316.GI3885@mtj.dyndns.org>
[not found] ` <20121205164640.GA7382@redhat.com>
[not found] ` <20121205165356.GL3885@mtj.dyndns.org>
2012-12-05 18:32 ` [RFC PATCH v2 01/10] CPU hotplug: Provide APIs for "light" atomic readers to prevent CPU offline Srivatsa S. Bhat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50C0EF49.8050700@linux.vnet.ibm.com \
--to=srivatsa.bhat@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=amit.kucheria@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=nikunj@linux.vnet.ibm.com \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=rjw@sisk.pl \
--cc=rostedt@goodmis.org \
--cc=rusty@rustcorp.com.au \
--cc=sbw@mit.edu \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=wangyun@linux.vnet.ibm.com \
--cc=xiaoguangrong@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).