* [PATCH 2/2] cpuhotplug: introduce try_get_online_cpus()
@ 2009-05-29 8:29 Lai Jiangshan
2009-05-29 20:31 ` Andrew Morton
0 siblings, 1 reply; 6+ messages in thread
From: Lai Jiangshan @ 2009-05-29 8:29 UTC (permalink / raw)
To: Andrew Morton, Rusty Russell, Ingo Molnar, Paul E. McKenney, LKML
get_online_cpus() is a typically coarsely granular lock.
It's a source of ABBA deadlock.
Thanks to the CPU notifiers, Some subsystem's global lock will
be required after cpu_hotplug.rwlock. Subsystem's global lock
is coarsely granular lock too, thus a lot's of lock in kernel
should be required after cpu_hotplug.rwlock(if we need
cpu_hotplug.rwlock held too)
Otherwise it may come to a ABBA deadlock like this:
thread 1 | thread 2
_cpu_down() | Lock a-kernel-lock.
cpu_hotplug_begin() |
down_write(&cpu_hotplug.rwlock) |
__raw_notifier_call_chain(CPU_DOWN_PREPARE) | get_online_cpus()
------------------------------------------------------------------------
Lock a-kernel-lock.(wait thread2) | down_read(&cpu_hotplug.rwlock)
(wait thread 1)
But CPU online/offline are happened very rarely, get_online_cpus()
returns success quickly in all probability.
So it's an asinine behavior that get_online_cpus() is not allowed
to be required after we had held "a-kernel-lock".
To dispel the ABBA deadlock, this patch introduces
try_get_online_cpus(). It returns fail very rarely. It gives the
caller a chance to select an alternative way to finish works,
instead of sleeping or deadlock.
Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 2643d84..98f5c4b 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -104,6 +104,7 @@ extern struct sysdev_class cpu_sysdev_class;
extern void get_online_cpus(void);
extern void put_online_cpus(void);
+extern int try_get_online_cpus(void);
#define hotcpu_notifier(fn, pri) { \
static struct notifier_block fn##_nb __cpuinitdata = \
{ .notifier_call = fn, .priority = pri }; \
@@ -117,6 +118,7 @@ int cpu_down(unsigned int cpu);
#define get_online_cpus() do { } while (0)
#define put_online_cpus() do { } while (0)
+#define try_get_online_cpus() (1)
#define hotcpu_notifier(fn, pri) do { (void)(fn); } while (0)
/* These aren't inline functions due to a GCC bug. */
#define register_hotcpu_notifier(nb) ({ (void)(nb); 0; })
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 62198ec..e948f19 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -66,6 +66,15 @@ void put_online_cpus(void)
}
EXPORT_SYMBOL_GPL(put_online_cpus);
+int try_get_online_cpus(void)
+{
+ might_sleep();
+ if (cpu_hotplug.active_writer == current)
+ return 1;
+ return down_read_trylock(&cpu_hotplug.rwlock);
+
+}
+EXPORT_SYMBOL_GPL(try_get_online_cpus);
#endif /* CONFIG_HOTPLUG_CPU */
/*
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH 2/2] cpuhotplug: introduce try_get_online_cpus()
2009-05-29 8:29 [PATCH 2/2] cpuhotplug: introduce try_get_online_cpus() Lai Jiangshan
@ 2009-05-29 20:31 ` Andrew Morton
2009-06-01 2:42 ` Lai Jiangshan
2009-06-01 7:31 ` Rusty Russell
0 siblings, 2 replies; 6+ messages in thread
From: Andrew Morton @ 2009-05-29 20:31 UTC (permalink / raw)
To: Lai Jiangshan
Cc: rusty, mingo, paulmck, linux-kernel, Oleg Nesterov,
Linus Torvalds
On Fri, 29 May 2009 16:29:34 +0800
Lai Jiangshan <laijs@cn.fujitsu.com> wrote:
>
> get_online_cpus() is a typically coarsely granular lock.
> It's a source of ABBA deadlock.
>
> Thanks to the CPU notifiers, Some subsystem's global lock will
> be required after cpu_hotplug.rwlock. Subsystem's global lock
> is coarsely granular lock too, thus a lot's of lock in kernel
> should be required after cpu_hotplug.rwlock(if we need
> cpu_hotplug.rwlock held too)
>
> Otherwise it may come to a ABBA deadlock like this:
>
> thread 1 | thread 2
> _cpu_down() | Lock a-kernel-lock.
> cpu_hotplug_begin() |
> down_write(&cpu_hotplug.rwlock) |
> __raw_notifier_call_chain(CPU_DOWN_PREPARE) | get_online_cpus()
> ------------------------------------------------------------------------
> Lock a-kernel-lock.(wait thread2) | down_read(&cpu_hotplug.rwlock)
> (wait thread 1)
>
> But CPU online/offline are happened very rarely, get_online_cpus()
> returns success quickly in all probability.
> So it's an asinine behavior that get_online_cpus() is not allowed
> to be required after we had held "a-kernel-lock".
>
> To dispel the ABBA deadlock, this patch introduces
> try_get_online_cpus(). It returns fail very rarely. It gives the
> caller a chance to select an alternative way to finish works,
> instead of sleeping or deadlock.
>
We really really really don't want to add new trylocks. They're nasty
things, requiring that callers provide alternative fallback code paths
which are really hard to test. The original code writer _might_ have
performed runtime testing of the contention path when originally
writing the patch, but after that nobody will tesst that path at all
and there's a good chance that it will become broken over time and
nobody will know about it.
I assume that one advantage of your rwlock conversion patch is that
code paths such as the above are now checked by lockdep, yes? And this
is how you discovered the bug in some subsystem whcih you didn't tell
us about?
> ---
> diff --git a/include/linux/cpu.h b/include/linux/cpu.h
> index 2643d84..98f5c4b 100644
> --- a/include/linux/cpu.h
> +++ b/include/linux/cpu.h
> @@ -104,6 +104,7 @@ extern struct sysdev_class cpu_sysdev_class;
>
> extern void get_online_cpus(void);
> extern void put_online_cpus(void);
> +extern int try_get_online_cpus(void);
> #define hotcpu_notifier(fn, pri) { \
> static struct notifier_block fn##_nb __cpuinitdata = \
> { .notifier_call = fn, .priority = pri }; \
> @@ -117,6 +118,7 @@ int cpu_down(unsigned int cpu);
>
> #define get_online_cpus() do { } while (0)
> #define put_online_cpus() do { } while (0)
> +#define try_get_online_cpus() (1)
> #define hotcpu_notifier(fn, pri) do { (void)(fn); } while (0)
> /* These aren't inline functions due to a GCC bug. */
> #define register_hotcpu_notifier(nb) ({ (void)(nb); 0; })
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 62198ec..e948f19 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -66,6 +66,15 @@ void put_online_cpus(void)
> }
> EXPORT_SYMBOL_GPL(put_online_cpus);
>
> +int try_get_online_cpus(void)
> +{
> + might_sleep();
> + if (cpu_hotplug.active_writer == current)
> + return 1;
> + return down_read_trylock(&cpu_hotplug.rwlock);
> +
> +}
> +EXPORT_SYMBOL_GPL(try_get_online_cpus);
It's strange to add a might_sleep() to a function which doesn't sleep.
The patch adds no callers to this function. This is significant
because it would be quite interesting to find out which subsystem(s)
you've found to have this deadlock. I do think that we should look at
alternative (non-trylocky) ways of fixing them.
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH 2/2] cpuhotplug: introduce try_get_online_cpus()
2009-05-29 20:31 ` Andrew Morton
@ 2009-06-01 2:42 ` Lai Jiangshan
2009-06-01 7:31 ` Rusty Russell
1 sibling, 0 replies; 6+ messages in thread
From: Lai Jiangshan @ 2009-06-01 2:42 UTC (permalink / raw)
To: Andrew Morton
Cc: rusty, mingo, paulmck, linux-kernel, Oleg Nesterov,
Linus Torvalds, Gautham R Shenoy
Andrew Morton wrote:
>
>> ---
>> diff --git a/include/linux/cpu.h b/include/linux/cpu.h
>> index 2643d84..98f5c4b 100644
>> --- a/include/linux/cpu.h
>> +++ b/include/linux/cpu.h
>> @@ -104,6 +104,7 @@ extern struct sysdev_class cpu_sysdev_class;
>>
>> extern void get_online_cpus(void);
>> extern void put_online_cpus(void);
>> +extern int try_get_online_cpus(void);
>> #define hotcpu_notifier(fn, pri) { \
>> static struct notifier_block fn##_nb __cpuinitdata = \
>> { .notifier_call = fn, .priority = pri }; \
>> @@ -117,6 +118,7 @@ int cpu_down(unsigned int cpu);
>>
>> #define get_online_cpus() do { } while (0)
>> #define put_online_cpus() do { } while (0)
>> +#define try_get_online_cpus() (1)
>> #define hotcpu_notifier(fn, pri) do { (void)(fn); } while (0)
>> /* These aren't inline functions due to a GCC bug. */
>> #define register_hotcpu_notifier(nb) ({ (void)(nb); 0; })
>> diff --git a/kernel/cpu.c b/kernel/cpu.c
>> index 62198ec..e948f19 100644
>> --- a/kernel/cpu.c
>> +++ b/kernel/cpu.c
>> @@ -66,6 +66,15 @@ void put_online_cpus(void)
>> }
>> EXPORT_SYMBOL_GPL(put_online_cpus);
>>
>> +int try_get_online_cpus(void)
>> +{
>> + might_sleep();
>> + if (cpu_hotplug.active_writer == current)
>> + return 1;
>> + return down_read_trylock(&cpu_hotplug.rwlock);
>> +
>> +}
>> +EXPORT_SYMBOL_GPL(try_get_online_cpus);
>
> It's strange to add a might_sleep() to a function which doesn't sleep.
It might sleep indeed. I prefer to add a might_sleep()/cond_sched()
for a sleepable function.
>
> The patch adds no callers to this function. This is significant
> because it would be quite interesting to find out which subsystem(s)
> you've found to have this deadlock. I do think that we should look at
> alternative (non-trylocky) ways of fixing them.
>
This problem exist in cgroup/cpuset. and Max Krasnyansky fix it:
(non-trylocky way)
*
* The rebuild_sched_domains() and partition_sched_domains()
* routines must nest cgroup_lock() inside get_online_cpus(),
* but such cpuset changes as these must nest that locking the
* other way, holding cgroup_lock() for much of the code.
*
* So in order to avoid an ABBA deadlock, the cpuset code handling
* these user changes delegates the actual sched domain rebuilding
* to a separate workqueue thread, which ends up processing the
* above do_rebuild_sched_domains() function.
*/
static void async_rebuild_sched_domains(void)
{
queue_work(cpuset_wq, &rebuild_sched_domains_work);
}
get_online_cpus() is so a coarsely granular lock, and
try_get_online_cpus() fails rarely. The kernel indeed needs
try_get_online_cpus().
Paul will use it:
http://lkml.org/lkml/2009/5/22/332
Lai
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH 2/2] cpuhotplug: introduce try_get_online_cpus()
2009-05-29 20:31 ` Andrew Morton
2009-06-01 2:42 ` Lai Jiangshan
@ 2009-06-01 7:31 ` Rusty Russell
2009-06-01 16:19 ` Paul E. McKenney
1 sibling, 1 reply; 6+ messages in thread
From: Rusty Russell @ 2009-06-01 7:31 UTC (permalink / raw)
To: Andrew Morton
Cc: Lai Jiangshan, mingo, paulmck, linux-kernel, Oleg Nesterov,
Linus Torvalds
On Sat, 30 May 2009 06:01:18 am Andrew Morton wrote:
> I do think that we should look at
> alternative (non-trylocky) ways of fixing them.
Speculating: we could add a "keep_cpu()" (FIXME: improve name) which is kind
of like get_cpu() only doesn't disable preemption and only stops *this* cpu
from going down.
Not sure where that gets us, but if someone's going to dig deep into this it
might help.
Thanks,
Rusty.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] cpuhotplug: introduce try_get_online_cpus()
2009-06-01 7:31 ` Rusty Russell
@ 2009-06-01 16:19 ` Paul E. McKenney
2009-06-04 0:16 ` Paul E. McKenney
0 siblings, 1 reply; 6+ messages in thread
From: Paul E. McKenney @ 2009-06-01 16:19 UTC (permalink / raw)
To: Rusty Russell
Cc: Andrew Morton, Lai Jiangshan, mingo, linux-kernel, Oleg Nesterov,
Linus Torvalds
On Mon, Jun 01, 2009 at 05:01:50PM +0930, Rusty Russell wrote:
> On Sat, 30 May 2009 06:01:18 am Andrew Morton wrote:
> > I do think that we should look at
> > alternative (non-trylocky) ways of fixing them.
>
> Speculating: we could add a "keep_cpu()" (FIXME: improve name) which is kind
> of like get_cpu() only doesn't disable preemption and only stops *this* cpu
> from going down.
>
> Not sure where that gets us, but if someone's going to dig deep into this it
> might help.
I have been beating up on the approach of disabling preemption to pin down
a single CPU, and although it is working, it is no faster than simply
doing get_online_cpus() and it is much much more subtle and complex.
I am not sure that I have all the races properly accounted for, and I
am failing to see the point of having something quite this ugly in the
kernel when much simpler alternatives exist.
The main vulnerability is the possibility that someone will invoke
synchroniize_rcu_expedited() while holding a mutex that is also acquired
in a CPU-hotplug notifier, as Lai noted. But this is easily handled
given a primitive that will say whether the current CPU is executing in a
CPU-hotplug notifier. This primitive is permitted to sometimes mistakenly
say that the current CPU is executing in a CPU-hotplug notifier when it
is not (as long as it doesn't do so too often), but not vice versa.
One way to implement this would be to have such a primitive simply say
whether or not a CPU-hotplug operation is currently in effect. Yes, this
is racy, but not when it matters -- you cannot possibly exit a CPU-hotplug
operation while executing in a CPU-hotplug notifier. For example,
the following exported from kernel/cpu.c would work just fine:
bool cpu_hotplug_in_progress(void)
{
return cpu_hotplug.active_writer != NULL;
}
I believe that we should be OK moving forward with an updated version of
http://lkml.org/lkml/2009/5/22/332 even without the deadlock avoidance.
Having the deadlock avoidance would be better, of course, so I will use
something like the above on the next patch.
Thoughts?
Thanx, Paul
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH 2/2] cpuhotplug: introduce try_get_online_cpus()
2009-06-01 16:19 ` Paul E. McKenney
@ 2009-06-04 0:16 ` Paul E. McKenney
0 siblings, 0 replies; 6+ messages in thread
From: Paul E. McKenney @ 2009-06-04 0:16 UTC (permalink / raw)
To: Rusty Russell
Cc: Andrew Morton, Lai Jiangshan, mingo, linux-kernel, Oleg Nesterov,
Linus Torvalds
On Mon, Jun 01, 2009 at 09:19:31AM -0700, Paul E. McKenney wrote:
> On Mon, Jun 01, 2009 at 05:01:50PM +0930, Rusty Russell wrote:
> > On Sat, 30 May 2009 06:01:18 am Andrew Morton wrote:
> > > I do think that we should look at
> > > alternative (non-trylocky) ways of fixing them.
> >
> > Speculating: we could add a "keep_cpu()" (FIXME: improve name) which is kind
> > of like get_cpu() only doesn't disable preemption and only stops *this* cpu
> > from going down.
> >
> > Not sure where that gets us, but if someone's going to dig deep into this it
> > might help.
>
> I have been beating up on the approach of disabling preemption to pin down
> a single CPU, and although it is working, it is no faster than simply
> doing get_online_cpus() and it is much much more subtle and complex.
> I am not sure that I have all the races properly accounted for, and I
> am failing to see the point of having something quite this ugly in the
> kernel when much simpler alternatives exist.
>
> The main vulnerability is the possibility that someone will invoke
> synchroniize_rcu_expedited() while holding a mutex that is also acquired
> in a CPU-hotplug notifier, as Lai noted. But this is easily handled
> given a primitive that will say whether the current CPU is executing in a
> CPU-hotplug notifier. This primitive is permitted to sometimes mistakenly
> say that the current CPU is executing in a CPU-hotplug notifier when it
> is not (as long as it doesn't do so too often), but not vice versa.
>
> One way to implement this would be to have such a primitive simply say
> whether or not a CPU-hotplug operation is currently in effect. Yes, this
> is racy, but not when it matters -- you cannot possibly exit a CPU-hotplug
> operation while executing in a CPU-hotplug notifier. For example,
> the following exported from kernel/cpu.c would work just fine:
>
> bool cpu_hotplug_in_progress(void)
> {
> return cpu_hotplug.active_writer != NULL;
> }
>
> I believe that we should be OK moving forward with an updated version of
> http://lkml.org/lkml/2009/5/22/332 even without the deadlock avoidance.
> Having the deadlock avoidance would be better, of course, so I will use
> something like the above on the next patch.
Of course, the above does not actually solve the deadlock, instead
merely making it less likely to occur. I have absolutely no idea what
I was thinking!
Back to try_get_online_cpus().
Thanx, Paul
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-06-04 0:16 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-29 8:29 [PATCH 2/2] cpuhotplug: introduce try_get_online_cpus() Lai Jiangshan
2009-05-29 20:31 ` Andrew Morton
2009-06-01 2:42 ` Lai Jiangshan
2009-06-01 7:31 ` Rusty Russell
2009-06-01 16:19 ` Paul E. McKenney
2009-06-04 0:16 ` Paul E. McKenney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox