* [PATCH] sched: missing locking in sched_domains code
@ 2008-04-27 21:12 Heiko Carstens
2008-04-28 1:39 ` Andrew Morton
0 siblings, 1 reply; 15+ messages in thread
From: Heiko Carstens @ 2008-04-27 21:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Gautham R Shenoy, Ingo Molnar, Paul Jackson, linux-kernel
From: Heiko Carstens <heiko.carstens@de.ibm.com>
Concurrent calls to detach_destroy_domains and arch_init_sched_domains
were prevented by the old scheduler subsystem cpu hotplug mutex. When
this got converted to get_online_cpus() the locking got broken.
Unlike before now several processes can concurrently enter the critical
sections that were protected by the old lock.
So add a new sched_domains_mutex which protects these sections again.
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
---
include/linux/sched.h | 2 ++
kernel/cpuset.c | 2 ++
kernel/sched.c | 11 +++++++++++
3 files changed, 15 insertions(+)
Index: linux-2.6/kernel/sched.c
===================================================================
--- linux-2.6.orig/kernel/sched.c
+++ linux-2.6/kernel/sched.c
@@ -7807,14 +7807,23 @@ match2:
unlock_doms_cur();
}
+/*
+ * Protects against concurrent calls to detach_destroy_domains
+ * and arch_init_sched_domains.
+ */
+DEFINE_MUTEX(sched_domains_mutex);
+
#if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
int arch_reinit_sched_domains(void)
{
+ static DEFINE_MUTEX(arch_reinit_sched_domains_mutex);
int err;
get_online_cpus();
+ mutex_lock(&sched_domains_mutex);
detach_destroy_domains(&cpu_online_map);
err = arch_init_sched_domains(&cpu_online_map);
+ mutex_unlock(&sched_domains_mutex);
put_online_cpus();
return err;
@@ -7932,10 +7941,12 @@ void __init sched_init_smp(void)
BUG_ON(sched_group_nodes_bycpu == NULL);
#endif
get_online_cpus();
+ mutex_lock(&sched_domains_mutex);
arch_init_sched_domains(&cpu_online_map);
cpus_andnot(non_isolated_cpus, cpu_possible_map, cpu_isolated_map);
if (cpus_empty(non_isolated_cpus))
cpu_set(smp_processor_id(), non_isolated_cpus);
+ mutex_unlock(&sched_domains_mutex);
put_online_cpus();
/* XXX: Theoretical race here - CPU may be hotplugged now */
hotcpu_notifier(update_sched_domains, 0);
Index: linux-2.6/include/linux/sched.h
===================================================================
--- linux-2.6.orig/include/linux/sched.h
+++ linux-2.6/include/linux/sched.h
@@ -809,6 +809,8 @@ struct sched_domain {
#endif
};
+extern struct mutex sched_domains_mutex;
+
extern void partition_sched_domains(int ndoms_new, cpumask_t *doms_new,
struct sched_domain_attr *dattr_new);
extern int arch_reinit_sched_domains(void);
Index: linux-2.6/kernel/cpuset.c
===================================================================
--- linux-2.6.orig/kernel/cpuset.c
+++ linux-2.6/kernel/cpuset.c
@@ -684,7 +684,9 @@ restart:
rebuild:
/* Have scheduler rebuild sched domains */
get_online_cpus();
+ mutex_lock(&sched_domains_mutex);
partition_sched_domains(ndoms, doms, dattr);
+ mutex_unlock(&sched_domains_mutex);
put_online_cpus();
done:
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH] sched: missing locking in sched_domains code 2008-04-27 21:12 [PATCH] sched: missing locking in sched_domains code Heiko Carstens @ 2008-04-28 1:39 ` Andrew Morton 2008-04-28 7:09 ` Heiko Carstens 0 siblings, 1 reply; 15+ messages in thread From: Andrew Morton @ 2008-04-28 1:39 UTC (permalink / raw) To: Heiko Carstens; +Cc: Gautham R Shenoy, Ingo Molnar, Paul Jackson, linux-kernel On Sun, 27 Apr 2008 23:12:24 +0200 Heiko Carstens <heiko.carstens@de.ibm.com> wrote: > From: Heiko Carstens <heiko.carstens@de.ibm.com> > > Concurrent calls to detach_destroy_domains and arch_init_sched_domains > were prevented by the old scheduler subsystem cpu hotplug mutex. When > this got converted to get_online_cpus() the locking got broken. > Unlike before now several processes can concurrently enter the critical > sections that were protected by the old lock. > > So add a new sched_domains_mutex which protects these sections again. > > Cc: Gautham R Shenoy <ego@in.ibm.com> > Cc: Ingo Molnar <mingo@elte.hu> > Cc: Paul Jackson <pj@sgi.com> > Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> > --- > include/linux/sched.h | 2 ++ > kernel/cpuset.c | 2 ++ > kernel/sched.c | 11 +++++++++++ > 3 files changed, 15 insertions(+) > > Index: linux-2.6/kernel/sched.c > =================================================================== > --- linux-2.6.orig/kernel/sched.c > +++ linux-2.6/kernel/sched.c > @@ -7807,14 +7807,23 @@ match2: > unlock_doms_cur(); > } > > +/* > + * Protects against concurrent calls to detach_destroy_domains > + * and arch_init_sched_domains. > + */ > +DEFINE_MUTEX(sched_domains_mutex); > + > #if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT) > int arch_reinit_sched_domains(void) > { > + static DEFINE_MUTEX(arch_reinit_sched_domains_mutex); leftover hunk. > int err; > > get_online_cpus(); > + mutex_lock(&sched_domains_mutex); > detach_destroy_domains(&cpu_online_map); > err = arch_init_sched_domains(&cpu_online_map); > + mutex_unlock(&sched_domains_mutex); > put_online_cpus(); > > return err; > @@ -7932,10 +7941,12 @@ void __init sched_init_smp(void) > BUG_ON(sched_group_nodes_bycpu == NULL); > #endif > get_online_cpus(); > + mutex_lock(&sched_domains_mutex); > arch_init_sched_domains(&cpu_online_map); > cpus_andnot(non_isolated_cpus, cpu_possible_map, cpu_isolated_map); > if (cpus_empty(non_isolated_cpus)) > cpu_set(smp_processor_id(), non_isolated_cpus); > + mutex_unlock(&sched_domains_mutex); > put_online_cpus(); > /* XXX: Theoretical race here - CPU may be hotplugged now */ > hotcpu_notifier(update_sched_domains, 0); > Index: linux-2.6/include/linux/sched.h > =================================================================== > --- linux-2.6.orig/include/linux/sched.h > +++ linux-2.6/include/linux/sched.h > @@ -809,6 +809,8 @@ struct sched_domain { > #endif > }; > > +extern struct mutex sched_domains_mutex; > + > extern void partition_sched_domains(int ndoms_new, cpumask_t *doms_new, > struct sched_domain_attr *dattr_new); > extern int arch_reinit_sched_domains(void); > Index: linux-2.6/kernel/cpuset.c > =================================================================== > --- linux-2.6.orig/kernel/cpuset.c > +++ linux-2.6/kernel/cpuset.c > @@ -684,7 +684,9 @@ restart: > rebuild: > /* Have scheduler rebuild sched domains */ > get_online_cpus(); > + mutex_lock(&sched_domains_mutex); > partition_sched_domains(ndoms, doms, dattr); > + mutex_unlock(&sched_domains_mutex); > put_online_cpus(); > It seems a bit fragile to take this lock in the caller without even adding a comment at the callee site which documents the new locking rule. It would be more robust to take the lock within partition_sched_domains(). partition_sched_domains() already covers itself with lock_doms_cur(). Can we take that in arch_reinit_sched_domains() rather than adding the new lock? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] sched: missing locking in sched_domains code 2008-04-28 1:39 ` Andrew Morton @ 2008-04-28 7:09 ` Heiko Carstens 2008-04-28 7:24 ` Ingo Molnar 2008-04-28 7:28 ` Andrew Morton 0 siblings, 2 replies; 15+ messages in thread From: Heiko Carstens @ 2008-04-28 7:09 UTC (permalink / raw) To: Andrew Morton; +Cc: Gautham R Shenoy, Ingo Molnar, Paul Jackson, linux-kernel On Sun, Apr 27, 2008 at 06:39:26PM -0700, Andrew Morton wrote: > On Sun, 27 Apr 2008 23:12:24 +0200 Heiko Carstens <heiko.carstens@de.ibm.com> wrote: > > Index: linux-2.6/kernel/cpuset.c > > =================================================================== > > --- linux-2.6.orig/kernel/cpuset.c > > +++ linux-2.6/kernel/cpuset.c > > @@ -684,7 +684,9 @@ restart: > > rebuild: > > /* Have scheduler rebuild sched domains */ > > get_online_cpus(); > > + mutex_lock(&sched_domains_mutex); > > partition_sched_domains(ndoms, doms, dattr); > > + mutex_unlock(&sched_domains_mutex); > > put_online_cpus(); > > > > It seems a bit fragile to take this lock in the caller without even adding > a comment at the callee site which documents the new locking rule. > > It would be more robust to take the lock within partition_sched_domains(). > > partition_sched_domains() already covers itself with lock_doms_cur(). Can > we take that in arch_reinit_sched_domains() rather than adding the new lock? I think you meant taking it in partition_sched_domains? But anyway, I moved it all over to sched.c. So here's the new patch. Shorter and doesn't export a new lock :) Subject: [PATCH] sched: fix sched_domains locking From: Heiko Carstens <heiko.carstens@de.ibm.com> Concurrent calls to detach_destroy_domains and arch_init_sched_domains were prevented by the old scheduler subsystem cpu hotplug mutex. When this got converted to get_online_cpus() the locking got broken. Unlike before now several processes can concurrently enter the critical sections that were protected by the old lock. So add a new sched_domains_mutex which protects these sections again. Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> --- kernel/sched.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) Index: linux-2.6/kernel/sched.c =================================================================== --- linux-2.6.orig/kernel/sched.c +++ linux-2.6/kernel/sched.c @@ -7730,6 +7730,12 @@ static int dattrs_equal(struct sched_dom } /* + * Protects against concurrent calls to detach_destroy_domains + * and arch_init_sched_domains. + */ +static DEFINE_MUTEX(sched_domains_mutex); + +/* * Partition sched domains as specified by the 'ndoms_new' * cpumasks in the array doms_new[] of cpumasks. This compares * doms_new[] to the current sched domain partitioning, doms_cur[]. @@ -7756,7 +7762,8 @@ void partition_sched_domains(int ndoms_n int i, j; lock_doms_cur(); - + mutex_lock(&sched_domains_mutex); + /* always unregister in case we don't destroy any domains */ unregister_sched_domain_sysctl(); @@ -7804,6 +7811,7 @@ match2: register_sched_domain_sysctl(); + mutex_unlock(&sched_domains_mutex); unlock_doms_cur(); } @@ -7813,8 +7821,10 @@ int arch_reinit_sched_domains(void) int err; get_online_cpus(); + mutex_lock(&sched_domains_mutex); detach_destroy_domains(&cpu_online_map); err = arch_init_sched_domains(&cpu_online_map); + mutex_unlock(&sched_domains_mutex); put_online_cpus(); return err; @@ -7932,10 +7942,12 @@ void __init sched_init_smp(void) BUG_ON(sched_group_nodes_bycpu == NULL); #endif get_online_cpus(); + mutex_lock(&sched_domains_mutex); arch_init_sched_domains(&cpu_online_map); cpus_andnot(non_isolated_cpus, cpu_possible_map, cpu_isolated_map); if (cpus_empty(non_isolated_cpus)) cpu_set(smp_processor_id(), non_isolated_cpus); + mutex_unlock(&sched_domains_mutex); put_online_cpus(); /* XXX: Theoretical race here - CPU may be hotplugged now */ hotcpu_notifier(update_sched_domains, 0); ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] sched: missing locking in sched_domains code 2008-04-28 7:09 ` Heiko Carstens @ 2008-04-28 7:24 ` Ingo Molnar 2008-04-28 7:28 ` Andrew Morton 1 sibling, 0 replies; 15+ messages in thread From: Ingo Molnar @ 2008-04-28 7:24 UTC (permalink / raw) To: Heiko Carstens Cc: Andrew Morton, Gautham R Shenoy, Paul Jackson, linux-kernel, Peter Zijlstra * Heiko Carstens <heiko.carstens@de.ibm.com> wrote: > > /* > + * Protects against concurrent calls to detach_destroy_domains > + * and arch_init_sched_domains. > + */ > +static DEFINE_MUTEX(sched_domains_mutex); > + > +/* > * Partition sched domains as specified by the 'ndoms_new' > * cpumasks in the array doms_new[] of cpumasks. This compares > * doms_new[] to the current sched domain partitioning, doms_cur[]. > @@ -7756,7 +7762,8 @@ void partition_sched_domains(int ndoms_n > int i, j; > > lock_doms_cur(); > - > + mutex_lock(&sched_domains_mutex); i might be missing something but why not make doms_cur_mutex locking unconditional and extend it to detach_destroy_domains() as well? Ingo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] sched: missing locking in sched_domains code 2008-04-28 7:09 ` Heiko Carstens 2008-04-28 7:24 ` Ingo Molnar @ 2008-04-28 7:28 ` Andrew Morton 2008-04-28 7:52 ` Heiko Carstens 1 sibling, 1 reply; 15+ messages in thread From: Andrew Morton @ 2008-04-28 7:28 UTC (permalink / raw) To: Heiko Carstens; +Cc: Gautham R Shenoy, Ingo Molnar, Paul Jackson, linux-kernel On Mon, 28 Apr 2008 09:09:46 +0200 Heiko Carstens <heiko.carstens@de.ibm.com> wrote: > On Sun, Apr 27, 2008 at 06:39:26PM -0700, Andrew Morton wrote: > > On Sun, 27 Apr 2008 23:12:24 +0200 Heiko Carstens <heiko.carstens@de.ibm.com> wrote: > > > Index: linux-2.6/kernel/cpuset.c > > > =================================================================== > > > --- linux-2.6.orig/kernel/cpuset.c > > > +++ linux-2.6/kernel/cpuset.c > > > @@ -684,7 +684,9 @@ restart: > > > rebuild: > > > /* Have scheduler rebuild sched domains */ > > > get_online_cpus(); > > > + mutex_lock(&sched_domains_mutex); > > > partition_sched_domains(ndoms, doms, dattr); > > > + mutex_unlock(&sched_domains_mutex); > > > put_online_cpus(); > > > > > > > It seems a bit fragile to take this lock in the caller without even adding > > a comment at the callee site which documents the new locking rule. > > > > It would be more robust to take the lock within partition_sched_domains(). > > > > partition_sched_domains() already covers itself with lock_doms_cur(). Can > > we take that in arch_reinit_sched_domains() rather than adding the new lock? > > I think you meant taking it in partition_sched_domains? What I meant was: rather than adding the new sched_domains_mutex, can we instead call lock_doms_cur() from arch_reinit_sched_domains() and sched_init_smp()? Borrow the existing lock? Whether that makes sense depends upon what lock_doms_cur() semantically *means*. As that appears to be somewhat of a secret, we get to decide ;) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] sched: missing locking in sched_domains code 2008-04-28 7:28 ` Andrew Morton @ 2008-04-28 7:52 ` Heiko Carstens 2008-04-28 8:11 ` Heiko Carstens 0 siblings, 1 reply; 15+ messages in thread From: Heiko Carstens @ 2008-04-28 7:52 UTC (permalink / raw) To: Andrew Morton; +Cc: Gautham R Shenoy, Ingo Molnar, Paul Jackson, linux-kernel On Mon, Apr 28, 2008 at 12:28:53AM -0700, Andrew Morton wrote: > On Mon, 28 Apr 2008 09:09:46 +0200 Heiko Carstens <heiko.carstens@de.ibm.com> wrote: > > > On Sun, Apr 27, 2008 at 06:39:26PM -0700, Andrew Morton wrote: > > > On Sun, 27 Apr 2008 23:12:24 +0200 Heiko Carstens <heiko.carstens@de.ibm.com> wrote: > > > > Index: linux-2.6/kernel/cpuset.c > > > > =================================================================== > > > > --- linux-2.6.orig/kernel/cpuset.c > > > > +++ linux-2.6/kernel/cpuset.c > > > > @@ -684,7 +684,9 @@ restart: > > > > rebuild: > > > > /* Have scheduler rebuild sched domains */ > > > > get_online_cpus(); > > > > + mutex_lock(&sched_domains_mutex); > > > > partition_sched_domains(ndoms, doms, dattr); > > > > + mutex_unlock(&sched_domains_mutex); > > > > put_online_cpus(); > > > > > > > > > > It seems a bit fragile to take this lock in the caller without even adding > > > a comment at the callee site which documents the new locking rule. > > > > > > It would be more robust to take the lock within partition_sched_domains(). > > > > > > partition_sched_domains() already covers itself with lock_doms_cur(). Can > > > we take that in arch_reinit_sched_domains() rather than adding the new lock? > > > > I think you meant taking it in partition_sched_domains? > > What I meant was: rather than adding the new sched_domains_mutex, can we > instead call lock_doms_cur() from arch_reinit_sched_domains() and > sched_init_smp()? Borrow the existing lock? > > Whether that makes sense depends upon what lock_doms_cur() semantically > *means*. As that appears to be somewhat of a secret, we get to decide ;) Oh, I didn't realize that lock_doms_cur() was only used in one function. So that should work. Will try. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] sched: missing locking in sched_domains code 2008-04-28 7:52 ` Heiko Carstens @ 2008-04-28 8:11 ` Heiko Carstens 2008-04-28 8:32 ` Ingo Molnar 0 siblings, 1 reply; 15+ messages in thread From: Heiko Carstens @ 2008-04-28 8:11 UTC (permalink / raw) To: Andrew Morton; +Cc: Gautham R Shenoy, Ingo Molnar, Paul Jackson, linux-kernel Subject: [PATCH] sched: fix locking in arch_reinit_sched_domains From: Heiko Carstens <heiko.carstens@de.ibm.com> Concurrent calls to detach_destroy_domains and arch_init_sched_domains were prevented by the old scheduler subsystem cpu hotplug mutex. When this got converted to get_online_cpus() the locking got broken. Unlike before now several processes can concurrently enter the critical sections that were protected by the old lock. So use the already present doms_cur_mutex to protect these sections again. Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> --- kernel/sched.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) Index: linux-2.6/kernel/sched.c =================================================================== --- linux-2.6.orig/kernel/sched.c +++ linux-2.6/kernel/sched.c @@ -311,6 +311,16 @@ static DEFINE_SPINLOCK(task_group_lock); /* doms_cur_mutex serializes access to doms_cur[] array */ static DEFINE_MUTEX(doms_cur_mutex); +static inline void lock_doms_cur(void) +{ + mutex_lock(&doms_cur_mutex); +} + +static inline void unlock_doms_cur(void) +{ + mutex_unlock(&doms_cur_mutex); +} + #ifdef CONFIG_FAIR_GROUP_SCHED #ifdef CONFIG_USER_SCHED # define INIT_TASK_GROUP_LOAD (2*NICE_0_LOAD) @@ -358,21 +368,9 @@ static inline void set_task_rq(struct ta #endif } -static inline void lock_doms_cur(void) -{ - mutex_lock(&doms_cur_mutex); -} - -static inline void unlock_doms_cur(void) -{ - mutex_unlock(&doms_cur_mutex); -} - #else static inline void set_task_rq(struct task_struct *p, unsigned int cpu) { } -static inline void lock_doms_cur(void) { } -static inline void unlock_doms_cur(void) { } #endif /* CONFIG_GROUP_SCHED */ @@ -7813,8 +7811,10 @@ int arch_reinit_sched_domains(void) int err; get_online_cpus(); + lock_doms_cur(); detach_destroy_domains(&cpu_online_map); err = arch_init_sched_domains(&cpu_online_map); + unlock_doms_cur(); put_online_cpus(); return err; ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] sched: missing locking in sched_domains code 2008-04-28 8:11 ` Heiko Carstens @ 2008-04-28 8:32 ` Ingo Molnar 2008-04-28 8:49 ` Heiko Carstens 0 siblings, 1 reply; 15+ messages in thread From: Ingo Molnar @ 2008-04-28 8:32 UTC (permalink / raw) To: Heiko Carstens Cc: Andrew Morton, Gautham R Shenoy, Paul Jackson, linux-kernel * Heiko Carstens <heiko.carstens@de.ibm.com> wrote: > /* doms_cur_mutex serializes access to doms_cur[] array */ > static DEFINE_MUTEX(doms_cur_mutex); > > +static inline void lock_doms_cur(void) > +{ > + mutex_lock(&doms_cur_mutex); > +} > @@ -7813,8 +7811,10 @@ int arch_reinit_sched_domains(void) > int err; > > get_online_cpus(); > + lock_doms_cur(); thanks, that looks a lot more clean already. May i ask for another thing, if you are hacking on this anyway? Please get rid of the lock_doms_cur() complication now that it's not conditional - an open coded mutex_lock(&sched_doms_mutex) looks more readable - it gives a clear idea about what's happening. Also, please rename sched_doms_mutex to something less tongue-twisting - such as sched_domains_mutex. Hm? Ingo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] sched: missing locking in sched_domains code 2008-04-28 8:32 ` Ingo Molnar @ 2008-04-28 8:49 ` Heiko Carstens 2008-04-28 8:57 ` Andrew Morton 0 siblings, 1 reply; 15+ messages in thread From: Heiko Carstens @ 2008-04-28 8:49 UTC (permalink / raw) To: Ingo Molnar; +Cc: Andrew Morton, Gautham R Shenoy, Paul Jackson, linux-kernel On Mon, Apr 28, 2008 at 10:32:22AM +0200, Ingo Molnar wrote: > > * Heiko Carstens <heiko.carstens@de.ibm.com> wrote: > > > /* doms_cur_mutex serializes access to doms_cur[] array */ > > static DEFINE_MUTEX(doms_cur_mutex); > > > > +static inline void lock_doms_cur(void) > > +{ > > + mutex_lock(&doms_cur_mutex); > > +} > > > @@ -7813,8 +7811,10 @@ int arch_reinit_sched_domains(void) > > int err; > > > > get_online_cpus(); > > + lock_doms_cur(); > > thanks, that looks a lot more clean already. May i ask for another > thing, if you are hacking on this anyway? Please get rid of the > lock_doms_cur() complication now that it's not conditional - an open > coded mutex_lock(&sched_doms_mutex) looks more readable - it gives a > clear idea about what's happening. Also, please rename sched_doms_mutex > to something less tongue-twisting - such as sched_domains_mutex. Hm? Your wish is my order: Subject: [PATCH] sched: fix locking in arch_reinit_sched_domains From: Heiko Carstens <heiko.carstens@de.ibm.com> Concurrent calls to detach_destroy_domains and arch_init_sched_domains were prevented by the old scheduler subsystem cpu hotplug mutex. When this got converted to get_online_cpus() the locking got broken. Unlike before now several processes can concurrently enter the critical sections that were protected by the old lock. So use the already present doms_cur_mutex to protect these sections again. Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> --- kernel/sched.c | 20 +++++--------------- 1 file changed, 5 insertions(+), 15 deletions(-) Index: linux-2.6/kernel/sched.c =================================================================== --- linux-2.6.orig/kernel/sched.c +++ linux-2.6/kernel/sched.c @@ -309,7 +309,7 @@ static DEFINE_PER_CPU(struct rt_rq, init static DEFINE_SPINLOCK(task_group_lock); /* doms_cur_mutex serializes access to doms_cur[] array */ -static DEFINE_MUTEX(doms_cur_mutex); +static DEFINE_MUTEX(sched_domains_mutex); #ifdef CONFIG_FAIR_GROUP_SCHED #ifdef CONFIG_USER_SCHED @@ -358,21 +358,9 @@ static inline void set_task_rq(struct ta #endif } -static inline void lock_doms_cur(void) -{ - mutex_lock(&doms_cur_mutex); -} - -static inline void unlock_doms_cur(void) -{ - mutex_unlock(&doms_cur_mutex); -} - #else static inline void set_task_rq(struct task_struct *p, unsigned int cpu) { } -static inline void lock_doms_cur(void) { } -static inline void unlock_doms_cur(void) { } #endif /* CONFIG_GROUP_SCHED */ @@ -7755,7 +7743,7 @@ void partition_sched_domains(int ndoms_n { int i, j; - lock_doms_cur(); + mutex_lock(&sched_domains_mutex); /* always unregister in case we don't destroy any domains */ unregister_sched_domain_sysctl(); @@ -7804,7 +7792,7 @@ match2: register_sched_domain_sysctl(); - unlock_doms_cur(); + mutex_unlock(&sched_domains_mutex); } #if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT) @@ -7813,8 +7801,10 @@ int arch_reinit_sched_domains(void) int err; get_online_cpus(); + mutex_lock(&sched_domains_mutex); detach_destroy_domains(&cpu_online_map); err = arch_init_sched_domains(&cpu_online_map); + mutex_unlock(&sched_domains_mutex); put_online_cpus(); return err; ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] sched: missing locking in sched_domains code 2008-04-28 8:49 ` Heiko Carstens @ 2008-04-28 8:57 ` Andrew Morton 2008-04-28 9:17 ` Heiko Carstens 0 siblings, 1 reply; 15+ messages in thread From: Andrew Morton @ 2008-04-28 8:57 UTC (permalink / raw) To: Heiko Carstens; +Cc: Ingo Molnar, Gautham R Shenoy, Paul Jackson, linux-kernel On Mon, 28 Apr 2008 10:49:04 +0200 Heiko Carstens <heiko.carstens@de.ibm.com> wrote: > On Mon, Apr 28, 2008 at 10:32:22AM +0200, Ingo Molnar wrote: > > > > * Heiko Carstens <heiko.carstens@de.ibm.com> wrote: > > > > > /* doms_cur_mutex serializes access to doms_cur[] array */ > > > static DEFINE_MUTEX(doms_cur_mutex); > > > > > > +static inline void lock_doms_cur(void) > > > +{ > > > + mutex_lock(&doms_cur_mutex); > > > +} > > > > > @@ -7813,8 +7811,10 @@ int arch_reinit_sched_domains(void) > > > int err; > > > > > > get_online_cpus(); > > > + lock_doms_cur(); > > > > thanks, that looks a lot more clean already. May i ask for another > > thing, if you are hacking on this anyway? Please get rid of the > > lock_doms_cur() complication now that it's not conditional - an open > > coded mutex_lock(&sched_doms_mutex) looks more readable - it gives a > > clear idea about what's happening. Also, please rename sched_doms_mutex > > to something less tongue-twisting - such as sched_domains_mutex. Hm? > > Your wish is my order: heh, let's all boss Heiko around. > /* doms_cur_mutex serializes access to doms_cur[] array */ > -static DEFINE_MUTEX(doms_cur_mutex); > +static DEFINE_MUTEX(sched_domains_mutex); The comment refers to a no-longer-existing lock, and no longer correctly describes the lock's usage. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] sched: missing locking in sched_domains code 2008-04-28 8:57 ` Andrew Morton @ 2008-04-28 9:17 ` Heiko Carstens 2008-04-28 9:31 ` Andrew Morton 2008-04-28 9:33 ` Heiko Carstens 0 siblings, 2 replies; 15+ messages in thread From: Heiko Carstens @ 2008-04-28 9:17 UTC (permalink / raw) To: Andrew Morton; +Cc: Ingo Molnar, Gautham R Shenoy, Paul Jackson, linux-kernel On Mon, Apr 28, 2008 at 01:57:23AM -0700, Andrew Morton wrote: > On Mon, 28 Apr 2008 10:49:04 +0200 Heiko Carstens <heiko.carstens@de.ibm.com> wrote: > > /* doms_cur_mutex serializes access to doms_cur[] array */ > > -static DEFINE_MUTEX(doms_cur_mutex); > > +static DEFINE_MUTEX(sched_domains_mutex); > > The comment refers to a no-longer-existing lock, and no longer correctly > describes the lock's usage. version 42. Please feel free to change the comment if you think it could be better :) Subject: [PATCH] sched: fix locking in arch_reinit_sched_domains From: Heiko Carstens <heiko.carstens@de.ibm.com> Concurrent calls to detach_destroy_domains and arch_init_sched_domains were prevented by the old scheduler subsystem cpu hotplug mutex. When this got converted to get_online_cpus() the locking got broken. Unlike before now several processes can concurrently enter the critical sections that were protected by the old lock. So use the already present doms_cur_mutex to protect these sections again. Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> --- kernel/sched.c | 24 ++++++++---------------- 1 file changed, 8 insertions(+), 16 deletions(-) Index: linux-2.6/kernel/sched.c =================================================================== --- linux-2.6.orig/kernel/sched.c +++ linux-2.6/kernel/sched.c @@ -308,8 +308,10 @@ static DEFINE_PER_CPU(struct rt_rq, init */ static DEFINE_SPINLOCK(task_group_lock); -/* doms_cur_mutex serializes access to doms_cur[] array */ -static DEFINE_MUTEX(doms_cur_mutex); +/* sched_domains_mutex serializes calls to arch_init_sched_domains, + * detach_destroy_domains and partition_sched_domains. + */ +static DEFINE_MUTEX(sched_domains_mutex); #ifdef CONFIG_FAIR_GROUP_SCHED #ifdef CONFIG_USER_SCHED @@ -358,21 +360,9 @@ static inline void set_task_rq(struct ta #endif } -static inline void lock_doms_cur(void) -{ - mutex_lock(&doms_cur_mutex); -} - -static inline void unlock_doms_cur(void) -{ - mutex_unlock(&doms_cur_mutex); -} - #else static inline void set_task_rq(struct task_struct *p, unsigned int cpu) { } -static inline void lock_doms_cur(void) { } -static inline void unlock_doms_cur(void) { } #endif /* CONFIG_GROUP_SCHED */ @@ -7755,7 +7745,7 @@ void partition_sched_domains(int ndoms_n { int i, j; - lock_doms_cur(); + mutex_lock(&sched_domains_mutex); /* always unregister in case we don't destroy any domains */ unregister_sched_domain_sysctl(); @@ -7804,7 +7794,7 @@ match2: register_sched_domain_sysctl(); - unlock_doms_cur(); + mutex_unlock(&sched_domains_mutex); } #if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT) @@ -7813,8 +7803,10 @@ int arch_reinit_sched_domains(void) int err; get_online_cpus(); + mutex_lock(&sched_domains_mutex); detach_destroy_domains(&cpu_online_map); err = arch_init_sched_domains(&cpu_online_map); + mutex_unlock(&sched_domains_mutex); put_online_cpus(); return err; ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] sched: missing locking in sched_domains code 2008-04-28 9:17 ` Heiko Carstens @ 2008-04-28 9:31 ` Andrew Morton 2008-04-28 9:33 ` Heiko Carstens 1 sibling, 0 replies; 15+ messages in thread From: Andrew Morton @ 2008-04-28 9:31 UTC (permalink / raw) To: Heiko Carstens; +Cc: Ingo Molnar, Gautham R Shenoy, Paul Jackson, linux-kernel On Mon, 28 Apr 2008 11:17:45 +0200 Heiko Carstens <heiko.carstens@de.ibm.com> wrote: > On Mon, Apr 28, 2008 at 01:57:23AM -0700, Andrew Morton wrote: > > On Mon, 28 Apr 2008 10:49:04 +0200 Heiko Carstens <heiko.carstens@de.ibm.com> wrote: > > > /* doms_cur_mutex serializes access to doms_cur[] array */ > > > -static DEFINE_MUTEX(doms_cur_mutex); > > > +static DEFINE_MUTEX(sched_domains_mutex); > > > > The comment refers to a no-longer-existing lock, and no longer correctly > > describes the lock's usage. > > version 42. Please feel free to change the comment if you think it could > be better :) Actually, it's a pretty bad comment ;) > +/* sched_domains_mutex serializes calls to arch_init_sched_domains, > + * detach_destroy_domains and partition_sched_domains. > + */ locks protect *data*, not "calls". This matters. Which data is actually protected by this lock?? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] sched: missing locking in sched_domains code 2008-04-28 9:17 ` Heiko Carstens 2008-04-28 9:31 ` Andrew Morton @ 2008-04-28 9:33 ` Heiko Carstens 2008-04-28 12:27 ` Ingo Molnar 2008-04-28 13:13 ` Ingo Molnar 1 sibling, 2 replies; 15+ messages in thread From: Heiko Carstens @ 2008-04-28 9:33 UTC (permalink / raw) To: Andrew Morton; +Cc: Ingo Molnar, Gautham R Shenoy, Paul Jackson, linux-kernel On Mon, Apr 28, 2008 at 11:17:45AM +0200, Heiko Carstens wrote: > On Mon, Apr 28, 2008 at 01:57:23AM -0700, Andrew Morton wrote: > > On Mon, 28 Apr 2008 10:49:04 +0200 Heiko Carstens <heiko.carstens@de.ibm.com> wrote: > > > /* doms_cur_mutex serializes access to doms_cur[] array */ > > > -static DEFINE_MUTEX(doms_cur_mutex); > > > +static DEFINE_MUTEX(sched_domains_mutex); > > > > The comment refers to a no-longer-existing lock, and no longer correctly > > describes the lock's usage. > > version 42. Please feel free to change the comment if you think it could > be better :) I don't believe this... version 43 ;) I forgot to add the lock in sched_init_smp(). Not that it would really matter, but it should be there to avoid (even more) confusion. Subject: [PATCH] sched: fix locking in arch_reinit_sched_domains From: Heiko Carstens <heiko.carstens@de.ibm.com> Concurrent calls to detach_destroy_domains and arch_init_sched_domains were prevented by the old scheduler subsystem cpu hotplug mutex. When this got converted to get_online_cpus() the locking got broken. Unlike before now several processes can concurrently enter the critical sections that were protected by the old lock. So use the already present doms_cur_mutex to protect these sections again. Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> --- kernel/sched.c | 26 ++++++++++---------------- 1 file changed, 10 insertions(+), 16 deletions(-) Index: linux-2.6/kernel/sched.c =================================================================== --- linux-2.6.orig/kernel/sched.c +++ linux-2.6/kernel/sched.c @@ -308,8 +308,10 @@ static DEFINE_PER_CPU(struct rt_rq, init */ static DEFINE_SPINLOCK(task_group_lock); -/* doms_cur_mutex serializes access to doms_cur[] array */ -static DEFINE_MUTEX(doms_cur_mutex); +/* sched_domains_mutex serializes calls to arch_init_sched_domains, + * detach_destroy_domains and partition_sched_domains. + */ +static DEFINE_MUTEX(sched_domains_mutex); #ifdef CONFIG_FAIR_GROUP_SCHED #ifdef CONFIG_USER_SCHED @@ -358,21 +360,9 @@ static inline void set_task_rq(struct ta #endif } -static inline void lock_doms_cur(void) -{ - mutex_lock(&doms_cur_mutex); -} - -static inline void unlock_doms_cur(void) -{ - mutex_unlock(&doms_cur_mutex); -} - #else static inline void set_task_rq(struct task_struct *p, unsigned int cpu) { } -static inline void lock_doms_cur(void) { } -static inline void unlock_doms_cur(void) { } #endif /* CONFIG_GROUP_SCHED */ @@ -7755,7 +7745,7 @@ void partition_sched_domains(int ndoms_n { int i, j; - lock_doms_cur(); + mutex_lock(&sched_domains_mutex); /* always unregister in case we don't destroy any domains */ unregister_sched_domain_sysctl(); @@ -7804,7 +7794,7 @@ match2: register_sched_domain_sysctl(); - unlock_doms_cur(); + mutex_unlock(&sched_domains_mutex); } #if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT) @@ -7813,8 +7803,10 @@ int arch_reinit_sched_domains(void) int err; get_online_cpus(); + mutex_lock(&sched_domains_mutex); detach_destroy_domains(&cpu_online_map); err = arch_init_sched_domains(&cpu_online_map); + mutex_unlock(&sched_domains_mutex); put_online_cpus(); return err; @@ -7932,10 +7924,12 @@ void __init sched_init_smp(void) BUG_ON(sched_group_nodes_bycpu == NULL); #endif get_online_cpus(); + mutex_lock(&sched_domains_mutex); arch_init_sched_domains(&cpu_online_map); cpus_andnot(non_isolated_cpus, cpu_possible_map, cpu_isolated_map); if (cpus_empty(non_isolated_cpus)) cpu_set(smp_processor_id(), non_isolated_cpus); + mutex_unlock(&sched_domains_mutex); put_online_cpus(); /* XXX: Theoretical race here - CPU may be hotplugged now */ hotcpu_notifier(update_sched_domains, 0); ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] sched: missing locking in sched_domains code 2008-04-28 9:33 ` Heiko Carstens @ 2008-04-28 12:27 ` Ingo Molnar 2008-04-28 13:13 ` Ingo Molnar 1 sibling, 0 replies; 15+ messages in thread From: Ingo Molnar @ 2008-04-28 12:27 UTC (permalink / raw) To: Heiko Carstens Cc: Andrew Morton, Gautham R Shenoy, Paul Jackson, linux-kernel * Heiko Carstens <heiko.carstens@de.ibm.com> wrote: > > version 42. Please feel free to change the comment if you think it > > could be better :) > > I don't believe this... version 43 ;) I forgot to add the lock in > sched_init_smp(). Not that it would really matter, but it should be > there to avoid (even more) confusion. > > Subject: [PATCH] sched: fix locking in arch_reinit_sched_domains thanks Heiko, applied. Ingo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] sched: missing locking in sched_domains code 2008-04-28 9:33 ` Heiko Carstens 2008-04-28 12:27 ` Ingo Molnar @ 2008-04-28 13:13 ` Ingo Molnar 1 sibling, 0 replies; 15+ messages in thread From: Ingo Molnar @ 2008-04-28 13:13 UTC (permalink / raw) To: Heiko Carstens Cc: Andrew Morton, Gautham R Shenoy, Paul Jackson, linux-kernel * Heiko Carstens <heiko.carstens@de.ibm.com> wrote: > > version 42. Please feel free to change the comment if you think it could > > be better :) > > I don't believe this... version 43 ;) I forgot to add the lock in > sched_init_smp(). Not that it would really matter, but it should be > there to avoid (even more) confusion. hey i just had to iterate it to version 44. this bit: > -/* doms_cur_mutex serializes access to doms_cur[] array */ > -static DEFINE_MUTEX(doms_cur_mutex); > +/* sched_domains_mutex serializes calls to arch_init_sched_domains, > + * detach_destroy_domains and partition_sched_domains. > + */ > +static DEFINE_MUTEX(sched_domains_mutex); was inside an #ifdef section ;-) Ingo ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2008-04-28 13:14 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-04-27 21:12 [PATCH] sched: missing locking in sched_domains code Heiko Carstens 2008-04-28 1:39 ` Andrew Morton 2008-04-28 7:09 ` Heiko Carstens 2008-04-28 7:24 ` Ingo Molnar 2008-04-28 7:28 ` Andrew Morton 2008-04-28 7:52 ` Heiko Carstens 2008-04-28 8:11 ` Heiko Carstens 2008-04-28 8:32 ` Ingo Molnar 2008-04-28 8:49 ` Heiko Carstens 2008-04-28 8:57 ` Andrew Morton 2008-04-28 9:17 ` Heiko Carstens 2008-04-28 9:31 ` Andrew Morton 2008-04-28 9:33 ` Heiko Carstens 2008-04-28 12:27 ` Ingo Molnar 2008-04-28 13:13 ` Ingo Molnar
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.