From: Heiko Carstens <heiko.carstens@de.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Gautham R Shenoy <ego@in.ibm.com>, Ingo Molnar <mingo@elte.hu>,
Paul Jackson <pj@sgi.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] sched: missing locking in sched_domains code
Date: Mon, 28 Apr 2008 09:09:46 +0200 [thread overview]
Message-ID: <20080428070946.GA4507@osiris.boeblingen.de.ibm.com> (raw)
In-Reply-To: <20080427183926.acb66fff.akpm@linux-foundation.org>
On Sun, Apr 27, 2008 at 06:39:26PM -0700, Andrew Morton wrote:
> On Sun, 27 Apr 2008 23:12:24 +0200 Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
> > Index: linux-2.6/kernel/cpuset.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/cpuset.c
> > +++ linux-2.6/kernel/cpuset.c
> > @@ -684,7 +684,9 @@ restart:
> > rebuild:
> > /* Have scheduler rebuild sched domains */
> > get_online_cpus();
> > + mutex_lock(&sched_domains_mutex);
> > partition_sched_domains(ndoms, doms, dattr);
> > + mutex_unlock(&sched_domains_mutex);
> > put_online_cpus();
> >
>
> It seems a bit fragile to take this lock in the caller without even adding
> a comment at the callee site which documents the new locking rule.
>
> It would be more robust to take the lock within partition_sched_domains().
>
> partition_sched_domains() already covers itself with lock_doms_cur(). Can
> we take that in arch_reinit_sched_domains() rather than adding the new lock?
I think you meant taking it in partition_sched_domains? But anyway, I moved
it all over to sched.c. So here's the new patch. Shorter and doesn't export
a new lock :)
Subject: [PATCH] sched: fix sched_domains locking
From: Heiko Carstens <heiko.carstens@de.ibm.com>
Concurrent calls to detach_destroy_domains and arch_init_sched_domains
were prevented by the old scheduler subsystem cpu hotplug mutex. When
this got converted to get_online_cpus() the locking got broken.
Unlike before now several processes can concurrently enter the critical
sections that were protected by the old lock.
So add a new sched_domains_mutex which protects these sections again.
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
---
kernel/sched.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
Index: linux-2.6/kernel/sched.c
===================================================================
--- linux-2.6.orig/kernel/sched.c
+++ linux-2.6/kernel/sched.c
@@ -7730,6 +7730,12 @@ static int dattrs_equal(struct sched_dom
}
/*
+ * Protects against concurrent calls to detach_destroy_domains
+ * and arch_init_sched_domains.
+ */
+static DEFINE_MUTEX(sched_domains_mutex);
+
+/*
* Partition sched domains as specified by the 'ndoms_new'
* cpumasks in the array doms_new[] of cpumasks. This compares
* doms_new[] to the current sched domain partitioning, doms_cur[].
@@ -7756,7 +7762,8 @@ void partition_sched_domains(int ndoms_n
int i, j;
lock_doms_cur();
-
+ mutex_lock(&sched_domains_mutex);
+
/* always unregister in case we don't destroy any domains */
unregister_sched_domain_sysctl();
@@ -7804,6 +7811,7 @@ match2:
register_sched_domain_sysctl();
+ mutex_unlock(&sched_domains_mutex);
unlock_doms_cur();
}
@@ -7813,8 +7821,10 @@ int arch_reinit_sched_domains(void)
int err;
get_online_cpus();
+ mutex_lock(&sched_domains_mutex);
detach_destroy_domains(&cpu_online_map);
err = arch_init_sched_domains(&cpu_online_map);
+ mutex_unlock(&sched_domains_mutex);
put_online_cpus();
return err;
@@ -7932,10 +7942,12 @@ void __init sched_init_smp(void)
BUG_ON(sched_group_nodes_bycpu == NULL);
#endif
get_online_cpus();
+ mutex_lock(&sched_domains_mutex);
arch_init_sched_domains(&cpu_online_map);
cpus_andnot(non_isolated_cpus, cpu_possible_map, cpu_isolated_map);
if (cpus_empty(non_isolated_cpus))
cpu_set(smp_processor_id(), non_isolated_cpus);
+ mutex_unlock(&sched_domains_mutex);
put_online_cpus();
/* XXX: Theoretical race here - CPU may be hotplugged now */
hotcpu_notifier(update_sched_domains, 0);
next prev parent reply other threads:[~2008-04-28 7:09 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-27 21:12 [PATCH] sched: missing locking in sched_domains code Heiko Carstens
2008-04-28 1:39 ` Andrew Morton
2008-04-28 7:09 ` Heiko Carstens [this message]
2008-04-28 7:24 ` Ingo Molnar
2008-04-28 7:28 ` Andrew Morton
2008-04-28 7:52 ` Heiko Carstens
2008-04-28 8:11 ` Heiko Carstens
2008-04-28 8:32 ` Ingo Molnar
2008-04-28 8:49 ` Heiko Carstens
2008-04-28 8:57 ` Andrew Morton
2008-04-28 9:17 ` Heiko Carstens
2008-04-28 9:31 ` Andrew Morton
2008-04-28 9:33 ` Heiko Carstens
2008-04-28 12:27 ` Ingo Molnar
2008-04-28 13:13 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080428070946.GA4507@osiris.boeblingen.de.ibm.com \
--to=heiko.carstens@de.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=ego@in.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=pj@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.