* [PATCH 1/2] cpuset: use rebuild_sched_domains() in cpuset_hotplug_workfn()
@ 2013-04-27 9:53 Li Zefan
2013-04-27 9:53 ` [PATCH 2/2] cpuset: fix cpu hotplug vs rebuild_sched_domains() race Li Zefan
0 siblings, 1 reply; 3+ messages in thread
From: Li Zefan @ 2013-04-27 9:53 UTC (permalink / raw)
To: Tejun Heo; +Cc: Li Zhong, LKML, Cgroups
From: Li Zhong <zhong@linux.vnet.ibm.com>
In cpuset_hotplug_workfn(), partition_sched_domains() is called without
hotplug lock held, which is actually needed (stated in the function
header of partition_sched_domains()).
This patch tries to use rebuild_sched_domains() to solve the above
issue, and makes the code looks a little simpler.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Li Zefan <lizefan@huawei.com>
---
kernel/cpuset.c | 13 ++-----------
1 file changed, 2 insertions(+), 11 deletions(-)
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 943968d..b0f18ba 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -2184,17 +2184,8 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
flush_workqueue(cpuset_propagate_hotplug_wq);
/* rebuild sched domains if cpus_allowed has changed */
- if (cpus_updated) {
- struct sched_domain_attr *attr;
- cpumask_var_t *doms;
- int ndoms;
-
- mutex_lock(&cpuset_mutex);
- ndoms = generate_sched_domains(&doms, &attr);
- mutex_unlock(&cpuset_mutex);
-
- partition_sched_domains(ndoms, doms, attr);
- }
+ if (cpus_updated)
+ rebuild_sched_domains();
}
void cpuset_update_active_cpus(bool cpu_online)
--
1.8.0.2
^ permalink raw reply related [flat|nested] 3+ messages in thread* [PATCH 2/2] cpuset: fix cpu hotplug vs rebuild_sched_domains() race
2013-04-27 9:53 [PATCH 1/2] cpuset: use rebuild_sched_domains() in cpuset_hotplug_workfn() Li Zefan
@ 2013-04-27 9:53 ` Li Zefan
2013-04-27 14:23 ` Tejun Heo
0 siblings, 1 reply; 3+ messages in thread
From: Li Zefan @ 2013-04-27 9:53 UTC (permalink / raw)
To: Tejun Heo; +Cc: Li Zhong, LKML, Cgroups
rebuild_sched_domains() might pass doms with offlined cpu to
partition_sched_domains(), which results in an oops:
general protection fault: 0000 [#1] SMP
...
RIP: 0010:[<ffffffff81077a1e>] [<ffffffff81077a1e>] get_group+0x6e/0x90
...
Call Trace:
[<ffffffff8107f07c>] build_sched_domains+0x70c/0xcb0
[<ffffffff8107f2a7>] ? build_sched_domains+0x937/0xcb0
[<ffffffff81173f64>] ? kfree+0xe4/0x1b0
[<ffffffff8107f6e0>] ? partition_sched_domains+0xc0/0x470
[<ffffffff8107f905>] partition_sched_domains+0x2e5/0x470
[<ffffffff8107f6e0>] ? partition_sched_domains+0xc0/0x470
[<ffffffff810c9007>] ? generate_sched_domains+0xc7/0x530
[<ffffffff810c94a8>] rebuild_sched_domains_locked+0x38/0x70
[<ffffffff810cb4a4>] cpuset_write_resmask+0x1a4/0x500
[<ffffffff810c8700>] ? cpuset_mount+0xe0/0xe0
[<ffffffff810c7f50>] ? cpuset_read_u64+0x100/0x100
[<ffffffff810be890>] ? cgroup_iter_next+0x90/0x90
[<ffffffff810cb300>] ? cpuset_css_offline+0x70/0x70
[<ffffffff810c1a73>] cgroup_file_write+0x133/0x2e0
[<ffffffff8118995b>] vfs_write+0xcb/0x130
[<ffffffff8118a174>] sys_write+0x64/0xa0
Reported-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Li Zefan <lizefan@huawei.com>
---
kernel/cpuset.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index b0f18ba..ef05901 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -769,12 +769,20 @@ static void rebuild_sched_domains_locked(void)
lockdep_assert_held(&cpuset_mutex);
get_online_cpus();
+ /*
+ * We have raced with CPU hotplug. Don't do anything to avoid
+ * passing doms with offlined cpu to partition_sched_domains().
+ * Anyways, hotplug work item will rebuild sched domains.
+ */
+ if (!cpumask_equal(top_cpuset.cpus_allowed, cpu_active_mask))
+ goto out;
+
/* Generate domain masks and attrs */
ndoms = generate_sched_domains(&doms, &attr);
/* Have scheduler rebuild the domains */
partition_sched_domains(ndoms, doms, attr);
-
+out:
put_online_cpus();
}
#else /* !CONFIG_SMP */
--
1.8.0.2
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH 2/2] cpuset: fix cpu hotplug vs rebuild_sched_domains() race
2013-04-27 9:53 ` [PATCH 2/2] cpuset: fix cpu hotplug vs rebuild_sched_domains() race Li Zefan
@ 2013-04-27 14:23 ` Tejun Heo
0 siblings, 0 replies; 3+ messages in thread
From: Tejun Heo @ 2013-04-27 14:23 UTC (permalink / raw)
To: Li Zefan; +Cc: Li Zhong, LKML, Cgroups
On Sat, Apr 27, 2013 at 05:53:48PM +0800, Li Zefan wrote:
> rebuild_sched_domains() might pass doms with offlined cpu to
> partition_sched_domains(), which results in an oops:
>
> general protection fault: 0000 [#1] SMP
> ...
> RIP: 0010:[<ffffffff81077a1e>] [<ffffffff81077a1e>] get_group+0x6e/0x90
> ...
> Call Trace:
> [<ffffffff8107f07c>] build_sched_domains+0x70c/0xcb0
> [<ffffffff8107f2a7>] ? build_sched_domains+0x937/0xcb0
> [<ffffffff81173f64>] ? kfree+0xe4/0x1b0
> [<ffffffff8107f6e0>] ? partition_sched_domains+0xc0/0x470
> [<ffffffff8107f905>] partition_sched_domains+0x2e5/0x470
> [<ffffffff8107f6e0>] ? partition_sched_domains+0xc0/0x470
> [<ffffffff810c9007>] ? generate_sched_domains+0xc7/0x530
> [<ffffffff810c94a8>] rebuild_sched_domains_locked+0x38/0x70
> [<ffffffff810cb4a4>] cpuset_write_resmask+0x1a4/0x500
> [<ffffffff810c8700>] ? cpuset_mount+0xe0/0xe0
> [<ffffffff810c7f50>] ? cpuset_read_u64+0x100/0x100
> [<ffffffff810be890>] ? cgroup_iter_next+0x90/0x90
> [<ffffffff810cb300>] ? cpuset_css_offline+0x70/0x70
> [<ffffffff810c1a73>] cgroup_file_write+0x133/0x2e0
> [<ffffffff8118995b>] vfs_write+0xcb/0x130
> [<ffffffff8118a174>] sys_write+0x64/0xa0
>
> Reported-by: Li Zhong <zhong@linux.vnet.ibm.com>
> Signed-off-by: Li Zefan <lizefan@huawei.com>
Applied 1-2 to cgroup/for-3.10.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-04-27 14:23 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-27 9:53 [PATCH 1/2] cpuset: use rebuild_sched_domains() in cpuset_hotplug_workfn() Li Zefan
2013-04-27 9:53 ` [PATCH 2/2] cpuset: fix cpu hotplug vs rebuild_sched_domains() race Li Zefan
2013-04-27 14:23 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox