All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: lizefan@huawei.com, paul@paulmenage.org, glommer@parallels.com
Cc: containers@lists.linux-foundation.org, cgroups@vger.kernel.org,
	peterz@infradead.org, mhocko@suse.cz, bsingharora@gmail.com,
	hannes@cmpxchg.org, kamezawa.hiroyu@jp.fujitsu.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Tejun Heo <tj@kernel.org>
Subject: [PATCH 11/13] cpuset: pin down cpus and mems while a task is being attached
Date: Thu,  3 Jan 2013 13:36:05 -0800	[thread overview]
Message-ID: <1357248967-24959-12-git-send-email-tj@kernel.org> (raw)
In-Reply-To: <1357248967-24959-1-git-send-email-tj@kernel.org>

cpuset is scheduled to be decoupled from cgroup_lock which will make
configuration updates race with task migration.  Any config update
will be allowed to happen between ->can_attach() and ->attach().  If
such config update removes either all cpus or mems, by the time
->attach() is called, the condition verified by ->can_attach(), that
the cpuset is capable of hosting the tasks, is no longer true.

This patch adds cpuset->attach_in_progress which is incremented from
->can_attach() and decremented when the attach operation finishes
either successfully or not.  validate_change() treats cpusets w/
non-zero ->attach_in_progress like cpusets w/ tasks and refuses to
remove all cpus or mems from it.

This currently doesn't make any functional difference as everything is
protected by cgroup_mutex but enables decoupling the locking.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cpuset.c | 28 ++++++++++++++++++++++++++--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 2d0b9bc..27e8614 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -91,6 +91,12 @@ struct cpuset {
 
 	struct fmeter fmeter;		/* memory_pressure filter */
 
+	/*
+	 * Tasks are being attached to this cpuset.  Used to prevent
+	 * zeroing cpus/mems_allowed between ->can_attach() and ->attach().
+	 */
+	int attach_in_progress;
+
 	/* partition number for rebuild_sched_domains() */
 	int pn;
 
@@ -468,9 +474,12 @@ static int validate_change(const struct cpuset *cur, const struct cpuset *trial)
 			goto out;
 	}
 
-	/* Cpusets with tasks can't have empty cpus_allowed or mems_allowed */
+	/*
+	 * Cpusets with tasks - existing or newly being attached - can't
+	 * have empty cpus_allowed or mems_allowed.
+	 */
 	ret = -ENOSPC;
-	if (cgroup_task_count(cur->css.cgroup) &&
+	if ((cgroup_task_count(cur->css.cgroup) || cur->attach_in_progress) &&
 	    (cpumask_empty(trial->cpus_allowed) ||
 	     nodes_empty(trial->mems_allowed)))
 		goto out;
@@ -1386,9 +1395,21 @@ static int cpuset_can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
 			return ret;
 	}
 
+	/*
+	 * Mark attach is in progress.  This makes validate_change() fail
+	 * changes which zero cpus/mems_allowed.
+	 */
+	cs->attach_in_progress++;
+
 	return 0;
 }
 
+static void cpuset_cancel_attach(struct cgroup *cgrp,
+				 struct cgroup_taskset *tset)
+{
+	cgroup_cs(cgrp)->attach_in_progress--;
+}
+
 /*
  * Protected by cgroup_mutex.  cpus_attach is used only by cpuset_attach()
  * but we can't allocate it dynamically there.  Define it global and
@@ -1441,6 +1462,8 @@ static void cpuset_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
 					  &cpuset_attach_nodemask_to);
 		mmput(mm);
 	}
+
+	cs->attach_in_progress--;
 }
 
 /* The various types of files and directories in a cpuset file system */
@@ -1908,6 +1931,7 @@ struct cgroup_subsys cpuset_subsys = {
 	.css_offline = cpuset_css_offline,
 	.css_free = cpuset_css_free,
 	.can_attach = cpuset_can_attach,
+	.cancel_attach = cpuset_cancel_attach,
 	.attach = cpuset_attach,
 	.subsys_id = cpuset_subsys_id,
 	.base_cftypes = files,
-- 
1.8.0.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Tejun Heo <tj@kernel.org>
To: lizefan@huawei.com, paul@paulmenage.org, glommer@parallels.com
Cc: containers@lists.linux-foundation.org, cgroups@vger.kernel.org,
	peterz@infradead.org, mhocko@suse.cz, bsingharora@gmail.com,
	hannes@cmpxchg.org, kamezawa.hiroyu@jp.fujitsu.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Tejun Heo <tj@kernel.org>
Subject: [PATCH 11/13] cpuset: pin down cpus and mems while a task is being attached
Date: Thu,  3 Jan 2013 13:36:05 -0800	[thread overview]
Message-ID: <1357248967-24959-12-git-send-email-tj@kernel.org> (raw)
In-Reply-To: <1357248967-24959-1-git-send-email-tj@kernel.org>

cpuset is scheduled to be decoupled from cgroup_lock which will make
configuration updates race with task migration.  Any config update
will be allowed to happen between ->can_attach() and ->attach().  If
such config update removes either all cpus or mems, by the time
->attach() is called, the condition verified by ->can_attach(), that
the cpuset is capable of hosting the tasks, is no longer true.

This patch adds cpuset->attach_in_progress which is incremented from
->can_attach() and decremented when the attach operation finishes
either successfully or not.  validate_change() treats cpusets w/
non-zero ->attach_in_progress like cpusets w/ tasks and refuses to
remove all cpus or mems from it.

This currently doesn't make any functional difference as everything is
protected by cgroup_mutex but enables decoupling the locking.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cpuset.c | 28 ++++++++++++++++++++++++++--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 2d0b9bc..27e8614 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -91,6 +91,12 @@ struct cpuset {
 
 	struct fmeter fmeter;		/* memory_pressure filter */
 
+	/*
+	 * Tasks are being attached to this cpuset.  Used to prevent
+	 * zeroing cpus/mems_allowed between ->can_attach() and ->attach().
+	 */
+	int attach_in_progress;
+
 	/* partition number for rebuild_sched_domains() */
 	int pn;
 
@@ -468,9 +474,12 @@ static int validate_change(const struct cpuset *cur, const struct cpuset *trial)
 			goto out;
 	}
 
-	/* Cpusets with tasks can't have empty cpus_allowed or mems_allowed */
+	/*
+	 * Cpusets with tasks - existing or newly being attached - can't
+	 * have empty cpus_allowed or mems_allowed.
+	 */
 	ret = -ENOSPC;
-	if (cgroup_task_count(cur->css.cgroup) &&
+	if ((cgroup_task_count(cur->css.cgroup) || cur->attach_in_progress) &&
 	    (cpumask_empty(trial->cpus_allowed) ||
 	     nodes_empty(trial->mems_allowed)))
 		goto out;
@@ -1386,9 +1395,21 @@ static int cpuset_can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
 			return ret;
 	}
 
+	/*
+	 * Mark attach is in progress.  This makes validate_change() fail
+	 * changes which zero cpus/mems_allowed.
+	 */
+	cs->attach_in_progress++;
+
 	return 0;
 }
 
+static void cpuset_cancel_attach(struct cgroup *cgrp,
+				 struct cgroup_taskset *tset)
+{
+	cgroup_cs(cgrp)->attach_in_progress--;
+}
+
 /*
  * Protected by cgroup_mutex.  cpus_attach is used only by cpuset_attach()
  * but we can't allocate it dynamically there.  Define it global and
@@ -1441,6 +1462,8 @@ static void cpuset_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
 					  &cpuset_attach_nodemask_to);
 		mmput(mm);
 	}
+
+	cs->attach_in_progress--;
 }
 
 /* The various types of files and directories in a cpuset file system */
@@ -1908,6 +1931,7 @@ struct cgroup_subsys cpuset_subsys = {
 	.css_offline = cpuset_css_offline,
 	.css_free = cpuset_css_free,
 	.can_attach = cpuset_can_attach,
+	.cancel_attach = cpuset_cancel_attach,
 	.attach = cpuset_attach,
 	.subsys_id = cpuset_subsys_id,
 	.base_cftypes = files,
-- 
1.8.0.2


  parent reply	other threads:[~2013-01-03 21:36 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-03 21:35 [PATCHSET] cpuset: decouple cpuset locking from cgroup core, take#2 Tejun Heo
2013-01-03 21:35 ` Tejun Heo
2013-01-03 21:35 ` Tejun Heo
2013-01-03 21:35 ` [PATCH 01/13] cpuset: remove unused cpuset_unlock() Tejun Heo
2013-01-03 21:35   ` Tejun Heo
2013-01-03 21:35 ` [PATCH 03/13] cpuset: introduce ->css_on/offline() Tejun Heo
2013-01-03 21:35   ` Tejun Heo
2013-01-03 21:35 ` [PATCH 04/13] cpuset: introduce CS_ONLINE Tejun Heo
2013-01-03 21:35   ` Tejun Heo
2013-01-03 21:35 ` [PATCH 05/13] cpuset: introduce cpuset_for_each_child() Tejun Heo
2013-01-03 21:35   ` Tejun Heo
2013-01-03 21:36 ` [PATCH 06/13] cpuset: cleanup cpuset[_can]_attach() Tejun Heo
2013-01-03 21:36   ` Tejun Heo
2013-01-03 21:36 ` [PATCH 07/13] cpuset: reorganize CPU / memory hotplug handling Tejun Heo
2013-01-03 21:36   ` Tejun Heo
2013-01-03 21:36 ` [PATCH 08/13] cpuset: don't nest cgroup_mutex inside get_online_cpus() Tejun Heo
2013-01-03 21:36   ` Tejun Heo
2013-01-03 21:36 ` [PATCH 09/13] cpuset: drop async_rebuild_sched_domains() Tejun Heo
2013-01-03 21:36   ` Tejun Heo
2013-01-03 21:36 ` [PATCH 10/13] cpuset: make CPU / memory hotplug propagation asynchronous Tejun Heo
2013-01-03 21:36   ` Tejun Heo
     [not found]   ` <1357248967-24959-11-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-01-06  8:29     ` Li Zefan
2013-01-06  8:29       ` Li Zefan
2013-01-06  8:29       ` Li Zefan
     [not found]       ` <50E935D5.4040402-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-01-07 16:42         ` Tejun Heo
2013-01-07 16:42           ` Tejun Heo
2013-01-07 16:42           ` Tejun Heo
2013-01-03 21:36 ` Tejun Heo [this message]
2013-01-03 21:36   ` [PATCH 11/13] cpuset: pin down cpus and mems while a task is being attached Tejun Heo
     [not found] ` <1357248967-24959-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-01-03 21:35   ` [PATCH 01/13] cpuset: remove unused cpuset_unlock() Tejun Heo
2013-01-03 21:35   ` [PATCH 02/13] cpuset: remove fast exit path from remove_tasks_in_empty_cpuset() Tejun Heo
2013-01-03 21:35     ` Tejun Heo
2013-01-03 21:35     ` Tejun Heo
2013-01-03 21:35   ` [PATCH 03/13] cpuset: introduce ->css_on/offline() Tejun Heo
2013-01-03 21:35   ` [PATCH 04/13] cpuset: introduce CS_ONLINE Tejun Heo
2013-01-03 21:35   ` [PATCH 05/13] cpuset: introduce cpuset_for_each_child() Tejun Heo
2013-01-03 21:36   ` [PATCH 06/13] cpuset: cleanup cpuset[_can]_attach() Tejun Heo
2013-01-03 21:36   ` [PATCH 07/13] cpuset: reorganize CPU / memory hotplug handling Tejun Heo
2013-01-03 21:36   ` [PATCH 08/13] cpuset: don't nest cgroup_mutex inside get_online_cpus() Tejun Heo
2013-01-03 21:36   ` [PATCH 09/13] cpuset: drop async_rebuild_sched_domains() Tejun Heo
2013-01-03 21:36   ` [PATCH 10/13] cpuset: make CPU / memory hotplug propagation asynchronous Tejun Heo
2013-01-03 21:36   ` [PATCH 11/13] cpuset: pin down cpus and mems while a task is being attached Tejun Heo
2013-01-03 21:36   ` [PATCH 12/13] cpuset: schedule hotplug propagation from cpuset_attach() if the cpuset is empty Tejun Heo
2013-01-03 21:36   ` [PATCH 13/13] cpuset: replace cgroup_mutex locking with cpuset internal locking Tejun Heo
2013-01-03 21:36   ` Tejun Heo
2013-01-03 21:36     ` Tejun Heo
2013-01-03 21:36     ` Tejun Heo
2013-01-06  8:27   ` [PATCHSET] cpuset: decouple cpuset locking from cgroup core, take#2 Li Zefan
2013-01-06  8:27   ` Li Zefan
2013-01-06  8:27     ` Li Zefan
2013-01-06  8:27     ` Li Zefan
     [not found]     ` <50E93554.3070102-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-01-07 16:44       ` Tejun Heo
2013-01-07 16:44         ` Tejun Heo
2013-01-07 16:44         ` Tejun Heo
     [not found]         ` <20130107164453.GH3926-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-01-08  1:31           ` Li Zefan
2013-01-08  1:31             ` Li Zefan
2013-01-08  1:31             ` Li Zefan
     [not found]             ` <50EB76DF.5070508-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-01-09 18:57               ` Tejun Heo
2013-01-09 18:57                 ` Tejun Heo
2013-01-09 18:57                 ` Tejun Heo
     [not found]                 ` <20130109185724.GP3926-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-01-11  9:05                   ` Li Zefan
2013-01-11  9:05                     ` Li Zefan
2013-01-11  9:05                     ` Li Zefan
2013-01-09 19:32               ` Paul Menage
2013-01-09 19:32                 ` Paul Menage
2013-01-09 19:32                 ` Paul Menage
2013-01-09 19:32               ` Paul Menage
2013-01-07  8:12   ` Kamezawa Hiroyuki
2013-01-07  8:12     ` Kamezawa Hiroyuki
2013-01-07  8:12     ` Kamezawa Hiroyuki
2013-01-09  9:46   ` Glauber Costa
2013-01-09  9:46   ` Glauber Costa
2013-01-09  9:46     ` Glauber Costa
2013-01-09  9:46     ` Glauber Costa
2013-01-03 21:36 ` [PATCH 12/13] cpuset: schedule hotplug propagation from cpuset_attach() if the cpuset is empty Tejun Heo
2013-01-03 21:36   ` Tejun Heo
  -- strict thread matches above, loose matches on Subject: below --
2012-11-28 21:34 [PATCHSET cgroup/for-3.8] cpuset: decouple cpuset locking from cgroup core Tejun Heo
     [not found] ` <1354138460-19286-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-11-28 21:34   ` [PATCH 11/13] cpuset: pin down cpus and mems while a task is being attached Tejun Heo
2012-11-28 21:34 ` Tejun Heo
2012-11-28 21:34   ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1357248967-24959-12-git-send-email-tj@kernel.org \
    --to=tj@kernel.org \
    --cc=bsingharora@gmail.com \
    --cc=cgroups@vger.kernel.org \
    --cc=containers@lists.linux-foundation.org \
    --cc=glommer@parallels.com \
    --cc=hannes@cmpxchg.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan@huawei.com \
    --cc=mhocko@suse.cz \
    --cc=paul@paulmenage.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.