public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Paul Jackson <pj@sgi.com>
To: Paul Menage <menage@google.com>
Cc: rientjes@google.com, nickpiggin@yahoo.com.au,
	a.p.zijlstra@chello.nl, balbir@linux.vnet.ibm.com,
	linux-kernel@vger.kernel.org, clg@fr.ibm.com,
	ebiederm@xmission.com, containers@lists.osdl.org,
	serue@us.ibm.com, svaidy@linux.vnet.ibm.com,
	akpm@linux-foundation.org, xemul@openvz.org
Subject: Re: [RFC] cpuset update_cgroup_cpus_allowed
Date: Mon, 15 Oct 2007 19:34:39 -0700	[thread overview]
Message-ID: <20071015193439.fe67bc4d.pj@sgi.com> (raw)
In-Reply-To: <471403BF.4090203@google.com>

> currently against an older kernel 

ah .. which older kernel?

I tried it against the broken out 2.6.23-rc8-mm2 patch set,
inserting it before the task-containersv11-* patches, but
that blew up on me - three rejected hunks.

Any chance of getting this against a current cgroup (aka
container) kernel?

Could you use the diff --show-c-function option when composing
patches - they're easier to read that way - thanks.

+	if (!retval) {
+		cpus_allowed = cpuset_cpus_allowed(p);
+		if (!cpus_subset(new_mask, cpus_allowed)) {
+			/*
+			 * We must have raced with a concurrent cpuset
+			 * update. Just reset the cpus_allowed to the
+			 * cpuset's cpus_allowed
+			 */
+			new_mask = cpus_allowed;

This narrows the race, perhaps sufficiently, but I don't see that it
guarantees closure.  Memory accesses to two different locations are not
guaranteed to be ordered across nodes, as best I recall.  The second
line above, that rereads the cpuset cpus_allowed, could get an old
value, in essence.

	cpuset update task		sched_setaffinity task
	------------------		----------------------

	A. write cpuset	[Q]		V. read cpuset [Q]
	B. read task [P]		W. check ok
	C. write task [P]		X. write task [P]
					Y. reread cpuset [Q]
					Z. check ok again

Two memory locations:
	[P] the cpus_allowed mask in the task_struct of the
		task doing the sched_setaffinity call.
	[Q] the cpus_allowed mask in the cpuset of the cpuset
		to which the sched_setaffinity task is attached.

Even though, from the perspective of location [P], both B. and C.
happened before X., still from the perspective of location [Q] the
rereading in Y. could return the value the cpuset cpus_allowed had
before the write in A.  This could result in a task running with
a cpus_allowed that was totally outside its cpusets cpus_allowed.

I will grant that this is a narrow window.  I won't loose much sleep
over it.

>  - uses a priority heap to pick the processes to act on, based on start time

This adds a fair bit of code and complexity, relative to my patch.
This I do loose more sleep over.  There has to be a compelling
reason for doing this.

The point that David raises, regarding the interaction of this with
hotplug, seems to be a compelling reason for doing -something-
different than my patch proposal.

I don't know yet if it compels us to this much code, however.

Any chance you could provide a patch that works against cgroups?

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

  reply	other threads:[~2007-10-16  2:34 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-15  7:11 [RFC] cpuset update_cgroup_cpus_allowed Paul Jackson
2007-10-15 18:49 ` David Rientjes
2007-10-16  2:32   ` Paul Jackson
2007-10-16  6:07   ` Paul Jackson
2007-10-16  6:21     ` David Rientjes
2007-10-16  9:16       ` Paul Jackson
2007-10-16 18:27         ` David Rientjes
2007-10-16 23:14           ` Paul Jackson
2007-10-15 21:24 ` Paul Menage
2007-10-16  0:16   ` Paul Jackson
2007-10-16  0:20     ` Paul Menage
2007-10-16  2:34       ` Paul Jackson [this message]
2007-10-16  5:12         ` Paul Menage
2007-10-16  5:20           ` Paul Jackson
2007-10-16 10:07         ` Paul Menage

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071015193439.fe67bc4d.pj@sgi.com \
    --to=pj@sgi.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=clg@fr.ibm.com \
    --cc=containers@lists.osdl.org \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=menage@google.com \
    --cc=nickpiggin@yahoo.com.au \
    --cc=rientjes@google.com \
    --cc=serue@us.ibm.com \
    --cc=svaidy@linux.vnet.ibm.com \
    --cc=xemul@openvz.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox