Re: exclusive cpusets broken with cpu hotplug

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Paul Jackson <pj@sgi.com>
To: Robin Holt <holt@sgi.com>
Cc: suresh.b.siddha@intel.com, dino@in.ibm.com, menage@google.com,
	Simon.Derr@bull.net, linux-kernel@vger.kernel.org,
	mbligh@google.com, rohitseth@google.com, dipankar@in.ibm.com,
	nickpiggin@yahoo.com.au
Subject: Re: exclusive cpusets broken with cpu hotplug
Date: Wed, 18 Oct 2006 14:07:38 -0700	[thread overview]
Message-ID: <20061018140738.a0c1c845.pj@sgi.com> (raw)
In-Reply-To: <20061018105307.GA17027@lnx-holt.americas.sgi.com>

> You do, however, hopefully have enough information to create the
> calls you would make to partition_sched_domain if each had their
> cpu_exclusive flags cleared.  Essentially, what I am proposing is
> making all the calls as if the user had cleared each as the
> remove/add starts, and then behave as if each each was set again.

Yes - hopefully we have enough information to rebuild the sched domains
each time, consistently.  And your proposal is probably an improvement
for that reason.

However, I'm afraid that only solves half the problem.  It makes the
sched domains more repeatable and predictable.  But I'm worried that
the cpuset control over sched domains is still broken .. see the
example below.

I've half a mind to prepare a patch to just rip out the sched domain
defining code from kernel/cpuset.c, completely uncoupling the
cpu_exclusive flag, and any other cpuset flags, from sched domains.

Example:

    As best as I can tell (which is not very far ;), if some hapless
    user does the following:

	    /dev/cpuset		cpu_exclusive == 1; cpus == 0-7
	    /dev/cpuset/a	cpu_exclusive == 1; cpus == 0-3
	    /dev/cpsuet/b	cpu_exclusive == 1; cpus == 4-7

    and then runs a big job in the top cpuset (/dev/cpuset), then that
    big job will not load balance correctly, with whatever threads
    in the big job that got stuck on cpus 0-3 isolated from whatever
    threads got stuck on cpus 4-7.

Is this correct?

If so, there no practical way that I can see on a production system for
the system admin to realize they have messed up their system this way.

If we can't make this work properly automatically, then we either need
to provide users the visibility and control to make it work by explicit
manual control (meaning my 'sched_domain' flag patch, plus some way of
exporting the sched domain topology in /sys), or we need to stop doing
this.

If the above example is not correct, then I'm afraid my education in
sched domains is in need of another lesson.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

next prev parent reply	other threads:[~2006-10-18 21:07 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-18  2:25 exclusive cpusets broken with cpu hotplug Siddha, Suresh B
2006-10-18  7:14 ` Paul Jackson
2006-10-18  9:56   ` Robin Holt
2006-10-18 10:10     ` Paul Jackson
2006-10-18 10:53       ` Robin Holt
2006-10-18 21:07         ` Paul Jackson [this message]
2006-10-19  5:56           ` Paul Jackson
2006-10-18 12:16       ` Nick Piggin
2006-10-18 14:14         ` Siddha, Suresh B
2006-10-18 14:51           ` Nick Piggin
2006-10-19  6:15         ` Paul Jackson
2006-10-19  6:35           ` Nick Piggin
2006-10-19  6:57             ` Paul Jackson
2006-10-19  7:04               ` Nick Piggin
2006-10-19  7:33                 ` Paul Jackson
2006-10-19  8:16                   ` Nick Piggin
2006-10-19  8:31                     ` Paul Jackson
2006-10-19  7:34                 ` Paul Jackson
2006-10-19  8:07                   ` Nick Piggin
2006-10-19  8:11                     ` Paul Jackson
2006-10-19  8:22                       ` Nick Piggin
2006-10-19  8:42                         ` Paul Jackson
2006-10-18 17:54 ` Dinakar Guniguntala
2006-10-18 18:05   ` Paul Jackson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061018140738.a0c1c845.pj@sgi.com \
    --to=pj@sgi.com \
    --cc=Simon.Derr@bull.net \
    --cc=dino@in.ibm.com \
    --cc=dipankar@in.ibm.com \
    --cc=holt@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbligh@google.com \
    --cc=menage@google.com \
    --cc=nickpiggin@yahoo.com.au \
    --cc=rohitseth@google.com \
    --cc=suresh.b.siddha@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox