public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Paul Jackson <pj@sgi.com>
To: "Martin J. Bligh" <mbligh@aracnet.com>
Cc: pwil3058@bigpond.net.au, frankeh@watson.ibm.com,
	dipankar@in.ibm.com, akpm@osdl.org,
	ckrm-tech@lists.sourceforge.net, efocht@hpce.nec.com,
	lse-tech@lists.sourceforge.net, hch@infradead.org,
	steiner@sgi.com, jbarnes@sgi.com, sylvain.jeaugey@bull.net,
	djh@sgi.com, linux-kernel@vger.kernel.org, colpatch@us.ibm.com,
	Simon.Derr@bull.net, ak@suse.de, sivanich@sgi.com,
	raybry@sgi.com
Subject: Re: [Lse-tech] [PATCH] cpusets - big numa cpu and memory placement
Date: Sun, 3 Oct 2004 20:33:14 -0700	[thread overview]
Message-ID: <20041003203314.04133167.pj@sgi.com> (raw)
In-Reply-To: <833710000.1096847229@[10.10.2.4]>

Martin wrote:
> Rebalance!  
> Ooooh, CPU 3 over there looks heavily loaded, I'll steal something.
> That one. Try to migrate. Oops, no cpus_allowed bars me.
> ...
> Humpf. I give up.
> ... ad infinitum.
> 
> Desperately boring, and rather ineffective.

Well ... I don't mind unemployed CPUs being borish.  It's not that they
have much useful work to do.  But if they keep beating down the doors of
their neighbors trying to find work, that seems disruptive.  Won't CPU 3
in your example waste time and suffer increased lock contention,
responding to its deadbeat neighbor?


> > Likely your same concerns apply to the task->mems_allowed field that
> > I added, in the same fashion, in my cpuset patch of recent.
> 
> Mmm, I'm less concerned about that one, or at least I can't specifically
> see how it breaks.

Ray Bryant <raybry@sgi.com> is working this now.  There are ways to get
memory allocated that hurt on our big boxes - such as blowing out one
nodes memory with a disproportionate share of the systems page cache
pages, due to problems vaguely like the cpus_allowed ones.

The kernel allocator and numa placement policies don't really integrate
mems_allowed into their algorithms, but rather are just whacked upside
the head anytime they ask if they can allocate on a non-allowed node.
They can end up doing suboptimal placement on big boxes.

A common one is that the first node in a multiple-node cpuset gets a
bigger memory load from allocations initiated on nodes up stream of it,
that weren't allowed to roost closer to home (or something like this ...
not sure I said this one just right).

Ray is leaning on me to get some kind of memory policy in each cpuset.
I'm giving him a hard time back over details of what this policy
structure should look like, buying time while I try to make more sense
of this all.

I've added him to the cc list here - hopefully he will find my
characterization of our discussions amusing ;).


> > Somewhat like dual-channeled disks, having more than one
> > sched_domain apply at the same time to a given CPU leads to confusions
> > best avoided unless desparately needed. 
> 
> Agreed. The cpus_allowed mechanism doesn't seem well suited to heavy use
> anyway (I think John Hawkes had problems with it too).

The various problems Hawkes had were various race conditions using the
new (at the time) set_cpus_allowed() that Ingo (I believe) added as part
of the O(1) scheduler.  SGI was on the bleeding edge of using the
set_cpus_allowed() call in new and exciting ways, and there were various
race and lock conditions and issues with making sure the per-cpu
migration threads stayed home.

Other than reminding us that this stuff is hard, these problems Hawkes
dealt with don't, to my understanding, shed any light on the new issue
uncovered in this thread, that a simple per-task cpus_allowed mask,
heavily used to affect affinity policy, can interact poorly with
sophisticated schedulers trying to balance an entire system.

===

In sum, I am tending further in the direction of thinking we need to
have scheduler and allocation policies handled on a "per-domain" basis,
where these domains take the form of a partition of the system into
equivalence classes corresponding to subtrees of the cpuset hierarchy.

For example, just to throw out a wild and crazy idea, perhaps instead of
one global set of zonelists (one per node, each containing all nodes,
sorted in various numa friendly orders), rather there should be a set of
zonelists per memory-domain, containing just the nodes therein
(subsetted from the global zonelists, preserving order).

We'll have to be careful here.  I suspect that the tolerance of those
running normal sized systems for this kind of crap will be pretty low.

Moreover, the scheduler in particular, and the allocator somewhat as
well, are areas with a long history of intense technical development.
Our impact on these areas has to be simplistic, so that folks doing the
real work here can keep our multi-domain stuff working with almost no
mind to it at all.

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373

  reply	other threads:[~2004-10-04  3:35 UTC|newest]

Thread overview: 233+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-08-05 10:08 [PATCH] new bitmap list format (for cpusets) Paul Jackson
2004-08-05 10:10 ` [PATCH] cpusets - big numa cpu and memory placement Paul Jackson
2004-08-05 20:55   ` [Lse-tech] " Martin J. Bligh
2004-08-06  2:05     ` Paul Jackson
2004-08-06  3:24       ` Martin J. Bligh
2004-08-06  8:31         ` Paul Jackson
2004-08-06 15:30         ` Erich Focht
2004-08-06 15:35           ` Martin J. Bligh
2004-08-06 15:48             ` Hubertus Franke
2004-08-07  6:30               ` Paul Jackson
2004-08-07  6:45               ` Paul Jackson
2004-08-06 15:49             ` Hubertus Franke
2004-08-06 15:52             ` Hubertus Franke
2004-08-06 15:55             ` Erich Focht
2004-08-07  6:10           ` Paul Jackson
2004-08-07 15:22             ` Erich Focht
2004-08-07 18:59               ` Paul Jackson
2004-08-08  3:17               ` Paul Jackson
2004-08-08 14:50               ` Martin J. Bligh
2004-08-11  0:43                 ` Paul Jackson
2004-08-11  9:40                 ` Erich Focht
2004-08-11 14:49                   ` Martin J. Bligh
2004-08-11 17:50                     ` Paul Jackson
2004-08-11 21:12                       ` Shailabh Nagar
2004-08-12  7:15                         ` Paul Jackson
2004-08-12 12:58                           ` Jack Steiner
2004-08-12 14:50                           ` Martin J. Bligh
2004-08-11 15:12                   ` Shailabh Nagar
2004-08-08 20:22               ` Shailabh Nagar
2004-08-09 15:57                 ` Hubertus Franke
2004-08-10 11:31                   ` [ckrm-tech] " Paul Jackson
2004-08-10 22:38                     ` Shailabh Nagar
2004-08-11 10:42                       ` Erich Focht
2004-08-11 14:56                         ` Shailabh Nagar
2004-08-14  8:51                       ` Paul Jackson
2004-08-08 19:58             ` Shailabh Nagar
2004-10-01 23:41               ` Andrew Morton
2004-10-02  6:06                 ` Paul Jackson
2004-10-02 14:55                   ` Dipankar Sarma
2004-10-02 16:14                     ` Hubertus Franke
2004-10-02 18:04                       ` Paul Jackson
2004-10-02 23:21                       ` Peter Williams
2004-10-02 23:44                         ` Hubertus Franke
2004-10-03  0:00                           ` Peter Williams
2004-10-03  3:44                           ` Paul Jackson
2004-10-05  3:13                           ` [ckrm-tech] " Matthew Helsley
2004-10-05  8:30                             ` Hubertus Franke
2004-10-05 14:20                               ` Paul Jackson
2004-10-03  2:59                         ` Paul Jackson
2004-10-03  3:19                         ` Paul Jackson
2004-10-03  3:53                           ` Peter Williams
2004-10-03  4:47                             ` Paul Jackson
2004-10-03  5:12                               ` Peter Williams
2004-10-03  5:39                                 ` Paul Jackson
2004-10-03  4:02                           ` Paul Jackson
2004-10-03  3:39                         ` Paul Jackson
2004-10-03 14:36                         ` Martin J. Bligh
2004-10-03 15:39                           ` Paul Jackson
2004-10-03 23:53                             ` Martin J. Bligh
2004-10-04  0:02                               ` Martin J. Bligh
2004-10-04  0:53                                 ` Paul Jackson
2004-10-04  3:56                                   ` Martin J. Bligh
2004-10-04  4:24                                     ` Paul Jackson
2004-10-04 15:03                                       ` Martin J. Bligh
2004-10-04 15:53                                         ` [ckrm-tech] " Paul Jackson
2004-10-04 18:17                                           ` Martin J. Bligh
2004-10-04 20:25                                             ` Paul Jackson
2004-10-04 22:15                                               ` Martin J. Bligh
2004-10-05  9:17                                                 ` Paul Jackson
2004-10-05 10:01                                                   ` Paul Jackson
2004-10-05 22:24                                                   ` Matthew Dobson
2004-10-05  9:26                                         ` Simon Derr
2004-10-05  9:58                                           ` Paul Jackson
2004-10-05 19:34                                           ` Martin J. Bligh
2004-10-06  0:28                                             ` Paul Jackson
2004-10-06  1:16                                               ` Martin J. Bligh
2004-10-06  2:08                                                 ` Paul Jackson
2004-10-06 22:59                                                   ` Matthew Dobson
2004-10-06 23:23                                                     ` Peter Williams
2004-10-07  0:16                                                       ` Rick Lindsley
2004-10-07 18:27                                                         ` Paul Jackson
2004-10-07  8:51                                                     ` Paul Jackson
2004-10-07 10:53                                                       ` Rick Lindsley
2004-10-07 14:41                                                         ` Martin J. Bligh
     [not found]                                                         ` <20041007072842.2bafc320.pj@sgi.com>
2004-10-07 19:05                                                           ` Rick Lindsley
2004-10-10  2:15                                                             ` [ckrm-tech] " Paul Jackson
2004-10-11 22:06                                                               ` Matthew Dobson
2004-10-11 22:58                                                                 ` Paul Jackson
2004-10-12 21:22                                                                   ` Matthew Dobson
2004-10-12  8:50                                                                 ` Simon Derr
2004-10-12 21:25                                                                   ` Matthew Dobson
2004-10-10  2:28                                                             ` Paul Jackson
     [not found]                                                           ` <4165A31E.4070905@watson.ibm.com>
2004-10-08 13:14                                                             ` Paul Jackson
2004-10-08 15:42                                                               ` Hubertus Franke
2004-10-08 18:23                                                                 ` Paul Jackson
2004-10-09  1:00                                                                   ` Matthew Dobson
2004-10-09 20:08                                                                     ` [Lse-tech] " Paul Jackson
2004-10-11 22:16                                                                       ` Matthew Dobson
2004-10-11 22:42                                                                         ` Paul Jackson
2004-10-10  0:05                                                                     ` Paul Jackson
2004-10-11 22:18                                                                       ` Matthew Dobson
2004-10-11 22:39                                                                         ` Paul Jackson
2004-10-09  0:51                                                               ` Matthew Dobson
2004-10-10  0:50                                                                 ` [Lse-tech] " Paul Jackson
2004-10-10  0:59                                                                 ` Paul Jackson
2004-10-09  0:22                                                             ` Matthew Dobson
2004-10-12 22:24                                                               ` [Lse-tech] " Hanna Linder
2004-10-13 20:56                                                                 ` Matthew Dobson
2004-10-09  0:06                                                           ` [Lse-tech] " Matthew Dobson
2004-10-07 12:47                                                       ` Simon Derr
2004-10-07 14:49                                                         ` Martin J. Bligh
2004-10-07 17:54                                                           ` Paul Jackson
2004-10-07 18:13                                                             ` Martin J. Bligh
2004-10-08  9:23                                                               ` Erich Focht
2004-10-08  9:50                                                                 ` Andrew Morton
2004-10-08 10:40                                                                   ` Erich Focht
2004-10-08 14:26                                                                     ` Martin J. Bligh
2004-10-08  9:53                                                                 ` Nick Piggin
2004-10-08 11:40                                                                   ` Erich Focht
2004-10-08 14:24                                                                 ` Martin J. Bligh
2004-10-08 22:37                                                                   ` Erich Focht
2004-10-14 10:35                                                               ` Eric W. Biederman
2004-10-14 11:22                                                                 ` Erich Focht
2004-10-14 11:23                                                                 ` Paul Jackson
2004-10-14 19:39                                                                 ` Paul Jackson
2004-10-14 22:38                                                                   ` Hubertus Franke
2004-10-15  1:26                                                                     ` Paul Jackson
2004-10-07 18:25                                                             ` Andrew Morton
2004-10-07 19:52                                                               ` Paul Jackson
2004-10-07 21:04                                                                 ` [ckrm-tech] " Matthew Helsley
2004-10-10  3:22                                                               ` Paul Jackson
2004-10-07 19:16                                                             ` Rick Lindsley
2004-10-10  2:35                                                               ` Paul Jackson
2004-10-10  5:12                                                           ` [ckrm-tech] " Paul Jackson
2004-10-08 23:48                                                       ` Matthew Dobson
2004-10-09  0:18                                                         ` Nick Piggin
2004-10-11 23:00                                                           ` Matthew Dobson
2004-10-11 23:09                                                             ` Nick Piggin
2004-10-05 22:33                                           ` Matthew Dobson
2004-10-06  3:01                                             ` Paul Jackson
2004-10-06 23:12                                               ` Matthew Dobson
2004-10-07  8:59                                                 ` [ckrm-tech] " Paul Jackson
2004-10-04  0:45                               ` Paul Jackson
2004-10-04 11:44                                 ` Rick Lindsley
2004-10-04 22:46                                   ` [ckrm-tech] " Paul Jackson
2004-10-05 22:19                               ` Matthew Dobson
2004-10-06  2:39                                 ` Paul Jackson
2004-10-06 23:21                                   ` Matthew Dobson
2004-10-07  9:41                                     ` [ckrm-tech] " Paul Jackson
2004-10-06  2:47                                 ` Paul Jackson
2004-10-06  9:43                                   ` Simon Derr
2004-10-06 13:27                                     ` Paul Jackson
2004-10-06 21:55                                     ` Peter Williams
2004-10-06 22:49                                       ` Paul Jackson
2004-10-06  8:02                                 ` Simon Derr
2005-02-07 23:59                                 ` Matthew Dobson
2005-02-08  0:20                                   ` Andrew Morton
2005-02-08  0:34                                     ` Paul Jackson
2005-02-08  9:54                                   ` Dinakar Guniguntala
2005-02-08  9:49                                     ` Nick Piggin
2005-02-08 16:13                                       ` Martin J. Bligh
2005-02-08 23:26                                         ` Nick Piggin
2005-02-09  4:23                                           ` Paul Jackson
2005-02-08 19:32                                       ` Matthew Dobson
2005-02-09  2:53                                         ` Nick Piggin
2005-02-08 19:00                                     ` Matthew Dobson
2005-02-08 20:42                                       ` Paul Jackson
2005-02-08 22:14                                         ` Matthew Dobson
2005-02-08 23:58                                           ` Shailabh Nagar
2005-02-09  0:27                                             ` Paul Jackson
2005-02-09  0:24                                           ` Paul Jackson
2005-02-09 17:59                                         ` [ckrm-tech] " Chandra Seetharaman
2005-02-11  2:46                                           ` Chandra Seetharaman
2005-02-11  9:21                                             ` Paul Jackson
2005-02-12  1:37                                               ` Chandra Seetharaman
2005-02-12  6:16                                                 ` Paul Jackson
2005-02-11 16:54                                             ` Jesse Barnes
2005-02-11 18:42                                               ` Chandra Seetharaman
2005-02-11 18:50                                                 ` Jesse Barnes
2005-02-08 16:15                                   ` Martin J. Bligh
2005-02-08 22:17                                     ` Matthew Dobson
2004-10-03 16:02                           ` Paul Jackson
2004-10-03 23:47                             ` Martin J. Bligh
2004-10-04  3:33                               ` Paul Jackson [this message]
2004-10-03 20:10                           ` Tim Hockin
2004-10-04  1:56                             ` Paul Jackson
2004-10-03  3:35                     ` Paul Jackson
2004-10-03 20:21                   ` Erich Focht
2004-10-03 20:48                     ` Andrew Morton
2004-10-04 14:05                       ` Erich Focht
2004-10-04 14:57                         ` Martin J. Bligh
2004-10-04 15:30                           ` Paul Jackson
2004-10-04 15:41                             ` Martin J. Bligh
2004-10-04 16:02                               ` Paul Jackson
2004-10-04 18:19                                 ` Martin J. Bligh
2004-10-04 18:29                                   ` Paul Jackson
2004-10-04 15:38                           ` Paul Jackson
2004-10-04 16:46                           ` Paul Jackson
2004-10-04  3:41                     ` Paul Jackson
2004-10-04 13:58                     ` Hubertus Franke
2004-10-04 14:13                       ` Simon Derr
2004-10-04 14:15                       ` Erich Focht
2004-10-04 15:23                         ` Paul Jackson
2004-10-04 14:37                       ` Paul Jackson
2004-10-02 15:46                 ` [ckrm-tech] " Marc E. Fiuczynski
2004-10-02 16:17                   ` Hubertus Franke
2004-10-02 17:53                     ` Paul Jackson
2004-10-02 18:16                       ` Hubertus Franke
2004-10-02 19:14                         ` Paul Jackson
2004-10-02 23:29                         ` Peter Williams
2004-10-02 23:51                           ` Hubertus Franke
2004-10-02 20:40                     ` Andrew Morton
2004-10-02 23:08                       ` Hubertus Franke
2004-10-02 22:26                         ` Alan Cox
2004-10-03  2:49                         ` Paul Jackson
2004-10-03 12:19                           ` Hubertus Franke
2004-10-03  3:25                         ` Paul Jackson
2004-10-03  2:26                       ` Paul Jackson
2004-10-03 14:11                         ` Paul Jackson
2004-10-02 17:47                   ` Paul Jackson
2004-08-05 20:47 ` [Lse-tech] [PATCH] new bitmap list format (for cpusets) Martin J. Bligh
2004-08-05 21:45   ` Paul Jackson
     [not found]     ` <Pine.A41.4.53.0408060930100.20680@isabelle.frec.bull.fr>
2004-08-06 10:14       ` Paul Jackson
2004-08-09  8:01   ` Paul Jackson
2004-08-09 14:49     ` Martin J. Bligh
2004-08-10 23:43       ` Paul Jackson
2004-08-11 13:11 ` Dinakar Guniguntala
2004-08-11 16:17   ` Paul Jackson
2004-08-11 18:05     ` Dinakar Guniguntala
2004-08-11 20:40       ` Paul Jackson
2004-08-12  9:48         ` Dinakar Guniguntala
2004-08-12 10:11           ` Paul Jackson
2004-08-12 12:34             ` Dinakar Guniguntala

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041003203314.04133167.pj@sgi.com \
    --to=pj@sgi.com \
    --cc=Simon.Derr@bull.net \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=ckrm-tech@lists.sourceforge.net \
    --cc=colpatch@us.ibm.com \
    --cc=dipankar@in.ibm.com \
    --cc=djh@sgi.com \
    --cc=efocht@hpce.nec.com \
    --cc=frankeh@watson.ibm.com \
    --cc=hch@infradead.org \
    --cc=jbarnes@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lse-tech@lists.sourceforge.net \
    --cc=mbligh@aracnet.com \
    --cc=pwil3058@bigpond.net.au \
    --cc=raybry@sgi.com \
    --cc=sivanich@sgi.com \
    --cc=steiner@sgi.com \
    --cc=sylvain.jeaugey@bull.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox