All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Snook <csnook@redhat.com>
To: Tong Li <tong.n.li@intel.com>
Cc: "Bill Huey (hui)" <billh@gnuppy.monkey.org>,
	Ingo Molnar <mingo@elte.hu>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC] scheduler: improve SMP fairness in CFS
Date: Sat, 28 Jul 2007 22:40:45 -0400	[thread overview]
Message-ID: <46ABFE2D.1060505@redhat.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0707281224100.22772@tongli.jf.intel.com>

Tong Li wrote:
> On Fri, 27 Jul 2007, Chris Snook wrote:
> 
>> Bill Huey (hui) wrote:
>>> You have to consider the target for this kind of code. There are 
>>> applications
>>> where you need something that falls within a constant error bound. 
>>> According
>>> to the numbers, the current CFS rebalancing logic doesn't achieve 
>>> that to
>>> any degree of rigor. So CFS is ok for SCHED_OTHER, but not for 
>>> anything more
>>> strict than that.
>>
>> I've said from the beginning that I think that anyone who desperately 
>> needs perfect fairness should be explicitly enforcing it with the aid 
>> of realtime priorities.  The problem is that configuring and tuning a 
>> realtime application is a pain, and people want to be able to 
>> approximate this behavior without doing a whole lot of dirty work 
>> themselves.  I believe that CFS can and should be enhanced to ensure 
>> SMP-fairness over potentially short, user-configurable intervals, even 
>> for SCHED_OTHER.  I do not, however, believe that we should take it to 
>> the extreme of wasting CPU cycles on migrations that will not improve 
>> performance for *any* task, just to avoid letting some tasks get ahead 
>> of others.  We should be as fair as possible but no fairer.  If we've 
>> already made it as fair as possible, we should account for the margin 
>> of error and correct for it the next time we rebalance.  We should not 
>> burn the surplus just to get rid of it.
> 
> Proportional-share scheduling actually has one of its roots in real-time 
> and having a p-fair scheduler is essential for real-time apps (soft 
> real-time).

Sounds like another scheduler class might be in order.  I find CFS to be 
fair enough for most purposes.  If the code that gives us near-perfect 
fairness at the expense of efficiency only runs when tasks have been 
given boosted priority by a privileged user, and only on the CPUs that 
have such tasks queued on them, the run time overhead and code 
complexity become much smaller concerns.

>>
>> On a non-NUMA box with single-socket, non-SMT processors, a constant 
>> error bound is fine.  Once we add SMT, go multi-core, go NUMA, and add 
>> inter-chassis interconnects on top of that, we need to multiply this 
>> error bound at each stage in the hierarchy, or else we'll end up 
>> wasting CPU cycles on migrations that actually hurt the processes 
>> they're supposed to be helping, and hurt everyone else even more.  I 
>> believe we should enforce an error bound that is proportional to 
>> migration cost.
>>
> 
> I think we are actually in agreement. When I say constant bound, it can 
> certainly be a constant that's determined based on inputs from the 
> memory hierarchy. The point is that it needs to be a constant 
> independent of things like # of tasks.

Agreed.

>> But this patch is only relevant to SCHED_OTHER.  The realtime 
>> scheduler doesn't have a concept of fairness, just priorities.  That 
>> why each realtime priority level has its own separate runqueue.  
>> Realtime schedulers are supposed to be dumb as a post, so they cannot 
>> heuristically decide to do anything other than precisely what you 
>> configured them to do, and so they don't get in the way when you're 
>> context switching a million times a second.
> 
> Are you referring to hard real-time? As I said, an infrastructure that 
> enables p-fair scheduling, EDF, or things alike is the foundation for 
> real-time. I designed DWRR, however, with a target of non-RT apps, 
> although I was hoping the research results might be applicable to RT.

I'm referring to the static priority SCHED_FIFO and SCHED_RR schedulers, 
which are (intentionally) dumb as a post, allowing userspace to manage 
CPU time explicitly.  Proportionally fair scheduling is a cool 
capability, but not a design goal of those schedulers.

	-- Chris

  reply	other threads:[~2007-07-29  2:41 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-23 18:38 [RFC] scheduler: improve SMP fairness in CFS Tong Li
2007-07-23 20:00 ` Andi Kleen
2007-07-23 21:10   ` Li, Tong N
2007-07-23 21:25     ` Chris Friesen
2007-07-24  9:43       ` Andi Kleen
2007-07-23 23:40 ` Chris Snook
2007-07-24  8:07   ` Chris Snook
2007-07-24 17:11     ` Li, Tong N
2007-07-24 17:07   ` Tong Li
2007-07-24 18:08     ` Chris Snook
2007-07-24 19:47       ` Chris Friesen
2007-07-24 20:39         ` Chris Snook
2007-07-24 20:58           ` Li, Tong N
2007-07-24 21:09             ` Chris Snook
2007-07-24 21:23               ` Chris Friesen
2007-07-24 21:45                 ` Chris Snook
2007-07-24 23:33                   ` Chris Friesen
2007-07-24 21:06           ` Bill Huey
2007-07-24 21:22             ` Chris Snook
2007-07-24 23:14               ` Bill Huey
2007-07-24 21:12           ` Chris Friesen
2007-07-25 11:01 ` Ingo Molnar
2007-07-25 12:03   ` Ingo Molnar
2007-07-25 17:23     ` Tong Li
2007-07-25 19:24       ` Ingo Molnar
2007-07-25 20:38         ` Chris Friesen
2007-07-25 20:55           ` Chris Snook
2007-07-25 21:15             ` Li, Tong N
2007-07-25 22:24               ` Chris Snook
2007-07-26 19:00         ` Tong Li
2007-07-26 21:31           ` Ingo Molnar
2007-07-26 22:00             ` Li, Tong N
2007-07-27  1:34               ` Tong Li
2007-07-27 17:16                 ` Chris Snook
2007-07-27 19:03                   ` Tong Li
2007-07-27 22:20                     ` Bill Huey
2007-07-27 23:36                     ` Chris Snook
2007-07-28  0:54                       ` Bill Huey
2007-07-28  2:59                         ` Chris Snook
2007-07-28 19:38                           ` Tong Li
2007-07-29  2:40                             ` Chris Snook [this message]
2007-07-28 19:23                       ` Tong Li
2007-07-29  3:01                         ` Chris Snook
2007-07-25 18:20     ` Li, Tong N
2007-07-25 19:18       ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46ABFE2D.1060505@redhat.com \
    --to=csnook@redhat.com \
    --cc=billh@gnuppy.monkey.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tong.n.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.