Re: [RFC][PATCH 0/6] Add group fairness to CFS - v1

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@elte.hu>
To: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>,
	efault@gmx.de, kernel@kolivas.org, containers@lists.osdl.org,
	ckrm-tech@lists.sourceforge.net, torvalds@linux-foundation.org,
	akpm@linux-foundation.org, pwil3058@bigpond.net.au,
	tingy@cs.umass.edu, tong.n.li@intel.com, wli@holomorphy.com,
	linux-kernel@vger.kernel.org, dmitry.adamushko@gmail.com,
	balbir@in.ibm.com, Kirill Korotaev <dev@sw.ru>
Subject: Re: [RFC][PATCH 0/6] Add group fairness to CFS - v1
Date: Tue, 12 Jun 2007 08:26:12 +0200	[thread overview]
Message-ID: <20070612062612.GA19637@elte.hu> (raw)
In-Reply-To: <20070612055024.GA26957@in.ibm.com>

* Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> wrote:

> On Mon, Jun 11, 2007 at 09:39:31PM +0200, Ingo Molnar wrote:
> > i mean bit 6, value 64. I flipped around its meaning in -v17-rc4, so the
> > new precise stats code there is now default-enabled - making SMP
> > load-balancing more accurate.
> 
> I must be missing something here. AFAICS, cpu_load calculation still 
> is timer-interrupt driven in the -v17 snapshot you sent me. Besides, 
> there is no change in default value of bit 6 b/n v16 and v17:
> 
> -unsigned int sysctl_sched_features __read_mostly = 1 | 2 | 4 | 8 | 0 | 0;
> +unsigned int sysctl_sched_features __read_mostly = 0 | 2 | 4 | 8 | 0 | 0;
> 
> So where's this precise stats based calculation of cpu_load?

but there's a change in the interpretation of bit 6:

-       if (!(sysctl_sched_features & 64)) {
-               this_load = this_rq->raw_weighted_load;
+       if (sysctl_sched_features & 64) {
+               this_load = this_rq->lrq.raw_weighted_load;

the update of the cpu_load[] value is timer interrupt driven, but the 
_value_ that is sampled is not. Previously we used ->raw_weighted_load 
(at whatever value it happened to be at the moment the timer irq hit the 
system), now we basically use a load derived from the fair-time passed 
since the last scheduler tick. (Mathematically it's close to an integral 
of load done over that period) So it takes all scheduling activities and 
all load values into account to calculate the average, not just the 
value that was sampled by the scheduler tick.

this, besides being more precise (it for example correctly samples 
short-lived, timer-interrupt-driven workloads too, which were largely 
'invisible' to the previous load calculation method), also enables us to 
make the scheduler tick hrtimer based in the (near) future. (in essence 
making the scheduler tick-less even when there are tasks running)

> Anyway, do you agree that splitting the cpu_load/nr_running fields so 
> that:
> 
> rq->nr_running 	   	  = total count of -all- tasks in runqueue
> rq->raw_weighted_load	  = total weight of -all- tasks in runqueue
> rq->lrq.nr_running 	  = total count of SCHED_NORMAL/BATCH tasks in runqueue
> rq->lrq.raw_weighted_load = total weight of SCHED_NORMAL/BATCH tasks in runqueue
> 
> is a good thing to avoid SCHED_RT<->SCHED_NORMAL/BATCH mixup (as 
> accomplished in Patch #4)?

yes, i agree in general, even though this causes some small overhead. 
This also has another advantage: the inter-policy isolation and load 
balancing is similar to what fair group scheduling does, so even 'plain' 
Linux will use the majority of the framework.

> If you don't agree, then I will make this split dependent on 
> CONFIG_FAIR_GROUP_SCHED

no, i'd rather avoid that #ifdeffery.

> > > Patch 6 hooks up scheduler with container patches in mm (as an 
> > > interface for task-grouping functionality).
> 
> Just to be clear, by container patches, I am referring to "process" 
> container patches from Paul Menage [1]. They aren't necessarily tied 
> to "virtualization-related" container support in -mm tree, although I 
> believe that "virtualization-related" container patches will make use 
> of the same "process-related" container patches for their 
> task-grouping requirements. Phew ..we need better names!

i'd still like to hear back from Kirill & co whether this framework is 
flexible enough for their work (OpenVZ, etc.) too.

	Ingo

next prev parent reply	other threads:[~2007-06-12  6:28 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-11 15:47 [RFC][PATCH 0/6] Add group fairness to CFS - v1 Srivatsa Vaddagiri
2007-06-11 15:50 ` [RFC][PATCH 1/6] Introduce struct sched_entity and struct lrq Srivatsa Vaddagiri
2007-06-11 18:48   ` Linus Torvalds
2007-06-11 18:56     ` Ingo Molnar
2007-06-12  2:15   ` [ckrm-tech] " Balbir Singh
2007-06-12  3:52     ` Srivatsa Vaddagiri
2007-06-11 15:52 ` [RFC][PATCH 2/6] task's cpu information needs to be always correct Srivatsa Vaddagiri
2007-06-12  2:17   ` [ckrm-tech] " Balbir Singh
2007-06-11 15:53 ` [RFC][PATCH 3/6] core changes in CFS Srivatsa Vaddagiri
2007-06-12  2:29   ` Balbir Singh
2007-06-12  4:22     ` Srivatsa Vaddagiri
2007-06-11 15:55 ` [RFC][PATCH 4/6] Fix (bad?) interactions between SCHED_RT and SCHED_NORMAL tasks Srivatsa Vaddagiri
2007-06-12  9:03   ` Dmitry Adamushko
2007-06-12 10:26     ` Srivatsa Vaddagiri
2007-06-12 12:23       ` Dmitry Adamushko
2007-06-12 13:30         ` Srivatsa Vaddagiri
2007-06-12 14:31           ` Dmitry Adamushko
2007-06-12 15:43             ` Srivatsa Vaddagiri
2007-06-11 15:56 ` [RFC][PATCH 5/6] core changes for group fairness Srivatsa Vaddagiri
2007-06-13 20:56   ` Dmitry Adamushko
2007-06-14 12:06     ` Srivatsa Vaddagiri
2007-06-11 15:58 ` [RFC][PATCH 6/6] Hook up to container infrastructure Srivatsa Vaddagiri
2007-06-11 16:02 ` [RFC][PATCH 0/6] Add group fairness to CFS - v1 Srivatsa Vaddagiri
2007-06-11 19:37 ` Ingo Molnar
2007-06-11 19:39   ` Ingo Molnar
2007-06-12  5:50   ` Srivatsa Vaddagiri
2007-06-12  6:26     ` Ingo Molnar [this message]
     [not found]       ` <20070612072742.GA785@in.ibm.com>
2007-06-12 10:56         ` Srivatsa Vaddagiri
2007-06-15 12:46       ` Kirill Korotaev
2007-06-15 14:06         ` Srivatsa Vaddagiri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070612062612.GA19637@elte.hu \
    --to=mingo@elte.hu \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@in.ibm.com \
    --cc=ckrm-tech@lists.sourceforge.net \
    --cc=containers@lists.osdl.org \
    --cc=dev@sw.ru \
    --cc=dmitry.adamushko@gmail.com \
    --cc=efault@gmx.de \
    --cc=kernel@kolivas.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=pwil3058@bigpond.net.au \
    --cc=tingy@cs.umass.edu \
    --cc=tong.n.li@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vatsa@linux.vnet.ibm.com \
    --cc=wli@holomorphy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox