All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kirill Korotaev <dev@sw.ru>
To: Ingo Molnar <mingo@elte.hu>
Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>,
	Nick Piggin <nickpiggin@yahoo.com.au>,
	tingy@cs.umass.edu, wli@holomorphy.com,
	ckrm-tech@lists.sourceforge.net, efault@gmx.de,
	pwil3058@bigpond.net.au, kernel@kolivas.org,
	linux-kernel@vger.kernel.org,
	Guillaume Chazarain <guichaz@yahoo.fr>,
	tong.n.li@intel.com, containers@lists.osdl.org,
	akpm@linux-foundation.org, torvalds@linux-foundation.org
Subject: Re: [RFC] [PATCH 0/3] Add group fairness to CFS
Date: Fri, 25 May 2007 17:05:16 +0400	[thread overview]
Message-ID: <4656DF0C.9090306@sw.ru> (raw)
In-Reply-To: <20070525082951.GA25280@elte.hu>

Ingo Molnar wrote:
> * Srivatsa Vaddagiri <vatsa@in.ibm.com> wrote:
> 
> 
>>Can you repeat your tests with this patch pls? With the patch applied, 
>>I am now getting the same split between nice 0 and nice 10 task as 
>>CFS-v13 provides (90:10 as reported by top )
>>
>> 5418 guest     20   0  2464  304  236 R   90  0.0   5:41.40 3 hog
>> 5419 guest     30  10  2460  304  236 R   10  0.0   0:43.62 3 nice10hog
> 
> 
> btw., what are you thoughts about SMP?
> 
> it's a natural extension of your current code. I think the best approach 
> would be to add a level of 'virtual CPU' objects above struct user. (how 
> to set the attributes of those objects is open - possibly combine it 
> with cpusets?)

> That way the scheduler would first pick a "virtual CPU" to schedule, and 
> then pick a user from that virtual CPU, and then a task from the user. 

don't you mean the vice versa:
first use to scheduler, then VCPU (which is essentially a runqueue or rbtree),
then a task from VCPU?

this is the approach we use in OpenVZ and if you don't mind
I would propose to go this way for fair-scheduling in mainstream.
It has it's own advantages and disatvantages.

This is not the easy way to go and I can outline the problems/disadvantages
which appear on this way:
- tasks which bind to CPU mask will bind to virtual CPUs.
  no problem with user tasks, but some kernel threads
  use this to do CPU-related management (like cpufreq).
  This can be fixed using SMP IPI actually.
- VCPUs should no change PCPUs very frequently,
  otherwise there is some overhead. Solvable.

Advantages:
- High precision and fairness.
- Allows to use different group scheduling algorithms
  on top of VCPU concept.
  OpenVZ uses fairscheduler with CPU limiting feature allowing
  to set maximum CPU time given to a group of tasks.

> To make group accounting scalable, the accounting object attached to the 
> user struct should/must be per-cpu (per-vcpu) too. That way we'd have a 
> clean hierarchy like:
> 
>   CPU #0 => VCPU A [ 40% ] + VCPU B [ 60% ]
>   CPU #1 => VCPU C [ 30% ] + VCPU D [ 70% ]

how did you select these 40%:60% and 30%:70% split?

>   VCPU A => USER X [ 10% ] + USER Y [ 90% ]
>   VCPU B => USER X [ 10% ] + USER Y [ 90% ]
>   VCPU C => USER X [ 10% ] + USER Y [ 90% ]
>   VCPU D => USER X [ 10% ] + USER Y [ 90% ]
> 
> the scheduler first picks a vcpu, then a user from a vcpu. (the actual 
> external structure of the hierarchy should be opaque to the scheduler 
> core, naturally, so that we can use other hierarchies too)
> 
> whenever the scheduler does accounting, it knows where in the hierarchy 
> it is and updates all higher level entries too. This means that the 
> accounting object for USER X is replicated for each VCPU it participates 
> in.

So if 2 VCPUs running on 2 physical CPUs do accounting the have to update the same
user X accounting information which is not per-[v]cpu?

> SMP balancing is straightforward: it would fundamentally iterate through 
> the same hierarchy and would attempt to keep all levels balanced - i 
> abstracted away its iterators already.

Thanks,
Kirill

  parent reply	other threads:[~2007-05-25 13:08 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-23 16:48 [RFC] [PATCH 0/3] Add group fairness to CFS Srivatsa Vaddagiri
2007-05-23 16:51 ` [RFC] [PATCH 1/3] task_cpu(p) needs to be correct always Srivatsa Vaddagiri
2007-05-23 16:54 ` [RFC] [PATCH 2/3] Introduce two new structures - struct lrq and sched_entity Srivatsa Vaddagiri
2007-05-23 16:56 ` [RFC] [PATCH 3/3] Generalize CFS core and provide per-user fairness Srivatsa Vaddagiri
2007-05-23 18:32 ` [RFC] [PATCH 0/3] Add group fairness to CFS Ingo Molnar
2007-05-25  7:59   ` Srivatsa Vaddagiri
     [not found] ` <3d8471ca0705231112rfac9cfbt9145ac2da8ec1c85@mail.gmail.com>
     [not found]   ` <20070523183824.GA7388@elte.hu>
     [not found]     ` <4654BF88.3030404@yahoo.fr>
2007-05-25  7:45       ` Srivatsa Vaddagiri
2007-05-25  8:29         ` Ingo Molnar
2007-05-25 10:56           ` Srivatsa Vaddagiri
2007-05-25 11:11             ` Ingo Molnar
2007-05-25 11:28               ` Srivatsa Vaddagiri
2007-05-25 12:05                 ` Ingo Molnar
2007-05-25 12:41                   ` Srivatsa Vaddagiri
2007-05-25 13:05           ` Kirill Korotaev [this message]
2007-05-25 15:34             ` [ckrm-tech] " Srivatsa Vaddagiri
2007-05-25 16:18               ` Kirill Korotaev
2007-05-25 18:08                 ` Srivatsa Vaddagiri
2007-05-26  0:17                   ` Peter Williams
2007-05-26 15:41                     ` William Lee Irwin III
2007-05-27  1:29                       ` Peter Williams
2007-05-29 10:48                         ` William Lee Irwin III
2007-05-30  0:09                           ` Peter Williams
2007-05-30  2:48                             ` William Lee Irwin III
2007-05-30  4:07                               ` Peter Williams
2007-05-30 17:14                       ` Srivatsa Vaddagiri
2007-05-30 20:13                         ` William Lee Irwin III
2007-05-31  3:26                           ` Srivatsa Vaddagiri
2007-05-31  4:09                             ` William Lee Irwin III
2007-05-31  5:48                               ` Srivatsa Vaddagiri
2007-05-31  6:36                                 ` William Lee Irwin III
2007-05-31  8:33                                   ` Srivatsa Vaddagiri
2007-05-31  8:43                                     ` William Lee Irwin III
2007-05-31  8:56                                     ` Srivatsa Vaddagiri
2007-05-31  9:15                                       ` William Lee Irwin III
2007-05-31  9:36                                         ` Srivatsa Vaddagiri
2007-05-28 17:26                     ` Srivatsa Vaddagiri
2007-05-29  0:18                       ` Peter Williams
2007-05-29  1:55                         ` Paul Menage
2007-05-29  3:30                         ` Peter Williams
2007-05-25  9:30         ` Guillaume Chazarain
     [not found] ` <20070523180316.GY19966@holomorphy.com>
2007-05-25 16:14   ` Srivatsa Vaddagiri
2007-05-25 17:14     ` Li, Tong N
2007-05-28 16:39       ` [ckrm-tech] " Srivatsa Vaddagiri
2007-05-30  0:14         ` Bill Huey
2007-05-30  2:51         ` William Lee Irwin III

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4656DF0C.9090306@sw.ru \
    --to=dev@sw.ru \
    --cc=akpm@linux-foundation.org \
    --cc=ckrm-tech@lists.sourceforge.net \
    --cc=containers@lists.osdl.org \
    --cc=efault@gmx.de \
    --cc=guichaz@yahoo.fr \
    --cc=kernel@kolivas.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=pwil3058@bigpond.net.au \
    --cc=tingy@cs.umass.edu \
    --cc=tong.n.li@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vatsa@in.ibm.com \
    --cc=wli@holomorphy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.