All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: pjt@google.com
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
	Srivatsa Vaddagiri <vatsa@in.ibm.com>,
	Chris Friesen <cfriesen@nortel.com>,
	Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>,
	Pierre Bourdon <pbourdon@excellency.fr>,
	Bharata B Rao <bharata@linux.vnet.ibm.com>
Subject: Re: [RFC tg_shares_up improvements - v1 00/12] [RFC tg_shares_up - v1 00/12] Reducing cost of tg->shares distribution
Date: Sat, 16 Oct 2010 21:46:54 +0200	[thread overview]
Message-ID: <1287258414.1998.133.camel@laptop> (raw)
In-Reply-To: <20101016044349.830426011@google.com>

On Fri, 2010-10-15 at 21:43 -0700, pjt@google.com wrote:
> Hi all,
> 
> Peter previously posted a patchset that attempted to improve the problem of
> task_group share distribution.  This is something that has been a long-time
> pain point for group scheduling.  The existing algorithm considers
> distributions on a per-cpu-per-domain basis and carries a fairly high update
> overhead, especially on larger machines.
> 
> I was previously looking at improving this using Fenwick trees to allow a
> single sum without the exorbitant cost but then Peter's idea above was better :).
> 
> The kernel is that by monitoring the average contribution to load on a
> per-cpu-per-taskgroup basis we can distribute the weight for which we are
> expected to consume.
> 
> This set extends the original posting with a focus on increased fairness and
> reduced convergence (to true average) time.  In particular the case of large
> over-commit in the case of a distributed wake-up is a concern which is now
> fairly well addressed.
> 
> Obviously everything's experimental but it should be stable/fair.

I like what you've done with it, my only worry is 10/12 where you allow
for extra updates to the global state -- I think they should be fairly
limited in number, and I can see the need for the update if we get too
far out of whack, but it is something to look at while testing this
stuff.

> TODO:
> - Validate any RT interaction

I don't think there's anything to worry about there, the only
interaction which there is between this and the rt scheduling classes is
the initial sharing of the load-avg window, but you 'cure' that in 7/12.

(I think that sysctl wants a _us postfix someplace and we thus want some
NSEC_PER_USEC multiplication in there).

> - Continue collecting/analyzing performance and fairness data

Yes please ;-), I'll try and run this on some machines as well.

> - Should the shares period just be the sched_latency?

Interesting idea.. lets keep it a separate sysctl for now for easy
tuning, if things settle down and we're still good in that range we can
consider merging them.



  parent reply	other threads:[~2010-10-16 19:47 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-16  4:43 [RFC tg_shares_up improvements - v1 00/12] [RFC tg_shares_up - v1 00/12] Reducing cost of tg->shares distribution pjt
2010-10-16  4:43 ` [RFC tg_shares_up improvements - v1 01/12] sched: rewrite tg_shares_up pjt
2010-10-21  6:04   ` Bharata B Rao
2010-10-21  6:28     ` Paul Turner
2010-10-21  8:08   ` Bharata B Rao
2010-10-21  8:38     ` Paul Turner
2010-10-21  9:08     ` Peter Zijlstra
     [not found]   ` <AANLkTi=zYAfb_izD15ROxH=C6+zPzX+XEGw7r5UUoAar@mail.gmail.com>
2010-11-04 21:00     ` Paul Turner
2010-10-16  4:43 ` [RFC tg_shares_up improvements - v1 02/12] sched: on-demand (active) cfs_rq list pjt
2010-10-16  4:43 ` [RFC tg_shares_up improvements - v1 03/12] sched: make tg_shares_up() walk on-demand pjt
2010-10-16  4:43 ` [RFC tg_shares_up improvements - v1 04/12] sched: fix load corruption from update_cfs_shares pjt
2010-10-16  4:43 ` [RFC tg_shares_up improvements - v1 05/12] sched: fix update_cfs_load synchronization pjt
2010-10-21  9:52   ` Bharata B Rao
2010-10-21 18:25     ` Paul Turner
2010-10-16  4:43 ` [RFC tg_shares_up improvements - v1 06/12] sched: hierarchal order on shares update list pjt
2010-10-16  4:43 ` [RFC tg_shares_up improvements - v1 07/12] sched: add sysctl_sched_shares_window pjt
2010-10-16  4:43 ` [RFC tg_shares_up improvements - v1 08/12] sched: update shares on idle_balance pjt
2010-10-16  4:43 ` [RFC tg_shares_up improvements - v1 09/12] sched: demand based update_cfs_load() pjt
2010-10-16  4:43 ` [RFC tg_shares_up improvements - v1 10/12] sched: allow update_cfs_load to update global load pjt
2010-10-16  4:44 ` [RFC tg_shares_up improvements - v1 11/12] sched: update tg->shares after cpu.shares write pjt
2010-10-16  4:44 ` [RFC tg_shares_up improvements - v1 12/12] debug: export effective shares for analysis versus specified pjt
2010-10-16 19:46 ` Peter Zijlstra [this message]
2010-10-21  6:36   ` [RFC tg_shares_up improvements - v1 00/12] [RFC tg_shares_up - v1 00/12] Reducing cost of tg->shares distribution Paul Turner
2010-10-22  0:14     ` Paul Turner
2010-10-17  5:24 ` Balbir Singh
2010-10-17  9:38   ` Peter Zijlstra
2010-10-17 12:09     ` Balbir Singh
2010-11-03 18:27 ` Karl Rister

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1287258414.1998.133.camel@laptop \
    --to=peterz@infradead.org \
    --cc=bharata@linux.vnet.ibm.com \
    --cc=cfriesen@nortel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=pbourdon@excellency.fr \
    --cc=pjt@google.com \
    --cc=svaidy@linux.vnet.ibm.com \
    --cc=vatsa@in.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.