Re: sched: tweak select_idle_sibling to look for idle threads

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Yuyang Du <yuyang.du@intel.com>
To: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>, Chris Mason <clm@fb.com>,
	Ingo Molnar <mingo@kernel.org>,
	Matt Fleming <matt@codeblueprint.co.uk>,
	linux-kernel@vger.kernel.org
Subject: Re: sched: tweak select_idle_sibling to look for idle threads
Date: Wed, 11 May 2016 09:23:47 +0800	[thread overview]
Message-ID: <20160511012347.GA8790@intel.com> (raw)
In-Reply-To: <1462940271.3717.57.camel@gmail.com>

On Wed, May 11, 2016 at 06:17:51AM +0200, Mike Galbraith wrote:
> > >  static inline unsigned long cfs_rq_runnable_load_avg(struct cfs_rq *cfs_rq)
> > >  {
> > > +> > 	> > if (sched_feat(LB_TIP_AVG_HIGH) && cfs_rq->load.weight > cfs_rq->runnable_load_avg*2)
> > > +> > 	> > 	> > return cfs_rq->runnable_load_avg + min_t(unsigned long, NICE_0_LOAD,
> > > +> > 	> > 	> > 	> > 	> > 	> > 	> > 	> >  cfs_rq->load.weight/2);
> > >  > > 	> > return cfs_rq->runnable_load_avg;
> > >  }
> >   
> > cfs_rq->runnable_load_avg is for sure no greater than (in this case much less
> > than, maybe 1/2 of) load.weight, whereas load_avg is not necessarily a rock
> > in gearbox that only impedes speed up, but also speed down.
> 
> Yeah, just like everything else, it'll cuts both ways (why you can't
> win the sched game).  If I can believe tbench, at tasks=cpus, reducing
> lag increased utilization and reduced latency a wee bit, as did the
> reserve thing once a booboo got fixed up.

Ok, so you have a secret IDLE_RESERVE? Good luck and show it, ;)

> Makes sense, robbing Peter
> to pay Paul should work out better for Paul.
> 
> NO_LB_TIP_AVG_HIGH
> Throughput 27132.9 MB/sec  96 clients  96 procs  max_latency=7.656 ms
> Throughput 28464.1 MB/sec  96 clients  96 procs  max_latency=9.905 ms
> Throughput 25369.8 MB/sec  96 clients  96 procs  max_latency=7.192 ms
> Throughput 25670.3 MB/sec  96 clients  96 procs  max_latency=5.874 ms
> Throughput 29309.3 MB/sec  96 clients  96 procs  max_latency=1.331 ms
> avg        27189   1.000                                     6.391   1.000
> 
> NO_LB_TIP_AVG_HIGH IDLE_RESERVE
> Throughput 24437.5 MB/sec  96 clients  96 procs  max_latency=1.837 ms
> Throughput 29464.7 MB/sec  96 clients  96 procs  max_latency=1.594 ms
> Throughput 28023.6 MB/sec  96 clients  96 procs  max_latency=1.494 ms
> Throughput 28299.0 MB/sec  96 clients  96 procs  max_latency=10.404 ms
> Throughput 29072.1 MB/sec  96 clients  96 procs  max_latency=5.575 ms
> avg        27859   1.024                                     4.180   0.654
> 
> LB_TIP_AVG_HIGH NO_IDLE_RESERVE
> Throughput 29068.1 MB/sec  96 clients  96 procs  max_latency=5.599 ms
> Throughput 26435.6 MB/sec  96 clients  96 procs  max_latency=3.703 ms
> Throughput 23930.0 MB/sec  96 clients  96 procs  max_latency=7.742 ms
> Throughput 29464.2 MB/sec  96 clients  96 procs  max_latency=1.549 ms
> Throughput 24250.9 MB/sec  96 clients  96 procs  max_latency=1.518 ms
> avg        26629   0.979                                     4.022   0.629
> 
> LB_TIP_AVG_HIGH IDLE_RESERVE
> Throughput 30340.1 MB/sec  96 clients  96 procs  max_latency=1.465 ms
> Throughput 29042.9 MB/sec  96 clients  96 procs  max_latency=4.515 ms
> Throughput 26718.7 MB/sec  96 clients  96 procs  max_latency=1.822 ms
> Throughput 28694.4 MB/sec  96 clients  96 procs  max_latency=1.503 ms
> Throughput 28918.2 MB/sec  96 clients  96 procs  max_latency=7.599 ms
> avg        28742   1.057                                     3.380   0.528
> 
> > But I really don't know the load references in select_task_rq() should be
> > what kind. So maybe the real issue is a mix of them, i.e., conflated balancing
> > and just wanting an idle cpu. ?
> 
> Depends on the goal.  For both, load lagging reality means the high
> frequency component is squelched, meaning less migration cost, but also
> higher latency due to stacking.  It's a tradeoff where Chris' latency
> is everything" benchmark, and _maybe_ the real world load it's based
> upon is on Peter's end of the rob Peter to pay Paul transaction.  The
> benchmark says it definitely is, the real world load may have already
> been fixed up by the select_idle_sibling() rewrite.
 
Obviously, load avgs are good at balancing in a larger scale in a timeframe,
so they should be used in comparing/balancing sd's not cpus. However, this
is not the case currently: avgs are mixed with idle cpu/core selection, so
I think better job can be done before and after select_idle_sibling().

For example, I don't know what the complex wake_affine() is really doing for
what. Am i missing something, you think?

Kudos to select_idle_sibling() rewrite, like Peter said, a second step and
an even third step scans are really helping, in addition to many cleanups
and refactors.

next prev parent reply	other threads:[~2016-05-11  9:05 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-05 18:08 [PATCH RFC] select_idle_sibling experiments Chris Mason
2016-04-05 18:43 ` Bastien Bastien Philbert
2016-04-05 19:28   ` Chris Mason
2016-04-05 20:03 ` Matt Fleming
2016-04-05 21:05   ` Bastien Philbert
2016-04-06  0:44   ` Chris Mason
2016-04-06  7:27 ` Mike Galbraith
2016-04-06 13:36   ` Chris Mason
2016-04-09 17:30   ` Chris Mason
2016-04-12 21:45     ` Matt Fleming
2016-04-13  3:40       ` Mike Galbraith
2016-04-13 15:54         ` Chris Mason
2016-04-28 12:00   ` Peter Zijlstra
2016-04-28 13:17     ` Mike Galbraith
2016-05-02  5:35     ` Mike Galbraith
2016-04-07 15:17 ` Chris Mason
2016-04-09 19:05 ` sched: tweak select_idle_sibling to look for idle threads Chris Mason
2016-04-10 10:04   ` Mike Galbraith
2016-04-10 12:35     ` Chris Mason
2016-04-10 12:46       ` Mike Galbraith
2016-04-10 19:55     ` Chris Mason
2016-04-11  4:54       ` Mike Galbraith
2016-04-12  0:30         ` Chris Mason
2016-04-12  4:44           ` Mike Galbraith
2016-04-12 13:27             ` Chris Mason
2016-04-12 18:16               ` Mike Galbraith
2016-04-12 20:07                 ` Chris Mason
2016-04-13  3:18                   ` Mike Galbraith
2016-04-13 13:44                     ` Chris Mason
2016-04-13 14:22                       ` Mike Galbraith
2016-04-13 14:36                         ` Chris Mason
2016-04-13 15:05                           ` Mike Galbraith
2016-04-13 15:34                             ` Mike Galbraith
2016-04-30 12:47   ` Peter Zijlstra
2016-05-01  7:12     ` Mike Galbraith
2016-05-01  8:53       ` Peter Zijlstra
2016-05-01  9:20         ` Mike Galbraith
2016-05-07  1:24           ` Yuyang Du
2016-05-08  8:08             ` Mike Galbraith
2016-05-08 18:57               ` Yuyang Du
2016-05-09  3:45                 ` Mike Galbraith
2016-05-08 20:22                   ` Yuyang Du
2016-05-09  7:44                     ` Mike Galbraith
2016-05-09  1:13                       ` Yuyang Du
2016-05-09  9:39                         ` Mike Galbraith
2016-05-09 23:26                           ` Yuyang Du
2016-05-10  7:49                             ` Mike Galbraith
2016-05-10 15:26                               ` Mike Galbraith
2016-05-10 19:16                                 ` Yuyang Du
2016-05-11  4:17                                   ` Mike Galbraith
2016-05-11  1:23                                     ` Yuyang Du [this message]
2016-05-11  9:56                                       ` Mike Galbraith
2016-05-18  6:41                                   ` Mike Galbraith
2016-05-09  3:52                 ` Mike Galbraith
2016-05-08 20:31                   ` Yuyang Du
2016-05-02  8:46       ` Peter Zijlstra
2016-05-02 14:50         ` Mike Galbraith
2016-05-02 14:58           ` Peter Zijlstra
2016-05-02 15:47             ` Chris Mason
2016-05-03 14:32               ` Peter Zijlstra
2016-05-03 15:11                 ` Chris Mason
2016-05-04 10:37                   ` Peter Zijlstra
2016-05-04 15:31                     ` Peter Zijlstra
2016-05-05 22:03                     ` Matt Fleming
2016-05-06 18:54                       ` Mike Galbraith
2016-05-09  8:33                         ` Peter Zijlstra
2016-05-09  8:56                           ` Mike Galbraith
2016-05-04 15:45                   ` Peter Zijlstra
2016-05-04 17:46                     ` Chris Mason
2016-05-05  9:33                       ` Peter Zijlstra
2016-05-05 13:58                         ` Chris Mason
2016-05-06  7:12                           ` Peter Zijlstra
2016-05-06 17:27                             ` Chris Mason
2016-05-06  7:25                   ` Peter Zijlstra
2016-05-02 17:30             ` Mike Galbraith
2016-05-02 15:01           ` Peter Zijlstra
2016-05-02 16:04             ` Ingo Molnar
2016-05-03 11:31               ` Peter Zijlstra
2016-05-03 18:22                 ` Peter Zijlstra
2016-05-02 15:10           ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160511012347.GA8790@intel.com \
    --to=yuyang.du@intel.com \
    --cc=clm@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matt@codeblueprint.co.uk \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=umgwanakikbuti@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.