All of lore.kernel.org
 help / color / mirror / Atom feed
From: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
To: "Chris Friesen" <cfriesen@nortel.com>
Cc: linux-kernel@vger.kernel.org, mingo@elte.hu,
	a.p.zijlstra@chello.nl, pj@sgi.com,
	Balbir Singh <balbir@in.ibm.com>,
	aneesh.kumar@linux.vnet.ibm.com, dhaval@linux.vnet.ibm.com
Subject: Re: fair group scheduler not so fair?
Date: Fri, 30 May 2008 17:06:53 +0530	[thread overview]
Message-ID: <20080530113653.GI12836@linux.vnet.ibm.com> (raw)
In-Reply-To: <483F207D.4010908@nortel.com>

On Thu, May 29, 2008 at 03:30:37PM -0600, Chris Friesen wrote:
> Overall the group scheduler results look better, but I'm seeing an odd 
> scenario within a single group where sometimes I get a 67/67/66 breakdown 
> but sometimes it gives 100/50/50.

Hmm ..I cant recreate this 100/50/50 situation (tried about 10 times).

> Also, although the long-term results are good, the shorter-term fairness 
> isn't great.  Is there a tuneable that would allow for a tradeoff between 
> performance and fairness?

The tuneables I can think of are:

- HZ (higher the better)
- min/max_interval and imbalance_pct for each domain (lower the better)

> I have people that are looking for within 4% fairness over a 1sec interval.

That seems to be pretty difficult to achieve with the per-cpu runqueue
and smpnice based load balancing approach we have now.

> Initially I tried a simple setup with three hogs all in the default "sys" 
> group.  Over multiple retries using 10-sec intervals, sometimes it gave 
> roughly 67% for each task, other times it settled into a 100/50/50 split 
> that remained stable over time.

Was this with imbalance_pct set to 105? Does it make any difference if
you change imbalance_pct to say 102?

> 3 tasks in sys
>  2471 cfriesen  20   0  3800  392  336 R 99.9  0.0   0:29.97 cat
>  2470 cfriesen  20   0  3800  392  336 R 50.3  0.0   0:17.83 cat
>  2469 cfriesen  20   0  3800  392  336 R 49.6  0.0   0:17.96 cat
>
> retry
>  2475 cfriesen  20   0  3800  392  336 R 68.3  0.0   0:28.46 cat
>  2476 cfriesen  20   0  3800  392  336 R 67.3  0.0   0:28.24 cat
>  2474 cfriesen  20   0  3800  392  336 R 64.3  0.0   0:28.73 cat
>
>  2476 cfriesen  20   0  3800  392  336 R 67.1  0.0   0:41.79 cat
>  2474 cfriesen  20   0  3800  392  336 R 66.6  0.0   0:41.96 cat
>  2475 cfriesen  20   0  3800  392  336 R 66.1  0.0   0:41.67 cat
>
> retry
>  2490 cfriesen  20   0  3800  392  336 R 99.7  0.0   0:22.23 cat
>  2489 cfriesen  20   0  3800  392  336 R 49.9  0.0   0:21.02 cat
>  2491 cfriesen  20   0  3800  392  336 R 49.9  0.0   0:13.94 cat
>
>
> With three groups, one task in each, I tried both 10 and 60 second 
> intervals.  The longer interval looked better but was still up to 0.8% off:

I honestly don't know if we can do better than 0.8%! In any case, I'd
expect that it would require more drastic changes.

> 10-sec
>  2490 cfriesen  20   0  3800  392  336 R 68.9  0.0   1:35.13 cat
>  2491 cfriesen  20   0  3800  392  336 R 65.8  0.0   1:04.65 cat
>  2489 cfriesen  20   0  3800  392  336 R 64.5  0.0   1:26.48 cat
>
> 60-sec
>  2490 cfriesen  20   0  3800  392  336 R 67.5  0.0   3:19.85 cat
>  2491 cfriesen  20   0  3800  392  336 R 66.3  0.0   2:48.93 cat
>  2489 cfriesen  20   0  3800  392  336 R 66.2  0.0   3:10.86 cat
>
>
> Finally, a more complicated scenario.  three tasks in A, two in B, and one 
> in C.  The 60-sec trial was up to 0.8 off, while a 3-second trial (just for 
> fun) was 8.5% off.
>
> 60-sec
> 2491 cfriesen  20   0  3800  392  336 R 65.9  0.0   5:06.69 cat
>  2499 cfriesen  20   0  3800  392  336 R 33.6  0.0   0:55.35 cat
>  2490 cfriesen  20   0  3800  392  336 R 33.5  0.0   4:47.94 cat
>  2497 cfriesen  20   0  3800  392  336 R 22.6  0.0   0:38.76 cat
>  2489 cfriesen  20   0  3800  392  336 R 22.2  0.0   4:28.03 cat
>  2498 cfriesen  20   0  3800  392  336 R 22.2  0.0   0:35.13 cat
>
> 3-sec
> 2491 cfriesen  20   0  3800  392  336 R 58.2  0.0  13:29.60 cat
>  2490 cfriesen  20   0  3800  392  336 R 34.8  0.0   9:07.73 cat
>  2499 cfriesen  20   0  3800  392  336 R 31.0  0.0   5:15.69 cat
>  2497 cfriesen  20   0  3800  392  336 R 29.4  0.0   3:37.25 cat
>  2489 cfriesen  20   0  3800  392  336 R 23.3  0.0   7:26.25 cat
>  2498 cfriesen  20   0  3800  392  336 R 23.0  0.0   3:33.24 cat

I ran with this configuration:

	HZ = 1000, 
	min/max_interval = 1
	imbalance_pct = 102

My 10-sec fairness looks like below (Error = 1.5%):

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  #C COMMAND
 4549 root      20   0  1384  228  176 R 65.2  0.0   0:36.02  0 hogc
 4547 root      20   0  1384  228  176 R 32.8  0.0   0:17.87  0 hogb
 4548 root      20   0  1384  228  176 R 32.6  0.0   0:18.28  1 hogb
 4546 root      20   0  1384  232  176 R 22.9  0.0   0:11.82  1 hoga
 4545 root      20   0  1384  228  176 R 22.3  0.0   0:11.74  1 hoga
 4544 root      20   0  1384  232  176 R 22.1  0.0   0:11.93  1 hoga

3-sec fairness (error = 2.3% ..sometimes went upto 6.7%)

 4549 root      20   0  1384  228  176 R 69.0  0.0   1:33.56  1 hogc
 4548 root      20   0  1384  228  176 R 32.7  0.0   0:46.74  1 hogb
 4547 root      20   0  1384  228  176 R 29.3  0.0   0:47.16  0 hogb
 4546 root      20   0  1384  232  176 R 22.3  0.0   0:30.80  0 hoga
 4544 root      20   0  1384  232  176 R 20.3  0.0   0:30.95  0 hoga
 4545 root      20   0  1384  228  176 R 19.4  0.0   0:31.17  0 hoga

-- 
Regards,
vatsa

  parent reply	other threads:[~2008-05-30 11:27 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-21 23:59 fair group scheduler not so fair? Chris Friesen
2008-05-22  6:56 ` Peter Zijlstra
2008-05-22 20:02   ` Chris Friesen
2008-05-22 20:07     ` Peter Zijlstra
2008-05-22 20:18       ` Li, Tong N
2008-05-22 21:13         ` Peter Zijlstra
2008-05-23  0:17           ` Chris Friesen
2008-05-23  7:44             ` Srivatsa Vaddagiri
2008-05-23  9:42         ` Srivatsa Vaddagiri
2008-05-23  9:39           ` Peter Zijlstra
2008-05-23 10:19             ` Srivatsa Vaddagiri
2008-05-23 10:16               ` Peter Zijlstra
2008-05-27 17:15 ` Srivatsa Vaddagiri
2008-05-27 18:13   ` Chris Friesen
2008-05-28 16:33     ` Srivatsa Vaddagiri
2008-05-28 18:35       ` Chris Friesen
2008-05-28 18:47         ` Dhaval Giani
2008-05-29  2:50         ` Srivatsa Vaddagiri
2008-05-29 16:46         ` Srivatsa Vaddagiri
2008-05-29 16:47           ` Srivatsa Vaddagiri
2008-05-29 21:30           ` Chris Friesen
2008-05-30  6:43             ` Dhaval Giani
2008-05-30 10:21               ` Srivatsa Vaddagiri
2008-05-30 11:36             ` Srivatsa Vaddagiri [this message]
2008-06-02 20:03               ` Chris Friesen
2008-05-27 17:28 ` Srivatsa Vaddagiri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080530113653.GI12836@linux.vnet.ibm.com \
    --to=vatsa@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=balbir@in.ibm.com \
    --cc=cfriesen@nortel.com \
    --cc=dhaval@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=pj@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.