public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [wake_afine fixes/improvements 0/3] Introduction
@ 2011-01-15  1:57 Paul Turner
  2011-01-15  1:57 ` [wake_afine fixes/improvements 1/3] sched: update effective_load() to use global share weights Paul Turner
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Paul Turner @ 2011-01-15  1:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Mike Galbraith, Nick Piggin,
	Srivatsa Vaddagiri


I've been looking at the wake_affine path to improve the group scheduling case
(wake affine performance for fair group sched has historically lagged) as well
as tweaking performance in general.

The current series of patches is attached, the first of which should probably be
considered for 2.6.38 since it fixes a bug/regression in the case of waking up
onto a previously (group) empty cpu.  While the others can be considered more
forwards looking.

I've been using an rpc ping-pong workload which is known be sensitive to poor affine 
decisions to benchmark these changes, I'm happy to run these patches against
other workloads.  In particular improvements on reaim have been demonstrated,
but since it's not as stable a benchmark the numbers are harder to present in
a representative fashion.  Suggestions/pet benchmarks greatly appreciated
here.

Some other things experimented with (but didn't pan out as a performance win):
- Considering instantaneous load on prev_cpu as well as current_cpu
- Using more gentle wl/wg values to reflect that they a task's contribution to
load_contribution is likely less than its weight.

Performance:

(througput is measured in txn/s across a 5 minute interval, with a 30 second 
warmup)

tip (no group scheduling):
throughput=57798.701988 reqs/sec.
throughput=58098.876188 reqs/sec.

tip: (autogroup + current shares code and associated broken effective_load)
throughput=49824.283179 reqs/sec.
throughput=48527.942386 reqs/sec.

tip (autogroup + old tg_shares code): [parity goal post]
throughput=57846.575060 reqs/sec.
throughput=57626.442034 reqs/sec.

tip (autogroup + effective_load rewrite):
throughput=58534.073595 reqs/sec.
throughput=58068.072052 reqs/sec.

tip (autogroup + effective_load + no affine moves for hot tasks):
throughput=60907.794697 reqs/sec.
throughput=61208.305629 reqs/sec.

Thanks,

- Paul




^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-01-18 21:52 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-15  1:57 [wake_afine fixes/improvements 0/3] Introduction Paul Turner
2011-01-15  1:57 ` [wake_afine fixes/improvements 1/3] sched: update effective_load() to use global share weights Paul Turner
2011-01-17 14:11   ` Peter Zijlstra
2011-01-17 14:20     ` Peter Zijlstra
2011-01-18 19:04   ` [tip:sched/urgent] sched: Update " tip-bot for Paul Turner
2011-01-15  1:57 ` [wake_afine fixes/improvements 2/3] sched: clean up task_hot() Paul Turner
2011-01-17 14:14   ` Peter Zijlstra
2011-01-18 21:52     ` Paul Turner
2011-01-15  1:57 ` [wake_afine fixes/improvements 3/3] sched: introduce sched_feat(NO_HOT_AFFINE) Paul Turner
2011-01-15 14:29 ` [wake_afine fixes/improvements 0/3] Introduction Mike Galbraith
2011-01-15 19:29   ` Paul Turner
2011-01-15 21:34 ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox