[ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
@ 2012-03-24  9:39 Con Kolivas
  2012-03-24  9:53 ` Gene Heskett
  0 siblings, 1 reply; 18+ messages in thread
From: Con Kolivas @ 2012-03-24  9:39 UTC (permalink / raw)
  To: linux-kernel

This is to announce the first stable release of the BFS CPU scheduler
for linux 3.3.0 designed for optimal interactivity, responsiveness and
throughput on commodity hardware.

The changes since BFS version 0.416 include a fairly large
architectural change just to bring the codebase in sync with 3.3, but
none of the changes should be noticeable in any way. One change that
may be user-visible is that the high resolution IRQ accounting now
appears to be on by default for x86 architectures. There is an issue
that system time accounting is wrong without this feature enabled in
BFS so this should correct that problem.

Other changes:
416-417: A number of ints were changed to bool which though unlikely
to have any performance impact, do make the code cleaner and the
compiled code does often come out different. rq_running_iso was
converted from a function to macro to avoid it being a separate
function call when compiled in with the attendant overhead.
requeue_task within the scheduler tick was moved to being done under
lock which may prevent rare races.  test_ret_isorefractory() was
optimised. set_rq_task() was not being called on tasks that were being
requeued within schedule() which could possibly have led to issues if
the task ran out of timeslice during that requeue and should have had
its deadline offset. The need_resched() check that occurs at the end
of schedule() was changed to unlikely() since it really is that. Moved
the scheduler version print function to bfs.c to avoid recompiling the
entire kernel if the version number is changed.

417-418: Fixed a problem with the accounting resync for linux 3.3.

418-419: There was a small possibility that an unnecessary resched
would occur in try_preempt if a task had changed affinity and called
try_preempt with its ->cpu still set to the old cpu it could no longer
run on, so try_preempt was reworked slightly. Reintroduced the
deadline offset based on CPU cache locality on sticky tasks in a way
that was cheaper than we currently offset the deadline.

419-420: Finally rewrote the earliest_deadline_task code. This has
long been one of the hottest code paths in the scheduler and small
changes here that made it look nice would often slow it down. I spent
quite a few hours reworking it to include less GOTOs while
disassembling the code to make sure it was actually getting smaller
with every change.  Then I wrote a scheduler specific version of
find_next_bit which could be inlined into this code and avoid another
function call in the hot path. The overall behaviour is unchanged from
previous BFS versions, but initial benchmarking confirms slight
improvements in throughput.

While interactivity is the prime concern for BFS, as part of the
regression testing, throughput benchmarks are performed with
kernbench. This is a plot of BFS 418/420 and mainline archlinux 3.3.0
kernel on a dual quad hyperthread core2 (lower is better):

http://postimage.org/image/wavusknl1/

Enjoy!
お楽しみ下さい

--
-ck

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
  2012-03-24  9:39 Con Kolivas
@ 2012-03-24  9:53 ` Gene Heskett
  2012-03-24 10:00   ` Con Kolivas
  2012-03-25  2:05   ` Valdis.Kletnieks
  0 siblings, 2 replies; 18+ messages in thread
From: Gene Heskett @ 2012-03-24  9:53 UTC (permalink / raw)
  To: Con Kolivas, linux-kernel

On Saturday, March 24, 2012, Con Kolivas wrote:
>This is to announce the first stable release of the BFS CPU scheduler
>for linux 3.3.0 designed for optimal interactivity, responsiveness and
>throughput on commodity hardware.

URL?

I for one am happy to see this, Con.  I have been running an earlier patch 
as pclos applies it to 2.6.38.8, and I must say the desktop interactivity 
is very much improved over the non-bfs version.

Cheers, Gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
My web page: <http://coyoteden.dyndns-free.com:85/gene>
I would like to know
What I was fencing in
And what I was fencing out.
		-- Robert Frost

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
  2012-03-24  9:53 ` Gene Heskett
@ 2012-03-24 10:00   ` Con Kolivas
  2012-03-25  2:05   ` Valdis.Kletnieks
  1 sibling, 0 replies; 18+ messages in thread
From: Con Kolivas @ 2012-03-24 10:00 UTC (permalink / raw)
  To: Gene Heskett; +Cc: linux-kernel

2012/3/24 Gene Heskett <gene.heskett@gmail.com>:
> On Saturday, March 24, 2012, Con Kolivas wrote:
>>This is to announce the first stable release of the BFS CPU scheduler
>>for linux 3.3.0 designed for optimal interactivity, responsiveness and
>>throughput on commodity hardware.
>
> URL?

My bad.

This patch is:
http://ck.kolivas.org/patches/bfs/3.3.0/3.3-sched-bfs-420.patch

All BFS patches are here:
http://ck.kolivas.org/patches/bfs/
>
> I for one am happy to see this, Con.  I have been running an earlier patch
> as pclos applies it to 2.6.38.8, and I must say the desktop interactivity
> is very much improved over the non-bfs version.

Thanks for the feedback. Enjoy.

Regards,
Con

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
  2012-03-24  9:53 ` Gene Heskett
  2012-03-24 10:00   ` Con Kolivas
@ 2012-03-25  2:05   ` Valdis.Kletnieks
  2012-03-25  2:33     ` Con Kolivas
                       ` (2 more replies)
  1 sibling, 3 replies; 18+ messages in thread
From: Valdis.Kletnieks @ 2012-03-25  2:05 UTC (permalink / raw)
  To: Gene Heskett; +Cc: Con Kolivas, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 989 bytes --]

On Sat, 24 Mar 2012 05:53:32 -0400, Gene Heskett said:

> I for one am happy to see this, Con.  I have been running an earlier patch
> as pclos applies it to 2.6.38.8, and I must say the desktop interactivity
> is very much improved over the non-bfs version.

I'va always wondered what people are using to measure interactivity. Do we have
some hard numbers from scheduler traces, or is it a "feels faster"?  And if
it's a subjective thing, how are people avoiding confirmation bias (where you
decide it feels faster because it's the new kernel and *should* feel faster)?
Anybody doing blinded boots, where a random kernel old/new is booted and the
user grades the performance without knowing which one was actually running?

And yes, this can be a real issue - anybody who's been a aysadmin for
a while will have at least one story of scheduling an upgrade, scratching it
at the last minute, and then having users complain about how the upgrade
ruined performance and introduced bugs...

[-- Attachment #2: Type: application/pgp-signature, Size: 865 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
  2012-03-25  2:05   ` Valdis.Kletnieks
@ 2012-03-25  2:33     ` Con Kolivas
  2012-03-25 13:37     ` Mike Galbraith
  2012-03-28  5:12     ` Heinz Diehl
  2 siblings, 0 replies; 18+ messages in thread
From: Con Kolivas @ 2012-03-25  2:33 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Gene Heskett, linux-kernel

On 25 March 2012 13:05,  <Valdis.Kletnieks@vt.edu> wrote:
> On Sat, 24 Mar 2012 05:53:32 -0400, Gene Heskett said:
>
>> I for one am happy to see this, Con.  I have been running an earlier patch
>> as pclos applies it to 2.6.38.8, and I must say the desktop interactivity
>> is very much improved over the non-bfs version.
>
> I'va always wondered what people are using to measure interactivity. Do we have
> some hard numbers from scheduler traces, or is it a "feels faster"?  And if
> it's a subjective thing, how are people avoiding confirmation bias (where you
> decide it feels faster because it's the new kernel and *should* feel faster)?
> Anybody doing blinded boots, where a random kernel old/new is booted and the
> user grades the performance without knowing which one was actually running?
>
> And yes, this can be a real issue - anybody who's been a aysadmin for
> a while will have at least one story of scheduling an upgrade, scratching it
> at the last minute, and then having users complain about how the upgrade
> ruined performance and introduced bugs...
>

I would say the vast majority of -ck/BFS users rely purely on
subjective feeling. On the other hand I have done numerous benchmarks
in the past trying to show the bound latencies of bfs are better than
mainline on regular workloads which is not surprising since BFS is
deterministic with respect to its latencies whereas mainline is not
(except on uniprocessor). I also documented interbench numbers showing
worst case latencies are bound better with BFS but since interbench is
a complicated benchmark that also displays fairness, most people don't
know how to read the values. Since I was never out to displace the
mainline scheduler but to demonstrate alternatives and provide a
standard for comparison I didn't bother with the benchmarks much
further than the occasional one I've posted. Since the main mailing
list seems distinctly disinterested in said results, I've only
published the throughput benchmarks as a kind of baseline regression
point to show that BFS' throughput is not significantly adversely
affected on the commodity hardware that people are using it on.

A comprehensive comparison of (an earlier BFS) compared to CFS and the
old O(1) scheduler evaluating throughput and fairness was in the
excellent thesis by Joseph T. Meehean entitled "Towards Transparent
CPU Scheduling":
http://research.cs.wisc.edu/wind/Publications/meehean-thesis11.html

A few of the latency benchmarks that still remain published on my site
can be found here:
http://ck.kolivas.org/patches/bfs/bfs404-cfs/
http://ck.kolivas.org/patches/bfs/2.6.35v2.6.35-ck1-interbench.log

Note how old they are. Not much has been done to repeat them since
then, but BFS' main design has not drastically changed in that time.
Some may be found on the old mailing list posts, but not a lot has
been documented with regards to this.

Some throughput benchmarks:
http://ck.kolivas.org/patches/bfs/benchmark3-results-for-announcement-20110410.txt

Current version:
http://s14.postimage.org/4gr5z8nxr/anova_x3360.png
http://postimage.org/image/wavusknl1/

Yes the results are from relatively simple benchmarks and limited in
scope. Yes there is hardly a decent benchmark for either interactivity
or responsiveness (interbench and contest were my attempt to benchmark
both of those).
Here's my very brief summary of the difference between interactivity
and responsiveness as I see it that I wrote many years ago:
http://ck.kolivas.org/readme.interactivity

Regards,
Con

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
  2012-03-25  2:05   ` Valdis.Kletnieks
  2012-03-25  2:33     ` Con Kolivas
@ 2012-03-25 13:37     ` Mike Galbraith
  2012-03-26 22:30       ` Con Kolivas
       [not found]       ` <CABqErrGaBLisO4YK5dP2O9Pv0QonZ+q9G43jm=Nf12yWVG,<1332825236.7411.54.camel@marge.simpson.net>
  2012-03-28  5:12     ` Heinz Diehl
  2 siblings, 2 replies; 18+ messages in thread
From: Mike Galbraith @ 2012-03-25 13:37 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Gene Heskett, Con Kolivas, linux-kernel

On Sat, 2012-03-24 at 22:05 -0400, Valdis.Kletnieks@vt.edu wrote: 
> On Sat, 24 Mar 2012 05:53:32 -0400, Gene Heskett said:
> 
> > I for one am happy to see this, Con.  I have been running an earlier patch
> > as pclos applies it to 2.6.38.8, and I must say the desktop interactivity
> > is very much improved over the non-bfs version.
> 
> I'va always wondered what people are using to measure interactivity. Do we have
> some hard numbers from scheduler traces, or is it a "feels faster"?  And if
> it's a subjective thing, how are people avoiding confirmation bias (where you
> decide it feels faster because it's the new kernel and *should* feel faster)?
> Anybody doing blinded boots, where a random kernel old/new is booted and the
> user grades the performance without knowing which one was actually running?
> 
> And yes, this can be a real issue - anybody who's been a aysadmin for
> a while will have at least one story of scheduling an upgrade, scratching it
> at the last minute, and then having users complain about how the upgrade
> ruined performance and introduced bugs...

Yeah.  In all the interactivity testing I've ever done, it's really hard
to not see what you expect and/or hope to see.  For normal desktop use,
I don't see any real difference with BFS vs CFS unless I load test of
course, and that can go either way, depending on the load.

Example:

3.3.0-bfs vs 3.3.0-cfs - identical config

Q6600 desktop box doing a measured interactivity test.

time mplayer BigBuckBunny-DivXPlusHD.mkv, with massive_intr 8 as competition

no bg load real    9m56.627s              1.000
CFS        real    9m59.199s              1.004
BFS        real    12m8.166s              1.220

As you can see, neither scheduler can run that perfectly on my box, as
the load needs a tad more than its fair share.  However, the Interactive
Experience was far better in CFS in this case due to it being more fair.
In BFS, the interactive tasks (mplayer/Xorg) could not get their fair
share, causing interactivity to measurably suffer.

It could just as well flip in favor of the unfair scheduler with the
right load mix.  Is this a big desktop deal?  No.  Neither scheduler
totally sucks, both have weaknesses and strengths (contrary to hype).

CFS vs BFS fairness:

CFS
  PID USER      PR  NI  VIRT  RES  SHR S   %CPU %MEM    TIME+  P COMMAND                                                                                                                                                                    
18598 root      20   0  8216  104    0 R     25  0.0   0:30.64 3 massive_intr                                                                                                                                                               
18597 root      20   0  8216  104    0 R     25  0.0   0:30.63 3 massive_intr                                                                                                                                                               
18600 root      20   0  3956  344  272 R     25  0.0   0:30.62 3 cpuhog                                                                                                                                                                     
18599 root      20   0  8216  104    0 R     25  0.0   0:30.63 3 massive_intr

BFS
  PID USER      PR  NI  VIRT  RES  SHR S   %CPU %MEM    TIME+  P COMMAND                                                                                                                                                                    
 7447 root       3   0  8216  104    0 R     27  0.0   0:31.20 3 massive_intr                                                                                                                                                               
 7448 root       5   0  8216  104    0 R     27  0.0   0:30.78 3 massive_intr                                                                                                                                                               
 7449 root       4   0  8216  104    0 R     26  0.0   0:30.65 3 massive_intr                                                                                                                                                               
 7446 root       7   0  3956  344  272 R     21  0.0   0:24.71 3 cpuhog

BFS is roughly fair, but demonstrably not as fair as CFS.  Is that a
strength or a weakness?  A: It depends.

What about low latency?  A couple latency bound loads:

tbench 8
Q6600 desktop box
CFS Throughput 1159.6 MB/sec 8 procs      1.000
BFS Throughput 701.2 MB/sec 8 procs        .604 (L2 misses hurt like hell)

E5620 (x3550 M3)
CFS Throughput 1505.09 MB/sec 8 procs     1.000
BFS Throughput 1269.87 MB/sec 8 procs      .843 (less pain, can't miss L3 at least)

Nobody likes vmark, but it sends a pretty clear message too.

marge:/vmark2.5.0.9 # ./volanomark.sh && grep troughput *.log

CFS
test-1.log:Average throughput = 148507 messages per second
test-2.log:Average throughput = 150017 messages per second
test-3.log:Average throughput = 147072 messages per second

BFS
test-1.log:Average throughput = 74042 messages per second
test-2.log:Average throughput = 73520 messages per second
test-3.log:Average throughput = 73134 messages per second

(Imagine this localhost throughput is your desktop applications
jabbering back and forth)

Right, BFS generally does have a tighter worst case, mostly because of
CFSs more accurate distribution.  OTOH, BFS pays a heavy price for being
single queue with zero load balancing overhead.  It has advantages, but
affinity problems result (not to mention scalability).

Lets see what lmbench has to say.

                 L M B E N C H  3 . 0   S U M M A R Y
                 ------------------------------------
		 (Alpha software, do not distribute)

Basic system parameters
------------------------------------------------------------------------------
Host                 OS Description              Mhz  tlb  cache  mem   scal
                                                     pages line   par   load
                                                           bytes  
--------- ------------- ----------------------- ---- ----- ----- ------ ----
marge         3.3.0-bfs        x86_64-linux-gnu 2401         128           1
marge         3.3.0-bfs        x86_64-linux-gnu 2401         128           1
marge         3.3.0-bfs        x86_64-linux-gnu 2401         128           1
marge         3.3.0-cfs        x86_64-linux-gnu 2401         128           1
marge         3.3.0-cfs        x86_64-linux-gnu 2401         128           1
marge         3.3.0-cfs        x86_64-linux-gnu 2401         128           1

Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host                 OS  Mhz null null      open slct sig  sig  fork exec sh  
                             call  I/O stat clos TCP  inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
marge         3.3.0-bfs 2401 0.12 0.16 1.32 1.93 2.99 0.23 1.22 191. 463. 1989
marge         3.3.0-bfs 2401 0.11 0.16 1.31 1.93 2.98 0.23 1.22 193. 463. 1991
marge         3.3.0-bfs 2401 0.11 0.17 1.31 1.93 3.02 0.23 1.23 192. 463. 1987
marge         3.3.0-cfs 2401 0.12 0.16 1.32 1.91 3.03 0.23 1.23 187. 458. 2237
marge         3.3.0-cfs 2401 0.11 0.16 1.29 1.89 3.04 0.23 1.23 185. 459. 2235
marge         3.3.0-cfs 2401 0.11 0.16 1.30 1.89 3.00 0.23 1.22 191. 455. 2227

Context switching - times in microseconds - smaller is better
-------------------------------------------------------------------------
Host                 OS  2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
                         ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
--------- ------------- ------ ------ ------ ------ ------ ------- -------
marge         3.3.0-bfs 1.4900 2.3600 1.9000 2.6500 2.8000 2.71000 2.16000
marge         3.3.0-bfs 1.4600 2.8800 2.9100 2.7300 2.0800 2.75000 3.50000
marge         3.3.0-bfs 1.4400 2.6500 2.3000 2.6400 2.2700 2.69000 3.82000
marge         3.3.0-cfs 1.6900 1.6800 1.6900 2.3700 1.9100 2.37000 1.94000
marge         3.3.0-cfs 1.6500 1.7100 1.6800 2.3600 1.8400 2.37000 1.89000
marge         3.3.0-cfs 1.6800 1.7900 1.6900 2.4100 1.8800 2.38000 2.06000

*Local* Communication latencies in microseconds - smaller is better
---------------------------------------------------------------------
Host                 OS 2p/0K  Pipe AF     UDP  RPC/   TCP  RPC/ TCP
                        ctxsw       UNIX         UDP         TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
marge         3.3.0-bfs 1.490 4.393 14.5  12.1  22.3  22.7  28.7  24.
marge         3.3.0-bfs 1.460 4.369 15.0  12.1  22.0  22.2  29.0  25.
marge         3.3.0-bfs 1.440 4.370 15.2  12.1  22.1  22.8  28.9  25.
marge         3.3.0-cfs 1.690 4.780 5.90  10.1  13.4  12.9  16.7  20.
marge         3.3.0-cfs 1.650 4.790 5.68  10.2  13.4  12.9  16.7  20.
marge         3.3.0-cfs 1.680 4.819 5.53  10.1  13.3  12.8  16.7  20.

File & VM system latencies in microseconds - smaller is better
-------------------------------------------------------------------------------
Host                 OS   0K File      10K File     Mmap    Prot   Page   100fd
                        Create Delete Create Delete Latency Fault  Fault  selct
--------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
marge         3.3.0-bfs                               775.0 0.447 0.96890 1.443
marge         3.3.0-bfs                               776.0 0.464 0.97250 1.441
marge         3.3.0-bfs                               783.0 0.461 0.97380 1.432
marge         3.3.0-cfs                               788.0 0.475 0.95950 1.441
marge         3.3.0-cfs                               774.0 0.473 0.96820 1.442
marge         3.3.0-cfs                               778.0 0.458 0.96040 1.432

*Local* Communication bandwidths in MB/s - bigger is better
-----------------------------------------------------------------------------
Host                OS  Pipe AF    TCP  File   Mmap  Bcopy  Bcopy  Mem   Mem
                             UNIX      reread reread (libc) (hand) read write
--------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
marge         3.3.0-bfs 2275 2102 1310 2959.7 5199.2 1881.3 1848.7 4912 2347.
marge         3.3.0-bfs 2242 2105 1321 2964.8 5199.6 1895.9 1849.4 4896 2345.
marge         3.3.0-bfs 2269 2115 1302 2961.5 5197.2 1903.1 1851.2 4882 2337.
marge         3.3.0-cfs 2452 4956 2885 3000.8 5121.2 1929.8 1829.7 4843 2032.
marge         3.3.0-cfs 2443 4965 2807 3010.7 5204.9 1900.6 1851.2 4900 2350.
marge         3.3.0-cfs 2449 4987 2834 2959.5 5194.0 1900.7 1829.2 4832 2305.
make[1]: Leaving directory `/usr/local/tmp/lmbench3/results.smp'


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
  2012-03-25 13:37     ` Mike Galbraith
@ 2012-03-26 22:30       ` Con Kolivas
  2012-03-27  5:13         ` Mike Galbraith
       [not found]       ` <CABqErrGaBLisO4YK5dP2O9Pv0QonZ+q9G43jm=Nf12yWVG,<1332825236.7411.54.camel@marge.simpson.net>
  1 sibling, 1 reply; 18+ messages in thread
From: Con Kolivas @ 2012-03-26 22:30 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Valdis.Kletnieks, Gene Heskett, linux-kernel

On 26 March 2012 00:37, Mike Galbraith <efault@gmx.de> wrote:
> Yeah.  In all the interactivity testing I've ever done, it's really hard
> to not see what you expect and/or hope to see.  For normal desktop use,
> I don't see any real difference with BFS vs CFS unless I load test of
> course, and that can go either way, depending on the load.
>
> Example:
>
> 3.3.0-bfs vs 3.3.0-cfs - identical config
>
> Q6600 desktop box doing a measured interactivity test.
>
> time mplayer BigBuckBunny-DivXPlusHD.mkv, with massive_intr 8 as competition
>
> no bg load real    9m56.627s              1.000
> CFS        real    9m59.199s              1.004
> BFS        real    12m8.166s              1.220
>
> As you can see, neither scheduler can run that perfectly on my box, as
> the load needs a tad more than its fair share.  However, the Interactive
> Experience was far better in CFS in this case due to it being more fair.
> In BFS, the interactive tasks (mplayer/Xorg) could not get their fair
> share, causing interactivity to measurably suffer.

massive_intr runs a number of threads that each run for 8ms and then
sleep for 1ms. That means they are 89% cpu bound. Run 8 of them and
your CPU load is 88.8 * 8 = 7.1. So now you're testing a difficult
mplayer benchmark in the presence of a load of 7.1 on a CPU with 4
cores. I don't know how much CPU the playback of your particular video
is but I suspect it does require a fair amount of CPU based on the CPU
it got back in your test. I can virtually guarantee that the amount of
CPU BFS is giving to mplayer is proportional to how much CPU is left.
Ergo as far as I can see, BFS is likely being absolutely perfectly
fair. This sort of fairness equation has been already elucidated in
the pHD that I linked to in my original post and he has done a much
more thorough analysis than this kind of drive-by test that you're
doing and misinterpreting has already shown that BFS is fair to a
fault.

snip the rest

'top' snapshots are uninteresting because CFS and BFS report cpu time
completely differently and a single snapshot tells us nothing.

Snip uninteresting-to-desktop-user throughput benchmarks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
  2012-03-26 22:30       ` Con Kolivas
@ 2012-03-27  5:13         ` Mike Galbraith
  0 siblings, 0 replies; 18+ messages in thread
From: Mike Galbraith @ 2012-03-27  5:13 UTC (permalink / raw)
  To: Con Kolivas; +Cc: Valdis.Kletnieks, Gene Heskett, linux-kernel

On Tue, 2012-03-27 at 09:30 +1100, Con Kolivas wrote: 
> On 26 March 2012 00:37, Mike Galbraith <efault@gmx.de> wrote:
> > Yeah.  In all the interactivity testing I've ever done, it's really hard
> > to not see what you expect and/or hope to see.  For normal desktop use,
> > I don't see any real difference with BFS vs CFS unless I load test of
> > course, and that can go either way, depending on the load.
> >
> > Example:
> >
> > 3.3.0-bfs vs 3.3.0-cfs - identical config
> >
> > Q6600 desktop box doing a measured interactivity test.
> >
> > time mplayer BigBuckBunny-DivXPlusHD.mkv, with massive_intr 8 as competition
> >
> > no bg load real    9m56.627s              1.000
> > CFS        real    9m59.199s              1.004
> > BFS        real    12m8.166s              1.220
> >
> > As you can see, neither scheduler can run that perfectly on my box, as
> > the load needs a tad more than its fair share.  However, the Interactive
> > Experience was far better in CFS in this case due to it being more fair.
> > In BFS, the interactive tasks (mplayer/Xorg) could not get their fair
> > share, causing interactivity to measurably suffer.
> 
> massive_intr runs a number of threads that each run for 8ms and then
> sleep for 1ms. That means they are 89% cpu bound. Run 8 of them and
> your CPU load is 88.8 * 8 = 7.1. So now you're testing a difficult
> mplayer benchmark in the presence of a load of 7.1 on a CPU with 4
> cores. I don't know how much CPU the playback of your particular video
> is but I suspect it does require a fair amount of CPU based on the CPU
> it got back in your test. I can virtually guarantee that the amount of
> CPU BFS is giving to mplayer is proportional to how much CPU is left.
> Ergo as far as I can see, BFS is likely being absolutely perfectly
> fair. This sort of fairness equation has been already elucidated in
> the pHD that I linked to in my original post and he has done a much
> more thorough analysis than this kind of drive-by test that you're
> doing and misinterpreting has already shown that BFS is fair to a
> fault.
> 
> snip the rest
> 
> 'top' snapshots are uninteresting because CFS and BFS report cpu time
> completely differently and a single snapshot tells us nothing.
> 
> Snip uninteresting-to-desktop-user throughput benchmarks.

You accuse others of disinterest in their dirty underwear, and wave away
the wide skid marks in your own with a flick of the wrist.  Amazing.

-Mike


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
@ 2012-03-27 20:17 Mike Blue
  2012-03-27 20:45 ` Pekka Enberg
  0 siblings, 1 reply; 18+ messages in thread
From: Mike Blue @ 2012-03-27 20:17 UTC (permalink / raw)
  To: linux-kernel


>You accuse others of disinterest in their dirty underwear, and wave away>the wide skid marks in your own with a flick of the wrist. Amazing.
Really, Mike? That is the sum total of your response to well articulated, data-supported arguments/constructive criticism to your own poorlydesigned experiment? Amazing and shame on you.  		 	   		  

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
  2012-03-27 20:17 Mike Blue
@ 2012-03-27 20:45 ` Pekka Enberg
  0 siblings, 0 replies; 18+ messages in thread
From: Pekka Enberg @ 2012-03-27 20:45 UTC (permalink / raw)
  To: Mike Blue; +Cc: linux-kernel

On Tue, Mar 27, 2012 at 11:17 PM, Mike Blue <micahmaggie@hotmail.com> wrote:
> Really, Mike? That is the sum total of your response to well articulated,
> data-supported arguments/constructive criticism to your own poorlydesigned
> experiment? Amazing and shame on you.                                        --

...and so the crazies crawl out of the woodwork. Thanks a bunch.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
@ 2012-03-27 20:52 Micheal Blue
  0 siblings, 0 replies; 18+ messages in thread
From: Micheal Blue @ 2012-03-27 20:52 UTC (permalink / raw)
  To: linux-kernel



> ----- Original Message -----
> From: Mike Galbraith
> Sent: 03/27/12 01:13 AM
> To: Con Kolivas
> Subject: Re: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
> 
> 
> You accuse others of disinterest in their dirty underwear, and wave away
> the wide skid marks in your own with a flick of the wrist. Amazing.

That is the sum total of your response to well-articulated, data-supported arguments/constructive criticism to your own poorly designed experiment? Amazing.  

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
       [not found]         ` <SNT112-W24CBD78928F4DD6AFDA564A14A0@phx.gbl>
@ 2012-03-28  3:48           ` Mike Galbraith
  0 siblings, 0 replies; 18+ messages in thread
From: Mike Galbraith @ 2012-03-28  3:48 UTC (permalink / raw)
  To: Mike Blue; +Cc: kernel, valdis.kletnieks, gene.heskett, linux-kernel

On Tue, 2012-03-27 at 16:23 -0400, Mike Blue wrote: 


> Amazing and shame on you.     
> 
Shrug, let the chips fall where they may.

-Mike 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
  2012-03-25  2:05   ` Valdis.Kletnieks
  2012-03-25  2:33     ` Con Kolivas
  2012-03-25 13:37     ` Mike Galbraith
@ 2012-03-28  5:12     ` Heinz Diehl
  2012-03-28 12:39       ` Nikos Chantziaras
  2 siblings, 1 reply; 18+ messages in thread
From: Heinz Diehl @ 2012-03-28  5:12 UTC (permalink / raw)
  To: linux-kernel

On 25.03.2012, Valdis.Kletnieks@vt.edu wrote: 

> I'va always wondered what people are using to measure interactivity. Do we have
> some hard numbers from scheduler traces, or is it a "feels faster"?

I guess it's a "feels faster", because it's the only thing that
counts. Given that there is strong evidence that scheduler A is
"faster, more interactive", whatever... than scheduler B, but a
controlled trial shows a significantly better "feels faster"
experience using scheduler B, I'm quite shure that people would choose
scheduler B over A, and that's quite ok. It does what they expect it
to do, despite evidence which documents the opposite.

> And if  it's a subjective thing, how are people avoiding confirmation bias (where you
> decide it feels faster because it's the new kernel and *should* feel faster)?

Confirmation bias is one thing, and it does exist. Surely. So it's up
to the user if it wants evidence, or if it's enough that it feels
faster. I guess that evidence doesn't really matter for the most of
the users as long as they have a positive experience.

> Anybody doing blinded boots, where a random kernel old/new is booted and the
> user grades the performance without knowing which one was actually running?

Hey, we could construct a randomized controlled trial on this :-)

> And yes, this can be a real issue - anybody who's been a aysadmin for
> a while will have at least one story of scheduling an upgrade, scratching it
> at the last minute, and then having users complain about how the upgrade
> ruined performance and introduced bugs...

Yep. They who have to do "real work" will rather base it on evidence
than trust their own feelings.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
  2012-03-28  5:12     ` Heinz Diehl
@ 2012-03-28 12:39       ` Nikos Chantziaras
  2012-03-28 13:53         ` Heinz Diehl
  2012-03-28 16:44         ` Mike Galbraith
  0 siblings, 2 replies; 18+ messages in thread
From: Nikos Chantziaras @ 2012-03-28 12:39 UTC (permalink / raw)
  To: linux-kernel

On 28/03/12 08:12, Heinz Diehl wrote:
> On 25.03.2012, Valdis.Kletnieks@vt.edu wrote:
>
>> I'va always wondered what people are using to measure interactivity. Do we have
>> some hard numbers from scheduler traces, or is it a "feels faster"?
>
> I guess it's a "feels faster", because it's the only thing that
> counts. Given that there is strong evidence that scheduler A is
> "faster, more interactive", whatever... than scheduler B, but a
> controlled trial shows a significantly better "feels faster"
> experience using scheduler B, I'm quite shure that people would choose
> scheduler B over A, and that's quite ok. It does what they expect it
> to do, despite evidence which documents the opposite.

CFS: ALSA XRUNs in JACK.
BFS: much less ALSA XRUNs in JACK


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
  2012-03-28 12:39       ` Nikos Chantziaras
@ 2012-03-28 13:53         ` Heinz Diehl
  2012-03-28 15:28           ` Nikos Chantziaras
  2012-03-28 16:44         ` Mike Galbraith
  1 sibling, 1 reply; 18+ messages in thread
From: Heinz Diehl @ 2012-03-28 13:53 UTC (permalink / raw)
  To: linux-kernel

On 28.03.2012, Nikos Chantziaras wrote: 

> CFS: ALSA XRUNs in JACK.
> BFS: much less ALSA XRUNs in JACK

BFS runs on all of my machines, and I know why. But that's not the
point here. Why do people not accept and learn from each other? I'm
quite shure that both CFS and BFS has good and bad things, why not
take the best from both of them and improve it further?

Just my 5ø.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
  2012-03-28 13:53         ` Heinz Diehl
@ 2012-03-28 15:28           ` Nikos Chantziaras
  0 siblings, 0 replies; 18+ messages in thread
From: Nikos Chantziaras @ 2012-03-28 15:28 UTC (permalink / raw)
  To: linux-kernel

On 28/03/12 16:53, Heinz Diehl wrote:
> On 28.03.2012, Nikos Chantziaras wrote:
>
>> CFS: ALSA XRUNs in JACK.
>> BFS: much less ALSA XRUNs in JACK
>
> BFS runs on all of my machines, and I know why. But that's not the
> point here. Why do people not accept and learn from each other? I'm
> quite shure that both CFS and BFS has good and bad things, why not
> take the best from both of them and improve it further?

I totally agree.  What ticks me off is people who claim that using BFS 
means you must be schizophrenic, even though some of them posted 
numbers.  Even on servers, BFS helped people.  For example, a server 
running mainline was behaving badly until it was switched to BFS.  The 
difference was quite impressive:

http://ck-hack.blogspot.com/2011/08/phoronix-revisits-bfs.html

But still, many people decide to keep the "you're imagining it" 
attitude, as if they're preparing flame-bait.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
  2012-03-28 12:39       ` Nikos Chantziaras
  2012-03-28 13:53         ` Heinz Diehl
@ 2012-03-28 16:44         ` Mike Galbraith
  1 sibling, 0 replies; 18+ messages in thread
From: Mike Galbraith @ 2012-03-28 16:44 UTC (permalink / raw)
  To: Nikos Chantziaras; +Cc: linux-kernel

On Wed, 2012-03-28 at 15:39 +0300, Nikos Chantziaras wrote: 
> On 28/03/12 08:12, Heinz Diehl wrote:
> > On 25.03.2012, Valdis.Kletnieks@vt.edu wrote:
> >
> >> I'va always wondered what people are using to measure interactivity. Do we have
> >> some hard numbers from scheduler traces, or is it a "feels faster"?
> >
> > I guess it's a "feels faster", because it's the only thing that
> > counts. Given that there is strong evidence that scheduler A is
> > "faster, more interactive", whatever... than scheduler B, but a
> > controlled trial shows a significantly better "feels faster"
> > experience using scheduler B, I'm quite shure that people would choose
> > scheduler B over A, and that's quite ok. It does what they expect it
> > to do, despite evidence which documents the opposite.
> 
> CFS: ALSA XRUNs in JACK.
> BFS: much less ALSA XRUNs in JACK

Something like that could be interesting to look into.  Do you have a
setup and recipe for inducing these xruns I can try?  I don't have any
audio problems of my own to fiddle with, but then the few apps I use
buffer a lot, so I wouldn't.

-Mike


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0
       [not found]     ` <iJTsB-44L-1@gated-at.bofh.it>
@ 2012-03-28 19:39       ` Martin Rogge
  0 siblings, 0 replies; 18+ messages in thread
From: Martin Rogge @ 2012-03-28 19:39 UTC (permalink / raw)
  To: Linux Kernel Mailing List

On 03/28/2012 07:20 AM, Heinz Diehl wrote:
> On 25.03.2012, Valdis.Kletnieks@vt.edu wrote:
>
>> I'va always wondered what people are using to measure interactivity. Do we have
>> some hard numbers from scheduler traces, or is it a "feels faster"?
>
> I guess it's a "feels faster", because it's the only thing that
> counts.

I think it's inherently difficult to find an objective measure for the 
"feeled" smoothness and interactivity of a desktop machine under load. 
The measuring process probably needs to reside outside the system under 
test and interpret the HDMI output. And control the mouse input.

On a side note, I maintain a little http server and found during testing 
that different servers respond differently to different schedulers. 
(Note this is about throughput, not interactivity, since this is about 
servers and throughput is so easy to measure.) The following is an 
unedited quote from the README:

"As an example look at the
following result reading 1000000 static files at concurrency level 100:

Kernel       Server             Keep-     Requests       Rate     Max.
Version                         Alive     per sec.                [ms]
----------------------------------------------------------------------
3.1.4        Apache/2.2.21       yes        25427        33388     265
3.1.4        lighttpd/1.4.29     yes        54741        68892       6
3.1.4        MrHTTPD/2.2.0       yes        88898       100888       5
3.1.4-ck2    Apache/2.2.21       yes        43760        57462      17
3.1.4-ck2    lighttpd/1.4.29     yes        49227        61652       6
3.1.4-ck2    MrHTTPD/2.2.0       yes       163158       185150       2

The throughput offered by MrHTTPD is three to four times higher than
Apache and Lighttpd. Result sets at higher concurrency continue this
trend, and at insane concurrency levels like 10000 or 20000 MrHTTPD is
the only responsive server remaining.

The other stunning fact proven by these figures is the significant
impact of the BFS CPU scheduler and other optimizations inherent in the
CK patch, leading to almost twice the throughput compared to the
vanilla kernel (which was btw configured to use the CFQ scheduler).
This is even more surprising as the design goal of the CK patch is
desktop interactivity rather than server throughput.

It is also noticable that a fully threaded design like MrHTTPD benefits
particularly well from the CK patch, whereas the event-driven design of
Lighttpd often seems to perform slightly better under the vanilla
kernel."

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2012-03-28 19:39 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-27 20:52 [ANNOUNCE] BFS CPU scheduler version 0.420 AKA "Smoking" for linux kernel 3.3.0 Micheal Blue
     [not found] <iIvLH-3cI-5@gated-at.bofh.it>
     [not found] ` <iIw53-3TH-15@gated-at.bofh.it>
     [not found]   ` <iIL45-3H1-5@gated-at.bofh.it>
     [not found]     ` <iJTsB-44L-1@gated-at.bofh.it>
2012-03-28 19:39       ` Martin Rogge
  -- strict thread matches above, loose matches on Subject: below --
2012-03-27 20:17 Mike Blue
2012-03-27 20:45 ` Pekka Enberg
2012-03-24  9:39 Con Kolivas
2012-03-24  9:53 ` Gene Heskett
2012-03-24 10:00   ` Con Kolivas
2012-03-25  2:05   ` Valdis.Kletnieks
2012-03-25  2:33     ` Con Kolivas
2012-03-25 13:37     ` Mike Galbraith
2012-03-26 22:30       ` Con Kolivas
2012-03-27  5:13         ` Mike Galbraith
     [not found]       ` <CABqErrGaBLisO4YK5dP2O9Pv0QonZ+q9G43jm=Nf12yWVG,<1332825236.7411.54.camel@marge.simpson.net>
     [not found]         ` <SNT112-W24CBD78928F4DD6AFDA564A14A0@phx.gbl>
2012-03-28  3:48           ` Mike Galbraith
2012-03-28  5:12     ` Heinz Diehl
2012-03-28 12:39       ` Nikos Chantziaras
2012-03-28 13:53         ` Heinz Diehl
2012-03-28 15:28           ` Nikos Chantziaras
2012-03-28 16:44         ` Mike Galbraith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox