public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Gregory Haskins" <ghaskins@novell.com>
To: "Ingo Molnar" <mingo@elte.hu>
Cc: <rostedt@goodmis.org>, <linux-kernel@vger.kernel.org>,
	<linux-rt-users@vger.kernel.org>
Subject: Re: [PATCH 3/3] Subject: SCHED - Use a 2-d bitmap for searching lowest-pri CPU
Date: Wed, 05 Dec 2007 05:19:52 -0500	[thread overview]
Message-ID: <475634F8.BA47.005A.0@novell.com> (raw)
In-Reply-To: <20071205093412.GA17911@elte.hu>

>>> On Wed, Dec 5, 2007 at  4:34 AM, in message <20071205093412.GA17911@elte.hu>,
Ingo Molnar <mingo@elte.hu> wrote: 

> * Gregory Haskins <ghaskins@novell.com> wrote:
> 
>> The current code use a linear algorithm which causes scaling issues on 
>> larger SMP machines.  This patch replaces that algorithm with a 
>> 2-dimensional bitmap to reduce latencies in the wake-up path.
> 
> hm, what kind of scaling issues

well, the linear algorithm scales with the number of online-cpus, so as you add more cpus the latencies increase, thus decreasing determinism.

> - do you have any numbers?  What workload did you measure and on what hardware?
> 

The vast majority of my own work was on 4-way and 8-way systems with -rt which I can dig up and share with you if you like.  Generally speaking, I would present various loads (make -j 128 being the defacto standard) and simultaneously measure RT latency performance with tools such as cyclictest and/or preempt-test.  Typically, I would let these loads/measurements run overnight to really seek out the worst-case maximums.

As a synopsis, the 2-d alg helps even on the 4-way, but even more so on the 8-way.  I expect that trend to continue as we add more cpus.  The general numbers I recall from memory are that the 2-d alg reduces max latencies by about 25-40% on the 8-way.

However, that said, Steven's testing work on the mainline port of our series sums it up very nicely, so I will present that in lieu of digging up my -rt numbers unless you specifically want them too.  Here they are:


http://people.redhat.com/srostedt/rt-benchmarks/

As I believe you are aware, these numbers were generated on a 64-way SMP beast.  To break down what we are looking at, lets start with the first row of graphs (under Kernel and System Information).  What we see here are scatterplots of wake-latencies generated from Steven's "rt-migrate" test.  We start with the vanilla "rc3" kernel, and make our way through "sdr" (the first 7-8 patches from v7), "gh" (patches 8-20), and finally "cpupri" (patches 21-23).  (Ignore the "noavoid" runs, that was an experiment that Steven and I agree was a flop so its dropped from the series). 

The test has a default run-interval of 20ms, which is employed here in this test.  What we expect to see is that the red points (high-priority) should be allowed to start immediately (close to 0 latency), whereas green points (low-priority) may sometimes happen to start first, or may start after the 20ms interval depending on luck of the draw.

You can see that "rc3" has issues keeping the red-points near zero, which is part of the problem we were addressing in this series.

http://people.redhat.com/srostedt/rt-benchmarks/imgs/results-rt-migrate-test-2.6.24-rc3.out.png

Rolling in "sdr" we fix the wandering/20ms red-points, but you still see a degree of jitter in the red-points staying close to zero.

http://people.redhat.com/srostedt/rt-benchmarks/imgs/results-rt-migrate-test-2.6.24-rc3-sdr.out-sm.png

This is "ok".  The code technically passes the test and does the right thing from a balancing perspective.  The remainder of the series is performance related.

We then apply "gh" which adds support for considering things like NUMA/MC distance and we can see a higher degree of determinism, particularly on the red-points.  

http://people.redhat.com/srostedt/rt-benchmarks/imgs/results-rt-migrate-test-2.6.24-rc3-gh.out-sm.png

Rolling in "cpupri" we substitute the linear search for the 2-d bitmap optimization, are we observe a further increase in determinism:

http://people.redhat.com/srostedt/rt-benchmarks/imgs/results-rt-migrate-test-2.6.24-rc3-gh-cpupri.out-sm.png

The numbers presented in these scatterplots can be summarized in the following bar-graph for easier comparison:

http://people.redhat.com/srostedt/rt-benchmarks/imgs/results-rt-migrate-hist-sm.png

You can clearly see the latencies dropping as we move through the 23 patches.  This is similar to my findings under -rt with the 4/8 ways that I mentioned earlier, but I can find those numbers for you next if you feel it is necessary.

I hope this helps to show the difference.  Thanks for hanging in there on this long mail!

Regards,
-Greg







  reply	other threads:[~2007-12-05 10:25 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-04 20:44 [PATCH 00/23] RT balance v7 Gregory Haskins
2007-12-04 20:44 ` [PATCH 01/23] Subject: SCHED - Add rt_nr_running accounting Gregory Haskins
2007-12-04 20:44 ` [PATCH 02/23] Subject: SCHED - track highest prio queued on runqueue Gregory Haskins
2007-12-04 20:44 ` [PATCH 03/23] Subject: SCHED - push RT tasks Gregory Haskins
2007-12-04 20:44 ` [PATCH 04/23] Subject: SCHED - RT overloaded runqueues accounting Gregory Haskins
2007-12-04 20:44 ` [PATCH 05/23] Subject: SCHED - pull RT tasks Gregory Haskins
2007-12-04 20:44 ` [PATCH 06/23] Subject: SCHED - wake up balance RT Gregory Haskins
2007-12-04 20:45 ` [PATCH 07/23] Subject: SCHED - disable CFS RT load balancing Gregory Haskins
2007-12-04 20:45 ` [PATCH 08/23] Subject: SCHED - Cache cpus_allowed weight for optimizing migration Gregory Haskins
2007-12-04 20:45 ` [PATCH 09/23] Subject: SCHED - Consistency cleanup for this_rq usage Gregory Haskins
2007-12-04 20:45 ` [PATCH 10/23] Subject: SCHED - Remove some CFS specific code from the wakeup path of RT tasks Gregory Haskins
2007-12-04 20:45 ` [PATCH 11/23] Subject: SCHED - Break out the search function Gregory Haskins
2007-12-04 20:45 ` [PATCH 12/23] Subject: SCHED - Allow current_cpu to be included in search Gregory Haskins
2007-12-04 20:45 ` [PATCH 13/23] Subject: SCHED - Pre-route RT tasks on wakeup Gregory Haskins
2007-12-04 20:45 ` [PATCH 14/23] Subject: SCHED - Optimize our cpu selection based on topology Gregory Haskins
2007-12-04 20:45 ` [PATCH 15/23] Subject: SCHED - Optimize rebalancing Gregory Haskins
2007-12-04 20:45 ` [PATCH 16/23] Subject: SCHED - Avoid overload Gregory Haskins
2007-12-04 20:45 ` [PATCH 17/23] Subject: SCHED - restore the migratable conditional Gregory Haskins
2007-12-04 20:45 ` [PATCH 18/23] Subject: SCHED - Optimize cpu search with hamming weight Gregory Haskins
2007-12-04 20:46 ` [PATCH 19/23] Subject: SCHED - Optimize out cpu_clears Gregory Haskins
2007-12-04 20:46 ` [PATCH 20/23] Subject: SCHED - balance RT tasks no new wake up Gregory Haskins
2007-12-04 20:46 ` [PATCH 21/23] Subject: SCHED - Add sched-domain roots Gregory Haskins
2007-12-04 20:46 ` [PATCH 22/23] Subject: SCHED - Only balance our RT tasks within our root-domain Gregory Haskins
2007-12-04 20:46 ` [PATCH 23/23] Subject: SCHED - Use a 2-d bitmap for searching lowest-pri CPU Gregory Haskins
2007-12-04 21:27 ` [PATCH 00/23] RT balance v7 Ingo Molnar
2007-12-04 21:35   ` Gregory Haskins
2007-12-05  2:55   ` [PATCH 0/3] RT balance v7a Gregory Haskins
2007-12-05  2:55     ` [PATCH 1/3] Subject: SCHED - Add sched-domain roots Gregory Haskins
2007-12-05  2:55     ` [PATCH 2/3] Subject: SCHED - Only balance our RT tasks within our root-domain Gregory Haskins
2007-12-05  2:55     ` [PATCH 3/3] Subject: SCHED - Use a 2-d bitmap for searching lowest-pri CPU Gregory Haskins
2007-12-05  9:34       ` Ingo Molnar
2007-12-05 10:19         ` Gregory Haskins [this message]
2007-12-05 11:44           ` Ingo Molnar
2007-12-05 13:41             ` Gregory Haskins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=475634F8.BA47.005A.0@novell.com \
    --to=ghaskins@novell.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox