public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: William Lee Irwin III <wli@holomorphy.com>
To: Anton Blanchard <anton@samba.org>
Cc: Andrew Morton <akpm@osdl.org>,
	Jesse Barnes <jbarnes@engr.sgi.com>,
	linux-kernel@vger.kernel.org
Subject: Re: 2.6.8.1-mm3
Date: Sat, 21 Aug 2004 08:22:18 -0700	[thread overview]
Message-ID: <20040821152218.GZ11200@holomorphy.com> (raw)
In-Reply-To: <20040821000343.GR11200@holomorphy.com>

At some point in the past, I wrote:
>>> Parallel compilation is an extremely poor benchmark in general, as the
>>> workload is incapable of being effectively scaled to system sizes, the
>>> linking phase is inherently unparallelizable and the compilation phase
>>> too parallelizable to actually stress anything. There is also precisely
>>> zero relevance the benchmark has to anything real users would do.

The manner in which the load is too parallelizable is that literally
the only shared data accessed are the directories and process creation
related structures. By and large the dcache is useless to beat on as
it's long been known what needs to happen there: the hashtable has to
die in favor of a data structure that interoperates properly with
lockless synchronization and with some remote hint of cache locality.
Real workloads run by real users perform nontrivial communication
between processes, not wait4(), access more devices than merely disk,
and furthermore, don't fork() and exit() all day in preference to doing
real work.


On Fri, Aug 20, 2004 at 05:03:43PM -0700, William Lee Irwin III wrote:
> Kernel hacking is not an end in itself, regardless of the fact there
> are some, such as myself, who use computers for no other purpose. A
> real user generally has some purpose to their activity beyond working
> on the software or hardware they are "using". e.g. various real users
> use their systems for entertainment: playing games, music, and movies.
> Others may use their systems to make money somehow, e.g. archiving
> information about customers so they can look up what they've bought
> and paid for or have yet to pay for.
> Regardless of the social issue, the rather serious technical deficits
> of compilation of any software as a benchmark are showstopping issues.
> Frankly, even the issues I've dredged up are nowhere near comprehensive.
> There are further issues such as that stable (i.e. not varying across
> the benchmarks being done on various systems at various times) versions
> of the software being compiled and the toolchain being used to compile
> it are lacking as components of any "kernel compile benchmarking suite"
> and worse still the variance in target architecture of the toolchain
> also defeats any attempt at meaningful benchmarking.

To be useful at all, benchmarks have to be useful to evaluate different
machines as well. For instance, to evaluate the scalability of the
kernel to different sizes of machines, different machines must be
comparable. Likewise, to evaluate how well utilizing a particular
hardware feature of an architecture improves kernel performance
relative to an architecture without the hardware feature, different
machines must be compared. The relative performance of the machines
before the kernel feature is utilized must be compared to the relative
performance of the machines after the feature is utilized.

Proper benchmarks are furthermore explicitly used to evaluate hardware.
Individual users posting results from their own frozen-at-gcc-versions-
of-their-choice userspace are worthless for this.

If there were to be an attempt at a proper kernel and system benchmark
using kernel compiles, which is unlikely ever to happen, one would do
the following:

(a) Bundle a toolchain and all supporting userspace required for it
	into the benchmark so that the gcc, make, etc. are identical
	for all users of the benchmark.
(b) Run O(num_cpus_online()) kernel compiles in parallel instead of
	a single make -j so the workload can be sized appropriately
	for the system. The fact is that 32x+ systems can't even be
	kept loaded by the "benchmark" because there is just not enough
	work to distribute, so this has to be done.
(c) Measure throughput in terms of kernel compiles per minute, and
	explicitly measure the variance during the runs.
(d) This still doesn't fix the fact that there are no nontrivial
	shared resources amongst the processes. It still doesn't benchmark
	anything useful, as it's not modeled on any real end user workload
	and not targeted at any specific kernel functionality. It will
	merely produce self-consistent results with these fixes.

i.e. the methodology now used for "kernel compile benchmarks" is poor
and all of the "results" obtained from it are highly questionable and
should be ignored.


On Fri, Aug 20, 2004 at 05:03:43PM -0700, William Lee Irwin III wrote:
> If you're truly concerned about compilation speed, userspace is going
> to be the most productive area to work on anyway, as the vast majority
> of time during compilation is spent in userspace. AIUI the userspace
> algorithms in gcc are not particularly cognizant of cache locality and
> in various instances have suboptimal time and space behavior, so it's
> not as if there isn't work to be done there. Improving the compactness
> and cache locality of data structures is important in userspace also,
> and most (perhaps all) userspace programs are grossly ignorant of this.
> FWIW, there are notable kernel hackers known to use very downrev gcc
> versions due to regressions in compilation speed in subsequent versions,
> so there are already large known differences in compilation speed that
> can be obtained just by choosing a different compiler version.

The point here is what to do if you are literally trying to improve
compilation speed.

If you are trying to benchmark the kernel, you should do a vaguely
realistic simulation of something a user might do to stress the kernel
or a real microbenchmark instead of repeating mistakes with poor
methodology.  The results are so bad they have to be thrown away after
every post, as the relative results' baselines are effectively
untraceable. It's also needlessly complex. It does too many different
things at once, and to no useful effect, as it's not a meaningful
macrobenchmark either, and so its results are even misleading. As it
has been used it is logically impossible for it to have properly
motivated any improvement of the kernel, or ever to do so in the future.

If you care about fork(), then use a fork() microbenchmark; every
meaningful improvement of fork() has been measured by such.

If you care about the parallelism of the vfs, then use a vfs
microbenchmark; every meaningful improvement of the vfs has been
measured by such.

Kernel compiles should not be used as benchmarks as they are now, and
their results should not be taken into consideration as performance
metrics. I highly encourage those concerned about performance to use
other benchmarks, e.g. reaim and the like, which are multiuser
simulations to measure interactive response and the like, or another
properly-constructed benchmark targeted at their performance concerns.
I encourage readers of kernel compile performance results to disregard them.

The only real point of interest regarding this kind of affair on
supercomputers is as a stress test to verify that the kernel doesn't
livelock or deadlock due to the extreme performance characteristics
of such large systems, and I encourage other readers likewise to limit
their interest and involvement in this thread to stress testing results.


-- wli

  parent reply	other threads:[~2004-08-21 15:22 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-08-20 10:19 2.6.8.1-mm3 Andrew Morton
2004-08-20 11:25 ` [PATCH] 2.6.8.1-mm3, fix visws kernel build Andrey Panin
2004-08-20 11:46 ` 2.6.8.1-mm3 Russell King
2004-08-20 11:47 ` [PATCH] 2.6.8.1-mm3, fix qla1280 build on visws Andrey Panin
2004-08-20 15:44 ` 2.6.8.1-mm3 Jesse Barnes
2004-08-20 16:57   ` 2.6.8.1-mm3 Jesse Barnes
2004-08-20 17:08     ` 2.6.8.1-mm3 Jesse Barnes
2004-08-20 18:55     ` 2.6.8.1-mm3 Andrew Morton
2004-08-20 19:56       ` 2.6.8.1-mm3 Jesse Barnes
2004-08-20 20:02       ` 2.6.8.1-mm3 William Lee Irwin III
2004-08-20 23:31         ` 2.6.8.1-mm3 Anton Blanchard
2004-08-21  0:03           ` 2.6.8.1-mm3 William Lee Irwin III
2004-08-21  7:04             ` 2.6.8.1-mm3 Martin J. Bligh
2004-08-21 15:22             ` William Lee Irwin III [this message]
2004-08-21 19:59         ` 2.6.8.1-mm3 Jesse Barnes
2004-08-21 20:24           ` 2.6.8.1-mm3 William Lee Irwin III
2004-08-21 20:35             ` 2.6.8.1-mm3 Jesse Barnes
2004-08-23  9:02         ` 2.6.8.1-mm3 David Mosberger
2004-08-23 16:27           ` 2.6.8.1-mm3 wli
2004-08-23 18:18             ` 2.6.8.1-mm3 Jesse Barnes
2004-08-24  7:24             ` 2.6.8.1-mm3 David Mosberger
2004-08-20 18:04   ` 2.6.8.1-mm3 lockmeter on 512p w/kernbench Jesse Barnes
2004-09-10 16:25     ` Greg Edwards
2004-08-20 18:46   ` 2.6.8.1-mm3 Jesse Barnes
2004-08-21  1:26   ` 2.6.8.1-mm3 Nick Piggin
2004-08-21 20:05     ` 2.6.8.1-mm3 Jesse Barnes
2004-08-22  1:27       ` 2.6.8.1-mm3 Jesse Barnes
2004-08-22  2:11         ` 2.6.8.1-mm3 Nick Piggin
2004-08-22 15:44           ` 2.6.8.1-mm3 Jesse Barnes
2004-08-20 17:38 ` 2.6.8.1-mm3 (build failture w/ CONFIG_NUMA) mita akinobu
2004-08-20 17:55   ` Jesse Barnes
2004-08-20 18:12 ` 2.6.8.1-mm3 (compile stats) John Cherry
2004-08-21 18:54   ` Herbert Poetzl
2004-08-21 17:37 ` 2.6.8.1-mm3 William Lee Irwin III
2004-08-22 13:02   ` 2.6.8.1-mm3 William Lee Irwin III
2004-08-21 18:51 ` 2.6.8.1-mm3 R. J. Wysocki
2004-08-22  4:32 ` 2.6.8.1-mm3 Thomas Davis
2004-08-22  4:48   ` 2.6.8.1-mm3 Andrew Morton
2004-08-22  4:58     ` 2.6.8.1-mm3 Nick Piggin
2004-08-22  6:26       ` 2.6.8.1-mm3 Thomas Davis
2004-08-22  6:51     ` 2.6.8.1-mm3 Pete Zaitcev
2004-08-22 15:11   ` 2.6.8.1-mm3 Bartlomiej Zolnierkiewicz
  -- strict thread matches above, loose matches on Subject: below --
2004-08-21 14:43 2.6.8.1-mm3 Mikael Pettersson
2004-08-21 16:02 ` 2.6.8.1-mm3 James Bottomley
2004-08-21 17:15 2.6.8.1-mm3 Mikael Pettersson
2004-08-21 18:38 2.6.8.1-mm3 Mikael Pettersson
2004-08-21 19:14 ` 2.6.8.1-mm3 Patrick Mansfield
2004-08-21 19:24   ` 2.6.8.1-mm3 James Bottomley
2004-08-21 21:47     ` 2.6.8.1-mm3 Matthew Wilcox
2004-08-25  3:50       ` 2.6.8.1-mm3 James Bottomley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040821152218.GZ11200@holomorphy.com \
    --to=wli@holomorphy.com \
    --cc=akpm@osdl.org \
    --cc=anton@samba.org \
    --cc=jbarnes@engr.sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox