Re: [BENCHMARK] Corrected gcc3.2 v gcc2.95.3 contest results

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: jw schultz <jw@pegasys.ws>
To: linux-kernel@vger.kernel.org
Subject: Re: [BENCHMARK] Corrected gcc3.2 v gcc2.95.3 contest results
Date: Mon, 23 Sep 2002 04:03:55 -0700	[thread overview]
Message-ID: <20020923110355.GF10841@pegasys.ws> (raw)
In-Reply-To: <1032777021.3d8eed3d55f53@kolivas.net>

On Mon, Sep 23, 2002 at 08:30:21PM +1000, Con Kolivas wrote:
> Quoting Ingo Molnar <mingo@elte.hu>:
> 
> > On Mon, 23 Sep 2002, Con Kolivas wrote:
> > 
> > > IO Full Load:
> > > 2.5.38                  170.21          42%
> > > 2.5.38-gcc32            230.77          30%
> > 
> > how many times are you running each test? You should run them at least
> > twice (ideally 3 times at least), to establish some sort of statistical
> > noise measure. Especially IO benchmarks tend to fluctuate very heavily
> > depending on various things - they are also very dependent on the initial
> > state - ie. how the pagecache happens to lay out, etc. Ie. a meaningful
> > measurement result would be something like:
> 
> Yes you make a very valid point and something I've been stewing over privately
> for some time. contest runs benchmarks in a fixed order with a "priming" compile
> to try and get pagecaches etc back to some sort of baseline (I've been trying
> hard to make the results accurate and repeatable). 
> 
> Despite that, you're correct in assuming the IO load will fluctuate widely. My
> initial tests show that noload and process_load (not surprisingly) vary very
> little. Mem_load varies a little. IO Loads can vary wildly, and the worse the
> average performance is, the greater the variation (I mean percentage variation
> not just absolute).
> 
> >  IO Full Load:
> >  2.5.38                  170.21 +- 55.21 sec        42%
> >  2.5.38-gcc32            230.77 +- 60.22 sec        30%
> > 
> > where the first column is the average of two measurements, the second
> > column is the delta of the two measurements divided by 2. This way we can
> > see the 'spread' of the results.
> 
> I'll create some results based on 3 runs soon. 
> 
> > I simply cannot believe that gcc32 can produce any visible effect in any
> > of the IO benchmarks, the only explanation would be heavy fluctuation of
> > IO results.
> 
> Agreed. There probably is no statistically significant difference in the
> different gcc versions.
> 
> Contest is very new and I appreciate any feedback I can get to make it as
> worthwhile a benchmark as possible to those who know.

What hapened to the relative improvement (ratio against
baseline)?  In this test it didn't matter much because the
baselines were almost identical but others lately especially
between different platforms it would have helped.

Perhaps someone who is a statistician could give Con a hand?
This looks like a good test but Ingo is right.  We need
p-values and/or confidence intervals and enough runs to get
them to at least 90% if possible.  I only know enough to
look at them when reported and say (non)random or
(in)significant correlation.  I couldn't begin to calculate
them.  Of course we don't want the measured data smothered in
analytical data, just enough to see if the numbers
are meaningfull.

-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw@pegasys.ws

		Remember Cernan and Schmitt

next prev parent reply	other threads:[~2002-09-23 10:58 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-09-23  6:55 [BENCHMARK] Corrected gcc3.2 v gcc2.95.3 contest results Con Kolivas
2002-09-23  7:49 ` Ingo Molnar
2002-09-23 10:30   ` Con Kolivas
2002-09-23 11:03     ` jw schultz [this message]
2002-09-23 12:47     ` Erik Andersen
2002-09-23 13:00       ` Con Kolivas
2002-09-23 13:15       ` Richard B. Johnson
2002-09-23 13:35         ` Ingo Molnar
2002-09-23 14:09           ` Richard B. Johnson
2002-09-23 18:24       ` Andrew Morton
2002-09-23 14:02     ` Ryan Anderson
2002-09-23 14:15       ` Richard B. Johnson
2002-09-23 14:24         ` Con Kolivas
2002-09-23 14:34           ` Jakub Jelinek
2002-09-23 16:03             ` Måns Rullgård
2002-09-23 14:43           ` Richard B. Johnson
2002-09-24 21:30             ` Bill Davidsen
2002-09-23 16:34           ` Oliver Xymoron
2002-09-23 21:47             ` Con Kolivas
2002-09-24  1:12               ` jw schultz
2002-09-24  9:18                 ` Jan Hudec
2002-09-23 14:26     ` Ingo Molnar
2002-09-23 14:36       ` Con Kolivas
2002-09-24 21:27   ` Bill Davidsen
     [not found] <Pine.LNX.4.33.0209232236070.27095-100000@coffee.psychology.mcmaster.ca>
2002-09-24  2:45 ` Con Kolivas
2002-09-24  3:01   ` Andrew Morton
2002-09-24  9:34     ` Jan Hudec
2002-09-24 13:45     ` Denis Vlasenko
2002-09-24  9:26       ` Con Kolivas
2002-09-24 14:19         ` Denis Vlasenko
2002-09-24 15:47       ` Mark Hahn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020923110355.GF10841@pegasys.ws \
    --to=jw@pegasys.ws \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox