All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hans Reiser <reiser@namesys.com>
To: Edward Shishkin <edward@namesys.com>
Cc: PFC <lists@peufeu.com>,
	jpiszcz@lucidpixels.com, reiserfs-list@namesys.com,
	Alexander Zarochentcev <zam@namesys.com>
Subject: Re: Linux Gazette benchmark Reiser 4
Date: Mon, 09 Jan 2006 23:57:36 -0800	[thread overview]
Message-ID: <43C368F0.7020202@namesys.com> (raw)
In-Reply-To: <43C18D32.8020106@namesys.com>

Did we really do the sophisticated statistical analysis below?  I
assumed we had just taken a look at how much our numbers tended to vary,
and based on experience assumed anything less than 2% was not above
noise.;-) 

The other rule of thumb I have is that really short times can be
amazingly unreliable indicators even when reproduceable.  I am not
entirely sure of why, but I know it to be true.;-)  People suggest such
things as timer inaccuracy, but perhaps there is more inaccuracy than
that could explain.  Perhaps it is scheduler timing related?  I don't
know why it is, I just know it is so.

I do like the way Zam did the red/green/black numbers by the way, I
think I forgot to compliment him on it (it was Zam who did it?).

We need to reproduce Justin's benchmark, fixing the mistakes he made in
its design, and then see how we do at it.  We need to know such things
as, how did he generate filenames, etc.  When people get back....

Hans

Edward Shishkin wrote:

> Hans Reiser wrote:
>
>> PFC wrote:
>>
>>  
>>
>>>    Hehe. Wow. Sure, a benchmark that runs in 0.03 seconds for the
>>> fastest  one and 0.07 seconds for the slowest one looks pretty
>>> reliable to me. How  much time does it take to spawn the "touch"
>>> process 10k times ? Hm... I'd  guess most of the benchmark time ?
>>>   
>>
>>
>>  
>>
>
> Let's consider this important aspect of benchmarking more carefully.
> So there is an interesting question: how much should be a difference
> in order to approve that some fs really wins at this statistics? Is
> there any guarantee you won't get, say, 0.05 and 0.02 after next run?
> Sorry, but I didn't find any answer in Justin's notes, NOTE5 (Tests
> Performed) says that questionable tests were re-run, but it seems we
> need something kinda research here instead of re-run.
>
> Below are some comments for how this problem is resolved (1*) in mongo
> benchmark. Look for example at this table:
> http://www.namesys.com/benchmarks.html#mongo.2.6.11
> Fractions like 0.982 (D/A), 1.017 (C/A) are in black color, it means
> that we _can not_ do any assumptions about winner because
> |1 - X/A| < 0.02. What the magic M = 0.02 is?
> Let's run the same phase for the same settings (file system, file set,
> etc..) 10 times. We will obtain for the same statistics X a set of
> different (because of errors) values x1, x2, ..., x10. Suppose that
> X has a normal distribution (any objections?). It means that we can
> calculate its trusted interval for a single measurement (2*) as
> [X - d(P), X + d(P)], where d(P) = D*U(P), D is dispersion and U(P)
> should be found from the standard table by any nominated value of
> trusted probability P (3*).
> Now we have the following simple criterion (*4):
>
> |A - X| >= 2d(P), i.e. |1 - X/A| >= 2D*U(P)/A
>
> |           |<-d->|    |<-d->|
> ------<-----|----->----<-----|----->------
>            A                X
>
> The magic M = 0.02 for mongo benchmark was calculated as 2D*U(P)/A
> for the trusted probability P=0.85 (5*).
> Now it is clear from the formula above why statistics shouldn't be
> too small: because the criterion becomes false. I am sure (and it
> is easy to check) 2d(P=0.85) is much more then |0.07 - 0.03| as it
> is in the case of find 10000 files. By the way, some settings, which
> provide a small values (~5 sec) of the mongo STATS statistics also
> make this criterion false.
>
>
> (1*) Maybe this is not a perfect way, but it is better then nothing
> (2*) For N measurements the expression for boundaries becomes a bit
>     complicated.
> (3*) For P=0.85 (as we can found in any scientific book) U(P)=1.44
> (4*) One more assumption here about identical distributions of A and X
> (5*) Actually D = max(D_create, D_copy, D_read, D_delete, D_dd), where
>     D_each_phase was estimated once by 10 measurements with some fixed
>     settings by the standard way:
>     D^2 = ((x - x1)^2 + ... + (x - x10)^2)/(10 - 1), where
>     x = (x1 + ... + x10)/10 is an average value.
>
> Edward.
>
>
>
>


  parent reply	other threads:[~2006-01-10  7:57 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-01-06 18:10 Linux Gazette benchmark Reiser 4 Robert Hulme
2006-01-06 19:09 ` PFC
2006-01-06 20:15   ` Hans Reiser
2006-01-08 22:07     ` Edward Shishkin
2006-01-09 11:04       ` Re[2]: " Pysiak Satriani
2006-01-09 19:50         ` Hans Reiser
2006-01-10  7:57       ` Hans Reiser [this message]
2006-01-07 12:41 ` Andrea Gelmini
2006-01-07 14:03   ` Philippe Gramoullé
2006-01-09 18:22   ` Hans Reiser
2006-01-09 19:01     ` Marcel Hilzinger
2006-01-18  8:28       ` A question: May Reiser4 be equivalent to Reiser3 with some flag/plugin Giovanni A. Orlando
2006-01-18 17:40         ` Hans Reiser
2006-01-18 18:43           ` Giovanni A. Orlando
2006-01-18 18:21         ` Vladimir V. Saveliev
2006-01-18 18:39           ` Giovanni A. Orlando
2006-01-18 20:17         ` Vladimir V. Saveliev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43C368F0.7020202@namesys.com \
    --to=reiser@namesys.com \
    --cc=edward@namesys.com \
    --cc=jpiszcz@lucidpixels.com \
    --cc=lists@peufeu.com \
    --cc=reiserfs-list@namesys.com \
    --cc=zam@namesys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.