From: Jens Axboe <axboe@kernel.dk>
To: Karl Cronburg <kcronbur@redhat.com>, fio@vger.kernel.org
Subject: Re: computing percentiles from fio data
Date: Tue, 7 Jun 2016 23:10:27 -0600 [thread overview]
Message-ID: <5757A8C3.4000704@kernel.dk> (raw)
In-Reply-To: <CAJ9yE1BYp_jqjwVVap1U-zjYfGT6bsOuXbOMcJV0r8Mz-5LQBQ@mail.gmail.com>
On 06/06/2016 03:00 PM, Karl Cronburg wrote:
> Hello,
>
> In benchmarking ceph I've been using fio / fiologparser, and want to
> get out the sort of stats & percentiles fiologparser currently gives
> (min, avg, max, percentiles). However I'm concerned the data coming
> out of fio is insufficient when I pass it the log_avg_msec argument.
> Namely using the average of a possibly asymmetric sample distribution
> (the set of I/O request samples over which fio is averaging when I
> pass it this argument) will not give accurate percentiles.
The normal stats like percentiles and min/max/avg etc values are not
averaged, even if log_avg_msec is set. That's only true for the logging,
if you specify any of the latency (or iops/bw) logging. The stats that
fio outputs at the end of a run in the normal output is not averaged.
So which problem are you attacking? If you want to improve the logged
values, then that could be useful. You want to look at
stat.c:add_log_sample() for that code.
> Something like this argument is necessary though to keep the log files
> a reasonable size. Would it be a good idea to push the sort of
> statistics done in the log parser directly into fio? I'm considering
> writing some code to compute the quantiles directly in fio, either
> brute-force by maintaining a sorted list or implementing something
> like the algorithm described here:
>
> http://www.cs.rutgers.edu/~muthu/bquant.pdf
>
> with some acceptable user-defined level of error given to fio when
> asked to compute the percentiles on long-running / large data sets.
I still don't quite follow this... You already have the percentiles. If
you re-compute them from a latency log with log_avg_msec set, then yes,
it won't be completely accurate. But why not just use the percentiles
directly?
> Is there any interest in having this added directly into fio? If so
> where in the code should I be looking?
It might be, if I know exactly what problem we are trying to solve here: :-)
--
Jens Axboe
next prev parent reply other threads:[~2016-06-08 5:10 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-06 21:00 computing percentiles from fio data Karl Cronburg
2016-06-06 21:11 ` Mark Nelson
2016-06-06 23:31 ` Jeff Furlong
2016-06-07 15:48 ` Karl Cronburg
2016-06-08 5:10 ` Jens Axboe [this message]
2016-06-08 15:21 ` Karl Cronburg
2016-06-08 15:29 ` Jens Axboe
2016-06-21 20:25 ` Karl Cronburg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5757A8C3.4000704@kernel.dk \
--to=axboe@kernel.dk \
--cc=fio@vger.kernel.org \
--cc=kcronbur@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox