From: Jens Axboe <axboe@kernel.dk>
To: Vasily Tarasov <tarasov@vasily.name>,
fio@vger.kernel.org, michael.mattsson@gmail.com
Subject: Re: Fix for a race when fio prints I/O statistics periodically
Date: Sat, 08 Nov 2014 09:28:54 -0700 [thread overview]
Message-ID: <545E44C6.5060505@kernel.dk> (raw)
In-Reply-To: <20141105171024.GC25731@vass-desktop>
On 2014-11-05 10:10, Vasily Tarasov wrote:
> Below is the demonstration for the latest code in git:
>
> Job-file:
>
> [job]
> filename=/dev/sda
> rw=randread
> blocksize=4096
> direct=1
> time_based=1
> runtime=300
> numjobs=16
>
> Command line:
>
> fio --output-format=json --status-interval=10 /tmp/job.file | tee /tmp/fio.out
>
> After parsing json and printing per-job "read->runtime" and "read->iops" in two
> columns, one can see this:
>
> runtime iops
> 9854 4762
> 19861 4724
> 29867 4711
> 39873 4705
> 49880 4701
> 59886 4698
> 69894 4696
> 79900 4694
> 89907 4693
> 99913 4692
> 109919 4692
> 119926 4691
> 129933 4691
> 139940 4691
> 149946 4691
> 159953 4690
> 169959 4690
> 179966 4690
> 189972 4690
> 199978 4690
> 209984 4690
> 219990 4690
> 229996 4690
> 240002 4690
> 250009 4690
> 260015 4690
> 270021 4690
> 280027 4690
> 290033 4690
> 600043 2344 <--- You can see a spurious jump in runtime here
> 300001 4690
>
> The problem is two-fold, IMHO:
>
> 1) td->ts.runtime[DDIR_READ/DDIR_WRITE/DDIR_TRIM] are set in microseconds
> instead of milliseconds in thread_main() function. The values are later
> converted to milliseconds (after exiting from the loop in thread_main()),
> but it is too late for the out-of-band access by the helper_thread
> (the one that prints periodic statistics).
>
> 2) The helper_thread in __show_running_run_stats() temporarily adds to
> td->ts.runtime[DDIR_READ/DDIR_WRITE/DDIR_TRIM]
> the time that has passed since the beginning of the loop (td->start).
> However, the same addition happens in thread_main() by the job
> thread and td->start is not reset atomically with this addition.
> As a result, the runtime can be added twice. (That's why in the output
> we see 600043, which is approximately twice larger than the
> previous reading).
>
> My proposal is to make the job threads updating both td->ts.runtime and
> td->start atomically by using an already existing stat_mutex.
> __show_running_run_stats() already takes this mutex where needed.
>
> A proposed patch is attached.
Patch looks pretty good, probably not worth it to make this a per-td
mutex instead. But why the extra setting of td->start?
Michael, did you get a chance to test this?
--
Jens Axboe
next prev parent reply other threads:[~2014-11-08 16:28 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-05 17:10 Fix for a race when fio prints I/O statistics periodically Vasily Tarasov
2014-11-05 17:19 ` Michael Mattsson
2014-11-08 16:28 ` Jens Axboe [this message]
2014-11-08 16:33 ` Michael Mattsson
2014-11-10 14:19 ` Vasily Tarasov
2014-11-10 15:33 ` Jens Axboe
2014-12-02 23:00 ` Michael Mattsson
2014-12-02 23:36 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=545E44C6.5060505@kernel.dk \
--to=axboe@kernel.dk \
--cc=fio@vger.kernel.org \
--cc=michael.mattsson@gmail.com \
--cc=tarasov@vasily.name \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.