From: Jens Axboe <axboe@kernel.dk>
To: Vasily Tarasov <tarasov@vasily.name>,
fio@vger.kernel.org, michael.mattsson@gmail.com
Subject: Re: Fix for a race when fio prints I/O statistics periodically
Date: Sat, 08 Nov 2014 09:28:54 -0700 [thread overview]
Message-ID: <545E44C6.5060505@kernel.dk> (raw)
In-Reply-To: <20141105171024.GC25731@vass-desktop>
On 2014-11-05 10:10, Vasily Tarasov wrote:
> Below is the demonstration for the latest code in git:
>
> Job-file:
>
> [job]
> filename=/dev/sda
> rw=randread
> blocksize=4096
> direct=1
> time_based=1
> runtime=300
> numjobs=16
>
> Command line:
>
> fio --output-format=json --status-interval=10 /tmp/job.file | tee /tmp/fio.out
>
> After parsing json and printing per-job "read->runtime" and "read->iops" in two
> columns, one can see this:
>
> runtime iops
> 9854 4762
> 19861 4724
> 29867 4711
> 39873 4705
> 49880 4701
> 59886 4698
> 69894 4696
> 79900 4694
> 89907 4693
> 99913 4692
> 109919 4692
> 119926 4691
> 129933 4691
> 139940 4691
> 149946 4691
> 159953 4690
> 169959 4690
> 179966 4690
> 189972 4690
> 199978 4690
> 209984 4690
> 219990 4690
> 229996 4690
> 240002 4690
> 250009 4690
> 260015 4690
> 270021 4690
> 280027 4690
> 290033 4690
> 600043 2344 <--- You can see a spurious jump in runtime here
> 300001 4690
>
> The problem is two-fold, IMHO:
>
> 1) td->ts.runtime[DDIR_READ/DDIR_WRITE/DDIR_TRIM] are set in microseconds
> instead of milliseconds in thread_main() function. The values are later
> converted to milliseconds (after exiting from the loop in thread_main()),
> but it is too late for the out-of-band access by the helper_thread
> (the one that prints periodic statistics).
>
> 2) The helper_thread in __show_running_run_stats() temporarily adds to
> td->ts.runtime[DDIR_READ/DDIR_WRITE/DDIR_TRIM]
> the time that has passed since the beginning of the loop (td->start).
> However, the same addition happens in thread_main() by the job
> thread and td->start is not reset atomically with this addition.
> As a result, the runtime can be added twice. (That's why in the output
> we see 600043, which is approximately twice larger than the
> previous reading).
>
> My proposal is to make the job threads updating both td->ts.runtime and
> td->start atomically by using an already existing stat_mutex.
> __show_running_run_stats() already takes this mutex where needed.
>
> A proposed patch is attached.
Patch looks pretty good, probably not worth it to make this a per-td
mutex instead. But why the extra setting of td->start?
Michael, did you get a chance to test this?
--
Jens Axboe
next prev parent reply other threads:[~2014-11-08 16:28 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-05 17:10 Fix for a race when fio prints I/O statistics periodically Vasily Tarasov
2014-11-05 17:19 ` Michael Mattsson
2014-11-08 16:28 ` Jens Axboe [this message]
2014-11-08 16:33 ` Michael Mattsson
2014-11-10 14:19 ` Vasily Tarasov
2014-11-10 15:33 ` Jens Axboe
2014-12-02 23:00 ` Michael Mattsson
2014-12-02 23:36 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=545E44C6.5060505@kernel.dk \
--to=axboe@kernel.dk \
--cc=fio@vger.kernel.org \
--cc=michael.mattsson@gmail.com \
--cc=tarasov@vasily.name \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox