* reporting two fio bug fix when ramp_time / rate_iops options is specified
@ 2013-03-07 1:08 SEOKYOUNG KO
2013-03-07 11:36 ` Jens Axboe
0 siblings, 1 reply; 2+ messages in thread
From: SEOKYOUNG KO @ 2013-03-07 1:08 UTC (permalink / raw)
To: fio@vger.kernel.org
hi.
we detected two problems related to logging latencies.
then we have analyzed the problems and fixed some fio codes.
(0) test condition
- kernel 3.0.0
- fio 2.0.13
- fio --filename=/dev/sdb1 --ioengine=libaio --direct=1 --norandommap --randrepeat=0 --time_based --runtime=$RUNTIME --blocksize=4k --rw=randrw
--rwmixwrite=66 --iodepth=8 --overwrite=1 --refill_buffers
- additional parameter: -ramp_time=$RAMP_TIME / -rate_iops=75% of avg.iops.
- test on a latest SSD
(1) -ramp_time option
■ Problem Statement
- When –ramp_time option is specified, it is reported that latency distribution (xx.xx% latencies) is negatively affected (higher WRITE latency for mixed 66:33 R/W)
■ Analysis Result 1 - Comparison Experiment
- w/o –ramp_time, FIO generates uniformly mixed R/W workload, as expected
- w/ –ramp_time , FIO shows abnormal behavior of generating burst WRITE workloads, when it gets out of‘ramping-up stage’
- Because of the WRITE burst, FIO logs get contaminated by lots of long-latency samples ==> False report of worse latency distribution
■ Analysis Result 2 - Code inspection
- when -ramp_time option is specified, fio call reset_all_stats() after fio run for the ramp_time.
- but there is a bug that td->rwmix_issues is not reset in reset_all_stats()
- when fio perform rwmix workload, fio compare td->io_issues to td->rwmix_issues in order to select the next io direction.
- if td->rwmix_issues is not reset (original code), fio will run the workload of same io direction until td->io_issues < td->rwmix_issues after ramp_time is end.
■ Patch
--- fio-2.0.13_orig/libfio.c 2013-03-04 11:01:59.012419225 +0900
+++ fio-2.0.13_fixed/libfio.c 2013-03-04 11:03:46.048422630 +0900
@@ -116,6 +116,7 @@
td->io_issues[i] = 0;
td->ts.total_io_u[i] = 0;
td->ts.runtime[i] = 0;
+ td->rwmix_issues = 0;
}
fio_gettime(&tv, NULL);
(2) -rate_iops option
■ Problem Statement
- When –rate_iops option is specified, it is reported that latency distribution is negatively affected (higher WRITE latency for mixed 66:33 R/W)
■ Analysis Result 1 - Comparison Experiment
- We have compared the logging report from FIO against the report from BLKTRACE
=> Observation : FIO falsely reported much higher I/O latency than BLCKTRACE
■ Analysis Result 2 – FIO code inspection
- when -rate_iops is specified, FIO periodically calls usleep() to limit IOPS
- Before usleep(), FIO always wait until the completion of all pending I/O
- For all I/O completions, FIO shows erroneous behavior of logging their latency, with that of the longest latency IO sample
- w/ QD=8/ mixed R:W=33:66, up to 7 samples may get false latency log when waiting all I/O completions
=> False report of latency distribution
■ Patch
--- fio-2.0.13_orig/io_u.c 2013-03-04 11:01:59.008419251 +0900
+++ fio-2.0.13_fixed/io_u.c 2013-03-04 11:05:32.828426032 +0900
@@ -457,10 +457,10 @@
* io's that have been actually submitted to an async engine,
* and cur_depth is meaningless for sync engines.
*/
- if (td->io_u_in_flight) {
+ while (td->io_u_in_flight) {
int fio_unused ret;
- ret = io_u_queued_complete(td, td->io_u_in_flight, NULL);
+ ret = io_u_queued_complete(td, 1, NULL);
}
fio_gettime(&t, NULL);
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2013-03-07 11:36 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-07 1:08 reporting two fio bug fix when ramp_time / rate_iops options is specified SEOKYOUNG KO
2013-03-07 11:36 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox