From: Jens Axboe <axboe@kernel.dk>
To: Gavin Martin <gavin.martin@seagate.com>,
"fio@vger.kernel.org" <fio@vger.kernel.org>
Subject: Re: Verify concentrated on first few drives
Date: Sat, 20 Jun 2015 13:01:59 -0400 [thread overview]
Message-ID: <55859C87.7060502@kernel.dk> (raw)
In-Reply-To: <55859A90.1060604@kernel.dk>
On 06/20/2015 12:53 PM, Jens Axboe wrote:
> On 06/17/2015 10:45 AM, Gavin Martin wrote:
>> Hi,
>>
>> Running FIO (2.2.8) with the verify option I see that it seems to
>> concentrate the Reads (from the Verify) on the first few drives and
>> tails off as you run more drives, as reported in the Disk Stats.
>>
>> Is this normal? Does the Verify follow the Round-Robin approach of
>> normal workloads?
>>
>> Config file:-
>> [global]
>> ioengine=libaio
>> iodepth=2
>> direct=1
>> runtime=30s
>> time_based
>> bs=64K
>> verify=meta
>> verify_backlog=32
>> verify_state_save=0
>> stonewall
>> norandommap
>> atomic=1
>>
>> filename=/dev/sdb
>> filename=/dev/sdc
>> filename=/dev/sdd
>> filename=/dev/sde
>> filename=/dev/sdf
>> filename=/dev/sdg
>> filename=/dev/sdh
>> filename=/dev/sdi
>> filename=/dev/sdj
>> filename=/dev/sdk
>> filename=/dev/sdl
>> filename=/dev/sdm
>> filename=/dev/sdn
>> filename=/dev/sdo
>> filename=/dev/sdp
>> filename=/dev/sdq
>> filename=/dev/sdr
>> filename=/dev/sds
>> filename=/dev/sdt
>> filename=/dev/sdu
>> filename=/dev/sdv
>> filename=/dev/sdw
>> filename=/dev/sdx
>> filename=/dev/sdy
>>
>> [RUN01]
>> rw=randwrite
>> do_verify=1
>>
>> Output:-
>> RUN01: (g=0): rw=randwrite, bs=64K-64K/64K-64K/64K-64K,
>> ioengine=libaio, iodepth=2
>> fio-2.2.8
>> Starting 1 process
>> Jobs: 1 (f=24): [w(1)] [100.0% done] [10199KB/9816KB/0KB /s]
>> [159/153/0 iops] [eta 00m:00s]
>> RUN01: (groupid=0, jobs=1): err= 0: pid=22749: Wed Jun 17 10:28:23 2015
>> write: io=296000KB, bw=9862.5KB/s, iops=154, runt= 30013msec
>> slat (usec): min=10, max=86, avg=32.74, stdev= 8.18
>> clat (msec): min=1, max=50, avg=11.80, stdev= 3.94
>> lat (msec): min=2, max=50, avg=11.83, stdev= 3.94
>> clat percentiles (usec):
>> | 1.00th=[ 3120], 5.00th=[ 9408], 10.00th=[10432],
>> 20.00th=[10816],
>> | 30.00th=[10944], 40.00th=[11072], 50.00th=[11200],
>> 60.00th=[11328],
>> | 70.00th=[11456], 80.00th=[11584], 90.00th=[12480],
>> 95.00th=[19328],
>> | 99.00th=[30336], 99.50th=[33024], 99.90th=[44288],
>> 99.95th=[44800],
>> | 99.99th=[50432]
>> bw (KB /s): min= 8431, max=11241, per=100.00%, avg=9876.58,
>> stdev=453.15
>> lat (usec) : 250=6.93%, 500=36.19%, 750=3.56%, 1000=0.17%
>> lat (msec) : 2=0.19%, 4=1.19%, 10=4.42%, 20=44.42%, 50=2.91%
>> lat (msec) : 100=0.01%
>> cpu : usr=0.68%, sys=0.99%, ctx=9001, majf=0, minf=87
>> IO depths : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>> >=64=0.0%
>> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>> >=64=0.0%
>> complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>> >=64=0.0%
>> issued : total=r=4608/w=4625/d=0, short=r=0/w=0/d=0,
>> drop=r=0/w=0/d=0
>> latency : target=0, window=0, percentile=100.00%, depth=2
>>
>> Run status group 0 (all jobs):
>> WRITE: io=296000KB, aggrb=9862KB/s, minb=9862KB/s, maxb=9862KB/s,
>> mint=30013msec, maxt=30013msec
>>
>> Disk stats (read/write):
>> sdb: ios=274/192, merge=0/0, ticks=305/2299, in_queue=2604, util=8.65%
>> sdc: ios=225/192, merge=0/0, ticks=359/2232, in_queue=2591, util=8.61%
>> sdd: ios=191/192, merge=0/0, ticks=222/2278, in_queue=2500, util=8.31%
>> sde: ios=217/192, merge=0/0, ticks=509/2235, in_queue=2744, util=9.12%
>> sdf: ios=191/192, merge=0/0, ticks=199/2209, in_queue=2408, util=8.01%
>> sdg: ios=191/192, merge=0/0, ticks=390/2254, in_queue=2644, util=8.80%
>> sdh: ios=191/192, merge=0/0, ticks=122/2339, in_queue=2461, util=8.19%
>> sdi: ios=191/192, merge=0/0, ticks=123/2270, in_queue=2393, util=7.95%
>> sdj: ios=191/191, merge=0/0, ticks=107/2188, in_queue=2302, util=7.64%
>> sdk: ios=191/191, merge=0/0, ticks=82/2168, in_queue=2255, util=7.51%
>> sdl: ios=191/191, merge=0/0, ticks=134/2322, in_queue=2456, util=8.17%
>> sdm: ios=191/191, merge=0/0, ticks=131/2152, in_queue=2283, util=7.61%
>> sdn: ios=191/191, merge=0/0, ticks=127/2288, in_queue=2415, util=8.05%
>> sdo: ios=191/191, merge=0/0, ticks=134/2312, in_queue=2446, util=8.13%
>> sdp: ios=191/191, merge=0/0, ticks=128/2258, in_queue=2386, util=7.95%
>> sdq: ios=191/191, merge=0/0, ticks=292/2296, in_queue=2587, util=8.49%
>> sdr: ios=191/191, merge=0/0, ticks=189/2341, in_queue=2530, util=8.27%
>> sds: ios=191/191, merge=0/0, ticks=167/2379, in_queue=2545, util=8.47%
>> sdt: ios=191/191, merge=0/0, ticks=297/2286, in_queue=2583, util=8.39%
>> sdu: ios=191/191, merge=0/0, ticks=284/2182, in_queue=2465, util=8.15%
>> sdv: ios=189/191, merge=0/0, ticks=191/2134, in_queue=2325, util=7.72%
>> sdw: ios=189/191, merge=0/0, ticks=189/2337, in_queue=2526, util=8.36%
>> sdx: ios=189/191, merge=0/0, ticks=122/2291, in_queue=2413, util=8.06%
>> sdy: ios=189/191, merge=0/0, ticks=195/2164, in_queue=2359, util=7.86%
>>
>> If I increase the iodepth to 8 it gets even worse. The Writes
>> and utilisation
>> has increased, but the last few drives are not going through any
>> verification:-
>>
>> Disk stats (read/write):
>> sdb: ios=970/899, merge=0/0, ticks=624/9107, in_queue=9736,
>> util=32.25%
>> sdc: ios=901/899, merge=0/0, ticks=622/8927, in_queue=9549,
>> util=31.58%
>> sdd: ios=899/899, merge=0/0, ticks=679/9151, in_queue=9831,
>> util=32.60%
>> sde: ios=899/899, merge=0/0, ticks=657/8991, in_queue=9652,
>> util=32.02%
>> sdf: ios=899/900, merge=0/0, ticks=591/9044, in_queue=9634,
>> util=31.96%
>> sdg: ios=899/899, merge=0/0, ticks=543/9260, in_queue=9805,
>> util=32.56%
>> sdh: ios=899/899, merge=0/0, ticks=547/9435, in_queue=9982,
>> util=32.82%
>> sdi: ios=899/899, merge=0/0, ticks=488/9365, in_queue=9853,
>> util=32.68%
>> sdj: ios=899/899, merge=0/0, ticks=422/9349, in_queue=9771,
>> util=32.46%
>> sdk: ios=899/899, merge=0/0, ticks=577/9414, in_queue=9991,
>> util=33.11%
>> sdl: ios=899/899, merge=0/0, ticks=507/9199, in_queue=9703,
>> util=32.19%
>> sdm: ios=899/899, merge=0/0, ticks=495/9125, in_queue=9619,
>> util=31.92%
>> sdn: ios=899/899, merge=0/0, ticks=485/9103, in_queue=9587,
>> util=31.82%
>> sdo: ios=899/899, merge=0/0, ticks=501/9240, in_queue=9741,
>> util=32.23%
>> sdp: ios=899/899, merge=0/0, ticks=477/9110, in_queue=9586,
>> util=31.76%
>> sdq: ios=899/899, merge=0/0, ticks=557/9169, in_queue=9726,
>> util=32.11%
>> sdr: ios=899/899, merge=0/0, ticks=545/11115, in_queue=11656,
>> util=35.65%
>> sds: ios=899/899, merge=0/0, ticks=1266/11542, in_queue=12808,
>> util=36.97%
>> sdt: ios=330/899, merge=0/0, ticks=363/9283, in_queue=9644,
>> util=31.43%
>> sdu: ios=0/899, merge=0/0, ticks=0/9211, in_queue=9209, util=30.77%
>> sdv: ios=0/899, merge=0/0, ticks=0/9232, in_queue=9231, util=30.86%
>> sdw: ios=0/899, merge=0/0, ticks=0/9470, in_queue=9470, util=31.68%
>> sdx: ios=0/899, merge=0/0, ticks=0/9342, in_queue=9342, util=31.26%
>> sdy: ios=0/898, merge=0/0, ticks=0/9174, in_queue=9181, util=30.74%
>>
>> Have I missed an option or have the wrong one in my config file that
>> could be causing this?
>
> That does look very odd. I'll take a look at this. As a temporary
> work-around, you could have separate jobs for each of the devices. Or
> try using your original job file, but add:
>
> verifysort=0
>
> I suspect the latter would also fix it for you.
I looked at the code, and it is an artifact of how the verify sorting
works. It has a builtin check for overlaps, and includes the file in the
sorting. This means that it will favor the starting files when you do
verify like this. If the job ran to completion, you would get full
verification. Setting verifysort=0 should make the time limited backlog
verification more fair for you, since it'll turn the backlog into FIFO
behavior.
--
Jens Axboe
next prev parent reply other threads:[~2015-06-20 17:01 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-17 14:45 Verify concentrated on first few drives Gavin Martin
2015-06-20 16:53 ` Jens Axboe
2015-06-20 17:01 ` Jens Axboe [this message]
2015-06-22 8:41 ` Gavin Martin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55859C87.7060502@kernel.dk \
--to=axboe@kernel.dk \
--cc=fio@vger.kernel.org \
--cc=gavin.martin@seagate.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox