* time_based option broken
@ 2016-01-06 18:45 Jeff Furlong
2016-01-07 6:03 ` Sitsofe Wheeler
0 siblings, 1 reply; 11+ messages in thread
From: Jeff Furlong @ 2016-01-06 18:45 UTC (permalink / raw)
To: fio@vger.kernel.org
Hi All,
Back in version 2.1.14, --time_based would allow a sequential workload to wrap around the device under test, if we reach the end of device or --size limits the range. But in version 2.3, --time_based is broken.
Here is 2.1.14 with a --runtime of 60s, and a runt of 60s:
# fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=1% --runtime=60s --time_based --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme0n1
SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
fio-2.1.14
Starting 1 process
Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/995.1MB/0KB /s] [0/995/0 iops] [eta 00m:00s]
SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=71018: Wed Jan 6 09:14:53 2016
write: io=56738MB, bw=968345KB/s, iops=945, runt= 59999msec
slat (usec): min=357, max=14590, avg=1049.50, stdev=181.80
clat (usec): min=175, max=73870, avg=32747.94, stdev=3397.28
lat (usec): min=947, max=75129, avg=33798.39, stdev=3491.75
clat percentiles (usec):
| 1.00th=[30592], 5.00th=[30592], 10.00th=[30592], 20.00th=[30848],
| 30.00th=[30848], 40.00th=[31104], 50.00th=[31360], 60.00th=[31360],
| 70.00th=[31872], 80.00th=[36608], 90.00th=[38144], 95.00th=[38656],
| 99.00th=[44288], 99.50th=[46336], 99.90th=[57600], 99.95th=[63744],
| 99.99th=[70144]
bw (KB /s): min=790528, max=1028096, per=99.84%, avg=966794.02, stdev=80569.23
lat (usec) : 250=0.01%, 1000=0.01%
lat (msec) : 2=0.01%, 4=0.01%, 10=0.04%, 20=0.07%, 50=99.74%
lat (msec) : 100=0.13%
cpu : usr=25.69%, sys=75.38%, ctx=155, majf=0, minf=3823
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=99.8%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=56738/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
WRITE: io=56738MB, aggrb=968344KB/s, minb=968344KB/s, maxb=968344KB/s, mint=59999msec, maxt=59999msec
Disk stats (read/write):
nvme0n1: ios=364/509460, merge=0/0, ticks=21/93539, in_queue=93551, util=80.61%
Here is 2.3 with a --runtime of 60s, and a runt of only 16s, and an error:
# fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=1% --runtime=60s --time_based --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme0n1
SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
fio-2.3-11-g5f3b
Starting 1 process
fio: io_u error on file /dev/nvme0n1: Invalid argument: write offset=153764, buflen=1048576
fio: io_u error on file /dev/nvme0n1: Invalid argument: write offset=1202340, buflen=1048576
fio: pid=68391, err=22/file:io_u.c:1596, func=io_u error, error=Invalid argument
SW_1MB_QD32: (groupid=0, jobs=1): err=22 (file:io_u.c:1596, func=io_u error, error=Invalid argument): pid=68391: Wed Jan 6 08:48:18 2016
write: io=15262MB, bw=957733KB/s, iops=937, runt= 16318msec
slat (usec): min=90, max=7299, avg=1057.41, stdev=223.10
clat (msec): min=6, max=81, avg=33.10, stdev= 4.04
lat (msec): min=7, max=82, avg=34.16, stdev= 4.15
clat percentiles (usec):
| 1.00th=[30336], 5.00th=[30592], 10.00th=[30848], 20.00th=[30848],
| 30.00th=[31104], 40.00th=[31104], 50.00th=[31360], 60.00th=[31616],
| 70.00th=[32128], 80.00th=[37120], 90.00th=[38656], 95.00th=[39680],
| 99.00th=[46336], 99.50th=[46848], 99.90th=[74240], 99.95th=[78336],
| 99.99th=[81408]
bw (KB /s): min=745472, max=1028096, per=99.83%, avg=956100.72, stdev=83519.38
lat (msec) : 10=0.04%, 20=0.08%, 50=99.39%, 100=0.28%
cpu : usr=25.12%, sys=75.00%, ctx=53, majf=0, minf=2253
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=99.8%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=15294/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
WRITE: io=15262MB, aggrb=957733KB/s, minb=957733KB/s, maxb=957733KB/s, mint=16318msec, maxt=16318msec
Disk stats (read/write):
nvme0n1: ios=2/135369, merge=0/0, ticks=0/87547, in_queue=87546, util=82.23%
The issue seems to be that in version 2.3 we no longer wrap around the device when in sequential workloads. Random workloads seem fine. Thoughts?
Thanks.
Regards,
Jeff
HGST E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of HGST and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: time_based option broken
2016-01-06 18:45 time_based option broken Jeff Furlong
@ 2016-01-07 6:03 ` Sitsofe Wheeler
2016-01-08 14:03 ` Jens Rosenboom
0 siblings, 1 reply; 11+ messages in thread
From: Sitsofe Wheeler @ 2016-01-07 6:03 UTC (permalink / raw)
To: Jeff Furlong; +Cc: fio@vger.kernel.org
Hi,
On 6 January 2016 at 18:45, Jeff Furlong <jeff.furlong@hgst.com> wrote:
>
> Here is 2.3 with a --runtime of 60s, and a runt of only 16s, and an error:
>
> # fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=1% --runtime=60s --time_based --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme0n1
> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
> fio-2.3-11-g5f3b
> Starting 1 process
> fio: io_u error on file /dev/nvme0n1: Invalid argument: write offset=153764, buflen=1048576
> fio: io_u error on file /dev/nvme0n1: Invalid argument: write offset=1202340, buflen=1048576
> fio: pid=68391, err=22/file:io_u.c:1596, func=io_u error, error=Invalid argument
>
> SW_1MB_QD32: (groupid=0, jobs=1): err=22 (file:io_u.c:1596, func=io_u error, error=Invalid argument): pid=68391: Wed Jan 6 08:48:18 2016
Can say how big /dev/nvme0n1 is and can you repeat your test but with
size set in bytes? I'm wondering if your 1MByte blocksize doesn't
divide evenly in the 1% space the job has to work with...
--
Sitsofe | http://sucs.org/~sits/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: time_based option broken
2016-01-07 6:03 ` Sitsofe Wheeler
@ 2016-01-08 14:03 ` Jens Rosenboom
2016-01-08 19:35 ` Jeff Furlong
0 siblings, 1 reply; 11+ messages in thread
From: Jens Rosenboom @ 2016-01-08 14:03 UTC (permalink / raw)
To: Sitsofe Wheeler; +Cc: Jeff Furlong, fio@vger.kernel.org
2016-01-07 7:03 GMT+01:00 Sitsofe Wheeler <sitsofe@gmail.com>:
> Hi,
>
> On 6 January 2016 at 18:45, Jeff Furlong <jeff.furlong@hgst.com> wrote:
>>
>> Here is 2.3 with a --runtime of 60s, and a runt of only 16s, and an error:
>>
>> # fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=1% --runtime=60s --time_based --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme0n1
>> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
>> fio-2.3-11-g5f3b
>> Starting 1 process
>> fio: io_u error on file /dev/nvme0n1: Invalid argument: write offset=153764, buflen=1048576
>> fio: io_u error on file /dev/nvme0n1: Invalid argument: write offset=1202340, buflen=1048576
>> fio: pid=68391, err=22/file:io_u.c:1596, func=io_u error, error=Invalid argument
>>
>> SW_1MB_QD32: (groupid=0, jobs=1): err=22 (file:io_u.c:1596, func=io_u error, error=Invalid argument): pid=68391: Wed Jan 6 08:48:18 2016
>
> Can say how big /dev/nvme0n1 is and can you repeat your test but with
> size set in bytes? I'm wondering if your 1MByte blocksize doesn't
> divide evenly in the 1% space the job has to work with...
Indeed I can pretty easily reproduce this by setting size to some odd
value. git bisect then finds c82ea3d49aa as the culprit.
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: time_based option broken
2016-01-08 14:03 ` Jens Rosenboom
@ 2016-01-08 19:35 ` Jeff Furlong
2016-01-14 17:33 ` Jens Axboe
0 siblings, 1 reply; 11+ messages in thread
From: Jeff Furlong @ 2016-01-08 19:35 UTC (permalink / raw)
To: Jens Rosenboom, Sitsofe Wheeler; +Cc: fio@vger.kernel.org
Good points. Here is further data:
# blockdev --getsize64 /dev/nvme2n1
800166076416
# fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=1000m --runtime=60s --time_based --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
fio-2.3-11-g5f3b
Starting 1 process
Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/990.2MB/0KB /s] [0/990/0 iops] [eta 00m:00s]
SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=19511: Fri Jan 8 11:08:55 2016
write: io=58620MB, bw=976.11MB/s, iops=976, runt= 60001msec
So the above using a fixed --size that is compatible with --bs is one workaround.
When trying to write more than 100% of the device (wrapping around after end of device), instead of --size=100% we could use a fixed size (where above 800166076416 bytes is 763097.8 MB), even if time_based is not used (use --io_size instead):
# fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=763097m --io_size=764000m --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
fio-2.3-11-g5f3b
Starting 1 process
Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/567.9MB/0KB /s] [0/567/0 iops] [eta 00m:00s]
SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=129928: Thu Jan 7 12:45:52 2016
write: io=764000MB, bw=600510KB/s, iops=586, runt=1302786msec
Alternatively, if we simply back out the commit c82ea3d49aa then we can make the 1% or 100% options (--time_based or --io_size) work fine:
# fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=1% --runtime=60s --time_based --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
fio-2.3-11-g5f3b
Starting 1 process
Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/954.2MB/0KB /s] [0/954/0 iops] [eta 00m:00s]
SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=21853: Fri Jan 8 11:20:04 2016
write: io=56931MB, bw=971606KB/s, iops=948, runt= 60001msec
Thanks.
Regards,
Jeff
> Hi,
>>
>> Here is 2.3 with a --runtime of 60s, and a runt of only 16s, and an error:
>>
>> # fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
>> --iodepth=32 --size=1% --runtime=60s --time_based --numjobs=1 --bs=1m
>> --overwrite=1 --filename=/dev/nvme0n1
>> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio,
>> iodepth=32 fio-2.3-11-g5f3b Starting 1 process
>> fio: io_u error on file /dev/nvme0n1: Invalid argument: write
>> offset=153764, buflen=1048576
>> fio: io_u error on file /dev/nvme0n1: Invalid argument: write
>> offset=1202340, buflen=1048576
>> fio: pid=68391, err=22/file:io_u.c:1596, func=io_u error,
>> error=Invalid argument
>>
>> SW_1MB_QD32: (groupid=0, jobs=1): err=22 (file:io_u.c:1596, func=io_u
>> error, error=Invalid argument): pid=68391: Wed Jan 6 08:48:18 2016
>
> Can say how big /dev/nvme0n1 is and can you repeat your test but with
> size set in bytes? I'm wondering if your 1MByte blocksize doesn't
> divide evenly in the 1% space the job has to work with...
Indeed I can pretty easily reproduce this by setting size to some odd value. git bisect then finds c82ea3d49aa as the culprit.
HGST E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of HGST and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: time_based option broken
2016-01-08 19:35 ` Jeff Furlong
@ 2016-01-14 17:33 ` Jens Axboe
2016-01-14 19:51 ` Jeff Furlong
0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2016-01-14 17:33 UTC (permalink / raw)
To: Jeff Furlong, Jens Rosenboom, Sitsofe Wheeler; +Cc: fio@vger.kernel.org
On 01/08/2016 12:35 PM, Jeff Furlong wrote:
> Good points. Here is further data:
>
> # blockdev --getsize64 /dev/nvme2n1
> 800166076416
>
> # fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=1000m --runtime=60s --time_based --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
> fio-2.3-11-g5f3b
> Starting 1 process
> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/990.2MB/0KB /s] [0/990/0 iops] [eta 00m:00s]
> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=19511: Fri Jan 8 11:08:55 2016
> write: io=58620MB, bw=976.11MB/s, iops=976, runt= 60001msec
>
> So the above using a fixed --size that is compatible with --bs is one workaround.
>
> When trying to write more than 100% of the device (wrapping around after end of device), instead of --size=100% we could use a fixed size (where above 800166076416 bytes is 763097.8 MB), even if time_based is not used (use --io_size instead):
>
> # fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=763097m --io_size=764000m --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
> fio-2.3-11-g5f3b
> Starting 1 process
> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/567.9MB/0KB /s] [0/567/0 iops] [eta 00m:00s]
> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=129928: Thu Jan 7 12:45:52 2016
> write: io=764000MB, bw=600510KB/s, iops=586, runt=1302786msec
>
> Alternatively, if we simply back out the commit c82ea3d49aa then we can make the 1% or 100% options (--time_based or --io_size) work fine:
>
> # fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=1% --runtime=60s --time_based --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
> fio-2.3-11-g5f3b
> Starting 1 process
> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/954.2MB/0KB /s] [0/954/0 iops] [eta 00m:00s]
> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=21853: Fri Jan 8 11:20:04 2016
> write: io=56931MB, bw=971606KB/s, iops=948, runt= 60001msec
So the bug isn't really in the bisected commit, it just changes how fio
backs out and then shows another bug. The real bug is that we don't
align the new start properly, so we end up with these weird unaligned
start offsets that don't work with O_DIRECT IO.
Try current -git, I just committed this fix:
http://git.kernel.dk/cgit/fio/commit/?id=19ddc35b9b97be5af371bb65e93a4864d1dce7b6
--
Jens Axboe
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: time_based option broken
2016-01-14 17:33 ` Jens Axboe
@ 2016-01-14 19:51 ` Jeff Furlong
2016-01-14 20:10 ` Jens Axboe
0 siblings, 1 reply; 11+ messages in thread
From: Jeff Furlong @ 2016-01-14 19:51 UTC (permalink / raw)
To: Jens Axboe, Jens Rosenboom, Sitsofe Wheeler; +Cc: fio@vger.kernel.org
Latest commit seems to help, but found one more issue:
# blockdev --getsize64 /dev/nvme2n1
800166076416
800166076416B = 763097.8MB
Test1: Round io_size to match bs: PASS
# ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=1% --runtime=60s --time_based --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
fio-2.3-26-g19dd
Starting 1 process
Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/1380MB/0KB /s] [0/1380/0 iops] [eta 00m:00s]
SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=80311: Thu Jan 14 10:11:36 2016
write: io=84966MB, bw=1415.1MB/s, iops=1415, runt= 60009msec
Test2: Loop around max device size and continue IO with fixed runtime: PASS
# ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=100% --runtime=15m --time_based --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
fio-2.3-26-g19dd
Starting 1 process
Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/1388MB/0KB /s] [0/1388/0 iops] [eta 00m:00s]
SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=80377: Thu Jan 14 10:28:27 2016
write: io=1231.9GB, bw=1401.6MB/s, iops=1401, runt=900008msec
Test3: Loop around max device size and continue IO with fixed total IO, round total IO to bs: FAIL (does not loop around to start LBA)
# ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=100% --io_size=810000000000 --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
fio-2.3-26-g19dd
Starting 1 process
Jobs: 1 (f=0): [W(1)] [98.9% done] [0KB/1453MB/0KB /s] [0/1453/0 iops] [eta 00m:06s]
SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=81065: Thu Jan 14 10:49:53 2016
write: io=763097MB, bw=1399.5MB/s, iops=1399, runt=545278msec
Test4: Loop around max device size and continue IO with fixed total IO, total IO is already aligned to bs: FAIL (does not loop around to start LBA)
# ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=100% --io_size=810675077120 --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
fio-2.3-26-g19dd
Starting 1 process
Jobs: 1 (f=1): [W(1)] [98.6% done] [0KB/1404MB/0KB /s] [0/1404/0 iops] [eta 00m:08s]
SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=81339: Thu Jan 14 11:01:57 2016
write: io=763097MB, bw=1399.9MB/s, iops=1399, runt=545142msec
Regards,
Jeff
-----Original Message-----
From: Jens Axboe [mailto:axboe@kernel.dk]
Sent: Thursday, January 14, 2016 9:33 AM
To: Jeff Furlong <jeff.furlong@hgst.com>; Jens Rosenboom <j.rosenboom@x-ion.de>; Sitsofe Wheeler <sitsofe@gmail.com>
Cc: fio@vger.kernel.org
Subject: Re: time_based option broken
On 01/08/2016 12:35 PM, Jeff Furlong wrote:
> Good points. Here is further data:
>
> # blockdev --getsize64 /dev/nvme2n1
> 800166076416
>
> # fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
> --iodepth=32 --size=1000m --runtime=60s --time_based --numjobs=1
> --bs=1m --overwrite=1 --filename=/dev/nvme2n1
> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio,
> iodepth=32 fio-2.3-11-g5f3b Starting 1 process
> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/990.2MB/0KB /s] [0/990/0
> iops] [eta 00m:00s]
> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=19511: Fri Jan 8 11:08:55 2016
> write: io=58620MB, bw=976.11MB/s, iops=976, runt= 60001msec
>
> So the above using a fixed --size that is compatible with --bs is one workaround.
>
> When trying to write more than 100% of the device (wrapping around after end of device), instead of --size=100% we could use a fixed size (where above 800166076416 bytes is 763097.8 MB), even if time_based is not used (use --io_size instead):
>
> # fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
> --iodepth=32 --size=763097m --io_size=764000m --numjobs=1 --bs=1m
> --overwrite=1 --filename=/dev/nvme2n1
> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio,
> iodepth=32 fio-2.3-11-g5f3b Starting 1 process
> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/567.9MB/0KB /s] [0/567/0
> iops] [eta 00m:00s]
> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=129928: Thu Jan 7 12:45:52 2016
> write: io=764000MB, bw=600510KB/s, iops=586, runt=1302786msec
>
> Alternatively, if we simply back out the commit c82ea3d49aa then we can make the 1% or 100% options (--time_based or --io_size) work fine:
>
> # fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
> --iodepth=32 --size=1% --runtime=60s --time_based --numjobs=1 --bs=1m
> --overwrite=1 --filename=/dev/nvme2n1
> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio,
> iodepth=32 fio-2.3-11-g5f3b Starting 1 process
> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/954.2MB/0KB /s] [0/954/0
> iops] [eta 00m:00s]
> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=21853: Fri Jan 8 11:20:04 2016
> write: io=56931MB, bw=971606KB/s, iops=948, runt= 60001msec
So the bug isn't really in the bisected commit, it just changes how fio backs out and then shows another bug. The real bug is that we don't align the new start properly, so we end up with these weird unaligned start offsets that don't work with O_DIRECT IO.
Try current -git, I just committed this fix:
http://git.kernel.dk/cgit/fio/commit/?id=19ddc35b9b97be5af371bb65e93a4864d1dce7b6
--
Jens Axboe
HGST E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of HGST and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: time_based option broken
2016-01-14 19:51 ` Jeff Furlong
@ 2016-01-14 20:10 ` Jens Axboe
2016-01-14 20:22 ` Jens Axboe
0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2016-01-14 20:10 UTC (permalink / raw)
To: Jeff Furlong, Jens Rosenboom, Sitsofe Wheeler; +Cc: fio@vger.kernel.org
On 01/14/2016 12:51 PM, Jeff Furlong wrote:
> Latest commit seems to help, but found one more issue:
>
> # blockdev --getsize64 /dev/nvme2n1
> 800166076416
>
> 800166076416B = 763097.8MB
>
> Test1: Round io_size to match bs: PASS
> # ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=1% --runtime=60s --time_based --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
> fio-2.3-26-g19dd
> Starting 1 process
> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/1380MB/0KB /s] [0/1380/0 iops] [eta 00m:00s]
> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=80311: Thu Jan 14 10:11:36 2016
> write: io=84966MB, bw=1415.1MB/s, iops=1415, runt= 60009msec
>
> Test2: Loop around max device size and continue IO with fixed runtime: PASS
> # ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=100% --runtime=15m --time_based --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
> fio-2.3-26-g19dd
> Starting 1 process
> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/1388MB/0KB /s] [0/1388/0 iops] [eta 00m:00s]
> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=80377: Thu Jan 14 10:28:27 2016
> write: io=1231.9GB, bw=1401.6MB/s, iops=1401, runt=900008msec
>
> Test3: Loop around max device size and continue IO with fixed total IO, round total IO to bs: FAIL (does not loop around to start LBA)
> # ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=100% --io_size=810000000000 --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
> fio-2.3-26-g19dd
> Starting 1 process
> Jobs: 1 (f=0): [W(1)] [98.9% done] [0KB/1453MB/0KB /s] [0/1453/0 iops] [eta 00m:06s]
> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=81065: Thu Jan 14 10:49:53 2016
> write: io=763097MB, bw=1399.5MB/s, iops=1399, runt=545278msec
>
> Test4: Loop around max device size and continue IO with fixed total IO, total IO is already aligned to bs: FAIL (does not loop around to start LBA)
> # ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=100% --io_size=810675077120 --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
> fio-2.3-26-g19dd
> Starting 1 process
> Jobs: 1 (f=1): [W(1)] [98.6% done] [0KB/1404MB/0KB /s] [0/1404/0 iops] [eta 00m:08s]
> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=81339: Thu Jan 14 11:01:57 2016
> write: io=763097MB, bw=1399.9MB/s, iops=1399, runt=545142msec
Very strange, I can't reproduce 3/4 test case failures. I've got a small
nvme0n1 device here:
axboe@dell:~ $ sudo blockdev --getsize64 /dev/nvme0n1
4286578688
and I tried:
./fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
--iodepth=1 --size=100% --io_size=4386578688 --numjobs=1 --bs=1m
--overwrite=1 --filename=/dev/nvme0n1
and 4300000000 for sizes, and it produces the desired outcome in writing
the bytes specified by io_size. I added a quick dump to the offset, and
that looks correct too:
[...]
off=4281335808
off=4282384384
off=4283432960
off=4284481536
off=4285530112
off=0
off=1048576
off=2097152
off=3145728
[...]
which loops around and starts writing from 0 again.
I'll try a bigger device, though it'd be weird if that behaved differently.
--
Jens Axboe
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: time_based option broken
2016-01-14 20:10 ` Jens Axboe
@ 2016-01-14 20:22 ` Jens Axboe
2016-01-14 21:46 ` Jens Axboe
0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2016-01-14 20:22 UTC (permalink / raw)
To: Jeff Furlong, Jens Rosenboom, Sitsofe Wheeler; +Cc: fio@vger.kernel.org
On 01/14/2016 01:10 PM, Jens Axboe wrote:
> On 01/14/2016 12:51 PM, Jeff Furlong wrote:
>> Latest commit seems to help, but found one more issue:
>>
>> # blockdev --getsize64 /dev/nvme2n1
>> 800166076416
>>
>> 800166076416B = 763097.8MB
>>
>> Test1: Round io_size to match bs: PASS
>> # ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
>> --iodepth=32 --size=1% --runtime=60s --time_based --numjobs=1 --bs=1m
>> --overwrite=1 --filename=/dev/nvme2n1
>> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio,
>> iodepth=32
>> fio-2.3-26-g19dd
>> Starting 1 process
>> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/1380MB/0KB /s] [0/1380/0
>> iops] [eta 00m:00s]
>> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=80311: Thu Jan 14
>> 10:11:36 2016
>> write: io=84966MB, bw=1415.1MB/s, iops=1415, runt= 60009msec
>>
>> Test2: Loop around max device size and continue IO with fixed runtime:
>> PASS
>> # ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
>> --iodepth=32 --size=100% --runtime=15m --time_based --numjobs=1
>> --bs=1m --overwrite=1 --filename=/dev/nvme2n1
>> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio,
>> iodepth=32
>> fio-2.3-26-g19dd
>> Starting 1 process
>> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/1388MB/0KB /s] [0/1388/0
>> iops] [eta 00m:00s]
>> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=80377: Thu Jan 14
>> 10:28:27 2016
>> write: io=1231.9GB, bw=1401.6MB/s, iops=1401, runt=900008msec
>>
>> Test3: Loop around max device size and continue IO with fixed total
>> IO, round total IO to bs: FAIL (does not loop around to start LBA)
>> # ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
>> --iodepth=32 --size=100% --io_size=810000000000 --numjobs=1 --bs=1m
>> --overwrite=1 --filename=/dev/nvme2n1
>> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio,
>> iodepth=32
>> fio-2.3-26-g19dd
>> Starting 1 process
>> Jobs: 1 (f=0): [W(1)] [98.9% done] [0KB/1453MB/0KB /s] [0/1453/0 iops]
>> [eta 00m:06s]
>> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=81065: Thu Jan 14
>> 10:49:53 2016
>> write: io=763097MB, bw=1399.5MB/s, iops=1399, runt=545278msec
>>
>> Test4: Loop around max device size and continue IO with fixed total
>> IO, total IO is already aligned to bs: FAIL (does not loop around to
>> start LBA)
>> # ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
>> --iodepth=32 --size=100% --io_size=810675077120 --numjobs=1 --bs=1m
>> --overwrite=1 --filename=/dev/nvme2n1
>> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio,
>> iodepth=32
>> fio-2.3-26-g19dd
>> Starting 1 process
>> Jobs: 1 (f=1): [W(1)] [98.6% done] [0KB/1404MB/0KB /s] [0/1404/0 iops]
>> [eta 00m:08s]
>> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=81339: Thu Jan 14
>> 11:01:57 2016
>> write: io=763097MB, bw=1399.9MB/s, iops=1399, runt=545142msec
>
> Very strange, I can't reproduce 3/4 test case failures. I've got a small
> nvme0n1 device here:
>
> axboe@dell:~ $ sudo blockdev --getsize64 /dev/nvme0n1
> 4286578688
>
> and I tried:
>
> ./fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
> --iodepth=1 --size=100% --io_size=4386578688 --numjobs=1 --bs=1m
> --overwrite=1 --filename=/dev/nvme0n1
>
> and 4300000000 for sizes, and it produces the desired outcome in writing
> the bytes specified by io_size. I added a quick dump to the offset, and
> that looks correct too:
>
> [...]
> off=4281335808
> off=4282384384
> off=4283432960
> off=4284481536
> off=4285530112
> off=0
> off=1048576
> off=2097152
> off=3145728
> [...]
>
> which loops around and starts writing from 0 again.
>
> I'll try a bigger device, though it'd be weird if that behaved differently.
It does reproduce with a bigger device, just not on the 4G/16G ones I
have. Very odd, but at least I can reproduce it.
--
Jens Axboe
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: time_based option broken
2016-01-14 20:22 ` Jens Axboe
@ 2016-01-14 21:46 ` Jens Axboe
2016-01-14 23:04 ` Jeff Furlong
0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2016-01-14 21:46 UTC (permalink / raw)
To: Jeff Furlong, Jens Rosenboom, Sitsofe Wheeler; +Cc: fio@vger.kernel.org
On 01/14/2016 01:22 PM, Jens Axboe wrote:
> On 01/14/2016 01:10 PM, Jens Axboe wrote:
>> On 01/14/2016 12:51 PM, Jeff Furlong wrote:
>>> Latest commit seems to help, but found one more issue:
>>>
>>> # blockdev --getsize64 /dev/nvme2n1
>>> 800166076416
>>>
>>> 800166076416B = 763097.8MB
>>>
>>> Test1: Round io_size to match bs: PASS
>>> # ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
>>> --iodepth=32 --size=1% --runtime=60s --time_based --numjobs=1 --bs=1m
>>> --overwrite=1 --filename=/dev/nvme2n1
>>> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio,
>>> iodepth=32
>>> fio-2.3-26-g19dd
>>> Starting 1 process
>>> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/1380MB/0KB /s] [0/1380/0
>>> iops] [eta 00m:00s]
>>> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=80311: Thu Jan 14
>>> 10:11:36 2016
>>> write: io=84966MB, bw=1415.1MB/s, iops=1415, runt= 60009msec
>>>
>>> Test2: Loop around max device size and continue IO with fixed runtime:
>>> PASS
>>> # ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
>>> --iodepth=32 --size=100% --runtime=15m --time_based --numjobs=1
>>> --bs=1m --overwrite=1 --filename=/dev/nvme2n1
>>> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio,
>>> iodepth=32
>>> fio-2.3-26-g19dd
>>> Starting 1 process
>>> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/1388MB/0KB /s] [0/1388/0
>>> iops] [eta 00m:00s]
>>> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=80377: Thu Jan 14
>>> 10:28:27 2016
>>> write: io=1231.9GB, bw=1401.6MB/s, iops=1401, runt=900008msec
>>>
>>> Test3: Loop around max device size and continue IO with fixed total
>>> IO, round total IO to bs: FAIL (does not loop around to start LBA)
>>> # ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
>>> --iodepth=32 --size=100% --io_size=810000000000 --numjobs=1 --bs=1m
>>> --overwrite=1 --filename=/dev/nvme2n1
>>> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio,
>>> iodepth=32
>>> fio-2.3-26-g19dd
>>> Starting 1 process
>>> Jobs: 1 (f=0): [W(1)] [98.9% done] [0KB/1453MB/0KB /s] [0/1453/0 iops]
>>> [eta 00m:06s]
>>> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=81065: Thu Jan 14
>>> 10:49:53 2016
>>> write: io=763097MB, bw=1399.5MB/s, iops=1399, runt=545278msec
>>>
>>> Test4: Loop around max device size and continue IO with fixed total
>>> IO, total IO is already aligned to bs: FAIL (does not loop around to
>>> start LBA)
>>> # ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
>>> --iodepth=32 --size=100% --io_size=810675077120 --numjobs=1 --bs=1m
>>> --overwrite=1 --filename=/dev/nvme2n1
>>> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio,
>>> iodepth=32
>>> fio-2.3-26-g19dd
>>> Starting 1 process
>>> Jobs: 1 (f=1): [W(1)] [98.6% done] [0KB/1404MB/0KB /s] [0/1404/0 iops]
>>> [eta 00m:08s]
>>> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=81339: Thu Jan 14
>>> 11:01:57 2016
>>> write: io=763097MB, bw=1399.9MB/s, iops=1399, runt=545142msec
>>
>> Very strange, I can't reproduce 3/4 test case failures. I've got a small
>> nvme0n1 device here:
>>
>> axboe@dell:~ $ sudo blockdev --getsize64 /dev/nvme0n1
>> 4286578688
>>
>> and I tried:
>>
>> ./fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
>> --iodepth=1 --size=100% --io_size=4386578688 --numjobs=1 --bs=1m
>> --overwrite=1 --filename=/dev/nvme0n1
>>
>> and 4300000000 for sizes, and it produces the desired outcome in writing
>> the bytes specified by io_size. I added a quick dump to the offset, and
>> that looks correct too:
>>
>> [...]
>> off=4281335808
>> off=4282384384
>> off=4283432960
>> off=4284481536
>> off=4285530112
>> off=0
>> off=1048576
>> off=2097152
>> off=3145728
>> [...]
>>
>> which loops around and starts writing from 0 again.
>>
>> I'll try a bigger device, though it'd be weird if that behaved
>> differently.
>
> It does reproduce with a bigger device, just not on the 4G/16G ones I
> have. Very odd, but at least I can reproduce it.
It's not an absolute size, but rather one where we run into an issue if
the device size isn't a multiple of the block size we use. This is a
common case for real devices. Can you try current -git again? Hopefully
it should be fixed.
--
Jens Axboe
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: time_based option broken
2016-01-14 21:46 ` Jens Axboe
@ 2016-01-14 23:04 ` Jeff Furlong
2016-01-15 15:41 ` Jens Axboe
0 siblings, 1 reply; 11+ messages in thread
From: Jeff Furlong @ 2016-01-14 23:04 UTC (permalink / raw)
To: Jens Axboe, Jens Rosenboom, Sitsofe Wheeler; +Cc: fio@vger.kernel.org
Success!
Test1: Round io_size to match bs: PASS
# ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=1% --runtime=60s --time_based --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
fio-2.3-27-g543e
Starting 1 process
Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/1422MB/0KB /s] [0/1422/0 iops] [eta 00m:00s]
SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=89722: Thu Jan 14 14:39:48 2016
write: io=84664MB, bw=1410.1MB/s, iops=1410, runt= 60007msec
Test2: Loop around max device size and continue IO with fixed runtime: PASS
# ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=100% --runtime=15m --time_based --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
fio-2.3-27-g543e
Starting 1 process
Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/1429MB/0KB /s] [0/1429/0 iops] [eta 00m:00s]
SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=89923: Thu Jan 14 14:55:41 2016
write: io=1227.2GB, bw=1397.2MB/s, iops=1397, runt=900010msec
Test3: Loop around max device size and continue IO with fixed total IO, round total IO to bs: PASS
# ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=100% --io_size=810000000000 --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
fio-2.3-27-g543e
Starting 1 process
Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/1377MB/0KB /s] [0/1377/0 iops] [eta 00m:00s]
SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=89175: Thu Jan 14 14:24:35 2016
write: io=772477MB, bw=1399.1MB/s, iops=1399, runt=551791msec
Test4: Loop around max device size and continue IO with fixed total IO, total IO is already aligned to bs: PASS
# ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write --iodepth=32 --size=100% --io_size=810675077120 --numjobs=1 --bs=1m --overwrite=1 --filename=/dev/nvme2n1
SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
fio-2.3-27-g543e
Starting 1 process
Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/1433MB/0KB /s] [0/1433/0 iops] [eta 00m:00s]
SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=89446: Thu Jan 14 14:36:52 2016
write: io=773120MB, bw=1398.9MB/s, iops=1398, runt=552709msec
Regards,
Jeff
-----Original Message-----
From: Jens Axboe [mailto:axboe@kernel.dk]
Sent: Thursday, January 14, 2016 1:47 PM
To: Jeff Furlong <jeff.furlong@hgst.com>; Jens Rosenboom <j.rosenboom@x-ion.de>; Sitsofe Wheeler <sitsofe@gmail.com>
Cc: fio@vger.kernel.org
Subject: Re: time_based option broken
On 01/14/2016 01:22 PM, Jens Axboe wrote:
> On 01/14/2016 01:10 PM, Jens Axboe wrote:
>> On 01/14/2016 12:51 PM, Jeff Furlong wrote:
>>> Latest commit seems to help, but found one more issue:
>>>
>>> # blockdev --getsize64 /dev/nvme2n1
>>> 800166076416
>>>
>>> 800166076416B = 763097.8MB
>>>
>>> Test1: Round io_size to match bs: PASS # ./fio/fio
>>> --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
>>> --iodepth=32 --size=1% --runtime=60s --time_based --numjobs=1
>>> --bs=1m
>>> --overwrite=1 --filename=/dev/nvme2n1
>>> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio,
>>> iodepth=32
>>> fio-2.3-26-g19dd
>>> Starting 1 process
>>> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/1380MB/0KB /s] [0/1380/0
>>> iops] [eta 00m:00s]
>>> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=80311: Thu Jan 14
>>> 10:11:36 2016
>>> write: io=84966MB, bw=1415.1MB/s, iops=1415, runt= 60009msec
>>>
>>> Test2: Loop around max device size and continue IO with fixed runtime:
>>> PASS
>>> # ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1
>>> --rw=write
>>> --iodepth=32 --size=100% --runtime=15m --time_based --numjobs=1
>>> --bs=1m --overwrite=1 --filename=/dev/nvme2n1
>>> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio,
>>> iodepth=32
>>> fio-2.3-26-g19dd
>>> Starting 1 process
>>> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/1388MB/0KB /s] [0/1388/0
>>> iops] [eta 00m:00s]
>>> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=80377: Thu Jan 14
>>> 10:28:27 2016
>>> write: io=1231.9GB, bw=1401.6MB/s, iops=1401, runt=900008msec
>>>
>>> Test3: Loop around max device size and continue IO with fixed total
>>> IO, round total IO to bs: FAIL (does not loop around to start LBA) #
>>> ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
>>> --iodepth=32 --size=100% --io_size=810000000000 --numjobs=1 --bs=1m
>>> --overwrite=1 --filename=/dev/nvme2n1
>>> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio,
>>> iodepth=32
>>> fio-2.3-26-g19dd
>>> Starting 1 process
>>> Jobs: 1 (f=0): [W(1)] [98.9% done] [0KB/1453MB/0KB /s] [0/1453/0
>>> iops] [eta 00m:06s]
>>> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=81065: Thu Jan 14
>>> 10:49:53 2016
>>> write: io=763097MB, bw=1399.5MB/s, iops=1399, runt=545278msec
>>>
>>> Test4: Loop around max device size and continue IO with fixed total
>>> IO, total IO is already aligned to bs: FAIL (does not loop around to
>>> start LBA) # ./fio/fio --name=SW_1MB_QD32 --ioengine=libaio
>>> --direct=1 --rw=write
>>> --iodepth=32 --size=100% --io_size=810675077120 --numjobs=1 --bs=1m
>>> --overwrite=1 --filename=/dev/nvme2n1
>>> SW_1MB_QD32: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio,
>>> iodepth=32
>>> fio-2.3-26-g19dd
>>> Starting 1 process
>>> Jobs: 1 (f=1): [W(1)] [98.6% done] [0KB/1404MB/0KB /s] [0/1404/0
>>> iops] [eta 00m:08s]
>>> SW_1MB_QD32: (groupid=0, jobs=1): err= 0: pid=81339: Thu Jan 14
>>> 11:01:57 2016
>>> write: io=763097MB, bw=1399.9MB/s, iops=1399, runt=545142msec
>>
>> Very strange, I can't reproduce 3/4 test case failures. I've got a
>> small
>> nvme0n1 device here:
>>
>> axboe@dell:~ $ sudo blockdev --getsize64 /dev/nvme0n1
>> 4286578688
>>
>> and I tried:
>>
>> ./fio --name=SW_1MB_QD32 --ioengine=libaio --direct=1 --rw=write
>> --iodepth=1 --size=100% --io_size=4386578688 --numjobs=1 --bs=1m
>> --overwrite=1 --filename=/dev/nvme0n1
>>
>> and 4300000000 for sizes, and it produces the desired outcome in
>> writing the bytes specified by io_size. I added a quick dump to the
>> offset, and that looks correct too:
>>
>> [...]
>> off=4281335808
>> off=4282384384
>> off=4283432960
>> off=4284481536
>> off=4285530112
>> off=0
>> off=1048576
>> off=2097152
>> off=3145728
>> [...]
>>
>> which loops around and starts writing from 0 again.
>>
>> I'll try a bigger device, though it'd be weird if that behaved
>> differently.
>
> It does reproduce with a bigger device, just not on the 4G/16G ones I
> have. Very odd, but at least I can reproduce it.
It's not an absolute size, but rather one where we run into an issue if the device size isn't a multiple of the block size we use. This is a common case for real devices. Can you try current -git again? Hopefully it should be fixed.
--
Jens Axboe
HGST E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of HGST and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-01-15 15:41 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-06 18:45 time_based option broken Jeff Furlong
2016-01-07 6:03 ` Sitsofe Wheeler
2016-01-08 14:03 ` Jens Rosenboom
2016-01-08 19:35 ` Jeff Furlong
2016-01-14 17:33 ` Jens Axboe
2016-01-14 19:51 ` Jeff Furlong
2016-01-14 20:10 ` Jens Axboe
2016-01-14 20:22 ` Jens Axboe
2016-01-14 21:46 ` Jens Axboe
2016-01-14 23:04 ` Jeff Furlong
2016-01-15 15:41 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox