* Spurious verification mismatch?
@ 2014-08-05 15:22 Sitsofe Wheeler
2014-08-07 21:27 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Sitsofe Wheeler @ 2014-08-05 15:22 UTC (permalink / raw)
To: fio
When trying to use fio from git
ae7e055f755e77dfa71ae9040250ce8eec238721 (commit dated July 25th) the
following line always fails at the same point:
./fio --randseed=1 --ioengine=libaio --iodepth=1 --bs=64k
--rw=randwrite --verify=meta --verify_backlog=50 --filename
/dev/shm/foo --size=8M --time_based --runtime=1m --name=go
go: (g=0): rw=randwrite, bs=64K-64K/64K-64K/64K-64K, ioengine=libaio, iodepth=1
fio-2.1.11-10-gae7e
Starting 1 process
meta: verify failed at file /dev/shm/foo offset 6291456, length 65536
fio: pid=55393, err=84/file:io_u.c:1806, func=io_u_queued_complete,
error=Invalid or incomplete multibyte or wide character
go: (groupid=0, jobs=1): err=84 (file:io_u.c:1806,
func=io_u_queued_complete, error=Invalid or incomplete multibyte or
wide character): pid=55393: Tue Aug 5 15:56:56 2014
Switching --loops=4 for --time_based --runtime=1m doesn't resulted in
the same error.
Is this fio job malformed?
--
Sitsofe | http://sucs.org/~sits/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Spurious verification mismatch?
2014-08-05 15:22 Spurious verification mismatch? Sitsofe Wheeler
@ 2014-08-07 21:27 ` Jens Axboe
2014-08-12 4:56 ` Sitsofe Wheeler
0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2014-08-07 21:27 UTC (permalink / raw)
To: Sitsofe Wheeler, fio
On 08/05/2014 09:22 AM, Sitsofe Wheeler wrote:
> When trying to use fio from git
> ae7e055f755e77dfa71ae9040250ce8eec238721 (commit dated July 25th) the
> following line always fails at the same point:
>
> ./fio --randseed=1 --ioengine=libaio --iodepth=1 --bs=64k
> --rw=randwrite --verify=meta --verify_backlog=50 --filename
> /dev/shm/foo --size=8M --time_based --runtime=1m --name=go
>
> go: (g=0): rw=randwrite, bs=64K-64K/64K-64K/64K-64K, ioengine=libaio, iodepth=1
> fio-2.1.11-10-gae7e
> Starting 1 process
> meta: verify failed at file /dev/shm/foo offset 6291456, length 65536
> fio: pid=55393, err=84/file:io_u.c:1806, func=io_u_queued_complete,
> error=Invalid or incomplete multibyte or wide character
> go: (groupid=0, jobs=1): err=84 (file:io_u.c:1806,
> func=io_u_queued_complete, error=Invalid or incomplete multibyte or
> wide character): pid=55393: Tue Aug 5 15:56:56 2014
>
> Switching --loops=4 for --time_based --runtime=1m doesn't resulted in
> the same error.
>
> Is this fio job malformed?
No, the job is fine, it's yet another case where the numberio
verification for meta fails miserably. Try current git, I committed a
one liner for it.
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Spurious verification mismatch?
2014-08-07 21:27 ` Jens Axboe
@ 2014-08-12 4:56 ` Sitsofe Wheeler
2014-08-13 20:15 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Sitsofe Wheeler @ 2014-08-12 4:56 UTC (permalink / raw)
To: Jens Axboe; +Cc: fio
(Resending with list cc)
On 7 August 2014 22:27, Jens Axboe <axboe@kernel.dk> wrote:
> On 08/05/2014 09:22 AM, Sitsofe Wheeler wrote:
>> When trying to use fio from git
>> ae7e055f755e77dfa71ae9040250ce8eec238721 (commit dated July 25th) the
>> following line always fails at the same point:
>>
>> ./fio --randseed=1 --ioengine=libaio --iodepth=1 --bs=64k
>> --rw=randwrite --verify=meta --verify_backlog=50 --filename
>> /dev/shm/foo --size=8M --time_based --runtime=1m --name=go
>>
>> Switching --loops=4 for --time_based --runtime=1m doesn't resulted in
>> the same error.
>>
>> Is this fio job malformed?
>
> No, the job is fine, it's yet another case where the numberio
> verification for meta fails miserably. Try current git, I committed a
> one liner for it.
Yup that fixes it. I've got one more mismatch query although with a
different job file:
fio --thread --direct=1 --ioengine=libaio --iodepth=128 --bs=64k
--rw=randwrite --verify=xxhash --verify_backlog=50 --filename
/dev/sdaw --name=go --time_based --runtime=2m --size=1M
I've got a case where this generates verification error before the job
completes. However on most of the other disks I have tried (virtual
disks, RAM disks etc) this job runs without issues. Once again
replacing --time_based --runtime with --loops makes the issue go
aways. Further, switching the ioscheduler for this disk from cfq to
noop also makes the problem go away (I've disabled merging by echoing
1 into nomerges)...
Could it be this job submits different I/O to the same position in the
same batch? If so isn't the result undefined? Why would using noop
solve the issue?
--
Sitsofe | http://sucs.org/~sits/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Spurious verification mismatch?
2014-08-12 4:56 ` Sitsofe Wheeler
@ 2014-08-13 20:15 ` Jens Axboe
2014-08-28 22:03 ` Sitsofe Wheeler
0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2014-08-13 20:15 UTC (permalink / raw)
To: Sitsofe Wheeler; +Cc: fio
On 08/11/2014 10:56 PM, Sitsofe Wheeler wrote:
> (Resending with list cc)
>
> On 7 August 2014 22:27, Jens Axboe <axboe@kernel.dk> wrote:
>> On 08/05/2014 09:22 AM, Sitsofe Wheeler wrote:
>>> When trying to use fio from git
>>> ae7e055f755e77dfa71ae9040250ce8eec238721 (commit dated July 25th) the
>>> following line always fails at the same point:
>>>
>>> ./fio --randseed=1 --ioengine=libaio --iodepth=1 --bs=64k
>>> --rw=randwrite --verify=meta --verify_backlog=50 --filename
>>> /dev/shm/foo --size=8M --time_based --runtime=1m --name=go
>>>
>>> Switching --loops=4 for --time_based --runtime=1m doesn't resulted in
>>> the same error.
>>>
>>> Is this fio job malformed?
>>
>> No, the job is fine, it's yet another case where the numberio
>> verification for meta fails miserably. Try current git, I committed a
>> one liner for it.
>
> Yup that fixes it. I've got one more mismatch query although with a
> different job file:
>
> fio --thread --direct=1 --ioengine=libaio --iodepth=128 --bs=64k
> --rw=randwrite --verify=xxhash --verify_backlog=50 --filename
> /dev/sdaw --name=go --time_based --runtime=2m --size=1M
>
> I've got a case where this generates verification error before the job
> completes. However on most of the other disks I have tried (virtual
> disks, RAM disks etc) this job runs without issues. Once again
> replacing --time_based --runtime with --loops makes the issue go
> aways. Further, switching the ioscheduler for this disk from cfq to
> noop also makes the problem go away (I've disabled merging by echoing
> 1 into nomerges)...
>
> Could it be this job submits different I/O to the same position in the
> same batch? If so isn't the result undefined? Why would using noop
> solve the issue?
It's most likely a bug that is timing dependent, which is why certain
changes to the IO path below fio will make a difference. I'll take a
look at this.
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Spurious verification mismatch?
2014-08-13 20:15 ` Jens Axboe
@ 2014-08-28 22:03 ` Sitsofe Wheeler
2014-08-28 22:57 ` Elliott, Robert (Server Storage)
0 siblings, 1 reply; 6+ messages in thread
From: Sitsofe Wheeler @ 2014-08-28 22:03 UTC (permalink / raw)
To: Jens Axboe; +Cc: fio@vger.kernel.org
On 13 August 2014 21:15, Jens Axboe <axboe@kernel.dk> wrote:
> On 08/11/2014 10:56 PM, Sitsofe Wheeler wrote:
>>
>> Yup that fixes it. I've got one more mismatch query although with a
>> different job file:
>>
>> fio --thread --direct=1 --ioengine=libaio --iodepth=128 --bs=64k
>> --rw=randwrite --verify=xxhash --verify_backlog=50 --filename
>> /dev/sdaw --name=go --time_based --runtime=2m --size=1M
>>
>> I've got a case where this generates verification error before the job
>> completes. However on most of the other disks I have tried (virtual
>> disks, RAM disks etc) this job runs without issues. Once again
>> replacing --time_based --runtime with --loops makes the issue go
>> aways. Further, switching the ioscheduler for this disk from cfq to
>> noop also makes the problem go away (I've disabled merging by echoing
>> 1 into nomerges)...
>>
>> Could it be this job submits different I/O to the same position in the
>> same batch? If so isn't the result undefined? Why would using noop
>> solve the issue?
>
> It's most likely a bug that is timing dependent, which is why certain
> changes to the IO path below fio will make a difference. I'll take a
> look at this.
I've got some logs from fio-2.1.11-11-gb7f5 and a blktrace from
3.14.2-200.fc20.x86_64 on
http://sucs.org/~sits/test/fio/fio-trace.tar.bz2 - does this help?
--
Sitsofe | http://sucs.org/~sits/
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: Spurious verification mismatch?
2014-08-28 22:03 ` Sitsofe Wheeler
@ 2014-08-28 22:57 ` Elliott, Robert (Server Storage)
0 siblings, 0 replies; 6+ messages in thread
From: Elliott, Robert (Server Storage) @ 2014-08-28 22:57 UTC (permalink / raw)
To: Sitsofe Wheeler, Jens Axboe; +Cc: fio@vger.kernel.org
> > On 08/11/2014 10:56 PM, Sitsofe Wheeler wrote:
...
> >> (I've disabled merging by echoing 1 into nomerges)...
Note that:
nomerges=1 just disables lengthier merge searches;
nomerges=2 disables all of them.
blk-mq has had some bugs where that wasn't always honored;
use iostat -x to see if merges are happening (rrqm/s
and wrqm/s columns) during the run, and/or check the
Disk stats line when fio exits (merge=n/n).
---
Rob Elliott HP Server Storage
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-08-28 22:57 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-05 15:22 Spurious verification mismatch? Sitsofe Wheeler
2014-08-07 21:27 ` Jens Axboe
2014-08-12 4:56 ` Sitsofe Wheeler
2014-08-13 20:15 ` Jens Axboe
2014-08-28 22:03 ` Sitsofe Wheeler
2014-08-28 22:57 ` Elliott, Robert (Server Storage)
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.