All of lore.kernel.org
 help / color / mirror / Atom feed
* Spurious verification mismatch?
@ 2014-08-05 15:22 Sitsofe Wheeler
  2014-08-07 21:27 ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Sitsofe Wheeler @ 2014-08-05 15:22 UTC (permalink / raw)
  To: fio

When trying to use fio from git
ae7e055f755e77dfa71ae9040250ce8eec238721 (commit dated July 25th) the
following line always fails at the same point:

./fio --randseed=1 --ioengine=libaio --iodepth=1 --bs=64k
--rw=randwrite --verify=meta --verify_backlog=50 --filename
/dev/shm/foo --size=8M --time_based --runtime=1m --name=go

go: (g=0): rw=randwrite, bs=64K-64K/64K-64K/64K-64K, ioengine=libaio, iodepth=1
fio-2.1.11-10-gae7e
Starting 1 process
meta: verify failed at file /dev/shm/foo offset 6291456, length 65536
fio: pid=55393, err=84/file:io_u.c:1806, func=io_u_queued_complete,
error=Invalid or incomplete multibyte or wide character
go: (groupid=0, jobs=1): err=84 (file:io_u.c:1806,
func=io_u_queued_complete, error=Invalid or incomplete multibyte or
wide character): pid=55393: Tue Aug  5 15:56:56 2014

Switching --loops=4 for --time_based --runtime=1m doesn't resulted in
the same error.

Is this fio job malformed?

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Spurious verification mismatch?
  2014-08-05 15:22 Spurious verification mismatch? Sitsofe Wheeler
@ 2014-08-07 21:27 ` Jens Axboe
  2014-08-12  4:56   ` Sitsofe Wheeler
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2014-08-07 21:27 UTC (permalink / raw)
  To: Sitsofe Wheeler, fio

On 08/05/2014 09:22 AM, Sitsofe Wheeler wrote:
> When trying to use fio from git
> ae7e055f755e77dfa71ae9040250ce8eec238721 (commit dated July 25th) the
> following line always fails at the same point:
> 
> ./fio --randseed=1 --ioengine=libaio --iodepth=1 --bs=64k
> --rw=randwrite --verify=meta --verify_backlog=50 --filename
> /dev/shm/foo --size=8M --time_based --runtime=1m --name=go
> 
> go: (g=0): rw=randwrite, bs=64K-64K/64K-64K/64K-64K, ioengine=libaio, iodepth=1
> fio-2.1.11-10-gae7e
> Starting 1 process
> meta: verify failed at file /dev/shm/foo offset 6291456, length 65536
> fio: pid=55393, err=84/file:io_u.c:1806, func=io_u_queued_complete,
> error=Invalid or incomplete multibyte or wide character
> go: (groupid=0, jobs=1): err=84 (file:io_u.c:1806,
> func=io_u_queued_complete, error=Invalid or incomplete multibyte or
> wide character): pid=55393: Tue Aug  5 15:56:56 2014
> 
> Switching --loops=4 for --time_based --runtime=1m doesn't resulted in
> the same error.
> 
> Is this fio job malformed?

No, the job is fine, it's yet another case where the numberio
verification for meta fails miserably. Try current git, I committed a
one liner for it.


-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Spurious verification mismatch?
  2014-08-07 21:27 ` Jens Axboe
@ 2014-08-12  4:56   ` Sitsofe Wheeler
  2014-08-13 20:15     ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Sitsofe Wheeler @ 2014-08-12  4:56 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

(Resending with list cc)

On 7 August 2014 22:27, Jens Axboe <axboe@kernel.dk> wrote:
> On 08/05/2014 09:22 AM, Sitsofe Wheeler wrote:
>> When trying to use fio from git
>> ae7e055f755e77dfa71ae9040250ce8eec238721 (commit dated July 25th) the
>> following line always fails at the same point:
>>
>> ./fio --randseed=1 --ioengine=libaio --iodepth=1 --bs=64k
>> --rw=randwrite --verify=meta --verify_backlog=50 --filename
>> /dev/shm/foo --size=8M --time_based --runtime=1m --name=go
>>
>> Switching --loops=4 for --time_based --runtime=1m doesn't resulted in
>> the same error.
>>
>> Is this fio job malformed?
>
> No, the job is fine, it's yet another case where the numberio
> verification for meta fails miserably. Try current git, I committed a
> one liner for it.

Yup that fixes it. I've got one more mismatch query although with a
different job file:

fio  --thread --direct=1 --ioengine=libaio --iodepth=128 --bs=64k
--rw=randwrite --verify=xxhash --verify_backlog=50 --filename
/dev/sdaw --name=go --time_based --runtime=2m --size=1M

I've got a case where this generates verification error before the job
completes. However on most of the other disks I have tried (virtual
disks, RAM disks etc) this job runs without issues. Once again
replacing --time_based --runtime with --loops makes the issue go
aways. Further, switching the ioscheduler for this disk from cfq to
noop also makes the problem go away (I've disabled merging by echoing
1 into nomerges)...

Could it be this job submits different I/O to the same position in the
same batch? If so isn't the result undefined? Why would using noop
solve the issue?

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Spurious verification mismatch?
  2014-08-12  4:56   ` Sitsofe Wheeler
@ 2014-08-13 20:15     ` Jens Axboe
  2014-08-28 22:03       ` Sitsofe Wheeler
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2014-08-13 20:15 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio

On 08/11/2014 10:56 PM, Sitsofe Wheeler wrote:
> (Resending with list cc)
> 
> On 7 August 2014 22:27, Jens Axboe <axboe@kernel.dk> wrote:
>> On 08/05/2014 09:22 AM, Sitsofe Wheeler wrote:
>>> When trying to use fio from git
>>> ae7e055f755e77dfa71ae9040250ce8eec238721 (commit dated July 25th) the
>>> following line always fails at the same point:
>>>
>>> ./fio --randseed=1 --ioengine=libaio --iodepth=1 --bs=64k
>>> --rw=randwrite --verify=meta --verify_backlog=50 --filename
>>> /dev/shm/foo --size=8M --time_based --runtime=1m --name=go
>>>
>>> Switching --loops=4 for --time_based --runtime=1m doesn't resulted in
>>> the same error.
>>>
>>> Is this fio job malformed?
>>
>> No, the job is fine, it's yet another case where the numberio
>> verification for meta fails miserably. Try current git, I committed a
>> one liner for it.
> 
> Yup that fixes it. I've got one more mismatch query although with a
> different job file:
> 
> fio  --thread --direct=1 --ioengine=libaio --iodepth=128 --bs=64k
> --rw=randwrite --verify=xxhash --verify_backlog=50 --filename
> /dev/sdaw --name=go --time_based --runtime=2m --size=1M
> 
> I've got a case where this generates verification error before the job
> completes. However on most of the other disks I have tried (virtual
> disks, RAM disks etc) this job runs without issues. Once again
> replacing --time_based --runtime with --loops makes the issue go
> aways. Further, switching the ioscheduler for this disk from cfq to
> noop also makes the problem go away (I've disabled merging by echoing
> 1 into nomerges)...
> 
> Could it be this job submits different I/O to the same position in the
> same batch? If so isn't the result undefined? Why would using noop
> solve the issue?

It's most likely a bug that is timing dependent, which is why certain
changes to the IO path below fio will make a difference. I'll take a
look at this.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Spurious verification mismatch?
  2014-08-13 20:15     ` Jens Axboe
@ 2014-08-28 22:03       ` Sitsofe Wheeler
  2014-08-28 22:57         ` Elliott, Robert (Server Storage)
  0 siblings, 1 reply; 6+ messages in thread
From: Sitsofe Wheeler @ 2014-08-28 22:03 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio@vger.kernel.org

On 13 August 2014 21:15, Jens Axboe <axboe@kernel.dk> wrote:
> On 08/11/2014 10:56 PM, Sitsofe Wheeler wrote:
>>
>> Yup that fixes it. I've got one more mismatch query although with a
>> different job file:
>>
>> fio  --thread --direct=1 --ioengine=libaio --iodepth=128 --bs=64k
>> --rw=randwrite --verify=xxhash --verify_backlog=50 --filename
>> /dev/sdaw --name=go --time_based --runtime=2m --size=1M
>>
>> I've got a case where this generates verification error before the job
>> completes. However on most of the other disks I have tried (virtual
>> disks, RAM disks etc) this job runs without issues. Once again
>> replacing --time_based --runtime with --loops makes the issue go
>> aways. Further, switching the ioscheduler for this disk from cfq to
>> noop also makes the problem go away (I've disabled merging by echoing
>> 1 into nomerges)...
>>
>> Could it be this job submits different I/O to the same position in the
>> same batch? If so isn't the result undefined? Why would using noop
>> solve the issue?
>
> It's most likely a bug that is timing dependent, which is why certain
> changes to the IO path below fio will make a difference. I'll take a
> look at this.

I've got some logs from fio-2.1.11-11-gb7f5 and a blktrace from
3.14.2-200.fc20.x86_64 on
http://sucs.org/~sits/test/fio/fio-trace.tar.bz2 - does this help?

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Spurious verification mismatch?
  2014-08-28 22:03       ` Sitsofe Wheeler
@ 2014-08-28 22:57         ` Elliott, Robert (Server Storage)
  0 siblings, 0 replies; 6+ messages in thread
From: Elliott, Robert (Server Storage) @ 2014-08-28 22:57 UTC (permalink / raw)
  To: Sitsofe Wheeler, Jens Axboe; +Cc: fio@vger.kernel.org

> > On 08/11/2014 10:56 PM, Sitsofe Wheeler wrote:
...
> >> (I've disabled merging by echoing 1 into nomerges)...

Note that:
  nomerges=1 just disables lengthier merge searches;
  nomerges=2 disables all of them.

blk-mq has had some bugs where that wasn't always honored;
use iostat -x to see if merges are happening (rrqm/s
and wrqm/s columns) during the run, and/or check the
Disk stats line when fio exits (merge=n/n).


---
Rob Elliott    HP Server Storage





^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-08-28 22:57 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-05 15:22 Spurious verification mismatch? Sitsofe Wheeler
2014-08-07 21:27 ` Jens Axboe
2014-08-12  4:56   ` Sitsofe Wheeler
2014-08-13 20:15     ` Jens Axboe
2014-08-28 22:03       ` Sitsofe Wheeler
2014-08-28 22:57         ` Elliott, Robert (Server Storage)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.