RFC: Data pattern buffer filling race condition fix

All of lore.kernel.org
 help / color / mirror / Atom feed

* RFC: Data pattern buffer filling race condition fix
@ 2010-11-06  9:35 Bart Van Assche
  2010-11-07 10:02 ` Bart Van Assche
  2010-11-07 11:43 ` Jens Axboe
  0 siblings, 2 replies; 7+ messages in thread
From: Bart Van Assche @ 2010-11-06  9:35 UTC (permalink / raw)
  To: fio@vger.kernel.org; +Cc: Jens Axboe, Radha Ramachandran

On multicore non-x86 CPUs fio has been observed to frequently reports false
data verification failures with I/O engine libaio and I/O depths above one.
This is because of a race condition in the function fill_pattern(). The code
in that function only works correct if all CPUs of a multicore system
observe store instructions in the order they were issued. That is the case for
multicore x86 systems but not for all other CPU families, such as e.g. the
POWER CPU family.

As far as I can see this bug was introduced via commit
cbe8d7561cf6d81d741d87eb7940db2a111d2144 (July 14, 2010).

I'm posting this patch as an RFC since the fix is GCC-specific.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>

diff --git a/verify.c b/verify.c
index ea1a911..3826198 100644
--- a/verify.c
+++ b/verify.c
@@ -31,18 +31,27 @@ void fill_pattern(struct thread_data *td, void *p, unsigned int len, struct io_u
 		fill_random_buf(p, len);
 		break;
 	case 1:
+#ifdef __GNUC__
+		__sync_synchronize();
+#endif
 		if (io_u->buf_filled_len >= len) {
 			dprint(FD_VERIFY, "using already filled verify pattern b=0 len=%u\n", len);
 			return;
 		}
 		dprint(FD_VERIFY, "fill verify pattern b=0 len=%u\n", len);
 		memset(p, td->o.verify_pattern[0], len);
+#ifdef __GNUC__
+		__sync_synchronize();
+#endif
 		io_u->buf_filled_len = len;
 		break;
 	default: {
 		unsigned int i = 0, size = 0;
 		unsigned char *b = p;
 
+#ifdef __GNUC__
+		__sync_synchronize();
+#endif
 		if (io_u->buf_filled_len >= len) {
 			dprint(FD_VERIFY, "using already filled verify pattern b=%d len=%u\n",
 					td->o.verify_pattern_bytes, len);
@@ -58,6 +67,9 @@ void fill_pattern(struct thread_data *td, void *p, unsigned int len, struct io_u
 			memcpy(b+i, td->o.verify_pattern, size);
 			i += size;
 		}
+#ifdef __GNUC__
+		__sync_synchronize();
+#endif
 		io_u->buf_filled_len = len;
 		break;
 		}


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: RFC: Data pattern buffer filling race condition fix
  2010-11-06  9:35 RFC: Data pattern buffer filling race condition fix Bart Van Assche
@ 2010-11-07 10:02 ` Bart Van Assche
  2010-11-07 11:43 ` Jens Axboe
  1 sibling, 0 replies; 7+ messages in thread
From: Bart Van Assche @ 2010-11-07 10:02 UTC (permalink / raw)
  To: fio@vger.kernel.org; +Cc: Jens Axboe, Radha Ramachandran

On Saturday 06 November 2010 10:35:18 Bart Van Assche wrote:
> On multicore non-x86 CPUs fio has been observed to frequently reports false
> data verification failures with I/O engine libaio and I/O depths above one.
> This is because of a race condition in the function fill_pattern(). The code
> in that function only works correct if all CPUs of a multicore system
> observe store instructions in the order they were issued. That is the case for
> multicore x86 systems but not for all other CPU families, such as e.g. the
> POWER CPU family.

Note: more information about the x86 and PowerPC memory consistency models can be found here:
[1] Scott Owens, Susmit Sarkar, Peter Sewell, A Better x86 Memory Model: x86-TSO, Proceedings of the 22nd International Conference on Theorem Proving in Higher Order Logics, 2009, http://portal.acm.org/citation.cfm?id=1616107.
[2] Power Instruction Set  Architecture Version 2.06 Revision B, July 23, 2010, http://www.power.org.

A quote from [2]:
<quote>
The order in which the processor performs storage accesses, the order in which
those accesses are performed with respect to another processor or mechanism,
and the order in which those accesses are performed in main storage may all be
different.
</quote>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RFC: Data pattern buffer filling race condition fix
  2010-11-06  9:35 RFC: Data pattern buffer filling race condition fix Bart Van Assche
  2010-11-07 10:02 ` Bart Van Assche
@ 2010-11-07 11:43 ` Jens Axboe
  2010-11-07 12:58   ` Bart Van Assche
  1 sibling, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2010-11-07 11:43 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: fio@vger.kernel.org, Radha Ramachandran

On 2010-11-06 10:35, Bart Van Assche wrote:
> On multicore non-x86 CPUs fio has been observed to frequently reports false
> data verification failures with I/O engine libaio and I/O depths above one.
> This is because of a race condition in the function fill_pattern(). The code
> in that function only works correct if all CPUs of a multicore system
> observe store instructions in the order they were issued. That is the case for
> multicore x86 systems but not for all other CPU families, such as e.g. the
> POWER CPU family.
> 
> As far as I can see this bug was introduced via commit
> cbe8d7561cf6d81d741d87eb7940db2a111d2144 (July 14, 2010).
> 
> I'm posting this patch as an RFC since the fix is GCC-specific.

ppc is notorious for its weaker memory ordering. I do have a ppc test
box, but haven't used it in a while. But it used to find bugs
immediately for race conditions, that x86 would never trigger. So since
you are pin pointing that particular commit, you are convinced that this
bug manifests itself due to bad ordering between the filled buffer and
the fill length?

> diff --git a/verify.c b/verify.c
> index ea1a911..3826198 100644
> --- a/verify.c
> +++ b/verify.c
> @@ -31,18 +31,27 @@ void fill_pattern(struct thread_data *td, void *p, unsigned int len, struct io_u
>  		fill_random_buf(p, len);
>  		break;
>  	case 1:
> +#ifdef __GNUC__
> +		__sync_synchronize();
> +#endif

You should be able to use the fio included write_barrier() and
read_barrier(), which are hooked to the architectures. Then you don't
need to use GNUC additions.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RFC: Data pattern buffer filling race condition fix
  2010-11-07 11:43 ` Jens Axboe
@ 2010-11-07 12:58   ` Bart Van Assche
  2010-11-08 13:03     ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Bart Van Assche @ 2010-11-07 12:58 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio@vger.kernel.org, Radha Ramachandran

On Sun, Nov 7, 2010 at 12:43 PM, Jens Axboe <jaxboe@fusionio.com> wrote:
>
> On 2010-11-06 10:35, Bart Van Assche wrote:
> > On multicore non-x86 CPUs fio has been observed to frequently reports false
> > data verification failures with I/O engine libaio and I/O depths above one.
> > This is because of a race condition in the function fill_pattern(). The code
> > in that function only works correct if all CPUs of a multicore system
> > observe store instructions in the order they were issued. That is the case for
> > multicore x86 systems but not for all other CPU families, such as e.g. the
> > POWER CPU family.
> >
> > As far as I can see this bug was introduced via commit
> > cbe8d7561cf6d81d741d87eb7940db2a111d2144 (July 14, 2010).
> >
> > I'm posting this patch as an RFC since the fix is GCC-specific.
>
> ppc is notorious for its weaker memory ordering. I do have a ppc test
> box, but haven't used it in a while. But it used to find bugs
> immediately for race conditions, that x86 would never trigger. So since
> you are pin pointing that particular commit, you are convinced that this
> bug manifests itself due to bad ordering between the filled buffer and
> the fill length?

I haven't done a full bisect, but fio version 1.41.6 (Jul 9, 2010) + a
backport of the PowerPC version of get_cpu_clock() (commit
5f39d8f797fcf01bd94b89ef7ed2bdb76deb2601 from August 10, 2010) was working
fine. And it is easy to see how reordered writes will cause commit
cbe8d7561cf6d81d741d87eb7940db2a111d2144 to make fio behave incorrectly.

> > diff --git a/verify.c b/verify.c
> > index ea1a911..3826198 100644
> > --- a/verify.c
> > +++ b/verify.c
> > @@ -31,18 +31,27 @@ void fill_pattern(struct thread_data *td, void *p, unsigned int len, struct io_u
> >               fill_random_buf(p, len);
> >               break;
> >       case 1:
> > +#ifdef __GNUC__
> > +             __sync_synchronize();
> > +#endif
>
> You should be able to use the fio included write_barrier() and
> read_barrier(), which are hooked to the architectures. Then you don't
> need to use GNUC additions.

This patch also works for me on PowerPC:

Signed-off-by: Bart Van Assche <bvanassche@acm.org>

diff --git a/verify.c b/verify.c
index ea1a911..c5b2fbe 100644
--- a/verify.c
+++ b/verify.c
@@ -31,18 +31,21 @@ void fill_pattern(struct thread_data *td, void *p,
unsigned int len, struct io_u
 		fill_random_buf(p, len);
 		break;
 	case 1:
+		write_barrier();
 		if (io_u->buf_filled_len >= len) {
 			dprint(FD_VERIFY, "using already filled verify pattern b=0 len=%u\n", len);
 			return;
 		}
 		dprint(FD_VERIFY, "fill verify pattern b=0 len=%u\n", len);
 		memset(p, td->o.verify_pattern[0], len);
+		write_barrier();
 		io_u->buf_filled_len = len;
 		break;
 	default: {
 		unsigned int i = 0, size = 0;
 		unsigned char *b = p;

+		write_barrier();
 		if (io_u->buf_filled_len >= len) {
 			dprint(FD_VERIFY, "using already filled verify pattern b=%d len=%u\n",
 					td->o.verify_pattern_bytes, len);
@@ -58,6 +61,7 @@ void fill_pattern(struct thread_data *td, void *p,
unsigned int len, struct io_u
 			memcpy(b+i, td->o.verify_pattern, size);
 			i += size;
 		}
+		write_barrier();
 		io_u->buf_filled_len = len;
 		break;
 		}


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: RFC: Data pattern buffer filling race condition fix
  2010-11-07 12:58   ` Bart Van Assche
@ 2010-11-08 13:03     ` Jens Axboe
  2010-11-09 11:53       ` Bart Van Assche
  0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2010-11-08 13:03 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: fio@vger.kernel.org, Radha Ramachandran

On 2010-11-07 13:58, Bart Van Assche wrote:
> On Sun, Nov 7, 2010 at 12:43 PM, Jens Axboe <jaxboe@fusionio.com> wrote:
>>
>> On 2010-11-06 10:35, Bart Van Assche wrote:
>>> On multicore non-x86 CPUs fio has been observed to frequently reports false
>>> data verification failures with I/O engine libaio and I/O depths above one.
>>> This is because of a race condition in the function fill_pattern(). The code
>>> in that function only works correct if all CPUs of a multicore system
>>> observe store instructions in the order they were issued. That is the case for
>>> multicore x86 systems but not for all other CPU families, such as e.g. the
>>> POWER CPU family.
>>>
>>> As far as I can see this bug was introduced via commit
>>> cbe8d7561cf6d81d741d87eb7940db2a111d2144 (July 14, 2010).
>>>
>>> I'm posting this patch as an RFC since the fix is GCC-specific.
>>
>> ppc is notorious for its weaker memory ordering. I do have a ppc test
>> box, but haven't used it in a while. But it used to find bugs
>> immediately for race conditions, that x86 would never trigger. So since
>> you are pin pointing that particular commit, you are convinced that this
>> bug manifests itself due to bad ordering between the filled buffer and
>> the fill length?
> 
> I haven't done a full bisect, but fio version 1.41.6 (Jul 9, 2010) + a
> backport of the PowerPC version of get_cpu_clock() (commit
> 5f39d8f797fcf01bd94b89ef7ed2bdb76deb2601 from August 10, 2010) was working
> fine. And it is easy to see how reordered writes will cause commit
> cbe8d7561cf6d81d741d87eb7940db2a111d2144 to make fio behave incorrectly.
> 
>>> diff --git a/verify.c b/verify.c
>>> index ea1a911..3826198 100644
>>> --- a/verify.c
>>> +++ b/verify.c
>>> @@ -31,18 +31,27 @@ void fill_pattern(struct thread_data *td, void *p, unsigned int len, struct io_u
>>>               fill_random_buf(p, len);
>>>               break;
>>>       case 1:
>>> +#ifdef __GNUC__
>>> +             __sync_synchronize();
>>> +#endif
>>
>> You should be able to use the fio included write_barrier() and
>> read_barrier(), which are hooked to the architectures. Then you don't
>> need to use GNUC additions.
> 
> This patch also works for me on PowerPC:
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> 
> diff --git a/verify.c b/verify.c
> index ea1a911..c5b2fbe 100644
> --- a/verify.c
> +++ b/verify.c
> @@ -31,18 +31,21 @@ void fill_pattern(struct thread_data *td, void *p,
> unsigned int len, struct io_u
>  		fill_random_buf(p, len);
>  		break;
>  	case 1:
> +		write_barrier();
>  		if (io_u->buf_filled_len >= len) {
>  			dprint(FD_VERIFY, "using already filled verify pattern b=0 len=%u\n", len);
>  			return;
>  		}
>  		dprint(FD_VERIFY, "fill verify pattern b=0 len=%u\n", len);
>  		memset(p, td->o.verify_pattern[0], len);
> +		write_barrier();
>  		io_u->buf_filled_len = len;
>  		break;

Forgive me, but I'm still a little confused. This second write_barrier()
is now protecting against the order of the fill and the length
assignment. IOW, if you see the new length, you are guaranteed to also
see the new content. This means that the first memory barrier should be
a read_barrier().

And ditto for the other case.

Can you verify whether that works as expected and send an updated patch?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RFC: Data pattern buffer filling race condition fix
  2010-11-08 13:03     ` Jens Axboe
@ 2010-11-09 11:53       ` Bart Van Assche
  2010-11-09 13:16         ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Bart Van Assche @ 2010-11-09 11:53 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio@vger.kernel.org, Radha Ramachandran

On Mon, Nov 8, 2010 at 2:03 PM, Jens Axboe <jaxboe@fusionio.com> wrote:
>
> On 2010-11-07 13:58, Bart Van Assche wrote:
> > On Sun, Nov 7, 2010 at 12:43 PM, Jens Axboe <jaxboe@fusionio.com> wrote:
> >>
> >> On 2010-11-06 10:35, Bart Van Assche wrote:
> >>> On multicore non-x86 CPUs fio has been observed to frequently reports false
> >>> data verification failures with I/O engine libaio and I/O depths above one.
> >>> This is because of a race condition in the function fill_pattern(). The code
> >>> in that function only works correct if all CPUs of a multicore system
> >>> observe store instructions in the order they were issued. That is the case for
> >>> multicore x86 systems but not for all other CPU families, such as e.g. the
> >>> POWER CPU family.
> >>>
> >>> [ ... ]
>
> Forgive me, but I'm still a little confused. This second write_barrier()
> is now protecting against the order of the fill and the length
> assignment. IOW, if you see the new length, you are guaranteed to also
> see the new content. This means that the first memory barrier should be
> a read_barrier().
>
> And ditto for the other case.
>
> Can you verify whether that works as expected and send an updated patch?

Hello Jens,

I'm afraid that I will have to do more testing and that I'll have to
make sure that I understand the entire fio code base before I can
develop and send a new patch - something I do not have the time for
now unfortunately. I ran into this issue on 32-bit 2.6.34.7 kernel
while running a test on a local ext3 filesystem, something I will have
to analyze further before I can proceed:

$ valgrind ./fio --ioengine=libaio --overwrite=1 --verify=md5
--iodepth=10 --direct=1 --loops=10 --size=1MB --name=test --thread
--numjobs=10 --group_reporting
==13318== Memcheck, a memory error detector
==13318== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==13318== Using Valgrind-3.7.0.SVN and LibVEX; rerun with -h for copyright info
==13318== Command: ./fio --ioengine=libaio --overwrite=1 --verify=md5
--iodepth=10 --direct=1 --loops=10 --size=1MB --name=test --thread
--numjobs=10 --group_reporting
==13318==
test: (g=0): rw=read, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=10
...
test: (g=0): rw=read, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=10
Starting 10 threads
verify: bad magic header 0, wanted f00baaef at file test.9.0 offset 0, length 0
verify: bad magic header 0, wanted f00baaef at file test.2.0 offset 0, length 0
verify: bad magic header 0, wanted f00baaef at file test.3.0 offset 0, length 0
verify: bad magic header 0, wanted f00baaef at file test.3.0 offset
4096, length 0
verify: bad magic header 0, wanted f00baaef at file test.8.0 offset 0, length 0
verify: bad magic header 0, wanted f00baaef at file test.8.0 offset
4096, length 0
verify: bad magic header 0, wanted f00baaef at file test.2.0 offset
4096, length 0
verify: bad magic header 0, wanted f00baaef at file test.1.0 offset 0, length 0
verify: bad magic header 0, wanted f00baaef at file test.1.0 offset
4096, length 0
verify: bad magic header 0, wanted f00baaef at file test.6.0 offset 0, length 0
verify: bad magic header 0, wanted f00baaef at file test.6.0 offset
4096, length 0
verify: bad magic header 0, wanted f00baaef at file test.5.0 offset 0, length 0
verify: bad magic header 0, wanted f00baaef at file test.5.0 offset
4096, length 0
verify: bad magic header 0, wanted f00baaef at file test.4.0 offset 0, length 0
verify: bad magic header 0, wanted f00baaef at file test.4.0 offset
4096, length 0
verify: bad magic header 0, wanted f00baaef at file test.7.0 offset 0, length 0
verify: bad magic header 0, wanted f00baaef at file test.7.0 offset
4096, length 0
verify: bad magic header 0, wanted f00baaef at file test.10.0 offset 0, length 0
verify: bad magic header 0, wanted f00baaef at file test.10.0 offset
4096, length 0
verify: bad magic header 0, wanted f00baaef at file test.9.0 offset
4096, length 0
fio: pid=13318, err=84/file:io_u.c:1346, func=io_u_queued_complete,
error=Invalid or incomplete multibyte or wide character
fio: pid=13318, err=84/file:io_u.c:1346, func=io_u_queued_complete,
error=Invalid or incomplete multibyte or wide character
fio: pid=13318, err=84/file:io_u.c:1346, func=io_u_queued_complete,
error=Invalid or incomplete multibyte or wide character
fio: pid=13318, err=84/file:io_u.c:1346, func=io_u_queued_complete,
error=Invalid or incomplete multibyte or wide character
fio: pid=13318, err=84/file:io_u.c:1346, func=io_u_queued_complete,
error=Invalid or incomplete multibyte or wide character
fio: pid=13318, err=84/file:io_u.c:1346, func=io_u_queued_complete,
error=Invalid or incomplete multibyte or wide character
fio: pid=13318, err=84/file:io_u.c:1346, func=io_u_queued_complete,
error=Invalid or incomplete multibyte or wide character
fio: pid=13318, err=84/file:io_u.c:1346, func=io_u_queued_complete,
error=Invalid or incomplete multibyte or wide character
fio: pid=13318, err=84/file:io_u.c:1346, func=io_u_queued_complete,
error=Invalid or incomplete multibyte or wide character
fio: pid=13318, err=84/file:io_u.c:1346, func=io_u_queued_complete,
error=Invalid or incomplete multibyte or wide character

test: (groupid=0, jobs=10): err=84 (file:io_u.c:1346,
func=io_u_queued_complete, error=Invalid or incomplete multibyte or
wide character): pid=13318
  read : io=81920 B, bw=576901 B/s, iops=704 , runt=   142msec
    slat (usec): min=26 , max=6642 , avg=40.86, stdev=21.59
    clat (msec): min=1 , max=99 , avg=41.51, stdev= 6.25
     lat (msec): min=1 , max=100 , avg=41.56, stdev= 6.25
  cpu          : usr=73.15%, sys=20.23%, ctx=598, majf=0, minf=1405
  IO depths    : 1=10.0%, 2=20.0%, 4=40.0%, 8=30.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w/d: total=100/0/0, short=0/0/0

     lat (msec): 2=1.00%, 4=3.00%, 10=1.00%, 50=6.00%, 100=9.00%

Run status group 0 (all jobs):
   READ: io=80KB, aggrb=563KB/s, minb=576KB/s, maxb=576KB/s,
mint=142msec, maxt=142msec

Disk stats (read/write):
  sda: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=-nan%
==13318==
==13318== HEAP SUMMARY:
==13318==     in use at exit: 18,102 bytes in 114 blocks
==13318==   total heap usage: 370 allocs, 256 frees, 849,743 bytes allocated
==13318==
==13318== LEAK SUMMARY:
==13318==    definitely lost: 2,401 bytes in 40 blocks
==13318==    indirectly lost: 15,680 bytes in 70 blocks
==13318==      possibly lost: 0 bytes in 0 blocks
==13318==    still reachable: 21 bytes in 4 blocks
==13318==         suppressed: 0 bytes in 0 blocks
==13318== Rerun with --leak-check=full to see details of leaked memory
==13318==
==13318== For counts of detected and suppressed errors, rerun with: -v
==13318== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 5 from 5)

Bart.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RFC: Data pattern buffer filling race condition fix
  2010-11-09 11:53       ` Bart Van Assche
@ 2010-11-09 13:16         ` Jens Axboe
  0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2010-11-09 13:16 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: fio@vger.kernel.org, Radha Ramachandran

On 2010-11-09 12:53, Bart Van Assche wrote:
> On Mon, Nov 8, 2010 at 2:03 PM, Jens Axboe <jaxboe@fusionio.com> wrote:
>>
>> On 2010-11-07 13:58, Bart Van Assche wrote:
>>> On Sun, Nov 7, 2010 at 12:43 PM, Jens Axboe <jaxboe@fusionio.com> wrote:
>>>>
>>>> On 2010-11-06 10:35, Bart Van Assche wrote:
>>>>> On multicore non-x86 CPUs fio has been observed to frequently reports false
>>>>> data verification failures with I/O engine libaio and I/O depths above one.
>>>>> This is because of a race condition in the function fill_pattern(). The code
>>>>> in that function only works correct if all CPUs of a multicore system
>>>>> observe store instructions in the order they were issued. That is the case for
>>>>> multicore x86 systems but not for all other CPU families, such as e.g. the
>>>>> POWER CPU family.
>>>>>
>>>>> [ ... ]
>>
>> Forgive me, but I'm still a little confused. This second write_barrier()
>> is now protecting against the order of the fill and the length
>> assignment. IOW, if you see the new length, you are guaranteed to also
>> see the new content. This means that the first memory barrier should be
>> a read_barrier().
>>
>> And ditto for the other case.
>>
>> Can you verify whether that works as expected and send an updated patch?
> 
> Hello Jens,
> 
> I'm afraid that I will have to do more testing and that I'll have to
> make sure that I understand the entire fio code base before I can
> develop and send a new patch - something I do not have the time for
> now unfortunately. I ran into this issue on 32-bit 2.6.34.7 kernel
> while running a test on a local ext3 filesystem, something I will have
> to analyze further before I can proceed:
> 
> $ valgrind ./fio --ioengine=libaio --overwrite=1 --verify=md5
> --iodepth=10 --direct=1 --loops=10 --size=1MB --name=test --thread
> --numjobs=10 --group_reporting
> ==13318== Memcheck, a memory error detector
> ==13318== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
> ==13318== Using Valgrind-3.7.0.SVN and LibVEX; rerun with -h for copyright info
> ==13318== Command: ./fio --ioengine=libaio --overwrite=1 --verify=md5
> --iodepth=10 --direct=1 --loops=10 --size=1MB --name=test --thread
> --numjobs=10 --group_reporting
> ==13318==
> test: (g=0): rw=read, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=10
> ...
> test: (g=0): rw=read, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=10

This looks pretty straight forward - the file is created, but not filled
with a verifiable pattern. You want to run the workload with rw=write
at least once first, then you can use a read-only verify workload later
if you want.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-11-09 13:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-06  9:35 RFC: Data pattern buffer filling race condition fix Bart Van Assche
2010-11-07 10:02 ` Bart Van Assche
2010-11-07 11:43 ` Jens Axboe
2010-11-07 12:58   ` Bart Van Assche
2010-11-08 13:03     ` Jens Axboe
2010-11-09 11:53       ` Bart Van Assche
2010-11-09 13:16         ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.