public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
* fio test triggering bad data on ext4
@ 2010-06-18  8:07 Jens Axboe
  2010-06-18 14:02 ` Eric Sandeen
  2010-07-07 14:26 ` Eric Sandeen
  0 siblings, 2 replies; 13+ messages in thread
From: Jens Axboe @ 2010-06-18  8:07 UTC (permalink / raw)
  To: tytso, adilger; +Cc: linux-ext4

Hi,

I was writing a small fio job file to do writes and read verifies on a
device. It forks 32 processes, each writing randomly to 4 files with a
block size between 4k and 16k. When it has written 1024 of those blocks,
it'll verify the oldest 512 of them. Each block is checksummed for every
512b. It uses libaio and O_DIRECT.

It works on ext2 and btrfs. I haven't run it to completion yet, but they
survive 15-20 minutes just fine. ext4 doesn't even go a full minutes
before this triggers:

Bad verify header 0 at 10137600
fio: pid=9943, err=84/file:io_u.c:1212, func=io_u_queued_complete, error=Invalid or incomplete multibyte or wide character

writers: (groupid=0, jobs=32): err=84 (file:io_u.c:1212, func=io_u_queued_complete, error=Invalid or incomplete multibyte or wide character): pid=9943

which tells us that where we expected to find the correct verify magic
in the header, it was all zeroes. The job file used is below, and to
reproduce you want to use the latest fio (1.40) since some earlier
versions don't do verify_interval properly for non-pattern verifies. You
can get fio here:

http://brick.kernel.dk/snaps/fio-1.40.tar.gz

or from git at:

git://git.kernel.dk/fio.git

The kernel used is 2.6.35-rc3 and I ran this on a raid0 that had 8 SSD
drives.

--- snip job file ---

[global]
direct=1
group_reporting=1
exitall
runtime=4h
time_based=1

# writers, will repeatedly randomly write and verify data
[writers]
rw=randwrite
bsrange=4k-16k
ioengine=libaio
iodepth=4
directory=/data
verify=crc32c
verify_backlog=1024
verify_backlog_batch=512
verify_interval=512
size=512m
nrfiles=4
filesize=64m-256m
numjobs=32
create_serialize=0

--- snip job file ---

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: fio test triggering bad data on ext4
@ 2010-06-21  9:37 Frank Mehnert
  0 siblings, 0 replies; 13+ messages in thread
From: Frank Mehnert @ 2010-06-21  9:37 UTC (permalink / raw)
  To: linux-ext4

[-- Attachment #1: Type: text/plain, Size: 583 bytes --]

Hi,

I want like to add that we have a similar testcase which probably triggers
much faster than the testcase of Jens, see here:

  https://bugzilla.kernel.org/show_bug.cgi?id=16165

We believe that this bug is responsible for data corruption of VirtualBox
disk images located on an ext4 file system. Please let me know how we can
help you debugging this issue.

Kind regards,

Frank
-- 
Dr.-Ing. Frank Mehnert

Sitz der Gesellschaft:
Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten
Amtsgericht München: HRB 161028
Geschäftsführer: Jürgen Kunz

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2010-07-07 19:39 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-18  8:07 fio test triggering bad data on ext4 Jens Axboe
2010-06-18 14:02 ` Eric Sandeen
2010-06-18 14:59   ` Eric Sandeen
2010-06-18 15:13     ` Jens Axboe
2010-06-18 15:28       ` Eric Sandeen
2010-06-18 17:32         ` Jens Axboe
2010-06-18 18:04           ` Eric Sandeen
2010-06-18 18:14             ` Jens Axboe
2010-06-21 10:20               ` Jens Axboe
2010-06-18 17:36       ` Jens Axboe
2010-07-07 14:26 ` Eric Sandeen
2010-07-07 19:39   ` Jens Axboe
  -- strict thread matches above, loose matches on Subject: below --
2010-06-21  9:37 Frank Mehnert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox