All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <chris.mason@fusionio.com>
To: David Sterba <dsterba@suse.cz>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
	Josef Bacik <jbacik@fusionio.com>
Subject: Re: [bug] csum mismatches and failed xfstests with 3.8-rc1
Date: Fri, 4 Jan 2013 08:01:44 -0500	[thread overview]
Message-ID: <20130104130144.GF14537@shiny> (raw)
In-Reply-To: <20130104125059.GE20089@twin.jikos.cz>

Thanks Dave,

On Fri, Jan 04, 2013 at 05:50:59AM -0700, David Sterba wrote:
> Hi,
> 
> I've noticed a few csum mismatch messages, and a few failed xfstests:
> 
> - 3.8.0-rc1
> - defautl mkfs options
> - MOUNT_OPTIONS -- -o space_cache,noatime,inode_cache
> - test device:    40G
> - scratch device: 10G

Josef, are the problems you see with 083 coming on the scratch drive or
the main disk?

> 
> 091:
> --- 091.out     2011-11-01 10:31:12.000000000 +0100
> +++ 091.out.bad 2013-01-03 21:07:29.000000000 +0100
> @@ -1,7 +1,45 @@
>  QA output created by 091
>  fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
> -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
> -fsx -N 10000 -o 32768 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
> -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
> -fsx -N 10000 -o 32768 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
> -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -W
> +fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
> +mapped writes DISABLED
> +truncating to largest ever: 0x12a00
> +truncating to largest ever: 0x75400
> +doread: read: Input/output error
> +LOG DUMP (35 total operations):
> +1(  1 mod 256): SKIPPED (no operation)
> +2(  2 mod 256): WRITE    0x62600 thru 0x6bdff  (0x9800 bytes) HOLE
> +3(  3 mod 256): FALLOC   0x2e0f2 thru 0x3134a  (0x3258 bytes) INTERIOR
> +4(  4 mod 256): TRUNCATE DOWN  from 0x6be00 to 0x12a00
> +5(  5 mod 256): READ     0x0 thru 0xdfff       (0xe000 bytes)
> +6(  6 mod 256): FALLOC   0x7048 thru 0x9f54    (0x2f0c bytes) INTERIOR
> +7(  7 mod 256): WRITE    0x5ea00 thru 0x6e7ff  (0xfe00 bytes) HOLE
> +8(  8 mod 256): READ     0x16000 thru 0x17fff  (0x2000 bytes)
> +9(  9 mod 256): FALLOC   0x4957f thru 0x5298e  (0x940f bytes) INTERIOR
> +10( 10 mod 256): SKIPPED (no operation)
> +11( 11 mod 256): WRITE    0x10a00 thru 0x173ff (0x6a00 bytes)
> +12( 12 mod 256): WRITE    0x53800 thru 0x5a7ff (0x7000 bytes)
> +13( 13 mod 256): WRITE    0x5ae00 thru 0x5afff (0x200 bytes)
> +14( 14 mod 256): READ     0x5d000 thru 0x66fff (0xa000 bytes)
> +15( 15 mod 256): SKIPPED (no operation)
> +16( 16 mod 256): READ     0x21000 thru 0x2bfff (0xb000 bytes)
> +17( 17 mod 256): SKIPPED (no operation)
> +18( 18 mod 256): READ     0x47000 thru 0x4ffff (0x9000 bytes)
> +19( 19 mod 256): WRITE    0x17600 thru 0x25bff (0xe600 bytes)
> +20( 20 mod 256): READ     0x3f000 thru 0x48fff (0xa000 bytes)
> +21( 21 mod 256): FALLOC   0xea89 thru 0x19800  (0xad77 bytes) INTERIOR
> +22( 22 mod 256): FALLOC   0x569aa thru 0x586ea (0x1d40 bytes) INTERIOR
> +23( 23 mod 256): WRITE    0x35c00 thru 0x453ff (0xf800 bytes)
> +24( 24 mod 256): SKIPPED (no operation)
> +25( 25 mod 256): SKIPPED (no operation)
> +26( 26 mod 256): READ     0x21000 thru 0x26fff (0x6000 bytes)
> +27( 27 mod 256): READ     0x5e000 thru 0x61fff (0x4000 bytes)
> +28( 28 mod 256): WRITE    0x6f600 thru 0x6f7ff (0x200 bytes) HOLE
> +29( 29 mod 256): READ     0x13000 thru 0x19fff (0x7000 bytes)
> +30( 30 mod 256): TRUNCATE UP   from 0x6f800 to 0x75400
> +31( 31 mod 256): READ     0x4000 thru 0xafff   (0x7000 bytes)
> +32( 32 mod 256): SKIPPED (no operation)
> +33( 33 mod 256): FALLOC   0x31d49 thru 0x3c520 (0xa7d7 bytes) INTERIOR
> +34( 34 mod 256): FALLOC   0x2bbb3 thru 0x37ad8 (0xbf25 bytes) INTERIOR
> +35( 35 mod 256): READ     0x68000 thru 0x73fff (0xc000 bytes)
> +Correct content saved for comparison
> +(maybe hexdump "/mnt/a1/junk" vs "/mnt/a1/junk.fsxgood")
> 
> I'm not quite sure if the messages match the test (best guess, neighbouring
> tests were fine):
> 
> [102885.667444] btrfs csum failed ino 638 off 425984 csum 1842675109 private 2279232751
> [102885.676804] btrfs csum failed ino 638 off 430080 csum 1842675109 private 1192041375
> [102885.686094] btrfs csum failed ino 638 off 434176 csum 2297282744 private 1619428542
> [102885.686103] btrfs csum failed ino 638 off 438272 csum 3709984297 private 2868627320
> [102885.686112] btrfs csum failed ino 638 off 442368 csum 1504116677 private 1239355148
> [102885.686121] btrfs csum failed ino 638 off 446464 csum 1957839041 private 3848200057
> [102885.686129] btrfs csum failed ino 638 off 450560 csum 3836729483 private 2867416946

I think fsx leaves the bad file, you can test the inode number?

> 
> 
> 113:
> - it hung last evening and is still in that state, no disk or cpu activity,
>   there were only the tests running
> - no process is in D state, no btrfs kernel thread is active
> - the only interesting process is
> 
>   PID TTY      STAT   TIME COMMAND
> 15585 pts/0    Sl+    0:01 /root/xfstests/ltp/aio-stress -t 20 -s 10  -O -S -I	\
> 	1000 /mnt/a1/aiostress.15188.4 /mnt/a1/aiostress.15188.4.20		\
> 	/mnt/a1/aiostress.15188.4.19 /mnt/a1/aiostress.15188.4.18		\
> [<ffffffff810af447>] futex_wait_queue_me+0xc7/0x100
> [<ffffffff810affa1>] futex_wait+0x191/0x280
> [<ffffffff810b1cb6>] do_futex+0xd6/0xbd0
> [<ffffffff810b282b>] sys_futex+0x7b/0x180
> [<ffffffff8195fe99>] system_call_fastpath+0x16/0x1b

Hmmm, I wonder if something else in rc1 is causing this?

-chris


  reply	other threads:[~2013-01-04 13:01 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-04 12:50 [bug] csum mismatches and failed xfstests with 3.8-rc1 David Sterba
2013-01-04 13:01 ` Chris Mason [this message]
2013-01-04 13:45   ` David Sterba
2013-01-07 17:03 ` [bug] csum mismatches and failed xfstests with 3.8-rc1 and rc2 David Sterba
2013-01-07 17:06   ` David Sterba
2013-01-08  1:17     ` Chris Mason
2013-01-22 14:26 ` [bug] csum mismatches and failed xfstests with 3.8-rc1 -rc4 David Sterba
2013-01-22 14:39   ` Chris Mason
2013-01-23  6:16     ` Liu Bo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130104130144.GF14537@shiny \
    --to=chris.mason@fusionio.com \
    --cc=dsterba@suse.cz \
    --cc=jbacik@fusionio.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.