From: Dmitry Monakhov <dmonakhov@openvz.org>
To: Theodore Ts'o <tytso@mit.edu>, EUNBONG SONG <eunb.song@samsung.com>
Cc: "linux-ext4\@vger.kernel.org" <linux-ext4@vger.kernel.org>,
"linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>,
linux-xfs@vger.kernel.org, Dave Chinner <david@fromorbit.com>
Subject: Re: Nasty memory corrution v3.9-12555-g2dbd3ca
Date: Sat, 11 May 2013 13:17:38 +0400 [thread overview]
Message-ID: <87txm96fkd.fsf@openvz.org> (raw)
In-Reply-To: <87y5bm53z3.fsf@openvz.org>
[-- Attachment #1: Type: text/plain, Size: 912 bytes --]
On Sat, 11 May 2013 12:13:20 +0400, Dmitry Monakhov <dmonakhov@openvz.org> wrote:
> On Fri, 10 May 2013 15:27:47 -0400, Theodore Ts'o <tytso@mit.edu> wrote:
> > Hmm, since you seem to be able to reproduce the problem reliably, any
> > chance you can try bisecting the problem? I've looked at the commits
> > that touch fs/jbd2 and nothing is jumping out at me.
> >
> > Also, how many CPU's do you have your system, and what kind of storage
> > device were you using when you were running iozone (5400rpm HDD,
> > 7200RPM HDD, RAID array, SSD, etc.)?
Ok, I've able to reproduce corruption on ext4
So at this moment we have:
Slub corruption on XFS testcase: xfstests/generic/013
Slub corruption on EXT4 testcase: xfstests/generic/299
In fact both test cases (069'th and 299'th) are just stress tests.
So this is likely a regression in mm layer. I try to bisect it now.
#Testcase: xfstests generic/299
#DMESG
[-- Attachment #2: xfstests-generic-299--ext4--v3.9-12555-g2dbd3ca --]
[-- Type: text/plain, Size: 5495 bytes --]
------------[ cut here ]------------
WARNING: at fs/ext4/inode.c:3223 ext4_ext_direct_IO+0x2cb/0x3c0()
Modules linked in: cpufreq_ondemand usb_storage acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod
CPU: 3 PID: 30537 Comm: fio Not tainted 3.9.0+ #14
Hardware name: /DQ67SW, BIOS SWQ6710H.86A.0052.2011.0520.1802 05/20/2011
ffffffff81e44ed6 ffff8801259ffa28 ffffffff818b20ff ffff8801259ffa68
ffffffff81060a87 ffff8801259ffa58 ffff880205a921d0 ffff880233192740
ffff880176496400 0000000000000000 ffffffffffffffe4 ffff8801259ffa78
Call Trace:
[<ffffffff818b20ff>] dump_stack+0x19/0x22
[<ffffffff81060a87>] warn_slowpath_common+0x87/0xb0
[<ffffffff81060aca>] warn_slowpath_null+0x1a/0x20
[<ffffffff813131eb>] ext4_ext_direct_IO+0x2cb/0x3c0
[<ffffffff813161c0>] ? ext4_get_block_write_nolock+0x20/0x20
[<ffffffff813132e0>] ? ext4_ext_direct_IO+0x3c0/0x3c0
[<ffffffff8131ac9f>] ext4_direct_IO+0x22f/0x3c0
[<ffffffff8118db75>] generic_file_direct_write+0x175/0x240
[<ffffffff81191bc6>] __generic_file_aio_write+0x556/0x770
[<ffffffff8130d2de>] ext4_file_dio_write+0x35e/0x4f0
[<ffffffff812929b3>] ? aio_rw_vect_retry+0xc3/0x250
[<ffffffff8130d5ae>] ext4_file_write+0x13e/0x190
[<ffffffff8130d470>] ? ext4_file_dio_write+0x4f0/0x4f0
[<ffffffff812929e3>] aio_rw_vect_retry+0xf3/0x250
[<ffffffff811c7f53>] ? might_fault+0x73/0xe0
[<ffffffff8130d470>] ? ext4_file_dio_write+0x4f0/0x4f0
[<ffffffff812949da>] aio_run_iocb+0x25a/0x3b0
[<ffffffff81294e26>] io_submit_one+0x2f6/0x3a0
[<ffffffff8129512e>] do_io_submit+0x25e/0x2f0
[<ffffffff812926b2>] ? lookup_ioctx+0xc2/0x100
[<ffffffff812951d0>] SyS_io_submit+0x10/0x20
[<ffffffff818c4ac2>] system_call_fastpath+0x16/0x1b
---[ end trace c96126e84d56efc2 ]---
------------[ cut here ]------------
WARNING: at fs/ext4/inode.c:3223 ext4_ext_direct_IO+0x2cb/0x3c0()
Modules linked in: cpufreq_ondemand usb_storage acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod
CPU: 1 PID: 30539 Comm: fio Tainted: G W 3.9.0+ #14
Hardware name: /DQ67SW, BIOS SWQ6710H.86A.0052.2011.0520.1802 05/20/2011
ffffffff81e44ed6 ffff880131209a28 ffffffff818b20ff ffff880131209a68
ffffffff81060a87 ffff880131209a58 ffff8801f5347990 ffff88022b9b7e00
ffff88023342c508 0000000000000000 ffffffffffffffe4 ffff880131209a78
Call Trace:
[<ffffffff818b20ff>] dump_stack+0x19/0x22
[<ffffffff81060a87>] warn_slowpath_common+0x87/0xb0
[<ffffffff81060aca>] warn_slowpath_null+0x1a/0x20
[<ffffffff813131eb>] ext4_ext_direct_IO+0x2cb/0x3c0
[<ffffffff813161c0>] ? ext4_get_block_write_nolock+0x20/0x20
[<ffffffff813132e0>] ? ext4_ext_direct_IO+0x3c0/0x3c0
[<ffffffff8131ac9f>] ext4_direct_IO+0x22f/0x3c0
[<ffffffff8118db75>] generic_file_direct_write+0x175/0x240
[<ffffffff81191bc6>] __generic_file_aio_write+0x556/0x770
[<ffffffff8130d2de>] ext4_file_dio_write+0x35e/0x4f0
[<ffffffff812929b3>] ? aio_rw_vect_retry+0xc3/0x250
[<ffffffff8130d5ae>] ext4_file_write+0x13e/0x190
[<ffffffff8130d470>] ? ext4_file_dio_write+0x4f0/0x4f0
[<ffffffff812929e3>] aio_rw_vect_retry+0xf3/0x250
[<ffffffff811c7f53>] ? might_fault+0x73/0xe0
[<ffffffff8130d470>] ? ext4_file_dio_write+0x4f0/0x4f0
[<ffffffff812949da>] aio_run_iocb+0x25a/0x3b0
[<ffffffff81294e26>] io_submit_one+0x2f6/0x3a0
[<ffffffff8129512e>] do_io_submit+0x25e/0x2f0
[<ffffffff812926b2>] ? lookup_ioctx+0xc2/0x100
[<ffffffff812951d0>] SyS_io_submit+0x10/0x20
[<ffffffff818c4ac2>] system_call_fastpath+0x16/0x1b
---[ end trace c96126e84d56efc3 ]---
Slab corruption (Tainted: G W ): ext4_io_end start=ffff88023342c508, len=64
Redzone: 0x9f911029d74e35b/0x9f911029d74e35b.
Last user: [<ffffffff8131fe9b>](ext4_release_io_end+0x12b/0x130)
030: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6a 6b 6b a5 kkkkkkkkkkkkjkk.
Prev obj: start=ffff88023342c4b0, len=64
Redzone: 0xd84156c5635688c0/0xd84156c5635688c0.
Last user: [<ffffffff813200d3>](ext4_init_io_end+0x23/0x60)
000: 58 cf 42 33 02 88 ff ff c8 27 33 62 01 88 ff ff X.B3.....'3b....
010: d0 51 a4 30 02 88 ff ff 05 00 00 00 00 00 00 00 .Q.0............
Next obj: start=ffff88023342c560, len=64
Redzone: 0xd84156c5635688c0/0xd84156c5635688c0.
Last user: [<ffffffff813200d3>](ext4_init_io_end+0x23/0x60)
000: a8 d3 1e 27 02 88 ff ff a0 8d 47 2b 02 88 ff ff ...'......G+....
010: d0 51 a4 30 02 88 ff ff 05 00 00 00 00 00 00 00 .Q.0............
Slab corruption (Tainted: G W ): ext4_io_end start=ffff880176496400, len=64
Redzone: 0x9f911029d74e35b/0x9f911029d74e35b.
Last user: [<ffffffff8131fe9b>](ext4_release_io_end+0x12b/0x130)
030: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6a 6b 6b a5 kkkkkkkkkkkkjkk.
Prev obj: start=ffff8801764963a8, len=64
Redzone: 0xd84156c5635688c0/0xd84156c5635688c0.
Last user: [<ffffffff813200d3>](ext4_init_io_end+0x23/0x60)
000: a8 63 49 76 01 88 ff ff a8 63 49 76 01 88 ff ff .cIv.....cIv....
010: 90 b9 79 2b 02 88 ff ff 00 00 00 00 00 00 00 00 ..y+............
Next obj: start=ffff880176496458, len=64
Redzone: 0x9f911029d74e35b/0x9f911029d74e35b.
Last user: [<ffffffff8131fe9b------------[ cut here ]------------
[-- Attachment #3: Type: text/plain, Size: 1716 bytes --]
> XFS TOO....
> It it definitely just an ext3/4 related issue.
> I've run xfstests on xfs and almost immediately have got slub corruption
> I use following HEAD: 2dbd3cac87250a0d44e07acc86c4224a08522709
>
> 2013-05-11 11:59:30 Slab corruption (Not tainted): xfs_efi_item
> start=ffff8802335063f0, len=400^M
> 2013-05-11 11:59:30 Redzone: 0x9f911029d74e35b/0x9f911029d74e35b.^M
> 2013-05-11 11:59:30 Last user:
> [<ffffffffa03ba36f>](xfs_efi_item_free+0x3f/0x50 [xfs])^M
> 2013-05-11 11:59:30 070: 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> jkkkkkkkkkkkkkkk^M
> 2013-05-11 11:59:30 Single bit error detected. Probably bad RAM.^M
> 2013-05-11 11:59:30 Run memtest86+ or a similar memory test tool.^M
> 2013-05-11 11:59:30 Prev obj: start=ffff880233506248, len=400^M
> 2013-05-11 11:59:30 Redzone: 0xd84156c5635688c0/0xd84156c5635688c0.^M
> 2013-05-11 11:59:30 Last user:
> [<ffffffffa034d7bb>](kmem_zone_alloc+0xbb/0x190 [xfs])^M
> 2013-05-11 11:59:30 000: 48 62 50 33 02 88 ff ff 48 62 50 33 02 88 ff ff
> HbP3....HbP3....^M
> 2013-05-11 11:59:30 010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ................^M
> 2013-05-11 11:59:30 Next obj: start=ffff880233506598, len=400^M
> 2013-05-11 11:59:30 Redzone: 0x9f911029d74e35b/0x9f911029d74e35b.^M
> >
> > Thanks,
> >
> > - Ted
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-05-11 9:17 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-10 0:51 Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+ EUNBONG SONG
2013-05-10 0:51 ` EUNBONG SONG
2013-05-10 17:27 ` Tony Luck
2013-05-11 7:52 ` Dmitry Monakhov
2013-05-13 2:04 ` Tony Luck
2013-05-13 3:07 ` Theodore Ts'o
2013-05-13 5:06 ` Sidorov, Andrei
2013-05-13 8:43 ` Zheng Liu
2013-05-10 19:27 ` Theodore Ts'o
2013-05-10 20:38 ` David Daney
2013-05-11 8:13 ` Nasty memory corrution v3.9-12555-g2dbd3ca Dmitry Monakhov
2013-05-11 9:17 ` Dmitry Monakhov [this message]
2013-05-11 11:00 ` EXT4 regression caused 4eec7 Dmitry Monakhov
2013-05-11 23:05 ` Theodore Ts'o
2013-05-12 9:01 ` Dmitry Monakhov
2013-05-13 16:34 ` Eric Sandeen
2013-05-13 17:01 ` Jan Kara
2013-05-13 17:09 ` Eric Sandeen
2013-05-14 7:11 ` Dmitry Monakhov
2013-05-14 14:08 ` Eric Sandeen
2013-05-14 22:04 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87txm96fkd.fsf@openvz.org \
--to=dmonakhov@openvz.org \
--cc=david@fromorbit.com \
--cc=eunb.song@samsung.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.