From: Christian Schmid <webmaster@rapidforum.com>
To: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: xfs@oss.sgi.com
Subject: Re: Critical xfs bug in 2.6.17.11?
Date: Mon, 11 Sep 2006 01:35:08 +0200 [thread overview]
Message-ID: <4504A12C.9090608@rapidforum.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0609101835390.29937@p34.internal.lan>
I am using www.linuxfromscratch.org. Kernel is vanilla 2.6.17.11 from kernel.org, xfsprogs is 2.7.3,
libc is 2.3.3. I was not doing anything special. In fact its a heavy-duty tmpfs with up to 300
write-streams at once, reads and deletes. So basically a heavy stress-test on a SMP. Maybe a
race-condition? Pure speculations from my side. But memory is ok. Memory-test with ECC disabled ran
through 12 hours without any errors. ECC is on now of course, so the possibility of a simple
hardware problem is eliminated from my side.
Justin Piszcz wrote:
> ACK, scary, will wait for Nathan Scott's/other SGI members reply on this
> one...
>
> I have not had that happen to me yet, what were you doing that caused
> the problem? Is it repeatable? Have you checked the XFS FAQ for the FS
> fix for 2.6.17-17.6? just to check if there is indeed any problems
> (basically an xfs check on your FS), if you do it, dont use knoppix
> 5.0.2 (contains the 2.6.17 XFS corruption bug), use 4.0.2.
>
> Justin.
>
> On Mon, 11 Sep 2006, Christian Schmid wrote:
>
>> This file-system was created 2 days before with the same kernel.
>>
>> Justin Piszcz wrote:
>>
>>> I hope this is not a repeat of 2.6.17 -> 2.6.17.6..
>>>
>>> $ grep -i xfs ChangeLog-2.6.17.*
>>> ChangeLog-2.6.17.7: XFS: corruption fix
>>> ChangeLog-2.6.17.7: check in xfs_dir2_leafn_remove() fails every
>>> time and xfs_dir2_shrink_inode()
>>>
>>> It appears the only changes to the XFS code though went into 2.6.17.7
>>> so I am not sure what you are seeing there, had you fixed your
>>> filesystem from the 2.6.17 -> .17.6 bug?
>>>
>>> Justin.
>>>
>>> On Sun, 10 Sep 2006, Christian Schmid wrote:
>>>
>>>> Hello.
>>>>
>>>> Instead of a tmpfs, I use a raid 10 softraid. Unfortunately it
>>>> crashed after 10 hours of extreme activities (read/block-writes with
>>>> up to 250 streams/deletes)
>>>>
>>>> 12 gb memory-test successful. 2 cpu xeon smp system.
>>>>
>>>> Tell me if this helps you:
>>>>
>>>> Sep 9 18:08:49 inode430 kernel: [87433.143498] 0x0: 58 41 47 46 00
>>>> 00 00 01 00 00 00 00 00 04 34 a0 Sep 9 18:08:49 inode430 kernel:
>>>> [87433.143672] Filesystem "md5": XFS internal error
>>>> xfs_alloc_read_agf at line 2176 of file fs/xfs/xfs_alloc.c. Caller
>>>> 0xfffffff
>>>> f80314069 Sep 9 18:08:49 inode430 kernel: [87433.143904] Sep 9
>>>> 18:08:49 inode430 kernel: [87433.143905] Call Trace:
>>>> <ffffffff8033c909>{xfs_corruption_error+244}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.143995]
>>>> <ffffffff80346efa>{xfs_iext_insert+65}
>>>> <ffffffff803588d1>{xfs_trans_read_buf+203}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.144353]
>>>> <ffffffff803121d9>{xfs_alloc_read_agf+281}
>>>> <ffffffff80314069>{xfs_alloc_fix_freelist+356}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.144628]
>>>> <ffffffff80314069>{xfs_alloc_fix_freelist+356}
>>>> <ffffffff80515a1f>{__down_read+18} Sep 9 18:08:49 inode430 kernel:
>>>> [87433.144855] <ffffffff8031450f>{xfs_alloc_vextent+289}
>>>> <ffffffff80323945>{xfs_bmapi+4061}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.145091]
>>>> <ffffffff803215ba>{xfs_bmap_search_multi_extents+175} Sep 9
>>>> 18:08:49 inode430 kernel: [87433.145226]
>>>> <ffffffff80349aad>{xfs_iomap_write_allocate+675}
>>>> <ffffffff80348c29>{xfs_iomap+701} Sep 9 18:08:49 inode430 kernel:
>>>> [87433.145473] <ffffffff803797f1>{generic_make_request+515}
>>>> <ffffffff803632f3>{xfs_map_blocks+67}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.145846]
>>>> <ffffffff80363a09>{xfs_page_state_convert+722}
>>>> <ffffffff80364344>{xfs_vm_writepage+179} Sep 9 18:08:49 inode430
>>>> kernel: [87433.146079] <ffffffff802986ed>{mpage_writepages+459}
>>>> <ffffffff80364291>{xfs_vm_writepage+0}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.146330]
>>>> <ffffffff802553c1>{do_writepages+41}
>>>> <ffffffff80296de0>{__writeback_single_inode+559}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.146583]
>>>> <ffffffff8022a82c>{default_wake_function+0}
>>>> <ffffffff8022a82c>{default_wake_function+0}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.146847]
>>>> <ffffffff80357f56>{xfs_trans_first_ail+28}
>>>> <ffffffff802974be>{sync_sb_inodes+501}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.147230]
>>>> <ffffffff802444b4>{keventd_create_kthread+0}
>>>> <ffffffff802977cd>{writeback_inodes+144}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.147463]
>>>> <ffffffff802551fc>{wb_kupdate+148} <ffffffff80255bfd>{pdflush+313}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.147825]
>>>> <ffffffff80255168>{wb_kupdate+0} <ffffffff80255ac4>{pdflush+0}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.148142]
>>>> <ffffffff80244479>{kthread+218} <ffffffff8020a992>{child_rip+8}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.148420]
>>>> <ffffffff802444b4>{keventd_create_kthread+0}
>>>> <ffffffff8024439f>{kthread+0}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.148775]
>>>> <ffffffff8020a98a>{child_rip+0} Sep 9 18:08:49 inode430 kernel:
>>>> [87433.149105] Filesystem "md5": XFS internal error xfs_trans_cancel
>>>> at line 1150 of file fs/xfs/xfs_trans.c. Caller 0xffffffff8
>>>> 0349bf8 Sep 9 18:08:49 inode430 kernel: [87433.149262] Sep 9
>>>> 18:08:49 inode430 kernel: [87433.149263] Call Trace:
>>>> <ffffffff803574e9>{xfs_trans_cancel+111} Sep 9 18:08:49 inode430
>>>> kernel: [87433.149348]
>>>> <ffffffff80349bf8>{xfs_iomap_write_allocate+1006}
>>>> <ffffffff80348c29>{xfs_iomap+701} Sep 9 18:08:49 inode430 kernel:
>>>> [87433.149568] <ffffffff803797f1>{generic_make_request+515}
>>>> <ffffffff803632f3>{xfs_map_blocks+67}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.149847]
>>>> <ffffffff80363a09>{xfs_page_state_convert+722}
>>>> <ffffffff80364344>{xfs_vm_writepage+179} Sep 9 18:08:49 inode430
>>>> kernel: [87433.150169] <ffffffff802986ed>{mpage_writepages+459}
>>>> <ffffffff80364291>{xfs_vm_writepage+0}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.150435]
>>>> <ffffffff802553c1>{do_writepages+41}
>>>> <ffffffff80296de0>{__writeback_single_inode+559}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.150593]
>>>> <ffffffff8022a82c>{default_wake_function+0}
>>>> <ffffffff8022a82c>{default_wake_function+0}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.150807]
>>>> <ffffffff80357f56>{xfs_trans_first_ail+28}
>>>> <ffffffff802974be>{sync_sb_inodes+501}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.151042]
>>>> <ffffffff802444b4>{keventd_create_kthread+0}
>>>> <ffffffff802977cd>{writeback_inodes+144}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.151271]
>>>> <ffffffff802551fc>{wb_kupdate+148} <ffffffff80255bfd>{pdflush+313}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.151439]
>>>> <ffffffff80255168>{wb_kupdate+0} <ffffffff80255ac4>{pdflush+0}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.151680]
>>>> <ffffffff80244479>{kthread+218} <ffffffff8020a992>{child_rip+8}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.151922]
>>>> <ffffffff802444b4>{keventd_create_kthread+0}
>>>> <ffffffff8024439f>{kthread+0}
>>>> Sep 9 18:08:49 inode430 kernel: [87433.152086]
>>>> <ffffffff8020a98a>{child_rip+0} Sep 9 18:08:49 inode430 kernel:
>>>> [87433.152489] xfs_force_shutdown(md5,0x8) called from line 1151 of
>>>> file fs/xfs/xfs_trans.c. Return address = 0xffffffff80357507
>>>> Sep 9 18:08:49 inode430 kernel: [87433.168623] Filesystem "md5":
>>>> Corruption of in-memory data detected. Shutting down filesystem: md5
>>>> Sep 9 18:08:49 inode430 kernel: [87433.168903] Please umount the
>>>> filesystem, and rectify the problem(s)
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
next prev parent reply other threads:[~2006-09-10 23:36 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-09-10 13:37 Critical xfs bug in 2.6.17.11? Christian Schmid
2006-09-10 21:31 ` Justin Piszcz
2006-09-10 22:13 ` Christian Schmid
2006-09-10 22:37 ` Justin Piszcz
2006-09-10 23:35 ` Christian Schmid [this message]
2006-09-11 1:00 ` David Chinner
2006-09-11 12:31 ` Christian Schmid
2006-09-11 0:54 ` David Chinner
2006-09-11 12:29 ` Christian Schmid
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4504A12C.9090608@rapidforum.com \
--to=webmaster@rapidforum.com \
--cc=jpiszcz@lucidpixels.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox