From: Mark Tinguely <tinguely@sgi.com>
To: Brian Foster <bfoster@redhat.com>
Cc: "Carlos E. R." <carlos.e.r@opensuse.org>,
XFS mail list <xfs@oss.sgi.com>
Subject: Re: Got "Internal error XFS_WANT_CORRUPTED_GOTO". Filesystem needs reformatting to correct issue.
Date: Wed, 02 Jul 2014 08:07:51 -0500 [thread overview]
Message-ID: <53B40427.6060302@sgi.com> (raw)
In-Reply-To: <20140702120441.GA51757@bfoster.bfoster>
On 07/02/14 07:04, Brian Foster wrote:
> On Wed, Jul 02, 2014 at 11:57:25AM +0200, Carlos E. R. wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>>
>>
>> Hi,
>>
>> I got this error:
>>
>>
>> <0.6> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.186436] r8169 0000:06:00.0 eth0: link up
>> <0.6> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.615073] PM: restore of devices complete after 2735.034 msecs
>> <0.1> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626346] XFS: Internal error XFS_WANT_CORRUPTED_GOTO at line 1602 of file /home/abuild/rpmbuild/BUILD/kernel-desktop-3.11.10/linux-3.11/fs/xfs/xfs_alloc.c. Caller 0xffffffffa0c39fe9
>> <0.1> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626346]<0.4>
>> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626348] CPU: 0 PID: 28875
>> Comm: kworker/0:2 Tainted: P O 3.11.10-11-desktop #1
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626348] Hardware name: MICRO-STAR INTERNATIONAL CO.,LTD MS-7516/MS-7516, BIOS V1.5 10/10/2008
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626388] Workqueue: xfs-eofblocks/sde5 xfs_eofblocks_worker [xfs]
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626390] 0000000000000002 ffffffff815a0252 00000000002a61c2 ffffffffa0c38996
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626391] ffff8800b7025680 ffff88022eb74180 ffff880121c3fe50 0000000000000002
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626393] 0000000000000000 0000000100000000 0000000000000000 0000000000000001
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626393] Call Trace:
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626403] [<ffffffff81004a28>] dump_trace+0x88/0x310
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626406] [<ffffffff81004d80>] show_stack_log_lvl+0xd0/0x1d0
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626408] [<ffffffff810061bc>] show_stack+0x1c/0x50
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626411] [<ffffffff815a0252>] dump_stack+0x50/0x89
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626425] [<ffffffffa0c38996>] xfs_free_ag_extent+0x226/0x860 [xfs]
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626468] [<ffffffffa0c39fe9>] xfs_free_extent+0xb9/0xf0 [xfs]
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626510] [<ffffffffa0c4c39e>] xfs_bmap_finish+0x11e/0x170 [xfs]
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626560] [<ffffffffa0c6b4c0>] xfs_itruncate_extents+0x190/0x340 [xfs]
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626623] [<ffffffffa0c33633>] xfs_free_eofblocks+0x1e3/0x260 [xfs]
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626659] [<ffffffffa0c291ef>] xfs_inode_free_eofblocks+0x6f/0x150 [xfs]
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626688] [<ffffffffa0c27f82>] xfs_inode_ag_walk.isra.10+0x1c2/0x310 [xfs]
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626716] [<ffffffffa0c28a8e>] xfs_inode_ag_iterator_tag+0x6e/0xb0 [xfs]
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626744] [<ffffffffa0c28d82>] xfs_eofblocks_worker+0x12/0x20 [xfs]
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626763] [<ffffffff8106ac78>] process_one_work+0x168/0x490
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626765] [<ffffffff8106b914>] worker_thread+0x114/0x3a0
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626768] [<ffffffff81071c3f>] kthread+0xaf/0xc0
>> <0.4> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626771] [<ffffffff815addfc>] ret_from_fork+0x7c/0xb0
>> <0.5> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.626776] XFS (sde5): xfs_do_force_shutdown(0x8) called from line 916 of file /home/abuild/rpmbuild/BUILD/kernel-desktop-3.11.10/linux-3.11/fs/xfs/xfs_bmap.c. Return address = 0xffffffffa0c4c3d8
>> <0.1> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.706440] XFS (sde5): Corruption of in-memory data detected. Shutting down filesystem
>> <0.1> 2014-06-29 12:32:18 Telcontar kernel - - - [212890.706440] XFS (sde5): Please umount the filesystem and rectify the problem(s)
>>
>
> This is the background eofblocks scanner attempting to free preallocated
> space on a file. The scanner looks for files that have been recently
> grown and since been flushed to disk (i.e., no longer concurrently being
> written to) and trims the post-eof preallocation that comes along with
> growing files.
>
> The corruption errors at xfs_alloc.c:1602,1629 on v3.11 fire if the
> extent we are attempting to free is already accounted for in the
> by-block allocation btree. IOW, this is attempting to free an extent
> that the allocation metadata thinks is already free.
>
>>
>> Brief description:
>>
>>
>> * It happens only on restore from hibernation.
>
> Interesting, could you elaborate a bit more on the behavior this system
> is typically subjected to? i.e., is this a server that sees a constant
> workload that is also frequently hibernated/awakened?
>
>> * It happens randomly, spaced a month or two.
>> * It happens always on the same partition, the one that holds /home
>> (I have 10 XFS partitions spread on 4 internal hard disks, and a few
>> more external). It is a new disk, 2 TB, traditional MBR partitions.
>> * Disk has no defects, or at least so says smartctl long test.
>> * When it happens, recovery is impossible: xfs_repair does not seem to
>> find anything, or maybe it does, silently; but on system reuse,
>> it crashes again, fast.
>> * Thus recovery procedure is to use "xfsdump" to get a backup copy,
>> reformat the partition, and recover the files with xfsrestore.
>>
>>
>> The worst issue for me is that "xfs_repair" fails to repair it.
what version of xfs_repair? Did you try to mount to replay the log
before repair?
Besides Brian's good advice, is kdump configured to dump vmcore?
--Mark.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-07-02 13:07 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-02 9:57 Got "Internal error XFS_WANT_CORRUPTED_GOTO". Filesystem needs reformatting to correct issue Carlos E. R.
2014-07-02 12:04 ` Brian Foster
2014-07-02 13:07 ` Mark Tinguely [this message]
2014-07-03 2:54 ` Carlos E. R.
2014-07-03 3:00 ` Carlos E. R.
2014-07-03 9:43 ` Dave Chinner
2014-07-03 17:40 ` Brian Foster
2014-07-03 23:34 ` Carlos E. R.
2014-07-04 0:04 ` Dave Chinner
2014-07-04 1:29 ` Carlos E. R.
2014-07-04 1:40 ` Dave Chinner
2014-07-04 2:42 ` Carlos E. R.
2014-07-04 3:12 ` Carlos E. R.
2014-07-04 12:40 ` Brian Foster
2014-07-04 13:36 ` Carlos E. R.
2014-07-03 17:39 ` Brian Foster
2014-07-04 21:32 ` Carlos E. R.
2014-07-05 12:28 ` Brian Foster
2014-07-12 0:30 ` Carlos E. R.
2014-07-12 1:30 ` Carlos E. R.
2014-07-12 1:45 ` Carlos E. R.
2014-07-12 14:26 ` Brian Foster
2014-07-12 14:19 ` Brian Foster
2014-08-11 14:23 ` Subject : Happened again, 20140811 -- " Carlos E. R.
2014-08-11 14:44 ` Brian Foster
2014-08-11 14:58 ` Carlos E. R.
2014-08-11 17:05 ` Carlos E. R.
2014-08-11 21:31 ` Carlos E. R.
[not found] ` <53E938CC.4010103@sgi.com>
2014-08-11 22:01 ` Carlos E. R.
2014-08-11 14:57 ` Mark Tinguely
2014-08-11 15:34 ` Carlos E. R.
2014-08-11 16:14 ` Brian Foster
2014-08-11 17:08 ` Carlos E. R.
2014-08-11 21:27 ` Mark Tinguely
2014-08-11 21:50 ` Carlos E. R.
2014-08-11 21:56 ` Mark Tinguely
2014-08-11 22:36 ` Carlos E. R.
2014-08-12 0:17 ` Carlos E. R.
2014-08-12 16:51 ` Brian Foster
2014-08-12 21:17 ` Carlos E. R.
2014-08-13 12:04 ` Brian Foster
2014-08-13 13:29 ` Mark Tinguely
2014-08-13 21:04 ` Dave Chinner
2014-08-12 21:27 ` Eric Sandeen
2014-08-12 21:57 ` Dave Chinner
2014-08-12 21:59 ` Brian Foster
2014-08-12 22:21 ` Eric Sandeen
2014-08-12 23:16 ` Dave Chinner
2014-08-13 0:07 ` Carlos E. R.
2014-09-30 22:27 ` Happened again, 20140930 " Carlos E. R.
2014-10-01 0:45 ` Dave Chinner
2014-10-01 2:48 ` Carlos E. R.
2014-10-01 3:04 ` Eric Sandeen
2014-10-02 11:32 ` Jan Kara
2014-10-02 11:46 ` Carlos E. R.
2014-10-05 14:28 ` Carlos E. R.
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53B40427.6060302@sgi.com \
--to=tinguely@sgi.com \
--cc=bfoster@redhat.com \
--cc=carlos.e.r@opensuse.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.