public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Paul Slootman <paul@wurtel.net>
To: xfs@oss.sgi.com
Subject: XFS internal error XFS_WANT_CORRUPTED_GOTO
Date: Mon, 14 Aug 2006 16:17:31 +0200	[thread overview]
Message-ID: <20060814141731.GA9098@wurtel.net> (raw)
In-Reply-To: <20060812091451.GA16661@wurtel.net>

On Sat 12 Aug 2006, Paul Slootman wrote:
> 
> I've now zapped that directory with xfs_db, and am running the (daily?!)
> xfs_repair at this moment. As the filesystem is 1.1TB, it takes a couple
> of hours :(

That showed the following message in phase 3 because of the xfs_db action:

    imap claims a free inode 261 is in use, correcting imap and clearing inode

and then in phase 4:

    entry "lost+found.x" at block 0 offset 584 in directory inode 256 references free inode 261
            clearing inode number in entry at offset 584...

and in phase 6:

    rebuilding directory inode 256

and phase 7:

    resetting inode 256 nlinks from 17 to 16

but nothing beyond that.


However, that night:

Aug 13 08:28:00 boes kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO at line 874 of file fs/xfs/xfs_ialloc.c.  Caller 0xffffffff8803be2f
Aug 13 08:28:00 boes kernel: 
Aug 13 08:28:00 boes kernel: Call Trace: <ffffffff880366d6>{:xfs:xfs_dialloc+1958}
Aug 13 08:28:00 boes kernel:        <ffffffff8805d8e7>{:xfs:_xfs_buf_lookup_pages+711} <ffffffff88045858>{:xfs:xlog_state_get_iclog_space+56}
Aug 13 08:28:00 boes kernel:        <ffffffff8803be2f>{:xfs:xfs_ialloc+95} <ffffffff8805b83b>{:xfs:kmem_zone_alloc+91}
Aug 13 08:28:00 boes kernel:        <ffffffff88052116>{:xfs:xfs_dir_ialloc+134} <ffffffff88043913>{:xfs:xfs_log_reserve+195}
Aug 13 08:28:00 boes kernel:        <ffffffff8805867b>{:xfs:xfs_mkdir+923} <ffffffff88007f1b>{:xfs:xfs_acl_get_attr+91}
Aug 13 08:28:00 boes kernel:        <ffffffff880623a1>{:xfs:xfs_vn_mknod+465} <ffffffff80292ab0>{d_rehash+112}
Aug 13 08:28:00 boes kernel:        <ffffffff804a136f>{__mutex_unlock_slowpath+415} <ffffffff80287f9d>{real_lookup+157}
Aug 13 08:28:00 boes kernel:        <ffffffff8033fac1>{_atomic_dec_and_lock+65} <ffffffff80296544>{mntput_no_expire+36}
Aug 13 08:28:00 boes kernel:        <ffffffff80289138>{__link_path_walk+3576} <ffffffff80342cd1>{__up_read+33}
Aug 13 08:28:00 boes kernel:        <ffffffff8803a816>{:xfs:xfs_iunlock+102} <ffffffff880560aa>{:xfs:xfs_access+74}
Aug 13 08:28:00 boes kernel:        <ffffffff88062b44>{:xfs:xfs_vn_permission+20} <ffffffff80287c48>{permission+104}
Aug 13 08:28:00 boes kernel:        <ffffffff802883ea>{__link_path_walk+170} <ffffffff880560aa>{:xfs:xfs_access+74}
Aug 13 08:28:00 boes kernel:        <ffffffff8028ab02>{vfs_mkdir+130} <ffffffff8028abf5>{sys_mkdirat+165}
Aug 13 08:28:00 boes kernel:        <ffffffff80209b5a>{system_call+126}
Aug 13 08:28:00 boes kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO at line 874 of file fs/xfs/xfs_ialloc.c.  Caller 0xffffffff8803be2f
Aug 13 08:28:00 boes kernel: 
Aug 13 08:28:00 boes kernel: Call Trace: <ffffffff880366d6>{:xfs:xfs_dialloc+1958}
Aug 13 08:28:00 boes kernel:        <ffffffff80331a11>{__generic_unplug_device+33} <ffffffff80340aa0>{kobject_release+0}
Aug 13 08:28:00 boes kernel:        <ffffffff88045858>{:xfs:xlog_state_get_iclog_space+56}
Aug 13 08:28:00 boes kernel:        <ffffffff8803be2f>{:xfs:xfs_ialloc+95} <ffffffff8805b83b>{:xfs:kmem_zone_alloc+91}
Aug 13 08:28:00 boes kernel:        <ffffffff88052116>{:xfs:xfs_dir_ialloc+134} <ffffffff88043913>{:xfs:xfs_log_reserve+195}
Aug 13 08:28:00 boes kernel:        <ffffffff8805867b>{:xfs:xfs_mkdir+923} <ffffffff88007f1b>{:xfs:xfs_acl_get_attr+91}
Aug 13 08:28:00 boes kernel:        <ffffffff880623a1>{:xfs:xfs_vn_mknod+465} <ffffffff80292ab0>{d_rehash+112}
Aug 13 08:28:00 boes kernel:        <ffffffff804a136f>{__mutex_unlock_slowpath+415} <ffffffff80287f9d>{real_lookup+157}
Aug 13 08:28:00 boes kernel:        <ffffffff8033fac1>{_atomic_dec_and_lock+65} <ffffffff80296544>{mntput_no_expire+36}
Aug 13 08:28:00 boes kernel:        <ffffffff80289138>{__link_path_walk+3576} <ffffffff80342cd1>{__up_read+33}
Aug 13 08:28:00 boes kernel:        <ffffffff8803a816>{:xfs:xfs_iunlock+102} <ffffffff880560aa>{:xfs:xfs_access+74}
Aug 13 08:28:00 boes kernel:        <ffffffff88062b44>{:xfs:xfs_vn_permission+20} <ffffffff80287c48>{permission+104}
Aug 13 08:28:00 boes kernel:        <ffffffff802883ea>{__link_path_walk+170} <ffffffff880560aa>{:xfs:xfs_access+74}
Aug 13 08:28:00 boes kernel:        <ffffffff8028ab02>{vfs_mkdir+130} <ffffffff8028abf5>{sys_mkdirat+165}
Aug 13 08:28:00 boes kernel:        <ffffffff80209b5a>{system_call+126}

Variations of this trace repeat a number of times, and then:

Aug 13 08:31:09 boes kernel: xfs_force_shutdown(md6,0x8) called from line 1151 of file fs/xfs/xfs_trans.c.  Return address = 0xffffffff88065ba8
Aug 13 08:31:09 boes kernel: Filesystem "md6": Corruption of in-memory data detected.  Shutting down filesystem: md6
Aug 13 08:31:09 boes kernel: Please umount the filesystem, and rectify the problem(s)


The repair after this gave the following messages:

Phase 3: correcting nblocks for inode 3080162495, was 2034 - counted 4
Phase 7: resetting inode 256 nlinks from 17 to 16
         resetting inode 3080162495 nlinks from 1 to 10

That's all.

Needless to say, the night after that repair it all went pear-shaped again:

Aug 14 01:00:03 boes kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO at line 874 of file fs/xfs/xfs_ialloc.c.  Caller 0xffffffff8803be2f
Aug 14 01:00:03 boes kernel: 
Aug 14 01:00:03 boes kernel: Call Trace: <ffffffff880366d6>{:xfs:xfs_dialloc+1958}
Aug 14 01:00:03 boes kernel:        <ffffffff8805d8e7>{:xfs:_xfs_buf_lookup_pages+711} <ffffffff88045858>{:xfs:xlog_state_get_iclog_space+56}
Aug 14 01:00:03 boes kernel:        <ffffffff8803be2f>{:xfs:xfs_ialloc+95} <ffffffff8805b83b>{:xfs:kmem_zone_alloc+91}
Aug 14 01:00:03 boes kernel:        <ffffffff88052116>{:xfs:xfs_dir_ialloc+134} <ffffffff88043913>{:xfs:xfs_log_reserve+195}
Aug 14 01:00:03 boes kernel:        <ffffffff8805867b>{:xfs:xfs_mkdir+923} <ffffffff88007f1b>{:xfs:xfs_acl_get_attr+91}
Aug 14 01:00:03 boes kernel:        <ffffffff880623a1>{:xfs:xfs_vn_mknod+465} <ffffffff80292ab0>{d_rehash+112}
Aug 14 01:00:03 boes kernel:        <ffffffff804a136f>{__mutex_unlock_slowpath+415} <ffffffff80287f9d>{real_lookup+157}
Aug 14 01:00:03 boes kernel:        <ffffffff8033fac1>{_atomic_dec_and_lock+65} <ffffffff80296544>{mntput_no_expire+36}
Aug 14 01:00:03 boes kernel:        <ffffffff80289138>{__link_path_walk+3576} <ffffffff80342cd1>{__up_read+33}
Aug 14 01:00:03 boes kernel:        <ffffffff8803a816>{:xfs:xfs_iunlock+102} <ffffffff880560aa>{:xfs:xfs_access+74}
Aug 14 01:00:03 boes kernel:        <ffffffff88062b44>{:xfs:xfs_vn_permission+20} <ffffffff80287c48>{permission+104}
Aug 14 01:00:03 boes kernel:        <ffffffff802883ea>{__link_path_walk+170} <ffffffff880560aa>{:xfs:xfs_access+74}
Aug 14 01:00:03 boes kernel:        <ffffffff8028ab02>{vfs_mkdir+130} <ffffffff8028abf5>{sys_mkdirat+165}
Aug 14 01:00:03 boes kernel:        <ffffffff80209b5a>{system_call+126}
Aug 14 01:00:03 boes kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO at line 874 of file fs/xfs/xfs_ialloc.c.  Caller 0xffffffff8803be2f
Aug 14 01:00:03 boes kernel: 
Aug 14 01:00:03 boes kernel: Call Trace: <ffffffff880366d6>{:xfs:xfs_dialloc+1958}
Aug 14 01:00:03 boes kernel:        <ffffffff8803be2f>{:xfs:xfs_ialloc+95} <ffffffff8805b83b>{:xfs:kmem_zone_alloc+91}
Aug 14 01:00:03 boes kernel:        <ffffffff88052116>{:xfs:xfs_dir_ialloc+134} <ffffffff88043913>{:xfs:xfs_log_reserve+195}
Aug 14 01:00:03 boes kernel:        <ffffffff8805867b>{:xfs:xfs_mkdir+923} <ffffffff88007f1b>{:xfs:xfs_acl_get_attr+91}
Aug 14 01:00:04 boes kernel:        <ffffffff880623a1>{:xfs:xfs_vn_mknod+465} <ffffffff80292ab0>{d_rehash+112}
Aug 14 01:00:04 boes kernel:        <ffffffff804a136f>{__mutex_unlock_slowpath+415} <ffffffff80287f9d>{real_lookup+157}
Aug 14 01:00:04 boes kernel:        <ffffffff8033fac1>{_atomic_dec_and_lock+65} <ffffffff80296544>{mntput_no_expire+36}
Aug 14 01:00:04 boes kernel:        <ffffffff80289138>{__link_path_walk+3576} <ffffffff80342cd1>{__up_read+33}
Aug 14 01:00:04 boes kernel:        <ffffffff8805076c>{:xfs:xfs_trans_unlocked_item+44}
Aug 14 01:00:04 boes kernel:        <ffffffff880560aa>{:xfs:xfs_access+74} <ffffffff88062b44>{:xfs:xfs_vn_permission+20}
Aug 14 01:00:04 boes kernel:        <ffffffff80287c48>{permission+104} <ffffffff802883ea>{__link_path_walk+170}
Aug 14 01:00:04 boes kernel:        <ffffffff880560aa>{:xfs:xfs_access+74} <ffffffff8028ab02>{vfs_mkdir+130}
Aug 14 01:00:04 boes kernel:        <ffffffff8028abf5>{sys_mkdirat+165} <ffffffff80209b5a>{system_call+126}

etc.


I had umounted and mounted the filesystem after that. I tried removing
a couple of junk directories at this point (probably a bad idea in retrospect)
and when I tried to umount the directory again in preparation of the repair,
the system stopped responding. The kernel was spewing these messages:

Aug 14 12:23:45 boes kernel: BUG: soft lockup detected on CPU#0!
Aug 14 12:23:45 boes kernel: 
Aug 14 12:23:45 boes kernel: Call Trace: <IRQ> <ffffffff802511a9>{softlockup_tick+233}
Aug 14 12:23:45 boes kernel:        <ffffffff802367e0>{update_process_times+80} <ffffffff802163e3>{smp_local_timer_interrupt+35}
Aug 14 12:23:45 boes kernel:        <ffffffff80216451>{smp_apic_timer_interrupt+65} <ffffffff8020a69a>{apic_timer_interrupt+98} <EOI>
Aug 14 12:23:45 boes kernel:        <ffffffff8803a578>{:xfs:xfs_iextract+264} <ffffffff80245591>{debug_mutex_add_waiter+161}
Aug 14 12:23:45 boes kernel:        <ffffffff8803e226>{:xfs:xfs_iflush_all+22} <ffffffff804a10df>{__mutex_lock_slowpath+767}
Aug 14 12:23:45 boes kernel:        <ffffffff804a10b4>{__mutex_lock_slowpath+724} <ffffffff8803e226>{:xfs:xfs_iflush_all+22}
Aug 14 12:23:45 boes kernel:        <ffffffff8804c733>{:xfs:xfs_unmountfs+19} <ffffffff8805368d>{:xfs:xfs_unmount+301}
Aug 14 12:23:45 boes kernel:        <ffffffff880659f8>{:xfs:vfs_unmount+40} <ffffffff88065342>{:xfs:xfs_fs_put_super+50}
Aug 14 12:23:45 boes kernel:        <ffffffff802805ff>{generic_shutdown_super+159} <ffffffff802811dd>{kill_block_super+45}
Aug 14 12:23:45 boes kernel:        <ffffffff8028048f>{deactivate_super+79} <ffffffff80296d79>{sys_umount+137}
Aug 14 12:23:45 boes kernel:        <ffffffff80342d82>{__up_write+34} <ffffffff8020a7ed>{error_exit+0}
Aug 14 12:23:45 boes kernel:        <ffffffff80209b5a>{system_call+126}
Aug 14 12:23:55 boes kernel: BUG: soft lockup detected on CPU#0!
Aug 14 12:23:55 boes kernel: 
Aug 14 12:23:55 boes kernel: Call Trace: <IRQ> <ffffffff802511a9>{softlockup_tick+233}
Aug 14 12:23:55 boes kernel:        <ffffffff802367e0>{update_process_times+80} <ffffffff802163e3>{smp_local_timer_interrupt+35}
Aug 14 12:23:55 boes kernel:        <ffffffff80216451>{smp_apic_timer_interrupt+65} <ffffffff8020a69a>{apic_timer_interrupt+98} <EOI>
Aug 14 12:23:56 boes kernel:        <ffffffff8803e226>{:xfs:xfs_iflush_all+22} <ffffffff80245591>{debug_mutex_add_waiter+161}
Aug 14 12:23:56 boes kernel:        <ffffffff804a10df>{__mutex_lock_slowpath+767} <ffffffff8803e261>{:xfs:xfs_iflush_all+81}
Aug 14 12:23:56 boes kernel:        <ffffffff804a13b8>{__mutex_unlock_slowpath+488} <ffffffff8803e261>{:xfs:xfs_iflush_all+81}
Aug 14 12:23:56 boes kernel:        <ffffffff8804c733>{:xfs:xfs_unmountfs+19} <ffffffff8805368d>{:xfs:xfs_unmount+301}
Aug 14 12:23:56 boes kernel:        <ffffffff880659f8>{:xfs:vfs_unmount+40} <ffffffff88065342>{:xfs:xfs_fs_put_super+50}
Aug 14 12:23:56 boes kernel:        <ffffffff802805ff>{generic_shutdown_super+159} <ffffffff802811dd>{kill_block_super+45}
Aug 14 12:23:56 boes kernel:        <ffffffff8028048f>{deactivate_super+79} <ffffffff80296d79>{sys_umount+137}
Aug 14 12:23:56 boes kernel:        <ffffffff80342d82>{__up_write+34} <ffffffff8020a7ed>{error_exit+0}
Aug 14 12:23:56 boes kernel:        <ffffffff80209b5a>{system_call+126}

Dumping the locks held via magic-sysreq showed:

Aug 14 12:26:46 boes kernel: #009:             [ffff81013020d488] {alloc_super}
Aug 14 12:26:46 boes kernel: .. held by:            umount:18733 [ffff810154498340, 117]
Aug 14 12:26:46 boes kernel: ... acquired at:               generic_shutdown_super+0x63/0x150
 


kernel: 2.6.17.7 x86_64
xfstools: 2.8.11 from CVS last week

I'm now running the "standard" debian xfs_repair (version 2.6.20) for kicks,
as the 2.8.11 version didn't really seem to help much. I'm now getting
plenty of these errors:

entry "img-050806-090_onlin_81895f.jpg" at block 4 offset 2752 in directory inode 1343503044 references free inode 2511243327
        clearing inode number in entry at offset 2752...
entry "img-050806-090_onlin_81895f.jpg" at block 4 offset 2704 in directory inode 2160247870 references free inode 2511243327
        clearing inode number in entry at offset 2704...
entry "xbase-clients" at block 1 offset 1248 in directory inode 2457926717 references free inode 2511243327
        clearing inode number in entry at offset 1248...
entry "img-050806-090_onlin_81895f.jpg" at block 5 offset 592 in directory inode 2508332587 references free inode 2511243327
        clearing inode number in entry at offset 592...

Phase 6:
rebuilding directory inode 256
rebuilding directory inode 1343503044
rebuilding directory inode 2508332587
rebuilding directory inode 2160247870
rebuilding directory inode 2457926717

Phase 7:
resetting inode 256 nlinks from 17 to 16
resetting inode 2457926717 nlinks from 12 to 2
resetting inode 3080162495 nlinks from 1 to 10

Note the recurring them of "resetting inode 256 nlinks from 17 to 16".
It seems like xfs_repair 2.8.11 doesn't, in fact, reset the nlinks.
(Or it's the deletion and recreation of lost+found as 256 is the root dir,
but that doesn't explain the other two inode nlinks.)

Help! :-(


Paul Slootman

  reply	other threads:[~2006-08-14 14:18 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-10 16:42 cache_purge: shake on cache 0x5880a0 left 8 nodes!? Paul Slootman
2006-08-11  1:30 ` Barry Naujok
2006-08-11  9:02   ` Paul Slootman
2006-08-12  9:14     ` Paul Slootman
2006-08-14 14:17       ` Paul Slootman [this message]
2006-08-15 11:54         ` XFS internal error XFS_WANT_CORRUPTED_GOTO Paul Slootman
2006-08-15  1:49       ` cache_purge: shake on cache 0x5880a0 left 8 nodes!? Barry Naujok
2006-08-15 11:55         ` Paul Slootman
2006-08-14 14:59 ` Chris Wedgwood
2006-08-14 15:55   ` Paul Slootman
  -- strict thread matches above, loose matches on Subject: below --
2007-04-19 13:28 XFS internal error XFS_WANT_CORRUPTED_GOTO Burbidge, Simon A
2007-04-19 14:18 ` David Chinner
2007-04-19 14:36   ` Burbidge, Simon A
2007-04-19 22:10     ` David Chinner
2008-05-29 13:48 Louis-David Mitterrand
2009-06-24 11:56 Nitin Arora
2009-06-24 12:40 ` Michael Monnerie
2009-06-24 13:18   ` Nitin Arora
2009-06-24 13:23 ` Eric Sandeen
2009-06-25 10:06   ` Nitin Arora
2009-06-25 13:45     ` Eric Sandeen
2011-07-22  4:52 Amit Sahrawat
2011-07-22  5:23 ` Amit Sahrawat
2011-07-22  6:59   ` Amit Sahrawat
2011-07-22  8:34     ` Dave Chinner
2011-07-22 10:33       ` Amit Sahrawat
2011-07-24  1:34         ` Dave Chinner
2011-07-25  5:26           ` Amit Sahrawat
2011-07-26  3:33             ` Dave Chinner
2011-07-26  9:47               ` Ajeet Yadav
2011-07-26 10:27                 ` Christoph Hellwig
2011-07-27  4:33                   ` Amit Sahrawat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060814141731.GA9098@wurtel.net \
    --to=paul@wurtel.net \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox