linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
@ 2009-01-19 11:49 Jacek Luczak
  2009-01-19 18:44 ` Eric Sandeen
  0 siblings, 1 reply; 22+ messages in thread
From: Jacek Luczak @ 2009-01-19 11:49 UTC (permalink / raw)
  To: LKML; +Cc: hch, sandeen


Hi All,

I've stepped into XFS issue/bug. Yesterday I've compiled 2.6.29-rc2 and no
didn't found errors. Today I've booted my notebook and XFS bug have occurred.
System reboot didn't helped, same error appeared.

Some info:
[1] config: http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2.config
[2] kernel logs:
http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2_XFS-bug.log
[3] most interesting part of log below.

Regards,
-Jacek

----------- BUG START HERE -----------
Jan 19 11:18:32 difrost kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO at
line 3327 of file fs/xfs/xfs_btree.c.  Caller 0xc01b047c
Jan 19 11:18:32 difrost kernel: Pid: 680, comm: mount Not tainted 2.6.29-rc2 #1
Jan 19 11:18:32 difrost kernel: Call Trace:
Jan 19 11:18:32 difrost kernel:  [<c01afef0>] xfs_btree_delrec+0x657/0xbc2
Jan 19 11:18:32 difrost kernel:  [<c01b047c>] xfs_btree_delete+0x21/0x66
Jan 19 11:18:32 difrost kernel:  [<c01ac0b0>] xfs_bmbt_init_key_from_rec+0xa/0x16
Jan 19 11:18:32 difrost kernel:  [<c01b047c>] xfs_btree_delete+0x21/0x66
Jan 19 11:18:32 difrost kernel:  [<c01a8de3>] xfs_bmap_del_extent+0x309/0x974
Jan 19 11:18:32 difrost kernel:  [<c01d8d33>] kmem_zone_alloc+0x53/0x90
Jan 19 11:18:32 difrost kernel:  [<c01a9b28>] xfs_bunmapi+0x59c/0x95d
Jan 19 11:18:32 difrost kernel:  [<c01c1ac1>] xfs_itruncate_finish+0x1c7/0x2f3
Jan 19 11:18:32 difrost kernel:  [<c01d742b>] xfs_inactive+0x1d2/0x3ce
Jan 19 11:18:32 difrost kernel:  [<c01c23b5>] xfs_imap_to_bp+0x5d/0xcb
Jan 19 11:18:32 difrost kernel:  [<c0171bc4>] clear_inode+0x6c/0xb8
Jan 19 11:18:32 difrost kernel:  [<c01720f1>] generic_delete_inode+0x72/0xcc
Jan 19 11:18:32 difrost kernel:  [<c0171703>] iput+0x48/0x4a
Jan 19 11:18:32 difrost kernel:  [<c01cba97>]
xlog_recover_process_one_iunlink+0xb0/0xda
Jan 19 11:18:32 difrost kernel:  [<c01cbb38>]
xlog_recover_process_iunlinks+0x77/0xd8
Jan 19 11:18:32 difrost kernel:  [<c01cbbd8>] xlog_recover_finish+0x3f/0x8d
Jan 19 11:18:32 difrost kernel:  [<c01cff74>] xfs_mountfs+0x44e/0x54b
Jan 19 11:18:32 difrost kernel:  [<c01d8e3a>] kmem_alloc+0x57/0xa8
Jan 19 11:18:32 difrost kernel:  [<c01d06e9>] xfs_mru_cache_create+0xe6/0x11c
Jan 19 11:18:32 difrost kernel:  [<c01e1616>] xfs_fs_fill_super+0x182/0x2d8
Jan 19 11:18:32 difrost kernel:  [<c0165a80>] get_sb_bdev+0xe8/0x130
Jan 19 11:18:32 difrost kernel:  [<c01750e6>] alloc_vfsmnt+0x69/0xf5
Jan 19 11:18:32 difrost kernel:  [<c01dfe39>] xfs_fs_get_sb+0x12/0x16
Jan 19 11:18:32 difrost kernel:  [<c01e1494>] xfs_fs_fill_super+0x0/0x2d8
Jan 19 11:18:32 difrost kernel:  [<c0164cca>] vfs_kern_mount+0x39/0x72
Jan 19 11:18:32 difrost kernel:  [<c0164d41>] do_kern_mount+0x2f/0xb4
Jan 19 11:18:32 difrost kernel:  [<c0175c6a>] do_mount+0x632/0x66d
Jan 19 11:18:32 difrost kernel:  [<c0175d14>] sys_mount+0x6f/0xaf
Jan 19 11:18:32 difrost kernel:  [<c0102d05>] sysenter_do_call+0x12/0x25
Jan 19 11:18:32 difrost kernel: Filesystem "sda5": XFS internal error
xfs_trans_cancel at line 1164 of file fs/xfs/xfs_trans.c.  Caller 0xc01d7444
Jan 19 11:18:32 difrost kernel:
Jan 19 11:18:32 difrost kernel: Pid: 680, comm: mount Not tainted 2.6.29-rc2 #1
Jan 19 11:18:32 difrost kernel: Call Trace:
Jan 19 11:18:32 difrost kernel:  [<c01d2255>] xfs_trans_cancel+0x49/0xcf
Jan 19 11:18:32 difrost kernel:  [<c01d7444>] xfs_inactive+0x1eb/0x3ce
Jan 19 11:18:32 difrost kernel:  [<c01d7444>] xfs_inactive+0x1eb/0x3ce
Jan 19 11:18:32 difrost kernel:  [<c01c23b5>] xfs_imap_to_bp+0x5d/0xcb
Jan 19 11:18:32 difrost kernel:  [<c0171bc4>] clear_inode+0x6c/0xb8
Jan 19 11:18:32 difrost kernel:  [<c01720f1>] generic_delete_inode+0x72/0xcc
Jan 19 11:18:32 difrost kernel:  [<c0171703>] iput+0x48/0x4a
Jan 19 11:18:32 difrost kernel:  [<c01cba97>]
xlog_recover_process_one_iunlink+0xb0/0xda
Jan 19 11:18:32 difrost kernel:  [<c01cbb38>]
xlog_recover_process_iunlinks+0x77/0xd8
Jan 19 11:18:32 difrost kernel:  [<c01cbbd8>] xlog_recover_finish+0x3f/0x8d
Jan 19 11:18:32 difrost kernel:  [<c01cff74>] xfs_mountfs+0x44e/0x54b
Jan 19 11:18:32 difrost kernel:  [<c01d8e3a>] kmem_alloc+0x57/0xa8
Jan 19 11:18:32 difrost kernel:  [<c01d06e9>] xfs_mru_cache_create+0xe6/0x11c
Jan 19 11:18:32 difrost kernel:  [<c01e1616>] xfs_fs_fill_super+0x182/0x2d8
Jan 19 11:18:32 difrost kernel:  [<c0165a80>] get_sb_bdev+0xe8/0x130
Jan 19 11:18:32 difrost kernel:  [<c01750e6>] alloc_vfsmnt+0x69/0xf5
Jan 19 11:18:32 difrost kernel:  [<c01dfe39>] xfs_fs_get_sb+0x12/0x16
Jan 19 11:18:32 difrost kernel:  [<c01e1494>] xfs_fs_fill_super+0x0/0x2d8
Jan 19 11:18:32 difrost kernel:  [<c0164cca>] vfs_kern_mount+0x39/0x72
Jan 19 11:18:32 difrost kernel:  [<c0164d41>] do_kern_mount+0x2f/0xb4
Jan 19 11:18:32 difrost kernel:  [<c0175c6a>] do_mount+0x632/0x66d
Jan 19 11:18:32 difrost kernel:  [<c0175d14>] sys_mount+0x6f/0xaf
Jan 19 11:18:32 difrost kernel:  [<c0102d05>] sysenter_do_call+0x12/0x25
Jan 19 11:18:32 difrost kernel: Filesystem "sda5": Corruption of in-memory data
detected.  Shutting down filesystem: sda5
Jan 19 11:18:32 difrost kernel: Please umount the filesystem, and rectify the
problem(s)
Jan 19 11:18:32 difrost kernel: BUG: unable to handle kernel NULL pointer
dereference at 0000005c
Jan 19 11:18:32 difrost kernel: IP: [<c01cbb51>]
xlog_recover_process_iunlinks+0x90/0xd8
Jan 19 11:18:32 difrost kernel: *pde = 00000000
Jan 19 11:18:32 difrost kernel: Oops: 0000 [#1] SMP
Jan 19 11:18:32 difrost kernel: last sysfs file:
/sys/devices/platform/i8042/modalias
Jan 19 11:18:32 difrost kernel: Modules linked in: psmouse arc4 ecb cryptomgr
aead crypto_blkcipher crypto_hash crypto_algapi iwl3945 rfkill mac80211 lib80211
cfg80211 sky2 sg
Jan 19 11:18:32 difrost kernel:
Jan 19 11:18:32 difrost kernel: Pid: 680, comm: mount Not tainted (2.6.29-rc2
#1) AMILO Pro Edition V3505
Jan 19 11:18:32 difrost kernel: EIP: 0060:[<c01cbb51>] EFLAGS: 00010286 CPU: 0
Jan 19 11:18:32 difrost kernel: EIP is at xlog_recover_process_iunlinks+0x90/0xd8
Jan 19 11:18:32 difrost kernel: EAX: 00000000 EBX: f6a71e40 ECX: 00000005 EDX:
f695fe20
Jan 19 11:18:32 difrost kernel: ESI: ffffffff EDI: f6825400 EBP: 00000026 ESP:
f695fe14
Jan 19 11:18:32 difrost kernel:  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Jan 19 11:18:32 difrost kernel: Process mount (pid: 680, ti=f695e000
task=f6aa5980 task.ti=f695e000)
Jan 19 11:18:32 difrost kernel: Stack:
Jan 19 11:18:32 difrost kernel:  00000026 00000002 00000000 00000000 f699c500
00000000 00000003 f6825400
Jan 19 11:18:32 difrost kernel:  c01cbbd8 00000003 00000000 00000000 c01cff74
00000013 00000246 00400004
Jan 19 11:18:32 difrost kernel:  00000000 00000000 00000001 00000058 00000000
c01d8e3a 00000001 00000058
Jan 19 11:18:32 difrost kernel: Call Trace:
Jan 19 11:18:32 difrost kernel:  [<c01cbbd8>] xlog_recover_finish+0x3f/0x8d
Jan 19 11:18:32 difrost kernel:  [<c01cff74>] xfs_mountfs+0x44e/0x54b
Jan 19 11:18:32 difrost kernel:  [<c01d8e3a>] kmem_alloc+0x57/0xa8
Jan 19 11:18:32 difrost kernel:  [<c01d06e9>] xfs_mru_cache_create+0xe6/0x11c
Jan 19 11:18:32 difrost kernel:  [<c01e1616>] xfs_fs_fill_super+0x182/0x2d8
Jan 19 11:18:32 difrost kernel:  [<c0165a80>] get_sb_bdev+0xe8/0x130
Jan 19 11:18:32 difrost kernel:  [<c01750e6>] alloc_vfsmnt+0x69/0xf5
Jan 19 11:18:32 difrost kernel:  [<c01dfe39>] xfs_fs_get_sb+0x12/0x16
Jan 19 11:18:32 difrost kernel:  [<c01e1494>] xfs_fs_fill_super+0x0/0x2d8
Jan 19 11:18:32 difrost kernel:  [<c0164cca>] vfs_kern_mount+0x39/0x72
Jan 19 11:18:32 difrost kernel:  [<c0164d41>] do_kern_mount+0x2f/0xb4
Jan 19 11:18:32 difrost kernel:  [<c0175c6a>] do_mount+0x632/0x66d
Jan 19 11:18:32 difrost kernel:  [<c0175d14>] sys_mount+0x6f/0xaf
Jan 19 11:18:32 difrost kernel:  [<c0102d05>] sysenter_do_call+0x12/0x25
Jan 19 11:18:32 difrost kernel: Code: 1d fd 00 00 89 f1 55 89 f8 8b 54 24 04 e8
af fe ff ff 31 d2 89 c6 8d 44 24 0c 50 89 f8 8b 4c 24 08 e8 0b 14 ff ff 8b 44 24
10 5a <8b> 40 5c 59 83 fe ff 75 b8 45 83 fd 40 75 aa 8b 5c 24 08 83 7b
Jan 19 11:18:32 difrost kernel: EIP: [<c01cbb51>]
xlog_recover_process_iunlinks+0x90/0xd8 SS:ESP 0068:f695fe14
Jan 19 11:18:32 difrost kernel: ---[ end trace 0d722cd205608c78 ]---


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-19 11:49 [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO Jacek Luczak
@ 2009-01-19 18:44 ` Eric Sandeen
  2009-01-20  0:46   ` Dave Chinner
  2009-01-20  9:24   ` Jacek Luczak
  0 siblings, 2 replies; 22+ messages in thread
From: Eric Sandeen @ 2009-01-19 18:44 UTC (permalink / raw)
  To: Jacek Luczak; +Cc: LKML, hch, xfs mailing list

Jacek Luczak wrote:
> Hi All,
> 
> I've stepped into XFS issue/bug. Yesterday I've compiled 2.6.29-rc2 and no
> didn't found errors. Today I've booted my notebook and XFS bug have occurred.
> System reboot didn't helped, same error appeared.
> 
> Some info:
> [1] config: http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2.config
> [2] kernel logs:
> http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2_XFS-bug.log
> [3] most interesting part of log below.

so this happens every mount?  Reproducible is good.  How large is the
filesystem (too large to extract elsewhere for analysis...?) (plus I
suppose it'll be hard to get to it when you can't even boot....)

-Eric

> Regards,
> -Jacek
> 
> ----------- BUG START HERE -----------
> Jan 19 11:18:32 difrost kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO at
> line 3327 of file fs/xfs/xfs_btree.c.  Caller 0xc01b047c
> Jan 19 11:18:32 difrost kernel: Pid: 680, comm: mount Not tainted 2.6.29-rc2 #1
> Jan 19 11:18:32 difrost kernel: Call Trace:
> Jan 19 11:18:32 difrost kernel:  [<c01afef0>] xfs_btree_delrec+0x657/0xbc2
> Jan 19 11:18:32 difrost kernel:  [<c01b047c>] xfs_btree_delete+0x21/0x66
> Jan 19 11:18:32 difrost kernel:  [<c01ac0b0>] xfs_bmbt_init_key_from_rec+0xa/0x16
> Jan 19 11:18:32 difrost kernel:  [<c01b047c>] xfs_btree_delete+0x21/0x66
> Jan 19 11:18:32 difrost kernel:  [<c01a8de3>] xfs_bmap_del_extent+0x309/0x974
> Jan 19 11:18:32 difrost kernel:  [<c01d8d33>] kmem_zone_alloc+0x53/0x90
> Jan 19 11:18:32 difrost kernel:  [<c01a9b28>] xfs_bunmapi+0x59c/0x95d
> Jan 19 11:18:32 difrost kernel:  [<c01c1ac1>] xfs_itruncate_finish+0x1c7/0x2f3
> Jan 19 11:18:32 difrost kernel:  [<c01d742b>] xfs_inactive+0x1d2/0x3ce
> Jan 19 11:18:32 difrost kernel:  [<c01c23b5>] xfs_imap_to_bp+0x5d/0xcb
> Jan 19 11:18:32 difrost kernel:  [<c0171bc4>] clear_inode+0x6c/0xb8
> Jan 19 11:18:32 difrost kernel:  [<c01720f1>] generic_delete_inode+0x72/0xcc
> Jan 19 11:18:32 difrost kernel:  [<c0171703>] iput+0x48/0x4a
> Jan 19 11:18:32 difrost kernel:  [<c01cba97>]
> xlog_recover_process_one_iunlink+0xb0/0xda
> Jan 19 11:18:32 difrost kernel:  [<c01cbb38>]
> xlog_recover_process_iunlinks+0x77/0xd8
> Jan 19 11:18:32 difrost kernel:  [<c01cbbd8>] xlog_recover_finish+0x3f/0x8d
> Jan 19 11:18:32 difrost kernel:  [<c01cff74>] xfs_mountfs+0x44e/0x54b
> Jan 19 11:18:32 difrost kernel:  [<c01d8e3a>] kmem_alloc+0x57/0xa8
> Jan 19 11:18:32 difrost kernel:  [<c01d06e9>] xfs_mru_cache_create+0xe6/0x11c
> Jan 19 11:18:32 difrost kernel:  [<c01e1616>] xfs_fs_fill_super+0x182/0x2d8
> Jan 19 11:18:32 difrost kernel:  [<c0165a80>] get_sb_bdev+0xe8/0x130
> Jan 19 11:18:32 difrost kernel:  [<c01750e6>] alloc_vfsmnt+0x69/0xf5
> Jan 19 11:18:32 difrost kernel:  [<c01dfe39>] xfs_fs_get_sb+0x12/0x16
> Jan 19 11:18:32 difrost kernel:  [<c01e1494>] xfs_fs_fill_super+0x0/0x2d8
> Jan 19 11:18:32 difrost kernel:  [<c0164cca>] vfs_kern_mount+0x39/0x72
> Jan 19 11:18:32 difrost kernel:  [<c0164d41>] do_kern_mount+0x2f/0xb4
> Jan 19 11:18:32 difrost kernel:  [<c0175c6a>] do_mount+0x632/0x66d
> Jan 19 11:18:32 difrost kernel:  [<c0175d14>] sys_mount+0x6f/0xaf
> Jan 19 11:18:32 difrost kernel:  [<c0102d05>] sysenter_do_call+0x12/0x25
> Jan 19 11:18:32 difrost kernel: Filesystem "sda5": XFS internal error
> xfs_trans_cancel at line 1164 of file fs/xfs/xfs_trans.c.  Caller 0xc01d7444
> Jan 19 11:18:32 difrost kernel:
> Jan 19 11:18:32 difrost kernel: Pid: 680, comm: mount Not tainted 2.6.29-rc2 #1
> Jan 19 11:18:32 difrost kernel: Call Trace:
> Jan 19 11:18:32 difrost kernel:  [<c01d2255>] xfs_trans_cancel+0x49/0xcf
> Jan 19 11:18:32 difrost kernel:  [<c01d7444>] xfs_inactive+0x1eb/0x3ce
> Jan 19 11:18:32 difrost kernel:  [<c01d7444>] xfs_inactive+0x1eb/0x3ce
> Jan 19 11:18:32 difrost kernel:  [<c01c23b5>] xfs_imap_to_bp+0x5d/0xcb
> Jan 19 11:18:32 difrost kernel:  [<c0171bc4>] clear_inode+0x6c/0xb8
> Jan 19 11:18:32 difrost kernel:  [<c01720f1>] generic_delete_inode+0x72/0xcc
> Jan 19 11:18:32 difrost kernel:  [<c0171703>] iput+0x48/0x4a
> Jan 19 11:18:32 difrost kernel:  [<c01cba97>]
> xlog_recover_process_one_iunlink+0xb0/0xda
> Jan 19 11:18:32 difrost kernel:  [<c01cbb38>]
> xlog_recover_process_iunlinks+0x77/0xd8
> Jan 19 11:18:32 difrost kernel:  [<c01cbbd8>] xlog_recover_finish+0x3f/0x8d
> Jan 19 11:18:32 difrost kernel:  [<c01cff74>] xfs_mountfs+0x44e/0x54b
> Jan 19 11:18:32 difrost kernel:  [<c01d8e3a>] kmem_alloc+0x57/0xa8
> Jan 19 11:18:32 difrost kernel:  [<c01d06e9>] xfs_mru_cache_create+0xe6/0x11c
> Jan 19 11:18:32 difrost kernel:  [<c01e1616>] xfs_fs_fill_super+0x182/0x2d8
> Jan 19 11:18:32 difrost kernel:  [<c0165a80>] get_sb_bdev+0xe8/0x130
> Jan 19 11:18:32 difrost kernel:  [<c01750e6>] alloc_vfsmnt+0x69/0xf5
> Jan 19 11:18:32 difrost kernel:  [<c01dfe39>] xfs_fs_get_sb+0x12/0x16
> Jan 19 11:18:32 difrost kernel:  [<c01e1494>] xfs_fs_fill_super+0x0/0x2d8
> Jan 19 11:18:32 difrost kernel:  [<c0164cca>] vfs_kern_mount+0x39/0x72
> Jan 19 11:18:32 difrost kernel:  [<c0164d41>] do_kern_mount+0x2f/0xb4
> Jan 19 11:18:32 difrost kernel:  [<c0175c6a>] do_mount+0x632/0x66d
> Jan 19 11:18:32 difrost kernel:  [<c0175d14>] sys_mount+0x6f/0xaf
> Jan 19 11:18:32 difrost kernel:  [<c0102d05>] sysenter_do_call+0x12/0x25
> Jan 19 11:18:32 difrost kernel: Filesystem "sda5": Corruption of in-memory data
> detected.  Shutting down filesystem: sda5
> Jan 19 11:18:32 difrost kernel: Please umount the filesystem, and rectify the
> problem(s)
> Jan 19 11:18:32 difrost kernel: BUG: unable to handle kernel NULL pointer
> dereference at 0000005c
> Jan 19 11:18:32 difrost kernel: IP: [<c01cbb51>]
> xlog_recover_process_iunlinks+0x90/0xd8
> Jan 19 11:18:32 difrost kernel: *pde = 00000000
> Jan 19 11:18:32 difrost kernel: Oops: 0000 [#1] SMP
> Jan 19 11:18:32 difrost kernel: last sysfs file:
> /sys/devices/platform/i8042/modalias
> Jan 19 11:18:32 difrost kernel: Modules linked in: psmouse arc4 ecb cryptomgr
> aead crypto_blkcipher crypto_hash crypto_algapi iwl3945 rfkill mac80211 lib80211
> cfg80211 sky2 sg
> Jan 19 11:18:32 difrost kernel:
> Jan 19 11:18:32 difrost kernel: Pid: 680, comm: mount Not tainted (2.6.29-rc2
> #1) AMILO Pro Edition V3505
> Jan 19 11:18:32 difrost kernel: EIP: 0060:[<c01cbb51>] EFLAGS: 00010286 CPU: 0
> Jan 19 11:18:32 difrost kernel: EIP is at xlog_recover_process_iunlinks+0x90/0xd8
> Jan 19 11:18:32 difrost kernel: EAX: 00000000 EBX: f6a71e40 ECX: 00000005 EDX:
> f695fe20
> Jan 19 11:18:32 difrost kernel: ESI: ffffffff EDI: f6825400 EBP: 00000026 ESP:
> f695fe14
> Jan 19 11:18:32 difrost kernel:  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> Jan 19 11:18:32 difrost kernel: Process mount (pid: 680, ti=f695e000
> task=f6aa5980 task.ti=f695e000)
> Jan 19 11:18:32 difrost kernel: Stack:
> Jan 19 11:18:32 difrost kernel:  00000026 00000002 00000000 00000000 f699c500
> 00000000 00000003 f6825400
> Jan 19 11:18:32 difrost kernel:  c01cbbd8 00000003 00000000 00000000 c01cff74
> 00000013 00000246 00400004
> Jan 19 11:18:32 difrost kernel:  00000000 00000000 00000001 00000058 00000000
> c01d8e3a 00000001 00000058
> Jan 19 11:18:32 difrost kernel: Call Trace:
> Jan 19 11:18:32 difrost kernel:  [<c01cbbd8>] xlog_recover_finish+0x3f/0x8d
> Jan 19 11:18:32 difrost kernel:  [<c01cff74>] xfs_mountfs+0x44e/0x54b
> Jan 19 11:18:32 difrost kernel:  [<c01d8e3a>] kmem_alloc+0x57/0xa8
> Jan 19 11:18:32 difrost kernel:  [<c01d06e9>] xfs_mru_cache_create+0xe6/0x11c
> Jan 19 11:18:32 difrost kernel:  [<c01e1616>] xfs_fs_fill_super+0x182/0x2d8
> Jan 19 11:18:32 difrost kernel:  [<c0165a80>] get_sb_bdev+0xe8/0x130
> Jan 19 11:18:32 difrost kernel:  [<c01750e6>] alloc_vfsmnt+0x69/0xf5
> Jan 19 11:18:32 difrost kernel:  [<c01dfe39>] xfs_fs_get_sb+0x12/0x16
> Jan 19 11:18:32 difrost kernel:  [<c01e1494>] xfs_fs_fill_super+0x0/0x2d8
> Jan 19 11:18:32 difrost kernel:  [<c0164cca>] vfs_kern_mount+0x39/0x72
> Jan 19 11:18:32 difrost kernel:  [<c0164d41>] do_kern_mount+0x2f/0xb4
> Jan 19 11:18:32 difrost kernel:  [<c0175c6a>] do_mount+0x632/0x66d
> Jan 19 11:18:32 difrost kernel:  [<c0175d14>] sys_mount+0x6f/0xaf
> Jan 19 11:18:32 difrost kernel:  [<c0102d05>] sysenter_do_call+0x12/0x25
> Jan 19 11:18:32 difrost kernel: Code: 1d fd 00 00 89 f1 55 89 f8 8b 54 24 04 e8
> af fe ff ff 31 d2 89 c6 8d 44 24 0c 50 89 f8 8b 4c 24 08 e8 0b 14 ff ff 8b 44 24
> 10 5a <8b> 40 5c 59 83 fe ff 75 b8 45 83 fd 40 75 aa 8b 5c 24 08 83 7b
> Jan 19 11:18:32 difrost kernel: EIP: [<c01cbb51>]
> xlog_recover_process_iunlinks+0x90/0xd8 SS:ESP 0068:f695fe14
> Jan 19 11:18:32 difrost kernel: ---[ end trace 0d722cd205608c78 ]---
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-19 18:44 ` Eric Sandeen
@ 2009-01-20  0:46   ` Dave Chinner
  2009-01-20  9:26     ` Jacek Luczak
  2009-01-20 11:29     ` Christoph Hellwig
  2009-01-20  9:24   ` Jacek Luczak
  1 sibling, 2 replies; 22+ messages in thread
From: Dave Chinner @ 2009-01-20  0:46 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Jacek Luczak, LKML, hch, xfs mailing list

On Mon, Jan 19, 2009 at 12:44:48PM -0600, Eric Sandeen wrote:
> Jacek Luczak wrote:
> > Hi All,
> > 
> > I've stepped into XFS issue/bug. Yesterday I've compiled 2.6.29-rc2 and no
> > didn't found errors. Today I've booted my notebook and XFS bug have occurred.
> > System reboot didn't helped, same error appeared.
> > 
> > Some info:
> > [1] config: http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2.config
> > [2] kernel logs:
> > http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2_XFS-bug.log
> > [3] most interesting part of log below.
> 
> so this happens every mount?  Reproducible is good.  How large is the
> filesystem (too large to extract elsewhere for analysis...?) (plus I
> suppose it'll be hard to get to it when you can't even boot....)

XFS folks, I suspect the common link between all the reports of this
bug is that they are on 32-bit kernels. I can't reproduce this on
a 64 bit kernel, and I'm trying to get a 32-bit UML built right now
to test this theory.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-19 18:44 ` Eric Sandeen
  2009-01-20  0:46   ` Dave Chinner
@ 2009-01-20  9:24   ` Jacek Luczak
  2009-01-20 10:42     ` Jacek Luczak
  1 sibling, 1 reply; 22+ messages in thread
From: Jacek Luczak @ 2009-01-20  9:24 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: LKML, hch, xfs mailing list

Eric Sandeen pisze:
> Jacek Luczak wrote:
>> Hi All,
>>
>> I've stepped into XFS issue/bug. Yesterday I've compiled 2.6.29-rc2 and no
>> didn't found errors. Today I've booted my notebook and XFS bug have occurred.
>> System reboot didn't helped, same error appeared.
>>
>> Some info:
>> [1] config: http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2.config
>> [2] kernel logs:
>> http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2_XFS-bug.log
>> [3] most interesting part of log below.
> 
> so this happens every mount?  Reproducible is good.  How large is the
> filesystem (too large to extract elsewhere for analysis...?) (plus I
> suppose it'll be hard to get to it when you can't even boot....)
> 
> -Eric
> 

Hi Eric,

funny or sad thing is that this happens while mounting only one of partitions
(/home) which is:
$ df -h | grep /home
/dev/sda5              20G   14G  6,0G  69% /home

This bug is quite strange, as I mentioned, first boot on new kernel went OK,
next two resulted in such behavior. Now I'm running 2.6.29-rc2-12097-gf3b8436
and no bug here. Nevertheless I'm not fully happy, as it was seen before, the
bug might still happen (will boot few times to test it and report back to you).

First yesterday buggy but went unnoticed so I've started fluxbox and firefox
(errors seen in log), as usual, then found that sth is wrong. Second boot was
also buggy than I've returned to old kernel where everything was OK ... nearly
everything, cause firefox suddenly ,,forgot'' all configuration.

I will boot my notebook on all those 2.6.29-* kernel few times and maybe be able
to reproduce that bug.

-Jacek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-20  0:46   ` Dave Chinner
@ 2009-01-20  9:26     ` Jacek Luczak
  2009-01-20 11:29     ` Christoph Hellwig
  1 sibling, 0 replies; 22+ messages in thread
From: Jacek Luczak @ 2009-01-20  9:26 UTC (permalink / raw)
  To: Eric Sandeen, Jacek Luczak, LKML, hch, xfs mailing list

Dave Chinner pisze:
> On Mon, Jan 19, 2009 at 12:44:48PM -0600, Eric Sandeen wrote:
>> Jacek Luczak wrote:
>>> Hi All,
>>>
>>> I've stepped into XFS issue/bug. Yesterday I've compiled 2.6.29-rc2 and no
>>> didn't found errors. Today I've booted my notebook and XFS bug have occurred.
>>> System reboot didn't helped, same error appeared.
>>>
>>> Some info:
>>> [1] config: http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2.config
>>> [2] kernel logs:
>>> http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2_XFS-bug.log
>>> [3] most interesting part of log below.
>> so this happens every mount?  Reproducible is good.  How large is the
>> filesystem (too large to extract elsewhere for analysis...?) (plus I
>> suppose it'll be hard to get to it when you can't even boot....)
> 
> XFS folks, I suspect the common link between all the reports of this
> bug is that they are on 32-bit kernels. I can't reproduce this on
> a 64 bit kernel, and I'm trying to get a 32-bit UML built right now
> to test this theory.
> 

Yep, 32-bits here. I've googled a while looking for some answer and it looks
like it has happen before in various kernel version (no report regarding 2.6.29
AFAIR).

-Jacek


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-20  9:24   ` Jacek Luczak
@ 2009-01-20 10:42     ` Jacek Luczak
  0 siblings, 0 replies; 22+ messages in thread
From: Jacek Luczak @ 2009-01-20 10:42 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: LKML, hch, xfs mailing list

Jacek Luczak pisze:
> Eric Sandeen pisze:
>> Jacek Luczak wrote:
>>> Hi All,
>>>
>>> I've stepped into XFS issue/bug. Yesterday I've compiled 2.6.29-rc2 and no
>>> didn't found errors. Today I've booted my notebook and XFS bug have occurred.
>>> System reboot didn't helped, same error appeared.
>>>
>>> Some info:
>>> [1] config: http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2.config
>>> [2] kernel logs:
>>> http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2_XFS-bug.log
>>> [3] most interesting part of log below.
>> so this happens every mount?  Reproducible is good.  How large is the
>> filesystem (too large to extract elsewhere for analysis...?) (plus I
>> suppose it'll be hard to get to it when you can't even boot....)
>>
>> -Eric
>>
> 
> Hi Eric,
> 
> funny or sad thing is that this happens while mounting only one of partitions
> (/home) which is:
> $ df -h | grep /home
> /dev/sda5              20G   14G  6,0G  69% /home
> 
> This bug is quite strange, as I mentioned, first boot on new kernel went OK,
> next two resulted in such behavior. Now I'm running 2.6.29-rc2-12097-gf3b8436
> and no bug here. Nevertheless I'm not fully happy, as it was seen before, the
> bug might still happen (will boot few times to test it and report back to you).
> 
> First yesterday buggy but went unnoticed so I've started fluxbox and firefox
> (errors seen in log), as usual, then found that sth is wrong. Second boot was
> also buggy than I've returned to old kernel where everything was OK ... nearly
> everything, cause firefox suddenly ,,forgot'' all configuration.
> 
> I will boot my notebook on all those 2.6.29-* kernel few times and maybe be able
> to reproduce that bug.
> 

I've made some tests. Basically umount + mount (with time measurement):

$ for i in $(seq -s ' ' 1 20) ; do echo "[=> umount [$MNT]: $i" ; time umount
$MNT ; [ $? -eq 0 ] && echo "[=> mount [$MNT]: $i" && time mount $MNT || break ;
done 2>&1 | tee ~/${MNT##*/}_mount_git.log

where MNT was set to three different partitions (all with XFS). Both on rc2 and
git version, no bug appeared. If someone is interested (I've seen some delays in
mounting before), some time results here:
http://pin.if.uz.zgora.pl/~difrost/linux-next/mount_logs/

I will still mess around so maybe it will appear once more than will try do some
more tests (suggestions are welcome).

-Jacek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-20  0:46   ` Dave Chinner
  2009-01-20  9:26     ` Jacek Luczak
@ 2009-01-20 11:29     ` Christoph Hellwig
  2009-01-20 11:47       ` Jacek Luczak
  1 sibling, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2009-01-20 11:29 UTC (permalink / raw)
  To: Eric Sandeen, Jacek Luczak, LKML, hch, xfs mailing list

On Tue, Jan 20, 2009 at 11:46:11AM +1100, Dave Chinner wrote:
> On Mon, Jan 19, 2009 at 12:44:48PM -0600, Eric Sandeen wrote:
> > Jacek Luczak wrote:
> > > Hi All,
> > > 
> > > I've stepped into XFS issue/bug. Yesterday I've compiled 2.6.29-rc2 and no
> > > didn't found errors. Today I've booted my notebook and XFS bug have occurred.
> > > System reboot didn't helped, same error appeared.
> > > 
> > > Some info:
> > > [1] config: http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2.config
> > > [2] kernel logs:
> > > http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2_XFS-bug.log
> > > [3] most interesting part of log below.
> > 
> > so this happens every mount?  Reproducible is good.  How large is the
> > filesystem (too large to extract elsewhere for analysis...?) (plus I
> > suppose it'll be hard to get to it when you can't even boot....)
> 
> XFS folks, I suspect the common link between all the reports of this
> bug is that they are on 32-bit kernels. I can't reproduce this on
> a 64 bit kernel, and I'm trying to get a 32-bit UML built right now
> to test this theory.

I'm doing about half of my testing on 32 bit x86, and I couldn't
reproduce the detailed receipe  in the kernel.org bugzilla yet.

Just curious:  do you have CONFIG_LBD set?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-20 11:29     ` Christoph Hellwig
@ 2009-01-20 11:47       ` Jacek Luczak
  2009-01-20 11:49         ` Christoph Hellwig
  0 siblings, 1 reply; 22+ messages in thread
From: Jacek Luczak @ 2009-01-20 11:47 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Eric Sandeen, LKML, xfs mailing list

Christoph Hellwig pisze:
> On Tue, Jan 20, 2009 at 11:46:11AM +1100, Dave Chinner wrote:
>> On Mon, Jan 19, 2009 at 12:44:48PM -0600, Eric Sandeen wrote:
>>> Jacek Luczak wrote:
>>>> Hi All,
>>>>
>>>> I've stepped into XFS issue/bug. Yesterday I've compiled 2.6.29-rc2 and no
>>>> didn't found errors. Today I've booted my notebook and XFS bug have occurred.
>>>> System reboot didn't helped, same error appeared.
>>>>
>>>> Some info:
>>>> [1] config: http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2.config
>>>> [2] kernel logs:
>>>> http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2_XFS-bug.log
>>>> [3] most interesting part of log below.
>>> so this happens every mount?  Reproducible is good.  How large is the
>>> filesystem (too large to extract elsewhere for analysis...?) (plus I
>>> suppose it'll be hard to get to it when you can't even boot....)
>> XFS folks, I suspect the common link between all the reports of this
>> bug is that they are on 32-bit kernels. I can't reproduce this on
>> a 64 bit kernel, and I'm trying to get a 32-bit UML built right now
>> to test this theory.
> 
> I'm doing about half of my testing on 32 bit x86, and I couldn't
> reproduce the detailed receipe  in the kernel.org bugzilla yet.
> 
> Just curious:  do you have CONFIG_LBD set?
> 
Hi Christoph,

the answer is:
$ grep LBD .config
# CONFIG_LBD is not set

-Jacek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-20 11:47       ` Jacek Luczak
@ 2009-01-20 11:49         ` Christoph Hellwig
  2009-01-20 12:13           ` Christoph Hellwig
  2009-01-20 13:35           ` Dave Chinner
  0 siblings, 2 replies; 22+ messages in thread
From: Christoph Hellwig @ 2009-01-20 11:49 UTC (permalink / raw)
  To: Jacek Luczak; +Cc: Christoph Hellwig, Eric Sandeen, LKML, xfs mailing list

On Tue, Jan 20, 2009 at 12:47:16PM +0100, Jacek Luczak wrote:
> Christoph Hellwig pisze:
> > On Tue, Jan 20, 2009 at 11:46:11AM +1100, Dave Chinner wrote:
> >> On Mon, Jan 19, 2009 at 12:44:48PM -0600, Eric Sandeen wrote:
> >>> Jacek Luczak wrote:
> >>>> Hi All,
> >>>>
> >>>> I've stepped into XFS issue/bug. Yesterday I've compiled 2.6.29-rc2 and no
> >>>> didn't found errors. Today I've booted my notebook and XFS bug have occurred.
> >>>> System reboot didn't helped, same error appeared.
> >>>>
> >>>> Some info:
> >>>> [1] config: http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2.config
> >>>> [2] kernel logs:
> >>>> http://pin.if.uz.zgora.pl/~difrost/linux-next/2.6.29-rc2_XFS-bug.log
> >>>> [3] most interesting part of log below.
> >>> so this happens every mount?  Reproducible is good.  How large is the
> >>> filesystem (too large to extract elsewhere for analysis...?) (plus I
> >>> suppose it'll be hard to get to it when you can't even boot....)
> >> XFS folks, I suspect the common link between all the reports of this
> >> bug is that they are on 32-bit kernels. I can't reproduce this on
> >> a 64 bit kernel, and I'm trying to get a 32-bit UML built right now
> >> to test this theory.
> > 
> > I'm doing about half of my testing on 32 bit x86, and I couldn't
> > reproduce the detailed receipe  in the kernel.org bugzilla yet.
> > 
> > Just curious:  do you have CONFIG_LBD set?
> > 
> Hi Christoph,
> 
> the answer is:
> $ grep LBD .config
> # CONFIG_LBD is not set

Ok, let me reproduce it without that set..

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-20 11:49         ` Christoph Hellwig
@ 2009-01-20 12:13           ` Christoph Hellwig
  2009-01-20 12:45             ` Christoph Hellwig
  2009-01-20 13:35           ` Dave Chinner
  1 sibling, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2009-01-20 12:13 UTC (permalink / raw)
  To: Jacek Luczak
  Cc: Christoph Hellwig, Eric Sandeen, LKML, xfs mailing list,
	Dave Chinner

On Tue, Jan 20, 2009 at 06:49:06AM -0500, Christoph Hellwig wrote:
> > > Just curious:  do you have CONFIG_LBD set?
> > > 
> > Hi Christoph,
> > 
> > the answer is:
> > $ grep LBD .config
> > # CONFIG_LBD is not set
> 
> Ok, let me reproduce it without that set..

Ok, on 32-bit x86 without CONFIG_LBD I can reliably reproduce the issue
with the following script:


#!/bin/bash

TESTDIR=/mnt/test
SCRATCHMNT=/mnt/scratch
file=$SCRATCHMNT/f

do_pwrite()
{
	offset=`expr $1 \* 512`
	end=`expr $2 \* 512`
	length=`expr $end - $offset`

	xfs_io -d -f $file -c "pwrite $offset $length" >/dev/null
}


mkfs.xfs \
	-b size=1024 \
	-d file,name=$TESTDIR/fsfile,size=40146592b,agcount=16 \
	-i attr=0 \
	-l version=1

mount -o loop,rw,noatime,nodiratime $TESTDIR/fsfile $SCRATCHMNT

do_pwrite 30792 31039
do_pwrite 30320 30791
do_pwrite 29688 30319
do_pwrite 29536 29687
do_pwrite 27216 29535
do_pwrite 24368 27215
do_pwrite 21616 24367
do_pwrite 20608 21615
do_pwrite 19680 20607
do_pwrite 19232 19679
do_pwrite 17840 19231
do_pwrite 16928 17839
do_pwrite 15168 16927
do_pwrite 14048 15167
do_pwrite 12152 14047
do_pwrite 11344 12151
do_pwrite 8792 11343
do_pwrite 6456 8791
do_pwrite 5000 6455
do_pwrite 1728 4999
do_pwrite 0 1727

sync
sync

> $SCRATCHMNT/bigfile

#umount $SCRATCH

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-20 12:13           ` Christoph Hellwig
@ 2009-01-20 12:45             ` Christoph Hellwig
  2009-01-20 13:58               ` Jacek Luczak
  0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2009-01-20 12:45 UTC (permalink / raw)
  To: Jacek Luczak
  Cc: Christoph Hellwig, Eric Sandeen, LKML, xfs mailing list,
	Dave Chinner

On Tue, Jan 20, 2009 at 07:13:35AM -0500, Christoph Hellwig wrote:
> On Tue, Jan 20, 2009 at 06:49:06AM -0500, Christoph Hellwig wrote:
> > > > Just curious:  do you have CONFIG_LBD set?
> > > > 
> > > Hi Christoph,
> > > 
> > > the answer is:
> > > $ grep LBD .config
> > > # CONFIG_LBD is not set
> > 
> > Ok, let me reproduce it without that set..
> 
> Ok, on 32-bit x86 without CONFIG_LBD I can reliably reproduce the issue
> with the following script:

Bisected down to:

commit 91cca5df9bc85efdabfa645f51d54259ed09f4bf
Author: Christoph Hellwig <hch@infradead.org>
Date:   Thu Oct 30 16:58:01 2008 +1100

    [XFS] implement generic xfs_btree_delete/delrec

    Make the btree delete code generic. Based on a patch from David Chinner
    with lots of changes to follow the original btree implementations more
    closely. While this loses some of the generic helper routines for
    inserting/moving/removing records it also solves some of the one off bugs
    in the original code and makes it easier to verify.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-20 11:49         ` Christoph Hellwig
  2009-01-20 12:13           ` Christoph Hellwig
@ 2009-01-20 13:35           ` Dave Chinner
  2009-01-20 23:03             ` [PATCH] " Dave Chinner
  1 sibling, 1 reply; 22+ messages in thread
From: Dave Chinner @ 2009-01-20 13:35 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jacek Luczak, Eric Sandeen, LKML, xfs mailing list

On Tue, Jan 20, 2009 at 06:49:06AM -0500, Christoph Hellwig wrote:
> On Tue, Jan 20, 2009 at 12:47:16PM +0100, Jacek Luczak wrote:
> > Christoph Hellwig pisze:
> > > I'm doing about half of my testing on 32 bit x86, and I couldn't
> > > reproduce the detailed receipe  in the kernel.org bugzilla yet.
> > > 
> > > Just curious:  do you have CONFIG_LBD set?
> > 
> > the answer is:
> > $ grep LBD .config
> > # CONFIG_LBD is not set
> 
> Ok, let me reproduce it without that set..

Good call, Christoph. I have a reproduce on ia32, CONFIG_LBD=n,
1k block size, 16 AGs in 4GB. Filesystem pre-prepared by
copying a build kernel onto it then 'make mrproper' to put
holes in it. Then, on boot:

dave@xfs-32:/mnt$ sudo mount /dev/sdb /mnt; cd /mnt
dave@xfs-32:/mnt$ cp /home/dave/linux-2.6.tar.gz . ; sync
dave@xfs-32:/mnt$ sudo xfs_bmap -v linux-2.6.tar.gz
linux-2.6.tar.gz:
 EXT: FILE-OFFSET       BLOCK-RANGE      AG AG-OFFSET         TOTAL
   0: [0..150271]:      92112..242383     0 (92112..242383)  150272
   1: [150272..346879]: 256188..452795    0 (256188..452795) 196608
   2: [346880..445183]: 906390..1004693   1 (382102..480405)  98304
   3: [445184..494335]: 770022..819173    1 (245734..294885)  49152
   4: [494336..543487]: 720870..770021    1 (196582..245733)  49152
   5: [543488..592639]: 671718..720869    1 (147430..196581)  49152
   6: [592640..641791]: 622566..671717    1 (98278..147429)   49152
   7: [641792..737023]: 1398498..1493729  2 (349922..445153)  95232
   8: [737024..781055]: 1353872..1397903  2 (305296..349327)  44032
   9: [781056..830207]: 1304720..1353871  2 (256144..305295)  49152
  10: [830208..879359]: 1255566..1304717  2 (206990..256141)  49152
  11: [879360..925367]: 1209558..1255565  2 (160982..206989)  46008
dave@xfs-32:/mnt$ > linux-2.6.tar.gz
Connection to xfs-32 closed.

I'll see if this is reproducable, and if it is I'll start
instrumenting in the morning during the LCA keynote. ;)

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-20 12:45             ` Christoph Hellwig
@ 2009-01-20 13:58               ` Jacek Luczak
  2009-01-20 14:05                 ` Christoph Hellwig
  0 siblings, 1 reply; 22+ messages in thread
From: Jacek Luczak @ 2009-01-20 13:58 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Eric Sandeen, LKML, xfs mailing list, Dave Chinner

Christoph Hellwig pisze:
> On Tue, Jan 20, 2009 at 07:13:35AM -0500, Christoph Hellwig wrote:
>> On Tue, Jan 20, 2009 at 06:49:06AM -0500, Christoph Hellwig wrote:
>>>>> Just curious:  do you have CONFIG_LBD set?
>>>>>
>>>> Hi Christoph,
>>>>
>>>> the answer is:
>>>> $ grep LBD .config
>>>> # CONFIG_LBD is not set
>>> Ok, let me reproduce it without that set..
>> Ok, on 32-bit x86 without CONFIG_LBD I can reliably reproduce the issue
>> with the following script:
> 
> Bisected down to:
> 
> commit 91cca5df9bc85efdabfa645f51d54259ed09f4bf
> Author: Christoph Hellwig <hch@infradead.org>
> Date:   Thu Oct 30 16:58:01 2008 +1100
> 
>     [XFS] implement generic xfs_btree_delete/delrec
> 
>     Make the btree delete code generic. Based on a patch from David Chinner
>     with lots of changes to follow the original btree implementations more
>     closely. While this loses some of the generic helper routines for
>     inserting/moving/removing records it also solves some of the one off bugs
>     in the original code and makes it easier to verify.
> 

Good job! Is there some ,,quick'' fix?

-Jacek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-20 13:58               ` Jacek Luczak
@ 2009-01-20 14:05                 ` Christoph Hellwig
  2009-01-20 14:13                   ` Jacek Luczak
  2009-01-20 14:23                   ` Jacek Luczak
  0 siblings, 2 replies; 22+ messages in thread
From: Christoph Hellwig @ 2009-01-20 14:05 UTC (permalink / raw)
  To: Jacek Luczak
  Cc: Christoph Hellwig, Eric Sandeen, LKML, xfs mailing list,
	Dave Chinner

On Tue, Jan 20, 2009 at 02:58:35PM +0100, Jacek Luczak wrote:
> Good job! Is there some ,,quick'' fix?

The patch below makes it go away for me, alternatively just enable
CONFIG_LBD.


Index: linux-2.6/fs/xfs/xfs_types.h
===================================================================
--- linux-2.6.orig/fs/xfs/xfs_types.h	2009-01-20 14:55:55.806068213 +0100
+++ linux-2.6/fs/xfs/xfs_types.h	2009-01-20 14:56:01.437945154 +0100
@@ -96,7 +96,7 @@ typedef	__uint64_t	xfs_dfilblks_t;	/* nu
 /*
  * Memory based types are conditional.
  */
-#if XFS_BIG_BLKNOS
+#if 1 //XFS_BIG_BLKNOS
 typedef	__uint64_t	xfs_fsblock_t;	/* blockno in filesystem (agno|agbno) */
 typedef __uint64_t	xfs_rfsblock_t;	/* blockno in filesystem (raw) */
 typedef __uint64_t	xfs_rtblock_t;	/* extent (block) in realtime area */

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-20 14:05                 ` Christoph Hellwig
@ 2009-01-20 14:13                   ` Jacek Luczak
  2009-01-20 14:23                   ` Jacek Luczak
  1 sibling, 0 replies; 22+ messages in thread
From: Jacek Luczak @ 2009-01-20 14:13 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Eric Sandeen, LKML, xfs mailing list, Dave Chinner

Christoph Hellwig pisze:
> On Tue, Jan 20, 2009 at 02:58:35PM +0100, Jacek Luczak wrote:
>> Good job! Is there some ,,quick'' fix?
> 
> The patch below makes it go away for me, alternatively just enable
> CONFIG_LBD.
> 
> 
> Index: linux-2.6/fs/xfs/xfs_types.h
> ===================================================================
> --- linux-2.6.orig/fs/xfs/xfs_types.h	2009-01-20 14:55:55.806068213 +0100
> +++ linux-2.6/fs/xfs/xfs_types.h	2009-01-20 14:56:01.437945154 +0100
> @@ -96,7 +96,7 @@ typedef	__uint64_t	xfs_dfilblks_t;	/* nu
>  /*
>   * Memory based types are conditional.
>   */
> -#if XFS_BIG_BLKNOS
> +#if 1 //XFS_BIG_BLKNOS
>  typedef	__uint64_t	xfs_fsblock_t;	/* blockno in filesystem (agno|agbno) */
>  typedef __uint64_t	xfs_rfsblock_t;	/* blockno in filesystem (raw) */
>  typedef __uint64_t	xfs_rtblock_t;	/* extent (block) in realtime area */
> 

Applied. Thanks. Will do some tests with your script.

-Jacek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-20 14:05                 ` Christoph Hellwig
  2009-01-20 14:13                   ` Jacek Luczak
@ 2009-01-20 14:23                   ` Jacek Luczak
  2009-01-20 14:32                     ` Christoph Hellwig
  2009-01-21  4:05                     ` Dave Chinner
  1 sibling, 2 replies; 22+ messages in thread
From: Jacek Luczak @ 2009-01-20 14:23 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Eric Sandeen, LKML, xfs mailing list, Dave Chinner

Christoph Hellwig pisze:
> On Tue, Jan 20, 2009 at 02:58:35PM +0100, Jacek Luczak wrote:
>> Good job! Is there some ,,quick'' fix?
> 
> The patch below makes it go away for me, alternatively just enable
> CONFIG_LBD.
> 
> 
> Index: linux-2.6/fs/xfs/xfs_types.h
> ===================================================================
> --- linux-2.6.orig/fs/xfs/xfs_types.h	2009-01-20 14:55:55.806068213 +0100
> +++ linux-2.6/fs/xfs/xfs_types.h	2009-01-20 14:56:01.437945154 +0100
> @@ -96,7 +96,7 @@ typedef	__uint64_t	xfs_dfilblks_t;	/* nu
>  /*
>   * Memory based types are conditional.
>   */
> -#if XFS_BIG_BLKNOS
> +#if 1 //XFS_BIG_BLKNOS
>  typedef	__uint64_t	xfs_fsblock_t;	/* blockno in filesystem (agno|agbno) */
>  typedef __uint64_t	xfs_rfsblock_t;	/* blockno in filesystem (raw) */
>  typedef __uint64_t	xfs_rtblock_t;	/* extent (block) in realtime area */
> 

I've applied it and now running ,,fixed'' kernel. What I've notice is:
$ LC_ALL=C df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1              20G  -40E   40E   -  /
/dev/sda5              20G  -23E   23E   -  /home
/dev/sda6              56G   56G  774M  99% /NORA
/dev/sda7              45G   44G  1.2G  98% /MAGAZYN

-Jacek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-20 14:23                   ` Jacek Luczak
@ 2009-01-20 14:32                     ` Christoph Hellwig
  2009-01-21  4:05                     ` Dave Chinner
  1 sibling, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2009-01-20 14:32 UTC (permalink / raw)
  To: Jacek Luczak
  Cc: Christoph Hellwig, Eric Sandeen, LKML, xfs mailing list,
	Dave Chinner

On Tue, Jan 20, 2009 at 03:23:01PM +0100, Jacek Luczak wrote:
> I've applied it and now running ,,fixed'' kernel. What I've notice is:
> $ LC_ALL=C df -h
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/sda1              20G  -40E   40E   -  /
> /dev/sda5              20G  -23E   23E   -  /home
> /dev/sda6              56G   56G  774M  99% /NORA
> /dev/sda7              45G   44G  1.2G  98% /MAGAZYN

Yeah, it's more of a hack.  If you drop the patch and just enable
CONFIG_LBD it should be fine.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH] Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-20 13:35           ` Dave Chinner
@ 2009-01-20 23:03             ` Dave Chinner
  2009-01-20 23:22               ` Christoph Hellwig
  0 siblings, 1 reply; 22+ messages in thread
From: Dave Chinner @ 2009-01-20 23:03 UTC (permalink / raw)
  To: Christoph Hellwig, Jacek Luczak, Eric Sandeen, LKML,
	xfs mailing list

On Wed, Jan 21, 2009 at 12:35:21AM +1100, Dave Chinner wrote:
> On Tue, Jan 20, 2009 at 06:49:06AM -0500, Christoph Hellwig wrote:
> > On Tue, Jan 20, 2009 at 12:47:16PM +0100, Jacek Luczak wrote:
> > > Christoph Hellwig pisze:
> > > > I'm doing about half of my testing on 32 bit x86, and I couldn't
> > > > reproduce the detailed receipe  in the kernel.org bugzilla yet.
> > > > 
> > > > Just curious:  do you have CONFIG_LBD set?
> > > 
> > > the answer is:
> > > $ grep LBD .config
> > > # CONFIG_LBD is not set
> > 
> > Ok, let me reproduce it without that set..
> 
> Good call, Christoph. I have a reproduce on ia32, CONFIG_LBD=n,
> 1k block size, 16 AGs in 4GB.

Christoph nailed it down to a problem with xf_fsblock_t last night.

> I'll see if this is reproducable, and if it is I'll start
> instrumenting in the morning during the LCA keynote. ;)

And here's the patch to fix it, posted direct from the keynote ;)

Cheers,

Dave.

------

[XFS] Long btree pointers are still 64 bit on disk

On 32 bit machines with CONFIG_LBD=n, XFS reduces the
in memory size of xfs_fsblock_t to 32 bits so that it
will fit within 32 bit addressing. However, the disk format
for long btree pointers are still 64 bits in size.

The recent btree rewrite failed to take this into account
when initialising new btree blocks, setting sibling pointers
to NULL and checking if they are NULL. Hence checking whether
a 64 bit NULL was the same as a 32 bit NULL was failingi
resulting in NULL sibling pointers failing to be detected
correctly. This showed up as WANT_CORRUPTED_GOTO shutdowns
in xfs_btree_delrec.

Fix this by making all the comparisons and setting of long
pointer btree NULL blocks to the disk format, not the
in memory format. i.e. use NULLDFSBNO.

Signed-off-by: Dave Chinner <david@fromorbit.com>
---
 fs/xfs/xfs_btree.c |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/xfs/xfs_btree.c b/fs/xfs/xfs_btree.c
index 2c3ef20..6bc2136 100644
--- a/fs/xfs/xfs_btree.c
+++ b/fs/xfs/xfs_btree.c
@@ -843,7 +843,7 @@ xfs_btree_ptr_is_null(
 	union xfs_btree_ptr	*ptr)
 {
 	if (cur->bc_flags & XFS_BTREE_LONG_PTRS)
-		return be64_to_cpu(ptr->l) == NULLFSBLOCK;
+		return be64_to_cpu(ptr->l) == NULLDFSBNO;
 	else
 		return be32_to_cpu(ptr->s) == NULLAGBLOCK;
 }
@@ -854,7 +854,7 @@ xfs_btree_set_ptr_null(
 	union xfs_btree_ptr	*ptr)
 {
 	if (cur->bc_flags & XFS_BTREE_LONG_PTRS)
-		ptr->l = cpu_to_be64(NULLFSBLOCK);
+		ptr->l = cpu_to_be64(NULLDFSBNO);
 	else
 		ptr->s = cpu_to_be32(NULLAGBLOCK);
 }
@@ -918,8 +918,8 @@ xfs_btree_init_block(
 	new->bb_numrecs = cpu_to_be16(numrecs);
 
 	if (cur->bc_flags & XFS_BTREE_LONG_PTRS) {
-		new->bb_u.l.bb_leftsib = cpu_to_be64(NULLFSBLOCK);
-		new->bb_u.l.bb_rightsib = cpu_to_be64(NULLFSBLOCK);
+		new->bb_u.l.bb_leftsib = cpu_to_be64(NULLDFSBNO);
+		new->bb_u.l.bb_rightsib = cpu_to_be64(NULLDFSBNO);
 	} else {
 		new->bb_u.s.bb_leftsib = cpu_to_be32(NULLAGBLOCK);
 		new->bb_u.s.bb_rightsib = cpu_to_be32(NULLAGBLOCK);
@@ -971,7 +971,7 @@ xfs_btree_ptr_to_daddr(
 	union xfs_btree_ptr	*ptr)
 {
 	if (cur->bc_flags & XFS_BTREE_LONG_PTRS) {
-		ASSERT(be64_to_cpu(ptr->l) != NULLFSBLOCK);
+		ASSERT(be64_to_cpu(ptr->l) != NULLDFSBNO);
 
 		return XFS_FSB_TO_DADDR(cur->bc_mp, be64_to_cpu(ptr->l));
 	} else {

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-20 23:03             ` [PATCH] " Dave Chinner
@ 2009-01-20 23:22               ` Christoph Hellwig
  0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2009-01-20 23:22 UTC (permalink / raw)
  To: Christoph Hellwig, Jacek Luczak, Eric Sandeen, LKML,
	xfs mailing list

On Wed, Jan 21, 2009 at 10:03:06AM +1100, Dave Chinner wrote:
> [XFS] Long btree pointers are still 64 bit on disk
> 
> On 32 bit machines with CONFIG_LBD=n, XFS reduces the
> in memory size of xfs_fsblock_t to 32 bits so that it
> will fit within 32 bit addressing. However, the disk format
> for long btree pointers are still 64 bits in size.
> 
> The recent btree rewrite failed to take this into account
> when initialising new btree blocks, setting sibling pointers
> to NULL and checking if they are NULL. Hence checking whether
> a 64 bit NULL was the same as a 32 bit NULL was failingi
> resulting in NULL sibling pointers failing to be detected
> correctly. This showed up as WANT_CORRUPTED_GOTO shutdowns
> in xfs_btree_delrec.
> 
> Fix this by making all the comparisons and setting of long
> pointer btree NULL blocks to the disk format, not the
> in memory format. i.e. use NULLDFSBNO.

Thanks, this fixes the testcase for me.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-20 14:23                   ` Jacek Luczak
  2009-01-20 14:32                     ` Christoph Hellwig
@ 2009-01-21  4:05                     ` Dave Chinner
  2009-01-21  9:04                       ` Jacek Luczak
  1 sibling, 1 reply; 22+ messages in thread
From: Dave Chinner @ 2009-01-21  4:05 UTC (permalink / raw)
  To: Jacek Luczak; +Cc: Christoph Hellwig, Eric Sandeen, LKML, xfs mailing list

On Tue, Jan 20, 2009 at 03:23:01PM +0100, Jacek Luczak wrote:
> Christoph Hellwig pisze:
> > On Tue, Jan 20, 2009 at 02:58:35PM +0100, Jacek Luczak wrote:
> >> Good job! Is there some ,,quick'' fix?
> > 
> > The patch below makes it go away for me, alternatively just enable
> > CONFIG_LBD.
> > 
> > 
> > Index: linux-2.6/fs/xfs/xfs_types.h
> > ===================================================================
> > --- linux-2.6.orig/fs/xfs/xfs_types.h	2009-01-20 14:55:55.806068213 +0100
> > +++ linux-2.6/fs/xfs/xfs_types.h	2009-01-20 14:56:01.437945154 +0100
> > @@ -96,7 +96,7 @@ typedef	__uint64_t	xfs_dfilblks_t;	/* nu
> >  /*
> >   * Memory based types are conditional.
> >   */
> > -#if XFS_BIG_BLKNOS
> > +#if 1 //XFS_BIG_BLKNOS
> >  typedef	__uint64_t	xfs_fsblock_t;	/* blockno in filesystem (agno|agbno) */
> >  typedef __uint64_t	xfs_rfsblock_t;	/* blockno in filesystem (raw) */
> >  typedef __uint64_t	xfs_rtblock_t;	/* extent (block) in realtime area */
> > 
> 
> I've applied it and now running ,,fixed'' kernel. What I've notice is:
> $ LC_ALL=C df -h
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/sda1              20G  -40E   40E   -  /
> /dev/sda5              20G  -23E   23E   -  /home
> /dev/sda6              56G   56G  774M  99% /NORA
> /dev/sda7              45G   44G  1.2G  98% /MAGAZYN

Please try the patch I posted this morning - it fixes the problem
properly and shouldn't have this side effect.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-21  4:05                     ` Dave Chinner
@ 2009-01-21  9:04                       ` Jacek Luczak
  2009-01-21 22:58                         ` Dave Chinner
  0 siblings, 1 reply; 22+ messages in thread
From: Jacek Luczak @ 2009-01-21  9:04 UTC (permalink / raw)
  To: Jacek Luczak, Christoph Hellwig, Eric Sandeen, LKML,
	xfs mailing list

Dave Chinner pisze:
> On Tue, Jan 20, 2009 at 03:23:01PM +0100, Jacek Luczak wrote:
>> Christoph Hellwig pisze:
>>> On Tue, Jan 20, 2009 at 02:58:35PM +0100, Jacek Luczak wrote:
>>>> Good job! Is there some ,,quick'' fix?
>>> The patch below makes it go away for me, alternatively just enable
>>> CONFIG_LBD.
>>>
>>>
>>> Index: linux-2.6/fs/xfs/xfs_types.h
>>> ===================================================================
>>> --- linux-2.6.orig/fs/xfs/xfs_types.h	2009-01-20 14:55:55.806068213 +0100
>>> +++ linux-2.6/fs/xfs/xfs_types.h	2009-01-20 14:56:01.437945154 +0100
>>> @@ -96,7 +96,7 @@ typedef	__uint64_t	xfs_dfilblks_t;	/* nu
>>>  /*
>>>   * Memory based types are conditional.
>>>   */
>>> -#if XFS_BIG_BLKNOS
>>> +#if 1 //XFS_BIG_BLKNOS
>>>  typedef	__uint64_t	xfs_fsblock_t;	/* blockno in filesystem (agno|agbno) */
>>>  typedef __uint64_t	xfs_rfsblock_t;	/* blockno in filesystem (raw) */
>>>  typedef __uint64_t	xfs_rtblock_t;	/* extent (block) in realtime area */
>>>
>> I've applied it and now running ,,fixed'' kernel. What I've notice is:
>> $ LC_ALL=C df -h
>> Filesystem            Size  Used Avail Use% Mounted on
>> /dev/sda1              20G  -40E   40E   -  /
>> /dev/sda5              20G  -23E   23E   -  /home
>> /dev/sda6              56G   56G  774M  99% /NORA
>> /dev/sda7              45G   44G  1.2G  98% /MAGAZYN
> 
> Please try the patch I posted this morning - it fixes the problem
> properly and shouldn't have this side effect.
> 

Your patch work for me. I've made also some tests a'la one proposed by
Christoph, here also everything works. Good work guys!

Have a nice day,

-Jacek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO
  2009-01-21  9:04                       ` Jacek Luczak
@ 2009-01-21 22:58                         ` Dave Chinner
  0 siblings, 0 replies; 22+ messages in thread
From: Dave Chinner @ 2009-01-21 22:58 UTC (permalink / raw)
  To: Jacek Luczak; +Cc: Christoph Hellwig, Eric Sandeen, LKML, xfs mailing list

On Wed, Jan 21, 2009 at 10:04:36AM +0100, Jacek Luczak wrote:
> Dave Chinner pisze:
> > Please try the patch I posted this morning - it fixes the problem
> > properly and shouldn't have this side effect.
> 
> Your patch work for me. I've made also some tests a'la one proposed by
> Christoph, here also everything works. Good work guys!

Thanks for testing the fix, Jacek. I'll get it pushed upstream
now.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2009-01-21 22:58 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-19 11:49 [XFS] 2.6.29-rc2: XFS internal error XFS_WANT_CORRUPTED_GOTO Jacek Luczak
2009-01-19 18:44 ` Eric Sandeen
2009-01-20  0:46   ` Dave Chinner
2009-01-20  9:26     ` Jacek Luczak
2009-01-20 11:29     ` Christoph Hellwig
2009-01-20 11:47       ` Jacek Luczak
2009-01-20 11:49         ` Christoph Hellwig
2009-01-20 12:13           ` Christoph Hellwig
2009-01-20 12:45             ` Christoph Hellwig
2009-01-20 13:58               ` Jacek Luczak
2009-01-20 14:05                 ` Christoph Hellwig
2009-01-20 14:13                   ` Jacek Luczak
2009-01-20 14:23                   ` Jacek Luczak
2009-01-20 14:32                     ` Christoph Hellwig
2009-01-21  4:05                     ` Dave Chinner
2009-01-21  9:04                       ` Jacek Luczak
2009-01-21 22:58                         ` Dave Chinner
2009-01-20 13:35           ` Dave Chinner
2009-01-20 23:03             ` [PATCH] " Dave Chinner
2009-01-20 23:22               ` Christoph Hellwig
2009-01-20  9:24   ` Jacek Luczak
2009-01-20 10:42     ` Jacek Luczak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).