Reporting a bug

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* Reporting a bug
@ 2007-06-05  2:10 Germán Poó-Caamaño
  2007-06-05  2:21 ` Germán Poó-Caamaño
  2007-06-05 23:24 ` David Chinner
  0 siblings, 2 replies; 4+ messages in thread
From: Germán Poó-Caamaño @ 2007-06-05  2:10 UTC (permalink / raw)
  To: xfs

I having have some problems with a XFS partition in Debian Sarge:

After a clean reboot (it supposed to be), my machine started with
kernel messages of problems, such us XFS_WANT_CORRUPTED_GOTO and
XFS_WANT_CORRUPTED_RETURN.

It mainly was located in /var.  But, after cleaning that, I checked
other partitions.  I guessed that my root partition (/dev/sda5) was in
problems also.  I mounted as readonly partition and  I ran xfs_repair
on it.  xfs_repair moved 6 files (all of them ELF binaries) to
lost+found.  After reboot the machine, it can't boot anymore.

Trying with Sysrescue 0.3.5 I get the following:

# xfs_check /dev/sda5
[...]
dir 1310848 block 8388608 extra leaf entry fc4e7e74 e7
dir 1310848 block 8388608 extra leaf entry fcdbb5f3 8f
dir 1310848 block 8388608 extra leaf entry fddcbf74 164
/usr/bin/xfs_check: line 28: 14691 Segmentation fault
xfs_db$DBOPTS -i -p xfs_check -c "check$OPTS" $1



# xfs_repair -n /dev/sda5
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
bad nextents 12 for inode 786561, would reset to 13
bad directory leaf magic # 0x46e for directory inode 786561 block 8388610
        - agno = 4
        - agno = 5
bmap rec out of order, inode 1310848 entry 2 [o s c] [8388608 81946
1], 1 [8388608 81946 1]
bad data fork in inode 1310848
would have cleared inode 1310848
        - agno = 6
        - agno = 7
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
entry "etc" at block 0 offset 128 in directory inode 128 references
free inode 1310848
        would clear inode number in entry at offset 128...
entry ".." at block 0 offset 32 in directory inode 149 references free
inode 1310848
        would clear inode number in entry at offset 32...
entry ".." at block 0 offset 32 in directory inode 970 references free
inode 1310848
        would clear inode number in entry at offset 32...
entry ".." at block 0 offset 32 in directory inode 17805 references
free inode 1310848
        would clear inode number in entry at offset 32...
entry ".." at block 0 offset 32 in directory inode 42528 references
free inode 1310848
        would clear inode number in entry at offset 32...
        - agno = 1
entry ".." at block 0 offset 32 in directory inode 262276 references
free inode 1310848
        would clear inode number in entry at offset 32...
entry ".." at block 0 offset 32 in directory inode 262288 references
free inode 1310848
        would clear inode number in entry at offset 32...
        - agno = 2
entry ".." at block 0 offset 32 in directory inode 524569 references
free inode 1310848
        would clear inode number in entry at offset 32...
entry ".." at block 0 offset 32 in directory inode 560783 references
free inode 1310848
        would clear inode number in entry at offset 32...
        - agno = 3
bad nextents 12 for inode 786561, would reset to 13
entry ".." at block 0 offset 32 in directory inode 786608 references
free inode 1310848
        would clear inode number in entry at offset 32...
        - agno = 4
entry ".." at block 0 offset 32 in directory inode 1067905 references
free inode 1310848
        would clear inode number in entry at offset 32...
entry ".." at block 0 offset 32 in directory inode 1067924 references
free inode 1310848
        would clear inode number in entry at offset 32...
        - agno = 5
bmap rec out of order, inode 1310848 entry 2 [o s c] [8388608 81946
1], 1 [8388608 81946 1]
bad data fork in inode 1310848
would have cleared inode 1310848
entry ".." at block 0 offset 32 in directory inode 1310944 references
free inode 1310848
        would clear inode number in entry at offset 32...
        - agno = 6
entry ".." at block 0 offset 32 in directory inode 1573000 references
free inode 1310848
        would clear inode number in entry at offset 32...
entry ".." at block 0 offset 32 in directory inode 1573094 references
free inode 1310848
        would clear inode number in entry at offset 32...
entry ".." at block 0 offset 32 in directory inode 1573120 references
free inode 1310848
        would clear inode number in entry at offset 32...
        - agno = 7
entry ".." at block 0 offset 32 in directory inode 1835140 references
free inode 1310848
        would clear inode number in entry at offset 32...
entry ".." at block 0 offset 32 in directory inode 1835168 references
free inode 1310848
        would clear inode number in entry at offset 32...
entry ".." at block 0 offset 32 in directory inode 1854273 references
free inode 1310848
        would clear inode number in entry at offset 32...
entry ".." at block 0 offset 32 in directory inode 1854300 references
free inode 1310848
        would clear inode number in entry at offset 32...
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem starting at / ...
entry "etc" in directory inode 128 points to free inode 1310848, would
junk entry
corrupt dinode 786561, (btree extents).  This is a bug.
Please report it to xfs@oss.sgi.com.
corrupt dinode 786561, (btree extents).  This is a bug.
Please report it to xfs@oss.sgi.com.
corrupt dinode 786561, (btree extents).  This is a bug.
Please report it to xfs@oss.sgi.com.
Segmentation fault

-- 
Germán Poó Caamaño
http://www.gnome.org/~gpoo/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Reporting a bug
  2007-06-05  2:10 Reporting a bug Germán Poó-Caamaño
@ 2007-06-05  2:21 ` Germán Poó-Caamaño
  2007-06-05 23:24 ` David Chinner
  1 sibling, 0 replies; 4+ messages in thread
From: Germán Poó-Caamaño @ 2007-06-05  2:21 UTC (permalink / raw)
  To: xfs

Adding a little more information:

Sysrescue has kernel 2.6.20.

Filesystem "sda5": corrupt dinode 786561, (btree extents).  Unmount
and run xfs_repair.
Filesystem "sda5": XFS internal error xfs_bmap_read_extents(1) at line
4565 of file fs/xfs/xfs_bmap.c.  Caller 0xc02e7c99
 [<c02ca11e>] xfs_bmap_read_extents+0x488/0x4a2
 [<c02e7c99>] xfs_iread_extents+0xa0/0xbb
 [<c02e5b5f>] xfs_iext_realloc_direct+0xb3/0xc1
 [<c02e7c99>] xfs_iread_extents+0xa0/0xbb
 [<c02c2a54>] xfs_bmap_last_offset+0x94/0xdc
 [<c02d5269>] xfs_dir2_isblock+0x1b/0x60
 [<c0324085>] __make_request+0x384/0x495
 [<c02d59fb>] xfs_dir_lookup+0x8e/0xeb
 [<c02c7615>] xfs_bmapi+0x25b/0x1fd7
 [<c02fb04f>] xfs_dir_lookup_int+0x2c/0xd4
 [<c01230c4>] down_write+0x8/0x10
 [<c02e41ad>] xfs_ilock+0x47/0x67
 [<c02fe944>] xfs_lookup+0x50/0x76
 [<c05ff4cc>] __mutex_lock_slowpath+0x1ac/0x1b4
 [<c030a580>] xfs_vn_lookup+0x3b/0x70
 [<c0149b58>] do_lookup+0xa3/0x140
 [<c014b369>] __link_path_walk+0x61d/0xa24
 [<c014b7b2>] link_path_walk+0x42/0xaf
 [<c03000d8>] xfs_setattr+0xdbe/0xe7c
 [<c014ba67>] do_path_lookup+0x144/0x164
 [<c0145349>] get_empty_filp+0x4f/0xca
 [<c014c3d9>] __path_lookup_intent_open+0x43/0x72
 [<c014c47c>] path_lookup_open+0x20/0x25
 [<c014c546>] open_namei+0x6e/0x523
 [<c010b7cd>] do_page_fault+0x278/0x53f
 [<c0143049>] do_filp_open+0x2a/0x3e
 [<c03000d8>] xfs_setattr+0xdbe/0xe7c
 [<c01430a4>] do_sys_open+0x47/0xcf
 [<c0143165>] sys_open+0x1c/0x1e
 [<c0102c94>] syscall_call+0x7/0xb
 [<c05f0033>] svcauth_gss_accept+0x76a/0xadb
 =======================
Filesystem "sda5": corrupt dinode 786561, (btree extents).  Unmount
and run xfs_repair.
Filesystem "sda5": XFS internal error xfs_bmap_read_extents(1) at line
4565 of file fs/xfs/xfs_bmap.c.  Caller 0xc02e7c99
 [<c02ca11e>] xfs_bmap_read_extents+0x488/0x4a2
 [<c02e7c99>] xfs_iread_extents+0xa0/0xbb
 [<c02e5b5f>] xfs_iext_realloc_direct+0xb3/0xc1
 [<c02e7c99>] xfs_iread_extents+0xa0/0xbb
 [<c0324085>] __make_request+0x384/0x495
 [<c02c2a54>] xfs_bmap_last_offset+0x94/0xdc
 [<c02d5269>] xfs_dir2_isblock+0x1b/0x60
 [<c02d59fb>] xfs_dir_lookup+0x8e/0xeb
 [<c02d59ea>] xfs_dir_lookup+0x7d/0xeb
 [<c02fb04f>] xfs_dir_lookup_int+0x2c/0xd4
 [<c01230c4>] down_write+0x8/0x10
 [<c02e41ad>] xfs_ilock+0x47/0x67
 [<c02fe944>] xfs_lookup+0x50/0x76
 [<c05ff4cc>] __mutex_lock_slowpath+0x1ac/0x1b4
 [<c02fb04f>] xfs_dir_lookup_int+0x2c/0xd4
 [<c030a580>] xfs_vn_lookup+0x3b/0x70
 [<c0149b58>] do_lookup+0xa3/0x140
 [<c014b369>] __link_path_walk+0x61d/0xa24
 [<c014b7b2>] link_path_walk+0x42/0xaf
 [<c014ba67>] do_path_lookup+0x144/0x164
 [<c014c236>] __user_walk_fd+0x30/0x45
 [<c0146b8a>] vfs_stat_fd+0x19/0x40
 [<c0146c3c>] sys_stat64+0xf/0x23
 [<c0102c94>] syscall_call+0x7/0xb

and so on.

-- 
Germán Poó Caamaño
http://www.gnome.org/~gpoo/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Reporting a bug
  2007-06-05  2:10 Reporting a bug Germán Poó-Caamaño
  2007-06-05  2:21 ` Germán Poó-Caamaño
@ 2007-06-05 23:24 ` David Chinner
  2007-06-06  3:22   ` Germán Poó-Caamaño
  1 sibling, 1 reply; 4+ messages in thread
From: David Chinner @ 2007-06-05 23:24 UTC (permalink / raw)
  To: Germán Poó-Caamaño; +Cc: xfs

On Mon, Jun 04, 2007 at 10:10:06PM -0400, Germán Poó-Caamaño wrote:
> I having have some problems with a XFS partition in Debian Sarge:
> 
> After a clean reboot (it supposed to be), my machine started with
> kernel messages of problems, such us XFS_WANT_CORRUPTED_GOTO and
> XFS_WANT_CORRUPTED_RETURN.
> 
> It mainly was located in /var.  But, after cleaning that, I checked
> other partitions.  I guessed that my root partition (/dev/sda5) was in
> problems also.  I mounted as readonly partition and  I ran xfs_repair
> on it.  xfs_repair moved 6 files (all of them ELF binaries) to
> lost+found.  After reboot the machine, it can't boot anymore.

Sounds like a critical binary for boot got lost...

> Trying with Sysrescue 0.3.5 I get the following:

What version of the XFS utilities has that got?
You might do better booting knoppix and then downloading the
latest tools and running them....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Reporting a bug
  2007-06-05 23:24 ` David Chinner
@ 2007-06-06  3:22   ` Germán Poó-Caamaño
  0 siblings, 0 replies; 4+ messages in thread
From: Germán Poó-Caamaño @ 2007-06-06  3:22 UTC (permalink / raw)
  To: David Chinner; +Cc: xfs

2007/6/5, David Chinner <dgc@sgi.com>:
> On Mon, Jun 04, 2007 at 10:10:06PM -0400, Germán Poó-Caamaño wrote:
> > I having have some problems with a XFS partition in Debian Sarge:
> >
> > After a clean reboot (it supposed to be), my machine started with
> > kernel messages of problems, such us XFS_WANT_CORRUPTED_GOTO and
> > XFS_WANT_CORRUPTED_RETURN.
> >
> > It mainly was located in /var.  But, after cleaning that, I checked
> > other partitions.  I guessed that my root partition (/dev/sda5) was in
> > problems also.  I mounted as readonly partition and  I ran xfs_repair
> > on it.  xfs_repair moved 6 files (all of them ELF binaries) to
> > lost+found.  After reboot the machine, it can't boot anymore.
>
> Sounds like a critical binary for boot got lost...

I thought that in the beginning.  But, the crash and segfaults I
pasted in my second message were produced also when I ran Sysrescue
(LiveCD) and tried to work with that filesystem.

Anyway, I applied objdump -T to each file in lost+found.  A lot of
them were important (libgcc, tls/lpthreads, tls/libm, and such).

It seems there were duplicates, because for each file in lost+found
was a library in /lib or /lib/tls.

Some nasty behavior was something like:
  # mkdir foo
  # cd foo
  # foo: No such file or directory

ls over /etc was able to show me group, passwd and shadow.  But none
of them was available.  I copied group- to group.  ls show me two
'group' files.

> > Trying with Sysrescue 0.3.5 I get the following:
>
> What version of the XFS utilities has that got?
> You might do better booting knoppix and then downloading the
> latest tools and running them....

I used xfsprogs 2.8.18.

Unfortunately I haven't had another disk/partition to apply 'dd' to
the conflictive partition.
I discarded a memory problem, it passed a night session of memtest
without errors.

In another partition (137 GB) with *.a lot of* files (think it's used
for maildir). I got 17000+ files in lost+found).  It passed
xfs_repair.  After mounted it again, and a little load, some files
were deleted, but still listed with ls.  If I try to stat that file I
get 'No such file or directory', but still listed in the directory.

I knew that XFS required a robust hardware, I'm not sure if still is
true that statement.  Anyway, I thought the hardware was robust and
probably made a mistake.

-- 
Germán Poó Caamaño
http://www.gnome.org/~gpoo/

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-06-06  3:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-05  2:10 Reporting a bug Germán Poó-Caamaño
2007-06-05  2:21 ` Germán Poó-Caamaño
2007-06-05 23:24 ` David Chinner
2007-06-06  3:22   ` Germán Poó-Caamaño

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox