public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* xfs_repair v3.1.6 - Segmentation fault AND XFS internal error xfs_btree_check_sblock
@ 2011-10-17 14:42 Richard Ems
  2011-10-17 15:02 ` Christoph Hellwig
  2011-10-17 22:52 ` Dave Chinner
  0 siblings, 2 replies; 6+ messages in thread
From: Richard Ems @ 2011-10-17 14:42 UTC (permalink / raw)
  To: xfs

Hi all !

We have a XFS that started giving errors some days ago.
This is on an openSUSE 11.4 64 bit system.
The XFS is 12 TB big, 9.8 TB are used. Hardware RAID 6 on an Areca 1680 controller.

Mounting the XFS with ro,norecovery works almost always.

But xfs_repair crashes with a Segmentation fault. I tried both v3.1.4 from 
openSUSE 11.4 and xfs_repair v3.1.6 downloaded from the git repo.

Now after a reboot - the ___production___system completely freezed while running 
the last xfs_repair v3.1.6 !!! - the XFS got mounted rw, but just trying to 
touch a file generated the following error:


Oct 17 16:33:02 c3m kernel: [  794.628715] Filesystem "sdb1": XFS internal error xfs_btree_check_sblock at line 120 of file /usr/src/packages/BUILD/kernel-default-2.6.37.6/linux-2.6.37/fs/xfs/xfs_btree.c.  Caller 0xffffffffa0376cbe
Oct 17 16:33:02 c3m kernel: [  794.628718] 
Oct 17 16:33:02 c3m kernel: [  794.628722] Pid: 9066, comm: touch Not tainted 2.6.37.6-0.7-default #1
Oct 17 16:33:02 c3m kernel: [  794.628724] Call Trace:
Oct 17 16:33:02 c3m kernel: [  794.628737]  [<ffffffff81005819>] dump_trace+0x69/0x2e0
Oct 17 16:33:02 c3m kernel: [  794.628744]  [<ffffffff814ba5c3>] dump_stack+0x69/0x6f
Oct 17 16:33:02 c3m kernel: [  794.628776]  [<ffffffffa0376666>] xfs_btree_check_sblock+0x86/0x120 [xfs]
Oct 17 16:33:02 c3m kernel: [  794.628864]  [<ffffffffa0376cbe>] xfs_btree_read_buf_block.clone.0+0x9e/0xc0 [xfs]
Oct 17 16:33:02 c3m kernel: [  794.628947]  [<ffffffffa0378a3e>] xfs_btree_increment+0x1ee/0x290 [xfs]
Oct 17 16:33:02 c3m kernel: [  794.629036]  [<ffffffffa038e522>] xfs_dialloc+0x5e2/0x900 [xfs]
Oct 17 16:33:02 c3m kernel: [  794.629148]  [<ffffffffa0390bd5>] xfs_ialloc+0x75/0x6d0 [xfs]
Oct 17 16:33:02 c3m kernel: [  794.629259]  [<ffffffffa03aaa25>] xfs_dir_ialloc+0x95/0x340 [xfs]
Oct 17 16:33:02 c3m kernel: [  794.629409]  [<ffffffffa03adbb6>] xfs_create+0x406/0x6c0 [xfs]
Oct 17 16:33:02 c3m kernel: [  794.629560]  [<ffffffffa03b9fdf>] xfs_vn_mknod+0xaf/0x1d0 [xfs]
Oct 17 16:33:02 c3m kernel: [  794.629717]  [<ffffffff811552c3>] vfs_create+0x113/0x190
Oct 17 16:33:02 c3m kernel: [  794.629724]  [<ffffffff81155c52>] do_last+0x572/0x600
Oct 17 16:33:02 c3m kernel: [  794.629730]  [<ffffffff81155e88>] do_filp_open+0x1a8/0x610
Oct 17 16:33:02 c3m kernel: [  794.629736]  [<ffffffff81147496>] do_sys_open+0x66/0x110
Oct 17 16:33:02 c3m kernel: [  794.629743]  [<ffffffff81002e4b>] system_call_fastpath+0x16/0x1b
Oct 17 16:33:02 c3m kernel: [  794.629754]  [<00007fbb0de57ce0>] 0x7fbb0de57ce0


The last lines before the " xfs_repair -n -P /dev/sdb1 " Segmentation fault where:

would clear forw/back pointers in block 0 for attributes in inode 4319273
bad attribute leaf magic # 0x250 for dir ino 4319273
problem with attribute contents in inode 4319273
would clear attr fork
bad nblocks 2 for inode 4319273, would reset to 1
bad anextents 1 for inode 4319273, would reset to 0
-bash: line 5:  6488 Segmentation fault      /opt/xfsprogs-3.1.6/sbin/xfs_repair -n -P /dev/sdb1

The complete " xfs_repair -n -P /dev/sdb1 " output file is 1.2 MB gzipped. If anyone wants 
to have a look at it please ask and I will send it as a private mail.


What can I do to recover/repair this XFS ? We have important data in there !

Many thanks,
Richard


-- 
Richard Ems       mail: Richard.Ems@Cape-Horn-Eng.com

Cape Horn Engineering S.L.
C/ Dr. J.J. Dómine 1, 5º piso
46011 Valencia
Tel : +34 96 3242923 / Fax 924
http://www.cape-horn-eng.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfs_repair v3.1.6 - Segmentation fault AND XFS internal error xfs_btree_check_sblock
  2011-10-17 14:42 xfs_repair v3.1.6 - Segmentation fault AND XFS internal error xfs_btree_check_sblock Richard Ems
@ 2011-10-17 15:02 ` Christoph Hellwig
  2011-10-17 15:08   ` Richard Ems
  2011-10-17 22:52 ` Dave Chinner
  1 sibling, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2011-10-17 15:02 UTC (permalink / raw)
  To: Richard Ems; +Cc: xfs

Can you send me a metadump of the filesystem?  It is generated using the
xfs_metadump tool from xfsprogs, and obsfucates the file names used.
Feel free to upload it to a private place and send the link and/or
directly send it to me if you don't want to see it on the list.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfs_repair v3.1.6 - Segmentation fault AND XFS internal error xfs_btree_check_sblock
  2011-10-17 15:02 ` Christoph Hellwig
@ 2011-10-17 15:08   ` Richard Ems
  0 siblings, 0 replies; 6+ messages in thread
From: Richard Ems @ 2011-10-17 15:08 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On 10/17/2011 05:02 PM, Christoph Hellwig wrote:
> Can you send me a metadump of the filesystem?  It is generated using the
> xfs_metadump tool from xfsprogs, and obsfucates the file names used.
> Feel free to upload it to a private place and send the link and/or
> directly send it to me if you don't want to see it on the list.
> 

Hi, thanks for responding!

xfs_metadump also throws a Segmentation fault, both v3.1.6 and v3.1.4 :

# /opt/xfsprogs-3.1.6/sbin/xfs_metadump /dev/sdb1 xfs_metadump_dev-sdb1.log
*** stack smashing detected ***: xfs_db terminated
/opt/xfsprogs-3.1.6/sbin/xfs_metadump: line 31: 22049 Segmentation fault      xfs_db$DBOPTS -F -i -p xfs_metadump -c "metadump$OPTS $2" $1


I will send the gzipped metadump file to you directly.

Many thanks,
Richard


-- 
Richard Ems       mail: Richard.Ems@Cape-Horn-Eng.com

Cape Horn Engineering S.L.
C/ Dr. J.J. Dómine 1, 5º piso
46011 Valencia
Tel : +34 96 3242923 / Fax 924
http://www.cape-horn-eng.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfs_repair v3.1.6 - Segmentation fault AND XFS internal error xfs_btree_check_sblock
  2011-10-17 14:42 xfs_repair v3.1.6 - Segmentation fault AND XFS internal error xfs_btree_check_sblock Richard Ems
  2011-10-17 15:02 ` Christoph Hellwig
@ 2011-10-17 22:52 ` Dave Chinner
  2011-10-18  9:58   ` Richard Ems
  1 sibling, 1 reply; 6+ messages in thread
From: Dave Chinner @ 2011-10-17 22:52 UTC (permalink / raw)
  To: Richard Ems; +Cc: xfs

On Mon, Oct 17, 2011 at 04:42:55PM +0200, Richard Ems wrote:
> Hi all !
> 
> We have a XFS that started giving errors some days ago.  This is
> on an openSUSE 11.4 64 bit system.  The XFS is 12 TB big, 9.8 TB
> are used. Hardware RAID 6 on an Areca 1680 controller.
> 
> Mounting the XFS with ro,norecovery works almost always.
> 
> But xfs_repair crashes with a Segmentation fault. I tried both v3.1.4 from 
> openSUSE 11.4 and xfs_repair v3.1.6 downloaded from the git repo.

Ok, so not a new issue.

> Now after a reboot - the ___production___system completely freezed while running 
> the last xfs_repair v3.1.6 !!! - the XFS got mounted rw, but just trying to 
> touch a file generated the following error:
> 
> Oct 17 16:33:02 c3m kernel: [  794.628715] Filesystem "sdb1": XFS internal error xfs_btree_check_sblock at line 120 of file /usr/src/packages/BUILD/kernel-default-2.6.37.6/linux-2.6.37/fs/xfs/xfs_btree.c.  Caller 0xffffffffa0376cbe
> Oct 17 16:33:02 c3m kernel: [  794.628718] 
> Oct 17 16:33:02 c3m kernel: [  794.628722] Pid: 9066, comm: touch Not tainted 2.6.37.6-0.7-default #1
> Oct 17 16:33:02 c3m kernel: [  794.628724] Call Trace:
> Oct 17 16:33:02 c3m kernel: [  794.628737]  [<ffffffff81005819>] dump_trace+0x69/0x2e0
> Oct 17 16:33:02 c3m kernel: [  794.628744]  [<ffffffff814ba5c3>] dump_stack+0x69/0x6f
> Oct 17 16:33:02 c3m kernel: [  794.628776]  [<ffffffffa0376666>] xfs_btree_check_sblock+0x86/0x120 [xfs]
> Oct 17 16:33:02 c3m kernel: [  794.628864]  [<ffffffffa0376cbe>] xfs_btree_read_buf_block.clone.0+0x9e/0xc0 [xfs]
> Oct 17 16:33:02 c3m kernel: [  794.628947]  [<ffffffffa0378a3e>] xfs_btree_increment+0x1ee/0x290 [xfs]
> Oct 17 16:33:02 c3m kernel: [  794.629036]  [<ffffffffa038e522>] xfs_dialloc+0x5e2/0x900 [xfs]
> Oct 17 16:33:02 c3m kernel: [  794.629148]  [<ffffffffa0390bd5>] xfs_ialloc+0x75/0x6d0 [xfs]

A corrupt inode allocation btree - not a particularly common type of
corruption to be reported. Do you know what caused the errors to
start being reported? A crash, a bad disk, a raid rebuild, something
else?  That information always helps us understand how badly damaged
the filesystem might be....

> The last lines before the " xfs_repair -n -P /dev/sdb1 " Segmentation fault where:
> 
> would clear forw/back pointers in block 0 for attributes in inode 4319273
> bad attribute leaf magic # 0x250 for dir ino 4319273
> problem with attribute contents in inode 4319273
> would clear attr fork
> bad nblocks 2 for inode 4319273, would reset to 1
> bad anextents 1 for inode 4319273, would reset to 0
> -bash: line 5:  6488 Segmentation fault      /opt/xfsprogs-3.1.6/sbin/xfs_repair -n -P /dev/sdb1

And I'd guess that is failing on a different problem - a corrupt
inode most likely. You've build xfs-repair from the source code -
can yo urun it under gdb so we can see where it is dying?

> The complete " xfs_repair -n -P /dev/sdb1 " output file is 1.2 MB
> gzipped. If anyone wants to have a look at it please ask and I
> will send it as a private mail.

That sounds like there's a *lot* of damage to the filesystem. That
makes it even more important that we understand what caused the
damage in the first place....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfs_repair v3.1.6 - Segmentation fault AND XFS internal error xfs_btree_check_sblock
  2011-10-17 22:52 ` Dave Chinner
@ 2011-10-18  9:58   ` Richard Ems
  2011-10-24 16:55     ` Michael Monnerie
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Ems @ 2011-10-18  9:58 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 10/18/2011 12:52 AM, Dave Chinner wrote:
> A corrupt inode allocation btree - not a particularly common type of
> corruption to be reported. Do you know what caused the errors to
> start being reported? A crash, a bad disk, a raid rebuild, something
> else?  That information always helps us understand how badly damaged
> the filesystem might be....

We had a hard disc failure on an Areca 1680 RAID controller, RAID 6. I
checked the firmware and the last available version is already installed.


> And I'd guess that is failing on a different problem - a corrupt
> inode most likely. You've build xfs-repair from the source code -
> can yo urun it under gdb so we can see where it is dying?

I already run it under gdb and sent some mails to Christoph Hellwig. He
found an *issue* in the xfsprogs/repair/attr_repair.c code and sent me a
patch that fixed it. Now "xfs_repair -n -P /dev/sdb1" runs without
errors. But before repairing the XFS I have to rsync as much as I can
from this XFS to another one which is still not available. So it will
take a couple of days before I can run xfs_repair on the XFS.



> That sounds like there's a *lot* of damage to the filesystem. That
> makes it even more important that we understand what caused the
> damage in the first place....

Yes, lots of damage. 8(


Thanks for your help,
Richard


-- 
Richard Ems       mail: Richard.Ems@Cape-Horn-Eng.com

Cape Horn Engineering S.L.
C/ Dr. J.J. Dómine 1, 5º piso
46011 Valencia
Tel : +34 96 3242923 / Fax 924
http://www.cape-horn-eng.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfs_repair v3.1.6 - Segmentation fault AND XFS internal error xfs_btree_check_sblock
  2011-10-18  9:58   ` Richard Ems
@ 2011-10-24 16:55     ` Michael Monnerie
  0 siblings, 0 replies; 6+ messages in thread
From: Michael Monnerie @ 2011-10-24 16:55 UTC (permalink / raw)
  To: xfs; +Cc: Richard Ems


[-- Attachment #1.1: Type: Text/Plain, Size: 838 bytes --]

On Dienstag, 18. Oktober 2011 Richard Ems wrote:
> We had a hard disc failure on an Areca 1680 RAID controller, RAID 6.
> I checked the firmware and the last available version is already
> installed.

Did you check your disks are compatible? Areca's 1680 is pretty hard 
about disks firmware, we've had bad behaviour if drive+firmware were not 
in the support list.

If disks are compatible, and one broke, it could well be it behaved very 
bad and wrote different things than it read back. If you rebuilt your 
raid, let it check again to confirm it's good now. We've had that error 
once... finally we rebuilt the whole damn raid from scratch.

-- 
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services: Protéger
http://proteger.at [gesprochen: Prot-e-schee]
Tel: +43 660 / 415 6531

[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-10-24 16:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-17 14:42 xfs_repair v3.1.6 - Segmentation fault AND XFS internal error xfs_btree_check_sblock Richard Ems
2011-10-17 15:02 ` Christoph Hellwig
2011-10-17 15:08   ` Richard Ems
2011-10-17 22:52 ` Dave Chinner
2011-10-18  9:58   ` Richard Ems
2011-10-24 16:55     ` Michael Monnerie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox