public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Yann Dupont <Yann.Dupont@univ-nantes.fr>
To: David Chinner <dgc@sgi.com>
Cc: xfs@oss.sgi.com, Jacky Carimalo <jacky.carimalo@univ-nantes.fr>
Subject: Re: kernel oops on debian , 2.6.18-5
Date: Tue, 18 Dec 2007 15:41:36 +0100	[thread overview]
Message-ID: <4767DC20.1080406@univ-nantes.fr> (raw)
In-Reply-To: <20071218123259.GL4396912@sgi.com>

David Chinner wrote:
> On Tue, Dec 18, 2007 at 10:20:21AM +0100, Yann Dupont wrote:
>   
>> Hello, we got a kernel oops, probably in xfs on a debian kernel.
>>
>> This volume is on SAN + device mapper.
>> this is a 1 TB  volume. It was in service for more than 2 ou 3 years.
>> There is a high humber of files on it, as this volume serves for a
>> rsyncd, where 200+ servers sync their root filesystem on it every day.
>>
>> here is the oops :
>>
>> Dec 16 23:27:32 inchgower kernel: XFS internal error
>> XFS_WANT_CORRUPTED_GOTO at line 1561 of file fs/xfs/xfs_alloc.c.  Caller
>> 0xffffffff881857b7
>> Dec 16 23:27:32 inchgower kernel:
>> Dec 16 23:27:32 inchgower kernel: Call Trace:
>> Dec 16 23:27:32 inchgower kernel:  [<ffffffff88183ec0>]
>> :xfs:xfs_free_ag_extent+0x19f/0x67f
>>     
>
> corrupted freespace btree. what does xfs_check tell you about the
> filesystem on dm-3?
>
>   
xfs_check tells me to run xfs_repair -L, the attempts to mount the FS
to clear the logs ending in kernel oops.

XFS internal error XFS_WANT_CORRUPTED_RETURN at line 281 of file 
fs/xfs/xfs_alloc.c.  Caller 0xffffffff88182f74

Call Trace:
  [<ffffffff881816ed>] :xfs:xfs_alloc_fixup_trees+0x2fa/0x30b
  [<ffffffff88198822>] :xfs:xfs_btree_setbuf+0x1f/0x89
  [<ffffffff88182f74>] :xfs:xfs_alloc_ag_vextent+0xbd4/0xf5e
  [<ffffffff88183aa5>] :xfs:xfs_alloc_vextent+0x2ce/0x401
  [<ffffffff88191a70>] :xfs:xfs_bmapi+0x1068/0x1c85
  [<ffffffff881c85f2>] :xfs:kmem_zone_alloc+0x56/0xa3
  [<ffffffff8819ca78>] :xfs:xfs_dir2_grow_inode+0xca/0x2d4
  [<ffffffff8819d8df>] :xfs:xfs_dir2_sf_to_block+0xad/0x5ba
  [<ffffffff881b001b>] :xfs:xfs_inode_item_init+0x1e/0x7a
  [<ffffffff881a4348>] :xfs:xfs_dir2_sf_addname+0x19d/0x4cf
  [<ffffffff8819d43e>] :xfs:xfs_dir_createname+0xc4/0x134
  [<ffffffff881c865d>] :xfs:kmem_zone_zalloc+0x1e/0x2f
  [<ffffffff881b001b>] :xfs:xfs_inode_item_init+0x1e/0x7a
  [<ffffffff881c6065>] :xfs:xfs_create+0x39d/0x5dd
  [<ffffffff881ce702>] :xfs:xfs_vn_mknod+0x1bd/0x3c8inchgower:~# strace 
-fp 7885
Process 17194 attached with 6 threads - interrupt to quit

  [<ffffffff80220a18>] __up_read+0x13/0x8a
  [<ffffffff881aa75e>] :xfs:xfs_iunlock+0x57/0x79
  [<ffffffff881c3392>] :xfs:xfs_access+0x3d/0x46
  [<ffffffff8819d112>] :xfs:xfs_dir_lookup+0xa2/0x122
  [<ffffffff8020e0c5>] link_path_walk+0xd3/0xe5
  [<ffffffff80239138>] vfs_create+0xe7/0x12c
  [<ffffffff80219430>] open_namei+0x18c/0x6a0
  [<ffffffff881cc5bb>] :xfs:xfs_file_open+0x27/0x2c
  [<ffffffff80225d1d>] do_filp_open+0x1c/0x3d
  [<ffffffff802180e0>] do_sys_open+0x44/0xc5
  [<ffffffff8025d2a2>] ia32_sysret+0x0/0xa

Filesystem "dm-1": XFS internal error xfs_trans_cancel at line 1138 of 
file fs/xfs/xfs_trans.c.  Caller 0xffffffff881c6253

Call Trace:
  [<ffffffff881bdeac>] :xfs:xfs_trans_cancel+0x5b/0xfe
  [<ffffffff881c6253>] :xfs:xfs_create+0x58b/0x5dd
  [<ffffffff881ce702>] :xfs:xfs_vn_mknod+0x1bd/0x3c8
  [<ffffffff80220a18>] __up_read+0x13/0x8a
  [<ffffffff881aa75e>] :xfs:xfs_iunlock+0x57/0x79
  [<ffffffff881c3392>] :xfs:xfs_access+0x3d/0x46
  [<ffffffff8819d112>] :xfs:xfs_dir_lookup+0xa2/0x122
  [<ffffffff8020e0c5>] link_path_walk+0xd3/0xe5
  [<ffffffff80239138>] vfs_create+0xe7/0x12c
  [<ffffffff80219430>] open_namei+0x18c/0x6a0
  [<ffffffff881cc5bb>] :xfs:xfs_file_open+0x27/0x2c
  [<ffffffff80225d1d>] do_filp_open+0x1c/0x3d
  [<ffffffff802180e0>] do_sys_open+0x44/0xc5
  [<ffffffff8025d2a2>] ia32_sysret+0x0/0xa


I've been upgrading the xfs_repair to last version available on debian 
(xfs_repair version 2.9.4)

There are lots of  errors reported
(don't have the beginning on the console)

...
data fork in ino 3628932549 claims free block 226749351
data fork in ino 3628932549 claims free block 226749352
data fork in ino 3628932549 claims free block 226749353
data fork in ino 3628932549 claims free block 226749354
data fork in ino 3628932549 claims free block 226749355
data fork in ino 3628932549 claims free block 226749356
data fork in ino 3628932549 claims free block 226749357
data fork in ino 3628932549 claims free block 226749358
data fork in ino 3628932549 claims free block 226749359
data fork in ino 3628932549 claims free block 226749360
data fork in ino 3628932549 claims free block 226749361
data fork in ino 3628932549 claims free block 226749362
data fork in ino 3628932549 claims free block 226749363
imap claims a free inode 3629547632 is in use, correcting imap and 
clearing inode
         - agno = 28
         - agno = 29
data fork in ino 3894217924 claims free block 243388605
data fork in ino 3894217924 claims free block 243388606
data fork in ino 3899211601 claims free block 243702250
data fork in ino 3899211601 claims free block 243702251
data fork in ino 3899211601 claims free block 243702252
data fork in ino 3907562994 claims free block 244222632
data fork in ino 3907562994 claims free block 244222633
data fork in ino 3907562994 claims free block 244222634
data fork in ino 3907562994 claims free block 244222635
data fork in ino 3907562994 claims free block 244222636
data fork in ino 3910289697 claims free block 244393117
data fork in ino 3910289697 claims free block 244393118
data fork in ino 3910289699 claims free block 244393113
....
and in the end :


         - agno = 31
correcting imap
correcting imap
correcting imap
correcting imap
correcting imap
         - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
         - setting up duplicate extent list...
         - check for inodes claiming duplicate blocks...
         - agno = 0



)

And now the process seems stuck.
There is no activity on the san disk ;

a ps show this :

root      7885  6466  7885  0    6 1447133 5660020 6 09:55 pts/0  
00:00:19 xfs_repair -L /dev/evms/DATAXFS2
root      7885  6466 17190  0    6 1447133 5660020 6 10:16 pts/0  
00:00:00 xfs_repair -L /dev/evms/DATAXFS2
root      7885  6466 17191  0    6 1447133 5660020 6 10:16 pts/0  
00:00:00 xfs_repair -L /dev/evms/DATAXFS2
root      7885  6466 17192  0    6 1447133 5660020 6 10:16 pts/0  
00:00:00 xfs_repair -L /dev/evms/DATAXFS2
root      7885  6466 17193  0    6 1447133 5660020 6 10:16 pts/0  
00:00:00 xfs_repair -L /dev/evms/DATAXFS2
root      7885  6466 17194  0    6 1447133 5660020 6 10:16 pts/0  
00:00:00 xfs_repair -L /dev/evms/DATAXFS2


and a strace this :
inchgower:~# strace -fp 7885
Process 17194 attached with 6 threads - interrupt to quit
[pid 17191] futex(0x2aab3c8fa884, FUTEX_WAIT, 44, NULL <unfinished ...>
[pid 17192] futex(0x2aab3c8fa884, FUTEX_WAIT, 44, NULL <unfinished ...>
[pid 17193] futex(0x2aab3c8fa884, FUTEX_WAIT, 44, NULL <unfinished ...>
[pid 17194] futex(0x2aab3c8fa884, FUTEX_WAIT, 44, NULL <unfinished ...>
[pid 17190] futex(0x67e4f8, FUTEX_WAIT, 2, NULL

Can I stop the process and start another version without risking problems ?
> Could be a hardware problem. Could be an XFs problem. Coul dbe a dm problem.
> I really can't say from a shutdown message like this - all it tells us is
> that a btree block was corrupted by something since the last time it was
> checked....
>
> Cheers,
>
> Dave.
>   

OK,
cheers,

--
Yann Dupont, Cri de l'université de Nantes
Tel: 02.51.12.53.91 - Fax: 02.51.12.58.60 - Yann.Dupont@univ-nantes.fr

  reply	other threads:[~2007-12-18 14:42 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-18  9:20 kernel oops on debian , 2.6.18-5 Yann Dupont
2007-12-18 12:32 ` David Chinner
2007-12-18 14:41   ` Yann Dupont [this message]
2007-12-19  0:38     ` Barry Naujok
2007-12-19 10:27       ` Yann Dupont

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4767DC20.1080406@univ-nantes.fr \
    --to=yann.dupont@univ-nantes.fr \
    --cc=dgc@sgi.com \
    --cc=jacky.carimalo@univ-nantes.fr \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox