public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Sandeen <sandeen@sandeen.net>
To: Chris Samuel <csamuel@vpac.org>
Cc: xfs@oss.sgi.com
Subject: Re: XFS filesystem shutting down on linux 2.6.28.10 (xfs_rename)
Date: Mon, 10 Aug 2009 09:29:48 -0500	[thread overview]
Message-ID: <4A802EDC.1080706@sandeen.net> (raw)
In-Reply-To: <1367391532.793061249444829356.JavaMail.root@mail.vpac.org>

Chris Samuel wrote:
> Hi folks,
> 
> I believe we've been hitting the same issue that
> Gabriel Barazer reported in 2.6.28.9 on the 22nd
> of July on our NFS server for our HPC Linux clusters.
> 
> Here is the backtrace we got this morning:
> 
> Aug  5 11:44:27 stg7 kernel: [680506.864506] Pid: 5271, comm: nfsd Not tainted 2.6.28.10-vpac-1 #1
> Aug  5 11:44:27 stg7 kernel: [680506.864508] Call Trace:
> Aug  5 11:44:27 stg7 kernel: [680506.864541]  [<ffffffffa032c8d5>] xfs_rename+0x5ac/0x5af [xfs]
> Aug  5 11:44:27 stg7 kernel: [680506.864567]  [<ffffffffa032d793>] xfs_trans_cancel+0x56/0xee [xfs]
> Aug  5 11:44:27 stg7 kernel: [680506.864589]  [<ffffffffa032c8d5>] xfs_rename+0x5ac/0x5af [xfs]
> Aug  5 11:44:27 stg7 kernel: [680506.864609]  [<ffffffffa033b8d0>] xfs_vn_rename+0x61/0x69 [xfs]
> Aug  5 11:44:27 stg7 kernel: [680506.864615]  [<ffffffff8029a798>] vfs_rename+0x28a/0x404
> Aug  5 11:44:27 stg7 kernel: [680506.864642]  [<ffffffffa045322c>] nfsd_rename+0x2ba/0x35f [nfsd]
> Aug  5 11:44:27 stg7 kernel: [680506.864654]  [<ffffffffa045a898>] nfsd3_proc_rename+0x120/0x131 [nfsd]
> Aug  5 11:44:27 stg7 kernel: [680506.864681]  [<ffffffffa044f23b>] nfsd_dispatch+0xdd/0x1b9 [nfsd]
> Aug  5 11:44:27 stg7 kernel: [680506.864706]  [<ffffffffa03b3cdd>] svc_process+0x3e6/0x70e [sunrpc]
> Aug  5 11:44:27 stg7 kernel: [680506.864711]  [<ffffffff8022f9f2>] default_wake_function+0x0/0xe
> Aug  5 11:44:27 stg7 kernel: [680506.864717]  [<ffffffff8040dfac>] __down_read+0x15/0x99
> Aug  5 11:44:27 stg7 kernel: [680506.864740]  [<ffffffffa044f7d1>] nfsd+0x1a0/0x26c [nfsd]
> Aug  5 11:44:27 stg7 kernel: [680506.864750]  [<ffffffffa044f631>] nfsd+0x0/0x26c [nfsd]
> Aug  5 11:44:27 stg7 kernel: [680506.864754]  [<ffffffff802470de>] kthread+0x47/0x73
> Aug  5 11:44:27 stg7 kernel: [680506.864757]  [<ffffffff80232f9a>] schedule_tail+0x27/0x60
> Aug  5 11:44:27 stg7 kernel: [680506.864761]  [<ffffffff8020ccd9>] child_rip+0xa/0x11
> Aug  5 11:44:27 stg7 kernel: [680506.864764]  [<ffffffff80247097>] kthread+0x0/0x73
> Aug  5 11:44:27 stg7 kernel: [680506.864766]  [<ffffffff8020cccf>] child_rip+0x0/0x11
> Aug  5 11:44:27 stg7 kernel: [680506.864770] xfs_force_shutdown(md25,0x8) called from line 1165 of file fs/xfs/xfs
> _trans.c.  Return address = 0xffffffffa032d7ac

...

Just for the record, Chris let me know offline that he tried ext4 and
got an error:

> EXT4-fs: mounted filesystem sde1 with ordered data mode
> end_request: I/O error, dev sde, sector 1430524111
> Aborting journal on device sde1:8.
> ext4_abort called.
> EXT4-fs error (device sde1): ext4_journal_start_sb: Detected aborted journal
> Remounting filesystem read-only
> <snip>
> ext4_abort called.
> EXT4-fs error (device sde1): ext4_put_super: Couldn't clean up the journal
> end_request: I/O error, dev sde, sector 63

so he got IO errors to sector 1430524111 and sector 63 (!)

the question may now be whether xfs got an IO error causing the dirty
transaction cancellation but didn't report it as such.

Also interesting that no other layers complained about the IO error ...

What's your storage stack look like?

-Eric

> This kernel is built with XFS as a kernel module so I've
> been able to attach the objdump output that Eric Sandeen
> had originally requested from Gabriel.
> 
> Like Gabriel we're stuck on 2.6.28.x as the last working
> NFS exporting XFS kernel due to kernel bug #13375 (the
> radix bug), so I hope this helps!
> 
> cheers,
> Chris
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2009-08-10 14:29 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <816101050.793011249444717925.JavaMail.root@mail.vpac.org>
2009-08-05  4:00 ` XFS filesystem shutting down on linux 2.6.28.10 (xfs_rename) Chris Samuel
2009-08-05  4:26   ` Eric Sandeen
2009-08-05  4:32     ` Chris Samuel
2009-08-10  8:20   ` Krzysztof Błaszkowski
2009-08-10 14:29   ` Eric Sandeen [this message]
2009-08-10 22:52     ` Chris Samuel
2009-08-10 23:20       ` Eric Sandeen
     [not found] <1055011478.793231249445110983.JavaMail.root@mail.vpac.org>
2009-08-05  4:05 ` Chris Samuel
     [not found] <7684694.1055541249875694469.JavaMail.root@mail.vpac.org>
2009-08-10  3:43 ` Chris Samuel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A802EDC.1080706@sandeen.net \
    --to=sandeen@sandeen.net \
    --cc=csamuel@vpac.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox