Re: kernel OOPS for XFS in xfs_iget_core (using NFS+SMP+MD)

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Gregory Brauer <greg@wildbrain.com>
To: linux-kernel@vger.kernel.org, linux-xfs@oss.sgi.com
Cc: Joshua Baker-LePain <jlb17@duke.edu>,
	Jakob Oestergaard <jakob@unthought.net>,
	Chris Wedgwood <cw@f00f.org>
Subject: Re: kernel OOPS for XFS in xfs_iget_core (using NFS+SMP+MD)
Date: Wed, 18 May 2005 13:43:16 -0700	[thread overview]
Message-ID: <428BA8E4.2040108@wildbrain.com> (raw)
In-Reply-To: <Pine.LNX.4.58.0505181556410.6834@chaos.egr.duke.edu>

Joshua Baker-LePain wrote:
> Do you have a test case that would show this up?  I've been testing a 
> centos-4 based server with the RH-derived 2.6.9-based kernel tweaked to 
> disable 4K stacks and enable XFS and haven't run into any issues yet.  
> This includes running the parallel IOR benchmark from 10 clients (and 
> getting 200MiB/s throughput on reads).
> 

For Jakob,
Note that the last OOPS I posted was for 2.6.11.10.

For Joshua,

We first saw the problem after 5 days in production, but since then
we took the server out of production and used the script
nfs_fsstress.sh located in this package:

http://prdownloads.sourceforge.net/ltp/ltp-full-20050505.tgz?download

We run the script on 5 client machines that are running RedHat 9
with kernel-smp-2.4.20-20.9 and nfs-utils-1.0.1-3.9.1.legacy and
are NFS mounting our 2.6 kernel server.  The longest time to OOPS
has been about 8 hours.  We have not tried the parallel IOR
benchmark.  (Where can we get that?)

You didn't mention if you are using md at all.  We have a
software RAID-0 of 4 x 3ware 8506-4 controllers running the
latest 3ware driver from their site.  The filesystem is XFS.
The network driver is e1000 (two interfaces, not bonded).  The
system is a dual Xeon.  We upped the number of NFS daemons
from 8 to 64.  The nfs_fsstress.sh client mounts the servers
both UDP and TCP, and our in-production oops likely happened
with a combination of both protocols in use simultaneously as
well.  We've seen the OOPS with both the default and with 32K
read and write NFS block sizes.  The machine was stable for
over a year with RedHat 9 and 2.4.20.

I'm grasping for any subtle details that might help...

Here is our list of loaded modules:

Our server configuration is
Module                  Size  Used by
nfsd                  185569  65
exportfs                9921  1 nfsd
lockd                  59625  2 nfsd
md5                     8001  1
ipv6                  236769  16
parport_pc             29701  1
lp                     15405  0
parport                37129  2 parport_pc,lp
sunrpc                135077  28 nfsd,lockd
xfs                   487809  1
dm_mod                 57925  0
video                  19653  0
button                 10577  0
battery                13253  0
ac                      8773  0
uhci_hcd               33497  0
hw_random               9429  0
i2c_i801               11981  0
i2c_core               24513  1 i2c_i801
e1000                  84629  0
bonding                59817  0
floppy                 56913  0
ext3                  117961  2
jbd                    57177  1 ext3
raid0                  11840  1
3w_xxxx                30561  4
sd_mod                 20545  4
scsi_mod              116033  2 3w_xxxx,sd_mod

Let me know if there is anything else I can provide.

Thanks.

Greg

next prev parent reply	other threads:[~2005-05-18 20:50 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-13 20:45 kernel OOPS for XFS in xfs_iget_core (using NFS+SMP+MD) Gregory Brauer
2005-05-14 18:47 ` Chris Wedgwood
2005-05-18 17:38   ` Gregory Brauer
2005-05-18 17:59     ` Chris Wedgwood
2005-05-18 19:52       ` Jakob Oestergaard
2005-05-18 20:00         ` Joshua Baker-LePain
2005-05-18 20:20           ` Jakob Oestergaard
2005-05-18 20:53             ` Gregory Brauer
2005-05-18 20:43           ` Gregory Brauer [this message]
2005-05-19 19:43             ` Joshua Baker-LePain
2005-05-19 21:00               ` Joshua Baker-LePain
2005-05-19 21:09                 ` Lee Revell
2005-05-19 21:16                   ` Joshua Baker-LePain
2005-05-19 21:29                     ` Steve Lord
2005-05-19 21:32                       ` Steve Lord
2005-05-19 21:38                         ` Joshua Baker-LePain
2005-05-19 21:43                           ` Steve Lord
2005-05-19 21:35                       ` Chris Wedgwood
2005-05-19 21:27                 ` Chris Wedgwood
2005-05-19 21:42                 ` Joshua Baker-LePain
2005-05-19 21:48                   ` Joshua Baker-LePain
2005-05-19 22:03                     ` Gregory Brauer
2005-08-18 18:49                       ` kristina clair
2005-08-18 22:58                         ` Nathan Scott
2005-08-19 14:59                           ` kristina clair
2005-05-19 21:49                   ` Lee Revell
2005-05-19 21:52                     ` Joshua Baker-LePain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=428BA8E4.2040108@wildbrain.com \
    --to=greg@wildbrain.com \
    --cc=cw@f00f.org \
    --cc=jakob@unthought.net \
    --cc=jlb17@duke.edu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox