From: "J. Bruce Fields" <bfields@fieldses.org>
To: Erez Zadok <ezk-EX0cT3Az47bauI2f2gSDlQ@public.gmane.org>
Cc: Trond.Myklebust@netapp.com, linux-nfs@vger.kernel.org,
nfs@lists.sourceforge.net
Subject: Re: nfs2/3 ESTALE bug on mount point (v2.6.24-rc8)
Date: Mon, 21 Jan 2008 17:08:28 -0500 [thread overview]
Message-ID: <20080121220828.GR17468@fieldses.org> (raw)
In-Reply-To: <200801212028.m0LKSpwA002924-zop+azHP2WsZjdeEBZXbMidm6ipF23ct@public.gmane.org>
On Mon, Jan 21, 2008 at 03:28:51PM -0500, Erez Zadok wrote:
> In message <20080121193116.GM17468@fieldses.org>, "J. Bruce Fields" writes:
> > On Mon, Jan 21, 2008 at 01:19:30PM -0500, Erez Zadok wrote:
> > > Since around 2.6.24-rc5 or so I've had an occasional problem: I get an
> > > ESTALE error on the mount point after setting up a localhost exported mount
> > > point, and trying to mkdir something there (this is part of my setup scripts
> > > prior to running unionfs regression tests).
> > >
> > > I'm CC'ing both client and server maintainers/list, b/c I'm not certain
> > > where the problem is. The problem doesn't exist in 2.6.23 or earlier stable
> > > kernels. It doesn't appear in nfs4 either, only nfs2 and nfs3.
> > >
> > > The problem is seen intermittently, and is probably some form of a race. I
> > > was finally able to narrow it down a bit. I was able to write a shell
> > > script that for me reproduces the problem within a few minutes (I tried it
> > > on v2.6.24-rc8-74-ga7da60f and several different machine configurations).
> > >
> > > I've included the shell script below. Hopefully you can use it to track the
> > > problem down. The mkdir command in the middle of the script is that one
> > > that'll eventually cause an ESTALE error and cause the script to abort; you
> > > can run "df" afterward to see the stale mount points.
> > >
> > > Notes: the one anecdotal factor that seems to make the bug appear sooner is
> > > if you increase the number of total mounts that the script below creates
> > > ($MAX in the script).
> >
> > OK, so to summarize:
> >
> > 1. create $MAX ext2 filesystem images, loopback-mount them, and export
> > the result.
> > 2. nfs-mount each of those $MAX exports.
> > 3. create a directory under each of those nfs-mounts.
> > 4. unmount and unexport
> >
> > Repeat that a thousand times, and eventually get you ESTALE at step 3?
>
> Your description is correct.
>
> > I guess one step would be to see if it's possible to get a network trace
> > showing what happened in the bad case....
>
> Here you go. See the tcpdump in here:
>
> http://agora.fsl.cs.sunysb.edu/tmp/nfs/
>
> I captured it on an x86_64 machine using
>
> tcpdump -s 0 -i lo -w tcpdump2
>
> And it shows near the very end the ESTALE error.
Yep, thanks! So frame 107855 has the MNT reply that returns the
filehandle in question, which is used in an ACCESS call in frame 107855
that gets an ESTALE. Looks like an unhappy server!
> Do you think this could be related to nfs-utils? I find that I can easily
> trigger this problem on an FC7 machine with nfs-utils-1.1.0-4.fc7 (within
> 10-30 runs of the above loop); but so far I cannot trigger the problem on an
> FC6 machine with nfs-utils-1.0.10-14.fc6 (even after 300+ runs of the above
> loop).
Yes, it's quite likely, though on a quick skim through the git logs I
don't see an obviously related commit....
--b.
next prev parent reply other threads:[~2008-01-21 22:08 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-21 18:19 [NFS] nfs2/3 ESTALE bug on mount point (v2.6.24-rc8) Erez Zadok
[not found] ` <200801211819.m0LIJU6Y017173-zop+azHP2WsZjdeEBZXbMidm6ipF23ct@public.gmane.org>
2008-01-21 19:31 ` J. Bruce Fields
2008-01-21 20:28 ` [NFS] " Erez Zadok
[not found] ` <200801212028.m0LKSpwA002924-zop+azHP2WsZjdeEBZXbMidm6ipF23ct@public.gmane.org>
2008-01-21 22:08 ` J. Bruce Fields [this message]
2008-01-22 16:41 ` J. Bruce Fields
2008-01-28 4:37 ` [NFS] " Erez Zadok
[not found] ` <200801280437.m0S4bxcE001453-zop+azHP2WsZjdeEBZXbMidm6ipF23ct@public.gmane.org>
2008-01-28 15:35 ` Kevin Coffman
2008-01-29 1:08 ` J. Bruce Fields
2008-01-29 3:03 ` [NFS] " Erez Zadok
[not found] ` <200801290303.m0T33miE028199-zop+azHP2WsZjdeEBZXbMidm6ipF23ct@public.gmane.org>
2008-01-29 3:48 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080121220828.GR17468@fieldses.org \
--to=bfields@fieldses.org \
--cc=Trond.Myklebust@netapp.com \
--cc=ezk-EX0cT3Az47bauI2f2gSDlQ@public.gmane.org \
--cc=linux-nfs@vger.kernel.org \
--cc=nfs@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox