Linux NFS development
 help / color / mirror / Atom feed
* [NFS] nfs2/3 ESTALE bug on mount point (v2.6.24-rc8)
@ 2008-01-21 18:19 Erez Zadok
       [not found] ` <200801211819.m0LIJU6Y017173-zop+azHP2WsZjdeEBZXbMidm6ipF23ct@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Erez Zadok @ 2008-01-21 18:19 UTC (permalink / raw)
  To: Trond.Myklebust, bfields; +Cc: linux-nfs, nfs

Since around 2.6.24-rc5 or so I've had an occasional problem: I get an
ESTALE error on the mount point after setting up a localhost exported mount
point, and trying to mkdir something there (this is part of my setup scripts
prior to running unionfs regression tests).

I'm CC'ing both client and server maintainers/list, b/c I'm not certain
where the problem is.  The problem doesn't exist in 2.6.23 or earlier stable
kernels.  It doesn't appear in nfs4 either, only nfs2 and nfs3.

The problem is seen intermittently, and is probably some form of a race.  I
was finally able to narrow it down a bit.  I was able to write a shell
script that for me reproduces the problem within a few minutes (I tried it
on v2.6.24-rc8-74-ga7da60f and several different machine configurations).

I've included the shell script below.  Hopefully you can use it to track the
problem down.  The mkdir command in the middle of the script is that one
that'll eventually cause an ESTALE error and cause the script to abort; you
can run "df" afterward to see the stale mount points.

Notes: the one anecdotal factor that seems to make the bug appear sooner is
if you increase the number of total mounts that the script below creates
($MAX in the script).

Hope this helps.

Thanks,
Erez.


#!/bin/sh
# script to tickle a "stale filehandle" mount-point bug in nfs2/3
# Erez Zadok.

# mount flags
FLAGS=no_root_squash,rw,async
# max no. of nfs mounts (each using a loop device)
MAX=6
# total no. of times to try test
COUNT=1000

function runcmd
{
    echo "CMD: $@"
    $@
    ret=$?
    test $ret -ne 0 && exit $ret
}

function doit
{
    for c in `seq 0 $MAX`; do
	runcmd dd if=/dev/zero of=/tmp/fs.$$.$c bs=1024k count=1 seek=100
	runcmd losetup /dev/loop$c /tmp/fs.$$.$c
	runcmd mkfs -t ext2 -q /dev/loop$c
	runcmd mkdir -p /n/export/b$c
	runcmd mount -t ext2 /dev/loop$c /n/export/b$c
	runcmd exportfs -o $FLAGS localhost:/n/export/b$c
	runcmd mkdir -p /n/lower/b$c
	runcmd mount -t nfs -o nfsvers=3 localhost:/n/export/b$c /n/lower/b$c
    done

    # this mkdir command will eventually cause an ESTALE error on the mnt pt
    for c in `seq 0 $MAX`; do
	runcmd mkdir -p /n/lower/b$c/dir
    done

    # check if "df" prints" "stale file handle"
    for i in `seq 1 10` ; do
	sleep 0.1
	echo -n "."
	if test -n "`df 2>&1 | grep -i stale`" ; then
	    df
	    exit 123
	fi
    done
    echo

    for c in `seq 0 $MAX`; do
	runcmd umount /n/lower/b$c
	runcmd exportfs -u localhost:/n/export/b$c
	runcmd umount /n/export/b$c
	runcmd losetup -d /dev/loop$c
	runcmd rm -f /tmp/fs.$$.$c
    done
}

count=$COUNT
while test $count -gt 0 ; do
    echo "------------------------------------------------------------------"
    echo "COUNT $count"
    doit
    let count=count-1
done
##############################################################################

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs


^ permalink raw reply	[flat|nested] 11+ messages in thread
* [NFS] nfs2/3 ESTALE bug on mount point (v2.6.24-rc8)
@ 2008-01-21  3:27 Erez Zadok
  0 siblings, 0 replies; 11+ messages in thread
From: Erez Zadok @ 2008-01-21  3:27 UTC (permalink / raw)
  To: nfs

Since around 2.6.24-rc5 or so I've had an occasional problem: I get an
ESTALE error on the mount point after setting up a localhost exported mount
point, and trying to mkdir something there (this is part of my setup scripts
prior to running unionfs regression tests).

The problem doesn't exist in 2.6.23 or earlier stable kernels.  It doesn't
appear in nfs4 either, only nfs2 and nfs3.

The problem is seen intermittently, and is probably some form of a race.  I
was finally able to narrow it down a bit.  I was able to write a shell
script that for me reproduces the problem within a few minutes (I tried it
on v2.6.24-rc8-74-ga7da60f and several different machine configurations).

I've included the shell script below.  Hopefully you can use it to track the
problem down.  The mkdir command in the middle of the script is that one
that'll eventually cause an ESTALE error and cause the script to abort; you
can run "df" afterward to see the stale mount points.

Notes: the one anecdotal factor that seems to make the bug appear sooner is
if you increase the number of total mounts that the script below creates
($MAX in the script).

Hope this helps.

Thanks,
Erez.


#!/bin/sh
# script to tickle a "stale filehandle" mount-point bug in nfs2/3
# Erez Zadok.

# mount flags
FLAGS=no_root_squash,rw,async
# max no. of nfs mounts (each using a loop device)
MAX=6
# total no. of times to try test
COUNT=1000

function runcmd
{
    echo "CMD: $@"
    $@
    ret=$?
    test $ret -ne 0 && exit $ret
}

function doit
{
    for c in `seq 0 $MAX`; do
	runcmd dd if=/dev/zero of=/tmp/fs.$$.$c bs=1024k count=1 seek=100
	runcmd losetup /dev/loop$c /tmp/fs.$$.$c
	runcmd mkfs -t ext2 -q /dev/loop$c
	runcmd mkdir -p /n/export/b$c
	runcmd mount -t ext2 /dev/loop$c /n/export/b$c
	runcmd exportfs -o $FLAGS localhost:/n/export/b$c
	runcmd mkdir -p /n/lower/b$c
	runcmd mount -t nfs -o nfsvers=3 localhost:/n/export/b$c /n/lower/b$c
    done

    # this mkdir command will eventually cause an ESTALE error on the mnt pt
    for c in `seq 0 $MAX`; do
	runcmd mkdir -p /n/lower/b$c/dir
    done

    # check if "df" prints" "stale file handle"
    for i in `seq 1 10` ; do
	sleep 0.1
	echo -n "."
	if test -n "`df 2>&1 | grep -i stale`" ; then
	    df
	    exit 123
	fi
    done
    echo

    for c in `seq 0 $MAX`; do
	runcmd umount /n/lower/b$c
	runcmd exportfs -u localhost:/n/export/b$c
	runcmd umount /n/export/b$c
	runcmd losetup -d /dev/loop$c
	runcmd rm -f /tmp/fs.$$.$c
    done
}

count=$COUNT
while test $count -gt 0 ; do
    echo "------------------------------------------------------------------"
    echo "COUNT $count"
    doit
    let count=count-1
done
##############################################################################

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-01-29  3:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-21 18:19 [NFS] nfs2/3 ESTALE bug on mount point (v2.6.24-rc8) Erez Zadok
     [not found] ` <200801211819.m0LIJU6Y017173-zop+azHP2WsZjdeEBZXbMidm6ipF23ct@public.gmane.org>
2008-01-21 19:31   ` J. Bruce Fields
2008-01-21 20:28     ` [NFS] " Erez Zadok
     [not found]       ` <200801212028.m0LKSpwA002924-zop+azHP2WsZjdeEBZXbMidm6ipF23ct@public.gmane.org>
2008-01-21 22:08         ` J. Bruce Fields
2008-01-22 16:41           ` J. Bruce Fields
2008-01-28  4:37             ` [NFS] " Erez Zadok
     [not found]               ` <200801280437.m0S4bxcE001453-zop+azHP2WsZjdeEBZXbMidm6ipF23ct@public.gmane.org>
2008-01-28 15:35                 ` Kevin Coffman
2008-01-29  1:08                 ` J. Bruce Fields
2008-01-29  3:03                   ` [NFS] " Erez Zadok
     [not found]                     ` <200801290303.m0T33miE028199-zop+azHP2WsZjdeEBZXbMidm6ipF23ct@public.gmane.org>
2008-01-29  3:48                       ` J. Bruce Fields
  -- strict thread matches above, loose matches on Subject: below --
2008-01-21  3:27 Erez Zadok

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox