All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pascal Dameme <pascal.dameme@evidian.com>
To: nfs@lists.sourceforge.net
Subject: NFSD bug when using nfsdfs (2.6.10 kernel, with reiserfs or ext3 backing filesystem) ?
Date: Wed, 26 Jan 2005 10:19:01 +0100	[thread overview]
Message-ID: <41F76085.7040303@evidian.com> (raw)

Hello,

When a locally exported directory is mounted other itself using nfs V3, 
after
a few minutes, the nfs servers starts issueing "ESTALE" on previously 
perfectly accessible files ... There is
no other activity except for the test script, that does "ls" in a loop ...

This behavior has been observed on redhat Fedora core 2, Suse SLES 9 
*and* 2.6.10 (from kernel.org) kernels.
The less activity there is, the fastest the problem appears ...

For some reason, it manifests *only if the nfsdfs filesystem is mounted*
(in "legacy" mode, where the filesystem is not mounted, the system behaves
normally for at least a week, whereas with the filesystem mounted, ESTALE
is returned after at  most 30 minutes)

Herafter, you will find a test scenario to reproduce the problem,
as well as all information I have dug so far . I searched the archives, 
but did not find anything related ...

Anyone ?

Best regards,
-- 
Pascal Dameme.


---------------------------------------------------------------------------------------------------------------- 

The test scenario to reproduce the problem is as follow (the test 
machine is
a SuSE distribution running a freshly compiled 2.6.10 kernel):

#start nfsd
/etc/rc.d/nfsserver start
#export test directory
exportfs -o rw,insecure,no_root_squash,no_subtree_check 127.0.0.1:/test/dir
#mount
mount -o hard,nolock,vers=3,proto=udp 127.0.0.1:/test/dir /test/dir

The following is a trace of what happens :

atchoum:~ # while true; do date;ls -ld /test/dir; sleep 60;done
Thu Jan  6 14:33:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:34:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:35:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:36:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:37:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:38:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:39:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:40:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:41:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:42:41 CET 2005
/bin/ls: /test/dir: Stale NFS file handle
Thu Jan  6 14:43:41 CET 2005
/bin/ls: /test/dir: Stale NFS file handle
Thu Jan  6 14:44:41 CET 2005
/bin/ls: /test/dir: Stale NFS file handle
Thu Jan  6 14:45:42 CET 2005
/bin/ls: /test/dir: Stale NFS file handle
Thu Jan  6 14:46:42 CET 2005
/bin/ls: /test/dir: Stale NFS file handle
Thu Jan  6 14:47:42 CET 2005
/bin/ls: /test/dir: Stale NFS file handle
Thu Jan  6 14:48:42 CET 2005
/bin/ls: /test/dir: Stale NFS file handle
Thu Jan  6 14:49:42 CET 2005
/bin/ls: /test/dir: Stale NFS file handle
Thu Jan  6 14:50:42 CET 2005
/bin/ls: /test/dir: Stale NFS file handle

I enabled the NFS debug messages, this is what is seen in the syslog file
around the problem:

Jan  6 14:40:41 atchoum kernel: NFS: revalidating (0:f/113840)
Jan  6 14:40:41 atchoum kernel: NFS call  getattr
Jan  6 14:40:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 14:40:41 atchoum kernel: nfsd: GETATTR(3)  12: 00000001 02000800 
0001bcb0 00000000 00000000 00000000
Jan  6 14:40:41 atchoum kernel: nfsd: fh_verify(12: 00000001 02000800 
0001bcb0 00000000 00000000 00000000)
Jan  6 14:40:41 atchoum kernel: NFS reply getattr
Jan  6 14:40:41 atchoum kernel: NFS: nfs_update_inode(0:f/113840 ct=1 
info=0x6)
Jan  6 14:40:41 atchoum kernel: NFS: (0:f/113840) revalidation complete
Jan  6 14:41:41 atchoum kernel: NFS: revalidating (0:f/113840)
Jan  6 14:41:41 atchoum kernel: NFS call  getattr
Jan  6 14:41:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 14:41:41 atchoum kernel: nfsd: GETATTR(3)  12: 00000001 02000800 
0001bcb0 00000000 00000000 00000000
Jan  6 14:41:41 atchoum kernel: nfsd: fh_verify(12: 00000001 02000800 
0001bcb0 00000000 00000000 00000000)
Jan  6 14:41:41 atchoum kernel: NFS reply getattr
Jan  6 14:41:41 atchoum kernel: NFS: nfs_update_inode(0:f/113840 ct=1 
info=0x6)
Jan  6 14:41:41 atchoum kernel: NFS: (0:f/113840) revalidation complete
Jan  6 14:41:42 atchoum kernel: exp_export: export of non-dev fs without 
fsidfound domain localhost
Jan  6 14:41:42 atchoum kernel: found fsidtype 0
Jan  6 14:41:42 atchoum kernel: found fsid length 8
Jan  6 14:41:42 atchoum kernel: Path seems to be <>
Jan  6 14:42:41 atchoum kernel: NFS: revalidating (0:f/113840)
Jan  6 14:42:41 atchoum kernel: NFS call  getattr
Jan  6 14:42:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 14:42:41 atchoum kernel: nfsd: GETATTR(3)  12: 00000001 02000800 
0001bcb0 00000000 00000000 00000000
Jan  6 14:42:41 atchoum kernel: nfsd: fh_verify(12: 00000001 02000800 
0001bcb0 00000000 00000000 00000000)
Jan  6 14:42:41 atchoum kernel: NFS reply getattr
Jan  6 14:42:41 atchoum kernel: nfs_revalidate_inode: (0:f/113840) 
getattr failed, error=-116
Jan  6 14:43:41 atchoum kernel: NFS: revalidating (0:f/113840)
Jan  6 14:43:41 atchoum kernel: NFS call  getattr
Jan  6 14:43:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 14:43:41 atchoum kernel: nfsd: GETATTR(3)  12: 00000001 02000800 
0001bcb0 00000000 00000000 00000000
Jan  6 14:43:41 atchoum kernel: nfsd: fh_verify(12: 00000001 02000800 
0001bcb0 00000000 00000000 00000000)
Jan  6 14:43:41 atchoum kernel: NFS reply getattr
Jan  6 14:43:41 atchoum kernel: nfs_revalidate_inode: (0:f/113840) 
getattr failed, error=-116

Somehow, it seems that check_export gets confused ...

I tried to mount the directory using the fsid= option, this seems to help a
little, but after some time, the following message appears in the syslog:

Jan  6 18:35:41 atchoum kernel: NFS: nfs_update_inode(0:f/113904 ct=1 
info=0x6)
Jan  6 18:35:41 atchoum kernel: NFS: (0:f/113904) revalidation complete
Jan  6 18:36:41 atchoum kernel: NFS: revalidating (0:f/113904)
Jan  6 18:36:41 atchoum kernel: NFS call  getattr
Jan  6 18:36:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 18:36:41 atchoum kernel: nfsd: GETATTR(3)  8: 00010001 00000309 
00000000 00000000 00000000 00000000
Jan  6 18:36:41 atchoum kernel: nfsd: fh_verify(8: 00010001 00000309 
00000000 00000000 00000000 00000000)
Jan  6 18:36:41 atchoum kernel: NFS reply getattr
Jan  6 18:36:41 atchoum kernel: NFS: nfs_update_inode(0:f/113904 ct=1 
info=0x6)
Jan  6 18:36:41 atchoum kernel: NFS: (0:f/113904) revalidation complete
Jan  6 18:37:41 atchoum kernel: NFS: revalidating (0:f/113904)
Jan  6 18:37:41 atchoum kernel: NFS call  getattr
Jan  6 18:37:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 18:37:41 atchoum kernel: nfsd: GETATTR(3)  8: 00010001 00000309 
00000000 00000000 00000000 00000000
Jan  6 18:37:41 atchoum kernel: nfsd: fh_verify(8: 00010001 00000309 
00000000 00000000 00000000 00000000)
Jan  6 18:37:41 atchoum kernel: nfsd: Dropping request due to malloc 
failure!
Jan  6 18:37:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 18:37:41 atchoum kernel: nfsd: GETATTR(3)  8: 00010001 00000309 
00000000 00000000 00000000 00000000
Jan  6 18:37:41 atchoum kernel: nfsd: fh_verify(8: 00010001 00000309 
00000000 00000000 00000000 00000000)
Jan  6 18:37:41 atchoum kernel: nfsd: Dropping request due to malloc 
failure!
Jan  6 18:37:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 18:37:41 atchoum kernel: nfsd: GETATTR(3)  8: 00010001 00000309 
00000000 00000000 00000000 00000000
Jan  6 18:37:41 atchoum kernel: nfsd: fh_verify(8: 00010001 00000309 
00000000 00000000 00000000 00000000)
Jan  6 18:37:41 atchoum kernel: nfsd: Dropping request due to malloc 
failure!
Jan  6 18:37:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 18:37:41 atchoum kernel: nfsd: GETATTR(3)  8: 00010001 00000309 
00000000 00000000 00000000 00000000
Jan  6 18:37:41 atchoum kernel: nfsd: fh_verify(8: 00010001 00000309 
00000000 00000000 00000000 00000000)
Jan  6 18:37:41 atchoum kernel: nfsd: Dropping request due to malloc 
failure!

Looks like a  memory leak ...



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

                 reply	other threads:[~2005-01-26  9:16 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41F76085.7040303@evidian.com \
    --to=pascal.dameme@evidian.com \
    --cc=nfs@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.