public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Pascal Dameme <pascal.dameme@evidian.com>
To: linux-kernel@vger.kernel.org
Subject: [nfsd-bug ?] ESTALE returned after a while when using nfsdfs (2.6.10 kernel)
Date: Mon, 24 Jan 2005 14:50:30 +0100	[thread overview]
Message-ID: <41F4FD26.2050307@evidian.com> (raw)

Hello,

When a locally exported directory is mounted other itself using nfs V3, 
after
a few minutes, the nfs servers starts issueing "ESTALE" on previously 
perfectly accessible files ...

This behavior has been observed on redhat Fedora core 2, Suse SLES 9 
*and* 2.6.10 (from kernel.org) kernels.
The less activity there is, the fastest the problem appears ...

For some reason, it manifests *only if the nfsdfs filesystem is mounted*
(in "legacy" mode, where the filesystem is not mounted, the system behaves
 normally for at least a week, whereas with the filesystem mounted, ESTALE
is returned after at  most 30 minutes)

Herafter, you will find a test scenario to reproduce the problem,
 as well as all information I have dug so far .

Anyone ?

Best regards,
-- 
Pascal Dameme.


----------------------------------------------------------------------------------------------------------------
The test scenario to reproduce the problem is as follow (the test machine is
 a SuSE distribution running a freshly compiled 2.6.10 kernel):

#start nfsd
/etc/rc.d/nfsserver start
#export test directory
exportfs -o rw,insecure,no_root_squash,no_subtree_check 127.0.0.1:/test/dir
#mount
mount -o hard,nolock,vers=3,proto=udp 127.0.0.1:/test/dir /test/dir

The following is a trace of what happens :

atchoum:~ # while true; do date;ls -ld /test/dir; sleep 60;done
Thu Jan  6 14:33:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:34:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:35:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:36:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:37:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:38:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:39:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:40:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:41:41 CET 2005
drwxr-xr-x  8 root root 360 Dec  7 13:46 /test/dir
Thu Jan  6 14:42:41 CET 2005
/bin/ls: /test/dir: Stale NFS file handle
Thu Jan  6 14:43:41 CET 2005
/bin/ls: /test/dir: Stale NFS file handle
Thu Jan  6 14:44:41 CET 2005
/bin/ls: /test/dir: Stale NFS file handle
Thu Jan  6 14:45:42 CET 2005
/bin/ls: /test/dir: Stale NFS file handle
Thu Jan  6 14:46:42 CET 2005
/bin/ls: /test/dir: Stale NFS file handle
Thu Jan  6 14:47:42 CET 2005
/bin/ls: /test/dir: Stale NFS file handle
Thu Jan  6 14:48:42 CET 2005
/bin/ls: /test/dir: Stale NFS file handle
Thu Jan  6 14:49:42 CET 2005
/bin/ls: /test/dir: Stale NFS file handle
Thu Jan  6 14:50:42 CET 2005
/bin/ls: /test/dir: Stale NFS file handle

I enabled the NFS debug messages, this is what is seen in the syslog file
around the problem:

Jan  6 14:40:41 atchoum kernel: NFS: revalidating (0:f/113840)
Jan  6 14:40:41 atchoum kernel: NFS call  getattr
Jan  6 14:40:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 14:40:41 atchoum kernel: nfsd: GETATTR(3)  12: 00000001 02000800 
0001bcb0 00000000 00000000 00000000
Jan  6 14:40:41 atchoum kernel: nfsd: fh_verify(12: 00000001 02000800 
0001bcb0 00000000 00000000 00000000)
Jan  6 14:40:41 atchoum kernel: NFS reply getattr
Jan  6 14:40:41 atchoum kernel: NFS: nfs_update_inode(0:f/113840 ct=1 
info=0x6)
Jan  6 14:40:41 atchoum kernel: NFS: (0:f/113840) revalidation complete
Jan  6 14:41:41 atchoum kernel: NFS: revalidating (0:f/113840)
Jan  6 14:41:41 atchoum kernel: NFS call  getattr
Jan  6 14:41:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 14:41:41 atchoum kernel: nfsd: GETATTR(3)  12: 00000001 02000800 
0001bcb0 00000000 00000000 00000000
Jan  6 14:41:41 atchoum kernel: nfsd: fh_verify(12: 00000001 02000800 
0001bcb0 00000000 00000000 00000000)
Jan  6 14:41:41 atchoum kernel: NFS reply getattr
Jan  6 14:41:41 atchoum kernel: NFS: nfs_update_inode(0:f/113840 ct=1 
info=0x6)
Jan  6 14:41:41 atchoum kernel: NFS: (0:f/113840) revalidation complete
Jan  6 14:41:42 atchoum kernel: exp_export: export of non-dev fs without 
fsidfound domain localhost
Jan  6 14:41:42 atchoum kernel: found fsidtype 0
Jan  6 14:41:42 atchoum kernel: found fsid length 8
Jan  6 14:41:42 atchoum kernel: Path seems to be <>
Jan  6 14:42:41 atchoum kernel: NFS: revalidating (0:f/113840)
Jan  6 14:42:41 atchoum kernel: NFS call  getattr
Jan  6 14:42:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 14:42:41 atchoum kernel: nfsd: GETATTR(3)  12: 00000001 02000800 
0001bcb0 00000000 00000000 00000000
Jan  6 14:42:41 atchoum kernel: nfsd: fh_verify(12: 00000001 02000800 
0001bcb0 00000000 00000000 00000000)
Jan  6 14:42:41 atchoum kernel: NFS reply getattr
Jan  6 14:42:41 atchoum kernel: nfs_revalidate_inode: (0:f/113840) 
getattr failed, error=-116
Jan  6 14:43:41 atchoum kernel: NFS: revalidating (0:f/113840)
Jan  6 14:43:41 atchoum kernel: NFS call  getattr
Jan  6 14:43:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 14:43:41 atchoum kernel: nfsd: GETATTR(3)  12: 00000001 02000800 
0001bcb0 00000000 00000000 00000000
Jan  6 14:43:41 atchoum kernel: nfsd: fh_verify(12: 00000001 02000800 
0001bcb0 00000000 00000000 00000000)
Jan  6 14:43:41 atchoum kernel: NFS reply getattr
Jan  6 14:43:41 atchoum kernel: nfs_revalidate_inode: (0:f/113840) 
getattr failed, error=-116

Somehow, it seems that check_export gets confused ...

I tried to mount the directory using the fsid= option, this seems to help a
 little, but after some time, the following message appears in the syslog:

Jan  6 18:35:41 atchoum kernel: NFS: nfs_update_inode(0:f/113904 ct=1 
info=0x6)
Jan  6 18:35:41 atchoum kernel: NFS: (0:f/113904) revalidation complete
Jan  6 18:36:41 atchoum kernel: NFS: revalidating (0:f/113904)
Jan  6 18:36:41 atchoum kernel: NFS call  getattr
Jan  6 18:36:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 18:36:41 atchoum kernel: nfsd: GETATTR(3)  8: 00010001 00000309 
00000000 00000000 00000000 00000000
Jan  6 18:36:41 atchoum kernel: nfsd: fh_verify(8: 00010001 00000309 
00000000 00000000 00000000 00000000)
Jan  6 18:36:41 atchoum kernel: NFS reply getattr
Jan  6 18:36:41 atchoum kernel: NFS: nfs_update_inode(0:f/113904 ct=1 
info=0x6)
Jan  6 18:36:41 atchoum kernel: NFS: (0:f/113904) revalidation complete
Jan  6 18:37:41 atchoum kernel: NFS: revalidating (0:f/113904)
Jan  6 18:37:41 atchoum kernel: NFS call  getattr
Jan  6 18:37:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 18:37:41 atchoum kernel: nfsd: GETATTR(3)  8: 00010001 00000309 
00000000 00000000 00000000 00000000
Jan  6 18:37:41 atchoum kernel: nfsd: fh_verify(8: 00010001 00000309 
00000000 00000000 00000000 00000000)
Jan  6 18:37:41 atchoum kernel: nfsd: Dropping request due to malloc 
failure!
Jan  6 18:37:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 18:37:41 atchoum kernel: nfsd: GETATTR(3)  8: 00010001 00000309 
00000000 00000000 00000000 00000000
Jan  6 18:37:41 atchoum kernel: nfsd: fh_verify(8: 00010001 00000309 
00000000 00000000 00000000 00000000)
Jan  6 18:37:41 atchoum kernel: nfsd: Dropping request due to malloc 
failure!
Jan  6 18:37:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 18:37:41 atchoum kernel: nfsd: GETATTR(3)  8: 00010001 00000309 
00000000 00000000 00000000 00000000
Jan  6 18:37:41 atchoum kernel: nfsd: fh_verify(8: 00010001 00000309 
00000000 00000000 00000000 00000000)
Jan  6 18:37:41 atchoum kernel: nfsd: Dropping request due to malloc 
failure!
Jan  6 18:37:41 atchoum kernel: nfsd_dispatch: vers 3 proc 1
Jan  6 18:37:41 atchoum kernel: nfsd: GETATTR(3)  8: 00010001 00000309 
00000000 00000000 00000000 00000000
Jan  6 18:37:41 atchoum kernel: nfsd: fh_verify(8: 00010001 00000309 
00000000 00000000 00000000 00000000)
Jan  6 18:37:41 atchoum kernel: nfsd: Dropping request due to malloc 
failure!

Looks like a  memory leak ...


                 reply	other threads:[~2005-01-24 13:49 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41F4FD26.2050307@evidian.com \
    --to=pascal.dameme@evidian.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox