All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Staubach <staubach@redhat.com>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: uketinen@us.ibm.com, nfs@lists.sourceforge.net
Subject: Re: NFS dentry caching mechanism
Date: Fri, 27 Jan 2006 08:38:08 -0500	[thread overview]
Message-ID: <43DA2240.2090900@redhat.com> (raw)
In-Reply-To: <1138317247.8770.39.camel@lade.trondhjem.org>

Trond Myklebust wrote:

>On Thu, 2006-01-26 at 14:40 -0800, Usha Ketineni wrote:
>  
>
>>
>>We are investigating an issue with the NFS client code in 2.4.21
>>kernel:
>>
>>To reproduce the issue: 
>>
>>(using machine A and machine B, and a file system mounted off an NFS
>>server 
>>called /home)
>>
>>1) On Machine A: 
>>
>>ls home/source 
>>ls: /home/source: No such file or directory 
>>
>>2) On machine B: 
>>touch /home/source 
>>
>>3) back on machine A: 
>>rm /home/source 
>>rm: cannot lstat `source': No such file or directory 
>>
>>But source *does* exist. 
>>
>>    
>>
>
>Why on earth is 'rm' trying to lstat the file? That is both racy and
>unnecessary.
>
>  
>
>>This shows the problem.
>>
>>===
>>
>>There are workarounds:
>>
>>1) Mount the file system with acdirmin=0 and acdirmax=0. But this then
>>affects 
>>all system calls, not just unlink(). And it hurts NFS performance.
>>
>>2) Mount the file system with the noac option, but the same negative
>>effect as 
>>in #1 applies.
>>
>>What happens is this: 
>>
>>0) Let F be a filename on the NFS file system. Initially this file
>>does not exist.
>>
>>1) The application on the machine A does a stat() on F. The NFS
>>client in the kernel sends a LOOKUP request to the NFS server, which
>>obviously returns failure. The stat() fails with ENOENT. OK so far.
>>
>>2) Immediately afterwards (a few seconds max), the application on
>>machine B creates the file F. No problems so far.
>>
>>3) When B is done with F, a few seconds later the application on
>>machine A does an unlink() on F. Because of the negative dentry
>>caching in the Linux kernel, it doesn't even bother to send an NFS
>>REMOVE request to the NFS server, as (it thinks) it knows for sure the
>>file doesn't exist. It lets the unlink() fail with ENOENT. But the
>>file definitely exists. 
>>
>>Is there any other solution for this (including moving to a newer
>>kernel)? 
>>
>>    
>>
>I suppose one could add a VFS intent for unlink in order to force
>nfs_lookup_revalidate() to drop the negative dentry. We don't do that on
>any existing kernels though (particularly not on 2.4 kernels, as they
>don't support intents).
>
>However I suspect that most non-linux clients will similarly cache
>negative DNLC entries, and be vulnerable to the same problem.
>

For systems which are based on the ONC+ code (Ie. Solaris), on write-able
file systems, the negative cache entries are _always_ validated using a
forced over the wire GETATTR operation.  Read-only file systems are
treated slightly differently by using the normal attribute cache mechanism
to do the validation.  This keeps the client from falling into this trap.

It is okay for the client to think that a file exists which may not, because
it can detect the difference.  It is not okay for a client to decide that a
file does not exist without a strong validation mechanism because there is
no way for the application to determine otherwise.

       ps


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

  reply	other threads:[~2006-01-27 13:38 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-01-26 22:40 NFS dentry caching mechanism Usha Ketineni
2006-01-26 23:14 ` Trond Myklebust
2006-01-27 13:38   ` Peter Staubach [this message]
2006-01-27 13:44     ` Trond Myklebust
2006-01-27 13:49       ` Peter Staubach
2006-01-27 14:26         ` Trond Myklebust
2006-01-27 14:43           ` Peter Staubach
2006-01-27 15:13             ` Trond Myklebust
2006-01-27 15:36               ` Peter Staubach
2006-01-27 17:13                 ` Trond Myklebust
2006-01-27 18:20                   ` Peter Staubach

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43DA2240.2090900@redhat.com \
    --to=staubach@redhat.com \
    --cc=nfs@lists.sourceforge.net \
    --cc=trond.myklebust@fys.uio.no \
    --cc=uketinen@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.