public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Theodore Tso <tytso@mit.edu>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "Jörn Engel" <joern@lazybastard.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Christoph Hellwig" <hch@infradead.org>,
	"Ulrich Drepper" <drepper@gmail.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Neil Brown" <neilb@suse.de>
Subject: Re: If not readdir() then what?
Date: Mon, 9 Apr 2007 09:19:18 -0400	[thread overview]
Message-ID: <20070409131918.GC18580@thunk.org> (raw)
In-Reply-To: <1176121897.6210.8.camel@heimdal.trondhjem.org>

On Mon, Apr 09, 2007 at 08:31:37AM -0400, Trond Myklebust wrote:
> On Mon, 2007-04-09 at 13:09 +0200, Jörn Engel wrote:
> > That surely doesn't make life any easier for filesystem developers, I
> > agree.  From that point of view, all telldir cookies should end their
> > life at closedir time.  For "rm -r" it would be sufficient if the nfs
> > client simply didn't seekdir at all.  For "ls -lR", this would return
> > duplicate dentries.
> 
> Please go read the NFS spec. The only thing an NFS client has in order
> to read a directory is a READDIR operation that in essence takes a
> filehandle and a cookie as its arguments. Unless the server is able to
> return the entire rest of the directory in one RPC reply, the client
> needs to send a second READDIR operation with a cookie from the previous
> READDIR operation. The server is expected to return cookies for _each_
> entry in the directory.
> 
> That is a protocol limitation, not a client limitation.

<Groan>

And after quickly checking RFC 3010, I see this limitation hasn't been
lifted in NFSv4.

Speaking of which, right now ext3 doesn't know whether it's talking to
an NFSv2 or NFS v3/v4 server, so it's always passing a 32-bit cookie.
If NFSv3/v4 could use an explicit interface to request a 64-bit
cookie, instead of just relying on the f_pos field in the file handle,
we can reduce the chance of hash collisions when reading an ext3
directory significantly.   

If there are 2 or 3 directory entries that have a hash collision,
would the NFS protocol allow the server to juggle things so that those
2-3 directory entries with the hash collision are sent back in a
single readdir RPC reply?  Is it aceptable/legal to have multiple
entries in the same READDIR reply packet have the same cookie value?

						- Ted

  reply	other threads:[~2007-04-09 13:20 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-07 16:57 If not readdir() then what? Ulrich Drepper
2007-04-07 20:36 ` Theodore Tso
2007-04-07 23:30   ` Christoph Hellwig
2007-04-08 18:11     ` H. Peter Anvin
2007-04-08 18:41       ` Jörn Engel
2007-04-08 19:19         ` Theodore Tso
2007-04-08 19:26           ` Ulrich Drepper
2007-04-08 19:28           ` H. Peter Anvin
2007-04-08 19:40             ` Ulrich Drepper
2007-04-09  1:44             ` Theodore Tso
2007-04-09 11:09               ` Jörn Engel
2007-04-09 12:29                 ` Trond Myklebust
2007-04-09 12:31                 ` Trond Myklebust
2007-04-09 13:19                   ` Theodore Tso [this message]
2007-04-09 14:03                     ` Trond Myklebust
2007-04-09 16:34                       ` Jan Engelhardt
2007-04-09 17:00                         ` Trond Myklebust
2007-04-10 13:56                       ` Theodore Tso
2007-04-10 14:10                         ` Ulrich Drepper
2007-04-10 15:48                           ` H. Peter Anvin
2007-04-10 16:42                             ` Ulrich Drepper
2007-04-10 14:37                         ` Trond Myklebust
2007-04-10 15:54                           ` Jan Engelhardt
2007-04-10 16:18                             ` H. Peter Anvin
2007-04-10 16:25                             ` Valdis.Kletnieks
2007-04-10 21:12                           ` Neil Brown
2007-04-10 21:16                             ` H. Peter Anvin
2007-04-10 21:43                               ` Neil Brown
2007-04-10 21:18                             ` Trond Myklebust
2007-04-10 21:37                               ` Neil Brown
2007-04-10 21:57                                 ` Bob Copeland
2007-04-10 21:59                                 ` Trond Myklebust
2007-04-10 22:33                                   ` Neil Brown
2007-04-11  0:22                                     ` Trond Myklebust
2007-04-11  1:45                                       ` Bernd Eckenfels
2007-04-10 21:46                             ` Alan Cox
2007-04-10 21:26                     ` Neil Brown
2007-04-09 12:46                 ` Andreas Schwab
2007-04-10 21:15         ` Neil Brown
2007-04-11 13:57           ` Jan Engelhardt
2007-04-11 14:42           ` Theodore Tso
2007-04-11 22:32             ` Neil Brown
2007-04-11 22:06               ` David Lang
2007-04-11 23:23                 ` H. Peter Anvin
2007-04-11 23:33                   ` Jörn Engel
2007-04-12  0:00                 ` Neil Brown
2007-04-11 23:22               ` Theodore Tso
2007-04-12  1:46                 ` Neil Brown
2007-04-12  2:37                   ` Jörn Engel
2007-04-12  5:57                     ` Neil Brown
2007-04-12  9:33                       ` Jörn Engel
2007-04-12 12:21                       ` Theodore Tso
2007-04-12 17:18                         ` J. Bruce Fields
2007-04-12 17:35                           ` H. Peter Anvin
2007-04-16  3:05                             ` Theodore Tso
2007-04-16  5:47                               ` Neil Brown
2007-04-16 10:39                                 ` Theodore Tso
2007-04-16  6:18                         ` Neil Brown
2007-04-16 11:07                           ` Theodore Tso
2007-04-16 23:24                             ` Neil Brown
2007-04-08 18:47       ` Theodore Tso
2007-04-08 19:13         ` H. Peter Anvin
2007-04-08 18:50     ` Ulrich Drepper
2007-04-07 23:44   ` Jan Engelhardt
2007-04-08 20:36   ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070409131918.GC18580@thunk.org \
    --to=tytso@mit.edu \
    --cc=drepper@gmail.com \
    --cc=hch@infradead.org \
    --cc=hpa@zytor.com \
    --cc=joern@lazybastard.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox