linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rob Ross <rross@mcs.anl.gov>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Christoph Hellwig <hch@infradead.org>,
	Andreas Dilger <adilger@clusterfs.com>,
	Sage Weil <sage@newdream.net>, Brad Boyer <flar@allandria.com>,
	Anton Altaparmakov <aia21@cam.ac.uk>,
	Gary Grider <ggrider@lanl.gov>,
	linux-fsdevel@vger.kernel.org
Subject: Re: NFSv4/pNFS possible POSIX I/O API standards
Date: Tue, 05 Dec 2006 16:11:16 -0600	[thread overview]
Message-ID: <4575EE84.6070604@mcs.anl.gov> (raw)
In-Reply-To: <1165330516.5742.24.camel@lade.trondhjem.org>

Trond Myklebust wrote:
> On Tue, 2006-12-05 at 10:07 +0000, Christoph Hellwig wrote:
>>> ...and we have pointed out how nicely this ignores the realities of
>>> current caching models. There is no need for a  readdirplus() system
>>> call. There may be a need for a caching barrier, but AFAICS that is all.
>> I think Andreas mentioned that it is useful for clustered filesystems
>> that can avoid additional roundtrips this way.  That alone might now
>> be enough reason for API additions, though.  The again statlite and
>> readdirplus really are the most sane bits of these proposals as they
>> fit nicely into the existing set of APIs.  The filehandle idiocy on
>> the other hand is way of into crackpipe land.
> 
> They provide no benefits whatsoever for the two most commonly used
> networked filesystems NFS and CIFS. As far as they are concerned, the
> only new thing added by readdirplus() is the caching barrier semantics.
> I don't see why you would want to add that into a generic syscall like
> readdir() though: it is
>         
>         a) networked filesystem specific. The mask stuff etc adds no
>         value whatsoever to actual "posix" filesystems. In fact it is
>         telling the kernel that it can violate posix semantics.

It isn't violating POSIX semantics if we get the calls passed as an 
extension to POSIX :).

>         b) quite unnatural to impose caching semantics on all the
>         directory _entries_ using a syscall that refers to the directory
>         itself (see the explanations by both myself and Peter Staubach
>         of the synchronisation difficulties). Consider in particular
>         that it is quite possible for directory contents to change in
>         between readdirplus calls.

I want to make sure that I understand this correctly. NFS semantics 
dictate that if someone stat()s a file that all changes from that client 
need to be propagated to the server? And this call complicates that 
semantic because now there's an operation on a different object (the 
directory) that would cause this flush on the files?

Of course directory contents can change in between readdirplus() calls, 
just as they can between readdir() calls. That's expected, and we do not 
attempt to create consistency between calls.

>         i.e. the "strict posix caching model' is pretty much impossible
>         to implement on something like NFS or CIFS using these
>         semantics. Why then even bother to have "masks" to tell you when
>         it is OK to violate said strict model.

We're trying to obtain improved performance for distributed file systems 
with stronger consistency guarantees than these two.

>         c) Says nothing about what should happen to non-stat() metadata
>         such as ACL information and other extended attributes (for
>         example future selinux context info). You would think that the
>         'ls -l' application would care about this.

Honestly, we hadn't thought about other non-stat() metadata because we 
didn't think it was part of the use case, and we were trying to stay 
close to the flavor of POSIX. If you have ideas here, we'd like to hear 
them.

Thanks for the comments,

Rob

  reply	other threads:[~2006-12-05 22:11 UTC|newest]

Thread overview: 124+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-28  4:34 NFSv4/pNFS possible POSIX I/O API standards Gary Grider
2006-11-28  5:54 ` Christoph Hellwig
2006-11-28 10:54   ` Andreas Dilger
2006-11-28 11:28     ` Anton Altaparmakov
2006-11-28 20:17     ` Russell Cattelan
2006-11-28 23:28     ` Wendy Cheng
2006-11-29  9:12       ` Christoph Hellwig
2006-11-29  9:04   ` Christoph Hellwig
2006-11-29  9:14     ` Christoph Hellwig
2006-11-29  9:48     ` Andreas Dilger
2006-11-29 10:18       ` Anton Altaparmakov
2006-11-29  8:26         ` Brad Boyer
2006-11-30  9:25           ` Christoph Hellwig
2006-11-30 17:49             ` Sage Weil
2006-12-01  5:26               ` Trond Myklebust
2006-12-01  7:08                 ` Sage Weil
2006-12-01 14:41                   ` Trond Myklebust
2006-12-01 16:47                     ` Sage Weil
2006-12-01 18:07                       ` Trond Myklebust
2006-12-01 18:42                         ` Sage Weil
2006-12-01 19:13                           ` Trond Myklebust
2006-12-01 20:32                             ` Sage Weil
2006-12-04 18:02                           ` Peter Staubach
2006-12-05 23:20                             ` readdirplus() as possible POSIX I/O API Sage Weil
2006-12-06 15:48                               ` Peter Staubach
2006-12-03  1:57                         ` NFSv4/pNFS possible POSIX I/O API standards Andreas Dilger
2006-12-03  7:34                           ` Kari Hurtta
2006-12-03  1:52                     ` Andreas Dilger
2006-12-03 16:10                       ` Sage Weil
2006-12-04  7:32                         ` Andreas Dilger
2006-12-04 15:15                           ` Trond Myklebust
2006-12-05  0:59                             ` Rob Ross
2006-12-05  4:44                               ` Gary Grider
2006-12-05 10:05                                 ` Christoph Hellwig
2006-12-05  5:56                               ` Trond Myklebust
2006-12-05 10:07                                 ` Christoph Hellwig
2006-12-05 14:20                                   ` Matthew Wilcox
2006-12-06 15:04                                     ` Rob Ross
2006-12-06 15:44                                       ` Matthew Wilcox
2006-12-06 16:15                                         ` Rob Ross
2006-12-05 14:55                                   ` Trond Myklebust
2006-12-05 22:11                                     ` Rob Ross [this message]
2006-12-05 23:24                                       ` Trond Myklebust
2006-12-06 16:42                                         ` Rob Ross
2006-12-06 12:22                                     ` Ragnar Kjørstad
2006-12-06 15:14                                       ` Trond Myklebust
2006-12-05 16:55                                   ` Latchesar Ionkov
2006-12-05 22:12                                     ` Christoph Hellwig
2006-12-06 23:12                                       ` Latchesar Ionkov
2006-12-06 23:33                                         ` Trond Myklebust
2006-12-05 21:50                                   ` Rob Ross
2006-12-05 22:05                                     ` Christoph Hellwig
2006-12-05 23:18                                       ` Sage Weil
2006-12-05 23:55                                       ` Ulrich Drepper
2006-12-06 10:06                                         ` Andreas Dilger
2006-12-06 17:19                                           ` Ulrich Drepper
2006-12-06 17:27                                             ` Rob Ross
2006-12-06 17:42                                               ` Ulrich Drepper
2006-12-06 18:01                                                 ` Ragnar Kjørstad
2006-12-06 18:13                                                   ` Ulrich Drepper
2006-12-17 14:41                                                     ` Ragnar Kjørstad
2006-12-17 19:07                                                       ` Ulrich Drepper
2006-12-17 19:38                                                         ` Matthew Wilcox
2006-12-17 21:51                                                           ` Ulrich Drepper
2006-12-18  2:57                                                             ` Ragnar Kjørstad
2006-12-18  3:54                                                               ` Gary Grider
2006-12-07  5:57                                                 ` Andreas Dilger
2006-12-15 22:37                                                   ` Ulrich Drepper
2006-12-16 18:13                                                     ` Andreas Dilger
2006-12-16 19:08                                                       ` Ulrich Drepper
2006-12-14 23:58                                         ` statlite() Rob Ross
2006-12-07 23:39                                       ` NFSv4/pNFS possible POSIX I/O API standards Nikita Danilov
2006-12-05 14:37                               ` Peter Staubach
2006-12-05 10:26                             ` readdirplus() as possible POSIX I/O API Andreas Dilger
2006-12-05 15:23                               ` Trond Myklebust
2006-12-06 10:28                                 ` Andreas Dilger
2006-12-06 15:10                                   ` Trond Myklebust
2006-12-05 17:06                               ` Latchesar Ionkov
2006-12-05 22:48                                 ` Rob Ross
2006-11-29 10:25       ` NFSv4/pNFS possible POSIX I/O API standards Steven Whitehouse
2006-11-30 12:29         ` Christoph Hellwig
2006-12-01 15:52       ` Ric Wheeler
2006-11-29 12:23     ` Matthew Wilcox
2006-11-29 12:35       ` Matthew Wilcox
2006-11-29 16:26         ` Gary Grider
2006-11-29 17:18           ` Christoph Hellwig
2006-11-29 12:39       ` Christoph Hellwig
2006-12-01 22:29         ` Rob Ross
2006-12-02  2:35           ` Latchesar Ionkov
2006-12-05  0:37             ` Rob Ross
2006-12-05 10:02               ` Christoph Hellwig
2006-12-05 16:47               ` Latchesar Ionkov
2006-12-05 17:01                 ` Matthew Wilcox
     [not found]                   ` <f158dc670612050909m366594c5ubaa87d9a9ecc8c2a@mail.gmail.com>
2006-12-05 17:10                     ` Latchesar Ionkov
2006-12-05 17:39                     ` Matthew Wilcox
2006-12-05 21:55                       ` Rob Ross
2006-12-05 21:50                   ` Peter Staubach
2006-12-05 21:44                 ` Rob Ross
2006-12-06 11:01                   ` openg Christoph Hellwig
2006-12-06 15:41                     ` openg Trond Myklebust
2006-12-06 15:42                     ` openg Rob Ross
2006-12-06 23:32                       ` openg Christoph Hellwig
2006-12-14 23:36                         ` openg Rob Ross
2006-12-06 23:25                   ` Re: NFSv4/pNFS possible POSIX I/O API standards Latchesar Ionkov
2006-12-06  9:48                 ` David Chinner
2006-12-06 15:53                   ` openg and path_to_handle Rob Ross
2006-12-06 16:04                     ` Matthew Wilcox
2006-12-06 16:20                       ` Rob Ross
2006-12-06 20:57                         ` David Chinner
2006-12-06 20:40                     ` David Chinner
2006-12-06 20:50                       ` Matthew Wilcox
2006-12-06 21:09                         ` David Chinner
2006-12-06 22:09                         ` Andreas Dilger
2006-12-06 22:17                           ` Matthew Wilcox
2006-12-06 22:41                             ` Andreas Dilger
2006-12-06 23:39                           ` Christoph Hellwig
2006-12-14 22:52                             ` Rob Ross
2006-12-06 20:50                       ` Rob Ross
2006-12-06 21:01                         ` David Chinner
2006-12-06 23:19                     ` Latchesar Ionkov
2006-12-14 21:00                       ` Rob Ross
2006-12-14 21:20                         ` Matthew Wilcox
2006-12-14 23:02                           ` Rob Ross
2006-11-28 15:08 ` NFSv4/pNFS possible POSIX I/O API standards Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4575EE84.6070604@mcs.anl.gov \
    --to=rross@mcs.anl.gov \
    --cc=adilger@clusterfs.com \
    --cc=aia21@cam.ac.uk \
    --cc=flar@allandria.com \
    --cc=ggrider@lanl.gov \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=sage@newdream.net \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).