From: Chuck Lever <chuck.lever@oracle.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>,
Chris Perl <cperl@janestreet.com>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
Chris Perl <chris.perl@gmail.com>
Subject: Re: File Read Returns Non-existent Null Bytes
Date: Fri, 27 Feb 2015 18:33:24 -0500 [thread overview]
Message-ID: <0836BFA3-1F2D-4CD0-AB25-3DBA916D941C@oracle.com> (raw)
In-Reply-To: <20150227224029.GA8750@fieldses.org>
On Feb 27, 2015, at 5:40 PM, bfields@fieldses.org wrote:
> On Thu, Feb 26, 2015 at 08:29:51AM -0500, Trond Myklebust wrote:
>> On Thu, Feb 26, 2015 at 7:41 AM, Chris Perl <cperl@janestreet.com> wrote:
>>>>> Ok, thanks for helping me understand this a little more clearly. For
>>>>> my own edification, is there somewhere I can find the details where
>>>>> these things are spelled out (or is it just somewhere in rfc1813 that
>>>>> I haven't seen)?
>>>>
>>>> There is a short description here:
>>>> http://nfs.sourceforge.net/#faq_a8
>>>
>>> Yes, thanks. I had come across that when trying to do a little
>>> research after your initial reply.
>>>
>>> However, I was hoping for something with a little more detail. For
>>> example, you said earlier that "the close-to-open cache consistency
>>> model is clear ...", which implied to me that there was a more formal
>>> description somewhere outlining the semantics and constraints. Or is
>>> it just more of an implementation detail?
>>>
>>> Also, reading that FAQ entry seems to reinforce my original notion
>>> that a client reading a file that is being updated might get stale
>>> data returned from its cache, but shouldn't get corrupt data returned
>>> from its cache. Perhaps the FAQ entry should be updated to explicitly
>>> note that corrupt data can be returned?
>>>
>>> FWIW, I realize that the use case I've given as a reproducer isn't
>>> supported and isn't supposed to work. I accept that and that is fine.
>>> However, we do run into this problem in "everyday types of file
>>> sharing" (to quote that FAQ). Sometimes (not very often, but enough
>>> that its annoying), someone will cat a file that is already in their
>>> clients page cache, and it happens to be at just the wrong time,
>>> resulting in corrupt data being read.
>>
>> If you are saying that we're not making it clear enough that "you
>> ignore these rules at your peril" then, fair enough, I'm sure Chuck
>> would be able to add a line to the faq stating just that.
>
> Yeah, I don't think that FAQ answer is clear. It talks a little about
> how close-to-open is implemented but doesn't really state clearly what
> applications can assume. The first paragraph comes close, but it's
> really just a motivating example.
>
> A rought attempt, but it feels a little overboard while still
> incomplete:
I’m in favor of staying more hand-wavy. Otherwise you will end up
making promises you don’t intend to keep ;-)
Something like:
> Because NFS is not a cluster or “single system image” filesystem,
> applications must provide proper serialization of reads and writes among multiple clients to ensure correct application behavior and
> prevent corruption of file data. The close-to-open mechanism is not
> adequate in the presence of concurrent opens for write when multiple
> clients are involved.
Plus or minus some word-smithing.
And, let’s consider updating the DATA AND METADATA COHERENCY section
of nfs(5), which contains a similar discussion of close-to-open
cache consistency.
> - access from multiple processes on the same client provides the
> same guarantees as on local filesystems.
>
> - access from multiple clients will provide the same guarantees
> as long as no client's open for write overlaps any other open
> from another client.
>
> - if a client does open a file for read while another holds it
> open for write, results of that client's reads are undefined.
>
> - More specifically, a read of a byte at offset X could return a
> byte that has been written to that offset during a concurrent
> write open, or the value stored there at offset X at the start
> of that write open, or 0 if X happened to be past the end of
> file at any point during the concurrent write open. The
> reader may not assume any relationships among values at
> different offsets or the file size, updates to any of which
> may be seen in any order.
>
> ?
>
> --b.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
next prev parent reply other threads:[~2015-02-27 23:33 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-23 20:56 File Read Returns Non-existent Null Bytes Chris Perl
2015-02-23 22:34 ` Trond Myklebust
2015-02-25 17:04 ` Chris Perl
2015-02-25 17:37 ` Trond Myklebust
2015-02-25 21:02 ` Chris Perl
2015-02-25 21:47 ` Trond Myklebust
2015-02-25 21:53 ` Chris Perl
2015-02-25 22:15 ` Trond Myklebust
2015-02-26 12:41 ` Chris Perl
2015-02-26 13:29 ` Trond Myklebust
2015-02-26 13:42 ` Chris Perl
2015-02-26 14:10 ` Chris Perl
2015-02-26 15:22 ` Simo Sorce
2015-02-26 15:34 ` Trond Myklebust
2015-02-26 15:36 ` Simo Sorce
2015-02-26 15:45 ` Chris Perl
2015-02-26 15:56 ` Simo Sorce
2015-02-27 1:48 ` Harshula
2015-02-27 13:17 ` Chris Perl
2015-02-26 16:00 ` Chris Perl
2015-02-26 23:43 ` Trond Myklebust
2015-02-26 15:37 ` Trond Myklebust
2015-02-27 22:40 ` J. Bruce Fields
2015-02-27 23:33 ` Chuck Lever [this message]
2015-03-02 15:19 ` Chris Perl
2015-03-02 15:57 ` Chuck Lever
2015-03-02 20:58 ` J. Bruce Fields
2015-03-02 21:15 ` Chuck Lever
2015-03-03 13:29 ` Chris Perl
2015-03-03 15:30 ` Chuck Lever
2015-03-03 17:44 ` Trond Myklebust
2015-03-03 19:57 ` Chuck Lever
2015-03-02 21:33 ` didier
2015-03-03 9:09 ` Boaz Harrosh
[not found] ` <CAHHaOubVomDJ5uePb7DFGizZ0TBsyC-tJN5p6-RWOYKQC2oxvA@mail.gmail.com>
2015-02-27 20:13 ` Chris Perl
2015-02-25 22:32 ` Chuck Lever
2015-02-26 0:37 ` Trond Myklebust
2015-02-26 0:43 ` Trond Myklebust
2015-02-26 1:27 ` Trond Myklebust
2015-02-26 15:08 ` Chuck Lever
2015-02-26 16:26 ` fsx size error (was: File Read Returns Non-existent Null Bytes) Chuck Lever
2015-02-26 17:27 ` Trond Myklebust
2015-02-26 19:00 ` Chuck Lever
2015-02-26 23:06 ` Trond Myklebust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0836BFA3-1F2D-4CD0-AB25-3DBA916D941C@oracle.com \
--to=chuck.lever@oracle.com \
--cc=bfields@fieldses.org \
--cc=chris.perl@gmail.com \
--cc=cperl@janestreet.com \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@primarydata.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox