All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Staubach <staubach@redhat.com>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "Lever, Charles" <Charles.Lever@netapp.com>,
	Charles Duffy <ccd@mailcall.com.au>,
	nfs@lists.sourceforge.net
Subject: Re: Data coherency trouble with	multiple	clients,	on2.6.14-rc5
Date: Wed, 26 Oct 2005 17:22:57 -0400	[thread overview]
Message-ID: <435FF3B1.5030200@redhat.com> (raw)
In-Reply-To: <1130360742.8859.56.camel@lade.trondhjem.org>

Trond Myklebust wrote:

>on den 26.10.2005 klokka 15:53 (-0400) skreiv Peter Staubach:
>  
>
>>This brings lots of extra guarantees, actually.  Just because the file is
>>open for writing does not mean that there are any dirty pages hanging
>>around waiting to be written.  And, even if there are, they will get
>>flushed when the conflict is detected.  Last there one there wins.  This
>>is even the policy when local processes conflict on the same file in the
>>same region.
>>
>>This policy would address the situation that was reported here.
>>
>>This policy will definitely result in _much_ stronger caching semantics
>>than does close-to-open.  These two policies together can usually result
>>in reasonable cache consistency, enough for most applications.  Applications
>>which need stronger cache consistency should be advisory locking in order
>>to synchronize access to the file.
>>    
>>
>
>Sure, but the big issue here is how to actually detect conflicts (and
>avoid excessive false positives).
>
>  
>

I would say that it is better to be safe and then fast.  Some cache
invalidations for false positives are better than missing some which
were required.

>NFSv3 does in theory give you the option of detecting conflicts using
>weak cache consistency. In practice, write reordering and the fact that
>most servers violate the requirement given by RFC1813 that pre/post-op
>attributes should be atomic w.r.t. the main operation prevents you from
>closing the hole.
>NFSv2 and NFSv4 don't even have support for WCC, so your detection
>scheme ends up being very dependent on one particular version of NFS.
>
>  
>

Actually NFSv4 does have an attribute that the client can use, doesn't it?
Something like change_attr or some such?

The write reordering issue only exists for multiple concurrent operations
such as WRITE operations.  I will agree, that if the wcc_data for WRITE
operations is used, then many false positives will probably occur.  However,
useful and valid cache validations can be done using GETATTR or other
operations such as ACCESS or LOOKUP, even while a file is open for writing.

>Basically, what I'm saying is that as long as we cannot implement the
>above ideal, we should not be issuing promises to application developers
>that they can rely on it. O_DIRECT was specifically developed in order
>to give database implementers a reliable uncached I/O interface, and so
>that is what we should direct them towards.
>The worst thing to do when someone asks IMHO is to reply that "we can
>almost but not quite fix noac".
>

O_DIRECT is pretty much only useful to the database folks because of the
lack of readahead and write behind which kills performance.  They can
utilize O_DIRECT because they use multiple contexts or AIO to issue the
i/o requests.

The application developers are already aware of the loose cache consistency
that NFS offers.  This is not a reason to loosen it further though.  We
can and should do the best job that we can.  We have to make some 
assumptions
about how well NFS servers implement the correct semantics.  If an NFS
server is truly broken, then let's get that NFS server fixed.  Avoiding
useful semantics because some servers in the market may not get them
right seems self defeating to me and just futhers the myth that NFS is
not useful as a distributed file system.

    Thanx...

       ps


-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc.
Get Certified Today * Register for a JBoss Training Course
Free Certification Exam for All Training Attendees Through End of 2005
Visit http://www.jboss.com/services/certification for more information
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

  reply	other threads:[~2005-10-26 21:23 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-26 15:27 Data coherency trouble with multiple clients, on2.6.14-rc5 Lever, Charles
2005-10-26 16:50 ` Trond Myklebust
2005-10-26 18:45   ` Peter Staubach
2005-10-26 19:08     ` Trond Myklebust
2005-10-26 19:53       ` Peter Staubach
2005-10-26 21:05         ` Trond Myklebust
2005-10-26 21:22           ` Peter Staubach [this message]
2005-10-26 21:57             ` Trond Myklebust
2005-10-27 12:25               ` Peter Staubach
2005-10-27 12:53                 ` Trond Myklebust
2005-10-27 14:25               ` Calum Mackay
2005-10-27 15:33                 ` Trond Myklebust
  -- strict thread matches above, loose matches on Subject: below --
2005-11-07 16:47 Data coherency trouble with multiple clients on2.6.14-rc5 Lever, Charles
2005-11-07 16:52 ` Peter Staubach
2005-10-26  6:01 Data coherency trouble with multiple clients, on 2.6.14-rc5 Charles Duffy
2005-10-26  6:23 ` Trond Myklebust
2005-10-26  7:25   ` Data coherency trouble with multiple clients, on2.6.14-rc5 Charles Duffy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=435FF3B1.5030200@redhat.com \
    --to=staubach@redhat.com \
    --cc=Charles.Lever@netapp.com \
    --cc=ccd@mailcall.com.au \
    --cc=nfs@lists.sourceforge.net \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.