public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@hammerspace.com>
To: "davenoveck@gmail.com" <davenoveck@gmail.com>
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>,
	"devel@lists.nfs-ganesha.org" <devel@lists.nfs-ganesha.org>,
	"linux-nfs-owner@vger.kernel.org"
	<linux-nfs-owner@vger.kernel.org>,
	"tigran.mkrtchyan@desy.de" <tigran.mkrtchyan@desy.de>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"rmacklem@uoguelph.ca" <rmacklem@uoguelph.ca>
Subject: Re: [nfsv4] file size and getattr
Date: Wed, 27 Feb 2019 18:05:39 +0000	[thread overview]
Message-ID: <90c104ee477cdc8e999ddaff4ccab32332d0f71d.camel@hammerspace.com> (raw)
In-Reply-To: <CADaq8jfcFCuM5u30bXqTvg+0Hc-V+1VO9aUdSJLA090j48LCvA@mail.gmail.com>

On Wed, 2019-02-27 at 12:52 -0500, David Noveck wrote:
> > However note
> > that the counter argument to what you state above is that _if_ the
> > server requires a layoutcommit before it will acknowledge a file
> size
> > change, then pNFS is likely to underperform for applications such
> as
> > databases or VMs where each record is required to be written in
> stable
> > mode.
> > IOW: If all writes that need to be stable are also required to be
> > acknowledged with a layoutcommit (to the MDS),
> 
> But it is not true that *all* writes that need to be stable are also
> required 
> to be acknowledged with a layoutcommit (to the MDS.  Only those that 
> potentially change the file size require this.

That's true for POSIX O_DSYNC writes, but it is not true for O_SYNC. In
the latter case, the timestamps are required to be updated
synchronously as well, which implies a layoutcommit.

>  
> > then your ability to
> > scale out your server will be in doubt
> 
> For many applications, particularly databases, it will easy to make
> sure
> that the writes that potentially change the file size are few and far
> between.  

If the database uses O_DSYNC, yes.

> 
> On Tue, Feb 26, 2019 at 8:12 PM Trond Myklebust <
> trondmy@hammerspace.com> wrote:
> > On Wed, 2019-02-27 at 00:13 +0000, Rick Macklem wrote:
> > > Trond Myklebust wrote:
> > > [stuff snipped]
> > > > Please see the Errata ID 2751 
> > > > http://www.rfc-editor.org/errata/eid2751
> > > 
> > > I'll admit I hadn't seen this errata before. However, it seems to
> > be
> > > specific to
> > > the File Layout. For the Flexible File Layout...
> > > 
> > > When I look in RFC-8435, I cannot find anything that states that
> > a
> > > LayoutCommit
> > > is only required for case(s) where a Commit to the Storage Server
> > is
> > > required.
> > > Sec. 2.1
> > >    Clearly states that a Commit to the Storage Server is required
> > > before the client
> > >    does a LayoutCommit when the write(s) were not done FILE_SYNC.
> > >    However, I do not see any indication that the LayoutCommit is
> > not
> > > to be done
> > >    for the case where the write(s) are done FILE_SYNC.
> > > 
> > > FF_FLAGS_NO_LAYOUTCOMMIT can be used to indicate to a client that
> > > LayoutCommits are not required, but this does not be dependent on
> > how
> > > the write(s) to the Storage Server were done.
> > > 
> > > The only way a Flexible File layout Metadata server can know what
> > the
> > > current file size is (when a read/write layout is issued to a
> > client)
> > > is to do a
> > > Getattr to the Storage Server.
> > > If a client is not required to do a LayoutCommit when the
> > write(s) to
> > > the
> > > Storage Server are done FILE_SYNC, then the Metadata server must
> > do
> > > Getattr RPCs to the Storage Server whenever it needs an up to
> > date
> > > file size
> > > if a read/write layout is issued to a client.
> > > 
> > > This can result in a lot of overhead that can be avoided by
> > requiring
> > > the
> > > LayoutCommit to be done by a client after writing to a Storage
> > > Server,
> > > irrespective of the need for a Commit to the Storage Server.
> > > As such, I would rather not have this errata applied to RFC-8435.
> > > 
> > 
> > Fair enough. I agree that the errata in question only applies to
> > the
> > pNFS files layout, however you were talking about RFC5661 and
> > whether
> > or not we were interpreting that correctly. Since RFC5661 only
> > refers
> > to about the behaviour of the pNFS files layout, then I assumed
> > that
> > was what you were referring to.
> > 
> > For flexfiles we may have a bug in the layoutcommit case. However
> > note
> > that the counter argument to what you state above is that _if_ the
> > server requires a layoutcommit before it will acknowledge a file
> > size
> > change, then pNFS is likely to underperform for applications such
> > as
> > databases or VMs where each record is required to be written in
> > stable
> > mode.
> > IOW: If all writes that need to be stable are also required to be
> > acknowledged with a layoutcommit (to the MDS), then your ability to
> > scale out your server will be in doubt.
> > 
-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



  parent reply	other threads:[~2019-02-27 18:05 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <155049372736.14318.3390584694682770373.idtracker@ietfa.amsl.com>
     [not found] ` <CADaq8je2Ap4oAZAPOguaWctUs8dQ=Q9g=TO39EOiJW5EJSFTGg@mail.gmail.com>
2019-02-22 23:25   ` file size and getattr Marc Eshel
     [not found]     ` <CADaq8jfT=nxmVPz4mDRFAaOx36uxWqKbxitqS-YH3GCOs4CsrQ@mail.gmail.com>
2019-02-25  8:49       ` [nfsv4] " Mkrtchyan, Tigran
2019-02-26  2:54         ` Rick Macklem
2019-02-26  3:23           ` Rick Macklem
2019-02-26 12:48             ` Trond Myklebust
2019-02-27  0:13               ` Rick Macklem
2019-02-27  1:12                 ` Trond Myklebust
     [not found]                   ` <CADaq8jfcFCuM5u30bXqTvg+0Hc-V+1VO9aUdSJLA090j48LCvA@mail.gmail.com>
2019-02-27 18:05                     ` Trond Myklebust [this message]
     [not found]         ` <CADaq8jdp7qVOE9aKui-tgrk00hVmNVXP=yjQ_G-RtGm+G4VrYw@mail.gmail.com>
2019-02-26 19:46           ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=90c104ee477cdc8e999ddaff4ccab32332d0f71d.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=davenoveck@gmail.com \
    --cc=devel@lists.nfs-ganesha.org \
    --cc=linux-nfs-owner@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=nfsv4@ietf.org \
    --cc=rmacklem@uoguelph.ca \
    --cc=tigran.mkrtchyan@desy.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox