public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Peter Staubach <staubach@redhat.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>,
	NFS list <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH v2] flow control for WRITE requests
Date: Tue, 24 Mar 2009 17:19:17 -0400	[thread overview]
Message-ID: <20090324211917.GJ19389@fieldses.org> (raw)
In-Reply-To: <49C93526.70303@redhat.com>

On Tue, Mar 24, 2009 at 03:31:50PM -0400, Peter Staubach wrote:
> Hi.
> 
> Attached is a patch which implements some flow control for the
> NFS client to control dirty pages.  The flow control is
> implemented on a per-file basis and causes dirty pages to be
> written out when the client can detect that the application is
> writing in a serial fashion and has dirtied enough pages to
> fill a complete over the wire transfer.
> 
> This work was precipitated by working on a situation where a
> server at a customer site was not able to adequately handle
> the behavior of the Linux NFS client.  This particular server
> required that all data to the file written to the file be
> written in a strictly serial fashion.  It also had problems
> handling the Linux NFS client semantic of caching a large
> amount of data and then sending out that data all at once.
> 
> The sequential ordering problem was resolved by a previous
> patch which was submitted to the linux-nfs list.  This patch
> addresses the capacity problem.
> 
> The problem is resolved by sending WRITE requests much
> earlier in the process of the application writing to the file.
> The client keeps track of the number of dirty pages associated
> with the file and also the last offset of the data being
> written.  When the client detects that a full over the wire
> transfer could be constructed and that the application is
> writing sequentially, then it generates an UNSTABLE write to
> server for the currently dirty data.
> 
> The client also keeps track of the number of these WRITE
> requests which have been generated.  It flow controls based
> on a configurable maximum.  This keeps the client from
> completely overwhelming the server.
> 
> A nice side effect of the framework is that the issue of
> stat()'ing a file being written can be handled much more
> quickly than before.  The amount of data that must be
> transmitted to the server to satisfy the "latest mtime"
> requirement is limited.  Also, the application writing to
> the file is blocked until the over the wire GETATTR is
> completed.  This allows the GETATTR to be send and the
> response received without competing with the data being
> written.
> 
> No performance regressions were seen during informal
> performance testing.
> 
> As a side note -- the more natural model of flow control
> would seem to be at the client/server level instead of
> the per-file level.  However, that level was too coarse
> with the particular server that was required to be used
> because its requirements were at the per-file level.

I don't understand what you mean by "its requirements were at the
per-file level".

> The new functionality in this patch is controlled via the
> use of the sysctl, nfs_max_outstanding_writes.  It defaults
> to 0, meaning no flow control and the current behaviors.
> Setting it to any non-zero value enables the functionality.
> The value of 16 seems to be a good number and aligns with
> other NFS and RPC tunables.
> 
> Lastly, the functionality of starting WRITE requests sooner
> to smooth out the i/o pattern should probably be done by the
> VM subsystem.  I am looking into this, but in the meantime
> and to solve the immediate problem, this support is proposed.

It seems unfortunate if we add a sysctl to work around a problem that
ends up being fixed some other way a version or two later.

Would be great to have some progress on these problems, though....

--b.

  reply	other threads:[~2009-03-24 21:19 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-24 19:31 [PATCH v2] flow control for WRITE requests Peter Staubach
2009-03-24 21:19 ` J. Bruce Fields [this message]
2009-03-25 13:15   ` Peter Staubach
2009-05-27 19:18   ` Peter Staubach
2009-05-27 20:45     ` Trond Myklebust
     [not found]       ` <1243457149.8522.68.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-28 15:41         ` Peter Staubach
2009-05-28 15:48           ` Chuck Lever
2009-06-01 21:48           ` Trond Myklebust
     [not found]             ` <1243892886.4868.74.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-06-02 18:37               ` Peter Staubach
2009-06-02 22:12                 ` Trond Myklebust
     [not found]                   ` <1243980736.4868.314.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-06-03 14:17                     ` Peter Staubach
2009-06-09 22:32                       ` Peter Staubach
2009-06-09 23:05                         ` Trond Myklebust
     [not found]                           ` <1244588719.24750.20.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-06-10 19:43                             ` Peter Staubach
2009-07-06  0:48                               ` Neil Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090324211917.GJ19389@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=staubach@redhat.com \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox