From: Trond Myklebust <trond.myklebust@fys.uio.no>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: Greg Banks <gnb-xTcybq6BZ68@public.gmane.org>,
Brian R Cowan <brcowan@us.ibm.com>,
linux-nfs@vger.kernel.org, linux-nfs-owner@vger.kernel.org,
Peter Staubach <staubach@redhat.com>
Subject: Re: Read/Write NFS I/O performance degraded by FLUSH_STABLE page flushing
Date: Tue, 02 Jun 2009 13:27:11 -0400 [thread overview]
Message-ID: <1243963631.4868.124.camel@heimdal.trondhjem.org> (raw)
In-Reply-To: <BA67D2A4-1752-4789-ADB9-D1B3C6D197F6@oracle.com>
On Tue, 2009-06-02 at 11:00 -0400, Chuck Lever wrote:
> On May 30, 2009, at 9:02 AM, Greg Banks wrote:
> > On Sat, May 30, 2009 at 10:26 PM, Trond Myklebust
> > <trond.myklebust@fys.uio.no> wrote:
> >> On Sat, 2009-05-30 at 10:22 +1000, Greg Banks wrote:
> >>> On Sat, May 30, 2009 at 3:35 AM, Trond Myklebust
> >>> <trond.myklebust@fys.uio.no> wrote:
> >>>> On Fri, 2009-05-29 at 13:25 -0400, Brian R Cowan wrote:
> >>>>>
> >>>
> >>
> >> Firstly, the server only uses O_SYNC if you turn off write gathering
> >> (a.k.a. the 'wdelay' option). The default behaviour for the Linux nfs
> >> server is to always try write gathering and hence no O_SYNC.
> >
> > Well, write gathering is a total crock that AFAICS only helps
> > single-file writes on NFSv2. For today's workloads all it does is
> > provide a hotspot on the two global variables that track writes in an
> > attempt to gather them. Back when I worked on a server product,
> > no_wdelay was one of the standard options for new exports.
>
> Really? Even for NFSv3/4 FILE_SYNC? I can understand that it
> wouldn't have any real effect on UNSTABLE.
The question is why would a sensible client ever want to send more than
1 NFSv3 write with FILE_SYNC? If you need to send multiple writes in
parallel to the same file, then it makes much more sense to use
UNSTABLE.
Write gathering relies on waiting an arbitrary length of time in order
to see if someone is going to send another write. The protocol offers no
guidance as to how long that wait should be, and so (at least on the
Linux server) we've coded in a hard wait of 10ms if and only if we see
that something else has the file open for writing.
One problem with the Linux implementation is that the "something else"
could be another nfs server thread that happens to be in nfsd_write(),
however it could also be another open NFSv4 stateid, or a NLM lock, or a
local process that has the file open for writing.
Another problem is that the nfs server keeps a record of the last file
that was accessed, and also waits if it sees you are writing again to
that same file. Of course it has no idea if this is truly a parallel
write, or if it just happens that you are writing again to the same file
using O_SYNC...
Trond
next prev parent reply other threads:[~2009-06-02 17:27 UTC|newest]
Thread overview: 94+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-30 20:12 Read/Write NFS I/O performance degraded by FLUSH_STABLE page flushing Brian R Cowan
2009-04-30 20:25 ` Christoph Hellwig
2009-04-30 20:28 ` Chuck Lever
2009-04-30 20:41 ` Peter Staubach
2009-04-30 21:13 ` Chuck Lever
2009-04-30 21:23 ` Trond Myklebust
2009-05-01 16:39 ` Brian R Cowan
[not found] ` <1241126587.15476.62.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-29 15:55 ` Brian R Cowan
2009-05-29 16:46 ` Trond Myklebust
[not found] ` <1243615595.7155.48.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-29 17:25 ` Brian R Cowan
2009-05-29 17:35 ` Trond Myklebust
[not found] ` <1243618500.7155.56.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-30 0:22 ` Greg Banks
[not found] ` <ac442c870905291722x1ec811b2sda997d464898fcda-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-05-30 7:57 ` Christoph Hellwig
2009-06-01 22:30 ` J. Bruce Fields
2009-06-05 14:54 ` Christoph Hellwig
2009-06-05 16:01 ` J. Bruce Fields
2009-06-05 16:12 ` Trond Myklebust
[not found] ` <1244218328.5410.38.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-06-05 19:54 ` J. Bruce Fields
2009-06-05 21:21 ` Trond Myklebust
2009-05-30 12:26 ` Trond Myklebust
[not found] ` <1243686363.5209.16.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-30 12:43 ` Trond Myklebust
2009-05-30 13:02 ` Greg Banks
[not found] ` <ac442c870905300602v6950ec42y5195d2d6ea7dd4c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-06-01 22:30 ` J. Bruce Fields
2009-06-02 15:00 ` Chuck Lever
2009-06-02 17:27 ` Trond Myklebust [this message]
[not found] ` <1243963631.4868.124.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-06-02 18:15 ` Chuck Lever
2009-06-03 16:22 ` Carlos Carvalho
2009-06-03 17:10 ` Trond Myklebust
[not found] ` <OFB53BFCCB.0CEC7A7E-ON852575C <1244138698.5203.59.camel@heimdal.trondhjem.org>
2009-06-03 21:28 ` Dean Hildebrand
2009-06-04 2:16 ` Carlos Carvalho
2009-06-04 17:42 ` Brian R Cowan
2009-06-04 18:04 ` Trond Myklebust
2009-06-04 20:43 ` Link performance over NFS degraded in RHEL5. -- was : " Brian R Cowan
2009-06-04 20:57 ` Trond Myklebust
2009-06-04 21:30 ` Brian R Cowan
2009-06-04 21:48 ` Trond Myklebust
2009-06-04 21:07 ` Peter Staubach
2009-06-04 21:39 ` Brian R Cowan
2009-06-05 11:35 ` Steve Dickson
2009-06-05 12:46 ` Trond Myklebust
2009-06-05 13:03 ` Brian R Cowan
2009-06-05 13:05 ` Tom Talpey
[not found] ` <4A29144A.6030405@gmail.com>
2009-06-05 13:30 ` Steve Dickson
2009-06-05 13:52 ` Trond Myklebust
[not found] ` <1244209956.5410.33.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-06-05 13:57 ` Steve Dickson
[not found] ` <4A29243F.8080008-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
2009-06-05 16:05 ` J. Bruce Fields
2009-06-05 16:35 ` Trond Myklebust
[not found] ` <1244219715.5410.40.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-06-15 23:08 ` J. Bruce Fields
2009-06-16 0:21 ` NeilBrown
[not found] ` <99d4545537613ce76040d3655b78bdb7.squirrel-eq65iwfR9nKIECXXMXunQA@public.gmane.org>
2009-06-16 0:33 ` J. Bruce Fields
2009-06-16 0:50 ` NeilBrown
[not found] ` <02ada87c636e1088e9365a3cbea301e7.squirrel-eq65iwfR9nKIECXXMXunQA@public.gmane.org>
2009-06-16 0:55 ` J. Bruce Fields
2009-06-17 16:54 ` J. Bruce Fields
2009-06-17 16:59 ` [PATCH 1/3] nfsd: track last inode only in use_wgather case J. Bruce Fields
2009-06-17 16:59 ` [PATCH 2/3] nfsd: Pull write-gathering code out of nfsd_vfs_write J. Bruce Fields
2009-06-17 16:59 ` [PATCH 3/3] nfsd: minor nfsd_vfs_write cleanup J. Bruce Fields
2009-06-16 0:32 ` Link performance over NFS degraded in RHEL5. -- was : Read/Write NFS I/O performance degraded by FLUSH_STABLE page flushing Trond Myklebust
[not found] ` <1245112324.7470.7.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-06-16 2:02 ` J. Bruce Fields
[not found] ` <4A291D83.1000508@RedHat.com>
2009-06-05 13:50 ` Tom Talpey
2009-06-05 13:54 ` Trond Myklebust
2009-06-05 13:58 ` Tom Talpey
2009-06-05 13:56 ` Brian R Cowan
2009-06-24 19:54 ` [PATCH] read-modify-write page updating Peter Staubach
2009-06-25 17:13 ` Trond Myklebust
[not found] ` <1245950029.4913.17.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-07-09 13:59 ` Peter Staubach
2009-07-09 14:12 ` [PATCH v2] " Peter Staubach
2009-07-09 15:39 ` Trond Myklebust
[not found] ` <1247153972.5766.15.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-07-10 15:57 ` Peter Staubach
2009-07-10 17:22 ` J. Bruce Fields
2009-08-04 17:52 ` [PATCH v3] " Peter Staubach
2009-08-05 0:50 ` Trond Myklebust
2009-05-29 17:48 ` Read/Write NFS I/O performance degraded by FLUSH_STABLE page flushing Peter Staubach
2009-05-29 18:21 ` Trond Myklebust
2009-05-29 17:01 ` Chuck Lever
2009-05-29 17:38 ` Brian R Cowan
2009-05-29 17:42 ` Trond Myklebust
[not found] ` <1243618968.7155.60.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-29 17:47 ` Chuck Lever
2009-05-29 18:15 ` Trond Myklebust
2009-05-29 17:51 ` Peter Staubach
2009-05-29 18:25 ` Brian R Cowan
2009-05-29 18:43 ` Trond Myklebust
2009-05-29 17:55 ` Brian R Cowan
2009-05-29 18:07 ` Trond Myklebust
[not found] ` <1243620455.7155.80.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-29 18:18 ` Brian R Cowan
2009-05-29 18:29 ` Trond Myklebust
[not found] ` <1243621769.7155.97.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-29 20:09 ` Brian R Cowan
2009-05-29 20:21 ` Trond Myklebust
[not found] ` <1243628519.7155.150.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-29 21:55 ` Brian R Cowan
2009-05-29 22:03 ` Trond Myklebust
[not found] ` <OFBB9B2C07.CC3D028B-ON852575C5. <1243634634.7155.160.camel@heimdal.trondhjem.org>
[not found] ` <1243634634.7155.160.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-29 22:20 ` Brian R Cowan
2009-05-29 22:36 ` Trond Myklebust
[not found] ` <OF061E0258.9581352B-ON852575C <1243636593.7155.188.camel@heimdal.trondhjem.org>
[not found] ` <1243636593.7155.188.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-29 23:02 ` Brian R Cowan
2009-05-29 23:13 ` Trond Myklebust
2009-05-29 17:57 ` Trond Myklebust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1243963631.4868.124.camel@heimdal.trondhjem.org \
--to=trond.myklebust@fys.uio.no \
--cc=brcowan@us.ibm.com \
--cc=chuck.lever@oracle.com \
--cc=gnb-xTcybq6BZ68@public.gmane.org \
--cc=linux-nfs-owner@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=staubach@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox