From: Peter Staubach <staubach@redhat.com>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Brian R Cowan <brcowan@us.ibm.com>, linux-nfs@vger.kernel.org
Subject: [PATCH v2] read-modify-write page updating
Date: Thu, 09 Jul 2009 10:12:24 -0400 [thread overview]
Message-ID: <4A55FAC8.5040309@redhat.com> (raw)
In-Reply-To: <4A428482.60403@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 2869 bytes --]
Hi.
I have a proposal for possibly resolving this issue.
I believe that this situation occurs due to the way that the
Linux NFS client handles writes which modify partial pages.
The Linux NFS client handles partial page modifications by
allocating a page from the page cache, copying the data from
the user level into the page, and then keeping track of the
offset and length of the modified portions of the page. The
page is not marked as up to date because there are portions
of the page which do not contain valid file contents.
When a read call comes in for a portion of the page, the
contents of the page must be read in the from the server.
However, since the page may already contain some modified
data, that modified data must be written to the server
before the file contents can be read back in the from server.
And, since the writing and reading can not be done atomically,
the data must be written and committed to stable storage on
the server for safety purposes. This means either a
FILE_SYNC WRITE or a UNSTABLE WRITE followed by a COMMIT.
This has been discussed at length previously.
This algorithm could be described as modify-write-read. It
is most efficient when the application only updates pages
and does not read them.
My proposed solution is to add a heuristic to decide whether
to do this modify-write-read algorithm or switch to a read-
modify-write algorithm when initially allocating the page
in the write system call path. The heuristic uses the modes
that the file was opened with, the offset in the page to
read from, and the size of the region to read.
If the file was opened for reading in addition to writing
and the page would not be filled completely with data from
the user level, then read in the old contents of the page
and mark it as Uptodate before copying in the new data. If
the page would be completely filled with data from the user
level, then there would be no reason to read in the old
contents because they would just be copied over.
This would optimize for applications which randomly access
and update portions of files. The linkage editor for the
C compiler is an example of such a thing.
I tested the attached patch by using rpmbuild to build the
current Fedora rawhide kernel. The kernel without the
patch generated about 269,500 WRITE requests. The modified
kernel containing the patch generated about 261,000 WRITE
requests. Thus, about 8,500 fewer WRITE requests were
generated. I suspect that many of these additional
WRITE requests were probably FILE_SYNC requests to WRITE
a single page, but I didn't test this theory.
The previous version of this patch caused the NFS client to
generate around 3,000 more READ requests. This version
actually causes the NFS client to generate almost 500 fewer
READ requests.
Thanx...
ps
Signed-off-by: Peter Staubach <staubach@redhat.com>
[-- Attachment #2: read-modify-write.devel.2 --]
[-- Type: application/x-troff-man, Size: 2713 bytes --]
next prev parent reply other threads:[~2009-07-09 14:12 UTC|newest]
Thread overview: 94+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-30 20:12 Read/Write NFS I/O performance degraded by FLUSH_STABLE page flushing Brian R Cowan
2009-04-30 20:25 ` Christoph Hellwig
2009-04-30 20:28 ` Chuck Lever
2009-04-30 20:41 ` Peter Staubach
2009-04-30 21:13 ` Chuck Lever
2009-04-30 21:23 ` Trond Myklebust
2009-05-01 16:39 ` Brian R Cowan
[not found] ` <1241126587.15476.62.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-29 15:55 ` Brian R Cowan
2009-05-29 16:46 ` Trond Myklebust
[not found] ` <1243615595.7155.48.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-29 17:25 ` Brian R Cowan
2009-05-29 17:35 ` Trond Myklebust
[not found] ` <1243618500.7155.56.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-30 0:22 ` Greg Banks
[not found] ` <ac442c870905291722x1ec811b2sda997d464898fcda-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-05-30 7:57 ` Christoph Hellwig
2009-06-01 22:30 ` J. Bruce Fields
2009-06-05 14:54 ` Christoph Hellwig
2009-06-05 16:01 ` J. Bruce Fields
2009-06-05 16:12 ` Trond Myklebust
[not found] ` <1244218328.5410.38.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-06-05 19:54 ` J. Bruce Fields
2009-06-05 21:21 ` Trond Myklebust
2009-05-30 12:26 ` Trond Myklebust
[not found] ` <1243686363.5209.16.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-30 12:43 ` Trond Myklebust
2009-05-30 13:02 ` Greg Banks
[not found] ` <ac442c870905300602v6950ec42y5195d2d6ea7dd4c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-06-01 22:30 ` J. Bruce Fields
2009-06-02 15:00 ` Chuck Lever
2009-06-02 17:27 ` Trond Myklebust
[not found] ` <1243963631.4868.124.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-06-02 18:15 ` Chuck Lever
2009-06-03 16:22 ` Carlos Carvalho
2009-06-03 17:10 ` Trond Myklebust
[not found] ` <OFB53BFCCB.0CEC7A7E-ON852575C <1244138698.5203.59.camel@heimdal.trondhjem.org>
2009-06-03 21:28 ` Dean Hildebrand
2009-06-04 2:16 ` Carlos Carvalho
2009-06-04 17:42 ` Brian R Cowan
2009-06-04 18:04 ` Trond Myklebust
2009-06-04 20:43 ` Link performance over NFS degraded in RHEL5. -- was : " Brian R Cowan
2009-06-04 20:57 ` Trond Myklebust
2009-06-04 21:30 ` Brian R Cowan
2009-06-04 21:48 ` Trond Myklebust
2009-06-04 21:07 ` Peter Staubach
2009-06-04 21:39 ` Brian R Cowan
2009-06-05 11:35 ` Steve Dickson
2009-06-05 12:46 ` Trond Myklebust
2009-06-05 13:03 ` Brian R Cowan
2009-06-05 13:05 ` Tom Talpey
[not found] ` <4A29144A.6030405@gmail.com>
2009-06-05 13:30 ` Steve Dickson
2009-06-05 13:52 ` Trond Myklebust
[not found] ` <1244209956.5410.33.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-06-05 13:57 ` Steve Dickson
[not found] ` <4A29243F.8080008-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
2009-06-05 16:05 ` J. Bruce Fields
2009-06-05 16:35 ` Trond Myklebust
[not found] ` <1244219715.5410.40.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-06-15 23:08 ` J. Bruce Fields
2009-06-16 0:21 ` NeilBrown
[not found] ` <99d4545537613ce76040d3655b78bdb7.squirrel-eq65iwfR9nKIECXXMXunQA@public.gmane.org>
2009-06-16 0:33 ` J. Bruce Fields
2009-06-16 0:50 ` NeilBrown
[not found] ` <02ada87c636e1088e9365a3cbea301e7.squirrel-eq65iwfR9nKIECXXMXunQA@public.gmane.org>
2009-06-16 0:55 ` J. Bruce Fields
2009-06-17 16:54 ` J. Bruce Fields
2009-06-17 16:59 ` [PATCH 1/3] nfsd: track last inode only in use_wgather case J. Bruce Fields
2009-06-17 16:59 ` [PATCH 2/3] nfsd: Pull write-gathering code out of nfsd_vfs_write J. Bruce Fields
2009-06-17 16:59 ` [PATCH 3/3] nfsd: minor nfsd_vfs_write cleanup J. Bruce Fields
2009-06-16 0:32 ` Link performance over NFS degraded in RHEL5. -- was : Read/Write NFS I/O performance degraded by FLUSH_STABLE page flushing Trond Myklebust
[not found] ` <1245112324.7470.7.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-06-16 2:02 ` J. Bruce Fields
[not found] ` <4A291D83.1000508@RedHat.com>
2009-06-05 13:50 ` Tom Talpey
2009-06-05 13:54 ` Trond Myklebust
2009-06-05 13:58 ` Tom Talpey
2009-06-05 13:56 ` Brian R Cowan
2009-06-24 19:54 ` [PATCH] read-modify-write page updating Peter Staubach
2009-06-25 17:13 ` Trond Myklebust
[not found] ` <1245950029.4913.17.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-07-09 13:59 ` Peter Staubach
2009-07-09 14:12 ` Peter Staubach [this message]
2009-07-09 15:39 ` [PATCH v2] " Trond Myklebust
[not found] ` <1247153972.5766.15.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-07-10 15:57 ` Peter Staubach
2009-07-10 17:22 ` J. Bruce Fields
2009-08-04 17:52 ` [PATCH v3] " Peter Staubach
2009-08-05 0:50 ` Trond Myklebust
2009-05-29 17:48 ` Read/Write NFS I/O performance degraded by FLUSH_STABLE page flushing Peter Staubach
2009-05-29 18:21 ` Trond Myklebust
2009-05-29 17:01 ` Chuck Lever
2009-05-29 17:38 ` Brian R Cowan
2009-05-29 17:42 ` Trond Myklebust
[not found] ` <1243618968.7155.60.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-29 17:47 ` Chuck Lever
2009-05-29 18:15 ` Trond Myklebust
2009-05-29 17:51 ` Peter Staubach
2009-05-29 18:25 ` Brian R Cowan
2009-05-29 18:43 ` Trond Myklebust
2009-05-29 17:55 ` Brian R Cowan
2009-05-29 18:07 ` Trond Myklebust
[not found] ` <1243620455.7155.80.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-29 18:18 ` Brian R Cowan
2009-05-29 18:29 ` Trond Myklebust
[not found] ` <1243621769.7155.97.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-29 20:09 ` Brian R Cowan
2009-05-29 20:21 ` Trond Myklebust
[not found] ` <1243628519.7155.150.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-29 21:55 ` Brian R Cowan
2009-05-29 22:03 ` Trond Myklebust
[not found] ` <OFBB9B2C07.CC3D028B-ON852575C5. <1243634634.7155.160.camel@heimdal.trondhjem.org>
[not found] ` <1243634634.7155.160.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-29 22:20 ` Brian R Cowan
2009-05-29 22:36 ` Trond Myklebust
[not found] ` <OF061E0258.9581352B-ON852575C <1243636593.7155.188.camel@heimdal.trondhjem.org>
[not found] ` <1243636593.7155.188.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-29 23:02 ` Brian R Cowan
2009-05-29 23:13 ` Trond Myklebust
2009-05-29 17:57 ` Trond Myklebust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A55FAC8.5040309@redhat.com \
--to=staubach@redhat.com \
--cc=brcowan@us.ibm.com \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox