From: Olaf Kirch <okir@suse.de>
To: nfs@lists.sourceforge.net
Subject: [PATCH] Fix nfsd rewrite performance
Date: Mon, 1 Aug 2005 13:39:54 +0200 [thread overview]
Message-ID: <20050801113954.GA8698@suse.de> (raw)
Since no-one commented on the previous versions of the patch, I'm
submitting it almost unchanged now. This patch is relative to 2.6.12-rc3,
and has undergone some iozone testing, which showed write and rewrite
performance coming out almost the same.
------------------------------------------------------------------
Subject: NFS: Fix rewrite performance
This patch fixes an nfsd performance issue with rewrite. Most of the time,
the iovecs passed to nfsd_vfs_write are unaligned. As the default writev
implementation will just call write() on each chunk in the iovec, this
will cause partial blocks to be dirtied, triggering a read-modify-write
cycle for each block.
The short-term fix is to make sure nfsd aligns the data properly.
The long term fix would be to make the VFS smarter about writev requests.
Signed-off-by: Olaf Kirch <okir@suse.de>
Index: linux-2.6.12.new/fs/nfsd/vfs.c
===================================================================
--- linux-2.6.12.new.orig/fs/nfsd/vfs.c
+++ linux-2.6.12.new/fs/nfsd/vfs.c
@@ -874,6 +874,46 @@ out:
return err;
}
+/*
+ * Helper function to page-align the write payload.
+ */
+static inline int
+nfsd_page_align_payload(struct kvec *vec, int vlen)
+{
+ unsigned char *this_page, *prev_page;
+ int i, chunk0, chunk1;
+
+ /* The following checks are just paranoia */
+ if (vlen < 2)
+ return 0;
+
+ if (vec[0].iov_len + vec[vlen-1].iov_len != PAGE_CACHE_SIZE)
+ return 0;
+ for (i = 1; i < vlen - 1; ++i) {
+ if (vec[i].iov_len != PAGE_CACHE_SIZE)
+ return 0;
+ }
+
+ chunk0 = vec[0].iov_len;
+ chunk1 = PAGE_CACHE_SIZE - chunk0;
+
+ this_page = (unsigned char *) vec[vlen-1].iov_base;
+ for (i = vlen-1; i; --i) {
+ prev_page = (unsigned char *) vec[i-1].iov_base;
+
+ /* Push trailing partial page so it's
+ * aligned with the end of the page, then
+ * pull up the missing chunk from the previous
+ * page */
+ memmove(this_page + chunk0, this_page, chunk1);
+ memcpy(this_page, prev_page + chunk1, chunk0);
+ vec[i].iov_len = PAGE_CACHE_SIZE;
+ this_page = prev_page;
+ }
+
+ return 1;
+}
+
static inline int
nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
loff_t offset, struct kvec *vec, int vlen,
@@ -917,6 +957,17 @@ nfsd_vfs_write(struct svc_rqst *rqstp, s
if (stable && !EX_WGATHER(exp))
file->f_flags |= O_SYNC;
+ /* Hack: if we're rewriting the file, make sure
+ * we align the iovec properly to avoid costly
+ * read-modify-write operations on the block devices.
+ * This hack can go away once we have generic_file_writev.
+ */
+ if ((offset < inode->i_size)
+ && (cnt % PAGE_CACHE_SIZE) == 0
+ && vec->iov_len != PAGE_CACHE_SIZE
+ && nfsd_page_align_payload(vec, vlen))
+ vec++, vlen--;
+
/* Write the data. */
oldfs = get_fs(); set_fs(KERNEL_DS);
err = vfs_writev(file, (struct iovec __user *)vec, vlen, &offset);
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
okir@suse.de | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
next reply other threads:[~2005-08-01 11:40 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-08-01 11:39 Olaf Kirch [this message]
2005-08-01 11:53 ` [PATCH] Fix nfsd rewrite performance J. Bruce Fields
2005-08-01 11:59 ` Olaf Kirch
2005-08-01 12:10 ` J. Bruce Fields
2005-08-01 12:14 ` Olaf Kirch
2005-08-01 12:58 ` J. Bruce Fields
2005-08-02 9:49 ` Olaf Kirch
2005-08-02 10:02 ` J. Bruce Fields
2005-08-02 10:49 ` Olaf Kirch
2005-08-02 12:07 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050801113954.GA8698@suse.de \
--to=okir@suse.de \
--cc=nfs@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox