All of lore.kernel.org
 help / color / mirror / Atom feed
From: Olaf Kirch <okir@suse.de>
To: nfs@lists.sourceforge.net
Subject: [PATCH] Fix nfsd rewrite performance
Date: Mon, 1 Aug 2005 13:39:54 +0200	[thread overview]
Message-ID: <20050801113954.GA8698@suse.de> (raw)

Since no-one commented on the previous versions of the patch, I'm
submitting it almost unchanged now. This patch is relative to 2.6.12-rc3,
and has undergone some iozone testing, which showed write and rewrite
performance coming out almost the same.

------------------------------------------------------------------
Subject: NFS: Fix rewrite performance

This patch fixes an nfsd performance issue with rewrite. Most of the time,
the iovecs passed to nfsd_vfs_write are unaligned. As the default writev
implementation will just call write() on each chunk in the iovec, this
will cause partial blocks to be dirtied, triggering a read-modify-write
cycle for each block.

The short-term fix is to make sure nfsd aligns the data properly.
The long term fix would be to make the VFS smarter about writev requests.

Signed-off-by: Olaf Kirch <okir@suse.de>

Index: linux-2.6.12.new/fs/nfsd/vfs.c
===================================================================
--- linux-2.6.12.new.orig/fs/nfsd/vfs.c
+++ linux-2.6.12.new/fs/nfsd/vfs.c
@@ -874,6 +874,46 @@ out:
 	return err;
 }
 
+/*
+ * Helper function to page-align the write payload.
+ */
+static inline int
+nfsd_page_align_payload(struct kvec *vec, int vlen)
+{
+	unsigned char *this_page, *prev_page;
+	int i, chunk0, chunk1;
+
+	/* The following checks are just paranoia */
+	if (vlen < 2)
+		return 0;
+
+	if (vec[0].iov_len + vec[vlen-1].iov_len != PAGE_CACHE_SIZE)
+		return 0;
+	for (i = 1; i < vlen - 1; ++i) {
+		if (vec[i].iov_len != PAGE_CACHE_SIZE)
+			return 0;
+	}
+
+	chunk0 = vec[0].iov_len;
+	chunk1 = PAGE_CACHE_SIZE - chunk0;
+
+	this_page = (unsigned char *) vec[vlen-1].iov_base;
+	for (i = vlen-1; i; --i) {
+		prev_page = (unsigned char *) vec[i-1].iov_base;
+
+		/* Push trailing partial page so it's
+		 * aligned with the end of the page, then
+		 * pull up the missing chunk from the previous
+		 * page */
+		memmove(this_page + chunk0, this_page, chunk1);
+		memcpy(this_page, prev_page + chunk1, chunk0);
+		vec[i].iov_len = PAGE_CACHE_SIZE;
+		this_page = prev_page;
+	}
+
+	return 1;
+}
+
 static inline int
 nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
 				loff_t offset, struct kvec *vec, int vlen,
@@ -917,6 +957,17 @@ nfsd_vfs_write(struct svc_rqst *rqstp, s
 	if (stable && !EX_WGATHER(exp))
 		file->f_flags |= O_SYNC;
 
+	/* Hack: if we're rewriting the file, make sure
+	 * we align the iovec properly to avoid costly
+	 * read-modify-write operations on the block devices.
+	 * This hack can go away once we have generic_file_writev.
+	 */
+	if ((offset < inode->i_size)
+	 && (cnt % PAGE_CACHE_SIZE) == 0
+	 && vec->iov_len != PAGE_CACHE_SIZE
+	 && nfsd_page_align_payload(vec, vlen))
+		vec++, vlen--;
+
 	/* Write the data. */
 	oldfs = get_fs(); set_fs(KERNEL_DS);
 	err = vfs_writev(file, (struct iovec __user *)vec, vlen, &offset);
-- 
Olaf Kirch   |  --- o --- Nous sommes du soleil we love when we play
okir@suse.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

             reply	other threads:[~2005-08-01 11:40 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-08-01 11:39 Olaf Kirch [this message]
2005-08-01 11:53 ` [PATCH] Fix nfsd rewrite performance J. Bruce Fields
2005-08-01 11:59   ` Olaf Kirch
2005-08-01 12:10     ` J. Bruce Fields
2005-08-01 12:14       ` Olaf Kirch
2005-08-01 12:58         ` J. Bruce Fields
2005-08-02  9:49           ` Olaf Kirch
2005-08-02 10:02             ` J. Bruce Fields
2005-08-02 10:49               ` Olaf Kirch
2005-08-02 12:07                 ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050801113954.GA8698@suse.de \
    --to=okir@suse.de \
    --cc=nfs@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.