linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] BZ#694309: nfs: use unstable writes for groups of small DIO writes
@ 2011-04-14 12:43 Jeff Layton
  2011-04-15  4:13 ` Christoph Hellwig
  0 siblings, 1 reply; 2+ messages in thread
From: Jeff Layton @ 2011-04-14 12:43 UTC (permalink / raw)
  To: Trond.Myklebust; +Cc: linux-nfs, pbadari, chuck.lever

Currently, the client uses FILE_SYNC whenever it's writing less than or
equal data to the wsize with O_DIRECT. This is a problem though if we
have a bunch of small iovec's batched up in a single writev call. The
client will iterate over them and do a single FILE_SYNC WRITE for each.

Instead, change the code to do unstable writes when we'll need to do
multiple WRITE RPC's in order to satisfy the request. While we're at
it, optimize away the allocation of commit_data when we aren't going
to use it anyway.

I tested this with a program that allocates 256 page-sized and aligned
chunks of data into an array of iovecs, opens a file with O_DIRECT, and
then passes that into a writev call 128 times. Without this patch, it
took 5m16s to run on my (admittedly crappy) test rig. With this patch,
it finished in 7.5s.

Trond, would it be reasonable to take this patch as a stopgap measure
until your overhaul of the O_DIRECT code is finished?

Reported-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/nfs/direct.c |   13 +++++++++++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
index 8eea253..9fc3430 100644
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -871,9 +871,18 @@ static ssize_t nfs_direct_write(struct kiocb *iocb, const struct iovec *iov,
 	dreq = nfs_direct_req_alloc();
 	if (!dreq)
 		goto out;
-	nfs_alloc_commit_data(dreq);
 
-	if (dreq->commit_data == NULL || count <= wsize)
+	if (count > wsize || nr_segs > 1)
+		nfs_alloc_commit_data(dreq);
+	else
+		dreq->commit_data = NULL;
+
+	/*
+	 * If we couldn't allocate commit data, or we'll just be doing a
+	 * single write, then make this a NFS_FILE_SYNC write and do away
+	 * with the commit.
+	 */
+	if (dreq->commit_data == NULL)
 		sync = NFS_FILE_SYNC;
 
 	dreq->inode = inode;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] BZ#694309: nfs: use unstable writes for groups of small DIO writes
  2011-04-14 12:43 [PATCH] BZ#694309: nfs: use unstable writes for groups of small DIO writes Jeff Layton
@ 2011-04-15  4:13 ` Christoph Hellwig
  0 siblings, 0 replies; 2+ messages in thread
From: Christoph Hellwig @ 2011-04-15  4:13 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Trond.Myklebust, linux-nfs, pbadari, chuck.lever

On Thu, Apr 14, 2011 at 08:43:28AM -0400, Jeff Layton wrote:
> Currently, the client uses FILE_SYNC whenever it's writing less than or
> equal data to the wsize with O_DIRECT. This is a problem though if we
> have a bunch of small iovec's batched up in a single writev call. The
> client will iterate over them and do a single FILE_SYNC WRITE for each.
> 
> Instead, change the code to do unstable writes when we'll need to do
> multiple WRITE RPC's in order to satisfy the request. While we're at
> it, optimize away the allocation of commit_data when we aren't going
> to use it anyway.
> 
> I tested this with a program that allocates 256 page-sized and aligned
> chunks of data into an array of iovecs, opens a file with O_DIRECT, and
> then passes that into a writev call 128 times. Without this patch, it
> took 5m16s to run on my (admittedly crappy) test rig. With this patch,
> it finished in 7.5s.
> 
> Trond, would it be reasonable to take this patch as a stopgap measure
> until your overhaul of the O_DIRECT code is finished?

To me your patch looks like a good quick fix for this issue.  I'm
not actually sure how Trond's re-architecture is supposed to look like
given that pagecache writeback and DIO writes are pretty fundamentally
driven, but I can't image a design that wouldn't allow for a similar
quirk on when to use stable writes and when not.


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2011-04-15  4:13 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-14 12:43 [PATCH] BZ#694309: nfs: use unstable writes for groups of small DIO writes Jeff Layton
2011-04-15  4:13 ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).