All of lore.kernel.org
 help / color / mirror / Atom feed
From: Olaf Kirch <okir@suse.de>
To: nfs@lists.sourceforge.net
Subject: Re: nfsd write throughput
Date: Tue, 3 Aug 2004 12:32:14 +0200	[thread overview]
Message-ID: <20040803103213.GE21365@suse.de> (raw)
In-Reply-To: <20040803075506.GL5581@sgi.com>

Hi folks,

I've been looking at the problem from a different angle...

Theory:
	The main bottleneck is that we spend a long time in commit(),
	blocking other WRITE calls from making any progress (thereby
	stalling all NFS clients). The reason is what we take inode->i_sem
	in nfsd_sync, but the writev() code wants to grab the same
	semaphore.

Circumstantial Evidence:

I've been doing some tests with the latencies of WRITE and COMMIT, using a
single stream write. The average time we spend in nfsd_write is miniscule,
usually it's less than 2 milliseconds. However when a commit comes in,
we take a hit there as well - something around 500 ms for reiser, and
400 ms for ext3. Syncing to reiser frequently takes up to 1.2 seconds,
while the 400 ms for ext3 is pretty constant.

Right now, nfsd_sync calls

	filemap_fdatawrite
	filp->f_op->fsync
	filemap_fdatawait

all under the i_sem. However, it seems we don't need the i_sem for
the filemap_* functions (is that valid - at least sync_page_range
doesn't?). So I changed the code to make it grab i_sem only for the fsync
call, but unfortunately, that doesn't seem to make much of a difference,
as I found out. Most of the time taken by a commit is spent in fsync
(the delta between the fsync latency and the overall commit latency is
usually less than 5 ms, i.e. ~1%).

I also changed nfsd_sync to call filemap_fdatawrite_range instead of
filemap_fdatawrite, but that doesn't make a noticeable difference either.

I then re-enabled my flushfast hack, and the commit latencies went
down to 30 ms on ext3, with the occasional spike of 300 ms. On reiser,
the commit latency went down to something like 50 ms on average.

(The reiserfs rewrite case was fairly bad, however. Rewrite over NFS
on top of reiser is fairly slow to begin with, much slower than write;
and the gain from the flushfast patch is minimal - but that's a different
story)


Conclusion:

So this at least supports my theory that the commits are throttling the
writes quite a bit. For the sake of completeness, I did some more iozone
measurements, and on write/rewrite the performance gain is about 50%
on both reiser and ext3, for a single client. I would think for several
clients writing concurrently, the gain should be even more pronounced,
but I haven't run these tests yet.

I'm wondering what could happen if we change nfsd_sync to not take the
i_sem at all... I'll talk to a few VFS folks around here and try to find out.


PS:

Another thing I noticed was that the commit calls sent by the Linux
client (2.6.5) are not evenly distributed over time. Much of the time,
the client will call COMMIT 4-6 times a second, and then all of a sudden
I see 30-80 calls a second several times in a row.

Olaf
-- 
Olaf Kirch     |  The Hardware Gods hate me.
okir@suse.de   |
---------------+ 


-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

  parent reply	other threads:[~2004-08-03 10:35 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-08-02 16:24 nfsd write throughput Olaf Kirch
2004-08-03  2:10 ` Greg Banks
2004-08-03  6:02   ` Olaf Kirch
2004-08-03  7:55     ` Greg Banks
2004-08-03  8:09       ` Olaf Kirch
2004-08-03  8:28         ` Greg Banks
2004-08-03 10:32       ` Olaf Kirch [this message]
2004-08-03 10:52         ` Olaf Kirch
2004-08-03 11:24         ` Greg Banks
2004-08-03 13:26           ` Olaf Kirch
2004-08-03  2:23 ` Neil Brown
  -- strict thread matches above, loose matches on Subject: below --
2004-08-04  0:10 Bruce Allan
2004-08-04  8:18 ` Greg Banks

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040803103213.GE21365@suse.de \
    --to=okir@suse.de \
    --cc=nfs@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.