From: Olaf Kirch <okir@suse.de>
To: nfs@lists.sourceforge.net
Subject: Re: nfsd write throughput
Date: Tue, 3 Aug 2004 12:32:14 +0200 [thread overview]
Message-ID: <20040803103213.GE21365@suse.de> (raw)
In-Reply-To: <20040803075506.GL5581@sgi.com>
Hi folks,
I've been looking at the problem from a different angle...
Theory:
The main bottleneck is that we spend a long time in commit(),
blocking other WRITE calls from making any progress (thereby
stalling all NFS clients). The reason is what we take inode->i_sem
in nfsd_sync, but the writev() code wants to grab the same
semaphore.
Circumstantial Evidence:
I've been doing some tests with the latencies of WRITE and COMMIT, using a
single stream write. The average time we spend in nfsd_write is miniscule,
usually it's less than 2 milliseconds. However when a commit comes in,
we take a hit there as well - something around 500 ms for reiser, and
400 ms for ext3. Syncing to reiser frequently takes up to 1.2 seconds,
while the 400 ms for ext3 is pretty constant.
Right now, nfsd_sync calls
filemap_fdatawrite
filp->f_op->fsync
filemap_fdatawait
all under the i_sem. However, it seems we don't need the i_sem for
the filemap_* functions (is that valid - at least sync_page_range
doesn't?). So I changed the code to make it grab i_sem only for the fsync
call, but unfortunately, that doesn't seem to make much of a difference,
as I found out. Most of the time taken by a commit is spent in fsync
(the delta between the fsync latency and the overall commit latency is
usually less than 5 ms, i.e. ~1%).
I also changed nfsd_sync to call filemap_fdatawrite_range instead of
filemap_fdatawrite, but that doesn't make a noticeable difference either.
I then re-enabled my flushfast hack, and the commit latencies went
down to 30 ms on ext3, with the occasional spike of 300 ms. On reiser,
the commit latency went down to something like 50 ms on average.
(The reiserfs rewrite case was fairly bad, however. Rewrite over NFS
on top of reiser is fairly slow to begin with, much slower than write;
and the gain from the flushfast patch is minimal - but that's a different
story)
Conclusion:
So this at least supports my theory that the commits are throttling the
writes quite a bit. For the sake of completeness, I did some more iozone
measurements, and on write/rewrite the performance gain is about 50%
on both reiser and ext3, for a single client. I would think for several
clients writing concurrently, the gain should be even more pronounced,
but I haven't run these tests yet.
I'm wondering what could happen if we change nfsd_sync to not take the
i_sem at all... I'll talk to a few VFS folks around here and try to find out.
PS:
Another thing I noticed was that the commit calls sent by the Linux
client (2.6.5) are not evenly distributed over time. Much of the time,
the client will call COMMIT 4-6 times a second, and then all of a sudden
I see 30-80 calls a second several times in a row.
Olaf
--
Olaf Kirch | The Hardware Gods hate me.
okir@suse.de |
---------------+
-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
next prev parent reply other threads:[~2004-08-03 10:35 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-08-02 16:24 nfsd write throughput Olaf Kirch
2004-08-03 2:10 ` Greg Banks
2004-08-03 6:02 ` Olaf Kirch
2004-08-03 7:55 ` Greg Banks
2004-08-03 8:09 ` Olaf Kirch
2004-08-03 8:28 ` Greg Banks
2004-08-03 10:32 ` Olaf Kirch [this message]
2004-08-03 10:52 ` Olaf Kirch
2004-08-03 11:24 ` Greg Banks
2004-08-03 13:26 ` Olaf Kirch
2004-08-03 2:23 ` Neil Brown
-- strict thread matches above, loose matches on Subject: below --
2004-08-04 0:10 Bruce Allan
2004-08-04 8:18 ` Greg Banks
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040803103213.GE21365@suse.de \
--to=okir@suse.de \
--cc=nfs@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.