From: Jim Callahan <callahan@temerity.us>
To: linux-nfs@vger.kernel.org
Subject: Best A->B large file copy performance
Date: Thu, 12 Mar 2009 17:00:59 -0400 [thread overview]
Message-ID: <49B9780B.2020609@temerity.us> (raw)
I'm trying to determine the most optimal way to have a single NFS client
copy large numbers (100-1000) of fairly large (1-50M) files from one
location on an file server to another location on the same file server.
There seem to be several API layers which influence this:
1. Number of OS level processes performing the copy in parallel.
2. Record size used buy the C-library read()/write() calls from these
processes.
3. NFS client rsize/wsize settings.
4. Ethernet MTU size.
5. Bandwidth of the ethernet network and switches.
So far we've played around with larger MTU and rsize/wsize settings
without seeing a huge difference. Since we have been using "cp" to
perform (1), we've not tweaked the record size at all at this point.
My suspicion is that we should be carefully coordinating the sizes
specified in for the layers 2, 3 and 4. Perhaps we should be using "dd"
instead of "cp" so we can control the record size being used. Since
the number of permutations of these three settings are large I was
hoping that I might get some advise from this list about a range of
values we should be investigating and any unpleasant interactions
between these levels of settings we should be aware of to narrow our
search. Also, if there are other major factors outside those listed I'd
appreciate being pointed in the right direction.
---
While I'm on the subject, has there been any discussion about adding an
NFS request that would allow copying files from one location to another
on the same NFS server without requiring a round trip to a client? Its
not at all uncommon to need to move data around in this manner and it
seems a huge waste of bandwidth to have to send all this data from the
server to the client just to have the client send the data back
unaltered to a different location. Such a COPY request would be high
level along the lines of RENAME and each server vendor could optimize
this for their particular hardware architecture. For our particular
application, having such a request would make a huge difference in
performance.
--
Jim Callahan - President - Temerity Software <www.temerity.us>
next reply other threads:[~2009-03-12 22:00 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-12 21:00 Jim Callahan [this message]
2009-03-13 2:43 ` Best A->B large file copy performance Greg Banks
2009-03-13 19:16 ` Trond Myklebust
2009-03-13 21:40 ` Jim Callahan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49B9780B.2020609@temerity.us \
--to=callahan@temerity.us \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox