linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Greear <greearb@candelatech.com>
To: Andy Adamson <andros@netapp.com>
Cc: quanli gui <gqlxj1987@gmail.com>,
	Trond Myklebust <Trond.Myklebust@netapp.com>,
	Benny Halevy <bhalevy@tonian.com>,
	linux-nfs@vger.kernel.org, "Mueller,
	Brian" <bmueller@panasas.com>
Subject: Re: [nfsv4]nfs client bug
Date: Thu, 30 Jun 2011 09:57:56 -0700	[thread overview]
Message-ID: <4E0CAB14.6070206@candelatech.com> (raw)
In-Reply-To: <7CEE6045-810F-4381-AC81-7275F2F31A88@netapp.com>

On 06/30/2011 09:26 AM, Andy Adamson wrote:
>
> On Jun 30, 2011, at 11:52 AM, quanli gui wrote:
>
>> Thanks for your tips. I will try to test by using the tips.
>>
>> But I have a question about the nfsv4 performace indeed because of the
>> nfsv4 code, that is because the nfsv4 client code, the performace I
>> tested is slow. Do you have some test result about the nfsv4
>> performance?
>
>
> I'm just beginning testing NFSv4.0 Linux client to Linux server.  Both are Fedora 13 with the 3.0-rc1 kernel and 10G interfaces.
>
> I'm getting ~ 5Gb/sec READs with iperf and ~3.5Gb/sec READs with NFSv4.0 using iozone. Much more testing/tuning to do.

We've almost saturated two 10G links (about 17Gbps total) using older (maybe 2.6.34 or so) kernels with Linux clients and
Linux servers.  We use a RAM FS on the server side to make sure disk access isn't a problem,
and fast 10G NICs with TCP offload enabled (Intel 82599, 5GT/s pci-e bus).

We haven't benchmarked this particular setup lately...

Thanks,
Ben

>
> -->Andy
>>
>> On Thu, Jun 30, 2011 at 10:24 PM, Trond Myklebust
>> <Trond.Myklebust@netapp.com>  wrote:
>>> On Thu, 2011-06-30 at 09:36 -0400, Andy Adamson wrote:
>>>> On Jun 29, 2011, at 10:32 PM, quanli gui wrote:
>>>>
>>>>> When I use the iperf tools for one client to 4 ds, the network
>>>>> throughput is 890MB/S. It reflect that it is indeed 10GE non-blocking.
>>>>>
>>>>> a. about block size, I use bs=1M when I use dd
>>>>> b. we indeed use the tcp (doesn't the nfsv4 use the tcp defaultly?)
>>>>> c. the jumbo frames is what? how set mtu automatically?
>>>>>
>>>>> Brian, do you have some more tips?
>>>>
>>>> 1) Set the mtu on both the client and the server 10G interface. Sometimes 9000 is too high. My setup uses 8000.
>>>> To set MTU on interface eth0.
>>>>
>>>> % ifconfig eth0 mtu 9000
>>>>
>>>> iperf will report the MTU of the full path between client and server - use it to verify the MTU of the connection.
>>>>
>>>> 2) Increase the # of rpc_slots on the client.
>>>> % echo 128>  /proc/sys/sunrpc/tcp_slot_table_entries
>>>>
>>>> 3) Increase the # of server threads
>>>>
>>>> % echo 128>  /proc/fs/nfsd/threads
>>>> % service nfs restart
>>>>
>>>> 4) Ensure the TCP buffers on both the client and the server are large enough for the TCP window.
>>>> Calculate the required buffer size by pinging the server from the client with the MTU packet size and multiply the round trip time by the interface capacity
>>>>
>>>> % ping -s 9000 server  - say 108 ms average
>>>>
>>>> 10Gbits/sec = 1,250,000,000 Bytes/sec * .108 sec = 135,000,000 bytes
>>>>
>>>> Use this number to set the following:
>>>> sysctl -w net.core.rmem_max = 135000000
>>>> sysctl -w net.core.wmem_max 135000000
>>>> sysctl -w "net.ipv4.tcp_rmem<first number unchaged>  <second unchanged>  135000000"
>>>> sysctl net.ipv4.tcp_wmem<first number unchaged>  <second unchanged>  135000000"
>>>>
>>>> 5) mount with rsize=131072,wsize=131072
>>>
>>> 6) Note that NFS always guarantees that the file is _on_disk_ after
>>> close(), so if you are using 'dd' to test, then you should be using the
>>> 'conv=fsync' flag (i.e 'dd if=/dev/zero of=test count=20k conv=fsync')
>>> in order to obtain a fair comparison between the NFS and local disk
>>> performance. Otherwise, you are comparing NFS and local _pagecache_
>>> performance.
>>>
>>> Trond
>>> --
>>> Trond Myklebust
>>> Linux NFS client maintainer
>>>
>>> NetApp
>>> Trond.Myklebust@netapp.com
>>> www.netapp.com
>>>
>>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


      reply	other threads:[~2011-06-30 16:58 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <BANLkTi=xcQseTx8BTWEzg-1DO=ayJuMLrw@mail.gmail.com>
2011-06-29 16:28 ` [nfsv4]nfs client bug Benny Halevy
2011-06-30  2:32   ` quanli gui
2011-06-30 13:36     ` Andy Adamson
2011-06-30 14:24       ` Trond Myklebust
2011-06-30 15:13         ` Benny Halevy
2011-06-30 15:35           ` Trond Myklebust
2011-06-30 15:42             ` Benny Halevy
2011-06-30 15:52         ` quanli gui
2011-06-30 15:57           ` Trond Myklebust
2011-06-30 16:26           ` Andy Adamson
2011-06-30 16:57             ` Ben Greear [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E0CAB14.6070206@candelatech.com \
    --to=greearb@candelatech.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=andros@netapp.com \
    --cc=bhalevy@tonian.com \
    --cc=bmueller@panasas.com \
    --cc=gqlxj1987@gmail.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).