linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dean Hildebrand <seattleplus@gmail.com>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>,
	John Stoffel <john@stoffel.org>,
	Dave Chinner <david@fromorbit.com>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] nfs: use 4*rsize readahead size
Date: Wed, 14 Apr 2010 14:22:43 -0700	[thread overview]
Message-ID: <4BC63223.8050604@gmail.com> (raw)
In-Reply-To: <20100303032724.GA9979@localhost>

You cannot simply update linux system tcp parameters and expect nfs to 
work well performance-wise over the wan.  The NFS server does not use 
system tcp parameters.  This is a long standing issue.  A patch was 
originally added in 2.6.30 that enabled NFS to use linux tcp buffer 
autotuning, which would resolve the issue, but a regression was reported 
(http://thread.gmane.org/gmane.linux.kernel/826598 ) and so they removed 
the patch.

Maybe its time to rethink allowing users to manually set linux nfs 
server tcp buffer sizes?  Years have passed on this subject and people 
are still waiting.  Good performance over the wan will require manually 
setting tcp buffer sizes.  As mentioned in the regression thread, 
autotuning can reduce performance by up to 10%.  Here is a patch 
(slightly outdated) that creates 2 sysctls that allow users to manually 
to set NFS TCP buffer sizes.  The first link also has a fair amount of 
background information on the subject.
http://www.spinics.net/lists/linux-nfs/msg01338.html
http://www.spinics.net/lists/linux-nfs/msg01339.html

Dean


Wu Fengguang wrote:
> On Wed, Mar 03, 2010 at 02:42:19AM +0800, Trond Myklebust wrote:
>   
>> On Tue, 2010-03-02 at 12:33 -0500, John Stoffel wrote: 
>>     
>>>>>>>> "Trond" == Trond Myklebust <Trond.Myklebust@netapp.com> writes:
>>>>>>>>                 
>>> Trond> On Tue, 2010-03-02 at 11:10 +0800, Wu Fengguang wrote: 
>>>       
>>>>> Dave,
>>>>>
>>>>> Here is one more test on a big ext4 disk file:
>>>>>
>>>>> 16k	39.7 MB/s
>>>>> 32k	54.3 MB/s
>>>>> 64k	63.6 MB/s
>>>>> 128k	72.6 MB/s
>>>>> 256k	71.7 MB/s
>>>>> rsize ==> 512k  71.7 MB/s
>>>>> 1024k	72.2 MB/s
>>>>> 2048k	71.0 MB/s
>>>>> 4096k	73.0 MB/s
>>>>> 8192k	74.3 MB/s
>>>>> 16384k	74.5 MB/s
>>>>>
>>>>> It shows that >=128k client side readahead is enough for single disk
>>>>> case :) As for RAID configurations, I guess big server side readahead
>>>>> should be enough.
>>>>>           
>>> Trond> There are lots of people who would like to use NFS on their
>>> Trond> company WAN, where you typically have high bandwidths (up to
>>> Trond> 10GigE), but often a high latency too (due to geographical
>>> Trond> dispersion).  My ping latency from here to a typical server in
>>> Trond> NetApp's Bangalore office is ~ 312ms. I read your test results
>>> Trond> with 10ms delays, but have you tested with higher than that?
>>>
>>> If you have that high a latency, the low level TCP protocol is going
>>> to kill your performance before you get to the NFS level.  You really
>>> need to open up the TCP window size at that point.  And it only gets
>>> worse as the bandwidth goes up too.  
>>>       
>> Yes. You need to open the TCP window in addition to reading ahead
>> aggressively.
>>     
>
> I only get ~10MB/s throughput with following settings.
>
> # huge NFS ra size
> echo 89512 > /sys/devices/virtual/bdi/0:15/read_ahead_kb        
>
> # on both sides
> /sbin/tc qdisc add dev eth0 root netem delay 200ms              
>
> net.core.rmem_max = 873800000
> net.core.wmem_max = 655360000
> net.ipv4.tcp_rmem = 8192 87380000 873800000
> net.ipv4.tcp_wmem = 4096 65536000 655360000
>
> Did I miss something?
>
> Thanks,
> Fengguang
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-04-14 21:22 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-24  2:41 [RFC] nfs: use 2*rsize readahead size Wu Fengguang
2010-02-24  3:29 ` Dave Chinner
2010-02-24  4:18   ` Wu Fengguang
2010-02-24  5:22     ` Dave Chinner
2010-02-24  6:12       ` Wu Fengguang
2010-02-24  7:39         ` Dave Chinner
2010-02-26  7:49           ` [RFC] nfs: use 4*rsize " Wu Fengguang
2010-03-02  3:10             ` Wu Fengguang
2010-03-02 14:19               ` Trond Myklebust
2010-03-02 17:33                 ` John Stoffel
     [not found]                   ` <19341.19446.356359.99958-HgN6juyGXH5AfugRpC6u6w@public.gmane.org>
2010-03-02 18:42                     ` Trond Myklebust
2010-03-03  3:27                       ` Wu Fengguang
2010-04-14 21:22                         ` Dean Hildebrand [this message]
2010-03-02 20:14               ` Bret Towe
2010-03-03  1:43                 ` Wu Fengguang
     [not found]       ` <20100224052215.GH16175-CJ6yYqJ1V6CgjvmRZuSThA@public.gmane.org>
2010-02-24 11:18         ` [RFC] nfs: use 2*rsize " Akshat Aranya
2010-02-25 12:37           ` Wu Fengguang
2010-02-24  4:24   ` Dave Chinner
2010-02-24  4:33     ` Wu Fengguang
     [not found]     ` <20100224042414.GG16175-CJ6yYqJ1V6CgjvmRZuSThA@public.gmane.org>
2010-02-24  4:43       ` Wu Fengguang
2010-02-24  5:24         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BC63223.8050604@gmail.com \
    --to=seattleplus@gmail.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=david@fromorbit.com \
    --cc=fengguang.wu@intel.com \
    --cc=john@stoffel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).