All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] nfs: use 2*rsize readahead size
Date: Wed, 24 Feb 2010 12:33:24 +0800	[thread overview]
Message-ID: <20100224043324.GA31913@localhost> (raw)
In-Reply-To: <20100224042414.GG16175@discord.disaster>

On Wed, Feb 24, 2010 at 12:24:14PM +0800, Dave Chinner wrote:
> On Wed, Feb 24, 2010 at 02:29:34PM +1100, Dave Chinner wrote:
> > On Wed, Feb 24, 2010 at 10:41:01AM +0800, Wu Fengguang wrote:
> > > With default rsize=512k and NFS_MAX_READAHEAD=15, the current NFS
> > > readahead size 512k*15=7680k is too large than necessary for typical
> > > clients.
> > > 
> > > On a e1000e--e1000e connection, I got the following numbers
> > > 
> > > 	readahead size		throughput
> > > 		   16k           35.5 MB/s
> > > 		   32k           54.3 MB/s
> > > 		   64k           64.1 MB/s
> > > 		  128k           70.5 MB/s
> > > 		  256k           74.6 MB/s
> > > rsize ==>	  512k           77.4 MB/s
> > > 		 1024k           85.5 MB/s
> > > 		 2048k           86.8 MB/s
> > > 		 4096k           87.9 MB/s
> > > 		 8192k           89.0 MB/s
> > > 		16384k           87.7 MB/s
> > > 
> > > So it seems that readahead_size=2*rsize (ie. keep two RPC requests in flight)
> > > can already get near full NFS bandwidth.
> > > 
> > > The test script is:
> > > 
> > > #!/bin/sh
> > > 
> > > file=/mnt/sparse
> > > BDI=0:15
> > > 
> > > for rasize in 16 32 64 128 256 512 1024 2048 4096 8192 16384
> > > do
> > > 	echo 3 > /proc/sys/vm/drop_caches
> > > 	echo $rasize > /sys/devices/virtual/bdi/$BDI/read_ahead_kb
> > > 	echo readahead_size=${rasize}k
> > > 	dd if=$file of=/dev/null bs=4k count=1024000
> > > done
> > 
> > That's doing a cached read out of the server cache, right? You
> > might find the results are different if the server has to read the
> > file from disk. I would expect reads from the server cache not
> > to require much readahead as there is no IO latency on the server
> > side for the readahead to hide....
> 
> FWIW, if you mount the client with "-o rsize=32k" or the server only
> supports rsize <= 32k then this will probably hurt throughput a lot
> because then readahead will be capped at 64k instead of 480k....

That's why I take the max of 2*rsize and system default readahead size
(which will be enlarged to 512K):

-       server->backing_dev_info.ra_pages = server->rpages * NFS_MAX_READAHEAD;
+       server->backing_dev_info.ra_pages = max_t(unsigned long,
+                                             default_backing_dev_info.ra_pages,
+                                             2 * server->rpages);

Thanks,
Fengguang

WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] nfs: use 2*rsize readahead size
Date: Wed, 24 Feb 2010 12:33:24 +0800	[thread overview]
Message-ID: <20100224043324.GA31913@localhost> (raw)
In-Reply-To: <20100224042414.GG16175@discord.disaster>

On Wed, Feb 24, 2010 at 12:24:14PM +0800, Dave Chinner wrote:
> On Wed, Feb 24, 2010 at 02:29:34PM +1100, Dave Chinner wrote:
> > On Wed, Feb 24, 2010 at 10:41:01AM +0800, Wu Fengguang wrote:
> > > With default rsize=512k and NFS_MAX_READAHEAD=15, the current NFS
> > > readahead size 512k*15=7680k is too large than necessary for typical
> > > clients.
> > > 
> > > On a e1000e--e1000e connection, I got the following numbers
> > > 
> > > 	readahead size		throughput
> > > 		   16k           35.5 MB/s
> > > 		   32k           54.3 MB/s
> > > 		   64k           64.1 MB/s
> > > 		  128k           70.5 MB/s
> > > 		  256k           74.6 MB/s
> > > rsize ==>	  512k           77.4 MB/s
> > > 		 1024k           85.5 MB/s
> > > 		 2048k           86.8 MB/s
> > > 		 4096k           87.9 MB/s
> > > 		 8192k           89.0 MB/s
> > > 		16384k           87.7 MB/s
> > > 
> > > So it seems that readahead_size=2*rsize (ie. keep two RPC requests in flight)
> > > can already get near full NFS bandwidth.
> > > 
> > > The test script is:
> > > 
> > > #!/bin/sh
> > > 
> > > file=/mnt/sparse
> > > BDI=0:15
> > > 
> > > for rasize in 16 32 64 128 256 512 1024 2048 4096 8192 16384
> > > do
> > > 	echo 3 > /proc/sys/vm/drop_caches
> > > 	echo $rasize > /sys/devices/virtual/bdi/$BDI/read_ahead_kb
> > > 	echo readahead_size=${rasize}k
> > > 	dd if=$file of=/dev/null bs=4k count=1024000
> > > done
> > 
> > That's doing a cached read out of the server cache, right? You
> > might find the results are different if the server has to read the
> > file from disk. I would expect reads from the server cache not
> > to require much readahead as there is no IO latency on the server
> > side for the readahead to hide....
> 
> FWIW, if you mount the client with "-o rsize=32k" or the server only
> supports rsize <= 32k then this will probably hurt throughput a lot
> because then readahead will be capped at 64k instead of 480k....

That's why I take the max of 2*rsize and system default readahead size
(which will be enlarged to 512K):

-       server->backing_dev_info.ra_pages = server->rpages * NFS_MAX_READAHEAD;
+       server->backing_dev_info.ra_pages = max_t(unsigned long,
+                                             default_backing_dev_info.ra_pages,
+                                             2 * server->rpages);

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-02-24  4:33 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-24  2:41 [RFC] nfs: use 2*rsize readahead size Wu Fengguang
2010-02-24  2:41 ` Wu Fengguang
2010-02-24  2:41 ` Wu Fengguang
2010-02-24  3:29 ` Dave Chinner
2010-02-24  3:29   ` Dave Chinner
2010-02-24  4:18   ` Wu Fengguang
2010-02-24  4:18     ` Wu Fengguang
2010-02-24  5:22     ` Dave Chinner
2010-02-24  5:22       ` Dave Chinner
2010-02-24  5:22       ` Dave Chinner
2010-02-24  6:12       ` Wu Fengguang
2010-02-24  6:12         ` Wu Fengguang
2010-02-24  7:39         ` Dave Chinner
2010-02-24  7:39           ` Dave Chinner
2010-02-26  7:49           ` [RFC] nfs: use 4*rsize " Wu Fengguang
2010-02-26  7:49             ` Wu Fengguang
2010-03-02  3:10             ` Wu Fengguang
2010-03-02  3:10               ` Wu Fengguang
2010-03-02 14:19               ` Trond Myklebust
2010-03-02 14:19                 ` Trond Myklebust
2010-03-02 17:33                 ` John Stoffel
2010-03-02 17:33                   ` John Stoffel
2010-03-02 18:42                   ` Trond Myklebust
2010-03-02 18:42                     ` Trond Myklebust
2010-03-02 18:42                     ` Trond Myklebust
2010-03-03  3:27                     ` Wu Fengguang
2010-03-03  3:27                       ` Wu Fengguang
2010-04-14 21:22                       ` Dean Hildebrand
2010-04-14 21:22                         ` Dean Hildebrand
2010-03-02 20:14               ` Bret Towe
2010-03-02 20:14                 ` Bret Towe
2010-03-02 20:14                 ` Bret Towe
2010-03-03  1:43                 ` Wu Fengguang
2010-03-03  1:43                   ` Wu Fengguang
2010-02-24 11:18       ` [RFC] nfs: use 2*rsize " Akshat Aranya
2010-02-24 11:18         ` Akshat Aranya
2010-02-24 11:18         ` Akshat Aranya
2010-02-25 12:37         ` Wu Fengguang
2010-02-25 12:37           ` Wu Fengguang
2010-02-25 12:37           ` Wu Fengguang
2010-02-24  4:24   ` Dave Chinner
2010-02-24  4:24     ` Dave Chinner
2010-02-24  4:33     ` Wu Fengguang [this message]
2010-02-24  4:33       ` Wu Fengguang
2010-02-24  4:43     ` Wu Fengguang
2010-02-24  4:43       ` Wu Fengguang
2010-02-24  4:43       ` Wu Fengguang
2010-02-24  5:24       ` Dave Chinner
2010-02-24  5:24         ` Dave Chinner
2010-02-24  5:24         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100224043324.GA31913@localhost \
    --to=fengguang.wu@intel.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.