From: Wu Fengguang <fengguang.wu@intel.com>
To: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Linux Memory Management List <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: [RFC] nfs: use 2*rsize readahead size
Date: Wed, 24 Feb 2010 10:41:01 +0800 [thread overview]
Message-ID: <20100224024100.GA17048@localhost> (raw)
With default rsize=512k and NFS_MAX_READAHEAD=15, the current NFS
readahead size 512k*15=7680k is too large than necessary for typical
clients.
On a e1000e--e1000e connection, I got the following numbers
readahead size throughput
16k 35.5 MB/s
32k 54.3 MB/s
64k 64.1 MB/s
128k 70.5 MB/s
256k 74.6 MB/s
rsize ==> 512k 77.4 MB/s
1024k 85.5 MB/s
2048k 86.8 MB/s
4096k 87.9 MB/s
8192k 89.0 MB/s
16384k 87.7 MB/s
So it seems that readahead_size=2*rsize (ie. keep two RPC requests in flight)
can already get near full NFS bandwidth.
The test script is:
#!/bin/sh
file=/mnt/sparse
BDI=0:15
for rasize in 16 32 64 128 256 512 1024 2048 4096 8192 16384
do
echo 3 > /proc/sys/vm/drop_caches
echo $rasize > /sys/devices/virtual/bdi/$BDI/read_ahead_kb
echo readahead_size=${rasize}k
dd if=$file of=/dev/null bs=4k count=1024000
done
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
fs/nfs/client.c | 4 +++-
fs/nfs/internal.h | 8 --------
2 files changed, 3 insertions(+), 9 deletions(-)
--- linux.orig/fs/nfs/client.c 2010-02-23 11:15:44.000000000 +0800
+++ linux/fs/nfs/client.c 2010-02-24 10:16:00.000000000 +0800
@@ -889,7 +889,9 @@ static void nfs_server_set_fsinfo(struct
server->rpages = (server->rsize + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
server->backing_dev_info.name = "nfs";
- server->backing_dev_info.ra_pages = server->rpages * NFS_MAX_READAHEAD;
+ server->backing_dev_info.ra_pages = max_t(unsigned long,
+ default_backing_dev_info.ra_pages,
+ 2 * server->rpages);
server->backing_dev_info.capabilities |= BDI_CAP_ACCT_UNSTABLE;
if (server->wsize > max_rpc_payload)
--- linux.orig/fs/nfs/internal.h 2010-02-23 11:15:44.000000000 +0800
+++ linux/fs/nfs/internal.h 2010-02-23 13:26:00.000000000 +0800
@@ -10,14 +10,6 @@
struct nfs_string;
-/* Maximum number of readahead requests
- * FIXME: this should really be a sysctl so that users may tune it to suit
- * their needs. People that do NFS over a slow network, might for
- * instance want to reduce it to something closer to 1 for improved
- * interactive response.
- */
-#define NFS_MAX_READAHEAD (RPC_DEF_SLOT_TABLE - 1)
-
/*
* Determine if sessions are in use.
*/
WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
To: Trond Myklebust
<Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>
Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Linux Memory Management List
<linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>,
LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: [RFC] nfs: use 2*rsize readahead size
Date: Wed, 24 Feb 2010 10:41:01 +0800 [thread overview]
Message-ID: <20100224024100.GA17048@localhost> (raw)
With default rsize=512k and NFS_MAX_READAHEAD=15, the current NFS
readahead size 512k*15=7680k is too large than necessary for typical
clients.
On a e1000e--e1000e connection, I got the following numbers
readahead size throughput
16k 35.5 MB/s
32k 54.3 MB/s
64k 64.1 MB/s
128k 70.5 MB/s
256k 74.6 MB/s
rsize ==> 512k 77.4 MB/s
1024k 85.5 MB/s
2048k 86.8 MB/s
4096k 87.9 MB/s
8192k 89.0 MB/s
16384k 87.7 MB/s
So it seems that readahead_size=2*rsize (ie. keep two RPC requests in flight)
can already get near full NFS bandwidth.
The test script is:
#!/bin/sh
file=/mnt/sparse
BDI=0:15
for rasize in 16 32 64 128 256 512 1024 2048 4096 8192 16384
do
echo 3 > /proc/sys/vm/drop_caches
echo $rasize > /sys/devices/virtual/bdi/$BDI/read_ahead_kb
echo readahead_size=${rasize}k
dd if=$file of=/dev/null bs=4k count=1024000
done
CC: Trond Myklebust <Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Wu Fengguang <fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
fs/nfs/client.c | 4 +++-
fs/nfs/internal.h | 8 --------
2 files changed, 3 insertions(+), 9 deletions(-)
--- linux.orig/fs/nfs/client.c 2010-02-23 11:15:44.000000000 +0800
+++ linux/fs/nfs/client.c 2010-02-24 10:16:00.000000000 +0800
@@ -889,7 +889,9 @@ static void nfs_server_set_fsinfo(struct
server->rpages = (server->rsize + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
server->backing_dev_info.name = "nfs";
- server->backing_dev_info.ra_pages = server->rpages * NFS_MAX_READAHEAD;
+ server->backing_dev_info.ra_pages = max_t(unsigned long,
+ default_backing_dev_info.ra_pages,
+ 2 * server->rpages);
server->backing_dev_info.capabilities |= BDI_CAP_ACCT_UNSTABLE;
if (server->wsize > max_rpc_payload)
--- linux.orig/fs/nfs/internal.h 2010-02-23 11:15:44.000000000 +0800
+++ linux/fs/nfs/internal.h 2010-02-23 13:26:00.000000000 +0800
@@ -10,14 +10,6 @@
struct nfs_string;
-/* Maximum number of readahead requests
- * FIXME: this should really be a sysctl so that users may tune it to suit
- * their needs. People that do NFS over a slow network, might for
- * instance want to reduce it to something closer to 1 for improved
- * interactive response.
- */
-#define NFS_MAX_READAHEAD (RPC_DEF_SLOT_TABLE - 1)
-
/*
* Determine if sessions are in use.
*/
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Linux Memory Management List <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: [RFC] nfs: use 2*rsize readahead size
Date: Wed, 24 Feb 2010 10:41:01 +0800 [thread overview]
Message-ID: <20100224024100.GA17048@localhost> (raw)
With default rsize=512k and NFS_MAX_READAHEAD=15, the current NFS
readahead size 512k*15=7680k is too large than necessary for typical
clients.
On a e1000e--e1000e connection, I got the following numbers
readahead size throughput
16k 35.5 MB/s
32k 54.3 MB/s
64k 64.1 MB/s
128k 70.5 MB/s
256k 74.6 MB/s
rsize ==> 512k 77.4 MB/s
1024k 85.5 MB/s
2048k 86.8 MB/s
4096k 87.9 MB/s
8192k 89.0 MB/s
16384k 87.7 MB/s
So it seems that readahead_size=2*rsize (ie. keep two RPC requests in flight)
can already get near full NFS bandwidth.
The test script is:
#!/bin/sh
file=/mnt/sparse
BDI=0:15
for rasize in 16 32 64 128 256 512 1024 2048 4096 8192 16384
do
echo 3 > /proc/sys/vm/drop_caches
echo $rasize > /sys/devices/virtual/bdi/$BDI/read_ahead_kb
echo readahead_size=${rasize}k
dd if=$file of=/dev/null bs=4k count=1024000
done
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
fs/nfs/client.c | 4 +++-
fs/nfs/internal.h | 8 --------
2 files changed, 3 insertions(+), 9 deletions(-)
--- linux.orig/fs/nfs/client.c 2010-02-23 11:15:44.000000000 +0800
+++ linux/fs/nfs/client.c 2010-02-24 10:16:00.000000000 +0800
@@ -889,7 +889,9 @@ static void nfs_server_set_fsinfo(struct
server->rpages = (server->rsize + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
server->backing_dev_info.name = "nfs";
- server->backing_dev_info.ra_pages = server->rpages * NFS_MAX_READAHEAD;
+ server->backing_dev_info.ra_pages = max_t(unsigned long,
+ default_backing_dev_info.ra_pages,
+ 2 * server->rpages);
server->backing_dev_info.capabilities |= BDI_CAP_ACCT_UNSTABLE;
if (server->wsize > max_rpc_payload)
--- linux.orig/fs/nfs/internal.h 2010-02-23 11:15:44.000000000 +0800
+++ linux/fs/nfs/internal.h 2010-02-23 13:26:00.000000000 +0800
@@ -10,14 +10,6 @@
struct nfs_string;
-/* Maximum number of readahead requests
- * FIXME: this should really be a sysctl so that users may tune it to suit
- * their needs. People that do NFS over a slow network, might for
- * instance want to reduce it to something closer to 1 for improved
- * interactive response.
- */
-#define NFS_MAX_READAHEAD (RPC_DEF_SLOT_TABLE - 1)
-
/*
* Determine if sessions are in use.
*/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2010-02-24 2:41 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-24 2:41 Wu Fengguang [this message]
2010-02-24 2:41 ` [RFC] nfs: use 2*rsize readahead size Wu Fengguang
2010-02-24 2:41 ` Wu Fengguang
2010-02-24 3:29 ` Dave Chinner
2010-02-24 3:29 ` Dave Chinner
2010-02-24 4:18 ` Wu Fengguang
2010-02-24 4:18 ` Wu Fengguang
2010-02-24 5:22 ` Dave Chinner
2010-02-24 5:22 ` Dave Chinner
2010-02-24 5:22 ` Dave Chinner
2010-02-24 6:12 ` Wu Fengguang
2010-02-24 6:12 ` Wu Fengguang
2010-02-24 7:39 ` Dave Chinner
2010-02-24 7:39 ` Dave Chinner
2010-02-26 7:49 ` [RFC] nfs: use 4*rsize " Wu Fengguang
2010-02-26 7:49 ` Wu Fengguang
2010-03-02 3:10 ` Wu Fengguang
2010-03-02 3:10 ` Wu Fengguang
2010-03-02 14:19 ` Trond Myklebust
2010-03-02 14:19 ` Trond Myklebust
2010-03-02 17:33 ` John Stoffel
2010-03-02 17:33 ` John Stoffel
2010-03-02 18:42 ` Trond Myklebust
2010-03-02 18:42 ` Trond Myklebust
2010-03-02 18:42 ` Trond Myklebust
2010-03-03 3:27 ` Wu Fengguang
2010-03-03 3:27 ` Wu Fengguang
2010-04-14 21:22 ` Dean Hildebrand
2010-04-14 21:22 ` Dean Hildebrand
2010-03-02 20:14 ` Bret Towe
2010-03-02 20:14 ` Bret Towe
2010-03-02 20:14 ` Bret Towe
2010-03-03 1:43 ` Wu Fengguang
2010-03-03 1:43 ` Wu Fengguang
2010-02-24 11:18 ` [RFC] nfs: use 2*rsize " Akshat Aranya
2010-02-24 11:18 ` Akshat Aranya
2010-02-24 11:18 ` Akshat Aranya
2010-02-25 12:37 ` Wu Fengguang
2010-02-25 12:37 ` Wu Fengguang
2010-02-25 12:37 ` Wu Fengguang
2010-02-24 4:24 ` Dave Chinner
2010-02-24 4:24 ` Dave Chinner
2010-02-24 4:33 ` Wu Fengguang
2010-02-24 4:33 ` Wu Fengguang
2010-02-24 4:43 ` Wu Fengguang
2010-02-24 4:43 ` Wu Fengguang
2010-02-24 4:43 ` Wu Fengguang
2010-02-24 5:24 ` Dave Chinner
2010-02-24 5:24 ` Dave Chinner
2010-02-24 5:24 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100224024100.GA17048@localhost \
--to=fengguang.wu@intel.com \
--cc=Trond.Myklebust@netapp.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.