All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Nikanth Karthikesan <knikanth@suse.de>
Cc: Dave Chinner <david@fromorbit.com>,
	Ankit Jain <radical@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>,
	Jens Axboe <jens.axboe@oracle.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Subject: Re: [PATCH v2] Make VM_MAX_READAHEAD a kernel parameter
Date: Sun, 21 Feb 2010 22:26:00 +0800	[thread overview]
Message-ID: <20100221142600.GA10036@localhost> (raw)
In-Reply-To: <201002151006.37294.knikanth@suse.de>

Nikanth,

> > > +	readahead=	Default readahead value for block devices.
> > > +
> > 
> > I think the description should define the units (kb) and valid value
> > ranges e.g. page size to something not excessive - say 65536kb.  The
> > above description is, IMO, useless without refering to the source to
> > find out this information....
> > 
> 
> The parameter can be specified with/without any suffix(k/m/g) that memparse() 
> helper function can accept. So it can take 1M, 1024k, 1050620. I checked other 
> parameters that use memparse() to get similar values and they didn't document 
> it. May be this should be described here.

Hope this helps clarify things to user:

+       readahead=nn[KM]
+                       Default max readahead size for block devices.
+                       Range: 0; 4k - 128m

> > And readahead_kb needs to be validated against the range of
> > valid values here.
> > 
> 
> I didn't want to impose artificial restrictions. I think Wu's patch set would 
> be adding some restrictions, like minimum readahead. He could fix it when he 
> modifies the patch to include in his patch set.

OK, I imposed a larger bound -- 128MB.
And values 1-4095 (more exactly: PAGE_CACHE_SIZE) are prohibited mainly to 
catch "readahead=128" where the user really means to do 128 _KB_ readahead.

Christian, with this patch and more patches to scale down readahead
size on small memory/device size, I guess it's no longer necessary to
introduce a CONFIG_READAHEAD_SIZE?

Thanks,
Fengguang
---
make default readahead size a kernel parameter

From: Nikanth Karthikesan <knikanth@suse.de>

Add new kernel parameter "readahead", which would be used instead of the
value of VM_MAX_READAHEAD. If the parameter is not specified, the default
of 128kb would be used.

CC: Ankit Jain <radical@gmail.com>
CC: Dave Chinner <david@fromorbit.com>
CC: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 Documentation/kernel-parameters.txt |    4 ++++
 block/blk-core.c                    |    3 +--
 fs/fuse/inode.c                     |    2 +-
 include/linux/mm.h                  |    2 ++
 mm/readahead.c                      |   26 ++++++++++++++++++++++++++
 5 files changed, 34 insertions(+), 3 deletions(-)

--- linux.orig/Documentation/kernel-parameters.txt	2010-02-21 22:09:41.000000000 +0800
+++ linux/Documentation/kernel-parameters.txt	2010-02-21 22:11:08.000000000 +0800
@@ -2174,6 +2174,10 @@ and is between 256 and 4096 characters. 
 			Run specified binary instead of /init from the ramdisk,
 			used for early userspace startup. See initrd.
 
+	readahead=nn[KM]
+			Default max readahead size for block devices.
+			Range: 0; 4k - 128m
+
 	reboot=		[BUGS=X86-32,BUGS=ARM,BUGS=IA-64] Rebooting mode
 			Format: <reboot_mode>[,<reboot_mode2>[,...]]
 			See arch/*/kernel/reboot.c or arch/*/kernel/process.c
--- linux.orig/block/blk-core.c	2010-02-21 22:09:41.000000000 +0800
+++ linux/block/blk-core.c	2010-02-21 22:09:42.000000000 +0800
@@ -498,8 +498,7 @@ struct request_queue *blk_alloc_queue_no
 
 	q->backing_dev_info.unplug_io_fn = blk_backing_dev_unplug;
 	q->backing_dev_info.unplug_io_data = q;
-	q->backing_dev_info.ra_pages =
-			(VM_MAX_READAHEAD * 1024) / PAGE_CACHE_SIZE;
+	q->backing_dev_info.ra_pages = max_readahead_pages;
 	q->backing_dev_info.state = 0;
 	q->backing_dev_info.capabilities = BDI_CAP_MAP_COPY;
 	q->backing_dev_info.name = "block";
--- linux.orig/fs/fuse/inode.c	2010-02-21 22:09:41.000000000 +0800
+++ linux/fs/fuse/inode.c	2010-02-21 22:09:42.000000000 +0800
@@ -870,7 +870,7 @@ static int fuse_bdi_init(struct fuse_con
 	int err;
 
 	fc->bdi.name = "fuse";
-	fc->bdi.ra_pages = (VM_MAX_READAHEAD * 1024) / PAGE_CACHE_SIZE;
+	fc->bdi.ra_pages = max_readahead_pages;
 	fc->bdi.unplug_io_fn = default_unplug_io_fn;
 	/* fuse does it's own writeback accounting */
 	fc->bdi.capabilities = BDI_CAP_NO_ACCT_WB;
--- linux.orig/include/linux/mm.h	2010-02-21 22:09:41.000000000 +0800
+++ linux/include/linux/mm.h	2010-02-21 22:09:42.000000000 +0800
@@ -1187,6 +1187,8 @@ void task_dirty_inc(struct task_struct *
 #define VM_MAX_READAHEAD	128	/* kbytes */
 #define VM_MIN_READAHEAD	16	/* kbytes (includes current page) */
 
+extern unsigned long max_readahead_pages;
+
 int force_page_cache_readahead(struct address_space *mapping, struct file *filp,
 			pgoff_t offset, unsigned long nr_to_read);
 
--- linux.orig/mm/readahead.c	2010-02-21 22:09:41.000000000 +0800
+++ linux/mm/readahead.c	2010-02-21 22:13:44.000000000 +0800
@@ -19,6 +19,32 @@
 #include <linux/pagevec.h>
 #include <linux/pagemap.h>
 
+unsigned long max_readahead_pages = VM_MAX_READAHEAD * 1024 / PAGE_CACHE_SIZE;
+
+static int __init readahead(char *str)
+{
+	unsigned long bytes;
+
+	if (!str)
+		return -EINVAL;
+	bytes = memparse(str, &str);
+	if (*str != '\0')
+		return -EINVAL;
+
+	if (bytes) {
+		if (bytes < PAGE_CACHE_SIZE)	/* missed 'k'/'m' suffixes? */
+			return -EINVAL;
+		if (bytes > 128 << 20)		/* limit to 128MB */
+			bytes = 128 << 20;
+	}
+
+	max_readahead_pages = bytes / PAGE_CACHE_SIZE;
+	default_backing_dev_info.ra_pages = max_readahead_pages;
+	return 0;
+}
+
+early_param("readahead", readahead);
+
 /*
  * Initialise a struct file's readahead state.  Assumes that the caller has
  * memset *ra to zero.

WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Nikanth Karthikesan <knikanth@suse.de>
Cc: Dave Chinner <david@fromorbit.com>,
	Ankit Jain <radical@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>,
	Jens Axboe <jens.axboe@oracle.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Subject: Re: [PATCH v2] Make VM_MAX_READAHEAD a kernel parameter
Date: Sun, 21 Feb 2010 22:26:00 +0800	[thread overview]
Message-ID: <20100221142600.GA10036@localhost> (raw)
In-Reply-To: <201002151006.37294.knikanth@suse.de>

Nikanth,

> > > +	readahead=	Default readahead value for block devices.
> > > +
> > 
> > I think the description should define the units (kb) and valid value
> > ranges e.g. page size to something not excessive - say 65536kb.  The
> > above description is, IMO, useless without refering to the source to
> > find out this information....
> > 
> 
> The parameter can be specified with/without any suffix(k/m/g) that memparse() 
> helper function can accept. So it can take 1M, 1024k, 1050620. I checked other 
> parameters that use memparse() to get similar values and they didn't document 
> it. May be this should be described here.

Hope this helps clarify things to user:

+       readahead=nn[KM]
+                       Default max readahead size for block devices.
+                       Range: 0; 4k - 128m

> > And readahead_kb needs to be validated against the range of
> > valid values here.
> > 
> 
> I didn't want to impose artificial restrictions. I think Wu's patch set would 
> be adding some restrictions, like minimum readahead. He could fix it when he 
> modifies the patch to include in his patch set.

OK, I imposed a larger bound -- 128MB.
And values 1-4095 (more exactly: PAGE_CACHE_SIZE) are prohibited mainly to 
catch "readahead=128" where the user really means to do 128 _KB_ readahead.

Christian, with this patch and more patches to scale down readahead
size on small memory/device size, I guess it's no longer necessary to
introduce a CONFIG_READAHEAD_SIZE?

Thanks,
Fengguang
---
make default readahead size a kernel parameter

From: Nikanth Karthikesan <knikanth@suse.de>

Add new kernel parameter "readahead", which would be used instead of the
value of VM_MAX_READAHEAD. If the parameter is not specified, the default
of 128kb would be used.

CC: Ankit Jain <radical@gmail.com>
CC: Dave Chinner <david@fromorbit.com>
CC: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 Documentation/kernel-parameters.txt |    4 ++++
 block/blk-core.c                    |    3 +--
 fs/fuse/inode.c                     |    2 +-
 include/linux/mm.h                  |    2 ++
 mm/readahead.c                      |   26 ++++++++++++++++++++++++++
 5 files changed, 34 insertions(+), 3 deletions(-)

--- linux.orig/Documentation/kernel-parameters.txt	2010-02-21 22:09:41.000000000 +0800
+++ linux/Documentation/kernel-parameters.txt	2010-02-21 22:11:08.000000000 +0800
@@ -2174,6 +2174,10 @@ and is between 256 and 4096 characters. 
 			Run specified binary instead of /init from the ramdisk,
 			used for early userspace startup. See initrd.
 
+	readahead=nn[KM]
+			Default max readahead size for block devices.
+			Range: 0; 4k - 128m
+
 	reboot=		[BUGS=X86-32,BUGS=ARM,BUGS=IA-64] Rebooting mode
 			Format: <reboot_mode>[,<reboot_mode2>[,...]]
 			See arch/*/kernel/reboot.c or arch/*/kernel/process.c
--- linux.orig/block/blk-core.c	2010-02-21 22:09:41.000000000 +0800
+++ linux/block/blk-core.c	2010-02-21 22:09:42.000000000 +0800
@@ -498,8 +498,7 @@ struct request_queue *blk_alloc_queue_no
 
 	q->backing_dev_info.unplug_io_fn = blk_backing_dev_unplug;
 	q->backing_dev_info.unplug_io_data = q;
-	q->backing_dev_info.ra_pages =
-			(VM_MAX_READAHEAD * 1024) / PAGE_CACHE_SIZE;
+	q->backing_dev_info.ra_pages = max_readahead_pages;
 	q->backing_dev_info.state = 0;
 	q->backing_dev_info.capabilities = BDI_CAP_MAP_COPY;
 	q->backing_dev_info.name = "block";
--- linux.orig/fs/fuse/inode.c	2010-02-21 22:09:41.000000000 +0800
+++ linux/fs/fuse/inode.c	2010-02-21 22:09:42.000000000 +0800
@@ -870,7 +870,7 @@ static int fuse_bdi_init(struct fuse_con
 	int err;
 
 	fc->bdi.name = "fuse";
-	fc->bdi.ra_pages = (VM_MAX_READAHEAD * 1024) / PAGE_CACHE_SIZE;
+	fc->bdi.ra_pages = max_readahead_pages;
 	fc->bdi.unplug_io_fn = default_unplug_io_fn;
 	/* fuse does it's own writeback accounting */
 	fc->bdi.capabilities = BDI_CAP_NO_ACCT_WB;
--- linux.orig/include/linux/mm.h	2010-02-21 22:09:41.000000000 +0800
+++ linux/include/linux/mm.h	2010-02-21 22:09:42.000000000 +0800
@@ -1187,6 +1187,8 @@ void task_dirty_inc(struct task_struct *
 #define VM_MAX_READAHEAD	128	/* kbytes */
 #define VM_MIN_READAHEAD	16	/* kbytes (includes current page) */
 
+extern unsigned long max_readahead_pages;
+
 int force_page_cache_readahead(struct address_space *mapping, struct file *filp,
 			pgoff_t offset, unsigned long nr_to_read);
 
--- linux.orig/mm/readahead.c	2010-02-21 22:09:41.000000000 +0800
+++ linux/mm/readahead.c	2010-02-21 22:13:44.000000000 +0800
@@ -19,6 +19,32 @@
 #include <linux/pagevec.h>
 #include <linux/pagemap.h>
 
+unsigned long max_readahead_pages = VM_MAX_READAHEAD * 1024 / PAGE_CACHE_SIZE;
+
+static int __init readahead(char *str)
+{
+	unsigned long bytes;
+
+	if (!str)
+		return -EINVAL;
+	bytes = memparse(str, &str);
+	if (*str != '\0')
+		return -EINVAL;
+
+	if (bytes) {
+		if (bytes < PAGE_CACHE_SIZE)	/* missed 'k'/'m' suffixes? */
+			return -EINVAL;
+		if (bytes > 128 << 20)		/* limit to 128MB */
+			bytes = 128 << 20;
+	}
+
+	max_readahead_pages = bytes / PAGE_CACHE_SIZE;
+	default_backing_dev_info.ra_pages = max_readahead_pages;
+	return 0;
+}
+
+early_param("readahead", readahead);
+
 /*
  * Initialise a struct file's readahead state.  Assumes that the caller has
  * memset *ra to zero.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-02-22  1:29 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-09 11:29 [PATCH] Make vm_max_readahead configurable at run-time Nikanth Karthikesan
2010-02-09 11:29 ` Nikanth Karthikesan
2010-02-09 23:22 ` Andrew Morton
2010-02-09 23:22   ` Andrew Morton
2010-02-10  6:25 ` Balbir Singh
2010-02-10  6:25   ` Balbir Singh
2010-02-10 10:53   ` [PATCH v2] " Nikanth Karthikesan
2010-02-10 10:53     ` Nikanth Karthikesan
2010-02-10 11:05     ` Wu Fengguang
2010-02-10 11:05       ` Wu Fengguang
2010-02-10 13:52       ` Nikanth Karthikesan
2010-02-10 13:52         ` Nikanth Karthikesan
2010-02-11  5:13         ` Wu Fengguang
2010-02-11  5:13           ` Wu Fengguang
2010-02-11  7:34           ` Nikanth Karthikesan
2010-02-11  7:34             ` Nikanth Karthikesan
2010-02-11 10:16             ` [PATCH v2] Make VM_MAX_READAHEAD a kernel parameter Nikanth Karthikesan
2010-02-11 10:16               ` Nikanth Karthikesan
2010-02-11 11:15               ` Ankit Jain
2010-02-11 11:15                 ` Ankit Jain
2010-02-11 11:45                 ` Nikanth Karthikesan
2010-02-11 11:45                   ` Nikanth Karthikesan
2010-02-11 15:16                   ` Wu Fengguang
2010-02-11 15:16                     ` Wu Fengguang
2010-02-15  4:35                     ` Nikanth Karthikesan
2010-02-15  4:35                       ` Nikanth Karthikesan
2010-02-14 21:37                   ` Dave Chinner
2010-02-14 21:37                     ` Dave Chinner
2010-02-15  4:36                     ` Nikanth Karthikesan
2010-02-15  4:36                       ` Nikanth Karthikesan
2010-02-21 14:26                       ` Wu Fengguang [this message]
2010-02-21 14:26                         ` Wu Fengguang
2010-02-21 15:49                         ` Wu Fengguang
2010-02-21 15:49                           ` Wu Fengguang
2010-02-21 15:52                         ` Wu Fengguang
2010-02-21 15:52                           ` Wu Fengguang
2010-02-22  8:16                         ` Christian Ehrhardt
2010-02-22  8:16                           ` Christian Ehrhardt
2010-02-23  2:25                         ` Dave Chinner
2010-02-23  2:25                           ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100221142600.GA10036@localhost \
    --to=fengguang.wu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=david@fromorbit.com \
    --cc=ehrhardt@linux.vnet.ibm.com \
    --cc=jens.axboe@oracle.com \
    --cc=knikanth@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=radical@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.