public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks
@ 2005-12-01 13:17 Dirk Henning Gerdes
  2005-12-01 13:29 ` Arjan van de Ven
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Dirk Henning Gerdes @ 2005-12-01 13:17 UTC (permalink / raw)
  To: Jens Axboe; +Cc: LKML

Hi Jens!

For doing benchmarks on the I/O-Schedulers, I thought it would be very
useful to disable the pagecache.

I didn't want to make it so complicated so I just mark pages as
not-uptodate, so they have to be read again. Another reason was, that I
wanted to keep the conditions as near to reality as possible.

Further I thought it would be useful, if you could turn the pagecache on
and off without rebooting the system.

I implemented a proc-fs entry "/proc/benchmark/pagecache" for this.

Probably this patch can be useful for anyone else, who wants to do  some
benchmarks on block-layer stuff.
And if not, I would appreciate if you could have a look on it.

Signed-off-by: Dirk Gerdes <mail@dirk-gerdes.de>




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks
  2005-12-01 13:17 [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks Dirk Henning Gerdes
@ 2005-12-01 13:29 ` Arjan van de Ven
  2005-12-01 13:43   ` Dirk Henning Gerdes
  2005-12-01 14:36 ` Jens Axboe
  2005-12-02  1:25 ` Andrew Morton
  2 siblings, 1 reply; 17+ messages in thread
From: Arjan van de Ven @ 2005-12-01 13:29 UTC (permalink / raw)
  To: Dirk Henning Gerdes; +Cc: Jens Axboe, LKML

On Thu, 2005-12-01 at 14:17 +0100, Dirk Henning Gerdes wrote:
> Hi Jens!
> 
> For doing benchmarks on the I/O-Schedulers, I thought it would be very
> useful to disable the pagecache.


for benchmarks this is not enough though, you also need to clean the
inode and dentry caches, as well as any filesystem specific caches
(might be buffer cache)..... 
at which point it's probably nicer to just fake a limited umount since
that has to do all of that anyway


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks
  2005-12-01 13:29 ` Arjan van de Ven
@ 2005-12-01 13:43   ` Dirk Henning Gerdes
  0 siblings, 0 replies; 17+ messages in thread
From: Dirk Henning Gerdes @ 2005-12-01 13:43 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Jens Axboe, LKML

Probably I should have mentioned, how my benchmark should look like:

I have written a little c-program opening several files for reading and
writing. 
The dentry-cache would only play a role the first time, the files are
opened. I'm not quite sure about the inode-cache. 
I check if the page has buffer, and mark them as not uptodate, too. So
the buffer-cache is disabled, too.

I'm using ext2/ext3. I don't think, they use any additional caches.

But anyway: Could you explain your fake-umount idea a little more ?

Am Donnerstag, den 01.12.2005, 14:29 +0100 schrieb Arjan van de Ven:
> On Thu, 2005-12-01 at 14:17 +0100, Dirk Henning Gerdes wrote:
> > Hi Jens!
> > 
> > For doing benchmarks on the I/O-Schedulers, I thought it would be very
> > useful to disable the pagecache.
> 
> 
> for benchmarks this is not enough though, you also need to clean the
> inode and dentry caches, as well as any filesystem specific caches
> (might be buffer cache)..... 
> at which point it's probably nicer to just fake a limited umount since
> that has to do all of that anyway
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
-- 
Dirk Henning Gerdes
Bönnersdyk 47
47803 Krefeld

Tel:  02151-755745
      0174-7776640
Mail: mail@dirk-gerdes.de


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks
  2005-12-01 13:17 [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks Dirk Henning Gerdes
  2005-12-01 13:29 ` Arjan van de Ven
@ 2005-12-01 14:36 ` Jens Axboe
  2005-12-02  1:25 ` Andrew Morton
  2 siblings, 0 replies; 17+ messages in thread
From: Jens Axboe @ 2005-12-01 14:36 UTC (permalink / raw)
  To: Dirk Henning Gerdes; +Cc: LKML

On Thu, Dec 01 2005, Dirk Henning Gerdes wrote:
> Hi Jens!
> 
> For doing benchmarks on the I/O-Schedulers, I thought it would be very
> useful to disable the pagecache.
> 
> I didn't want to make it so complicated so I just mark pages as
> not-uptodate, so they have to be read again. Another reason was, that I
> wanted to keep the conditions as near to reality as possible.
> 
> Further I thought it would be useful, if you could turn the pagecache on
> and off without rebooting the system.
> 
> I implemented a proc-fs entry "/proc/benchmark/pagecache" for this.
> 
> Probably this patch can be useful for anyone else, who wants to do  some
> benchmarks on block-layer stuff.
> And if not, I would appreciate if you could have a look on it.

This is rather odd, if you ask me, I don't like it. If you are doing
serious benchmarking, you do it on a seperate disk / file system which
you can just umount/mount before starting over. Or you reboot the
machine in between.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks
       [not found] <5f08L-Um-413@gated-at.bofh.it>
@ 2005-12-01 22:48 ` Bodo Eggert
       [not found] ` <5f7UE-3FH-13@gated-at.bofh.it>
  1 sibling, 0 replies; 17+ messages in thread
From: Bodo Eggert @ 2005-12-01 22:48 UTC (permalink / raw)
  To: Dirk Henning Gerdes, Jens Axboe, LKML

Dirk Henning Gerdes <mail@dirk-gerdes.de> wrote:

> For doing benchmarks on the I/O-Schedulers, I thought it would be very
> useful to disable the pagecache.
> 
> I didn't want to make it so complicated so I just mark pages as
> not-uptodate, so they have to be read again. Another reason was, that I
> wanted to keep the conditions as near to reality as possible.
> 
> Further I thought it would be useful, if you could turn the pagecache on
> and off without rebooting the system.
> 
> I implemented a proc-fs entry "/proc/benchmark/pagecache" for this.

1) This mail is the only documentation on how to operate your patch.
   How do you suppose your users to find out how to operate the switch?
   (I asume it's really a switch, a toggle would be insane.)

   Since it's very short and only for special purpose, documenting it
   in Kconfig mignt be enough.

2) You're seperating your patches by file, not by function. ungood.

3) Your patches introduce a lot of whitespace.
-- 
Ich danke GMX dafür, die Verwendung meiner Adressen mittels per SPF
verbreiteten Lügen zu sabotieren.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks
  2005-12-01 13:17 [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks Dirk Henning Gerdes
  2005-12-01 13:29 ` Arjan van de Ven
  2005-12-01 14:36 ` Jens Axboe
@ 2005-12-02  1:25 ` Andrew Morton
  2005-12-02  1:34   ` Jeff Garzik
                     ` (4 more replies)
  2 siblings, 5 replies; 17+ messages in thread
From: Andrew Morton @ 2005-12-02  1:25 UTC (permalink / raw)
  To: Dirk Henning Gerdes; +Cc: axboe, linux-kernel

Dirk Henning Gerdes <mail@dirk-gerdes.de> wrote:
>
>  For doing benchmarks on the I/O-Schedulers, I thought it would be very
>  useful to disable the pagecache.

That's an FAQ.   Something like this?


From: Andrew Morton <akpm@osdl.org>

Add /proc/sys/vm/drop-pagecache.  When written to, this will cause the kernel
to discard as much pagecache and reclaimable slab objects as it can.

It won't drop dirty data, so the user should run `sync' first.

Caveats:

a) Holds inode_lock for exorbitant amounts of time.

b) Needs to be taught about NUMA nodes: propagate these all the way through
   so the discarding can be controlled on a per-node basis.

c) The pagecache shrinking and slab shrinking should probably have separate
   controls.


Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 fs/Makefile            |    2 -
 fs/drop-pagecache.c    |   62 +++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/mm.h     |    5 +++
 include/linux/sysctl.h |    1 
 kernel/sysctl.c        |    9 +++++++
 mm/truncate.c          |    1 
 mm/vmscan.c            |    3 --
 7 files changed, 79 insertions(+), 4 deletions(-)

diff -puN /dev/null fs/drop-pagecache.c
--- /dev/null	2003-09-15 06:40:47.000000000 -0700
+++ devel-akpm/fs/drop-pagecache.c	2005-12-01 17:20:55.000000000 -0800
@@ -0,0 +1,62 @@
+/*
+ * Implement the manual drop-all-pagecache function
+ */
+
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/fs.h>
+#include <linux/writeback.h>
+#include <linux/sysctl.h>
+#include <linux/gfp.h>
+
+static void drop_pagecache_sb(struct super_block *sb)
+{
+	struct inode *inode;
+
+	spin_lock(&inode_lock);
+	list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
+		if (inode->i_state & (I_FREEING|I_WILL_FREE))
+			continue;
+		invalidate_inode_pages(inode->i_mapping);
+	}
+	spin_unlock(&inode_lock);
+}
+
+static void drop_pagecache(void)
+{
+	struct super_block *sb;
+
+	spin_lock(&sb_lock);
+restart:
+	list_for_each_entry(sb, &super_blocks, s_list) {
+		sb->s_count++;
+		spin_unlock(&sb_lock);
+		down_read(&sb->s_umount);
+		if (sb->s_root)
+			drop_pagecache_sb(sb);
+		up_read(&sb->s_umount);
+		spin_lock(&sb_lock);
+		if (__put_super_and_need_restart(sb))
+			goto restart;
+	}
+	spin_unlock(&sb_lock);
+	printk("shrunk pagecache\n");
+}
+
+static void drop_slab(void)
+{
+	int nr_objects;
+
+	do {
+		nr_objects = shrink_slab(1000, GFP_KERNEL, 1000);
+		printk("shrunk %d cache objects\n", nr_objects);
+	} while (nr_objects > 10);
+}
+
+int drop_pagecache_sysctl_handler(ctl_table *table, int write,
+	struct file *file, void __user *buffer, size_t *length, loff_t *ppos)
+{
+	drop_pagecache();
+	drop_slab();
+	return 0;
+}
diff -puN fs/Makefile~drop-pagecache fs/Makefile
--- devel/fs/Makefile~drop-pagecache	2005-12-01 16:41:22.000000000 -0800
+++ devel-akpm/fs/Makefile	2005-12-01 16:41:22.000000000 -0800
@@ -10,7 +10,7 @@ obj-y :=	open.o read_write.o file_table.
 		ioctl.o readdir.o select.o fifo.o locks.o dcache.o inode.o \
 		attr.o bad_inode.o file.o filesystems.o namespace.o aio.o \
 		seq_file.o xattr.o libfs.o fs-writeback.o mpage.o direct-io.o \
-		ioprio.o pnode.o
+		ioprio.o pnode.o drop-pagecache.o
 
 obj-$(CONFIG_INOTIFY)		+= inotify.o
 obj-$(CONFIG_EPOLL)		+= eventpoll.o
diff -puN include/linux/mm.h~drop-pagecache include/linux/mm.h
--- devel/include/linux/mm.h~drop-pagecache	2005-12-01 16:41:22.000000000 -0800
+++ devel-akpm/include/linux/mm.h	2005-12-01 17:01:57.000000000 -0800
@@ -1078,5 +1078,10 @@ int in_gate_area_no_task(unsigned long a
 /* /proc/<pid>/oom_adj set to -17 protects from the oom-killer */
 #define OOM_DISABLE -17
 
+int drop_pagecache_sysctl_handler(struct ctl_table *, int, struct file *,
+					void __user *, size_t *, loff_t *);
+int shrink_slab(unsigned long scanned, gfp_t gfp_mask,
+			unsigned long lru_pages);
+
 #endif /* __KERNEL__ */
 #endif /* _LINUX_MM_H */
diff -puN include/linux/sysctl.h~drop-pagecache include/linux/sysctl.h
--- devel/include/linux/sysctl.h~drop-pagecache	2005-12-01 16:41:22.000000000 -0800
+++ devel-akpm/include/linux/sysctl.h	2005-12-01 16:41:22.000000000 -0800
@@ -182,6 +182,7 @@ enum
 	VM_LEGACY_VA_LAYOUT=27, /* legacy/compatibility virtual address space layout */
 	VM_SWAP_TOKEN_TIMEOUT=28, /* default time for token time out */
 	VM_SWAP_PREFETCH=29,	/* int: amount to swap prefetch */
+	VM_DROP_PAGECACHE=30,	/* int: nuke lots of pagecache */
 };
 
 
diff -puN kernel/sysctl.c~drop-pagecache kernel/sysctl.c
--- devel/kernel/sysctl.c~drop-pagecache	2005-12-01 16:41:22.000000000 -0800
+++ devel-akpm/kernel/sysctl.c	2005-12-01 16:41:22.000000000 -0800
@@ -783,6 +783,15 @@ static ctl_table vm_table[] = {
 		.strategy	= &sysctl_intvec,
 	},
 	{
+		.ctl_name	= VM_DROP_PAGECACHE,
+		.procname	= "drop-pagecache",
+		.data		= NULL,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= drop_pagecache_sysctl_handler,
+		.strategy	= &sysctl_intvec,
+	},
+	{
 		.ctl_name	= VM_MIN_FREE_KBYTES,
 		.procname	= "min_free_kbytes",
 		.data		= &min_free_kbytes,
diff -puN mm/truncate.c~drop-pagecache mm/truncate.c
--- devel/mm/truncate.c~drop-pagecache	2005-12-01 16:49:06.000000000 -0800
+++ devel-akpm/mm/truncate.c	2005-12-01 16:49:13.000000000 -0800
@@ -256,7 +256,6 @@ unlock:
 				break;
 		}
 		pagevec_release(&pvec);
-		cond_resched();
 	}
 	return ret;
 }
diff -puN mm/vmscan.c~drop-pagecache mm/vmscan.c
--- devel/mm/vmscan.c~drop-pagecache	2005-12-01 16:58:30.000000000 -0800
+++ devel-akpm/mm/vmscan.c	2005-12-01 17:00:39.000000000 -0800
@@ -181,8 +181,7 @@ EXPORT_SYMBOL(remove_shrinker);
  *
  * Returns the number of slab objects which we shrunk.
  */
-static int shrink_slab(unsigned long scanned, gfp_t gfp_mask,
-			unsigned long lru_pages)
+int shrink_slab(unsigned long scanned, gfp_t gfp_mask, unsigned long lru_pages)
 {
 	struct shrinker *shrinker;
 	int ret = 0;
_


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks
  2005-12-02  1:25 ` Andrew Morton
@ 2005-12-02  1:34   ` Jeff Garzik
  2005-12-02 19:19     ` Badari Pulavarty
  2005-12-02 19:17   ` Badari Pulavarty
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Jeff Garzik @ 2005-12-02  1:34 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Dirk Henning Gerdes, axboe, linux-kernel

On Thu, Dec 01, 2005 at 05:25:20PM -0800, Andrew Morton wrote:
> Dirk Henning Gerdes <mail@dirk-gerdes.de> wrote:
> >
> >  For doing benchmarks on the I/O-Schedulers, I thought it would be very
> >  useful to disable the pagecache.
> 
> That's an FAQ.   Something like this?
> 
> 
> From: Andrew Morton <akpm@osdl.org>
> 
> Add /proc/sys/vm/drop-pagecache.  When written to, this will cause the kernel
> to discard as much pagecache and reclaimable slab objects as it can.
> 
> It won't drop dirty data, so the user should run `sync' first.
> 
> Caveats:
> 
> a) Holds inode_lock for exorbitant amounts of time.
> 
> b) Needs to be taught about NUMA nodes: propagate these all the way through
>    so the discarding can be controlled on a per-node basis.
> 
> c) The pagecache shrinking and slab shrinking should probably have separate
>    controls.
> 
> 
> Signed-off-by: Andrew Morton <akpm@osdl.org>

ACK, I've wanted something like this for a while.

I really think it should be a config option, though, to discourage
people from building with it :)

	Jeff




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks
  2005-12-02  1:25 ` Andrew Morton
  2005-12-02  1:34   ` Jeff Garzik
@ 2005-12-02 19:17   ` Badari Pulavarty
  2005-12-02 21:24   ` Badari Pulavarty
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Badari Pulavarty @ 2005-12-02 19:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Dirk Henning Gerdes, axboe, lkml

On Thu, 2005-12-01 at 17:25 -0800, Andrew Morton wrote:
> Dirk Henning Gerdes <mail@dirk-gerdes.de> wrote:
> >
> >  For doing benchmarks on the I/O-Schedulers, I thought it would be very
> >  useful to disable the pagecache.
> 
> That's an FAQ.   Something like this?
> 
> 
> From: Andrew Morton <akpm@osdl.org>
> 
> Add /proc/sys/vm/drop-pagecache.  When written to, this will cause the kernel
> to discard as much pagecache and reclaimable slab objects as it can.
> 
> It won't drop dirty data, so the user should run `sync' first.
> 
> Caveats:
> 
> a) Holds inode_lock for exorbitant amounts of time.
> 
> b) Needs to be taught about NUMA nodes: propagate these all the way through
>    so the discarding can be controlled on a per-node basis.
> 
> c) The pagecache shrinking and slab shrinking should probably have separate
>    controls.
> 
> 
> Signed-off-by: Andrew Morton <akpm@osdl.org>

Yep. This is what I wanted also :) This is similar functionality as
"cfree" module some one wrote a while ago.

Cool, This will make some of the database folks get off my back for a
while :)


Thanks,
Badari


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks
  2005-12-02  1:34   ` Jeff Garzik
@ 2005-12-02 19:19     ` Badari Pulavarty
  0 siblings, 0 replies; 17+ messages in thread
From: Badari Pulavarty @ 2005-12-02 19:19 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Andrew Morton, Dirk Henning Gerdes, axboe, lkml

On Thu, 2005-12-01 at 20:34 -0500, Jeff Garzik wrote:
> On Thu, Dec 01, 2005 at 05:25:20PM -0800, Andrew Morton wrote:
> > Dirk Henning Gerdes <mail@dirk-gerdes.de> wrote:
> > >
> > >  For doing benchmarks on the I/O-Schedulers, I thought it would be very
> > >  useful to disable the pagecache.
> > 
> > That's an FAQ.   Something like this?
> > 
> > 
> > From: Andrew Morton <akpm@osdl.org>
> > 
> > Add /proc/sys/vm/drop-pagecache.  When written to, this will cause the kernel
> > to discard as much pagecache and reclaimable slab objects as it can.
> > 
> > It won't drop dirty data, so the user should run `sync' first.
> > 
> > Caveats:
> > 
> > a) Holds inode_lock for exorbitant amounts of time.
> > 
> > b) Needs to be taught about NUMA nodes: propagate these all the way through
> >    so the discarding can be controlled on a per-node basis.
> > 
> > c) The pagecache shrinking and slab shrinking should probably have separate
> >    controls.
> > 
> > 
> > Signed-off-by: Andrew Morton <akpm@osdl.org>
> 
> ACK, I've wanted something like this for a while.
> 
> I really think it should be a config option, though, to discourage
> people from building with it :)

Why ? Since its controlled through /proc, if some one "echo" stuff into
it, they might get crappy performance (like other /proc tunables). 
Isn't it expected ?

Thanks,
Badari


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks
  2005-12-02  1:25 ` Andrew Morton
  2005-12-02  1:34   ` Jeff Garzik
  2005-12-02 19:17   ` Badari Pulavarty
@ 2005-12-02 21:24   ` Badari Pulavarty
  2005-12-02 21:44     ` Andrew Morton
  2005-12-05  2:13   ` Rob Landley
  2005-12-05 16:54   ` Badari Pulavarty
  4 siblings, 1 reply; 17+ messages in thread
From: Badari Pulavarty @ 2005-12-02 21:24 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Dirk Henning Gerdes, axboe, lkml

On Thu, 2005-12-01 at 17:25 -0800, Andrew Morton wrote:
> Dirk Henning Gerdes <mail@dirk-gerdes.de> wrote:
> >
> >  For doing benchmarks on the I/O-Schedulers, I thought it would be very
> >  useful to disable the pagecache.
> 
> That's an FAQ.   Something like this?
> 
> 
> From: Andrew Morton <akpm@osdl.org>
> 
> Add /proc/sys/vm/drop-pagecache.  When written to, this will cause the kernel
> to discard as much pagecache and reclaimable slab objects as it can.
> 
> It won't drop dirty data, so the user should run `sync' first.
> 
> Caveats:
> 
> a) Holds inode_lock for exorbitant amounts of time.
> 
> b) Needs to be taught about NUMA nodes: propagate these all the way through
>    so the discarding can be controlled on a per-node basis.
> 
> c) The pagecache shrinking and slab shrinking should probably have separate
>    controls.
> 
> 
> Signed-off-by: Andrew Morton <akpm@osdl.org>

Wondering, if this shrinks shared memory pages (since they are backed by
tmpfs) ? (which is not what I want).

Thanks,
Badari


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks
  2005-12-02 21:24   ` Badari Pulavarty
@ 2005-12-02 21:44     ` Andrew Morton
  2005-12-02 22:33       ` Badari Pulavarty
  0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2005-12-02 21:44 UTC (permalink / raw)
  To: Badari Pulavarty; +Cc: mail, axboe, linux-kernel

Badari Pulavarty <pbadari@us.ibm.com> wrote:
>
> Wondering, if this shrinks shared memory pages (since they are backed by
> tmpfs) ? (which is not what I want).

It'll reclaim unused pagecache pages.  What effect that has on
idioticfs^Wtmpfs pages depends on the state of the pages.  If they're
attached to tmpfs inodes then they won't be reclaimed because they have no
backing store.  If they're attached to swapcache then they won't be
reclaimed because they have no superblock.

So I guess you got lucky.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks
  2005-12-02 21:44     ` Andrew Morton
@ 2005-12-02 22:33       ` Badari Pulavarty
  0 siblings, 0 replies; 17+ messages in thread
From: Badari Pulavarty @ 2005-12-02 22:33 UTC (permalink / raw)
  To: Andrew Morton; +Cc: mail, axboe, lkml

On Fri, 2005-12-02 at 13:44 -0800, Andrew Morton wrote:
> Badari Pulavarty <pbadari@us.ibm.com> wrote:
> >
> > Wondering, if this shrinks shared memory pages (since they are backed by
> > tmpfs) ? (which is not what I want).
> 
> It'll reclaim unused pagecache pages.  What effect that has on
> idioticfs^Wtmpfs pages depends on the state of the pages.  If they're
> attached to tmpfs inodes then they won't be reclaimed because they have no
> backing store.  If they're attached to swapcache then they won't be
> reclaimed because they have no superblock.
> 
> So I guess you got lucky.

Wow !! Thank you. Its not that often, I get lucky :)

Thanks,
Badari


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] linux-2.6-block: deactivating pagecache for  benchmarks
       [not found] ` <5f7UE-3FH-13@gated-at.bofh.it>
@ 2005-12-03  2:05   ` Bodo Eggert
  0 siblings, 0 replies; 17+ messages in thread
From: Bodo Eggert @ 2005-12-03  2:05 UTC (permalink / raw)
  To: Andrew Morton, Dirk Henning Gerdes, axboe, linux-kernel

Andrew Morton <akpm@osdl.org> wrote:

> +             .procname       = "drop-pagecache",
> +             .data           = NULL,
> +             .maxlen         = sizeof(int),
> +             .mode           = 0644,

1) Shouldn't this be a trigger? Reading from a trigger doesn't make sense,
   especially when triggering on a read.  .mode = 0200?


2) If you pass an int, wouldn't a bitmask selecting what to free make sense?
   (off cause -1 == everything)
   Maybe this would be overengineered.

-- 
Ich danke GMX dafür, die Verwendung meiner Adressen mittels per SPF
verbreiteten Lügen zu sabotieren.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks
  2005-12-02  1:25 ` Andrew Morton
                     ` (2 preceding siblings ...)
  2005-12-02 21:24   ` Badari Pulavarty
@ 2005-12-05  2:13   ` Rob Landley
  2005-12-05 16:20     ` Lee Revell
  2005-12-05 16:54   ` Badari Pulavarty
  4 siblings, 1 reply; 17+ messages in thread
From: Rob Landley @ 2005-12-05  2:13 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Dirk Henning Gerdes, axboe, linux-kernel

On Thursday 01 December 2005 19:25, Andrew Morton wrote:
> Dirk Henning Gerdes <mail@dirk-gerdes.de> wrote:
> >  For doing benchmarks on the I/O-Schedulers, I thought it would be very
> >  useful to disable the pagecache.
>
> That's an FAQ.   Something like this?
>
>
> From: Andrew Morton <akpm@osdl.org>
>
> Add /proc/sys/vm/drop-pagecache.  When written to, this will cause the
> kernel to discard as much pagecache and reclaimable slab objects as it can.
>
> It won't drop dirty data, so the user should run `sync' first.

This is deeply, deeply cool.

> Caveats:
>
> a) Holds inode_lock for exorbitant amounts of time.

Voluntary preemption point, maybe?

> b) Needs to be taught about NUMA nodes: propagate these all the way through
>    so the discarding can be controlled on a per-node basis.
>
> c) The pagecache shrinking and slab shrinking should probably have separate
>    controls.

It could care about _what_ you write to it, maybe?  (The first byte, 
anyway...)

>
> Signed-off-by: Andrew Morton <akpm@osdl.org>
> ---
>
>  fs/Makefile            |    2 -
>  fs/drop-pagecache.c    |   62
> +++++++++++++++++++++++++++++++++++++++++++++++++ include/linux/mm.h     | 
>   5 +++
>  include/linux/sysctl.h |    1
>  kernel/sysctl.c        |    9 +++++++
>  mm/truncate.c          |    1
>  mm/vmscan.c            |    3 --
>  7 files changed, 79 insertions(+), 4 deletions(-)
>
> diff -puN /dev/null fs/drop-pagecache.c
> --- /dev/null 2003-09-15 06:40:47.000000000 -0700
> +++ devel-akpm/fs/drop-pagecache.c 2005-12-01 17:20:55.000000000 -0800
> @@ -0,0 +1,62 @@
> +/*
> + * Implement the manual drop-all-pagecache function
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/mm.h>
> +#include <linux/fs.h>
> +#include <linux/writeback.h>
> +#include <linux/sysctl.h>
> +#include <linux/gfp.h>
> +
> +static void drop_pagecache_sb(struct super_block *sb)
> +{
> + struct inode *inode;
> +
> + spin_lock(&inode_lock);
> + list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
> +  if (inode->i_state & (I_FREEING|I_WILL_FREE))
> +   continue;
> +  invalidate_inode_pages(inode->i_mapping);
> + }
> + spin_unlock(&inode_lock);
> +}
> +
> +static void drop_pagecache(void)
> +{
> + struct super_block *sb;
> +
> + spin_lock(&sb_lock);
> +restart:
> + list_for_each_entry(sb, &super_blocks, s_list) {
> +  sb->s_count++;
> +  spin_unlock(&sb_lock);
> +  down_read(&sb->s_umount);
> +  if (sb->s_root)
> +   drop_pagecache_sb(sb);
> +  up_read(&sb->s_umount);
> +  spin_lock(&sb_lock);
> +  if (__put_super_and_need_restart(sb))
> +   goto restart;
> + }
> + spin_unlock(&sb_lock);
> + printk("shrunk pagecache\n");
> +}
> +
> +static void drop_slab(void)
> +{
> + int nr_objects;
> +
> + do {
> +  nr_objects = shrink_slab(1000, GFP_KERNEL, 1000);
> +  printk("shrunk %d cache objects\n", nr_objects);
> + } while (nr_objects > 10);
> +}
> +
> +int drop_pagecache_sysctl_handler(ctl_table *table, int write,
> + struct file *file, void __user *buffer, size_t *length, loff_t *ppos)
> +{
> + drop_pagecache();
> + drop_slab();
> + return 0;
> +}
> diff -puN fs/Makefile~drop-pagecache fs/Makefile
> --- devel/fs/Makefile~drop-pagecache 2005-12-01 16:41:22.000000000 -0800
> +++ devel-akpm/fs/Makefile 2005-12-01 16:41:22.000000000 -0800
> @@ -10,7 +10,7 @@ obj-y := open.o read_write.o file_table.
>    ioctl.o readdir.o select.o fifo.o locks.o dcache.o inode.o \
>    attr.o bad_inode.o file.o filesystems.o namespace.o aio.o \
>    seq_file.o xattr.o libfs.o fs-writeback.o mpage.o direct-io.o \
> -  ioprio.o pnode.o
> +  ioprio.o pnode.o drop-pagecache.o
>
>  obj-$(CONFIG_INOTIFY)  += inotify.o
>  obj-$(CONFIG_EPOLL)  += eventpoll.o
> diff -puN include/linux/mm.h~drop-pagecache include/linux/mm.h
> --- devel/include/linux/mm.h~drop-pagecache 2005-12-01 16:41:22.000000000
> -0800 +++ devel-akpm/include/linux/mm.h 2005-12-01 17:01:57.000000000 -0800
> @@ -1078,5 +1078,10 @@ int in_gate_area_no_task(unsigned long a
>  /* /proc/<pid>/oom_adj set to -17 protects from the oom-killer */
>  #define OOM_DISABLE -17
>
> +int drop_pagecache_sysctl_handler(struct ctl_table *, int, struct file *,
> +     void __user *, size_t *, loff_t *);
> +int shrink_slab(unsigned long scanned, gfp_t gfp_mask,
> +   unsigned long lru_pages);
> +
>  #endif /* __KERNEL__ */
>  #endif /* _LINUX_MM_H */
> diff -puN include/linux/sysctl.h~drop-pagecache include/linux/sysctl.h
> --- devel/include/linux/sysctl.h~drop-pagecache 2005-12-01
> 16:41:22.000000000 -0800 +++ devel-akpm/include/linux/sysctl.h 2005-12-01
> 16:41:22.000000000 -0800 @@ -182,6 +182,7 @@ enum
>   VM_LEGACY_VA_LAYOUT=27, /* legacy/compatibility virtual address space
> layout */ VM_SWAP_TOKEN_TIMEOUT=28, /* default time for token time out */
>   VM_SWAP_PREFETCH=29, /* int: amount to swap prefetch */
> + VM_DROP_PAGECACHE=30, /* int: nuke lots of pagecache */
>  };
>
>
> diff -puN kernel/sysctl.c~drop-pagecache kernel/sysctl.c
> --- devel/kernel/sysctl.c~drop-pagecache 2005-12-01 16:41:22.000000000
> -0800 +++ devel-akpm/kernel/sysctl.c 2005-12-01 16:41:22.000000000 -0800 @@
> -783,6 +783,15 @@ static ctl_table vm_table[] = {
>    .strategy = &sysctl_intvec,
>   },
>   {
> +  .ctl_name = VM_DROP_PAGECACHE,
> +  .procname = "drop-pagecache",
> +  .data  = NULL,
> +  .maxlen  = sizeof(int),
> +  .mode  = 0644,

So what _does_ it do when you read from it?

> +  .proc_handler = drop_pagecache_sysctl_handler,
> +  .strategy = &sysctl_intvec,
> + },
> + {
>    .ctl_name = VM_MIN_FREE_KBYTES,
>    .procname = "min_free_kbytes",
>    .data  = &min_free_kbytes,
> diff -puN mm/truncate.c~drop-pagecache mm/truncate.c
> --- devel/mm/truncate.c~drop-pagecache 2005-12-01 16:49:06.000000000 -0800
> +++ devel-akpm/mm/truncate.c 2005-12-01 16:49:13.000000000 -0800
> @@ -256,7 +256,6 @@ unlock:
>      break;
>    }
>    pagevec_release(&pvec);
> -  cond_resched();

Why drop that line?  (I don't follow...)

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks
  2005-12-05  2:13   ` Rob Landley
@ 2005-12-05 16:20     ` Lee Revell
  2005-12-05 17:28       ` Rob Landley
  0 siblings, 1 reply; 17+ messages in thread
From: Lee Revell @ 2005-12-05 16:20 UTC (permalink / raw)
  To: Rob Landley; +Cc: Andrew Morton, Dirk Henning Gerdes, axboe, linux-kernel

On Sun, 2005-12-04 at 20:13 -0600, Rob Landley wrote:
> > Add /proc/sys/vm/drop-pagecache.  When written to, this will cause the
> > kernel to discard as much pagecache and reclaimable slab objects as it can.
> >
> > It won't drop dirty data, so the user should run `sync' first.
> 
> This is deeply, deeply cool.
> 
> > Caveats:
> >
> > a) Holds inode_lock for exorbitant amounts of time.
> 
> Voluntary preemption point, maybe?

I thin it's a bad idea, that would just encourage people to use this for
anything other than debugging.  If you care about latency don't discard
the page cache.

The GNOME people have been asking for this for a while, in order to
improve startup times, they would like a way to simulate a cold start
without rebooting.

Lee


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks
  2005-12-02  1:25 ` Andrew Morton
                     ` (3 preceding siblings ...)
  2005-12-05  2:13   ` Rob Landley
@ 2005-12-05 16:54   ` Badari Pulavarty
  4 siblings, 0 replies; 17+ messages in thread
From: Badari Pulavarty @ 2005-12-05 16:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Dirk Henning Gerdes, axboe, lkml

On Thu, 2005-12-01 at 17:25 -0800, Andrew Morton wrote:
> Dirk Henning Gerdes <mail@dirk-gerdes.de> wrote:
> >
> >  For doing benchmarks on the I/O-Schedulers, I thought it would be very
> >  useful to disable the pagecache.
> 
> That's an FAQ.   Something like this?
> 
> 
> From: Andrew Morton <akpm@osdl.org>
> 
> Add /proc/sys/vm/drop-pagecache.  When written to, this will cause the kernel
> to discard as much pagecache and reclaimable slab objects as it can.
> 
> It won't drop dirty data, so the user should run `sync' first.

BTW, (a while ago) I tried doing similar thing from user-space 
using POSIX_FADV_DONTNEED on a file. While it worked great to 
get rid of the pagecache pages for few files, since I had to 
run this on each and every file in the filesystem - it ended 
up bloating inode, dentry slabs :( I really wanted to find out 
what files are really cached in the pagecache to run this on.

Thanks,
Badari


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks
  2005-12-05 16:20     ` Lee Revell
@ 2005-12-05 17:28       ` Rob Landley
  0 siblings, 0 replies; 17+ messages in thread
From: Rob Landley @ 2005-12-05 17:28 UTC (permalink / raw)
  To: Lee Revell; +Cc: Andrew Morton, Dirk Henning Gerdes, axboe, linux-kernel

On Monday 05 December 2005 10:20, Lee Revell wrote:

> > > Caveats:
> > >
> > > a) Holds inode_lock for exorbitant amounts of time.
> >
> > Voluntary preemption point, maybe?
>
> I thin it's a bad idea, that would just encourage people to use this for
> anything other than debugging.  If you care about latency don't discard
> the page cache.
>
> The GNOME people have been asking for this for a while, in order to
> improve startup times, they would like a way to simulate a cold start
> without rebooting.

I was thinking that virtual environments (namely, User Mode Linux) could use 
this in conjunction with sys_punch to free up memory back to the host system.

> Lee

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2005-12-05 17:29 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-12-01 13:17 [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks Dirk Henning Gerdes
2005-12-01 13:29 ` Arjan van de Ven
2005-12-01 13:43   ` Dirk Henning Gerdes
2005-12-01 14:36 ` Jens Axboe
2005-12-02  1:25 ` Andrew Morton
2005-12-02  1:34   ` Jeff Garzik
2005-12-02 19:19     ` Badari Pulavarty
2005-12-02 19:17   ` Badari Pulavarty
2005-12-02 21:24   ` Badari Pulavarty
2005-12-02 21:44     ` Andrew Morton
2005-12-02 22:33       ` Badari Pulavarty
2005-12-05  2:13   ` Rob Landley
2005-12-05 16:20     ` Lee Revell
2005-12-05 17:28       ` Rob Landley
2005-12-05 16:54   ` Badari Pulavarty
     [not found] <5f08L-Um-413@gated-at.bofh.it>
2005-12-01 22:48 ` Bodo Eggert
     [not found] ` <5f7UE-3FH-13@gated-at.bofh.it>
2005-12-03  2:05   ` Bodo Eggert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox