public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Rob Landley <rob@landley.net>
To: Andrew Morton <akpm@osdl.org>
Cc: Dirk Henning Gerdes <mail@dirk-gerdes.de>,
	axboe@suse.de, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks
Date: Sun, 4 Dec 2005 20:13:12 -0600	[thread overview]
Message-ID: <200512042013.13214.rob@landley.net> (raw)
In-Reply-To: <20051201172520.7095e524.akpm@osdl.org>

On Thursday 01 December 2005 19:25, Andrew Morton wrote:
> Dirk Henning Gerdes <mail@dirk-gerdes.de> wrote:
> >  For doing benchmarks on the I/O-Schedulers, I thought it would be very
> >  useful to disable the pagecache.
>
> That's an FAQ.   Something like this?
>
>
> From: Andrew Morton <akpm@osdl.org>
>
> Add /proc/sys/vm/drop-pagecache.  When written to, this will cause the
> kernel to discard as much pagecache and reclaimable slab objects as it can.
>
> It won't drop dirty data, so the user should run `sync' first.

This is deeply, deeply cool.

> Caveats:
>
> a) Holds inode_lock for exorbitant amounts of time.

Voluntary preemption point, maybe?

> b) Needs to be taught about NUMA nodes: propagate these all the way through
>    so the discarding can be controlled on a per-node basis.
>
> c) The pagecache shrinking and slab shrinking should probably have separate
>    controls.

It could care about _what_ you write to it, maybe?  (The first byte, 
anyway...)

>
> Signed-off-by: Andrew Morton <akpm@osdl.org>
> ---
>
>  fs/Makefile            |    2 -
>  fs/drop-pagecache.c    |   62
> +++++++++++++++++++++++++++++++++++++++++++++++++ include/linux/mm.h     | 
>   5 +++
>  include/linux/sysctl.h |    1
>  kernel/sysctl.c        |    9 +++++++
>  mm/truncate.c          |    1
>  mm/vmscan.c            |    3 --
>  7 files changed, 79 insertions(+), 4 deletions(-)
>
> diff -puN /dev/null fs/drop-pagecache.c
> --- /dev/null 2003-09-15 06:40:47.000000000 -0700
> +++ devel-akpm/fs/drop-pagecache.c 2005-12-01 17:20:55.000000000 -0800
> @@ -0,0 +1,62 @@
> +/*
> + * Implement the manual drop-all-pagecache function
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/mm.h>
> +#include <linux/fs.h>
> +#include <linux/writeback.h>
> +#include <linux/sysctl.h>
> +#include <linux/gfp.h>
> +
> +static void drop_pagecache_sb(struct super_block *sb)
> +{
> + struct inode *inode;
> +
> + spin_lock(&inode_lock);
> + list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
> +  if (inode->i_state & (I_FREEING|I_WILL_FREE))
> +   continue;
> +  invalidate_inode_pages(inode->i_mapping);
> + }
> + spin_unlock(&inode_lock);
> +}
> +
> +static void drop_pagecache(void)
> +{
> + struct super_block *sb;
> +
> + spin_lock(&sb_lock);
> +restart:
> + list_for_each_entry(sb, &super_blocks, s_list) {
> +  sb->s_count++;
> +  spin_unlock(&sb_lock);
> +  down_read(&sb->s_umount);
> +  if (sb->s_root)
> +   drop_pagecache_sb(sb);
> +  up_read(&sb->s_umount);
> +  spin_lock(&sb_lock);
> +  if (__put_super_and_need_restart(sb))
> +   goto restart;
> + }
> + spin_unlock(&sb_lock);
> + printk("shrunk pagecache\n");
> +}
> +
> +static void drop_slab(void)
> +{
> + int nr_objects;
> +
> + do {
> +  nr_objects = shrink_slab(1000, GFP_KERNEL, 1000);
> +  printk("shrunk %d cache objects\n", nr_objects);
> + } while (nr_objects > 10);
> +}
> +
> +int drop_pagecache_sysctl_handler(ctl_table *table, int write,
> + struct file *file, void __user *buffer, size_t *length, loff_t *ppos)
> +{
> + drop_pagecache();
> + drop_slab();
> + return 0;
> +}
> diff -puN fs/Makefile~drop-pagecache fs/Makefile
> --- devel/fs/Makefile~drop-pagecache 2005-12-01 16:41:22.000000000 -0800
> +++ devel-akpm/fs/Makefile 2005-12-01 16:41:22.000000000 -0800
> @@ -10,7 +10,7 @@ obj-y := open.o read_write.o file_table.
>    ioctl.o readdir.o select.o fifo.o locks.o dcache.o inode.o \
>    attr.o bad_inode.o file.o filesystems.o namespace.o aio.o \
>    seq_file.o xattr.o libfs.o fs-writeback.o mpage.o direct-io.o \
> -  ioprio.o pnode.o
> +  ioprio.o pnode.o drop-pagecache.o
>
>  obj-$(CONFIG_INOTIFY)  += inotify.o
>  obj-$(CONFIG_EPOLL)  += eventpoll.o
> diff -puN include/linux/mm.h~drop-pagecache include/linux/mm.h
> --- devel/include/linux/mm.h~drop-pagecache 2005-12-01 16:41:22.000000000
> -0800 +++ devel-akpm/include/linux/mm.h 2005-12-01 17:01:57.000000000 -0800
> @@ -1078,5 +1078,10 @@ int in_gate_area_no_task(unsigned long a
>  /* /proc/<pid>/oom_adj set to -17 protects from the oom-killer */
>  #define OOM_DISABLE -17
>
> +int drop_pagecache_sysctl_handler(struct ctl_table *, int, struct file *,
> +     void __user *, size_t *, loff_t *);
> +int shrink_slab(unsigned long scanned, gfp_t gfp_mask,
> +   unsigned long lru_pages);
> +
>  #endif /* __KERNEL__ */
>  #endif /* _LINUX_MM_H */
> diff -puN include/linux/sysctl.h~drop-pagecache include/linux/sysctl.h
> --- devel/include/linux/sysctl.h~drop-pagecache 2005-12-01
> 16:41:22.000000000 -0800 +++ devel-akpm/include/linux/sysctl.h 2005-12-01
> 16:41:22.000000000 -0800 @@ -182,6 +182,7 @@ enum
>   VM_LEGACY_VA_LAYOUT=27, /* legacy/compatibility virtual address space
> layout */ VM_SWAP_TOKEN_TIMEOUT=28, /* default time for token time out */
>   VM_SWAP_PREFETCH=29, /* int: amount to swap prefetch */
> + VM_DROP_PAGECACHE=30, /* int: nuke lots of pagecache */
>  };
>
>
> diff -puN kernel/sysctl.c~drop-pagecache kernel/sysctl.c
> --- devel/kernel/sysctl.c~drop-pagecache 2005-12-01 16:41:22.000000000
> -0800 +++ devel-akpm/kernel/sysctl.c 2005-12-01 16:41:22.000000000 -0800 @@
> -783,6 +783,15 @@ static ctl_table vm_table[] = {
>    .strategy = &sysctl_intvec,
>   },
>   {
> +  .ctl_name = VM_DROP_PAGECACHE,
> +  .procname = "drop-pagecache",
> +  .data  = NULL,
> +  .maxlen  = sizeof(int),
> +  .mode  = 0644,

So what _does_ it do when you read from it?

> +  .proc_handler = drop_pagecache_sysctl_handler,
> +  .strategy = &sysctl_intvec,
> + },
> + {
>    .ctl_name = VM_MIN_FREE_KBYTES,
>    .procname = "min_free_kbytes",
>    .data  = &min_free_kbytes,
> diff -puN mm/truncate.c~drop-pagecache mm/truncate.c
> --- devel/mm/truncate.c~drop-pagecache 2005-12-01 16:49:06.000000000 -0800
> +++ devel-akpm/mm/truncate.c 2005-12-01 16:49:13.000000000 -0800
> @@ -256,7 +256,6 @@ unlock:
>      break;
>    }
>    pagevec_release(&pvec);
> -  cond_resched();

Why drop that line?  (I don't follow...)

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.

  parent reply	other threads:[~2005-12-05 16:02 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-12-01 13:17 [PATCH 0/4] linux-2.6-block: deactivating pagecache for benchmarks Dirk Henning Gerdes
2005-12-01 13:29 ` Arjan van de Ven
2005-12-01 13:43   ` Dirk Henning Gerdes
2005-12-01 14:36 ` Jens Axboe
2005-12-02  1:25 ` Andrew Morton
2005-12-02  1:34   ` Jeff Garzik
2005-12-02 19:19     ` Badari Pulavarty
2005-12-02 19:17   ` Badari Pulavarty
2005-12-02 21:24   ` Badari Pulavarty
2005-12-02 21:44     ` Andrew Morton
2005-12-02 22:33       ` Badari Pulavarty
2005-12-05  2:13   ` Rob Landley [this message]
2005-12-05 16:20     ` Lee Revell
2005-12-05 17:28       ` Rob Landley
2005-12-05 16:54   ` Badari Pulavarty
     [not found] <5f08L-Um-413@gated-at.bofh.it>
2005-12-01 22:48 ` Bodo Eggert
     [not found] ` <5f7UE-3FH-13@gated-at.bofh.it>
2005-12-03  2:05   ` Bodo Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200512042013.13214.rob@landley.net \
    --to=rob@landley.net \
    --cc=akpm@osdl.org \
    --cc=axboe@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mail@dirk-gerdes.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox