All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org>
To: Andrea Righi <andrea-oIIqvOZpAevzfdHfmsDf5w@public.gmane.org>
Cc: Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Al Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
	Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [RFC] [PATCH] drop_pagecache syscall
Date: Wed, 27 Apr 2011 10:14:53 +1000	[thread overview]
Message-ID: <20110427001453.GD12436@dastard> (raw)
In-Reply-To: <1303853727-21444-1-git-send-email-andrea-oIIqvOZpAevzfdHfmsDf5w@public.gmane.org>

On Tue, Apr 26, 2011 at 11:35:27PM +0200, Andrea Righi wrote:
> Introduce sys_drop_pagecache() system call to drop the page cache pages of
> a single filesystem.
> 
> This new system call takes a file descriptor as argument and drops only
> the page cache pages of the file system it references.
> 
> At the moment it is possible to drop page cache pages via
> /proc/sys/vm/drop_pagecache or via posix_fadvise(POSIX_FADV_DONTNEED).
> 
> The first method drops the whole page cache while the second can be used
> to drop page cache pages of a single file descriptor. But there's not a
> simple way to drop all the pages of a filesystem (we could scan all the
> file descriptors and use posix_fadvise(), but this solution doesn't scale
> very well in some cases).

Why not just add a new posix_fadvise() command? e.g.
POSIX_FADV_DONTNEED_FS. Simpler than adding a new syscall...

> This functionality can be used by all the applications that want to have a
> better control over the page cache management (for example to immediately drop
> pages that for sure will not be reused in the near future, without calling
> posix_fadvise() for all the files they've touched), or to provide a more fine
> grained debugging feature usable by the filesystem benchmarks.
> 
> The system call does not require root privileges and it can be called by any
> unprivileged application. For example, we can write a userspace tool to run
> something like this:
> 
>   $ drop-pagecache /path/file_or_dir

That's a potential DOS vector, I think. Drop the pagecache in a hard
loop on the root fs of a busy server and watch it crawl...

> +/*
> + * Drop page cache of a single superblock
> + */
> +SYSCALL_DEFINE1(drop_pagecache, int, fd)
> +{
> +	struct file *file;
> +	struct super_block *sb;
> +	int fput_needed;
> +
> +	file = fget_light(fd, &fput_needed);
> +	if (!file)
> +		return -EBADF;
> +	sb = file->f_dentry->d_sb;
> +
> +	down_read(&sb->s_umount);
> +	drop_pagecache_sb(sb, NULL);
> +	up_read(&sb->s_umount);
> +
> +	fput_light(file, fput_needed);
> +	return 0;

You're holding an open reference to a file/dir on the fs so it can't
be unmounted from under you. Hence I don't think you need the
s_umount locking.

Cheers,

Dave.
-- 
Dave Chinner
david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org

WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Andrea Righi <andrea@betterlinux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>, Arnd Bergmann <arnd@arndb.de>,
	linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC] [PATCH] drop_pagecache syscall
Date: Wed, 27 Apr 2011 10:14:53 +1000	[thread overview]
Message-ID: <20110427001453.GD12436@dastard> (raw)
In-Reply-To: <1303853727-21444-1-git-send-email-andrea@betterlinux.com>

On Tue, Apr 26, 2011 at 11:35:27PM +0200, Andrea Righi wrote:
> Introduce sys_drop_pagecache() system call to drop the page cache pages of
> a single filesystem.
> 
> This new system call takes a file descriptor as argument and drops only
> the page cache pages of the file system it references.
> 
> At the moment it is possible to drop page cache pages via
> /proc/sys/vm/drop_pagecache or via posix_fadvise(POSIX_FADV_DONTNEED).
> 
> The first method drops the whole page cache while the second can be used
> to drop page cache pages of a single file descriptor. But there's not a
> simple way to drop all the pages of a filesystem (we could scan all the
> file descriptors and use posix_fadvise(), but this solution doesn't scale
> very well in some cases).

Why not just add a new posix_fadvise() command? e.g.
POSIX_FADV_DONTNEED_FS. Simpler than adding a new syscall...

> This functionality can be used by all the applications that want to have a
> better control over the page cache management (for example to immediately drop
> pages that for sure will not be reused in the near future, without calling
> posix_fadvise() for all the files they've touched), or to provide a more fine
> grained debugging feature usable by the filesystem benchmarks.
> 
> The system call does not require root privileges and it can be called by any
> unprivileged application. For example, we can write a userspace tool to run
> something like this:
> 
>   $ drop-pagecache /path/file_or_dir

That's a potential DOS vector, I think. Drop the pagecache in a hard
loop on the root fs of a busy server and watch it crawl...

> +/*
> + * Drop page cache of a single superblock
> + */
> +SYSCALL_DEFINE1(drop_pagecache, int, fd)
> +{
> +	struct file *file;
> +	struct super_block *sb;
> +	int fput_needed;
> +
> +	file = fget_light(fd, &fput_needed);
> +	if (!file)
> +		return -EBADF;
> +	sb = file->f_dentry->d_sb;
> +
> +	down_read(&sb->s_umount);
> +	drop_pagecache_sb(sb, NULL);
> +	up_read(&sb->s_umount);
> +
> +	fput_light(file, fput_needed);
> +	return 0;

You're holding an open reference to a file/dir on the fs so it can't
be unmounted from under you. Hence I don't think you need the
s_umount locking.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2011-04-27  0:14 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-26 21:35 [RFC] [PATCH] drop_pagecache syscall Andrea Righi
     [not found] ` <1303853727-21444-1-git-send-email-andrea-oIIqvOZpAevzfdHfmsDf5w@public.gmane.org>
2011-04-27  0:14   ` Dave Chinner [this message]
2011-04-27  0:14     ` Dave Chinner
2011-04-27  9:01     ` Andrea Righi
2011-04-27  9:01       ` Andrea Righi
2011-04-27  9:10       ` Mike Frysinger
     [not found]         ` <BANLkTimrpNOHVfnund7uc=thf-c3_HxyYQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-04-27  9:47           ` Andrea Righi
2011-04-27  9:47             ` Andrea Righi
2011-04-27  9:50             ` Mike Frysinger
     [not found]               ` <BANLkTi=ZNq4Yp3U5jVspsJixu4ckbdVjtQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-04-27  9:57                 ` Andrea Righi
2011-04-27  9:57                   ` Andrea Righi
     [not found]                   ` <20110427095709.GA1687-fxUVXftIFDlZdMzt4l2sLQC/G2K4zDHf@public.gmane.org>
2011-04-27 15:25                     ` Mike Frysinger
2011-04-27 15:25                       ` Mike Frysinger
     [not found]                       ` <BANLkTi=+E+WzUgqEQTnRPmL1g5yPsXu8Bw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-04-27 15:42                         ` Andrea Righi
2011-04-27 15:42                           ` Andrea Righi
     [not found]       ` <20110427085910.GA1749-Td79XgCuBx/ToqTmb/eOq0M9+F4ksjoh@public.gmane.org>
2011-04-28 23:22         ` Joel Becker
2011-04-28 23:22           ` Joel Becker
     [not found]           ` <20110428232210.GA4132-EPe72S9iottSzHKm+aFRNNkmqwFzkYv6@public.gmane.org>
2011-04-29  8:18             ` Andrea Righi
2011-04-29  8:18               ` Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110427001453.GD12436@dastard \
    --to=david-fqsqvqoi3ljby3ivrkzq2a@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=andrea-oIIqvOZpAevzfdHfmsDf5w@public.gmane.org \
    --cc=arnd-r2nGTMty4D4@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.