All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@kernel.dk>
To: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Nick Bowler <nbowler@elliptictech.com>,
	Simon Kirby <sim@hostway.ca>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-nfs@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>, Nick Piggin <npiggin@kernel.dk>,
	Rik van Riel <riel@redhat.com>, Christoph Hellwig <hch@lst.de>,
	Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH v4 2/3] Call the filesystem back whenever a page is removed from the page cache
Date: Tue, 7 Dec 2010 18:08:27 +1100	[thread overview]
Message-ID: <20101207070827.GC3307@amd> (raw)
In-Reply-To: <1291654748-8921-3-git-send-email-Trond.Myklebust@netapp.com>

On Mon, Dec 06, 2010 at 11:59:07AM -0500, Trond Myklebust wrote:
> From: Linus Torvalds <torvalds@linux-foundation.org>
> 
> NFS needs to be able to release objects that are stored in the page
> cache once the page itself is no longer visible from the page cache.
> 
> This patch adds a callback to the address space operations that allows
> filesystems to perform page cleanups once the page has been removed
> from the page cache.
> 
> Original patch by: Linus Torvalds <torvalds@linux-foundation.org>
> [trondmy: cover the cases of invalidate_inode_pages2() and
>           truncate_inode_pages()]
> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> ---
>  Documentation/filesystems/Locking |    7 ++++++-
>  Documentation/filesystems/vfs.txt |    7 +++++++
>  include/linux/fs.h                |    1 +
>  mm/filemap.c                      |    5 +++++
>  mm/truncate.c                     |    4 ++++
>  mm/vmscan.c                       |    7 +++++++
>  6 files changed, 30 insertions(+), 1 deletions(-)
> 
> diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
> index a91f308..b6426f1 100644
> --- a/Documentation/filesystems/Locking
> +++ b/Documentation/filesystems/Locking
> @@ -173,12 +173,13 @@ prototypes:
>  	sector_t (*bmap)(struct address_space *, sector_t);
>  	int (*invalidatepage) (struct page *, unsigned long);
>  	int (*releasepage) (struct page *, int);
> +	void (*freepage)(struct page *);
>  	int (*direct_IO)(int, struct kiocb *, const struct iovec *iov,
>  			loff_t offset, unsigned long nr_segs);
>  	int (*launder_page) (struct page *);
>  
>  locking rules:
> -	All except set_page_dirty may block
> +	All except set_page_dirty and freepage may block
>  
>  			BKL	PageLocked(page)	i_mutex
>  writepage:		no	yes, unlocks (see below)
> @@ -193,6 +194,7 @@ perform_write:		no	n/a			yes
>  bmap:			no
>  invalidatepage:		no	yes
>  releasepage:		no	yes
> +freepage:		no	yes
>  direct_IO:		no
>  launder_page:		no	yes
>  
> @@ -288,6 +290,9 @@ buffers from the page in preparation for freeing it.  It returns zero to
>  indicate that the buffers are (or may be) freeable.  If ->releasepage is zero,
>  the kernel assumes that the fs has no private interest in the buffers.
>  
> +	->freepage() is called when the kernel is done dropping the page
> +from the page cache.
> +
>  	->launder_page() may be called prior to releasing a page if
>  it is still found to be dirty. It returns zero if the page was successfully
>  cleaned, or an error value if not. Note that in order to prevent the page
> diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
> index ed7e5ef..3b14a55 100644
> --- a/Documentation/filesystems/vfs.txt
> +++ b/Documentation/filesystems/vfs.txt
> @@ -534,6 +534,7 @@ struct address_space_operations {
>  	sector_t (*bmap)(struct address_space *, sector_t);
>  	int (*invalidatepage) (struct page *, unsigned long);
>  	int (*releasepage) (struct page *, int);
> +	void (*freepage)(struct page *);
>  	ssize_t (*direct_IO)(int, struct kiocb *, const struct iovec *iov,
>  			loff_t offset, unsigned long nr_segs);
>  	struct page* (*get_xip_page)(struct address_space *, sector_t,
> @@ -679,6 +680,12 @@ struct address_space_operations {
>          need to ensure this.  Possibly it can clear the PageUptodate
>          bit if it cannot free private data yet.
>  
> +  freepage: freepage is called once the page is no longer visible in
> +        the page cache in order to allow the cleanup of any private
> +	data. Since it may be called by the memory reclaimer, it
> +	should not assume that the original address_space mapping still
> +	exists, and it should not block.

Of course we still have bugs in this regard, without inode RCU and
filesystem deregistration RCU, but when those things are implemented
for RCU path-walk, this section should be updated somewhat, and we'll
have to look at RCU protecting the final mapping manipulations after
a page is removed from pagecache.

But I'll help work on that after RCU inodes / filesystems is merged.

> +
>    direct_IO: called by the generic read/write routines to perform
>          direct_IO - that is IO requests which bypass the page cache
>          and transfer data directly between the storage and the
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index c9e06cc..090f0ea 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -602,6 +602,7 @@ struct address_space_operations {
>  	sector_t (*bmap)(struct address_space *, sector_t);
>  	void (*invalidatepage) (struct page *, unsigned long);
>  	int (*releasepage) (struct page *, gfp_t);
> +	void (*freepage)(struct page *);
>  	ssize_t (*direct_IO)(int, struct kiocb *, const struct iovec *iov,
>  			loff_t offset, unsigned long nr_segs);
>  	int (*get_xip_mem)(struct address_space *, pgoff_t, int,
> diff --git a/mm/filemap.c b/mm/filemap.c
> index ea89840..6b9aee2 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -143,13 +143,18 @@ void __remove_from_page_cache(struct page *page)
>  void remove_from_page_cache(struct page *page)
>  {
>  	struct address_space *mapping = page->mapping;
> +	void (*freepage)(struct page *);
>  
>  	BUG_ON(!PageLocked(page));
>  
> +	freepage = mapping->a_ops->freepage;
>  	spin_lock_irq(&mapping->tree_lock);
>  	__remove_from_page_cache(page);
>  	spin_unlock_irq(&mapping->tree_lock);
>  	mem_cgroup_uncharge_cache_page(page);
> +
> +	if (freepage)
> +		freepage(page);
>  }
>  EXPORT_SYMBOL(remove_from_page_cache);
>  
> diff --git a/mm/truncate.c b/mm/truncate.c
> index ba887bf..3c2d5dd 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -390,6 +390,10 @@ invalidate_complete_page2(struct address_space *mapping, struct page *page)
>  	__remove_from_page_cache(page);
>  	spin_unlock_irq(&mapping->tree_lock);
>  	mem_cgroup_uncharge_cache_page(page);
> +
> +	if (mapping->a_ops->freepage)
> +		mapping->a_ops->freepage(page);
> +
>  	page_cache_release(page);	/* pagecache ref */
>  	return 1;
>  failed:

The generic parts of the code look OK to me, but why is there a
difference in your sequences of loading the freepage function pointer
here?


  reply	other threads:[~2010-12-07  7:08 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-30 17:42 [PATCH] NFS: Fix a readdirplus bug Trond Myklebust
2010-11-30 22:10 ` Linus Torvalds
2010-11-30 22:13   ` Trond Myklebust
2010-12-01  3:47   ` [PATCH 0/3] Fix more NFS readdir regressions Trond Myklebust
2010-12-01  3:47   ` [PATCH 1/3] NFS: Ensure we use the correct cookie in nfs_readdir_xdr_filler Trond Myklebust
2010-12-01 15:04     ` Nick Bowler
2010-12-01 15:36       ` [PATCH v2 0/3] Fix more NFS readdir regressions Trond Myklebust
2010-12-01 15:36       ` [PATCH v2 1/3] NFS: Ensure we use the correct cookie in nfs_readdir_xdr_filler Trond Myklebust
2010-12-01 15:36       ` [PATCH v2 2/3] NFS: lock the readdir page while it is in use Trond Myklebust
2010-12-01 15:36       ` [PATCH v2 3/3] NFS: Fix a memory leak in nfs_readdir Trond Myklebust
2010-12-01 16:17         ` Linus Torvalds
2010-12-01 16:35           ` Rik van Riel
2010-12-01 16:45             ` Benny Halevy
2010-12-01 16:47             ` Linus Torvalds
2010-12-01 17:02               ` Rik van Riel
2010-12-01 17:58           ` Trond Myklebust
2010-12-01 18:29           ` Miklos Szeredi
2010-12-01 18:54           ` Trond Myklebust
2010-12-01 19:23             ` Hugh Dickins
2010-12-01 19:52               ` Linus Torvalds
2010-12-01 20:05               ` Trond Myklebust
2010-12-01 20:39                 ` Andrew Morton
2010-12-01 21:29                   ` Neil Brown
2010-12-01 22:43                     ` Andrew Morton
2010-12-01 23:01                       ` Neil Brown
2010-12-01 19:47             ` Linus Torvalds
2010-12-01 20:10               ` Trond Myklebust
2010-12-01 20:10                 ` Trond Myklebust
2010-12-01 20:18                 ` Linus Torvalds
2010-12-01 20:31                 ` Hugh Dickins
2010-12-01 20:33                 ` Andrew Morton
2010-12-01 21:02                   ` Hugh Dickins
2010-12-01 21:15                     ` Hugh Dickins
2010-12-01 21:38                       ` Andrew Morton
2010-12-01 21:51                         ` Trond Myklebust
2010-12-01 22:13                           ` Andrew Morton
2010-12-01 22:24                             ` Linus Torvalds
2010-12-01 22:38                               ` Andrew Morton
2010-12-01 22:47                                 ` Trond Myklebust
2010-12-01 23:21                                   ` Trond Myklebust
2010-12-01 23:21                                     ` Trond Myklebust
2010-12-01 23:46                                     ` Andrew Morton
2010-12-01 23:56                                       ` Trond Myklebust
2010-12-01 23:31                                 ` Linus Torvalds
2010-12-01 23:36                                   ` Andrew Morton
2010-12-02  1:05                                     ` Linus Torvalds
2010-12-02  1:22                                       ` Andrew Morton
2010-12-02  1:42                                         ` Linus Torvalds
2010-12-02  2:05                                           ` Andrew Morton
2010-12-02  3:08                                           ` [PATCH v3 0/3] Fix more NFS readdir regressions Trond Myklebust
2010-12-02  3:08                                           ` [PATCH v3 1/3] NFS: Ensure we use the correct cookie in nfs_readdir_xdr_filler Trond Myklebust
2010-12-02  3:08                                           ` [PATCH v3 2/3] Call the filesystem back whenever a page is removed from the page cache Trond Myklebust
2010-12-02  3:34                                             ` Hugh Dickins
2010-12-02  3:53                                               ` Trond Myklebust
2010-12-02  3:58                                                 ` Linus Torvalds
2010-12-06 16:59                                                   ` [PATCH v4 0/3] Fix more NFS readdir regressions Trond Myklebust
2010-12-06 16:59                                                   ` [PATCH v4 1/3] NFS: Ensure we use the correct cookie in nfs_readdir_xdr_filler Trond Myklebust
2010-12-06 16:59                                                   ` [PATCH v4 2/3] Call the filesystem back whenever a page is removed from the page cache Trond Myklebust
2010-12-07  7:08                                                     ` Nick Piggin [this message]
2010-12-06 16:59                                                   ` [PATCH v4 3/3] NFS: Fix a memory leak in nfs_readdir Trond Myklebust
2010-12-02  3:08                                           ` [PATCH v3 " Trond Myklebust
2010-12-03  9:12                                           ` [PATCH v2 " Nick Piggin
2010-12-01 23:43                                   ` Trond Myklebust
2010-12-01 22:43                               ` Hugh Dickins
2010-12-01  3:47   ` [PATCH 2/3] NFS: lock the readdir page while it is in use Trond Myklebust
2010-12-01  4:10     ` Linus Torvalds
2010-12-01  4:29       ` Trond Myklebust
2010-12-01  5:06         ` Linus Torvalds
2010-12-01 14:49           ` Trond Myklebust
2010-12-01 13:14         ` Rik van Riel
2010-12-01 14:55           ` Trond Myklebust
2010-12-01  3:47   ` [PATCH 3/3] NFS: Fix a memory leak in nfs_readdir Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101207070827.GC3307@amd \
    --to=npiggin@kernel.dk \
    --cc=Trond.Myklebust@netapp.com \
    --cc=akpm@linux-foundation.org \
    --cc=hch@lst.de \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=nbowler@elliptictech.com \
    --cc=riel@redhat.com \
    --cc=sim@hostway.ca \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.