linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>,
	Andi Kleen <ak@linux.intel.com>,
	Matthew Wilcox <matthew.r.wilcox@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Dave Chinner <david@fromorbit.com>, Ning Qu <quning@gmail.com>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCHv3 2/2] mm: implement ->map_pages for page cache
Date: Thu, 27 Feb 2014 13:47:11 -0800	[thread overview]
Message-ID: <20140227134711.329eb3c385098c8bce37c8d1@linux-foundation.org> (raw)
In-Reply-To: <1393530827-25450-3-git-send-email-kirill.shutemov@linux.intel.com>

On Thu, 27 Feb 2014 21:53:47 +0200 "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> wrote:

> filemap_map_pages() is generic implementation of ->map_pages() for
> filesystems who uses page cache.
> 
> It should be safe to use filemap_map_pages() for ->map_pages() if
> filesystem use filemap_fault() for ->fault().
> 
> ...
>
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1818,6 +1818,7 @@ extern void truncate_inode_pages_range(struct address_space *,
>  
>  /* generic vm_area_ops exported for stackable file systems */
>  extern int filemap_fault(struct vm_area_struct *, struct vm_fault *);
> +extern void filemap_map_pages(struct vm_area_struct *vma, struct vm_fault *vmf);
>  extern int filemap_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf);
>  
>  /* mm/page-writeback.c */
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 7a13f6ac5421..1bc12a96060d 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -33,6 +33,7 @@
>  #include <linux/hardirq.h> /* for BUG_ON(!in_atomic()) only */
>  #include <linux/memcontrol.h>
>  #include <linux/cleancache.h>
> +#include <linux/rmap.h>
>  #include "internal.h"
>  
>  #define CREATE_TRACE_POINTS
> @@ -1726,6 +1727,76 @@ page_not_uptodate:
>  }
>  EXPORT_SYMBOL(filemap_fault);
>  
> +void filemap_map_pages(struct vm_area_struct *vma, struct vm_fault *vmf)
> +{
> +	struct radix_tree_iter iter;
> +	void **slot;
> +	struct file *file = vma->vm_file;
> +	struct address_space *mapping = file->f_mapping;
> +	loff_t size;
> +	struct page *page;
> +	unsigned long address = (unsigned long) vmf->virtual_address;
> +	unsigned long addr;
> +	pte_t *pte;
> +
> +	rcu_read_lock();
> +	radix_tree_for_each_slot(slot, &mapping->page_tree, &iter, vmf->pgoff) {
> +		if (iter.index > vmf->max_pgoff)
> +			break;
> +repeat:
> +		page = radix_tree_deref_slot(slot);
> +		if (radix_tree_exception(page)) {
> +			if (radix_tree_deref_retry(page))
> +				break;
> +			else
> +				goto next;
> +		}
> +
> +		if (!page_cache_get_speculative(page))
> +			goto repeat;
> +
> +		/* Has the page moved? */
> +		if (unlikely(page != *slot)) {
> +			page_cache_release(page);
> +			goto repeat;
> +		}
> +
> +		if (!PageUptodate(page) ||
> +				PageReadahead(page) ||
> +				PageHWPoison(page))
> +			goto skip;
> +		if (!trylock_page(page))
> +			goto skip;
> +
> +		if (page->mapping != mapping || !PageUptodate(page))
> +			goto unlock;
> +
> +		size = i_size_read(mapping->host) + PAGE_CACHE_SIZE - 1;

Could perhaps use round_up here.

> +		if (page->index >= size	>> PAGE_CACHE_SHIFT)
> +			goto unlock;
> +		pte = vmf->pte + page->index - vmf->pgoff;
> +		if (!pte_none(*pte))
> +			goto unlock;
> +
> +		if (file->f_ra.mmap_miss > 0)
> +			file->f_ra.mmap_miss--;

I'm wondering about this.  We treat every speculative faultahead as a
hit, whether or not userspace will actually touch that page.

What's the effect of this?  To cause the amount of physical readahead
to increase?  But if userspace is in fact touching the file in a sparse
random fashion, that is exactly the wrong thing to do?

> +		addr = address + (page->index - vmf->pgoff) * PAGE_SIZE;
> +		do_set_pte(vma, addr, page, pte, false, false);
> +		unlock_page(page);
> +		goto next;
> +unlock:
> +		unlock_page(page);
> +skip:
> +		page_cache_release(page);
> +next:
> +		if (page->index == vmf->max_pgoff)
> +			break;
> +	}
> +	rcu_read_unlock();
> +}
> +EXPORT_SYMBOL(filemap_map_pages);
> +

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-02-27 21:47 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-27 19:53 [PATCHv3 0/2] mm: map few pages around fault address if they are in page cache Kirill A. Shutemov
2014-02-27 19:53 ` [PATCHv3 1/2] mm: introduce vm_ops->map_pages() Kirill A. Shutemov
2014-02-27 21:59   ` Dave Hansen
2014-02-27 22:06     ` Linus Torvalds
2014-02-27 22:34       ` Dave Hansen
2014-02-28  0:18         ` Kirill A. Shutemov
2014-02-28 11:50         ` Benjamin Herrenschmidt
2014-02-27 22:08     ` Andrew Morton
2014-03-03 23:16   ` Andrew Morton
2014-03-04  1:26     ` Kirill A. Shutemov
2014-03-05  0:04     ` Rusty Russell
2014-03-05 20:02       ` Andrew Morton
2014-07-24  3:33   ` Sasha Levin
2014-07-24  6:53     ` Andrey Ryabinin
2014-07-24 12:48       ` Sasha Levin
2014-07-24 13:05     ` Sasha Levin
2014-07-24 13:30       ` Konstantin Khlebnikov
2014-07-28  7:43     ` [PATCH] mm: don't allow fault_around_bytes to be 0 Andrey Ryabinin
2014-07-28  7:47       ` Andrey Ryabinin
2014-07-28  9:36       ` Kirill A. Shutemov
2014-07-28 10:27         ` Andrey Ryabinin
2014-07-28 10:52           ` Kirill A. Shutemov
2014-07-28 12:32           ` Sasha Levin
2014-07-28 22:43           ` David Rientjes
2014-07-28 15:26         ` Dave Hansen
2014-02-27 19:53 ` [PATCHv3 2/2] mm: implement ->map_pages for page cache Kirill A. Shutemov
2014-02-27 21:47   ` Andrew Morton [this message]
2014-02-28  0:31     ` Kirill A. Shutemov
2014-04-02 18:03   ` Konstantin Khlebnikov
2014-04-02 19:07     ` Kirill A. Shutemov
2014-02-27 21:28 ` [PATCHv3 0/2] mm: map few pages around fault address if they are in " Linus Torvalds
2014-02-28  0:10   ` Kirill A. Shutemov
2014-02-28  3:52     ` Wilcox, Matthew R
2014-02-28 23:08     ` Linus Torvalds
2014-03-12 14:22   ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140227134711.329eb3c385098c8bce37c8d1@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=ak@linux.intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@fromorbit.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.r.wilcox@intel.com \
    --cc=mgorman@suse.de \
    --cc=quning@gmail.com \
    --cc=riel@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).