From: Mel Gorman <mel@csn.ul.ie>
To: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Jeff Chua <jeff.chua.linux@gmail.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Nick Piggin <npiggin@suse.de>, Rik van Riel <riel@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 4/8] mm: FOLL_DUMP replace FOLL_ANON
Date: Wed, 9 Sep 2009 12:14:38 +0100 [thread overview]
Message-ID: <20090909111438.GF24614@csn.ul.ie> (raw)
In-Reply-To: <Pine.LNX.4.64.0909072233240.15430@sister.anvils>
On Mon, Sep 07, 2009 at 10:35:32PM +0100, Hugh Dickins wrote:
> The "FOLL_ANON optimization" and its use_zero_page() test have caused
> confusion and bugs: why does it test VM_SHARED? for the very good but
> unsatisfying reason that VMware crashed without. As we look to maybe
> reinstating anonymous use of the ZERO_PAGE, we need to sort this out.
>
> Easily done: it's silly for __get_user_pages() and follow_page() to
> be guessing whether it's safe to assume that they're being used for
> a coredump (which can take a shortcut snapshot where other uses must
> handle a fault) - just tell them with GUP_FLAGS_DUMP and FOLL_DUMP.
>
> get_dump_page() doesn't even want a ZERO_PAGE: an error suits fine.
>
> Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Acked-by: Mel Gorman <mel@csn.ul.ie>
> ---
>
> include/linux/mm.h | 2 +-
> mm/internal.h | 1 +
> mm/memory.c | 43 ++++++++++++-------------------------------
> 3 files changed, 14 insertions(+), 32 deletions(-)
>
> --- mm3/include/linux/mm.h 2009-09-07 13:16:32.000000000 +0100
> +++ mm4/include/linux/mm.h 2009-09-07 13:16:39.000000000 +0100
> @@ -1247,7 +1247,7 @@ struct page *follow_page(struct vm_area_
> #define FOLL_WRITE 0x01 /* check pte is writable */
> #define FOLL_TOUCH 0x02 /* mark page accessed */
> #define FOLL_GET 0x04 /* do get_page on page */
> -#define FOLL_ANON 0x08 /* give ZERO_PAGE if no pgtable */
> +#define FOLL_DUMP 0x08 /* give error on hole if it would be zero */
>
> typedef int (*pte_fn_t)(pte_t *pte, pgtable_t token, unsigned long addr,
> void *data);
> --- mm3/mm/internal.h 2009-09-07 13:16:22.000000000 +0100
> +++ mm4/mm/internal.h 2009-09-07 13:16:39.000000000 +0100
> @@ -252,6 +252,7 @@ static inline void mminit_validate_memmo
>
> #define GUP_FLAGS_WRITE 0x01
> #define GUP_FLAGS_FORCE 0x02
> +#define GUP_FLAGS_DUMP 0x04
>
> int __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
> unsigned long start, int len, int flags,
> --- mm3/mm/memory.c 2009-09-07 13:16:32.000000000 +0100
> +++ mm4/mm/memory.c 2009-09-07 13:16:39.000000000 +0100
> @@ -1174,41 +1174,22 @@ no_page:
> pte_unmap_unlock(ptep, ptl);
> if (!pte_none(pte))
> return page;
> - /* Fall through to ZERO_PAGE handling */
> +
> no_page_table:
> /*
> * When core dumping an enormous anonymous area that nobody
> - * has touched so far, we don't want to allocate page tables.
> + * has touched so far, we don't want to allocate unnecessary pages or
> + * page tables. Return error instead of NULL to skip handle_mm_fault,
> + * then get_dump_page() will return NULL to leave a hole in the dump.
> + * But we can only make this optimization where a hole would surely
> + * be zero-filled if handle_mm_fault() actually did handle it.
> */
> - if (flags & FOLL_ANON) {
> - page = ZERO_PAGE(0);
> - if (flags & FOLL_GET)
> - get_page(page);
> - BUG_ON(flags & FOLL_WRITE);
> - }
> + if ((flags & FOLL_DUMP) &&
> + (!vma->vm_ops || !vma->vm_ops->fault))
> + return ERR_PTR(-EFAULT);
> return page;
> }
>
> -/* Can we do the FOLL_ANON optimization? */
> -static inline int use_zero_page(struct vm_area_struct *vma)
> -{
> - /*
> - * We don't want to optimize FOLL_ANON for make_pages_present()
> - * when it tries to page in a VM_LOCKED region. As to VM_SHARED,
> - * we want to get the page from the page tables to make sure
> - * that we serialize and update with any other user of that
> - * mapping.
> - */
> - if (vma->vm_flags & (VM_LOCKED | VM_SHARED))
> - return 0;
> - /*
> - * And if we have a fault routine, it's not an anonymous region.
> - */
> - return !vma->vm_ops || !vma->vm_ops->fault;
> -}
> -
> -
> -
> int __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
> unsigned long start, int nr_pages, int flags,
> struct page **pages, struct vm_area_struct **vmas)
> @@ -1288,8 +1269,8 @@ int __get_user_pages(struct task_struct
> foll_flags = FOLL_TOUCH;
> if (pages)
> foll_flags |= FOLL_GET;
> - if (!write && use_zero_page(vma))
> - foll_flags |= FOLL_ANON;
> + if (flags & GUP_FLAGS_DUMP)
> + foll_flags |= FOLL_DUMP;
>
> do {
> struct page *page;
> @@ -1447,7 +1428,7 @@ struct page *get_dump_page(unsigned long
> struct page *page;
>
> if (__get_user_pages(current, current->mm, addr, 1,
> - GUP_FLAGS_FORCE, &page, &vma) < 1)
> + GUP_FLAGS_FORCE | GUP_FLAGS_DUMP, &page, &vma) < 1)
> return NULL;
> if (page == ZERO_PAGE(0)) {
> page_cache_release(page);
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-09-09 11:15 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-07 21:26 [PATCH 0/8] mm: around get_user_pages flags Hugh Dickins
2009-09-07 21:29 ` [PATCH 1/8] mm: munlock use follow_page Hugh Dickins
2009-09-08 2:58 ` KAMEZAWA Hiroyuki
2009-09-08 11:30 ` Hugh Dickins
2009-09-08 17:10 ` Rik van Riel
2009-09-09 15:59 ` Minchan Kim
2009-09-11 11:07 ` Hiroaki Wakabayashi
2009-09-07 21:31 ` [PATCH 2/8] mm: remove unused GUP flags Hugh Dickins
2009-09-08 17:27 ` Rik van Riel
2009-09-07 21:33 ` [PATCH 3/8] mm: add get_dump_page Hugh Dickins
2009-09-08 18:57 ` Rik van Riel
2009-09-07 21:35 ` [PATCH 4/8] mm: FOLL_DUMP replace FOLL_ANON Hugh Dickins
2009-09-08 18:59 ` Rik van Riel
2009-09-09 11:14 ` Mel Gorman [this message]
2009-09-09 16:16 ` Minchan Kim
2009-09-13 15:46 ` Hugh Dickins
2009-09-13 23:05 ` Minchan Kim
2009-09-07 21:37 ` [PATCH 5/8] mm: follow_hugetlb_page flags Hugh Dickins
2009-09-08 22:21 ` Rik van Riel
2009-09-09 11:31 ` Mel Gorman
2009-09-13 15:35 ` Hugh Dickins
2009-09-14 13:27 ` Mel Gorman
2009-09-15 20:26 ` Hugh Dickins
2009-09-07 21:38 ` [PATCH 6/8] mm: fix anonymous dirtying Hugh Dickins
2009-09-08 22:23 ` Rik van Riel
2009-09-07 21:39 ` [PATCH 7/8] mm: reinstate ZERO_PAGE Hugh Dickins
2009-09-08 2:37 ` KAMEZAWA Hiroyuki
2009-09-08 11:56 ` Hugh Dickins
2009-09-09 1:44 ` KAMEZAWA Hiroyuki
2009-09-15 20:15 ` Hugh Dickins
2009-09-08 7:31 ` Nick Piggin
2009-09-08 12:17 ` Hugh Dickins
2009-09-08 15:34 ` Nick Piggin
2009-09-08 16:40 ` Hugh Dickins
2009-09-08 14:13 ` Linus Torvalds
2009-09-08 23:35 ` Rik van Riel
2009-09-07 21:40 ` [PATCH 8/8] mm: FOLL flags for GUP flags Hugh Dickins
2009-09-07 23:51 ` [PATCH 0/8] mm: around get_user_pages flags Linus Torvalds
2009-09-07 23:52 ` KAMEZAWA Hiroyuki
2009-09-08 0:00 ` KOSAKI Motohiro
2009-09-10 0:33 ` KOSAKI Motohiro
2009-09-15 20:16 ` Hugh Dickins
2009-09-15 20:30 ` [PATCH 0/4] mm: mlock, hugetlb, zero followups Hugh Dickins
2009-09-15 20:31 ` [PATCH 1/4] mm: m(un)lock avoid ZERO_PAGE Hugh Dickins
2009-09-16 0:08 ` KOSAKI Motohiro
2009-09-16 9:35 ` Mel Gorman
2009-09-16 11:40 ` Hugh Dickins
2009-09-16 12:47 ` Mel Gorman
2009-09-15 20:33 ` [PATCH 2/4] mm: hugetlbfs_pagecache_present Hugh Dickins
2009-09-15 20:37 ` [PATCH 3/4] mm: ZERO_PAGE without PTE_SPECIAL Hugh Dickins
2009-09-16 6:20 ` KAMEZAWA Hiroyuki
2009-09-15 20:38 ` [PATCH 4/4] mm: move highest_memmap_pfn Hugh Dickins
2009-09-17 0:33 ` [PATCH 0/4] mm: mlock, hugetlb, zero followups KOSAKI Motohiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090909111438.GF24614@csn.ul.ie \
--to=mel@csn.ul.ie \
--cc=akpm@linux-foundation.org \
--cc=hugh.dickins@tiscali.co.uk \
--cc=jeff.chua.linux@gmail.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).