From: Greg KH <gregkh@suse.de>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: Justin Forbes <jmforbes@linuxtx.org>,
Zwane Mwaikambo <zwane@arm.linux.org.uk>,
"Theodore Ts'o" <tytso@mit.edu>,
Randy Dunlap <rdunlap@xenotime.net>,
Dave Jones <davej@redhat.com>,
Chuck Wolber <chuckw@quantumlinux.com>,
Chris Wedgwood <reviews@ml.cw.f00f.org>,
Michael Krufky <mkrufky@linuxtv.org>,
Chuck Ebbert <cebbert@redhat.com>,
Domenico Andreoli <cavokz@gmail.com>, Willy Tarreau <w@1wt.eu>,
Rodrigo Rubira Branco <rbranco@la.checkpoint.com>,
torvalds@linux-foundation.org, akpm@linux-foundation.org,
alan@lxorguk.ukuu.org.uk, Oleg Nesterov <oleg@tv-sign.ru>,
Nick Piggin <npiggin@suse.de>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Hugh Dickins <hugh@veritas.com>, Ingo Molnar <mingo@elte.hu>,
Roland McGrath <roland@redhat.com>
Subject: [patch 02/10] Reinstate ZERO_PAGE optimization in get_user_pages() and fix XIP
Date: Mon, 23 Jun 2008 16:05:25 -0700 [thread overview]
Message-ID: <20080623230525.GI29853@suse.de> (raw)
In-Reply-To: <20080623230417.GA29853@suse.de>
[-- Attachment #1: reinstate-zero_page-optimization-in-get_user_pages-and-fix-xip.patch --]
[-- Type: text/plain, Size: 4316 bytes --]
2.6.25.9-stable review patch. If anyone has any objections, please let
us know.
------------------
From: Linus Torvalds <torvalds@linux-foundation.org>
commit 89f5b7da2a6bad2e84670422ab8192382a5aeb9f upstream
KAMEZAWA Hiroyuki and Oleg Nesterov point out that since the commit
557ed1fa2620dc119adb86b34c614e152a629a80 ("remove ZERO_PAGE") removed
the ZERO_PAGE from the VM mappings, any users of get_user_pages() will
generally now populate the VM with real empty pages needlessly.
We used to get the ZERO_PAGE when we did the "handle_mm_fault()", but
since fault handling no longer uses ZERO_PAGE for new anonymous pages,
we now need to handle that special case in follow_page() instead.
In particular, the removal of ZERO_PAGE effectively removed the core
file writing optimization where we would skip writing pages that had not
been populated at all, and increased memory pressure a lot by allocating
all those useless newly zeroed pages.
This reinstates the optimization by making the unmapped PTE case the
same as for a non-existent page table, which already did this correctly.
While at it, this also fixes the XIP case for follow_page(), where the
caller could not differentiate between the case of a page that simply
could not be used (because it had no "struct page" associated with it)
and a page that just wasn't mapped.
We do that by simply returning an error pointer for pages that could not
be turned into a "struct page *". The error is arbitrarily picked to be
EFAULT, since that was what get_user_pages() already used for the
equivalent IO-mapped page case.
[ Also removed an impossible test for pte_offset_map_lock() failing:
that's not how that function works ]
Acked-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Nick Piggin <npiggin@suse.de>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
arch/powerpc/kernel/vdso.c | 2 +-
mm/memory.c | 17 +++++++++++++----
mm/migrate.c | 10 ++++++++++
3 files changed, 24 insertions(+), 5 deletions(-)
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -141,7 +141,7 @@ static void dump_one_vdso_page(struct pa
printk("kpg: %p (c:%d,f:%08lx)", __va(page_to_pfn(pg) << PAGE_SHIFT),
page_count(pg),
pg->flags);
- if (upg/* && pg != upg*/) {
+ if (upg && !IS_ERR(upg) /* && pg != upg*/) {
printk(" upg: %p (c:%d,f:%08lx)", __va(page_to_pfn(upg)
<< PAGE_SHIFT),
page_count(upg),
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -943,17 +943,15 @@ struct page *follow_page(struct vm_area_
}
ptep = pte_offset_map_lock(mm, pmd, address, &ptl);
- if (!ptep)
- goto out;
pte = *ptep;
if (!pte_present(pte))
- goto unlock;
+ goto no_page;
if ((flags & FOLL_WRITE) && !pte_write(pte))
goto unlock;
page = vm_normal_page(vma, address, pte);
if (unlikely(!page))
- goto unlock;
+ goto bad_page;
if (flags & FOLL_GET)
get_page(page);
@@ -968,6 +966,15 @@ unlock:
out:
return page;
+bad_page:
+ pte_unmap_unlock(ptep, ptl);
+ return ERR_PTR(-EFAULT);
+
+no_page:
+ pte_unmap_unlock(ptep, ptl);
+ if (!pte_none(pte))
+ return page;
+ /* Fall through to ZERO_PAGE handling */
no_page_table:
/*
* When core dumping an enormous anonymous area that nobody
@@ -1104,6 +1111,8 @@ int get_user_pages(struct task_struct *t
cond_resched();
}
+ if (IS_ERR(page))
+ return i ? i : PTR_ERR(page);
if (pages) {
pages[i] = page;
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -858,6 +858,11 @@ static int do_move_pages(struct mm_struc
goto set_status;
page = follow_page(vma, pp->addr, FOLL_GET);
+
+ err = PTR_ERR(page);
+ if (IS_ERR(page))
+ goto set_status;
+
err = -ENOENT;
if (!page)
goto set_status;
@@ -921,6 +926,11 @@ static int do_pages_stat(struct mm_struc
goto set_status;
page = follow_page(vma, pm->addr, 0);
+
+ err = PTR_ERR(page);
+ if (IS_ERR(page))
+ goto set_status;
+
err = -ENOENT;
/* Use PageReserved to check for zero page */
if (!page || PageReserved(page))
--
next prev parent reply other threads:[~2008-06-23 23:10 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20080623225737.837265824@mini.kroah.org>
2008-06-23 23:04 ` [patch 00/10] 2.6.28.9-rc2 review Greg KH
2008-06-23 23:04 ` [patch 08/10] hwmon: (lm85) Fix function RANGE_TO_REG() Greg KH
2008-06-23 23:04 ` [patch 09/10] hwmon: (adt7473) Initialize max_duty_at_overheat before use Greg KH
2008-06-23 23:04 ` [patch 10/10] Fix ZERO_PAGE breakage with vmware Greg KH
2008-06-23 23:28 ` Linus Torvalds
2008-06-24 6:04 ` Greg KH
2008-06-23 23:04 ` [patch 05/10] x86: set PAE PHYSICAL_MASK_SHIFT to 44 bits Greg KH
2008-06-23 23:04 ` [patch 06/10] Add return value to reserve_bootmem_node() Greg KH
2008-06-24 11:06 ` Adrian Bunk
2008-06-24 21:07 ` Greg KH
2008-06-23 23:05 ` [patch 07/10] watchdog: hpwdt: fix use of inline assembly Greg KH
2008-06-23 23:05 ` [patch 01/10] atl1: relax eeprom mac address error check Greg KH
2008-06-23 23:05 ` Greg KH [this message]
2008-06-23 23:05 ` [patch 03/10] sctp: Make sure N * sizeof(union sctp_addr) does not overflow Greg KH
2008-06-23 23:05 ` [patch 04/10] x86: use BOOTMEM_EXCLUSIVE on 32-bit Greg KH
2008-06-23 23:22 ` [patch 00/10] 2.6.28.9-rc2 review Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080623230525.GI29853@suse.de \
--to=gregkh@suse.de \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=cavokz@gmail.com \
--cc=cebbert@redhat.com \
--cc=chuckw@quantumlinux.com \
--cc=davej@redhat.com \
--cc=hugh@veritas.com \
--cc=jmforbes@linuxtx.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=mkrufky@linuxtv.org \
--cc=npiggin@suse.de \
--cc=oleg@tv-sign.ru \
--cc=rbranco@la.checkpoint.com \
--cc=rdunlap@xenotime.net \
--cc=reviews@ml.cw.f00f.org \
--cc=roland@redhat.com \
--cc=stable@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=tytso@mit.edu \
--cc=w@1wt.eu \
--cc=zwane@arm.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.