All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Rik van Riel <riel@redhat.com>, Michal Hocko <mhocko@suse.cz>,
	HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	David Rientjes <rientjes@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [ 09/42] hugetlbfs: add swap entry check in follow_hugetlb_page()
Date: Tue, 23 Apr 2013 14:52:07 -0700	[thread overview]
Message-ID: <20130423215206.505930786@linuxfoundation.org> (raw)
In-Reply-To: <20130423215205.523980967@linuxfoundation.org>

3.8-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

commit 9cc3a5bd40067b9a0fbd49199d0780463fc2140f upstream.

With applying the previous patch "hugetlbfs: stop setting VM_DONTDUMP in
initializing vma(VM_HUGETLB)" to reenable hugepage coredump, if a memory
error happens on a hugepage and the affected processes try to access the
error hugepage, we hit VM_BUG_ON(atomic_read(&page->_count) <= 0) in
get_page().

The reason for this bug is that coredump-related code doesn't recognise
"hugepage hwpoison entry" with which a pmd entry is replaced when a memory
error occurs on a hugepage.

In other words, physical address information is stored in different bit
layout between hugepage hwpoison entry and pmd entry, so
follow_hugetlb_page() which is called in get_dump_page() returns a wrong
page from a given address.

The expected behavior is like this:

  absent   is_swap_pte   FOLL_DUMP   Expected behavior
  -------------------------------------------------------------------
   true     false         false       hugetlb_fault
   false    true          false       hugetlb_fault
   false    false         false       return page
   true     false         true        skip page (to avoid allocation)
   false    true          true        hugetlb_fault
   false    false         true        return page

With this patch, we can call hugetlb_fault() and take proper actions (we
wait for migration entries, fail with VM_FAULT_HWPOISON_LARGE for
hwpoisoned entries,) and as the result we can dump all hugepages except
for hwpoisoned ones.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/hugetlb.c |   12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2965,7 +2965,17 @@ int follow_hugetlb_page(struct mm_struct
 			break;
 		}
 
-		if (absent ||
+		/*
+		 * We need call hugetlb_fault for both hugepages under migration
+		 * (in which case hugetlb_fault waits for the migration,) and
+		 * hwpoisoned hugepages (in which case we need to prevent the
+		 * caller from accessing to them.) In order to do this, we use
+		 * here is_swap_pte instead of is_hugetlb_entry_migration and
+		 * is_hugetlb_entry_hwpoisoned. This is because it simply covers
+		 * both cases, and because we can't follow correct pages
+		 * directly from any kind of swap entries.
+		 */
+		if (absent || is_swap_pte(huge_ptep_get(pte)) ||
 		    ((flags & FOLL_WRITE) && !pte_write(huge_ptep_get(pte)))) {
 			int ret;
 



  parent reply	other threads:[~2013-04-23 22:17 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-23 21:51 [ 00/42] 3.8.9-stable review Greg Kroah-Hartman
2013-04-23 21:51 ` [ 01/42] powerpc: add a missing label in resume_kernel Greg Kroah-Hartman
2013-04-23 21:52 ` [ 02/42] kvm/powerpc/e500mc: fix tlb invalidation on cpu migration Greg Kroah-Hartman
2013-04-23 21:52 ` [ 03/42] ARM: Do 15e0d9e37c (ARM: pm: let platforms select cpu_suspend support) properly Greg Kroah-Hartman
2013-04-23 21:52 ` [ 04/42] kthread: Prevent unpark race which puts threads on the wrong cpu Greg Kroah-Hartman
2013-04-23 21:52 ` [ 05/42] hrtimer: Dont reinitialize a cpu_base lock on CPU_UP Greg Kroah-Hartman
2013-04-23 21:52 ` [ 06/42] can: mcp251x: add missing IRQF_ONESHOT to request_threaded_irq Greg Kroah-Hartman
2013-04-23 21:52 ` [ 07/42] can: sja1000: fix handling on dt properties on little endian systems Greg Kroah-Hartman
2013-04-23 21:52 ` [ 08/42] hugetlbfs: stop setting VM_DONTDUMP in initializing vma(VM_HUGETLB) Greg Kroah-Hartman
2013-04-23 21:52 ` Greg Kroah-Hartman [this message]
2013-04-23 21:52 ` [ 10/42] fs/binfmt_elf.c: fix hugetlb memory check in vma_dump_size() Greg Kroah-Hartman
2013-04-23 21:52 ` [ 11/42] kernel/signal.c: stop info leak via the tkill and the tgkill syscalls Greg Kroah-Hartman
2013-04-23 21:52 ` [ 12/42] hfsplus: fix potential overflow in hfsplus_file_truncate() Greg Kroah-Hartman
2013-04-23 21:52 ` [ 13/42] md: raid1,10: Handle REQ_WRITE_SAME flag in write bios Greg Kroah-Hartman
2013-04-23 21:52 ` [ 14/42] KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME (CVE-2013-1796) Greg Kroah-Hartman
2013-04-23 21:52 ` [ 15/42] KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions (CVE-2013-1797) Greg Kroah-Hartman
2013-04-23 21:52 ` [ 16/42] KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798) Greg Kroah-Hartman
2013-04-23 21:52 ` [ 17/42] KVM: Allow cross page reads and writes from cached translations Greg Kroah-Hartman
2013-04-23 21:52 ` [ 18/42] ARM: i.MX35: enable MAX clock Greg Kroah-Hartman
2013-04-23 21:52 ` [ 19/42] ARM: clk-imx35: Bugfix iomux clock Greg Kroah-Hartman
2013-04-23 21:52 ` [ 20/42] tg3: Add 57766 device support Greg Kroah-Hartman
2013-04-23 21:52 ` [ 21/42] sched: Convert BUG_ON()s in try_to_wake_up_local() to WARN_ON_ONCE()s Greg Kroah-Hartman
2013-04-23 21:52 ` [ 22/42] sched/debug: Fix sd->*_idx limit range avoiding overflow Greg Kroah-Hartman
2013-05-10  2:14   ` Ben Hutchings
2013-05-10  7:59     ` Ingo Molnar
2013-05-28  1:47       ` Ben Hutchings
2013-04-23 21:52 ` [ 23/42] ARM: 7696/1: Fix kexec by setting outer_cache.inv_all for Feroceon Greg Kroah-Hartman
2013-04-23 21:52 ` [ 24/42] ARM: 7698/1: perf: fix group validation when using enable_on_exec Greg Kroah-Hartman
2013-04-23 21:52 ` [ 25/42] ath9k_htc: accept 1.x firmware newer than 1.3 Greg Kroah-Hartman
2013-04-23 21:52 ` [ 26/42] ath9k_hw: change AR9580 initvals to fix a stability issue Greg Kroah-Hartman
2013-04-23 21:52 ` [ 27/42] mac80211: fix cfg80211 interaction on auth/assoc request Greg Kroah-Hartman
2013-04-23 21:52 ` [ 28/42] ssb: implement spurious tone avoidance Greg Kroah-Hartman
2013-04-23 21:52 ` [ 29/42] crypto: algif - suppress sending source address information in recvmsg Greg Kroah-Hartman
2013-04-23 21:52   ` Greg Kroah-Hartman
2013-04-23 21:52 ` [ 30/42] perf: Treat attr.config as u64 in perf_swevent_init() Greg Kroah-Hartman
2013-04-23 21:52 ` [ 31/42] perf/x86: Fix offcore_rsp valid mask for SNB/IVB Greg Kroah-Hartman
2013-04-23 21:52 ` [ 32/42] userns: Dont let unprivileged users trick privileged users into setting the id_map Greg Kroah-Hartman
2013-04-23 21:52 ` [ 33/42] userns: Check uid_maps openers fsuid, not the current fsuid Greg Kroah-Hartman
2013-04-23 21:52 ` [ 34/42] userns: Changing any namespace id mappings should require privileges Greg Kroah-Hartman
2013-04-23 21:52 ` [ 35/42] vm: add vm_iomap_memory() helper function Greg Kroah-Hartman
2013-04-23 21:52 ` [ 36/42] vm: convert snd_pcm_lib_mmap_iomem() to vm_iomap_memory() helper Greg Kroah-Hartman
2013-04-23 21:52 ` [ 37/42] vm: convert fb_mmap " Greg Kroah-Hartman
2013-04-23 21:52 ` [ 38/42] vm: convert HPET mmap " Greg Kroah-Hartman
2013-04-23 21:52 ` [ 39/42] vm: convert mtdchar " Greg Kroah-Hartman
2013-04-23 21:52 ` [ 40/42] Btrfs: make sure nbytes are right after log replay Greg Kroah-Hartman
2013-04-23 21:52 ` [ 41/42] s390: move dummy io_remap_pfn_range() to asm/pgtable.h Greg Kroah-Hartman
2013-04-23 21:52 ` [ 42/42] Revert "MIPS: page.h: Provide more readable definition for PAGE_MASK." Greg Kroah-Hartman
2013-04-24 16:23 ` [ 00/42] 3.8.9-stable review Shuah Khan
2013-04-24 16:23   ` Shuah Khan
2013-04-24 16:24   ` Greg Kroah-Hartman
2013-04-25 10:43 ` Satoru Takeuchi
2013-04-25 14:49   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130423215206.505930786@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=d.hatayama@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.cz \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.