From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Rik van Riel <riel@redhat.com>, Michal Hocko <mhocko@suse.cz>,
HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
David Rientjes <rientjes@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [ 04/26] hugetlbfs: add swap entry check in follow_hugetlb_page()
Date: Tue, 23 Apr 2013 14:53:44 -0700 [thread overview]
Message-ID: <20130423215333.802657218@linuxfoundation.org> (raw)
In-Reply-To: <20130423215333.344045754@linuxfoundation.org>
3.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
commit 9cc3a5bd40067b9a0fbd49199d0780463fc2140f upstream.
With applying the previous patch "hugetlbfs: stop setting VM_DONTDUMP in
initializing vma(VM_HUGETLB)" to reenable hugepage coredump, if a memory
error happens on a hugepage and the affected processes try to access the
error hugepage, we hit VM_BUG_ON(atomic_read(&page->_count) <= 0) in
get_page().
The reason for this bug is that coredump-related code doesn't recognise
"hugepage hwpoison entry" with which a pmd entry is replaced when a memory
error occurs on a hugepage.
In other words, physical address information is stored in different bit
layout between hugepage hwpoison entry and pmd entry, so
follow_hugetlb_page() which is called in get_dump_page() returns a wrong
page from a given address.
The expected behavior is like this:
absent is_swap_pte FOLL_DUMP Expected behavior
-------------------------------------------------------------------
true false false hugetlb_fault
false true false hugetlb_fault
false false false return page
true false true skip page (to avoid allocation)
false true true hugetlb_fault
false false true return page
With this patch, we can call hugetlb_fault() and take proper actions (we
wait for migration entries, fail with VM_FAULT_HWPOISON_LARGE for
hwpoisoned entries,) and as the result we can dump all hugepages except
for hwpoisoned ones.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/hugetlb.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2906,7 +2906,17 @@ int follow_hugetlb_page(struct mm_struct
break;
}
- if (absent ||
+ /*
+ * We need call hugetlb_fault for both hugepages under migration
+ * (in which case hugetlb_fault waits for the migration,) and
+ * hwpoisoned hugepages (in which case we need to prevent the
+ * caller from accessing to them.) In order to do this, we use
+ * here is_swap_pte instead of is_hugetlb_entry_migration and
+ * is_hugetlb_entry_hwpoisoned. This is because it simply covers
+ * both cases, and because we can't follow correct pages
+ * directly from any kind of swap entries.
+ */
+ if (absent || is_swap_pte(huge_ptep_get(pte)) ||
((flags & FOLL_WRITE) && !pte_write(huge_ptep_get(pte)))) {
int ret;
next prev parent reply other threads:[~2013-04-23 21:53 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-23 21:53 [ 00/26] 3.4.42-stable review Greg Kroah-Hartman
2013-04-23 21:53 ` [ 01/26] ARM: Do 15e0d9e37c (ARM: pm: let platforms select cpu_suspend support) properly Greg Kroah-Hartman
2013-04-23 21:53 ` [ 02/26] hrtimer: Dont reinitialize a cpu_base lock on CPU_UP Greg Kroah-Hartman
2013-04-23 21:53 ` [ 03/26] can: sja1000: fix handling on dt properties on little endian systems Greg Kroah-Hartman
2013-04-23 21:53 ` Greg Kroah-Hartman [this message]
2013-04-24 23:04 ` [ 04/26] hugetlbfs: add swap entry check in follow_hugetlb_page() Ben Hutchings
2013-04-24 23:23 ` Greg Kroah-Hartman
2013-04-26 11:38 ` Naoya Horiguchi
2013-04-26 11:41 ` Ben Hutchings
2013-04-23 21:53 ` [ 05/26] kernel/signal.c: stop info leak via the tkill and the tgkill syscalls Greg Kroah-Hartman
2013-04-23 21:53 ` [ 06/26] hfsplus: fix potential overflow in hfsplus_file_truncate() Greg Kroah-Hartman
2013-04-23 21:53 ` [ 07/26] KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME (CVE-2013-1796) Greg Kroah-Hartman
2013-04-23 21:53 ` [ 08/26] KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions (CVE-2013-1797) Greg Kroah-Hartman
2013-04-23 21:53 ` [ 09/26] KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798) Greg Kroah-Hartman
2013-04-23 21:53 ` [ 10/26] KVM: Allow cross page reads and writes from cached translations Greg Kroah-Hartman
2013-04-23 21:53 ` [ 11/26] sched: Convert BUG_ON()s in try_to_wake_up_local() to WARN_ON_ONCE()s Greg Kroah-Hartman
2013-04-23 21:53 ` [ 12/26] ARM: 7696/1: Fix kexec by setting outer_cache.inv_all for Feroceon Greg Kroah-Hartman
2013-04-23 21:53 ` [ 13/26] ARM: 7698/1: perf: fix group validation when using enable_on_exec Greg Kroah-Hartman
2013-04-23 21:53 ` [ 14/26] ath9k_htc: accept 1.x firmware newer than 1.3 Greg Kroah-Hartman
2013-04-23 21:53 ` [ 15/26] ath9k_hw: change AR9580 initvals to fix a stability issue Greg Kroah-Hartman
2013-04-23 21:53 ` [ 16/26] ssb: implement spurious tone avoidance Greg Kroah-Hartman
2013-04-23 21:53 ` [ 17/26] crypto: algif - suppress sending source address information in recvmsg Greg Kroah-Hartman
2013-04-23 21:53 ` [ 18/26] perf: Treat attr.config as u64 in perf_swevent_init() Greg Kroah-Hartman
2013-04-23 21:53 ` [ 19/26] perf/x86: Fix offcore_rsp valid mask for SNB/IVB Greg Kroah-Hartman
2013-04-23 21:54 ` [ 20/26] fbcon: fix locking harder Greg Kroah-Hartman
2013-04-23 21:54 ` [ 21/26] vm: add vm_iomap_memory() helper function Greg Kroah-Hartman
2013-04-23 21:54 ` [ 22/26] vm: convert snd_pcm_lib_mmap_iomem() to vm_iomap_memory() helper Greg Kroah-Hartman
2013-04-23 21:54 ` [ 23/26] vm: convert fb_mmap " Greg Kroah-Hartman
2013-04-23 21:54 ` [ 24/26] vm: convert HPET mmap " Greg Kroah-Hartman
2013-04-23 21:54 ` [ 25/26] vm: convert mtdchar " Greg Kroah-Hartman
2013-04-23 21:54 ` [ 26/26] Btrfs: make sure nbytes are right after log replay Greg Kroah-Hartman
2013-04-24 16:24 ` [ 00/26] 3.4.42-stable review Shuah Khan
2013-04-25 10:41 ` Satoru Takeuchi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130423215333.802657218@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=d.hatayama@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mhocko@suse.cz \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox