* [PATCH v2 1/2] hugetlbfs: stop setting VM_DONTDUMP in initializing vma(VM_HUGETLB)
2013-04-01 17:21 [PATCH v2 0/2] fix hugepage coredump Naoya Horiguchi
@ 2013-04-01 17:21 ` Naoya Horiguchi
[not found] ` <CABOkKT0uceznvR0bKx79GB5HSEbWA2vp0G5dAjg6V23O3anS7w@mail.gmail.com>
2013-04-01 17:21 ` [PATCH v2 2/2] hugetlbfs: add swap entry check in follow_hugetlb_page() Naoya Horiguchi
2013-04-02 5:34 ` [PATCH v2 0/2] fix hugepage coredump Konstantin Khlebnikov
2 siblings, 1 reply; 5+ messages in thread
From: Naoya Horiguchi @ 2013-04-01 17:21 UTC (permalink / raw)
To: Andrew Morton
Cc: Mel Gorman, Hugh Dickins, Rik van Riel, KOSAKI Motohiro,
Konstantin Khlebnikov, Michal Hocko, linux-mm, linux-kernel,
stable
Currently we fail to include any data on hugepages into coredump,
because VM_DONTDUMP is set on hugetlbfs's vma. This behavior was recently
introduced by commit 314e51b98 "mm: kill vma flag VM_RESERVED and
mm->reserved_vm counter". This looks to me a serious regression,
so let's fix it.
ChangeLog v2:
- add 'return 0' in hugepage memory check
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: stable@vger.kernel.org
---
fs/binfmt_elf.c | 1 +
fs/hugetlbfs/inode.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git v3.9-rc3.orig/fs/binfmt_elf.c v3.9-rc3/fs/binfmt_elf.c
index 3939829..86af964 100644
--- v3.9-rc3.orig/fs/binfmt_elf.c
+++ v3.9-rc3/fs/binfmt_elf.c
@@ -1137,6 +1137,7 @@ static unsigned long vma_dump_size(struct vm_area_struct *vma,
goto whole;
if (!(vma->vm_flags & VM_SHARED) && FILTER(HUGETLB_PRIVATE))
goto whole;
+ return 0;
}
/* Do not dump I/O mapped devices or special mappings */
diff --git v3.9-rc3.orig/fs/hugetlbfs/inode.c v3.9-rc3/fs/hugetlbfs/inode.c
index 84e3d85..523464e 100644
--- v3.9-rc3.orig/fs/hugetlbfs/inode.c
+++ v3.9-rc3/fs/hugetlbfs/inode.c
@@ -110,7 +110,7 @@ static int hugetlbfs_file_mmap(struct file *file, struct vm_area_struct *vma)
* way when do_mmap_pgoff unwinds (may be important on powerpc
* and ia64).
*/
- vma->vm_flags |= VM_HUGETLB | VM_DONTEXPAND | VM_DONTDUMP;
+ vma->vm_flags |= VM_HUGETLB | VM_DONTEXPAND;
vma->vm_ops = &hugetlb_vm_ops;
if (vma->vm_pgoff & (~huge_page_mask(h) >> PAGE_SHIFT))
--
1.7.11.7
^ permalink raw reply related [flat|nested] 5+ messages in thread* [PATCH v2 2/2] hugetlbfs: add swap entry check in follow_hugetlb_page()
2013-04-01 17:21 [PATCH v2 0/2] fix hugepage coredump Naoya Horiguchi
2013-04-01 17:21 ` [PATCH v2 1/2] hugetlbfs: stop setting VM_DONTDUMP in initializing vma(VM_HUGETLB) Naoya Horiguchi
@ 2013-04-01 17:21 ` Naoya Horiguchi
2013-04-02 5:34 ` [PATCH v2 0/2] fix hugepage coredump Konstantin Khlebnikov
2 siblings, 0 replies; 5+ messages in thread
From: Naoya Horiguchi @ 2013-04-01 17:21 UTC (permalink / raw)
To: Andrew Morton
Cc: Mel Gorman, Hugh Dickins, Rik van Riel, KOSAKI Motohiro,
Konstantin Khlebnikov, Michal Hocko, linux-mm, linux-kernel,
stable
With applying the previous patch "hugetlbfs: stop setting VM_DONTDUMP in
initializing vma(VM_HUGETLB)" to reenable hugepage coredump, if a memory
error happens on a hugepage and the affected processes try to access
the error hugepage, we hit VM_BUG_ON(atomic_read(&page->_count) <= 0)
in get_page().
The reason for this bug is that coredump-related code doesn't recognise
"hugepage hwpoison entry" with which a pmd entry is replaced when a memory
error occurs on a hugepage.
In other words, physical address information is stored in different bit layout
between hugepage hwpoison entry and pmd entry, so follow_hugetlb_page()
which is called in get_dump_page() returns a wrong page from a given address.
We need to filter out only hwpoison hugepages to have data on healthy
hugepages in coredump. So this patch makes follow_hugetlb_page() avoid
trying to get page when a pmd is in swap entry like format.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: stable@vger.kernel.org
---
mm/hugetlb.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git v3.9-rc3.orig/mm/hugetlb.c v3.9-rc3/mm/hugetlb.c
index 0d1705b..8462e2c 100644
--- v3.9-rc3.orig/mm/hugetlb.c
+++ v3.9-rc3/mm/hugetlb.c
@@ -2968,7 +2968,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
* first, for the page indexing below to work.
*/
pte = huge_pte_offset(mm, vaddr & huge_page_mask(h));
- absent = !pte || huge_pte_none(huge_ptep_get(pte));
+ absent = !pte || huge_pte_none(huge_ptep_get(pte)) ||
+ is_swap_pte(huge_ptep_get(pte));
/*
* When coredumping, it suits get_dump_page if we just return
--
1.7.11.7
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2 0/2] fix hugepage coredump
2013-04-01 17:21 [PATCH v2 0/2] fix hugepage coredump Naoya Horiguchi
2013-04-01 17:21 ` [PATCH v2 1/2] hugetlbfs: stop setting VM_DONTDUMP in initializing vma(VM_HUGETLB) Naoya Horiguchi
2013-04-01 17:21 ` [PATCH v2 2/2] hugetlbfs: add swap entry check in follow_hugetlb_page() Naoya Horiguchi
@ 2013-04-02 5:34 ` Konstantin Khlebnikov
2 siblings, 0 replies; 5+ messages in thread
From: Konstantin Khlebnikov @ 2013-04-02 5:34 UTC (permalink / raw)
To: Naoya Horiguchi
Cc: Andrew Morton, Mel Gorman, Hugh Dickins, Rik van Riel,
KOSAKI Motohiro, Michal Hocko, linux-mm, linux-kernel
Naoya Horiguchi wrote:
> Hi,
>
> Here is 2nd version of hugepage coredump fix.
> See individual patches for more details.
>
> Thanks,
> Naoya Horiguchi
ACK to both patches
VM_* bits cleanup patchset was merged into v3.7, so only two recent stable kernels needs this fix.
^ permalink raw reply [flat|nested] 5+ messages in thread