public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] fix hugepage coredump
@ 2013-04-01 17:21 Naoya Horiguchi
  2013-04-01 17:21 ` [PATCH v2 1/2] hugetlbfs: stop setting VM_DONTDUMP in initializing vma(VM_HUGETLB) Naoya Horiguchi
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Naoya Horiguchi @ 2013-04-01 17:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, KOSAKI Motohiro,
	Konstantin Khlebnikov, Michal Hocko, linux-mm, linux-kernel

Hi,

Here is 2nd version of hugepage coredump fix.
See individual patches for more details.

Thanks,
Naoya Horiguchi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 1/2] hugetlbfs: stop setting VM_DONTDUMP in initializing vma(VM_HUGETLB)
  2013-04-01 17:21 [PATCH v2 0/2] fix hugepage coredump Naoya Horiguchi
@ 2013-04-01 17:21 ` Naoya Horiguchi
       [not found]   ` <CABOkKT0uceznvR0bKx79GB5HSEbWA2vp0G5dAjg6V23O3anS7w@mail.gmail.com>
  2013-04-01 17:21 ` [PATCH v2 2/2] hugetlbfs: add swap entry check in follow_hugetlb_page() Naoya Horiguchi
  2013-04-02  5:34 ` [PATCH v2 0/2] fix hugepage coredump Konstantin Khlebnikov
  2 siblings, 1 reply; 5+ messages in thread
From: Naoya Horiguchi @ 2013-04-01 17:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, KOSAKI Motohiro,
	Konstantin Khlebnikov, Michal Hocko, linux-mm, linux-kernel,
	stable

Currently we fail to include any data on hugepages into coredump,
because VM_DONTDUMP is set on hugetlbfs's vma. This behavior was recently
introduced by commit 314e51b98 "mm: kill vma flag VM_RESERVED and
mm->reserved_vm counter". This looks to me a serious regression,
so let's fix it.

ChangeLog v2:
 - add 'return 0' in hugepage memory check

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: stable@vger.kernel.org
---
 fs/binfmt_elf.c      | 1 +
 fs/hugetlbfs/inode.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git v3.9-rc3.orig/fs/binfmt_elf.c v3.9-rc3/fs/binfmt_elf.c
index 3939829..86af964 100644
--- v3.9-rc3.orig/fs/binfmt_elf.c
+++ v3.9-rc3/fs/binfmt_elf.c
@@ -1137,6 +1137,7 @@ static unsigned long vma_dump_size(struct vm_area_struct *vma,
 			goto whole;
 		if (!(vma->vm_flags & VM_SHARED) && FILTER(HUGETLB_PRIVATE))
 			goto whole;
+		return 0;
 	}
 
 	/* Do not dump I/O mapped devices or special mappings */
diff --git v3.9-rc3.orig/fs/hugetlbfs/inode.c v3.9-rc3/fs/hugetlbfs/inode.c
index 84e3d85..523464e 100644
--- v3.9-rc3.orig/fs/hugetlbfs/inode.c
+++ v3.9-rc3/fs/hugetlbfs/inode.c
@@ -110,7 +110,7 @@ static int hugetlbfs_file_mmap(struct file *file, struct vm_area_struct *vma)
 	 * way when do_mmap_pgoff unwinds (may be important on powerpc
 	 * and ia64).
 	 */
-	vma->vm_flags |= VM_HUGETLB | VM_DONTEXPAND | VM_DONTDUMP;
+	vma->vm_flags |= VM_HUGETLB | VM_DONTEXPAND;
 	vma->vm_ops = &hugetlb_vm_ops;
 
 	if (vma->vm_pgoff & (~huge_page_mask(h) >> PAGE_SHIFT))
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/2] hugetlbfs: add swap entry check in follow_hugetlb_page()
  2013-04-01 17:21 [PATCH v2 0/2] fix hugepage coredump Naoya Horiguchi
  2013-04-01 17:21 ` [PATCH v2 1/2] hugetlbfs: stop setting VM_DONTDUMP in initializing vma(VM_HUGETLB) Naoya Horiguchi
@ 2013-04-01 17:21 ` Naoya Horiguchi
  2013-04-02  5:34 ` [PATCH v2 0/2] fix hugepage coredump Konstantin Khlebnikov
  2 siblings, 0 replies; 5+ messages in thread
From: Naoya Horiguchi @ 2013-04-01 17:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, KOSAKI Motohiro,
	Konstantin Khlebnikov, Michal Hocko, linux-mm, linux-kernel,
	stable

With applying the previous patch "hugetlbfs: stop setting VM_DONTDUMP in
initializing vma(VM_HUGETLB)" to reenable hugepage coredump, if a memory
error happens on a hugepage and the affected processes try to access
the error hugepage, we hit VM_BUG_ON(atomic_read(&page->_count) <= 0)
in get_page().

The reason for this bug is that coredump-related code doesn't recognise
"hugepage hwpoison entry" with which a pmd entry is replaced when a memory
error occurs on a hugepage.
In other words, physical address information is stored in different bit layout
between hugepage hwpoison entry and pmd entry, so follow_hugetlb_page()
which is called in get_dump_page() returns a wrong page from a given address.

We need to filter out only hwpoison hugepages to have data on healthy
hugepages in coredump. So this patch makes follow_hugetlb_page() avoid
trying to get page when a pmd is in swap entry like format.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: stable@vger.kernel.org
---
 mm/hugetlb.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git v3.9-rc3.orig/mm/hugetlb.c v3.9-rc3/mm/hugetlb.c
index 0d1705b..8462e2c 100644
--- v3.9-rc3.orig/mm/hugetlb.c
+++ v3.9-rc3/mm/hugetlb.c
@@ -2968,7 +2968,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
 		 * first, for the page indexing below to work.
 		 */
 		pte = huge_pte_offset(mm, vaddr & huge_page_mask(h));
-		absent = !pte || huge_pte_none(huge_ptep_get(pte));
+		absent = !pte || huge_pte_none(huge_ptep_get(pte)) ||
+			is_swap_pte(huge_ptep_get(pte));
 
 		/*
 		 * When coredumping, it suits get_dump_page if we just return
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 0/2] fix hugepage coredump
  2013-04-01 17:21 [PATCH v2 0/2] fix hugepage coredump Naoya Horiguchi
  2013-04-01 17:21 ` [PATCH v2 1/2] hugetlbfs: stop setting VM_DONTDUMP in initializing vma(VM_HUGETLB) Naoya Horiguchi
  2013-04-01 17:21 ` [PATCH v2 2/2] hugetlbfs: add swap entry check in follow_hugetlb_page() Naoya Horiguchi
@ 2013-04-02  5:34 ` Konstantin Khlebnikov
  2 siblings, 0 replies; 5+ messages in thread
From: Konstantin Khlebnikov @ 2013-04-02  5:34 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Andrew Morton, Mel Gorman, Hugh Dickins, Rik van Riel,
	KOSAKI Motohiro, Michal Hocko, linux-mm, linux-kernel

Naoya Horiguchi wrote:
> Hi,
>
> Here is 2nd version of hugepage coredump fix.
> See individual patches for more details.
>
> Thanks,
> Naoya Horiguchi

ACK to both patches


VM_* bits cleanup patchset was merged into v3.7, so only two recent stable kernels needs this fix.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 1/2] hugetlbfs: stop setting VM_DONTDUMP in initializing vma(VM_HUGETLB)
       [not found]   ` <CABOkKT0uceznvR0bKx79GB5HSEbWA2vp0G5dAjg6V23O3anS7w@mail.gmail.com>
@ 2013-04-02 14:07     ` Naoya Horiguchi
  0 siblings, 0 replies; 5+ messages in thread
From: Naoya Horiguchi @ 2013-04-02 14:07 UTC (permalink / raw)
  To: HATAYAMA Daisuke
  Cc: Andrew Morton, Mel Gorman, Hugh Dickins, Rik van Riel,
	KOSAKI Motohiro, Konstantin Khlebnikov, Michal Hocko, linux-mm,
	linux-kernel, stable

On Tue, Apr 02, 2013 at 08:32:33PM +0900, HATAYAMA Daisuke wrote:
> 2013/4/2 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> 
> > Currently we fail to include any data on hugepages into coredump,
> > because VM_DONTDUMP is set on hugetlbfs's vma. This behavior was recently
> > introduced by commit 314e51b98 "mm: kill vma flag VM_RESERVED and
> > mm->reserved_vm counter". This looks to me a serious regression,
> > so let's fix it.
> >
> > ChangeLog v2:
> >  - add 'return 0' in hugepage memory check
> >
> <cut>
> 
> > @@ -1137,6 +1137,7 @@ static unsigned long vma_dump_size(struct
> > vm_area_struct *vma,
> >                         goto whole;
> >                 if (!(vma->vm_flags & VM_SHARED) &&
> > FILTER(HUGETLB_PRIVATE))
> >                         goto whole;
> > +               return 0;
> >         }
> >
> 
> You should split this part into another patch. This fix is orthogonal to
> the bug this patch tries to fix.

Fair enough, thanks.

> The bug you're trying to fix implicitly here is the filtering behaviour
> that doesn't follow
> the description in Documentation/filesystems/proc.txt that:
> 
>   Note bit 0-4 doesn't effect any hugetlb memory. hugetlb memory are only
>   effected by bit 5-6.
> 
> Right?

Right. Without this return, we will go into the subsequent flag checks
of bit 0-4 for vma(VM_HUGETLB).

Thanks,
Naoya Horiguchi

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-04-02 14:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-01 17:21 [PATCH v2 0/2] fix hugepage coredump Naoya Horiguchi
2013-04-01 17:21 ` [PATCH v2 1/2] hugetlbfs: stop setting VM_DONTDUMP in initializing vma(VM_HUGETLB) Naoya Horiguchi
     [not found]   ` <CABOkKT0uceznvR0bKx79GB5HSEbWA2vp0G5dAjg6V23O3anS7w@mail.gmail.com>
2013-04-02 14:07     ` Naoya Horiguchi
2013-04-01 17:21 ` [PATCH v2 2/2] hugetlbfs: add swap entry check in follow_hugetlb_page() Naoya Horiguchi
2013-04-02  5:34 ` [PATCH v2 0/2] fix hugepage coredump Konstantin Khlebnikov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox