From: Minchan Kim <minchan@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, Minchan Kim <minchan@kernel.org>,
Jason Evans <je@fb.com>,
"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
Will Deacon <will.deacon@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
"[4.5+]" <stable@vger.kernel.org>,
Andreas Schwab <schwab@suse.de>
Subject: [PATCH] mm: pmd dirty emulation in page fault handler
Date: Thu, 22 Dec 2016 08:48:21 +0900 [thread overview]
Message-ID: <1482364101-16204-1-git-send-email-minchan@kernel.org> (raw)
Andreas reported [1] made a test in jemalloc hang in THP mode in arm64.
http://lkml.kernel.org/r/mvmmvfy37g1.fsf@hawking.suse.de
The problem is page fault handler supports only accessed flag emulation
for THP page of SW-dirty/accessed architecture.
This patch enables dirty-bit emulation for those architectures.
Without it, MADV_FREE makes application hang by repeated fault forever.
[1] mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called
Cc: Jason Evans <je@fb.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: <stable@vger.kernel.org> [4.5+]
Fixes: b8d3c4c3009d ("mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called")
Reported-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
mm/huge_memory.c | 6 ++++--
mm/memory.c | 18 ++++++++++--------
2 files changed, 14 insertions(+), 10 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 10eedbf..29ec8a4 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -883,15 +883,17 @@ void huge_pmd_set_accessed(struct vm_fault *vmf, pmd_t orig_pmd)
{
pmd_t entry;
unsigned long haddr;
+ bool write = vmf->flags & FAULT_FLAG_WRITE;
vmf->ptl = pmd_lock(vmf->vma->vm_mm, vmf->pmd);
if (unlikely(!pmd_same(*vmf->pmd, orig_pmd)))
goto unlock;
entry = pmd_mkyoung(orig_pmd);
+ if (write)
+ entry = pmd_mkdirty(entry);
haddr = vmf->address & HPAGE_PMD_MASK;
- if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry,
- vmf->flags & FAULT_FLAG_WRITE))
+ if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry, write))
update_mmu_cache_pmd(vmf->vma, vmf->address, vmf->pmd);
unlock:
diff --git a/mm/memory.c b/mm/memory.c
index 36c774f..7408ddc 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3637,18 +3637,20 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
if (pmd_protnone(orig_pmd) && vma_is_accessible(vma))
return do_huge_pmd_numa_page(&vmf, orig_pmd);
- if ((vmf.flags & FAULT_FLAG_WRITE) &&
- !pmd_write(orig_pmd)) {
- ret = wp_huge_pmd(&vmf, orig_pmd);
- if (!(ret & VM_FAULT_FALLBACK))
+ if (vmf.flags & FAULT_FLAG_WRITE) {
+ if (!pmd_write(orig_pmd)) {
+ ret = wp_huge_pmd(&vmf, orig_pmd);
+ if (ret == VM_FAULT_FALLBACK)
+ goto pte_fault;
return ret;
- } else {
- huge_pmd_set_accessed(&vmf, orig_pmd);
- return 0;
+ }
}
+
+ huge_pmd_set_accessed(&vmf, orig_pmd);
+ return 0;
}
}
-
+pte_fault:
return handle_pte_fault(&vmf);
}
--
2.7.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Minchan Kim <minchan@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, Minchan Kim <minchan@kernel.org>,
Jason Evans <je@fb.com>,
"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
Will Deacon <will.deacon@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
"[4.5+]" <stable@vger.kernel.org>,
Andreas Schwab <schwab@suse.de>
Subject: [PATCH] mm: pmd dirty emulation in page fault handler
Date: Thu, 22 Dec 2016 08:48:21 +0900 [thread overview]
Message-ID: <1482364101-16204-1-git-send-email-minchan@kernel.org> (raw)
Message-ID: <20161221234821.0Gj-Lh1KdDJEdE8z-ucNnf5o1xqWJa_ZDVsPZa4Oqwc@z> (raw)
Andreas reported [1] made a test in jemalloc hang in THP mode in arm64.
http://lkml.kernel.org/r/mvmmvfy37g1.fsf@hawking.suse.de
The problem is page fault handler supports only accessed flag emulation
for THP page of SW-dirty/accessed architecture.
This patch enables dirty-bit emulation for those architectures.
Without it, MADV_FREE makes application hang by repeated fault forever.
[1] mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called
Cc: Jason Evans <je@fb.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: <stable@vger.kernel.org> [4.5+]
Fixes: b8d3c4c3009d ("mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called")
Reported-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
mm/huge_memory.c | 6 ++++--
mm/memory.c | 18 ++++++++++--------
2 files changed, 14 insertions(+), 10 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 10eedbf..29ec8a4 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -883,15 +883,17 @@ void huge_pmd_set_accessed(struct vm_fault *vmf, pmd_t orig_pmd)
{
pmd_t entry;
unsigned long haddr;
+ bool write = vmf->flags & FAULT_FLAG_WRITE;
vmf->ptl = pmd_lock(vmf->vma->vm_mm, vmf->pmd);
if (unlikely(!pmd_same(*vmf->pmd, orig_pmd)))
goto unlock;
entry = pmd_mkyoung(orig_pmd);
+ if (write)
+ entry = pmd_mkdirty(entry);
haddr = vmf->address & HPAGE_PMD_MASK;
- if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry,
- vmf->flags & FAULT_FLAG_WRITE))
+ if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry, write))
update_mmu_cache_pmd(vmf->vma, vmf->address, vmf->pmd);
unlock:
diff --git a/mm/memory.c b/mm/memory.c
index 36c774f..7408ddc 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3637,18 +3637,20 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
if (pmd_protnone(orig_pmd) && vma_is_accessible(vma))
return do_huge_pmd_numa_page(&vmf, orig_pmd);
- if ((vmf.flags & FAULT_FLAG_WRITE) &&
- !pmd_write(orig_pmd)) {
- ret = wp_huge_pmd(&vmf, orig_pmd);
- if (!(ret & VM_FAULT_FALLBACK))
+ if (vmf.flags & FAULT_FLAG_WRITE) {
+ if (!pmd_write(orig_pmd)) {
+ ret = wp_huge_pmd(&vmf, orig_pmd);
+ if (ret == VM_FAULT_FALLBACK)
+ goto pte_fault;
return ret;
- } else {
- huge_pmd_set_accessed(&vmf, orig_pmd);
- return 0;
+ }
}
+
+ huge_pmd_set_accessed(&vmf, orig_pmd);
+ return 0;
}
}
-
+pte_fault:
return handle_pte_fault(&vmf);
}
--
2.7.4
WARNING: multiple messages have this Message-ID (diff)
From: minchan@kernel.org (Minchan Kim)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] mm: pmd dirty emulation in page fault handler
Date: Thu, 22 Dec 2016 08:48:21 +0900 [thread overview]
Message-ID: <1482364101-16204-1-git-send-email-minchan@kernel.org> (raw)
Andreas reported [1] made a test in jemalloc hang in THP mode in arm64.
http://lkml.kernel.org/r/mvmmvfy37g1.fsf at hawking.suse.de
The problem is page fault handler supports only accessed flag emulation
for THP page of SW-dirty/accessed architecture.
This patch enables dirty-bit emulation for those architectures.
Without it, MADV_FREE makes application hang by repeated fault forever.
[1] mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called
Cc: Jason Evans <je@fb.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-arch at vger.kernel.org
Cc: linux-arm-kernel at lists.infradead.org
Cc: <stable@vger.kernel.org> [4.5+]
Fixes: b8d3c4c3009d ("mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called")
Reported-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
mm/huge_memory.c | 6 ++++--
mm/memory.c | 18 ++++++++++--------
2 files changed, 14 insertions(+), 10 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 10eedbf..29ec8a4 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -883,15 +883,17 @@ void huge_pmd_set_accessed(struct vm_fault *vmf, pmd_t orig_pmd)
{
pmd_t entry;
unsigned long haddr;
+ bool write = vmf->flags & FAULT_FLAG_WRITE;
vmf->ptl = pmd_lock(vmf->vma->vm_mm, vmf->pmd);
if (unlikely(!pmd_same(*vmf->pmd, orig_pmd)))
goto unlock;
entry = pmd_mkyoung(orig_pmd);
+ if (write)
+ entry = pmd_mkdirty(entry);
haddr = vmf->address & HPAGE_PMD_MASK;
- if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry,
- vmf->flags & FAULT_FLAG_WRITE))
+ if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry, write))
update_mmu_cache_pmd(vmf->vma, vmf->address, vmf->pmd);
unlock:
diff --git a/mm/memory.c b/mm/memory.c
index 36c774f..7408ddc 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3637,18 +3637,20 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
if (pmd_protnone(orig_pmd) && vma_is_accessible(vma))
return do_huge_pmd_numa_page(&vmf, orig_pmd);
- if ((vmf.flags & FAULT_FLAG_WRITE) &&
- !pmd_write(orig_pmd)) {
- ret = wp_huge_pmd(&vmf, orig_pmd);
- if (!(ret & VM_FAULT_FALLBACK))
+ if (vmf.flags & FAULT_FLAG_WRITE) {
+ if (!pmd_write(orig_pmd)) {
+ ret = wp_huge_pmd(&vmf, orig_pmd);
+ if (ret == VM_FAULT_FALLBACK)
+ goto pte_fault;
return ret;
- } else {
- huge_pmd_set_accessed(&vmf, orig_pmd);
- return 0;
+ }
}
+
+ huge_pmd_set_accessed(&vmf, orig_pmd);
+ return 0;
}
}
-
+pte_fault:
return handle_pte_fault(&vmf);
}
--
2.7.4
next reply other threads:[~2016-12-21 23:48 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-21 23:48 Minchan Kim [this message]
2016-12-21 23:48 ` [PATCH] mm: pmd dirty emulation in page fault handler Minchan Kim
2016-12-21 23:48 ` Minchan Kim
2016-12-22 8:17 ` Kirill A. Shutemov
2016-12-22 8:17 ` Kirill A. Shutemov
2016-12-22 8:17 ` Kirill A. Shutemov
2016-12-22 8:17 ` Kirill A. Shutemov
2016-12-22 14:52 ` Minchan Kim
2016-12-22 14:52 ` Minchan Kim
2016-12-22 14:52 ` Minchan Kim
2016-12-22 14:52 ` Minchan Kim
2016-12-22 18:35 ` Kirill A. Shutemov
2016-12-22 18:35 ` Kirill A. Shutemov
2016-12-22 18:35 ` Kirill A. Shutemov
2016-12-22 22:12 ` Andreas Schwab
2016-12-22 22:12 ` Andreas Schwab
2016-12-22 22:12 ` Andreas Schwab
2016-12-22 22:12 ` Andreas Schwab
2016-12-23 9:17 ` Michal Hocko
2016-12-23 9:17 ` Michal Hocko
2016-12-23 9:17 ` Michal Hocko
2016-12-23 9:53 ` Minchan Kim
2016-12-23 9:53 ` Minchan Kim
2016-12-23 9:53 ` Minchan Kim
2016-12-23 9:53 ` Minchan Kim
2016-12-23 11:54 ` Michal Hocko
2016-12-23 11:54 ` Michal Hocko
2016-12-23 11:54 ` Michal Hocko
2016-12-23 14:01 ` Minchan Kim
2016-12-23 14:01 ` Minchan Kim
2016-12-23 14:01 ` Minchan Kim
2016-12-23 14:01 ` Minchan Kim
2016-12-23 14:53 ` Michal Hocko
2016-12-23 14:53 ` Michal Hocko
2016-12-23 14:53 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1482364101-16204-1-git-send-email-minchan@kernel.org \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=catalin.marinas@arm.com \
--cc=je@fb.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-mm@kvack.org \
--cc=schwab@suse.de \
--cc=stable@vger.kernel.org \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.