* [PATCH v2 0/2] Change PAT to support mremap use-cases
@ 2015-12-23 0:54 Toshi Kani
2015-12-23 0:54 ` [PATCH v2 1/2] x86/mm/pat: Add untrack_pfn_moved for mremap Toshi Kani
2015-12-23 0:54 ` [PATCH v2 2/2] x86/mm/pat: Change free_memtype() to support shrinking case Toshi Kani
0 siblings, 2 replies; 3+ messages in thread
From: Toshi Kani @ 2015-12-23 0:54 UTC (permalink / raw)
To: tglx, mingo, hpa, bp; +Cc: stsp, x86, linux-mm, linux-kernel
This patch-set fixes two issues found in PAT when mremap() is used on
a VM_PFNMAP range.
Patch 1/2 fixes an issue for mremap() with MREMAP_FIXED, which moves
a pfnmap from old vma to new vma. untrack_pfn_moved() is added to
clear VM_PAT from old vma.
Patch 2/2 fixes an issue for mremap() with a shrinking map size.
The free_memtype() path is changed to support this mremap case in
addition to the regular munmap case.
Note, using mremap() with an expanded map size is not a valid case
since VM_DONTEXPAND is set along with VM_PFNMAP.
v2:
- Add an explicit call in the mremap code which clears the PAT flag.
(Thomas Gleixner)
- Add comment to explain how memtype_rb_match() is used for shrinking
case. (Thomas Gleixner)
---
Toshi Kani (2):
1/2 x86/mm/pat: Add untrack_pfn_moved for mremap
2/2 x86/mm/pat: Change free_memtype() to support shrinking case
---
arch/x86/mm/pat.c | 12 +++++++++-
arch/x86/mm/pat_rbtree.c | 52 +++++++++++++++++++++++++++++++++++--------
include/asm-generic/pgtable.h | 10 ++++++++-
mm/mremap.c | 4 ++++
4 files changed, 67 insertions(+), 11 deletions(-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH v2 1/2] x86/mm/pat: Add untrack_pfn_moved for mremap
2015-12-23 0:54 [PATCH v2 0/2] Change PAT to support mremap use-cases Toshi Kani
@ 2015-12-23 0:54 ` Toshi Kani
2015-12-23 0:54 ` [PATCH v2 2/2] x86/mm/pat: Change free_memtype() to support shrinking case Toshi Kani
1 sibling, 0 replies; 3+ messages in thread
From: Toshi Kani @ 2015-12-23 0:54 UTC (permalink / raw)
To: tglx, mingo, hpa, bp
Cc: stsp, x86, linux-mm, linux-kernel, Borislav Petkov, Toshi Kani
mremap() with MREMAP_FIXED on a VM_PFNMAP range causes the following
WARN_ON_ONCE() message in untrack_pfn().
WARNING: CPU: 1 PID: 3493 at arch/x86/mm/pat.c:985 untrack_pfn+0xbd/0xd0()
Call Trace:
[<ffffffff817729ea>] dump_stack+0x45/0x57
[<ffffffff8109e4b6>] warn_slowpath_common+0x86/0xc0
[<ffffffff8109e5ea>] warn_slowpath_null+0x1a/0x20
[<ffffffff8106a88d>] untrack_pfn+0xbd/0xd0
[<ffffffff811d2d5e>] unmap_single_vma+0x80e/0x860
[<ffffffff811d3725>] unmap_vmas+0x55/0xb0
[<ffffffff811d916c>] unmap_region+0xac/0x120
[<ffffffff811db86a>] do_munmap+0x28a/0x460
[<ffffffff811dec33>] move_vma+0x1b3/0x2e0
[<ffffffff811df113>] SyS_mremap+0x3b3/0x510
[<ffffffff817793ee>] entry_SYSCALL_64_fastpath+0x12/0x71
MREMAP_FIXED moves a pfnmap from old vma to new vma. untrack_pfn() is
called with the old vma after its pfnmap page table has been removed,
which causes follow_phys() to fail. The new vma has a new pfnmap to
the same pfn & cache type with VM_PAT set. Therefore, we only need to
clear VM_PAT from the old vma in this case.
Add untrack_pfn_moved(), which clears VM_PAT from a given old vma.
move_vma() is changed to call this function with the old vma when
VM_PFNMAP is set. move_vma() then calls do_munmap(), and untrack_pfn()
is a no-op since VM_PAT is cleared.
Link: http://lkml.kernel.org/r/<1446072663.20657.150.camel@hpe.com>
Reported-by: Stas Sergeev <stsp@list.ru>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
---
arch/x86/mm/pat.c | 10 ++++++++++
include/asm-generic/pgtable.h | 10 +++++++++-
mm/mremap.c | 4 ++++
3 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 188e3e0..1aca073 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -992,6 +992,16 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
vma->vm_flags &= ~VM_PAT;
}
+/*
+ * untrack_pfn_moved is called, while mremapping a pfnmap for a new region,
+ * with the old vma after its pfnmap page table has been removed. The new
+ * vma has a new pfnmap to the same pfn & cache type with VM_PAT set.
+ */
+void untrack_pfn_moved(struct vm_area_struct *vma)
+{
+ vma->vm_flags &= ~VM_PAT;
+}
+
pgprot_t pgprot_writecombine(pgprot_t prot)
{
return __pgprot(pgprot_val(prot) |
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 14b0ff32..3a6803c 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -569,7 +569,7 @@ static inline int track_pfn_copy(struct vm_area_struct *vma)
}
/*
- * untrack_pfn_vma is called while unmapping a pfnmap for a region.
+ * untrack_pfn is called while unmapping a pfnmap for a region.
* untrack can be called for a specific region indicated by pfn and size or
* can be for the entire vma (in which case pfn, size are zero).
*/
@@ -577,6 +577,13 @@ static inline void untrack_pfn(struct vm_area_struct *vma,
unsigned long pfn, unsigned long size)
{
}
+
+/*
+ * untrack_pfn_moved is called while mremapping a pfnmap for a new region.
+ */
+static inline void untrack_pfn_moved(struct vm_area_struct *vma)
+{
+}
#else
extern int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
unsigned long pfn, unsigned long addr,
@@ -586,6 +593,7 @@ extern int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
extern int track_pfn_copy(struct vm_area_struct *vma);
extern void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
unsigned long size);
+extern void untrack_pfn_moved(struct vm_area_struct *vma);
#endif
#ifdef __HAVE_COLOR_ZERO_PAGE
diff --git a/mm/mremap.c b/mm/mremap.c
index c25bc62..de824e7 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -319,6 +319,10 @@ static unsigned long move_vma(struct vm_area_struct *vma,
hiwater_vm = mm->hiwater_vm;
vm_stat_account(mm, vma->vm_flags, vma->vm_file, new_len>>PAGE_SHIFT);
+ /* Tell pfnmap has moved from this vma */
+ if (unlikely(vma->vm_flags & VM_PFNMAP))
+ untrack_pfn_moved(vma);
+
if (do_munmap(mm, old_addr, old_len) < 0) {
/* OOM: unable to split vma, just get accounts right */
vm_unacct_memory(excess >> PAGE_SHIFT);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH v2 2/2] x86/mm/pat: Change free_memtype() to support shrinking case
2015-12-23 0:54 [PATCH v2 0/2] Change PAT to support mremap use-cases Toshi Kani
2015-12-23 0:54 ` [PATCH v2 1/2] x86/mm/pat: Add untrack_pfn_moved for mremap Toshi Kani
@ 2015-12-23 0:54 ` Toshi Kani
1 sibling, 0 replies; 3+ messages in thread
From: Toshi Kani @ 2015-12-23 0:54 UTC (permalink / raw)
To: tglx, mingo, hpa, bp
Cc: stsp, x86, linux-mm, linux-kernel, Borislav Petkov, Toshi Kani
Using mremap() to shrink the map size of a VM_PFNMAP range causes
the following error message, and leaves the pfn range allocated.
x86/PAT: test:3493 freeing invalid memtype [mem 0x483200000-0x4863fffff]
This is because rbt_memtype_erase(), called from free_memtype()
with spin_lock held, only supports to free a whole memtype node in
memtype_rbroot. Therefore, this patch changes rbt_memtype_erase()
to support a request that shrinks the size of a memtype node for
mremap().
memtype_rb_exact_match() is renamed to memtype_rb_match(), and
is enhanced to support EXACT_MATCH and END_MATH in @match_type.
Since the memtype_rbroot tree allows overlapping ranges,
rbt_memtype_erase() checks with EXACT_MATCH first, i.e. free
a whole node for the munmap case. If no such entry is found,
it then checks with END_MATCH, i.e. shrink the size of a node
from the end for the mremap case.
On the mremap case, rbt_memtype_erase() proceeds in two steps,
1) remove the node, and then 2) insert the updated node. This
allows proper update of augmented values, subtree_max_end, in
the tree.
Link: http://lkml.kernel.org/r/<1446072663.20657.150.camel@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
---
arch/x86/mm/pat.c | 2 +-
arch/x86/mm/pat_rbtree.c | 52 ++++++++++++++++++++++++++++++++++++++--------
2 files changed, 44 insertions(+), 10 deletions(-)
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 1aca073..031782e 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -586,7 +586,7 @@ int free_memtype(u64 start, u64 end)
entry = rbt_memtype_erase(start, end);
spin_unlock(&memtype_lock);
- if (!entry) {
+ if (IS_ERR(entry)) {
pr_info("x86/PAT: %s:%d freeing invalid memtype [mem %#010Lx-%#010Lx]\n",
current->comm, current->pid, start, end - 1);
return -EINVAL;
diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c
index 6393108..2f77022 100644
--- a/arch/x86/mm/pat_rbtree.c
+++ b/arch/x86/mm/pat_rbtree.c
@@ -98,8 +98,13 @@ static struct memtype *memtype_rb_lowest_match(struct rb_root *root,
return last_lower; /* Returns NULL if there is no overlap */
}
-static struct memtype *memtype_rb_exact_match(struct rb_root *root,
- u64 start, u64 end)
+enum {
+ MEMTYPE_EXACT_MATCH = 0,
+ MEMTYPE_END_MATCH = 1
+};
+
+static struct memtype *memtype_rb_match(struct rb_root *root,
+ u64 start, u64 end, int match_type)
{
struct memtype *match;
@@ -107,7 +112,12 @@ static struct memtype *memtype_rb_exact_match(struct rb_root *root,
while (match != NULL && match->start < end) {
struct rb_node *node;
- if (match->start == start && match->end == end)
+ if ((match_type == MEMTYPE_EXACT_MATCH) &&
+ (match->start == start) && (match->end == end))
+ return match;
+
+ if ((match_type == MEMTYPE_END_MATCH) &&
+ (match->start < start) && (match->end == end))
return match;
node = rb_next(&match->rb);
@@ -117,7 +127,7 @@ static struct memtype *memtype_rb_exact_match(struct rb_root *root,
match = NULL;
}
- return NULL; /* Returns NULL if there is no exact match */
+ return NULL; /* Returns NULL if there is no match */
}
static int memtype_rb_check_conflict(struct rb_root *root,
@@ -210,12 +220,36 @@ struct memtype *rbt_memtype_erase(u64 start, u64 end)
{
struct memtype *data;
- data = memtype_rb_exact_match(&memtype_rbroot, start, end);
- if (!data)
- goto out;
+ /*
+ * Since the memtype_rbroot tree allows overlapping ranges,
+ * rbt_memtype_erase() checks with EXACT_MATCH first, i.e. free
+ * a whole node for the munmap case. If no such entry is found,
+ * it then checks with END_MATCH, i.e. shrink the size of a node
+ * from the end for the mremap case.
+ */
+ data = memtype_rb_match(&memtype_rbroot, start, end,
+ MEMTYPE_EXACT_MATCH);
+ if (!data) {
+ data = memtype_rb_match(&memtype_rbroot, start, end,
+ MEMTYPE_END_MATCH);
+ if (!data)
+ return ERR_PTR(-EINVAL);
+ }
+
+ if (data->start == start) {
+ /* munmap: erase this node */
+ rb_erase_augmented(&data->rb, &memtype_rbroot,
+ &memtype_rb_augment_cb);
+ } else {
+ /* mremap: update the end value of this node */
+ rb_erase_augmented(&data->rb, &memtype_rbroot,
+ &memtype_rb_augment_cb);
+ data->end = start;
+ data->subtree_max_end = data->end;
+ memtype_rb_insert(&memtype_rbroot, data);
+ return NULL;
+ }
- rb_erase_augmented(&data->rb, &memtype_rbroot, &memtype_rb_augment_cb);
-out:
return data;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-12-23 0:54 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-23 0:54 [PATCH v2 0/2] Change PAT to support mremap use-cases Toshi Kani
2015-12-23 0:54 ` [PATCH v2 1/2] x86/mm/pat: Add untrack_pfn_moved for mremap Toshi Kani
2015-12-23 0:54 ` [PATCH v2 2/2] x86/mm/pat: Change free_memtype() to support shrinking case Toshi Kani
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).