* mm: NULL ptr deref in migrate_page_move_mapping
@ 2014-09-22 15:30 Sasha Levin
2014-09-22 23:04 ` Hugh Dickins
0 siblings, 1 reply; 5+ messages in thread
From: Sasha Levin @ 2014-09-22 15:30 UTC (permalink / raw)
To: linux-mm@kvack.org
Cc: Andrew Morton, Konstantin Khlebnikov, Johannes Weiner,
Hugh Dickins, Naoya Horiguchi, Mel Gorman
Hi all,
While fuzzing with trinity inside a KVM tools guest running the latest -next
kernel, I've stumbled on the following spew:
[ 5028.149987] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 5028.151644] IP: migrate_page_move_mapping (mm/migrate.c:358)
[ 5028.152912] PGD 562e72067 PUD 549f93067 PMD 0
[ 5028.158109] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 5028.159419] Dumping ftrace buffer:
[ 5028.159555] (ftrace buffer empty)
[ 5028.159555] Modules linked in:
[ 5028.159555] CPU: 14 PID: 19517 Comm: trinity-main Not tainted 3.17.0-rc5-next-20140919-sasha-00031-gc150a84 #1208
[ 5028.159555] task: ffff8802ca820000 ti: ffff880526dc8000 task.ti: ffff880526dc8000
[ 5028.159555] RIP: migrate_page_move_mapping (mm/migrate.c:358)
[ 5028.159555] RSP: 0000:ffff880526dcb708 EFLAGS: 00010002
[ 5028.159555] RAX: 0000000000000001 RBX: ffffea0015926e00 RCX: 0000000000000001
[ 5028.159555] RDX: 0000000000000001 RSI: ffff880047f570d0 RDI: 0000000000000082
[ 5028.159555] RBP: ffff880526dcb758 R08: 0000000000000038 R09: 0000000000000001
[ 5028.159555] R10: 0000000000000038 R11: 00000000000003be R12: ffff880047f570a0
[ 5028.159555] R13: 0000000000000002 R14: ffffea0014790fc0 R15: 0000000000000000
[ 5028.159555] FS: 00007f93eff4a700(0000) GS:ffff880488800000(0000) knlGS:0000000000000000
[ 5028.159555] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 5028.159555] CR2: 0000000000000000 CR3: 0000000525f41000 CR4: 00000000000006a0
[ 5028.159555] DR0: 00000000006f0000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5028.159555] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 5028.159555] Stack:
[ 5028.159555] ffff880526dcb718 0000000000000000 0000000026dcb748 ffff880047f570b8
[ 5028.159555] ffffea0014790fc0 ffffea0014790fc0 ffffea0015926e00 0000000000000001
[ 5028.159555] ffff880047f570a0 ffffea0014790fc0 ffff880526dcb788 ffffffffa530bcfb
[ 5028.159555] Call Trace:
[ 5028.159555] migrate_page (mm/migrate.c:601)
[ 5028.159555] move_to_new_page (mm/migrate.c:775)
[ 5028.159555] ? try_to_unmap (mm/rmap.c:1527)
[ 5028.159555] ? try_to_unmap_nonlinear (mm/rmap.c:1124)
[ 5028.159555] ? invalid_migration_vma (mm/rmap.c:1483)
[ 5028.159555] ? page_remove_rmap (mm/rmap.c:1391)
[ 5028.159555] ? __put_anon_vma (mm/rmap.c:448)
[ 5028.159555] migrate_pages (mm/migrate.c:904 mm/migrate.c:941 mm/migrate.c:1122)
[ 5028.159555] ? isolate_freepages_block (mm/compaction.c:918)
[ 5028.159555] ? arch_local_save_flags (./arch/x86/include/asm/paravirt.h:819)
[ 5028.159555] compact_zone (mm/compaction.c:1209)
[ 5028.159555] compact_zone_order (mm/compaction.c:1256)
[ 5028.159555] try_to_compact_pages (mm/compaction.c:1323)
[ 5028.159555] __alloc_pages_direct_compact (./arch/x86/include/asm/current.h:14 mm/page_alloc.c:2313)
[ 5028.159555] __alloc_pages_nodemask (mm/page_alloc.c:2653 mm/page_alloc.c:2838)
[ 5028.159555] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
[ 5028.159555] alloc_pages_vma (include/linux/mempolicy.h:76 include/linux/mempolicy.h:81 mm/mempolicy.c:2036)
[ 5028.159555] ? do_huge_pmd_wp_page (mm/huge_memory.c:774 mm/huge_memory.c:1123)
[ 5028.159555] do_huge_pmd_wp_page (mm/huge_memory.c:774 mm/huge_memory.c:1123)
[ 5028.159555] ? __mem_cgroup_count_vm_event (include/linux/rcupdate.h:423 include/linux/rcupdate.h:918 mm/memcontrol.c:1306)
[ 5028.159555] ? __mem_cgroup_count_vm_event (mm/memcontrol.c:1307)
[ 5028.159555] ? __mem_cgroup_count_vm_event (mm/memcontrol.c:1287)
[ 5028.159555] handle_mm_fault (mm/memory.c:3312 mm/memory.c:3370)
[ 5028.159555] ? __lock_is_held (kernel/locking/lockdep.c:3518)
[ 5028.159555] __do_page_fault (arch/x86/mm/fault.c:1249)
[ 5028.159555] ? vtime_account_user (kernel/sched/cputime.c:691)
[ 5028.159555] ? preempt_count_sub (kernel/sched/core.c:2634)
[ 5028.159555] ? context_tracking_user_exit (kernel/context_tracking.c:184)
[ 5028.159555] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
[ 5028.159555] trace_do_page_fault (arch/x86/mm/fault.c:1332 include/linux/jump_label.h:114 include/linux/context_tracking_state.h:27 include/linux/context_tracking.h:45 arch/x86/mm/fault.c:1333)
[ 5028.159555] do_async_page_fault (arch/x86/kernel/kvm.c:280)
[ 5028.159555] async_page_fault (arch/x86/kernel/entry_64.S:1301)
[ 5028.159555] Code: 0f 85 00 02 00 00 4c 89 f0 8b 40 1c 41 39 c5 0f 85 31 01 00 00 e8 2c 03 ed ff 85 c0 74 0d 80 3d a4 0d cc 05 00 0f 84 8b 01 00 00 <4d> 3b 37 0f 85 12 01 00 00 44 89 e8 31 d2 f0 41 0f b1 56 1c 41
All code
========
0: 0f 85 00 02 00 00 jne 0x206
6: 4c 89 f0 mov %r14,%rax
9: 8b 40 1c mov 0x1c(%rax),%eax
c: 41 39 c5 cmp %eax,%r13d
f: 0f 85 31 01 00 00 jne 0x146
15: e8 2c 03 ed ff callq 0xffffffffffed0346
1a: 85 c0 test %eax,%eax
1c: 74 0d je 0x2b
1e: 80 3d a4 0d cc 05 00 cmpb $0x0,0x5cc0da4(%rip) # 0x5cc0dc9
25: 0f 84 8b 01 00 00 je 0x1b6
2b:* 4d 3b 37 cmp (%r15),%r14 <-- trapping instruction
2e: 0f 85 12 01 00 00 jne 0x146
34: 44 89 e8 mov %r13d,%eax
37: 31 d2 xor %edx,%edx
39: f0 41 0f b1 56 1c lock cmpxchg %edx,0x1c(%r14)
3f: 41 rex.B
...
Code starting with the faulting instruction
===========================================
0: 4d 3b 37 cmp (%r15),%r14
3: 0f 85 12 01 00 00 jne 0x11b
9: 44 89 e8 mov %r13d,%eax
c: 31 d2 xor %edx,%edx
e: f0 41 0f b1 56 1c lock cmpxchg %edx,0x1c(%r14)
14: 41 rex.B
...
[ 5028.159555] RIP migrate_page_move_mapping (mm/migrate.c:358)
[ 5028.159555] RSP <ffff880526dcb708>
[ 5028.159555] CR2: 0000000000000000
Codewise, it seems pretty straightforward:
int migrate_page_move_mapping(struct address_space *mapping,
struct page *newpage, struct page *page,
struct buffer_head *head, enum migrate_mode mode,
int extra_count)
{
int expected_count = 1 + extra_count;
void **pslot;
if (!mapping) {
/* Anonymous page without mapping */
if (page_count(page) != expected_count)
return -EAGAIN;
return MIGRATEPAGE_SUCCESS;
}
spin_lock_irq(&mapping->tree_lock);
pslot = radix_tree_lookup_slot(&mapping->page_tree,
page_index(page)); <==== Returned NULL
expected_count += 1 + page_has_private(page);
if (page_count(page) != expected_count ||
radix_tree_deref_slot_protected(pslot, &mapping->tree_lock) != page) { <==== Dereferenced that NULL
spin_unlock_irq(&mapping->tree_lock);
return -EAGAIN;
}
I don't think it's just a missing '!= NULL' check but I'm not sure what went wrong.
Thanks,
Sasha
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: mm: NULL ptr deref in migrate_page_move_mapping
2014-09-22 15:30 mm: NULL ptr deref in migrate_page_move_mapping Sasha Levin
@ 2014-09-22 23:04 ` Hugh Dickins
2014-09-22 23:49 ` Sasha Levin
2014-09-27 3:02 ` Sasha Levin
0 siblings, 2 replies; 5+ messages in thread
From: Hugh Dickins @ 2014-09-22 23:04 UTC (permalink / raw)
To: Sasha Levin
Cc: linux-mm@kvack.org, Andrew Morton, Konstantin Khlebnikov,
Johannes Weiner, Hugh Dickins, Naoya Horiguchi, Mel Gorman
On Mon, 22 Sep 2014, Sasha Levin wrote:
>
> int migrate_page_move_mapping(struct address_space *mapping,
> struct page *newpage, struct page *page,
> struct buffer_head *head, enum migrate_mode mode,
> int extra_count)
> {
> int expected_count = 1 + extra_count;
> void **pslot;
>
> if (!mapping) {
> /* Anonymous page without mapping */
> if (page_count(page) != expected_count)
> return -EAGAIN;
> return MIGRATEPAGE_SUCCESS;
> }
>
> spin_lock_irq(&mapping->tree_lock);
>
> pslot = radix_tree_lookup_slot(&mapping->page_tree,
> page_index(page)); <==== Returned NULL
>
> expected_count += 1 + page_has_private(page);
> if (page_count(page) != expected_count ||
> radix_tree_deref_slot_protected(pslot, &mapping->tree_lock) != page) { <==== Dereferenced that NULL
> spin_unlock_irq(&mapping->tree_lock);
> return -EAGAIN;
> }
>
> I don't think it's just a missing '!= NULL' check
I agree: we have had this page locked since before the
mapping = page_mapping(page), so it ought to be in its radix_tree.
Though if we believe that argument, then am I not implying that the
"radix_blah() != page" check is redundant? Hmm, perhaps someone can
see why it is needed, in which case that might give a hint on the crash.
But my suspicion is that it's just for safety: it corresponds to the
original "*radix_pointer != page" check in the first mm/migrate.c in
2.6.17, which may be there just so as not to rely so heavily on mm
locking protocols enforced elsewhere.
> but I'm not sure what went wrong.
Most likely would be a zeroing of the radix_tree node, just as you
were experiencing zeroing of other mm structures in earlier weeks.
Not that I've got any suggestions on where to take it from there.
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: mm: NULL ptr deref in migrate_page_move_mapping
2014-09-22 23:04 ` Hugh Dickins
@ 2014-09-22 23:49 ` Sasha Levin
2014-09-27 3:02 ` Sasha Levin
1 sibling, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2014-09-22 23:49 UTC (permalink / raw)
To: Hugh Dickins
Cc: linux-mm@kvack.org, Andrew Morton, Konstantin Khlebnikov,
Johannes Weiner, Naoya Horiguchi, Mel Gorman
Hi Hugh,
On 09/22/2014 07:04 PM, Hugh Dickins wrote:
>> but I'm not sure what went wrong.
> Most likely would be a zeroing of the radix_tree node, just as you
> were experiencing zeroing of other mm structures in earlier weeks.
>
> Not that I've got any suggestions on where to take it from there.
I've actually mailed this one because I thought that the root reason
isn't the same as the corruption before.
Previously, the radix tree itself would get corrupted, so we'd deref
a NULL ptr inside the radix tree functions. In this case, it seems
that if it was indeed a corruption, it affected the objects that are
held in the tree rather than the tree itself.
If you suspect it's a corruption specific to the tree, how about:
diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h
index 33170db..9dc19d9 100644
--- a/include/linux/radix-tree.h
+++ b/include/linux/radix-tree.h
@@ -27,6 +27,8 @@
#include <linux/kernel.h>
#include <linux/rcupdate.h>
+#define RADIX_POISON 0x8BADBEEF
+
/*
* An indirect pointer (root->rnode pointing to a radix_tree_node, rather
* than a data item) is signalled by the low bit set in the root->rnode
@@ -85,6 +87,7 @@ static inline int radix_tree_is_indirect_ptr(void *ptr)
#define RADIX_TREE_COUNT_MASK ((1UL << RADIX_TREE_COUNT_SHIFT) - 1)
struct radix_tree_node {
+ unsigned int poison_start;
unsigned int path; /* Offset in parent & height from the bottom */
unsigned int count;
union {
@@ -101,19 +104,24 @@ struct radix_tree_node {
struct list_head private_list;
void __rcu *slots[RADIX_TREE_MAP_SIZE];
unsigned long tags[RADIX_TREE_MAX_TAGS][RADIX_TREE_TAG_LONGS];
+ unsigned int poison_end;
};
/* root tags are stored in gfp_mask, shifted by __GFP_BITS_SHIFT */
struct radix_tree_root {
+ unsigned int poison_start;
unsigned int height;
gfp_t gfp_mask;
struct radix_tree_node __rcu *rnode;
+ unsigned int poison_end;
};
#define RADIX_TREE_INIT(mask) { \
.height = 0, \
.gfp_mask = (mask), \
.rnode = NULL, \
+ .poison_start = RADIX_POISON, \
+ .poison_end = RADIX_POISON, \
}
#define RADIX_TREE(name, mask) \
@@ -124,6 +132,8 @@ do { \
(root)->height = 0; \
(root)->gfp_mask = (mask); \
(root)->rnode = NULL; \
+ (root)->poison_start = RADIX_POISON; \
+ (root)->poison_end = RADIX_POISON; \
} while (0)
/**
diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index 3291a8e..5ef8f52 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -230,7 +230,7 @@ static void radix_tree_node_rcu_free(struct rcu_head *head)
node->slots[0] = NULL;
node->count = 0;
-
+ BUG_ON(node->poison_start != RADIX_POISON || node->poison_end != RADIX_POISON);
kmem_cache_free(radix_tree_node_cachep, node);
}
@@ -460,6 +460,7 @@ int radix_tree_insert(struct radix_tree_root *root,
node->count++;
BUG_ON(tag_get(node, 0, index & RADIX_TREE_MAP_MASK));
BUG_ON(tag_get(node, 1, index & RADIX_TREE_MAP_MASK));
+ BUG_ON(node->poison_start != RADIX_POISON || node->poison_end != RADIX_POISON);
} else {
BUG_ON(root_tag_get(root, 0));
BUG_ON(root_tag_get(root, 1));
@@ -489,11 +490,11 @@ void *__radix_tree_lookup(struct radix_tree_root *root, unsigned long index,
struct radix_tree_node *node, *parent;
unsigned int height, shift;
void **slot;
-
+ BUG_ON(root->poison_start != RADIX_POISON || root->poison_end != RADIX_POISON);
node = rcu_dereference_raw(root->rnode);
if (node == NULL)
return NULL;
-
+ BUG_ON(node->poison_start != RADIX_POISON || node->poison_end != RADIX_POISON);
if (!radix_tree_is_indirect_ptr(node)) {
if (index > 0)
return NULL;
@@ -518,7 +519,7 @@ void *__radix_tree_lookup(struct radix_tree_root *root, unsigned long index,
node = rcu_dereference_raw(*slot);
if (node == NULL)
return NULL;
-
+ BUG_ON(node->poison_start != RADIX_POISON || node->poison_end != RADIX_POISON);
shift -= RADIX_TREE_MAP_SHIFT;
height--;
} while (height > 0);
I'll run with it for the night.
Thanks,
Sasha
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: mm: NULL ptr deref in migrate_page_move_mapping
2014-09-22 23:04 ` Hugh Dickins
2014-09-22 23:49 ` Sasha Levin
@ 2014-09-27 3:02 ` Sasha Levin
2014-09-27 7:01 ` Andrey Ryabinin
1 sibling, 1 reply; 5+ messages in thread
From: Sasha Levin @ 2014-09-27 3:02 UTC (permalink / raw)
To: Hugh Dickins
Cc: linux-mm@kvack.org, Andrew Morton, Konstantin Khlebnikov,
Johannes Weiner, Naoya Horiguchi, Mel Gorman, Andrey Ryabinin
On 09/22/2014 07:04 PM, Hugh Dickins wrote:
>> but I'm not sure what went wrong.
> Most likely would be a zeroing of the radix_tree node, just as you
> were experiencing zeroing of other mm structures in earlier weeks.
>
> Not that I've got any suggestions on where to take it from there.
I've added poisoning to a few mm related structures, and managed to
confirm that the issue here is indeed corruption rather than something
specific with the given structures.
Right now I'm looking into making KASan (Cc Andrey) to mark the poison
bytes somehow so it would trigger an error on access, that way we'll
know what's corruption them.
Andrey, since it takes a while to trigger this corruption, could you
confirm that if I kasan_poison_shadow() a few bytes I will get a KASan
report on any read/write to them?
Thanks,
Sasha
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: mm: NULL ptr deref in migrate_page_move_mapping
2014-09-27 3:02 ` Sasha Levin
@ 2014-09-27 7:01 ` Andrey Ryabinin
0 siblings, 0 replies; 5+ messages in thread
From: Andrey Ryabinin @ 2014-09-27 7:01 UTC (permalink / raw)
To: Sasha Levin
Cc: Hugh Dickins, linux-mm@kvack.org, Andrew Morton,
Konstantin Khlebnikov, Johannes Weiner, Naoya Horiguchi,
Mel Gorman, Andrey Ryabinin
2014-09-27 7:02 GMT+04:00 Sasha Levin <sasha.levin@oracle.com>:
> On 09/22/2014 07:04 PM, Hugh Dickins wrote:
>>> but I'm not sure what went wrong.
>> Most likely would be a zeroing of the radix_tree node, just as you
>> were experiencing zeroing of other mm structures in earlier weeks.
>>
>> Not that I've got any suggestions on where to take it from there.
>
> I've added poisoning to a few mm related structures, and managed to
> confirm that the issue here is indeed corruption rather than something
> specific with the given structures.
>
> Right now I'm looking into making KASan (Cc Andrey) to mark the poison
> bytes somehow so it would trigger an error on access, that way we'll
> know what's corruption them.
>
> Andrey, since it takes a while to trigger this corruption, could you
> confirm that if I kasan_poison_shadow() a few bytes I will get a KASan
> report on any read/write to them?
>
That's right. Note that poison value has to be negative.
Address and size of poisoned area has to be aligned to 8 bytes.
--
Best regards,
Andrey Ryabinin
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-09-27 7:01 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-22 15:30 mm: NULL ptr deref in migrate_page_move_mapping Sasha Levin
2014-09-22 23:04 ` Hugh Dickins
2014-09-22 23:49 ` Sasha Levin
2014-09-27 3:02 ` Sasha Levin
2014-09-27 7:01 ` Andrey Ryabinin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).