mm: derefing NULL vma->vm_mm when unmapping

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* mm: derefing NULL vma->vm_mm when unmapping
@ 2014-06-30 13:49 Sasha Levin
  2014-06-30 22:07 ` Andrew Morton
  0 siblings, 1 reply; 4+ messages in thread
From: Sasha Levin @ 2014-06-30 13:49 UTC (permalink / raw)
  To: linux-mm@kvack.org; +Cc: Andrew Morton, Dave Jones, LKML

Hi all,

While fuzzing with trinity inside a KVM tools guest running the latest -next
kernel I've stumbled on the following spew:

[  761.704089] BUG: unable to handle kernel NULL pointer dereference at           (null)
[  761.704089] IP: mm_find_pmd (mm/rmap.c:570)
[  761.704089] PGD 51223067 PUD 50a09067 PMD 0
[  761.704089] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[  761.704089] Dumping ftrace buffer:
[  761.704089]    (ftrace buffer empty)
[  761.704089] Modules linked in:
[  761.704089] CPU: 4 PID: 20723 Comm: trinity-c131 Tainted: G        W      3.16.0-rc3-next-20140630-sasha-00023-g44434d4-dirty #756
[  761.704089] task: ffff88004e3c0000 ti: ffff88004e0b8000 task.ti: ffff88004e0b8000
[  761.704089] RIP: mm_find_pmd (mm/rmap.c:570)
[  761.704089] RSP: 0000:ffff88004e0bbaa8  EFLAGS: 00010246
[  761.704089] RAX: 0000000000000000 RBX: 0000000000a65000 RCX: ffff88004e0bbb30
[  761.704089] RDX: 0000000000000000 RSI: 0000000000a65000 RDI: ffff880000146000
[  761.704089] RBP: ffff88004e0bbaa8 R08: 0000000000000000 R09: 0000000000000000
[  761.704089] R10: ffff88004e3c0000 R11: 0000000000000000 R12: ffffea000d766e00
[  761.704089] R13: ffff88004e0bbb30 R14: ffff880000146000 R15: 0000000000000000
[  761.704089] FS:  00007f0293c61700(0000) GS:ffff880144e00000(0000) knlGS:0000000000000000
[  761.704089] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  761.704089] CR2: 0000000000000000 CR3: 000000004e3be000 CR4: 00000000000006a0
[  761.704089] Stack:
[  761.704089]  ffff88004e0bbae8 ffffffff9c2d0815 800000035d9b8805 ffff880000146000
[  761.704089]  ffffea000d766e00 ffff88000b4c4e58 ffff880034d7d200 0000000000000302
[  761.704089]  ffff88004e0bbb68 ffffffff9c2d1491 ffff88004e0bbb28 ffffffff9f57c58a
[  761.704089] Call Trace:
[  761.704089] __page_check_address (mm/rmap.c:618)
[  761.704089] try_to_unmap_one (mm/rmap.c:1133)
[  761.704089] ? down_read (kernel/locking/rwsem.c:45 (discriminator 2))
[  761.704089] ? page_lock_anon_vma_read (./arch/x86/include/asm/atomic.h:118 mm/rmap.c:491)
[  761.704089] ? page_lock_anon_vma_read (mm/rmap.c:448)
[  761.704089] rmap_walk (mm/rmap.c:1634 mm/rmap.c:1705)
[  761.704089] try_to_unmap (mm/rmap.c:1527)
[  761.704089] ? page_remove_rmap (mm/rmap.c:1124)
[  761.704089] ? invalid_migration_vma (mm/rmap.c:1483)
[  761.704089] ? try_to_unmap_one (mm/rmap.c:1391)
[  761.704089] ? anon_vma_prepare (mm/rmap.c:448)
[  761.704089] ? invalid_mkclean_vma (mm/rmap.c:1478)
[  761.704089] ? page_get_anon_vma (mm/rmap.c:405)
[  761.704089] migrate_pages (mm/migrate.c:912 mm/migrate.c:955 mm/migrate.c:1142)
[  761.704089] ? perf_trace_mm_numa_migrate_ratelimit (mm/migrate.c:1590)
[  761.704089] migrate_misplaced_page (mm/migrate.c:1750)
[  761.704089] __handle_mm_fault (mm/memory.c:3162 mm/memory.c:3212 mm/memory.c:3322)
[  761.704089] handle_mm_fault (include/linux/memcontrol.h:124 mm/memory.c:3348)
[  761.704089] ? __do_page_fault (arch/x86/mm/fault.c:1163)
[  761.704089] __do_page_fault (arch/x86/mm/fault.c:1230)
[  761.704089] ? vtime_account_user (kernel/sched/cputime.c:687)
[  761.704089] ? get_parent_ip (kernel/sched/core.c:2550)
[  761.704089] ? context_tracking_user_exit (include/linux/vtime.h:89 include/linux/jump_label.h:115 include/trace/events/context_tracking.h:47 kernel/context_tracking.c:180)
[  761.704089] ? preempt_count_sub (kernel/sched/core.c:2606)
[  761.704089] ? context_tracking_user_exit (kernel/context_tracking.c:184)
[  761.704089] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
[  761.704089] ? trace_hardirqs_off_caller (kernel/locking/lockdep.c:2638 (discriminator 2))
[  761.704089] trace_do_page_fault (arch/x86/mm/fault.c:1313 include/linux/jump_label.h:115 include/linux/context_tracking_state.h:27 include/linux/context_tracking.h:45 arch/x86/mm/fault.c:1314)
[  761.704089] do_async_page_fault (arch/x86/kernel/kvm.c:264)
[  761.704089] async_page_fault (arch/x86/kernel/entry_64.S:1322)
[ 761.704089] Code: 00 48 8b 5d f0 4c 8b 65 f8 c9 c3 66 0f 1f 44 00 00 66 66 66 66 90 55 48 89 f2 48 8b 47 40 48 c1 ea 27 48 89 e5 81 e2 ff 01 00 00 <48> 8b 3c d0 40 f6 c7 01 75 0c 31 f6 e9 af 00 00 00 0f 1f 44 00
All code
========
   0:	00 48 8b             	add    %cl,-0x75(%rax)
   3:	5d                   	pop    %rbp
   4:	f0 4c 8b 65 f8       	lock mov -0x8(%rbp),%r12
   9:	c9                   	leaveq
   a:	c3                   	retq
   b:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)
  11:	66 66 66 66 90       	data32 data32 data32 xchg %ax,%ax
  16:	55                   	push   %rbp
  17:	48 89 f2             	mov    %rsi,%rdx
  1a:	48 8b 47 40          	mov    0x40(%rdi),%rax
  1e:	48 c1 ea 27          	shr    $0x27,%rdx
  22:	48 89 e5             	mov    %rsp,%rbp
  25:	81 e2 ff 01 00 00    	and    $0x1ff,%edx
  2b:*	48 8b 3c d0          	mov    (%rax,%rdx,8),%rdi		<-- trapping instruction
  2f:	40 f6 c7 01          	test   $0x1,%dil
  33:	75 0c                	jne    0x41
  35:	31 f6                	xor    %esi,%esi
  37:	e9 af 00 00 00       	jmpq   0xeb
  3c:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)

Code starting with the faulting instruction
===========================================
   0:	48 8b 3c d0          	mov    (%rax,%rdx,8),%rdi
   4:	40 f6 c7 01          	test   $0x1,%dil
   8:	75 0c                	jne    0x16
   a:	31 f6                	xor    %esi,%esi
   c:	e9 af 00 00 00       	jmpq   0xc0
  11:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
[  761.704089] RIP mm_find_pmd (mm/rmap.c:570)
[  761.704089]  RSP <ffff88004e0bbaa8>
[  761.704089] CR2: 0000000000000000

As I didn't see any code changes around that part I'm thinking that it's a locking
issue that got messed up somewhere rather then a missing '!= NULL' check.


Thanks,
Sasha

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mm: derefing NULL vma->vm_mm when unmapping
  2014-06-30 13:49 mm: derefing NULL vma->vm_mm when unmapping Sasha Levin
@ 2014-06-30 22:07 ` Andrew Morton
  2014-07-01  0:55   ` Hugh Dickins
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2014-06-30 22:07 UTC (permalink / raw)
  To: Sasha Levin; +Cc: linux-mm@kvack.org, Dave Jones, LKML

On Mon, 30 Jun 2014 09:49:57 -0400 Sasha Levin <levinsasha928@gmail.com> wrote:

> Hi all,
> 
> While fuzzing with trinity inside a KVM tools guest running the latest -next
> kernel I've stumbled on the following spew:
> 
> [  761.704089] BUG: unable to handle kernel NULL pointer dereference at           (null)
> [  761.704089] IP: mm_find_pmd (mm/rmap.c:570)

Does this mean it oopsed in mm_find_pmd()'s call to pgd_offset()?

> [  761.704089] PGD 51223067 PUD 50a09067 PMD 0
> [  761.704089] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> [  761.704089] Dumping ftrace buffer:
> [  761.704089]    (ftrace buffer empty)
> [  761.704089] Modules linked in:
> [  761.704089] CPU: 4 PID: 20723 Comm: trinity-c131 Tainted: G        W      3.16.0-rc3-next-20140630-sasha-00023-g44434d4-dirty #756
> [  761.704089] task: ffff88004e3c0000 ti: ffff88004e0b8000 task.ti: ffff88004e0b8000
> [  761.704089] RIP: mm_find_pmd (mm/rmap.c:570)
> [  761.704089] RSP: 0000:ffff88004e0bbaa8  EFLAGS: 00010246
> [  761.704089] RAX: 0000000000000000 RBX: 0000000000a65000 RCX: ffff88004e0bbb30
> [  761.704089] RDX: 0000000000000000 RSI: 0000000000a65000 RDI: ffff880000146000
> [  761.704089] RBP: ffff88004e0bbaa8 R08: 0000000000000000 R09: 0000000000000000
> [  761.704089] R10: ffff88004e3c0000 R11: 0000000000000000 R12: ffffea000d766e00
> [  761.704089] R13: ffff88004e0bbb30 R14: ffff880000146000 R15: 0000000000000000
> [  761.704089] FS:  00007f0293c61700(0000) GS:ffff880144e00000(0000) knlGS:0000000000000000
> [  761.704089] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  761.704089] CR2: 0000000000000000 CR3: 000000004e3be000 CR4: 00000000000006a0
> [  761.704089] Stack:
> [  761.704089]  ffff88004e0bbae8 ffffffff9c2d0815 800000035d9b8805 ffff880000146000
> [  761.704089]  ffffea000d766e00 ffff88000b4c4e58 ffff880034d7d200 0000000000000302
> [  761.704089]  ffff88004e0bbb68 ffffffff9c2d1491 ffff88004e0bbb28 ffffffff9f57c58a
> [  761.704089] Call Trace:
> [  761.704089] __page_check_address (mm/rmap.c:618)
> [  761.704089] try_to_unmap_one (mm/rmap.c:1133)
> [  761.704089] ? down_read (kernel/locking/rwsem.c:45 (discriminator 2))
> [  761.704089] ? page_lock_anon_vma_read (./arch/x86/include/asm/atomic.h:118 mm/rmap.c:491)
> [  761.704089] ? page_lock_anon_vma_read (mm/rmap.c:448)
> [  761.704089] rmap_walk (mm/rmap.c:1634 mm/rmap.c:1705)
> [  761.704089] try_to_unmap (mm/rmap.c:1527)
> [  761.704089] ? page_remove_rmap (mm/rmap.c:1124)
> [  761.704089] ? invalid_migration_vma (mm/rmap.c:1483)
> [  761.704089] ? try_to_unmap_one (mm/rmap.c:1391)
> [  761.704089] ? anon_vma_prepare (mm/rmap.c:448)
> [  761.704089] ? invalid_mkclean_vma (mm/rmap.c:1478)
> [  761.704089] ? page_get_anon_vma (mm/rmap.c:405)
> [  761.704089] migrate_pages (mm/migrate.c:912 mm/migrate.c:955 mm/migrate.c:1142)
> [  761.704089] ? perf_trace_mm_numa_migrate_ratelimit (mm/migrate.c:1590)
> [  761.704089] migrate_misplaced_page (mm/migrate.c:1750)
> [  761.704089] __handle_mm_fault (mm/memory.c:3162 mm/memory.c:3212 mm/memory.c:3322)
> [  761.704089] handle_mm_fault (include/linux/memcontrol.h:124 mm/memory.c:3348)
> [  761.704089] ? __do_page_fault (arch/x86/mm/fault.c:1163)
> [  761.704089] __do_page_fault (arch/x86/mm/fault.c:1230)
> [  761.704089] ? vtime_account_user (kernel/sched/cputime.c:687)
> [  761.704089] ? get_parent_ip (kernel/sched/core.c:2550)
> [  761.704089] ? context_tracking_user_exit (include/linux/vtime.h:89 include/linux/jump_label.h:115 include/trace/events/context_tracking.h:47 kernel/context_tracking.c:180)
> [  761.704089] ? preempt_count_sub (kernel/sched/core.c:2606)
> [  761.704089] ? context_tracking_user_exit (kernel/context_tracking.c:184)
> [  761.704089] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
> [  761.704089] ? trace_hardirqs_off_caller (kernel/locking/lockdep.c:2638 (discriminator 2))
> [  761.704089] trace_do_page_fault (arch/x86/mm/fault.c:1313 include/linux/jump_label.h:115 include/linux/context_tracking_state.h:27 include/linux/context_tracking.h:45 arch/x86/mm/fault.c:1314)
> [  761.704089] do_async_page_fault (arch/x86/kernel/kvm.c:264)
> [  761.704089] async_page_fault (arch/x86/kernel/entry_64.S:1322)
> [ 761.704089] Code: 00 48 8b 5d f0 4c 8b 65 f8 c9 c3 66 0f 1f 44 00 00 66 66 66 66 90 55 48 89 f2 48 8b 47 40 48 c1 ea 27 48 89 e5 81 e2 ff 01 00 00 <48> 8b 3c d0 40 f6 c7 01 75 0c 31 f6 e9 af 00 00 00 0f 1f 44 00
> All code
> ========
>    0:	00 48 8b             	add    %cl,-0x75(%rax)
>    3:	5d                   	pop    %rbp
>    4:	f0 4c 8b 65 f8       	lock mov -0x8(%rbp),%r12
>    9:	c9                   	leaveq
>    a:	c3                   	retq
>    b:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)
>   11:	66 66 66 66 90       	data32 data32 data32 xchg %ax,%ax
>   16:	55                   	push   %rbp
>   17:	48 89 f2             	mov    %rsi,%rdx
>   1a:	48 8b 47 40          	mov    0x40(%rdi),%rax

0x40 is mm_struct.pgd

>   1e:	48 c1 ea 27          	shr    $0x27,%rdx
>   22:	48 89 e5             	mov    %rsp,%rbp
>   25:	81 e2 ff 01 00 00    	and    $0x1ff,%edx
>   2b:*	48 8b 3c d0          	mov    (%rax,%rdx,8),%rdi		<-- trapping instruction

So we seem to have mm->pgd == NULL?

dump_pagetable() was able to locate the pgd OK when it printed "PGD
51223067 PUD 50a09067 PMD 0", but it plucks the pgd out of the physical
pagetables, not out of the mm_struct.

Dunno.  You're under KVM and tracing is enabled, yes?  I don't
immediately see how that would affect it.

>   2f:	40 f6 c7 01          	test   $0x1,%dil
>   33:	75 0c                	jne    0x41
>   35:	31 f6                	xor    %esi,%esi
>   37:	e9 af 00 00 00       	jmpq   0xeb
>   3c:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
> 
> Code starting with the faulting instruction
> ===========================================
>    0:	48 8b 3c d0          	mov    (%rax,%rdx,8),%rdi
>    4:	40 f6 c7 01          	test   $0x1,%dil
>    8:	75 0c                	jne    0x16
>    a:	31 f6                	xor    %esi,%esi
>    c:	e9 af 00 00 00       	jmpq   0xc0
>   11:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
> [  761.704089] RIP mm_find_pmd (mm/rmap.c:570)
> [  761.704089]  RSP <ffff88004e0bbaa8>
> [  761.704089] CR2: 0000000000000000
> 
> As I didn't see any code changes around that part I'm thinking that it's a locking
> issue that got messed up somewhere rather then a missing '!= NULL' check.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mm: derefing NULL vma->vm_mm when unmapping
  2014-06-30 22:07 ` Andrew Morton
@ 2014-07-01  0:55   ` Hugh Dickins
  2014-07-05 14:41     ` Sasha Levin
  0 siblings, 1 reply; 4+ messages in thread
From: Hugh Dickins @ 2014-07-01  0:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Sasha Levin, linux-mm@kvack.org, Dave Jones, Andi Kleen, LKML

On Mon, 30 Jun 2014, Andrew Morton wrote:
> On Mon, 30 Jun 2014 09:49:57 -0400 Sasha Levin <levinsasha928@gmail.com> wrote:
> > Hi all,
> > 
> > While fuzzing with trinity inside a KVM tools guest running the latest -next
> > kernel I've stumbled on the following spew:
> > 
> > [  761.704089] BUG: unable to handle kernel NULL pointer dereference at           (null)
> > [  761.704089] IP: mm_find_pmd (mm/rmap.c:570)
> 
> Does this mean it oopsed in mm_find_pmd()'s call to pgd_offset()?
> 
> > [  761.704089] PGD 51223067 PUD 50a09067 PMD 0
> > [  761.704089] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> > [  761.704089] Dumping ftrace buffer:
> > [  761.704089]    (ftrace buffer empty)
> > [  761.704089] Modules linked in:
> > [  761.704089] CPU: 4 PID: 20723 Comm: trinity-c131 Tainted: G        W      3.16.0-rc3-next-20140630-sasha-00023-g44434d4-dirty #756
> > [  761.704089] task: ffff88004e3c0000 ti: ffff88004e0b8000 task.ti: ffff88004e0b8000
> > [  761.704089] RIP: mm_find_pmd (mm/rmap.c:570)
> > [  761.704089] RSP: 0000:ffff88004e0bbaa8  EFLAGS: 00010246
> > [  761.704089] RAX: 0000000000000000 RBX: 0000000000a65000 RCX: ffff88004e0bbb30
> > [  761.704089] RDX: 0000000000000000 RSI: 0000000000a65000 RDI: ffff880000146000
> > [  761.704089] RBP: ffff88004e0bbaa8 R08: 0000000000000000 R09: 0000000000000000
> > [  761.704089] R10: ffff88004e3c0000 R11: 0000000000000000 R12: ffffea000d766e00
> > [  761.704089] R13: ffff88004e0bbb30 R14: ffff880000146000 R15: 0000000000000000
> > [  761.704089] FS:  00007f0293c61700(0000) GS:ffff880144e00000(0000) knlGS:0000000000000000
> > [  761.704089] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > [  761.704089] CR2: 0000000000000000 CR3: 000000004e3be000 CR4: 00000000000006a0
> > [  761.704089] Stack:
> > [  761.704089]  ffff88004e0bbae8 ffffffff9c2d0815 800000035d9b8805 ffff880000146000
> > [  761.704089]  ffffea000d766e00 ffff88000b4c4e58 ffff880034d7d200 0000000000000302
> > [  761.704089]  ffff88004e0bbb68 ffffffff9c2d1491 ffff88004e0bbb28 ffffffff9f57c58a
> > [  761.704089] Call Trace:
> > [  761.704089] __page_check_address (mm/rmap.c:618)
> > [  761.704089] try_to_unmap_one (mm/rmap.c:1133)
> > [  761.704089] ? down_read (kernel/locking/rwsem.c:45 (discriminator 2))
> > [  761.704089] ? page_lock_anon_vma_read (./arch/x86/include/asm/atomic.h:118 mm/rmap.c:491)
> > [  761.704089] ? page_lock_anon_vma_read (mm/rmap.c:448)
> > [  761.704089] rmap_walk (mm/rmap.c:1634 mm/rmap.c:1705)
> > [  761.704089] try_to_unmap (mm/rmap.c:1527)
> > [  761.704089] ? page_remove_rmap (mm/rmap.c:1124)
> > [  761.704089] ? invalid_migration_vma (mm/rmap.c:1483)
> > [  761.704089] ? try_to_unmap_one (mm/rmap.c:1391)
> > [  761.704089] ? anon_vma_prepare (mm/rmap.c:448)
> > [  761.704089] ? invalid_mkclean_vma (mm/rmap.c:1478)
> > [  761.704089] ? page_get_anon_vma (mm/rmap.c:405)
> > [  761.704089] migrate_pages (mm/migrate.c:912 mm/migrate.c:955 mm/migrate.c:1142)
> > [  761.704089] ? perf_trace_mm_numa_migrate_ratelimit (mm/migrate.c:1590)
> > [  761.704089] migrate_misplaced_page (mm/migrate.c:1750)
> > [  761.704089] __handle_mm_fault (mm/memory.c:3162 mm/memory.c:3212 mm/memory.c:3322)
> > [  761.704089] handle_mm_fault (include/linux/memcontrol.h:124 mm/memory.c:3348)
> > [  761.704089] ? __do_page_fault (arch/x86/mm/fault.c:1163)
> > [  761.704089] __do_page_fault (arch/x86/mm/fault.c:1230)
> > [  761.704089] ? vtime_account_user (kernel/sched/cputime.c:687)
> > [  761.704089] ? get_parent_ip (kernel/sched/core.c:2550)
> > [  761.704089] ? context_tracking_user_exit (include/linux/vtime.h:89 include/linux/jump_label.h:115 include/trace/events/context_tracking.h:47 kernel/context_tracking.c:180)
> > [  761.704089] ? preempt_count_sub (kernel/sched/core.c:2606)
> > [  761.704089] ? context_tracking_user_exit (kernel/context_tracking.c:184)
> > [  761.704089] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
> > [  761.704089] ? trace_hardirqs_off_caller (kernel/locking/lockdep.c:2638 (discriminator 2))
> > [  761.704089] trace_do_page_fault (arch/x86/mm/fault.c:1313 include/linux/jump_label.h:115 include/linux/context_tracking_state.h:27 include/linux/context_tracking.h:45 arch/x86/mm/fault.c:1314)
> > [  761.704089] do_async_page_fault (arch/x86/kernel/kvm.c:264)
> > [  761.704089] async_page_fault (arch/x86/kernel/entry_64.S:1322)
> > [ 761.704089] Code: 00 48 8b 5d f0 4c 8b 65 f8 c9 c3 66 0f 1f 44 00 00 66 66 66 66 90 55 48 89 f2 48 8b 47 40 48 c1 ea 27 48 89 e5 81 e2 ff 01 00 00 <48> 8b 3c d0 40 f6 c7 01 75 0c 31 f6 e9 af 00 00 00 0f 1f 44 00
> > All code
> > ========
> >    0:	00 48 8b             	add    %cl,-0x75(%rax)
> >    3:	5d                   	pop    %rbp
> >    4:	f0 4c 8b 65 f8       	lock mov -0x8(%rbp),%r12
> >    9:	c9                   	leaveq
> >    a:	c3                   	retq
> >    b:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)
> >   11:	66 66 66 66 90       	data32 data32 data32 xchg %ax,%ax
> >   16:	55                   	push   %rbp
> >   17:	48 89 f2             	mov    %rsi,%rdx
> >   1a:	48 8b 47 40          	mov    0x40(%rdi),%rax
> 
> 0x40 is mm_struct.pgd
> 
> >   1e:	48 c1 ea 27          	shr    $0x27,%rdx
> >   22:	48 89 e5             	mov    %rsp,%rbp
> >   25:	81 e2 ff 01 00 00    	and    $0x1ff,%edx
> >   2b:*	48 8b 3c d0          	mov    (%rax,%rdx,8),%rdi		<-- trapping instruction
> 
> So we seem to have mm->pgd == NULL?

Yes.

> 
> dump_pagetable() was able to locate the pgd OK when it printed "PGD
> 51223067 PUD 50a09067 PMD 0", but it plucks the pgd out of the physical
> pagetables, not out of the mm_struct.

Two different mms, I think.  dump_pagetable() is reporting on the
current mm which experienced the oops on NULL pointer.  Whereas the
mm->pgd which is NULL is for one of those mms which rmap_walk is visiting.

> 
> Dunno.  You're under KVM and tracing is enabled, yes?  I don't
> immediately see how that would affect it.

I am beginning to wonder whether some of Sasha's reports are
actually problems with KVM, which I cannot help with at all.
It does add another dimension of doubt.  Or with DEBUG_PAGEALLOC.

I took a quick look, but had no more ideas on this crash than many
other of his recent ones.  Or is there something very (but very
rarely) wrong with the rmap walk and its trees these days?

> 
> >   2f:	40 f6 c7 01          	test   $0x1,%dil
> >   33:	75 0c                	jne    0x41
> >   35:	31 f6                	xor    %esi,%esi
> >   37:	e9 af 00 00 00       	jmpq   0xeb
> >   3c:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
> > 
> > Code starting with the faulting instruction
> > ===========================================
> >    0:	48 8b 3c d0          	mov    (%rax,%rdx,8),%rdi
> >    4:	40 f6 c7 01          	test   $0x1,%dil
> >    8:	75 0c                	jne    0x16
> >    a:	31 f6                	xor    %esi,%esi
> >    c:	e9 af 00 00 00       	jmpq   0xc0
> >   11:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)

Entirely off-topic: I love scripts/decodecode (thank you Andi!),
but has anyone ever seen any point at all to the "Code starting with
the faulting instruction" section, repeat of what's already shown?

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mm: derefing NULL vma->vm_mm when unmapping
  2014-07-01  0:55   ` Hugh Dickins
@ 2014-07-05 14:41     ` Sasha Levin
  0 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2014-07-05 14:41 UTC (permalink / raw)
  To: Hugh Dickins, Andrew Morton
  Cc: linux-mm@kvack.org, Dave Jones, Andi Kleen, LKML

On 06/30/2014 08:55 PM, Hugh Dickins wrote:
> On Mon, 30 Jun 2014, Andrew Morton wrote:
>> On Mon, 30 Jun 2014 09:49:57 -0400 Sasha Levin <levinsasha928@gmail.com> wrote:
>>
>> Dunno.  You're under KVM and tracing is enabled, yes?  I don't
>> immediately see how that would affect it.
> 
> I am beginning to wonder whether some of Sasha's reports are
> actually problems with KVM, which I cannot help with at all.
> It does add another dimension of doubt.  Or with DEBUG_PAGEALLOC.

The good news are that Oracle are being pretty cool and giving me some
more machines I could fuzz on, so soon I'll be doing fuzzing on physical
hardware as well - that'll tell us about KVM specific issues.

> I took a quick look, but had no more ideas on this crash than many
> other of his recent ones.  Or is there something very (but very
> rarely) wrong with the rmap walk and its trees these days?

It seems I'm hitting page table corruptions here and there, but not
sure if it's related to the report above.

[ 5753.537772] trinity-c43: Corrupted page table at address 7fc9a9fa2000
[ 5753.538893] PGD 3c2508067 PUD 3bbd58067 PMD 2f3b6a067 PTE ffff8800000b0235
[ 5753.540105] Bad pagetable: 0009 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 5753.540105] Dumping ftrace buffer:
[ 5753.542307]    (ftrace buffer empty)
[ 5753.542307] Modules linked in:
[ 5753.542307] CPU: 14 PID: 19432 Comm: trinity-c43 Not tainted 3.16.0-rc3-next-20140703-sasha-00024-g2ad7668-dirty #763
[ 5753.542307] task: ffff880161590000 ti: ffff880168c28000 task.ti: ffff880168c28000
[ 5753.542307] RIP: copy_user_generic_unrolled (arch/x86/lib/copy_user_64.S:166)
[ 5753.542307] RSP: 0018:ffff880168c2bf30  EFLAGS: 00010202
[ 5753.542307] RAX: ffff880168c28000 RBX: 00007fc9a9fa2000 RCX: 0000000000000002
[ 5753.542307] RDX: 0000000000000000 RSI: 00007fc9a9fa2000 RDI: ffff880168c2bf48
[ 5753.542307] RBP: ffff880168c2bf78 R08: 00000000001a7d9e R09: 0000000000000000
[ 5753.542307] R10: 0000000000000000 R11: 0000000000000001 R12: 00007fc9a9fa2008
[ 5753.542307] R13: 00007fc9aa16e6a8 R14: 0000000000000000 R15: 00000000000000a4
[ 5753.542307] FS:  00007fc9aa16e700(0000) GS:ffff88036ae00000(0000) knlGS:0000000000000000
[ 5753.542307] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5753.542307] CR2: 00007fc9a9fa2000 CR3: 000000015157a000 CR4: 00000000000006a0
[ 5753.542307] Stack:
[ 5753.542307]  ffffffff9216ffa1 00007fc9aa16e6a8 0000000000000000 00007fc9a9a1f000
[ 5753.542307]  ffffffff954d6ef0 00000000000000a4 0000000000000000 00000000000000a4
[ 5753.542307]  00007fc9a9a1f000 00007fc9a9a1f000 ffffffff954d6f53 0000000000000246
[ 5753.542307] Call Trace:
[ 5753.542307] ? SyS_settimeofday (kernel/time.c:196 kernel/time.c:189)
[ 5753.542307] ? tracesys (arch/x86/kernel/entry_64.S:531)
[ 5753.542307] tracesys (arch/x86/kernel/entry_64.S:542)
[ 5753.542307] Code: 30 4c 8b 5e 38 4c 89 47 20 4c 89 4f 28 4c 89 57 30 4c 89 5f 38 48 8d 76 40 48 8d 7f 40 ff c9 75 b6 89 d1 83 e2 07 c1 e9 03 74 12 <4c> 8b 06 4c 89 07 48 8d 76 08 48 8d 7f 08 ff c9 75 ee 21 d2 74
All code
========
   0:	30 4c 8b 5e          	xor    %cl,0x5e(%rbx,%rcx,4)
   4:	38 4c 89 47          	cmp    %cl,0x47(%rcx,%rcx,4)
   8:	20 4c 89 4f          	and    %cl,0x4f(%rcx,%rcx,4)
   c:	28 4c 89 57          	sub    %cl,0x57(%rcx,%rcx,4)
  10:	30 4c 89 5f          	xor    %cl,0x5f(%rcx,%rcx,4)
  14:	38 48 8d             	cmp    %cl,-0x73(%rax)
  17:	76 40                	jbe    0x59
  19:	48 8d 7f 40          	lea    0x40(%rdi),%rdi
  1d:	ff c9                	dec    %ecx
  1f:	75 b6                	jne    0xffffffffffffffd7
  21:	89 d1                	mov    %edx,%ecx
  23:	83 e2 07             	and    $0x7,%edx
  26:	c1 e9 03             	shr    $0x3,%ecx
  29:	74 12                	je     0x3d
  2b:*	4c 8b 06             	mov    (%rsi),%r8		<-- trapping instruction
  2e:	4c 89 07             	mov    %r8,(%rdi)
  31:	48 8d 76 08          	lea    0x8(%rsi),%rsi
  35:	48 8d 7f 08          	lea    0x8(%rdi),%rdi
  39:	ff c9                	dec    %ecx
  3b:	75 ee                	jne    0x2b
  3d:	21 d2                	and    %edx,%edx
  3f:	74 00                	je     0x41

Code starting with the faulting instruction
===========================================
   0:	4c 8b 06             	mov    (%rsi),%r8
   3:	4c 89 07             	mov    %r8,(%rdi)
   6:	48 8d 76 08          	lea    0x8(%rsi),%rsi
   a:	48 8d 7f 08          	lea    0x8(%rdi),%rdi
   e:	ff c9                	dec    %ecx
  10:	75 ee                	jne    0x0
  12:	21 d2                	and    %edx,%edx
  14:	74 00                	je     0x16
[ 5753.570683] RIP copy_user_generic_unrolled (arch/x86/lib/copy_user_64.S:166)
[ 5753.570683]  RSP <ffff880168c2bf30>


Thanks,
Sasha

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-07-05 14:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-30 13:49 mm: derefing NULL vma->vm_mm when unmapping Sasha Levin
2014-06-30 22:07 ` Andrew Morton
2014-07-01  0:55   ` Hugh Dickins
2014-07-05 14:41     ` Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).