* mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
@ 2013-06-13 11:48 richard -rw- weinberger
2013-06-13 12:02 ` Michal Hocko
0 siblings, 1 reply; 10+ messages in thread
From: richard -rw- weinberger @ 2013-06-13 11:48 UTC (permalink / raw)
To: LKML, linux-mm@kvack.org, cgroups mailinglist,
kamezawa.hiroyu@jp.fujitsu.com, bsingharora, Michal Hocko, hannes
Hi!
While playing with user namespaces my kernel crashed under heavy load.
Kernel is 3.9.0 plus some trivial patches.
[35355.882105] BUG: unable to handle kernel NULL pointer dereference
at 00000000000001a8
[35355.883056] IP: [<ffffffff811297d9>] mem_cgroup_page_lruvec+0x79/0x90
[35355.883056] PGD 0
[35355.883056] Oops: 0000 [#1] SMP
[35355.883056] CPU 3
[35355.883056] Pid: 477, comm: kswapd0 Not tainted 3.9.0+ #12 Bochs Bochs
[35355.883056] RIP: 0010:[<ffffffff811297d9>] [<ffffffff811297d9>]
mem_cgroup_page_lruvec+0x79/0x90
[35355.883056] RSP: 0000:ffff88003d523aa8 EFLAGS: 00010002
[35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
[35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
[35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
[35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
[35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
[35355.883056] FS: 0000000000000000(0000) GS:ffff88003fd80000(0000)
knlGS:0000000000000000
[35355.883056] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[35355.883056] CR2: 00000000000001a8 CR3: 0000000001a0b000 CR4: 00000000000006e0
[35355.883056] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[35355.883056] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[35355.883056] Process kswapd0 (pid: 477, threadinfo ffff88003d522000,
task ffff88003db6dc40)
[35355.883056] Stack:
[35355.883056] 0000000000000014 ffffea0000a14000 ffff88003d523b28
ffffffff810ea4c5
[35355.883056] ffff88003d523b90 ffff88003fffa9c0 ffff88003d523b98
0000000000000020
[35355.883056] ffff88003fffa600 0000000200000003 ffff88003fffa600
ffff88003d523b98
[35355.883056] Call Trace:
[35355.883056] [<ffffffff810ea4c5>] move_active_pages_to_lru+0x65/0x190
[35355.883056] [<ffffffff810ec4e7>] shrink_active_list+0x297/0x380
[35355.883056] [<ffffffff810ebff6>] ? shrink_inactive_list+0x1a6/0x400
[35355.883056] [<ffffffff810ec815>] shrink_lruvec+0x245/0x4b0
[35355.883056] [<ffffffff810ecae6>] shrink_zone+0x66/0x180
[35355.883056] [<ffffffff810edcb4>] balance_pgdat+0x474/0x5b0
[35355.883056] [<ffffffff810edf58>] kswapd+0x168/0x440
[35355.883056] [<ffffffff8105d310>] ? abort_exclusive_wait+0xb0/0xb0
[35355.883056] [<ffffffff810eddf0>] ? balance_pgdat+0x5b0/0x5b0
[35355.883056] [<ffffffff8105c5fb>] kthread+0xbb/0xc0
[35355.883056] [<ffffffff8105c540>] ? __kthread_unpark+0x50/0x50
[35355.883056] [<ffffffff81748eac>] ret_from_fork+0x7c/0xb0
[35355.883056] [<ffffffff8105c540>] ? __kthread_unpark+0x50/0x50
[35355.883056] Code: 89 50 08 48 89 d1 0f 1f 40 00 49 8b 04 24 48 89
c2 48 c1 e8 38 83 e0 03 48 c1 ea 3a 48 69 c0 38 01 00 00 48 03 84 d1
e0 02 00 00 <48> 3b 58 70 75 0a 48 8b 5d f0 4c 8b 65 f8 c9 c3 48 89 58
70 eb
[35355.883056] RIP [<ffffffff811297d9>] mem_cgroup_page_lruvec+0x79/0x90
[35355.883056] RSP <ffff88003d523aa8>
[35355.883056] CR2: 00000000000001a8
[35355.883056] ---[ end trace 2c9b8eec517f960d ]---
--
Thanks,
//richard
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
2013-06-13 11:48 mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8 richard -rw- weinberger
@ 2013-06-13 12:02 ` Michal Hocko
2013-06-13 12:06 ` Richard Weinberger
0 siblings, 1 reply; 10+ messages in thread
From: Michal Hocko @ 2013-06-13 12:02 UTC (permalink / raw)
To: richard -rw- weinberger
Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes
On Thu 13-06-13 13:48:27, richard -rw- weinberger wrote:
> Hi!
>
> While playing with user namespaces my kernel crashed under heavy load.
> Kernel is 3.9.0 plus some trivial patches.
Could you post disassembly for mem_cgroup_page_lruvec?
> [35355.882105] BUG: unable to handle kernel NULL pointer dereference
> at 00000000000001a8
> [35355.883056] IP: [<ffffffff811297d9>] mem_cgroup_page_lruvec+0x79/0x90
> [35355.883056] PGD 0
> [35355.883056] Oops: 0000 [#1] SMP
> [35355.883056] CPU 3
> [35355.883056] Pid: 477, comm: kswapd0 Not tainted 3.9.0+ #12 Bochs Bochs
> [35355.883056] RIP: 0010:[<ffffffff811297d9>] [<ffffffff811297d9>]
> mem_cgroup_page_lruvec+0x79/0x90
> [35355.883056] RSP: 0000:ffff88003d523aa8 EFLAGS: 00010002
> [35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
> [35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
> [35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
> [35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
> [35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
> [35355.883056] FS: 0000000000000000(0000) GS:ffff88003fd80000(0000)
> knlGS:0000000000000000
> [35355.883056] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [35355.883056] CR2: 00000000000001a8 CR3: 0000000001a0b000 CR4: 00000000000006e0
> [35355.883056] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [35355.883056] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [35355.883056] Process kswapd0 (pid: 477, threadinfo ffff88003d522000,
> task ffff88003db6dc40)
> [35355.883056] Stack:
> [35355.883056] 0000000000000014 ffffea0000a14000 ffff88003d523b28
> ffffffff810ea4c5
> [35355.883056] ffff88003d523b90 ffff88003fffa9c0 ffff88003d523b98
> 0000000000000020
> [35355.883056] ffff88003fffa600 0000000200000003 ffff88003fffa600
> ffff88003d523b98
> [35355.883056] Call Trace:
> [35355.883056] [<ffffffff810ea4c5>] move_active_pages_to_lru+0x65/0x190
> [35355.883056] [<ffffffff810ec4e7>] shrink_active_list+0x297/0x380
> [35355.883056] [<ffffffff810ebff6>] ? shrink_inactive_list+0x1a6/0x400
> [35355.883056] [<ffffffff810ec815>] shrink_lruvec+0x245/0x4b0
> [35355.883056] [<ffffffff810ecae6>] shrink_zone+0x66/0x180
> [35355.883056] [<ffffffff810edcb4>] balance_pgdat+0x474/0x5b0
> [35355.883056] [<ffffffff810edf58>] kswapd+0x168/0x440
> [35355.883056] [<ffffffff8105d310>] ? abort_exclusive_wait+0xb0/0xb0
> [35355.883056] [<ffffffff810eddf0>] ? balance_pgdat+0x5b0/0x5b0
> [35355.883056] [<ffffffff8105c5fb>] kthread+0xbb/0xc0
> [35355.883056] [<ffffffff8105c540>] ? __kthread_unpark+0x50/0x50
> [35355.883056] [<ffffffff81748eac>] ret_from_fork+0x7c/0xb0
> [35355.883056] [<ffffffff8105c540>] ? __kthread_unpark+0x50/0x50
> [35355.883056] Code: 89 50 08 48 89 d1 0f 1f 40 00 49 8b 04 24 48 89
> c2 48 c1 e8 38 83 e0 03 48 c1 ea 3a 48 69 c0 38 01 00 00 48 03 84 d1
> e0 02 00 00 <48> 3b 58 70 75 0a 48 8b 5d f0 4c 8b 65 f8 c9 c3 48 89 58
> 70 eb
> [35355.883056] RIP [<ffffffff811297d9>] mem_cgroup_page_lruvec+0x79/0x90
> [35355.883056] RSP <ffff88003d523aa8>
> [35355.883056] CR2: 00000000000001a8
> [35355.883056] ---[ end trace 2c9b8eec517f960d ]---
>
>
> --
> Thanks,
> //richard
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
2013-06-13 12:02 ` Michal Hocko
@ 2013-06-13 12:06 ` Richard Weinberger
2013-06-13 13:29 ` Michal Hocko
0 siblings, 1 reply; 10+ messages in thread
From: Richard Weinberger @ 2013-06-13 12:06 UTC (permalink / raw)
To: Michal Hocko
Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes
Am 13.06.2013 14:02, schrieb Michal Hocko:
> On Thu 13-06-13 13:48:27, richard -rw- weinberger wrote:
>> Hi!
>>
>> While playing with user namespaces my kernel crashed under heavy load.
>> Kernel is 3.9.0 plus some trivial patches.
>
> Could you post disassembly for mem_cgroup_page_lruvec?
Sure!
00000000000035e0 <mem_cgroup_page_lruvec>:
35e0: 55 push %rbp
35e1: 48 8d 86 c8 03 00 00 lea 0x3c8(%rsi),%rax
35e8: 48 89 e5 mov %rsp,%rbp
35eb: 48 83 ec 10 sub $0x10,%rsp
35ef: 48 89 5d f0 mov %rbx,-0x10(%rbp)
35f3: 48 89 f3 mov %rsi,%rbx
35f6: 8b 35 00 00 00 00 mov 0x0(%rip),%esi # 35fc <mem_cgroup_page_lruvec+0x1c>
35fc: 4c 89 65 f8 mov %r12,-0x8(%rbp)
3600: 85 f6 test %esi,%esi
3602: 75 55 jne 3659 <mem_cgroup_page_lruvec+0x79>
3604: 49 89 fc mov %rdi,%r12
3607: e8 00 00 00 00 callq 360c <mem_cgroup_page_lruvec+0x2c>
360c: 49 8b 14 24 mov (%r12),%rdx
3610: 48 8b 48 08 mov 0x8(%rax),%rcx
3614: 83 e2 20 and $0x20,%edx
3617: 75 1f jne 3638 <mem_cgroup_page_lruvec+0x58>
3619: 48 8b 10 mov (%rax),%rdx
361c: 83 e2 02 and $0x2,%edx
361f: 75 17 jne 3638 <mem_cgroup_page_lruvec+0x58>
3621: 48 8b 15 00 00 00 00 mov 0x0(%rip),%rdx # 3628 <mem_cgroup_page_lruvec+0x48>
3628: 48 39 d1 cmp %rdx,%rcx
362b: 74 0b je 3638 <mem_cgroup_page_lruvec+0x58>
362d: 48 89 50 08 mov %rdx,0x8(%rax)
3631: 48 89 d1 mov %rdx,%rcx
3634: 0f 1f 40 00 nopl 0x0(%rax)
3638: 49 8b 04 24 mov (%r12),%rax
363c: 48 89 c2 mov %rax,%rdx
363f: 48 c1 e8 38 shr $0x38,%rax
3643: 83 e0 03 and $0x3,%eax
3646: 48 c1 ea 3a shr $0x3a,%rdx
364a: 48 69 c0 38 01 00 00 imul $0x138,%rax,%rax
3651: 48 03 84 d1 e0 02 00 add 0x2e0(%rcx,%rdx,8),%rax
3658: 00
3659: 48 3b 58 70 cmp 0x70(%rax),%rbx
365d: 75 0a jne 3669 <mem_cgroup_page_lruvec+0x89>
365f: 48 8b 5d f0 mov -0x10(%rbp),%rbx
3663: 4c 8b 65 f8 mov -0x8(%rbp),%r12
3667: c9 leaveq
3668: c3 retq
3669: 48 89 58 70 mov %rbx,0x70(%rax)
366d: eb f0 jmp 365f <mem_cgroup_page_lruvec+0x7f>
366f: 90 nop
FWIW the ./scripts/decodecode output:
All code
========
0: 89 50 08 mov %edx,0x8(%rax)
3: 48 89 d1 mov %rdx,%rcx
6: 0f 1f 40 00 nopl 0x0(%rax)
a: 49 8b 04 24 mov (%r12),%rax
e: 48 89 c2 mov %rax,%rdx
11: 48 c1 e8 38 shr $0x38,%rax
15: 83 e0 03 and $0x3,%eax
18: 48 c1 ea 3a shr $0x3a,%rdx
1c: 48 69 c0 38 01 00 00 imul $0x138,%rax,%rax
23: 48 03 84 d1 e0 02 00 add 0x2e0(%rcx,%rdx,8),%rax
2a: 00
2b:* 48 3b 58 70 cmp 0x70(%rax),%rbx <-- trapping instruction
2f: 75 0a jne 0x3b
31: 48 8b 5d f0 mov -0x10(%rbp),%rbx
35: 4c 8b 65 f8 mov -0x8(%rbp),%r12
39: c9 leaveq
3a: c3 retq
3b: 48 89 58 70 mov %rbx,0x70(%rax)
3f: eb .byte 0xeb
Code starting with the faulting instruction
===========================================
0: 48 3b 58 70 cmp 0x70(%rax),%rbx
4: 75 0a jne 0x10
6: 48 8b 5d f0 mov -0x10(%rbp),%rbx
a: 4c 8b 65 f8 mov -0x8(%rbp),%r12
e: c9 leaveq
f: c3 retq
10: 48 89 58 70 mov %rbx,0x70(%rax)
14: eb .byte 0xeb
Thanks,
//richard
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
2013-06-13 12:06 ` Richard Weinberger
@ 2013-06-13 13:29 ` Michal Hocko
2013-06-13 13:32 ` Michal Hocko
0 siblings, 1 reply; 10+ messages in thread
From: Michal Hocko @ 2013-06-13 13:29 UTC (permalink / raw)
To: Richard Weinberger
Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes
On Thu 13-06-13 14:06:20, Richard Weinberger wrote:
[...]
> All code
> ========
> 0: 89 50 08 mov %edx,0x8(%rax)
> 3: 48 89 d1 mov %rdx,%rcx
> 6: 0f 1f 40 00 nopl 0x0(%rax)
> a: 49 8b 04 24 mov (%r12),%rax
> e: 48 89 c2 mov %rax,%rdx
> 11: 48 c1 e8 38 shr $0x38,%rax
> 15: 83 e0 03 and $0x3,%eax
nid = page_to_nid
> 18: 48 c1 ea 3a shr $0x3a,%rdx
zid = page_zonenum
> 1c: 48 69 c0 38 01 00 00 imul $0x138,%rax,%rax
> 23: 48 03 84 d1 e0 02 00 add 0x2e0(%rcx,%rdx,8),%rax
&memcg->nodeinfo[nid]->zoneinfo[zid]
> 2a: 00
> 2b:* 48 3b 58 70 cmp 0x70(%rax),%rbx <-- trapping instruction
OK, so this maps to:
if (unlikely(lruvec->zone != zone)) <<<
lruvec->zone = zone;
> [35355.883056] RSP: 0000:ffff88003d523aa8 EFLAGS: 00010002
> [35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
> [35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
> [35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
> [35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
> [35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
> [35355.883056] FS: 0000000000000000(0000) GS:ffff88003fd80000(0000)
RAX (lruvec) is obviously incorrect and it doesn't make any sense. rax should
contain an address at an offset from ffff88003e04a800 But there is 0x138 there
instead.
Is this easily reproducible? Could you configure kdump.
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
2013-06-13 13:29 ` Michal Hocko
@ 2013-06-13 13:32 ` Michal Hocko
2013-06-13 13:34 ` Richard Weinberger
0 siblings, 1 reply; 10+ messages in thread
From: Michal Hocko @ 2013-06-13 13:32 UTC (permalink / raw)
To: Richard Weinberger
Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes
Ohh and could you post the config please? Sorry should have asked
earlier.
On Thu 13-06-13 15:29:08, Michal Hocko wrote:
>
> On Thu 13-06-13 14:06:20, Richard Weinberger wrote:
> [...]
> > All code
> > ========
> > 0: 89 50 08 mov %edx,0x8(%rax)
> > 3: 48 89 d1 mov %rdx,%rcx
> > 6: 0f 1f 40 00 nopl 0x0(%rax)
> > a: 49 8b 04 24 mov (%r12),%rax
> > e: 48 89 c2 mov %rax,%rdx
> > 11: 48 c1 e8 38 shr $0x38,%rax
> > 15: 83 e0 03 and $0x3,%eax
> nid = page_to_nid
> > 18: 48 c1 ea 3a shr $0x3a,%rdx
> zid = page_zonenum
>
> > 1c: 48 69 c0 38 01 00 00 imul $0x138,%rax,%rax
> > 23: 48 03 84 d1 e0 02 00 add 0x2e0(%rcx,%rdx,8),%rax
> &memcg->nodeinfo[nid]->zoneinfo[zid]
>
> > 2a: 00
> > 2b:* 48 3b 58 70 cmp 0x70(%rax),%rbx <-- trapping instruction
>
> OK, so this maps to:
> if (unlikely(lruvec->zone != zone)) <<<
> lruvec->zone = zone;
>
> > [35355.883056] RSP: 0000:ffff88003d523aa8 EFLAGS: 00010002
> > [35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
> > [35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
> > [35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
> > [35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
> > [35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
> > [35355.883056] FS: 0000000000000000(0000) GS:ffff88003fd80000(0000)
>
> RAX (lruvec) is obviously incorrect and it doesn't make any sense. rax should
> contain an address at an offset from ffff88003e04a800 But there is 0x138 there
> instead.
>
> Is this easily reproducible? Could you configure kdump.
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
2013-06-13 13:32 ` Michal Hocko
@ 2013-06-13 13:34 ` Richard Weinberger
2013-06-13 14:39 ` Michal Hocko
0 siblings, 1 reply; 10+ messages in thread
From: Richard Weinberger @ 2013-06-13 13:34 UTC (permalink / raw)
To: Michal Hocko
Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes
[-- Attachment #1: Type: text/plain, Size: 2050 bytes --]
Am 13.06.2013 15:32, schrieb Michal Hocko:
> Ohh and could you post the config please? Sorry should have asked
> earlier.
See attachment.
> On Thu 13-06-13 15:29:08, Michal Hocko wrote:
>>
>> On Thu 13-06-13 14:06:20, Richard Weinberger wrote:
>> [...]
>>> All code
>>> ========
>>> 0: 89 50 08 mov %edx,0x8(%rax)
>>> 3: 48 89 d1 mov %rdx,%rcx
>>> 6: 0f 1f 40 00 nopl 0x0(%rax)
>>> a: 49 8b 04 24 mov (%r12),%rax
>>> e: 48 89 c2 mov %rax,%rdx
>>> 11: 48 c1 e8 38 shr $0x38,%rax
>>> 15: 83 e0 03 and $0x3,%eax
>> nid = page_to_nid
>>> 18: 48 c1 ea 3a shr $0x3a,%rdx
>> zid = page_zonenum
>>
>>> 1c: 48 69 c0 38 01 00 00 imul $0x138,%rax,%rax
>>> 23: 48 03 84 d1 e0 02 00 add 0x2e0(%rcx,%rdx,8),%rax
>> &memcg->nodeinfo[nid]->zoneinfo[zid]
>>
>>> 2a: 00
>>> 2b:* 48 3b 58 70 cmp 0x70(%rax),%rbx <-- trapping instruction
>>
>> OK, so this maps to:
>> if (unlikely(lruvec->zone != zone)) <<<
>> lruvec->zone = zone;
>>
>>> [35355.883056] RSP: 0000:ffff88003d523aa8 EFLAGS: 00010002
>>> [35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
>>> [35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
>>> [35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
>>> [35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
>>> [35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
>>> [35355.883056] FS: 0000000000000000(0000) GS:ffff88003fd80000(0000)
>>
>> RAX (lruvec) is obviously incorrect and it doesn't make any sense. rax should
>> contain an address at an offset from ffff88003e04a800 But there is 0x138 there
>> instead.
>>
>> Is this easily reproducible? Could you configure kdump.
Not really. So far it happened only once...
Thanks,
//richard
[-- Attachment #2: .config --]
[-- Type: application/x-config, Size: 65446 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
2013-06-13 13:34 ` Richard Weinberger
@ 2013-06-13 14:39 ` Michal Hocko
2013-06-13 14:45 ` Richard Weinberger
0 siblings, 1 reply; 10+ messages in thread
From: Michal Hocko @ 2013-06-13 14:39 UTC (permalink / raw)
To: Richard Weinberger
Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes
On Thu 13-06-13 15:34:59, Richard Weinberger wrote:
> Am 13.06.2013 15:32, schrieb Michal Hocko:
> >Ohh and could you post the config please? Sorry should have asked
> >earlier.
>
> See attachment.
Nothing unusual there. Could you enable CONFIG_DEBUG_VM maybe it will
help too catch the problem earlier.
> >On Thu 13-06-13 15:29:08, Michal Hocko wrote:
> >>
> >>On Thu 13-06-13 14:06:20, Richard Weinberger wrote:
> >>[...]
> >>>All code
> >>>========
> >>> 0: 89 50 08 mov %edx,0x8(%rax)
> >>> 3: 48 89 d1 mov %rdx,%rcx
> >>> 6: 0f 1f 40 00 nopl 0x0(%rax)
> >>> a: 49 8b 04 24 mov (%r12),%rax
> >>> e: 48 89 c2 mov %rax,%rdx
> >>> 11: 48 c1 e8 38 shr $0x38,%rax
> >>> 15: 83 e0 03 and $0x3,%eax
> >> nid = page_to_nid
> >>> 18: 48 c1 ea 3a shr $0x3a,%rdx
> >> zid = page_zonenum
Ohh, I am wrong here. rdx should be nid and eax the zid.
> >>
> >>> 1c: 48 69 c0 38 01 00 00 imul $0x138,%rax,%rax
> >>> 23: 48 03 84 d1 e0 02 00 add 0x2e0(%rcx,%rdx,8),%rax
> >> &memcg->nodeinfo[nid]->zoneinfo[zid]
> >>
> >>> 2a: 00
> >>> 2b:* 48 3b 58 70 cmp 0x70(%rax),%rbx <-- trapping instruction
> >>
> >>OK, so this maps to:
> >> if (unlikely(lruvec->zone != zone)) <<<
> >> lruvec->zone = zone;
> >>
> >>>[35355.883056] RSP: 0000:ffff88003d523aa8 EFLAGS: 00010002
> >>>[35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
> >>>[35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
> >>>[35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
> >>>[35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
> >>>[35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
> >>>[35355.883056] FS: 0000000000000000(0000) GS:ffff88003fd80000(0000)
> >>
> >>RAX (lruvec) is obviously incorrect and it doesn't make any sense. rax should
> >>contain an address at an offset from ffff88003e04a800 But there is 0x138 there
> >>instead.
Hmm, now that I am looking at the registers again. RDX which should be
nid seems to be quite big. It says this is node 32. Does the machine
have really so many NUMA nodes?
Also I think the trapping instruction was one instruction above:
IP: [<ffffffff811297d9>] mem_cgroup_page_lruvec+0x79/0x90
0x000000000004fb09 <+121>: add 0x2e0(%rcx,%rdx,8),%rax
0x000000000004fb11 <+129>: cmp 0x70(%rax),%rbx
rather than cmp marked above. This would explain why rax is 138 because
that would point the zid=1 and 138 is offset of mem_cgroup_per_zone
within mem_cgroup_per_node for that zone. This would mean that the
struct page contains a weird node id.
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
2013-06-13 14:39 ` Michal Hocko
@ 2013-06-13 14:45 ` Richard Weinberger
2013-06-13 14:57 ` Richard Weinberger
0 siblings, 1 reply; 10+ messages in thread
From: Richard Weinberger @ 2013-06-13 14:45 UTC (permalink / raw)
To: Michal Hocko
Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes
Am 13.06.2013 16:39, schrieb Michal Hocko:
> On Thu 13-06-13 15:34:59, Richard Weinberger wrote:
>> Am 13.06.2013 15:32, schrieb Michal Hocko:
>>> Ohh and could you post the config please? Sorry should have asked
>>> earlier.
>>
>> See attachment.
>
> Nothing unusual there. Could you enable CONFIG_DEBUG_VM maybe it will
> help too catch the problem earlier.
OK
>>> On Thu 13-06-13 15:29:08, Michal Hocko wrote:
>>>>
>>>> On Thu 13-06-13 14:06:20, Richard Weinberger wrote:
>>>> [...]
>>>>> All code
>>>>> ========
>>>>> 0: 89 50 08 mov %edx,0x8(%rax)
>>>>> 3: 48 89 d1 mov %rdx,%rcx
>>>>> 6: 0f 1f 40 00 nopl 0x0(%rax)
>>>>> a: 49 8b 04 24 mov (%r12),%rax
>>>>> e: 48 89 c2 mov %rax,%rdx
>>>>> 11: 48 c1 e8 38 shr $0x38,%rax
>>>>> 15: 83 e0 03 and $0x3,%eax
>>>> nid = page_to_nid
>>>>> 18: 48 c1 ea 3a shr $0x3a,%rdx
>>>> zid = page_zonenum
>
> Ohh, I am wrong here. rdx should be nid and eax the zid.
>
>>>>
>>>>> 1c: 48 69 c0 38 01 00 00 imul $0x138,%rax,%rax
>>>>> 23: 48 03 84 d1 e0 02 00 add 0x2e0(%rcx,%rdx,8),%rax
>>>> &memcg->nodeinfo[nid]->zoneinfo[zid]
>>>>
>>>>> 2a: 00
>>>>> 2b:* 48 3b 58 70 cmp 0x70(%rax),%rbx <-- trapping instruction
>>>>
>>>> OK, so this maps to:
>>>> if (unlikely(lruvec->zone != zone)) <<<
>>>> lruvec->zone = zone;
>>>>
>>>>> [35355.883056] RSP: 0000:ffff88003d523aa8 EFLAGS: 00010002
>>>>> [35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
>>>>> [35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
>>>>> [35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
>>>>> [35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
>>>>> [35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
>>>>> [35355.883056] FS: 0000000000000000(0000) GS:ffff88003fd80000(0000)
>>>>
>>>> RAX (lruvec) is obviously incorrect and it doesn't make any sense. rax should
>>>> contain an address at an offset from ffff88003e04a800 But there is 0x138 there
>>>> instead.
>
> Hmm, now that I am looking at the registers again. RDX which should be
> nid seems to be quite big. It says this is node 32. Does the machine
> have really so many NUMA nodes?
No. It's a KVM guest with two CPUs. Nothing special.
qemu command line:
qemu-kvm -m 1G -drive file=lxc_host.qcow2,if=virtio -nographic -kernel linux/arch/x86/boot/bzImage -append console=ttyS0 root=/dev/vda2 -net user,hostfwd=tcp::5555-:22 -net
nic,model=e1000 -smp 4
Thanks,
//richard
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
2013-06-13 14:45 ` Richard Weinberger
@ 2013-06-13 14:57 ` Richard Weinberger
2013-06-13 15:19 ` Michal Hocko
0 siblings, 1 reply; 10+ messages in thread
From: Richard Weinberger @ 2013-06-13 14:57 UTC (permalink / raw)
To: Michal Hocko
Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes
Am 13.06.2013 16:45, schrieb Richard Weinberger:
> Am 13.06.2013 16:39, schrieb Michal Hocko:
>> On Thu 13-06-13 15:34:59, Richard Weinberger wrote:
>>> Am 13.06.2013 15:32, schrieb Michal Hocko:
>>>> Ohh and could you post the config please? Sorry should have asked
>>>> earlier.
>>>
>>> See attachment.
>>
>> Nothing unusual there. Could you enable CONFIG_DEBUG_VM maybe it will
>> help too catch the problem earlier.
>
> OK
>
>>>> On Thu 13-06-13 15:29:08, Michal Hocko wrote:
>>>>>
>>>>> On Thu 13-06-13 14:06:20, Richard Weinberger wrote:
>>>>> [...]
>>>>>> All code
>>>>>> ========
>>>>>> 0: 89 50 08 mov %edx,0x8(%rax)
>>>>>> 3: 48 89 d1 mov %rdx,%rcx
>>>>>> 6: 0f 1f 40 00 nopl 0x0(%rax)
>>>>>> a: 49 8b 04 24 mov (%r12),%rax
>>>>>> e: 48 89 c2 mov %rax,%rdx
>>>>>> 11: 48 c1 e8 38 shr $0x38,%rax
>>>>>> 15: 83 e0 03 and $0x3,%eax
>>>>> nid = page_to_nid
>>>>>> 18: 48 c1 ea 3a shr $0x3a,%rdx
>>>>> zid = page_zonenum
>>
>> Ohh, I am wrong here. rdx should be nid and eax the zid.
>>
>>>>>
>>>>>> 1c: 48 69 c0 38 01 00 00 imul $0x138,%rax,%rax
>>>>>> 23: 48 03 84 d1 e0 02 00 add 0x2e0(%rcx,%rdx,8),%rax
>>>>> &memcg->nodeinfo[nid]->zoneinfo[zid]
>>>>>
>>>>>> 2a: 00
>>>>>> 2b:* 48 3b 58 70 cmp 0x70(%rax),%rbx <-- trapping instruction
>>>>>
>>>>> OK, so this maps to:
>>>>> if (unlikely(lruvec->zone != zone)) <<<
>>>>> lruvec->zone = zone;
>>>>>
>>>>>> [35355.883056] RSP: 0000:ffff88003d523aa8 EFLAGS: 00010002
>>>>>> [35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
>>>>>> [35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
>>>>>> [35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
>>>>>> [35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
>>>>>> [35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
>>>>>> [35355.883056] FS: 0000000000000000(0000) GS:ffff88003fd80000(0000)
>>>>>
>>>>> RAX (lruvec) is obviously incorrect and it doesn't make any sense. rax should
>>>>> contain an address at an offset from ffff88003e04a800 But there is 0x138 there
>>>>> instead.
>>
>> Hmm, now that I am looking at the registers again. RDX which should be
>> nid seems to be quite big. It says this is node 32. Does the machine
>> have really so many NUMA nodes?
>
> No. It's a KVM guest with two CPUs. Nothing special.
> qemu command line:
> qemu-kvm -m 1G -drive file=lxc_host.qcow2,if=virtio -nographic -kernel linux/arch/x86/boot/bzImage -append console=ttyS0 root=/dev/vda2 -net user,hostfwd=tcp::5555-:22 -net
> nic,model=e1000 -smp 4
Errr, I meant four CPUs. :)
Thanks,
//richard
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
2013-06-13 14:57 ` Richard Weinberger
@ 2013-06-13 15:19 ` Michal Hocko
0 siblings, 0 replies; 10+ messages in thread
From: Michal Hocko @ 2013-06-13 15:19 UTC (permalink / raw)
To: Richard Weinberger
Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes
On Thu 13-06-13 16:57:23, Richard Weinberger wrote:
> Am 13.06.2013 16:45, schrieb Richard Weinberger:
> >Am 13.06.2013 16:39, schrieb Michal Hocko:
> >>On Thu 13-06-13 15:34:59, Richard Weinberger wrote:
> >>>Am 13.06.2013 15:32, schrieb Michal Hocko:
> >>>>Ohh and could you post the config please? Sorry should have asked
> >>>>earlier.
> >>>
> >>>See attachment.
> >>
> >>Nothing unusual there. Could you enable CONFIG_DEBUG_VM maybe it will
> >>help too catch the problem earlier.
> >
> >OK
> >
> >>>>On Thu 13-06-13 15:29:08, Michal Hocko wrote:
> >>>>>
> >>>>>On Thu 13-06-13 14:06:20, Richard Weinberger wrote:
> >>>>>[...]
> >>>>>>All code
> >>>>>>========
> >>>>>> 0: 89 50 08 mov %edx,0x8(%rax)
> >>>>>> 3: 48 89 d1 mov %rdx,%rcx
> >>>>>> 6: 0f 1f 40 00 nopl 0x0(%rax)
> >>>>>> a: 49 8b 04 24 mov (%r12),%rax
> >>>>>> e: 48 89 c2 mov %rax,%rdx
> >>>>>> 11: 48 c1 e8 38 shr $0x38,%rax
> >>>>>> 15: 83 e0 03 and $0x3,%eax
> >>>>> nid = page_to_nid
> >>>>>> 18: 48 c1 ea 3a shr $0x3a,%rdx
> >>>>> zid = page_zonenum
> >>
> >>Ohh, I am wrong here. rdx should be nid and eax the zid.
> >>
> >>>>>
> >>>>>> 1c: 48 69 c0 38 01 00 00 imul $0x138,%rax,%rax
> >>>>>> 23: 48 03 84 d1 e0 02 00 add 0x2e0(%rcx,%rdx,8),%rax
> >>>>> &memcg->nodeinfo[nid]->zoneinfo[zid]
> >>>>>
> >>>>>> 2a: 00
> >>>>>> 2b:* 48 3b 58 70 cmp 0x70(%rax),%rbx <-- trapping instruction
> >>>>>
> >>>>>OK, so this maps to:
> >>>>> if (unlikely(lruvec->zone != zone)) <<<
> >>>>> lruvec->zone = zone;
> >>>>>
> >>>>>>[35355.883056] RSP: 0000:ffff88003d523aa8 EFLAGS: 00010002
> >>>>>>[35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
> >>>>>>[35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
> >>>>>>[35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
> >>>>>>[35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
> >>>>>>[35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
> >>>>>>[35355.883056] FS: 0000000000000000(0000) GS:ffff88003fd80000(0000)
> >>>>>
> >>>>>RAX (lruvec) is obviously incorrect and it doesn't make any sense. rax should
> >>>>>contain an address at an offset from ffff88003e04a800 But there is 0x138 there
> >>>>>instead.
> >>
> >>Hmm, now that I am looking at the registers again. RDX which should be
> >>nid seems to be quite big. It says this is node 32. Does the machine
> >>have really so many NUMA nodes?
> >
> >No. It's a KVM guest with two CPUs. Nothing special.
> >qemu command line:
> >qemu-kvm -m 1G -drive file=lxc_host.qcow2,if=virtio -nographic -kernel linux/arch/x86/boot/bzImage -append console=ttyS0 root=/dev/vda2 -net user,hostfwd=tcp::5555-:22 -net
> >nic,model=e1000 -smp 4
OK, then something probably overwrites page->flags. I would be more
inclined to blame some other code ;)
Maybe DEBUG_VM will start shouting earlier
> Errr, I meant four CPUs. :)
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2013-06-13 15:19 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-13 11:48 mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8 richard -rw- weinberger
2013-06-13 12:02 ` Michal Hocko
2013-06-13 12:06 ` Richard Weinberger
2013-06-13 13:29 ` Michal Hocko
2013-06-13 13:32 ` Michal Hocko
2013-06-13 13:34 ` Richard Weinberger
2013-06-13 14:39 ` Michal Hocko
2013-06-13 14:45 ` Richard Weinberger
2013-06-13 14:57 ` Richard Weinberger
2013-06-13 15:19 ` Michal Hocko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).