mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
@ 2013-06-13 11:48 richard -rw- weinberger
  2013-06-13 12:02 ` Michal Hocko
  0 siblings, 1 reply; 10+ messages in thread
From: richard -rw- weinberger @ 2013-06-13 11:48 UTC (permalink / raw)
  To: LKML, linux-mm@kvack.org, cgroups mailinglist,
	kamezawa.hiroyu@jp.fujitsu.com, bsingharora, Michal Hocko, hannes

Hi!

While playing with user namespaces my kernel crashed under heavy load.
Kernel is 3.9.0 plus some trivial patches.

[35355.882105] BUG: unable to handle kernel NULL pointer dereference
at 00000000000001a8
[35355.883056] IP: [<ffffffff811297d9>] mem_cgroup_page_lruvec+0x79/0x90
[35355.883056] PGD 0
[35355.883056] Oops: 0000 [#1] SMP
[35355.883056] CPU 3
[35355.883056] Pid: 477, comm: kswapd0 Not tainted 3.9.0+ #12 Bochs Bochs
[35355.883056] RIP: 0010:[<ffffffff811297d9>]  [<ffffffff811297d9>]
mem_cgroup_page_lruvec+0x79/0x90
[35355.883056] RSP: 0000:ffff88003d523aa8  EFLAGS: 00010002
[35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
[35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
[35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
[35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
[35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
[35355.883056] FS:  0000000000000000(0000) GS:ffff88003fd80000(0000)
knlGS:0000000000000000
[35355.883056] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[35355.883056] CR2: 00000000000001a8 CR3: 0000000001a0b000 CR4: 00000000000006e0
[35355.883056] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[35355.883056] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[35355.883056] Process kswapd0 (pid: 477, threadinfo ffff88003d522000,
task ffff88003db6dc40)
[35355.883056] Stack:
[35355.883056]  0000000000000014 ffffea0000a14000 ffff88003d523b28
ffffffff810ea4c5
[35355.883056]  ffff88003d523b90 ffff88003fffa9c0 ffff88003d523b98
0000000000000020
[35355.883056]  ffff88003fffa600 0000000200000003 ffff88003fffa600
ffff88003d523b98
[35355.883056] Call Trace:
[35355.883056]  [<ffffffff810ea4c5>] move_active_pages_to_lru+0x65/0x190
[35355.883056]  [<ffffffff810ec4e7>] shrink_active_list+0x297/0x380
[35355.883056]  [<ffffffff810ebff6>] ? shrink_inactive_list+0x1a6/0x400
[35355.883056]  [<ffffffff810ec815>] shrink_lruvec+0x245/0x4b0
[35355.883056]  [<ffffffff810ecae6>] shrink_zone+0x66/0x180
[35355.883056]  [<ffffffff810edcb4>] balance_pgdat+0x474/0x5b0
[35355.883056]  [<ffffffff810edf58>] kswapd+0x168/0x440
[35355.883056]  [<ffffffff8105d310>] ? abort_exclusive_wait+0xb0/0xb0
[35355.883056]  [<ffffffff810eddf0>] ? balance_pgdat+0x5b0/0x5b0
[35355.883056]  [<ffffffff8105c5fb>] kthread+0xbb/0xc0
[35355.883056]  [<ffffffff8105c540>] ? __kthread_unpark+0x50/0x50
[35355.883056]  [<ffffffff81748eac>] ret_from_fork+0x7c/0xb0
[35355.883056]  [<ffffffff8105c540>] ? __kthread_unpark+0x50/0x50
[35355.883056] Code: 89 50 08 48 89 d1 0f 1f 40 00 49 8b 04 24 48 89
c2 48 c1 e8 38 83 e0 03 48 c1 ea 3a 48 69 c0 38 01 00 00 48 03 84 d1
e0 02 00 00 <48> 3b 58 70 75 0a 48 8b 5d f0 4c 8b 65 f8 c9 c3 48 89 58
70 eb
[35355.883056] RIP  [<ffffffff811297d9>] mem_cgroup_page_lruvec+0x79/0x90
[35355.883056]  RSP <ffff88003d523aa8>
[35355.883056] CR2: 00000000000001a8
[35355.883056] ---[ end trace 2c9b8eec517f960d ]---


--
Thanks,
//richard

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
  2013-06-13 11:48 mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8 richard -rw- weinberger
@ 2013-06-13 12:02 ` Michal Hocko
  2013-06-13 12:06   ` Richard Weinberger
  0 siblings, 1 reply; 10+ messages in thread
From: Michal Hocko @ 2013-06-13 12:02 UTC (permalink / raw)
  To: richard -rw- weinberger
  Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
	kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes

On Thu 13-06-13 13:48:27, richard -rw- weinberger wrote:
> Hi!
> 
> While playing with user namespaces my kernel crashed under heavy load.
> Kernel is 3.9.0 plus some trivial patches.

Could you post disassembly for mem_cgroup_page_lruvec?

> [35355.882105] BUG: unable to handle kernel NULL pointer dereference
> at 00000000000001a8
> [35355.883056] IP: [<ffffffff811297d9>] mem_cgroup_page_lruvec+0x79/0x90
> [35355.883056] PGD 0
> [35355.883056] Oops: 0000 [#1] SMP
> [35355.883056] CPU 3
> [35355.883056] Pid: 477, comm: kswapd0 Not tainted 3.9.0+ #12 Bochs Bochs
> [35355.883056] RIP: 0010:[<ffffffff811297d9>]  [<ffffffff811297d9>]
> mem_cgroup_page_lruvec+0x79/0x90
> [35355.883056] RSP: 0000:ffff88003d523aa8  EFLAGS: 00010002
> [35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
> [35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
> [35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
> [35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
> [35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
> [35355.883056] FS:  0000000000000000(0000) GS:ffff88003fd80000(0000)
> knlGS:0000000000000000
> [35355.883056] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [35355.883056] CR2: 00000000000001a8 CR3: 0000000001a0b000 CR4: 00000000000006e0
> [35355.883056] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [35355.883056] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [35355.883056] Process kswapd0 (pid: 477, threadinfo ffff88003d522000,
> task ffff88003db6dc40)
> [35355.883056] Stack:
> [35355.883056]  0000000000000014 ffffea0000a14000 ffff88003d523b28
> ffffffff810ea4c5
> [35355.883056]  ffff88003d523b90 ffff88003fffa9c0 ffff88003d523b98
> 0000000000000020
> [35355.883056]  ffff88003fffa600 0000000200000003 ffff88003fffa600
> ffff88003d523b98
> [35355.883056] Call Trace:
> [35355.883056]  [<ffffffff810ea4c5>] move_active_pages_to_lru+0x65/0x190
> [35355.883056]  [<ffffffff810ec4e7>] shrink_active_list+0x297/0x380
> [35355.883056]  [<ffffffff810ebff6>] ? shrink_inactive_list+0x1a6/0x400
> [35355.883056]  [<ffffffff810ec815>] shrink_lruvec+0x245/0x4b0
> [35355.883056]  [<ffffffff810ecae6>] shrink_zone+0x66/0x180
> [35355.883056]  [<ffffffff810edcb4>] balance_pgdat+0x474/0x5b0
> [35355.883056]  [<ffffffff810edf58>] kswapd+0x168/0x440
> [35355.883056]  [<ffffffff8105d310>] ? abort_exclusive_wait+0xb0/0xb0
> [35355.883056]  [<ffffffff810eddf0>] ? balance_pgdat+0x5b0/0x5b0
> [35355.883056]  [<ffffffff8105c5fb>] kthread+0xbb/0xc0
> [35355.883056]  [<ffffffff8105c540>] ? __kthread_unpark+0x50/0x50
> [35355.883056]  [<ffffffff81748eac>] ret_from_fork+0x7c/0xb0
> [35355.883056]  [<ffffffff8105c540>] ? __kthread_unpark+0x50/0x50
> [35355.883056] Code: 89 50 08 48 89 d1 0f 1f 40 00 49 8b 04 24 48 89
> c2 48 c1 e8 38 83 e0 03 48 c1 ea 3a 48 69 c0 38 01 00 00 48 03 84 d1
> e0 02 00 00 <48> 3b 58 70 75 0a 48 8b 5d f0 4c 8b 65 f8 c9 c3 48 89 58
> 70 eb
> [35355.883056] RIP  [<ffffffff811297d9>] mem_cgroup_page_lruvec+0x79/0x90
> [35355.883056]  RSP <ffff88003d523aa8>
> [35355.883056] CR2: 00000000000001a8
> [35355.883056] ---[ end trace 2c9b8eec517f960d ]---
> 
> 
> --
> Thanks,
> //richard

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
  2013-06-13 12:02 ` Michal Hocko
@ 2013-06-13 12:06   ` Richard Weinberger
  2013-06-13 13:29     ` Michal Hocko
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Weinberger @ 2013-06-13 12:06 UTC (permalink / raw)
  To: Michal Hocko
  Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
	kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes

Am 13.06.2013 14:02, schrieb Michal Hocko:
> On Thu 13-06-13 13:48:27, richard -rw- weinberger wrote:
>> Hi!
>>
>> While playing with user namespaces my kernel crashed under heavy load.
>> Kernel is 3.9.0 plus some trivial patches.
>
> Could you post disassembly for mem_cgroup_page_lruvec?

Sure!

00000000000035e0 <mem_cgroup_page_lruvec>:
     35e0:       55                      push   %rbp
     35e1:       48 8d 86 c8 03 00 00    lea    0x3c8(%rsi),%rax
     35e8:       48 89 e5                mov    %rsp,%rbp
     35eb:       48 83 ec 10             sub    $0x10,%rsp
     35ef:       48 89 5d f0             mov    %rbx,-0x10(%rbp)
     35f3:       48 89 f3                mov    %rsi,%rbx
     35f6:       8b 35 00 00 00 00       mov    0x0(%rip),%esi        # 35fc <mem_cgroup_page_lruvec+0x1c>
     35fc:       4c 89 65 f8             mov    %r12,-0x8(%rbp)
     3600:       85 f6                   test   %esi,%esi
     3602:       75 55                   jne    3659 <mem_cgroup_page_lruvec+0x79>
     3604:       49 89 fc                mov    %rdi,%r12
     3607:       e8 00 00 00 00          callq  360c <mem_cgroup_page_lruvec+0x2c>
     360c:       49 8b 14 24             mov    (%r12),%rdx
     3610:       48 8b 48 08             mov    0x8(%rax),%rcx
     3614:       83 e2 20                and    $0x20,%edx
     3617:       75 1f                   jne    3638 <mem_cgroup_page_lruvec+0x58>
     3619:       48 8b 10                mov    (%rax),%rdx
     361c:       83 e2 02                and    $0x2,%edx
     361f:       75 17                   jne    3638 <mem_cgroup_page_lruvec+0x58>
     3621:       48 8b 15 00 00 00 00    mov    0x0(%rip),%rdx        # 3628 <mem_cgroup_page_lruvec+0x48>
     3628:       48 39 d1                cmp    %rdx,%rcx
     362b:       74 0b                   je     3638 <mem_cgroup_page_lruvec+0x58>
     362d:       48 89 50 08             mov    %rdx,0x8(%rax)
     3631:       48 89 d1                mov    %rdx,%rcx
     3634:       0f 1f 40 00             nopl   0x0(%rax)
     3638:       49 8b 04 24             mov    (%r12),%rax
     363c:       48 89 c2                mov    %rax,%rdx
     363f:       48 c1 e8 38             shr    $0x38,%rax
     3643:       83 e0 03                and    $0x3,%eax
     3646:       48 c1 ea 3a             shr    $0x3a,%rdx
     364a:       48 69 c0 38 01 00 00    imul   $0x138,%rax,%rax
     3651:       48 03 84 d1 e0 02 00    add    0x2e0(%rcx,%rdx,8),%rax
     3658:       00
     3659:       48 3b 58 70             cmp    0x70(%rax),%rbx
     365d:       75 0a                   jne    3669 <mem_cgroup_page_lruvec+0x89>
     365f:       48 8b 5d f0             mov    -0x10(%rbp),%rbx
     3663:       4c 8b 65 f8             mov    -0x8(%rbp),%r12
     3667:       c9                      leaveq
     3668:       c3                      retq
     3669:       48 89 58 70             mov    %rbx,0x70(%rax)
     366d:       eb f0                   jmp    365f <mem_cgroup_page_lruvec+0x7f>
     366f:       90                      nop

FWIW the ./scripts/decodecode output:

All code
========
    0:   89 50 08                mov    %edx,0x8(%rax)
    3:   48 89 d1                mov    %rdx,%rcx
    6:   0f 1f 40 00             nopl   0x0(%rax)
    a:   49 8b 04 24             mov    (%r12),%rax
    e:   48 89 c2                mov    %rax,%rdx
   11:   48 c1 e8 38             shr    $0x38,%rax
   15:   83 e0 03                and    $0x3,%eax
   18:   48 c1 ea 3a             shr    $0x3a,%rdx
   1c:   48 69 c0 38 01 00 00    imul   $0x138,%rax,%rax
   23:   48 03 84 d1 e0 02 00    add    0x2e0(%rcx,%rdx,8),%rax
   2a:   00
   2b:*  48 3b 58 70             cmp    0x70(%rax),%rbx     <-- trapping instruction
   2f:   75 0a                   jne    0x3b
   31:   48 8b 5d f0             mov    -0x10(%rbp),%rbx
   35:   4c 8b 65 f8             mov    -0x8(%rbp),%r12
   39:   c9                      leaveq
   3a:   c3                      retq
   3b:   48 89 58 70             mov    %rbx,0x70(%rax)
   3f:   eb                      .byte 0xeb

Code starting with the faulting instruction
===========================================
    0:   48 3b 58 70             cmp    0x70(%rax),%rbx
    4:   75 0a                   jne    0x10
    6:   48 8b 5d f0             mov    -0x10(%rbp),%rbx
    a:   4c 8b 65 f8             mov    -0x8(%rbp),%r12
    e:   c9                      leaveq
    f:   c3                      retq
   10:   48 89 58 70             mov    %rbx,0x70(%rax)
   14:   eb                      .byte 0xeb


Thanks,
//richard

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
  2013-06-13 12:06   ` Richard Weinberger
@ 2013-06-13 13:29     ` Michal Hocko
  2013-06-13 13:32       ` Michal Hocko
  0 siblings, 1 reply; 10+ messages in thread
From: Michal Hocko @ 2013-06-13 13:29 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
	kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes


On Thu 13-06-13 14:06:20, Richard Weinberger wrote:
[...]
> All code
> ========
>    0:   89 50 08                mov    %edx,0x8(%rax)
>    3:   48 89 d1                mov    %rdx,%rcx
>    6:   0f 1f 40 00             nopl   0x0(%rax)
>    a:   49 8b 04 24             mov    (%r12),%rax
>    e:   48 89 c2                mov    %rax,%rdx
>   11:   48 c1 e8 38             shr    $0x38,%rax
>   15:   83 e0 03                and    $0x3,%eax
					nid = page_to_nid
>   18:   48 c1 ea 3a             shr    $0x3a,%rdx
					zid = page_zonenum

>   1c:   48 69 c0 38 01 00 00    imul   $0x138,%rax,%rax
>   23:   48 03 84 d1 e0 02 00    add    0x2e0(%rcx,%rdx,8),%rax
					&memcg->nodeinfo[nid]->zoneinfo[zid]

>   2a:   00
>   2b:*  48 3b 58 70             cmp    0x70(%rax),%rbx     <-- trapping instruction

OK, so this maps to:
        if (unlikely(lruvec->zone != zone)) <<<
                lruvec->zone = zone;

> [35355.883056] RSP: 0000:ffff88003d523aa8  EFLAGS: 00010002
> [35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
> [35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
> [35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
> [35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
> [35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
> [35355.883056] FS:  0000000000000000(0000) GS:ffff88003fd80000(0000)

RAX (lruvec) is obviously incorrect and it doesn't make any sense. rax should
contain an address at an offset from ffff88003e04a800 But there is 0x138 there
instead.

Is this easily reproducible? Could you configure kdump.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
  2013-06-13 13:29     ` Michal Hocko
@ 2013-06-13 13:32       ` Michal Hocko
  2013-06-13 13:34         ` Richard Weinberger
  0 siblings, 1 reply; 10+ messages in thread
From: Michal Hocko @ 2013-06-13 13:32 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
	kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes

Ohh and could you post the config please? Sorry should have asked
earlier.

On Thu 13-06-13 15:29:08, Michal Hocko wrote:
> 
> On Thu 13-06-13 14:06:20, Richard Weinberger wrote:
> [...]
> > All code
> > ========
> >    0:   89 50 08                mov    %edx,0x8(%rax)
> >    3:   48 89 d1                mov    %rdx,%rcx
> >    6:   0f 1f 40 00             nopl   0x0(%rax)
> >    a:   49 8b 04 24             mov    (%r12),%rax
> >    e:   48 89 c2                mov    %rax,%rdx
> >   11:   48 c1 e8 38             shr    $0x38,%rax
> >   15:   83 e0 03                and    $0x3,%eax
> 					nid = page_to_nid
> >   18:   48 c1 ea 3a             shr    $0x3a,%rdx
> 					zid = page_zonenum
> 
> >   1c:   48 69 c0 38 01 00 00    imul   $0x138,%rax,%rax
> >   23:   48 03 84 d1 e0 02 00    add    0x2e0(%rcx,%rdx,8),%rax
> 					&memcg->nodeinfo[nid]->zoneinfo[zid]
> 
> >   2a:   00
> >   2b:*  48 3b 58 70             cmp    0x70(%rax),%rbx     <-- trapping instruction
> 
> OK, so this maps to:
>         if (unlikely(lruvec->zone != zone)) <<<
>                 lruvec->zone = zone;
> 
> > [35355.883056] RSP: 0000:ffff88003d523aa8  EFLAGS: 00010002
> > [35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
> > [35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
> > [35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
> > [35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
> > [35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
> > [35355.883056] FS:  0000000000000000(0000) GS:ffff88003fd80000(0000)
> 
> RAX (lruvec) is obviously incorrect and it doesn't make any sense. rax should
> contain an address at an offset from ffff88003e04a800 But there is 0x138 there
> instead.
> 
> Is this easily reproducible? Could you configure kdump.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
  2013-06-13 13:32       ` Michal Hocko
@ 2013-06-13 13:34         ` Richard Weinberger
  2013-06-13 14:39           ` Michal Hocko
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Weinberger @ 2013-06-13 13:34 UTC (permalink / raw)
  To: Michal Hocko
  Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
	kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes

[-- Attachment #1: Type: text/plain, Size: 2050 bytes --]

Am 13.06.2013 15:32, schrieb Michal Hocko:
> Ohh and could you post the config please? Sorry should have asked
> earlier.

See attachment.

> On Thu 13-06-13 15:29:08, Michal Hocko wrote:
>>
>> On Thu 13-06-13 14:06:20, Richard Weinberger wrote:
>> [...]
>>> All code
>>> ========
>>>     0:   89 50 08                mov    %edx,0x8(%rax)
>>>     3:   48 89 d1                mov    %rdx,%rcx
>>>     6:   0f 1f 40 00             nopl   0x0(%rax)
>>>     a:   49 8b 04 24             mov    (%r12),%rax
>>>     e:   48 89 c2                mov    %rax,%rdx
>>>    11:   48 c1 e8 38             shr    $0x38,%rax
>>>    15:   83 e0 03                and    $0x3,%eax
>> 					nid = page_to_nid
>>>    18:   48 c1 ea 3a             shr    $0x3a,%rdx
>> 					zid = page_zonenum
>>
>>>    1c:   48 69 c0 38 01 00 00    imul   $0x138,%rax,%rax
>>>    23:   48 03 84 d1 e0 02 00    add    0x2e0(%rcx,%rdx,8),%rax
>> 					&memcg->nodeinfo[nid]->zoneinfo[zid]
>>
>>>    2a:   00
>>>    2b:*  48 3b 58 70             cmp    0x70(%rax),%rbx     <-- trapping instruction
>>
>> OK, so this maps to:
>>          if (unlikely(lruvec->zone != zone)) <<<
>>                  lruvec->zone = zone;
>>
>>> [35355.883056] RSP: 0000:ffff88003d523aa8  EFLAGS: 00010002
>>> [35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
>>> [35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
>>> [35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
>>> [35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
>>> [35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
>>> [35355.883056] FS:  0000000000000000(0000) GS:ffff88003fd80000(0000)
>>
>> RAX (lruvec) is obviously incorrect and it doesn't make any sense. rax should
>> contain an address at an offset from ffff88003e04a800 But there is 0x138 there
>> instead.
>>
>> Is this easily reproducible? Could you configure kdump.

Not really. So far it happened only once...

Thanks,
//richard


[-- Attachment #2: .config --]
[-- Type: application/x-config, Size: 65446 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
  2013-06-13 13:34         ` Richard Weinberger
@ 2013-06-13 14:39           ` Michal Hocko
  2013-06-13 14:45             ` Richard Weinberger
  0 siblings, 1 reply; 10+ messages in thread
From: Michal Hocko @ 2013-06-13 14:39 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
	kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes

On Thu 13-06-13 15:34:59, Richard Weinberger wrote:
> Am 13.06.2013 15:32, schrieb Michal Hocko:
> >Ohh and could you post the config please? Sorry should have asked
> >earlier.
> 
> See attachment.

Nothing unusual there. Could you enable CONFIG_DEBUG_VM maybe it will
help too catch the problem earlier.

> >On Thu 13-06-13 15:29:08, Michal Hocko wrote:
> >>
> >>On Thu 13-06-13 14:06:20, Richard Weinberger wrote:
> >>[...]
> >>>All code
> >>>========
> >>>    0:   89 50 08                mov    %edx,0x8(%rax)
> >>>    3:   48 89 d1                mov    %rdx,%rcx
> >>>    6:   0f 1f 40 00             nopl   0x0(%rax)
> >>>    a:   49 8b 04 24             mov    (%r12),%rax
> >>>    e:   48 89 c2                mov    %rax,%rdx
> >>>   11:   48 c1 e8 38             shr    $0x38,%rax
> >>>   15:   83 e0 03                and    $0x3,%eax
> >>					nid = page_to_nid
> >>>   18:   48 c1 ea 3a             shr    $0x3a,%rdx
> >>					zid = page_zonenum

Ohh, I am wrong here. rdx should be nid and eax the zid.

> >>
> >>>   1c:   48 69 c0 38 01 00 00    imul   $0x138,%rax,%rax
> >>>   23:   48 03 84 d1 e0 02 00    add    0x2e0(%rcx,%rdx,8),%rax
> >>					&memcg->nodeinfo[nid]->zoneinfo[zid]
> >>
> >>>   2a:   00
> >>>   2b:*  48 3b 58 70             cmp    0x70(%rax),%rbx     <-- trapping instruction
> >>
> >>OK, so this maps to:
> >>         if (unlikely(lruvec->zone != zone)) <<<
> >>                 lruvec->zone = zone;
> >>
> >>>[35355.883056] RSP: 0000:ffff88003d523aa8  EFLAGS: 00010002
> >>>[35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
> >>>[35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
> >>>[35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
> >>>[35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
> >>>[35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
> >>>[35355.883056] FS:  0000000000000000(0000) GS:ffff88003fd80000(0000)
> >>
> >>RAX (lruvec) is obviously incorrect and it doesn't make any sense. rax should
> >>contain an address at an offset from ffff88003e04a800 But there is 0x138 there
> >>instead.

Hmm, now that I am looking at the registers again. RDX which should be
nid seems to be quite big. It says this is node 32. Does the machine
have really so many NUMA nodes?
Also I think the trapping instruction was one instruction above:
IP: [<ffffffff811297d9>] mem_cgroup_page_lruvec+0x79/0x90

0x000000000004fb09 <+121>:   add    0x2e0(%rcx,%rdx,8),%rax
0x000000000004fb11 <+129>:   cmp    0x70(%rax),%rbx

rather than cmp marked above. This would explain why rax is 138 because
that would point the zid=1 and 138 is offset of mem_cgroup_per_zone
within mem_cgroup_per_node for that zone. This would mean that the
struct page contains a weird node id.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
  2013-06-13 14:39           ` Michal Hocko
@ 2013-06-13 14:45             ` Richard Weinberger
  2013-06-13 14:57               ` Richard Weinberger
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Weinberger @ 2013-06-13 14:45 UTC (permalink / raw)
  To: Michal Hocko
  Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
	kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes

Am 13.06.2013 16:39, schrieb Michal Hocko:
> On Thu 13-06-13 15:34:59, Richard Weinberger wrote:
>> Am 13.06.2013 15:32, schrieb Michal Hocko:
>>> Ohh and could you post the config please? Sorry should have asked
>>> earlier.
>>
>> See attachment.
>
> Nothing unusual there. Could you enable CONFIG_DEBUG_VM maybe it will
> help too catch the problem earlier.

OK

>>> On Thu 13-06-13 15:29:08, Michal Hocko wrote:
>>>>
>>>> On Thu 13-06-13 14:06:20, Richard Weinberger wrote:
>>>> [...]
>>>>> All code
>>>>> ========
>>>>>     0:   89 50 08                mov    %edx,0x8(%rax)
>>>>>     3:   48 89 d1                mov    %rdx,%rcx
>>>>>     6:   0f 1f 40 00             nopl   0x0(%rax)
>>>>>     a:   49 8b 04 24             mov    (%r12),%rax
>>>>>     e:   48 89 c2                mov    %rax,%rdx
>>>>>    11:   48 c1 e8 38             shr    $0x38,%rax
>>>>>    15:   83 e0 03                and    $0x3,%eax
>>>> 					nid = page_to_nid
>>>>>    18:   48 c1 ea 3a             shr    $0x3a,%rdx
>>>> 					zid = page_zonenum
>
> Ohh, I am wrong here. rdx should be nid and eax the zid.
>
>>>>
>>>>>    1c:   48 69 c0 38 01 00 00    imul   $0x138,%rax,%rax
>>>>>    23:   48 03 84 d1 e0 02 00    add    0x2e0(%rcx,%rdx,8),%rax
>>>> 					&memcg->nodeinfo[nid]->zoneinfo[zid]
>>>>
>>>>>    2a:   00
>>>>>    2b:*  48 3b 58 70             cmp    0x70(%rax),%rbx     <-- trapping instruction
>>>>
>>>> OK, so this maps to:
>>>>          if (unlikely(lruvec->zone != zone)) <<<
>>>>                  lruvec->zone = zone;
>>>>
>>>>> [35355.883056] RSP: 0000:ffff88003d523aa8  EFLAGS: 00010002
>>>>> [35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
>>>>> [35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
>>>>> [35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
>>>>> [35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
>>>>> [35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
>>>>> [35355.883056] FS:  0000000000000000(0000) GS:ffff88003fd80000(0000)
>>>>
>>>> RAX (lruvec) is obviously incorrect and it doesn't make any sense. rax should
>>>> contain an address at an offset from ffff88003e04a800 But there is 0x138 there
>>>> instead.
>
> Hmm, now that I am looking at the registers again. RDX which should be
> nid seems to be quite big. It says this is node 32. Does the machine
> have really so many NUMA nodes?

No. It's a KVM guest with two CPUs. Nothing special.
qemu command line:
qemu-kvm -m 1G -drive file=lxc_host.qcow2,if=virtio -nographic -kernel linux/arch/x86/boot/bzImage -append console=ttyS0 root=/dev/vda2 -net user,hostfwd=tcp::5555-:22 -net 
nic,model=e1000 -smp 4

Thanks,
//richard

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
  2013-06-13 14:45             ` Richard Weinberger
@ 2013-06-13 14:57               ` Richard Weinberger
  2013-06-13 15:19                 ` Michal Hocko
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Weinberger @ 2013-06-13 14:57 UTC (permalink / raw)
  To: Michal Hocko
  Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
	kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes

Am 13.06.2013 16:45, schrieb Richard Weinberger:
> Am 13.06.2013 16:39, schrieb Michal Hocko:
>> On Thu 13-06-13 15:34:59, Richard Weinberger wrote:
>>> Am 13.06.2013 15:32, schrieb Michal Hocko:
>>>> Ohh and could you post the config please? Sorry should have asked
>>>> earlier.
>>>
>>> See attachment.
>>
>> Nothing unusual there. Could you enable CONFIG_DEBUG_VM maybe it will
>> help too catch the problem earlier.
>
> OK
>
>>>> On Thu 13-06-13 15:29:08, Michal Hocko wrote:
>>>>>
>>>>> On Thu 13-06-13 14:06:20, Richard Weinberger wrote:
>>>>> [...]
>>>>>> All code
>>>>>> ========
>>>>>>     0:   89 50 08                mov    %edx,0x8(%rax)
>>>>>>     3:   48 89 d1                mov    %rdx,%rcx
>>>>>>     6:   0f 1f 40 00             nopl   0x0(%rax)
>>>>>>     a:   49 8b 04 24             mov    (%r12),%rax
>>>>>>     e:   48 89 c2                mov    %rax,%rdx
>>>>>>    11:   48 c1 e8 38             shr    $0x38,%rax
>>>>>>    15:   83 e0 03                and    $0x3,%eax
>>>>>                     nid = page_to_nid
>>>>>>    18:   48 c1 ea 3a             shr    $0x3a,%rdx
>>>>>                     zid = page_zonenum
>>
>> Ohh, I am wrong here. rdx should be nid and eax the zid.
>>
>>>>>
>>>>>>    1c:   48 69 c0 38 01 00 00    imul   $0x138,%rax,%rax
>>>>>>    23:   48 03 84 d1 e0 02 00    add    0x2e0(%rcx,%rdx,8),%rax
>>>>>                     &memcg->nodeinfo[nid]->zoneinfo[zid]
>>>>>
>>>>>>    2a:   00
>>>>>>    2b:*  48 3b 58 70             cmp    0x70(%rax),%rbx     <-- trapping instruction
>>>>>
>>>>> OK, so this maps to:
>>>>>          if (unlikely(lruvec->zone != zone)) <<<
>>>>>                  lruvec->zone = zone;
>>>>>
>>>>>> [35355.883056] RSP: 0000:ffff88003d523aa8  EFLAGS: 00010002
>>>>>> [35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
>>>>>> [35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
>>>>>> [35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
>>>>>> [35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
>>>>>> [35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
>>>>>> [35355.883056] FS:  0000000000000000(0000) GS:ffff88003fd80000(0000)
>>>>>
>>>>> RAX (lruvec) is obviously incorrect and it doesn't make any sense. rax should
>>>>> contain an address at an offset from ffff88003e04a800 But there is 0x138 there
>>>>> instead.
>>
>> Hmm, now that I am looking at the registers again. RDX which should be
>> nid seems to be quite big. It says this is node 32. Does the machine
>> have really so many NUMA nodes?
>
> No. It's a KVM guest with two CPUs. Nothing special.
> qemu command line:
> qemu-kvm -m 1G -drive file=lxc_host.qcow2,if=virtio -nographic -kernel linux/arch/x86/boot/bzImage -append console=ttyS0 root=/dev/vda2 -net user,hostfwd=tcp::5555-:22 -net
> nic,model=e1000 -smp 4

Errr, I meant four CPUs. :)

Thanks,
//richard

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
  2013-06-13 14:57               ` Richard Weinberger
@ 2013-06-13 15:19                 ` Michal Hocko
  0 siblings, 0 replies; 10+ messages in thread
From: Michal Hocko @ 2013-06-13 15:19 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: LKML, linux-mm@kvack.org, cgroups mailinglist,
	kamezawa.hiroyu@jp.fujitsu.com, bsingharora, hannes

On Thu 13-06-13 16:57:23, Richard Weinberger wrote:
> Am 13.06.2013 16:45, schrieb Richard Weinberger:
> >Am 13.06.2013 16:39, schrieb Michal Hocko:
> >>On Thu 13-06-13 15:34:59, Richard Weinberger wrote:
> >>>Am 13.06.2013 15:32, schrieb Michal Hocko:
> >>>>Ohh and could you post the config please? Sorry should have asked
> >>>>earlier.
> >>>
> >>>See attachment.
> >>
> >>Nothing unusual there. Could you enable CONFIG_DEBUG_VM maybe it will
> >>help too catch the problem earlier.
> >
> >OK
> >
> >>>>On Thu 13-06-13 15:29:08, Michal Hocko wrote:
> >>>>>
> >>>>>On Thu 13-06-13 14:06:20, Richard Weinberger wrote:
> >>>>>[...]
> >>>>>>All code
> >>>>>>========
> >>>>>>    0:   89 50 08                mov    %edx,0x8(%rax)
> >>>>>>    3:   48 89 d1                mov    %rdx,%rcx
> >>>>>>    6:   0f 1f 40 00             nopl   0x0(%rax)
> >>>>>>    a:   49 8b 04 24             mov    (%r12),%rax
> >>>>>>    e:   48 89 c2                mov    %rax,%rdx
> >>>>>>   11:   48 c1 e8 38             shr    $0x38,%rax
> >>>>>>   15:   83 e0 03                and    $0x3,%eax
> >>>>>                    nid = page_to_nid
> >>>>>>   18:   48 c1 ea 3a             shr    $0x3a,%rdx
> >>>>>                    zid = page_zonenum
> >>
> >>Ohh, I am wrong here. rdx should be nid and eax the zid.
> >>
> >>>>>
> >>>>>>   1c:   48 69 c0 38 01 00 00    imul   $0x138,%rax,%rax
> >>>>>>   23:   48 03 84 d1 e0 02 00    add    0x2e0(%rcx,%rdx,8),%rax
> >>>>>                    &memcg->nodeinfo[nid]->zoneinfo[zid]
> >>>>>
> >>>>>>   2a:   00
> >>>>>>   2b:*  48 3b 58 70             cmp    0x70(%rax),%rbx     <-- trapping instruction
> >>>>>
> >>>>>OK, so this maps to:
> >>>>>         if (unlikely(lruvec->zone != zone)) <<<
> >>>>>                 lruvec->zone = zone;
> >>>>>
> >>>>>>[35355.883056] RSP: 0000:ffff88003d523aa8  EFLAGS: 00010002
> >>>>>>[35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
> >>>>>>[35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
> >>>>>>[35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
> >>>>>>[35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
> >>>>>>[35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
> >>>>>>[35355.883056] FS:  0000000000000000(0000) GS:ffff88003fd80000(0000)
> >>>>>
> >>>>>RAX (lruvec) is obviously incorrect and it doesn't make any sense. rax should
> >>>>>contain an address at an offset from ffff88003e04a800 But there is 0x138 there
> >>>>>instead.
> >>
> >>Hmm, now that I am looking at the registers again. RDX which should be
> >>nid seems to be quite big. It says this is node 32. Does the machine
> >>have really so many NUMA nodes?
> >
> >No. It's a KVM guest with two CPUs. Nothing special.
> >qemu command line:
> >qemu-kvm -m 1G -drive file=lxc_host.qcow2,if=virtio -nographic -kernel linux/arch/x86/boot/bzImage -append console=ttyS0 root=/dev/vda2 -net user,hostfwd=tcp::5555-:22 -net
> >nic,model=e1000 -smp 4

OK, then something probably overwrites page->flags. I would be more
inclined to blame some other code ;)
Maybe DEBUG_VM will start shouting earlier
 
> Errr, I meant four CPUs. :)

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-06-13 15:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-13 11:48 mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8 richard -rw- weinberger
2013-06-13 12:02 ` Michal Hocko
2013-06-13 12:06   ` Richard Weinberger
2013-06-13 13:29     ` Michal Hocko
2013-06-13 13:32       ` Michal Hocko
2013-06-13 13:34         ` Richard Weinberger
2013-06-13 14:39           ` Michal Hocko
2013-06-13 14:45             ` Richard Weinberger
2013-06-13 14:57               ` Richard Weinberger
2013-06-13 15:19                 ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).