netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: gaofeng@cn.fujitsu.com, eric.dumazet@gmail.com,
	davem@davemloft.net, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, mark.d.rustad@intel.com,
	john.r.fastabend@intel.com, lizefan@huawei.com
Subject: Re: [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list
Date: Fri, 20 Jul 2012 16:48:22 +0530	[thread overview]
Message-ID: <50093E7E.6050203@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120720110016.GA22367@hmsreliant.think-freely.org>

On 07/20/2012 04:30 PM, Neil Horman wrote:
> On Fri, Jul 20, 2012 at 03:34:47PM +0530, Srivatsa S. Bhat wrote:
>> On 07/19/2012 10:14 PM, Neil Horman wrote:
>>> On Thu, Jul 19, 2012 at 09:57:37PM +0530, Srivatsa S. Bhat wrote:
>>>> After commit ef209f15 (net: cgroup: fix access the unallocated memory in
>>>> netprio cgroup), boot fails with the following NULL pointer dereference:
>>>>
>>>> Initializing cgroup subsys devices
>>>> Initializing cgroup subsys freezer
>>>> Initializing cgroup subsys net_cls
>>>> Initializing cgroup subsys blkio
>>>> Initializing cgroup subsys perf_event
>>>> Initializing cgroup subsys net_prio
>>>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000698
>>>> IP: [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
>>>> PGD 0
>>>> Oops: 0000 [#1] SMP
>>>> CPU 0
>>>> Modules linked in:
>>>>
>>>> Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc7-mandeep #1 IBM IBM System x -[7870C4Q]-/68Y8033
>>>> RIP: 0010:[<ffffffff8145e8d6>]  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
>>>> RSP: 0000:ffffffff81a01ea8  EFLAGS: 00010213
>>>> RAX: 0000000000000000 RBX: ffffffffffffff10 RCX: 0000000000000000
>>>> RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffffff81aa70a0
>>>> RBP: ffffffff81a01ed8 R08: 0000000000000000 R09: 0000000000000000
>>>> R10: ffff8808ff8641c0 R11: 6e697a696c616974 R12: 0000000000000001
>>>> R13: ffff8808ff8641c0 R14: 0000000000000000 R15: 0000000000093970
>>>> FS:  0000000000000000(0000) GS:ffff8808ffc00000(0000) knlGS:0000000000000000
>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>> CR2: 0000000000000698 CR3: 0000000001a0b000 CR4: 00000000000006b0
>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>> Process swapper/0 (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a13420)
>>>> Stack:
>>>>  ffffffff81a01eb8 ffffffff818060ff ffffffff81d75ec8 ffffffff81aa8960
>>>>  ffffffff81aa8960 ffffffff81b4c2c0 ffffffff81a01ef8 ffffffff81b1cb78
>>>>  0000000000000018 0000000000000048 ffffffff81a01f18 ffffffff81b1ce13
>>>> Call Trace:
>>>>  [<ffffffff81b1cb78>] cgroup_init_subsys+0x83/0x169
>>>>  [<ffffffff81b1ce13>] cgroup_init+0x36/0x119
>>>>  [<ffffffff81affef7>] start_kernel+0x3ba/0x3ef
>>>>  [<ffffffff81aff95b>] ? kernel_init+0x27b/0x27b
>>>>  [<ffffffff81aff356>] x86_64_start_reservations+0x131/0x136
>>>>  [<ffffffff81aff45e>] x86_64_start_kernel+0x103/0x112
>>>> Code: 01 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 75 1b eb 73 0f 1f 00 48 8b 83 f0 00 00 00 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 74 5a <48> 8b 83 88 07 00 00 48 85 c0 74 de 44 3b 60 10 76 d8 44 89 e6
>>>> RIP  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
>>>>  RSP <ffffffff81a01ea8>
>>>> CR2: 0000000000000698
>>>> ---[ end trace a7919e7f17c0a725 ]---
>>>> Kernel panic - not syncing: Attempted to kill the idle task!
>>>>
>>>> The code corresponds to:
>>>>
>>>> update_netdev_tables():
>>>>         for_each_netdev(&init_net, dev) {
>>>>                 map = rtnl_dereference(dev->priomap);  <---- HERE
>>>>
>>>>
>>>> The list head is initialized in netdev_init(), which is called much
>>>> later than cgrp_create(). So the problem is that we are calling
>>>> update_netdev_tables() way too early (in cgrp_create()), which will
>>>> end up traversing the not-yet-circular linked list. So at some point,
>>>> the dev pointer will become NULL and hence dev->priomap becomes an
>>>> invalid access.
>>>>
>>>> To fix this, just remove the update_netdev_tables() function entirely,
>>>> since it appears that write_update_netdev_table() will handle things
>>>> just fine.
>>>>
>>>> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
>>>> ---
>>>>
>>>> Requesting a thorough review of this patch, since I am not sure whether
>>>> removing update_netdev_tables() is perfectly OK and whether that is the
>>>> right thing to do.
>>>>
>>> We could do this I suppose, but this has already been fixed by
>>> 734b65417b24d6eea3e3d7457e1f11493890ee1d
>>
>> Oh good! But don't you think that my patch looks cleaner than that fix?
>> (Of course, provided that my patch is correct!)
>>
>> Anyway, I'm happy to see that the boot failure is fixed. But if anyone feels
>> that the approach of removing the update_netdev_tables() function is correct
>> and better, I'll be happy to provide a patch on top of the boot-fix that
>> went upstream.
>>
> We're almost at the end of a release.  The fix that went in has been tesetd and
> fixes the specific problem that was reported, with almost zero likelyhood of
> causing other regressions. While this fix looks like it might be preferable,
> this isn't a time to go doing something like this without alot more testing, as
> it may cause unforseen problems.
> 

Oh definitely! I didn't mean to suggest doing these changes right away.
It can surely wait.. :)

> Theres also a larger issue of initalization order that I'll be looking at in the
> next few weeks.  Based on the outcome of that I may roll this change in.
> 

Sure, thanks!

Regards,
Srivatsa S. Bhat

  reply	other threads:[~2012-07-20 11:19 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-19 16:27 [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list Srivatsa S. Bhat
2012-07-19 16:44 ` Neil Horman
2012-07-20 10:04   ` Srivatsa S. Bhat
2012-07-20 11:00     ` Neil Horman
2012-07-20 11:18       ` Srivatsa S. Bhat [this message]
2012-07-23  1:15 ` Gao feng
2012-07-23 11:40   ` Neil Horman
2012-09-10  9:29     ` Srivatsa S. Bhat
2012-09-10 13:16       ` Neil Horman
2012-09-11 11:21         ` Srivatsa S. Bhat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50093E7E.6050203@linux.vnet.ibm.com \
    --to=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=gaofeng@cn.fujitsu.com \
    --cc=john.r.fastabend@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=mark.d.rustad@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).