From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752853Ab2GWBOt (ORCPT ); Sun, 22 Jul 2012 21:14:49 -0400 Received: from cn.fujitsu.com ([222.73.24.84]:12786 "EHLO song.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org with ESMTP id S1752618Ab2GWBOr convert rfc822-to-8bit (ORCPT ); Sun, 22 Jul 2012 21:14:47 -0400 X-IronPort-AV: E=Sophos;i="4.77,635,1336320000"; d="scan'208";a="5455260" Message-ID: <500CA599.6030907@cn.fujitsu.com> Date: Mon, 23 Jul 2012 09:15:05 +0800 From: Gao feng User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: "Srivatsa S. Bhat" , eric.dumazet@gmail.com, davem@davemloft.net CC: nhorman@tuxdriver.com, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, mark.d.rustad@intel.com, john.r.fastabend@intel.com, lizefan@huawei.com Subject: Re: [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list References: <20120719162532.23505.85946.stgit@srivatsabhat.in.ibm.com> In-Reply-To: <20120719162532.23505.85946.stgit@srivatsabhat.in.ibm.com> X-MIMETrack: Itemize by SMTP Server on mailserver/fnst(Release 8.5.3|September 15, 2011) at 2012/07/23 09:15:20, Serialize by Router on mailserver/fnst(Release 8.5.3|September 15, 2011) at 2012/07/23 09:15:22 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 于 2012年07月20日 00:27, Srivatsa S. Bhat 写道: > After commit ef209f15 (net: cgroup: fix access the unallocated memory in > netprio cgroup), boot fails with the following NULL pointer dereference: > > Initializing cgroup subsys devices > Initializing cgroup subsys freezer > Initializing cgroup subsys net_cls > Initializing cgroup subsys blkio > Initializing cgroup subsys perf_event > Initializing cgroup subsys net_prio > BUG: unable to handle kernel NULL pointer dereference at 0000000000000698 > IP: [] cgrp_create+0xf6/0x190 > PGD 0 > Oops: 0000 [#1] SMP > CPU 0 > Modules linked in: > > Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc7-mandeep #1 IBM IBM System x -[7870C4Q]-/68Y8033 > RIP: 0010:[] [] cgrp_create+0xf6/0x190 > RSP: 0000:ffffffff81a01ea8 EFLAGS: 00010213 > RAX: 0000000000000000 RBX: ffffffffffffff10 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffffff81aa70a0 > RBP: ffffffff81a01ed8 R08: 0000000000000000 R09: 0000000000000000 > R10: ffff8808ff8641c0 R11: 6e697a696c616974 R12: 0000000000000001 > R13: ffff8808ff8641c0 R14: 0000000000000000 R15: 0000000000093970 > FS: 0000000000000000(0000) GS:ffff8808ffc00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 0000000000000698 CR3: 0000000001a0b000 CR4: 00000000000006b0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process swapper/0 (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a13420) > Stack: > ffffffff81a01eb8 ffffffff818060ff ffffffff81d75ec8 ffffffff81aa8960 > ffffffff81aa8960 ffffffff81b4c2c0 ffffffff81a01ef8 ffffffff81b1cb78 > 0000000000000018 0000000000000048 ffffffff81a01f18 ffffffff81b1ce13 > Call Trace: > [] cgroup_init_subsys+0x83/0x169 > [] cgroup_init+0x36/0x119 > [] start_kernel+0x3ba/0x3ef > [] ? kernel_init+0x27b/0x27b > [] x86_64_start_reservations+0x131/0x136 > [] x86_64_start_kernel+0x103/0x112 > Code: 01 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 75 1b eb 73 0f 1f 00 48 8b 83 f0 00 00 00 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 74 5a <48> 8b 83 88 07 00 00 48 85 c0 74 de 44 3b 60 10 76 d8 44 89 e6 > RIP [] cgrp_create+0xf6/0x190 > RSP > CR2: 0000000000000698 > ---[ end trace a7919e7f17c0a725 ]--- > Kernel panic - not syncing: Attempted to kill the idle task! > > The code corresponds to: > > update_netdev_tables(): > for_each_netdev(&init_net, dev) { > map = rtnl_dereference(dev->priomap); <---- HERE > > > The list head is initialized in netdev_init(), which is called much > later than cgrp_create(). So the problem is that we are calling > update_netdev_tables() way too early (in cgrp_create()), which will > end up traversing the not-yet-circular linked list. So at some point, > the dev pointer will become NULL and hence dev->priomap becomes an > invalid access. > > To fix this, just remove the update_netdev_tables() function entirely, > since it appears that write_update_netdev_table() will handle things > just fine. The reason I add update_netdev_tables in cgrp_create is to avoid additional bound checkings when we accessing the dev->priomap.priomap. Eric,can we revert this commit 91c68ce2b26319248a32d7baa1226f819d283758 now? I think it's safe enough to access priomap without bound check. Thanks