From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755144Ab2DGBMa (ORCPT ); Fri, 6 Apr 2012 21:12:30 -0400 Received: from out07.mta.xmission.com ([166.70.13.237]:36242 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753101Ab2DGBM1 convert rfc822-to-8bit (ORCPT ); Fri, 6 Apr 2012 21:12:27 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: Sasha Levin Cc: davem , Eric Dumazet , Eric Van Hensbergen , Dave Jones , linux-kernel , netdev References: <1333664446.3538.12.camel@lappy> Date: Fri, 06 Apr 2012 18:16:10 -0700 In-Reply-To: (Sasha Levin's message of "Fri, 6 Apr 2012 11:04:25 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-XM-SPF: eid=;;;mid=;;;hst=in02.mta.xmission.com;;;ip=98.207.153.68;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX19lWuZE6HYisn9HprL0DIX6vkVnvNwk+x0= X-SA-Exim-Connect-IP: 98.207.153.68 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * 1.5 TR_Symld_Words too many words that have symbols inside * 0.0 T_TM2_M_HEADER_IN_MSG BODY: T_TM2_M_HEADER_IN_MSG * -3.0 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.0000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa03 1397; Body=1 Fuz1=1 Fuz2=1] * 0.4 FVGT_m_MULTI_ODD Contains multiple odd letter combinations * 0.0 T_TooManySym_01 4+ unique symbols in subject * 0.1 XMSolicitRefs_0 Weightloss drug * 0.4 UNTRUSTED_Relay Comes from a non-trusted relay X-Spam-DCC: XMission; sa03 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ;Sasha Levin X-Spam-Relay-Country: ** Subject: Re: net: kernel BUG() in net/netns/generic.h:45 X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Fri, 06 Aug 2010 16:31:04 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Sasha Levin writes: > On Fri, Apr 6, 2012 at 1:53 AM, Eric W. Biederman wrote: >> Sasha Levin writes: >> >>> Hi all, >>> >>> When an initialization of a network namespace in setup_net() fails, we >>> try to undo everything by executing each of the exit callbacks of every >>> namespace in the network. >>> >>> The problem is, it might be possible that the net_generic array wasn't >>> initialized before we fail and try to undo everything. At that point, >>> some of the networks assume that since we're already calling the exit >>> callback, the net_generic structure is initialized and we hit the BUG() >>> in net/netns/generic.h:45 . >>> >>> I'm not quite sure whether the right fix from the following three >>> options is, and would be happy to figure it out before fixing it: >>> >>>  1. Don't assume net_generic was initialized in the exit callback, which >>> is a bit problematic since we can't query that nicely anyway (a >>> sub-option here would be adding an API to query whether the net_generic >>> structure is initialized. >>> >>>  2. Remove the BUG(), switch it to a WARN() and let each subsystem >>> handle the case of NULL on it's own. While it sounds a bit wrong, it's >>> worth mentioning that that BUG() was initially added in an attempt to >>> fix an issue in CAIF, which was fixed in a completely different way >>> afterwards, so it's not strictly necessary here. >>> >>>  3. Only call the exit callback for subsystems we have called the init >>> callback for. >> >> Your option 3 only calling the exit callbacks for subsystems we have >> initialized should be what is implemented.  Certainly it looks like we >> are attempting to only call the exit callbacks for code whose init >> callback has succeeded. >> >> What problem are you seeing? >> >> This smells suspiciously like a problem we had a little while ago caif >> was registering as a pernet device instead of a pernet subsystem, >> and because of that we had packets flying around after it had been >> unregistered and was trying access it's net_generic data. > > It looks different from the caif problem a bit, here is a sample stacktrace: > > [ 163.733755] ------------[ cut here ]------------ > [ 163.734501] kernel BUG at include/net/netns/generic.h:45! > [ 163.734501] invalid opcode: 0000 [#1] PREEMPT SMP > [ 163.734501] CPU 2 > [ 163.734501] Pid: 19145, comm: trinity Tainted: G W > 3.4.0-rc1-next-20120405-sasha-dirty #57 > [ 163.734501] RIP: 0010:[] [] > phonet_pernet+0x182/0x1a0 > [ 163.734501] RSP: 0018:ffff8800674d5ca8 EFLAGS: 00010246 > [ 163.734501] RAX: 000000003fffffff RBX: 0000000000000000 RCX: ffff8800678c88d8 > [ 163.734501] RDX: 00000000003f4000 RSI: ffff8800678c8910 RDI: 0000000000000282 > [ 163.734501] RBP: ffff8800674d5cc8 R08: 0000000000000000 R09: 0000000000000000 > [ 163.734501] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880068bec920 > [ 163.734501] R13: ffffffff836b90c0 R14: 0000000000000000 R15: 0000000000000000 > [ 163.734501] FS: 00007f055e8de700(0000) GS:ffff88007d000000(0000) > knlGS:0000000000000000 > [ 163.734501] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 163.734501] CR2: 00007f055e6bb518 CR3: 0000000070c16000 CR4: 00000000000406e0 > [ 163.734501] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 163.734501] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 163.734501] Process trinity (pid: 19145, threadinfo > ffff8800674d4000, task ffff8800678c8000) > [ 163.734501] Stack: > [ 163.734501] ffffffff824d5f00 ffffffff810e2ec1 ffff880067ae0000 > 00000000ffffffd4 > [ 163.734501] ffff8800674d5cf8 ffffffff824d667a ffff880067ae0000 > 00000000ffffffd4 > [ 163.734501] ffffffff836b90c0 0000000000000000 ffff8800674d5d18 > ffffffff824d707d > [ 163.734501] Call Trace: > [ 163.734501] [] ? phonet_pernet+0x20/0x1a0 > [ 163.734501] [] ? get_parent_ip+0x11/0x50 > [ 163.734501] [] phonet_device_destroy+0x1a/0x100 > [ 163.734501] [] phonet_device_notify+0x3d/0x50 > [ 163.734501] [] notifier_call_chain+0xee/0x130 > [ 163.734501] [] raw_notifier_call_chain+0x11/0x20 > [ 163.734501] [] call_netdevice_notifiers+0x52/0x60 > [ 163.734501] [] rollback_registered_many+0x185/0x270 > [ 163.734501] [] unregister_netdevice_many+0x14/0x60 > [ 163.734501] [] ipip_exit_net+0x1b3/0x1d0 > [ 163.734501] [] ? ipip_rcv+0x420/0x420 > [ 163.734501] [] ops_exit_list+0x35/0x70 > [ 163.734501] [] setup_net+0xab/0xe0 > [ 163.734501] [] copy_net_ns+0x76/0x100 > [ 163.734501] [] create_new_namespaces+0xfb/0x190 > [ 163.734501] [] unshare_nsproxy_namespaces+0x61/0x80 > [ 163.734501] [] sys_unshare+0xff/0x290 > [ 163.734501] [] ? trace_hardirqs_on_thunk+0x3a/0x3f > [ 163.734501] [] system_call_fastpath+0x16/0x1b > [ 163.734501] Code: e0 c3 fe 66 0f 1f 44 00 00 48 c7 c2 40 60 4d 82 > be 01 00 00 00 48 c7 c7 80 d1 23 83 e8 48 2a c4 fe e8 73 06 c8 fe 48 > 85 db 75 0e <0f> 0b 0f 1f 40 00 eb fe 66 0f 1f 44 00 00 48 83 c4 10 48 > 89 d8 > [ 163.734501] RIP [] phonet_pernet+0x182/0x1a0 > [ 163.734501] RSP > [ 163.861289] ---[ end trace fb5615826c548066 ]--- > > It would appear that there's at least a one-off problem with the > reverse iteration. I don't see anything pointing to reverse iteration being the problem, or even the call graph order. What I see is phonet registering a netdevice notifier. Then I see phonet register pernet_device operations. Then I see phonet responding to all network devices no matter what the network namespace is, and looking in the phonet net_generic data for them. Looking phonet does not implement any network devices phonet just has per network device state. Which means phonet should be using register_pernet_subsys and this roughly is the same issue we saw with caif. Confusion of when you should use register_pernet_subsys and register_pernet_device. Issues that I see. - There is a bunch of weird and stuff going on in phonet_net_exit to handle the module unregister case where we don't synthesize network device removal events. ( A generic bug we can fix). - phonet should be using register_pernet_subsys instead of register_pernet_device. Patches to follow in a minute. Is there any chance you can reproduce this so you can verify the problems don't continue to reproduce? Eric