From mboxrd@z Thu Jan 1 00:00:00 1970 From: Flavio Leitner Subject: Re: [PATCH] igmp: fix ip_mc_sf_allow race Date: Mon, 4 Jan 2010 09:29:58 -0200 Message-ID: <20100104112957.GA2573@sysclose.org> References: <1262183005-28406-1-git-send-email-fleitner@redhat.com> <20100103.215441.43026709.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: netdev@vger.kernel.org To: David Miller Return-path: Received: from hapkido.dreamhost.com ([66.33.216.122]:57726 "EHLO hapkido.dreamhost.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750738Ab0ADLam (ORCPT ); Mon, 4 Jan 2010 06:30:42 -0500 Received: from homiemail-a17.g.dreamhost.com (caiajhbdcagg.dreamhost.com [208.97.132.66]) by hapkido.dreamhost.com (Postfix) with ESMTP id 0E81717DA03 for ; Mon, 4 Jan 2010 03:30:40 -0800 (PST) Content-Disposition: inline In-Reply-To: <20100103.215441.43026709.davem@davemloft.net> Sender: netdev-owner@vger.kernel.org List-ID: On Sun, Jan 03, 2010 at 09:54:41PM -0800, David Miller wrote: > From: Flavio Leitner > Date: Wed, 30 Dec 2009 12:23:25 -0200 > > > Almost all igmp functions accessing inet->mc_list are protected by > > rtnl_lock(), but there is one exception which is ip_mc_sf_allow(), > > so there is a chance of either ip_mc_drop_socket or ip_mc_leave_group > > remove an entry while ip_mc_sf_allow is running causing a crash. > > > > Signed-off-by: Flavio Leitner > > Have you triggered this in practice or is this due purely > to code inspection? I had to modify the code to reproduce introducing a delay in ip_mc_sf_allow(), but a customer is able to reproduce it when avahi-daemon runs at boot time. CPU: Intel(R) Xeon(R) CPU X5570 @ 2.93GHz stepping 05 BUG: unable to handle kernel paging request at virtual address 005e0005 printing eip: c05f2194 *pde = 00000000 Oops: 0000 [#1] SMP last sysfs file: /devices/pci0000:7f/0000:7f:06.0/irq Modules linked in: nfs lockd fscache nfs_acl autofs4 hidp l2cap bluetooth sunrpc 8021q ipv6 xfrm_nalgo crypto_api dm_multipath scsi_dh video hwmon backlight sbs i2c_ec i2c_core but ton battery asus_acpi ac parport_pc lp parport sg e1000(U) pcspkr dm_raid45 dm_message dm_ region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod hfcldd(FU) sd_mod scs i_mod hfcldd_conf(U) hraslog_link(U) ext3 jbd uhci_hcd ohci_hcd ehci_hcd CPU: 0 EIP: 0060:[] Tainted: GF VLI EFLAGS: 00210202 (2.6.18-128.el5PAE #1) EIP is at ip_mc_sf_allow+0x20/0x79 eax: 005e0001 ebx: f6cb9100 ecx: 00000008 edx: fb0000e0 esi: 5acb10ac edi: f63414e9 ebp: f7ae3200 esp: c0732ea4 ds: 007b es: 007b ss: 0068 Process xlinpack_xeon32 (pid: 5194, ti=c0732000 task=cfd2daa0 task.ti=f5d30000) Stack: f6cb9108 f6cb9100 c05ea1eb 00000008 d03e9034 5acb10ac fb0000e0 e9140000 00000008 e91414e9 f7ae3200 c06ab4a8 00000000 00000000 c05ce1d5 f7ae3200 00000000 f7ae3200 d03e9020 c05ce042 f7ae3200 c07d6988 c06ab560 00000008 Call Trace: [] udp_rcv+0x1f4/0x514 [] ip_local_deliver+0x159/0x204 [] ip_rcv+0x46f/0x4a9 [] netif_receive_skb+0x30c/0x330 [] e1000_clean_rx_irq+0xf0/0x3e0 [e1000] [] e1000_clean_rx_irq+0x0/0x3e0 [e1000] [] e1000_clean+0xf4/0x340 [e1000] [] do_IRQ+0xb5/0xc3 [] net_rx_action+0x92/0x175 [] __do_softirq+0x87/0x114 [] do_softirq+0x52/0x9c [] apic_timer_interrupt+0x1f/0x24 ======================= Code: 81 c4 8c 00 00 00 5b 5e 5f 5d c3 56 89 ce 53 89 c3 8b 4c 24 0c 89 d0 25 f0 00 00 00 3d e0 00 00 00 75 59 8b 83 84 01 00 00 eb 0c <39> 50 04 75 05 39 48 0c 74 08 8b 00 85 c0 75 f0 eb 3f 8b 50 14 EIP: [] ip_mc_sf_allow+0x20/0x79 SS:ESP 0068:c0732ea4 <0>Kernel panic - not syncing: Fatal exception in interrupt --- snip --- > That new synchronize_rcu() is very expensive and will decrease > the rate at which groups can be joined and left, _especially_ > on high cpu count machines. Well, I tried using read-write locking but then the packet reception was slower while another task was playing with multicasting groups. Then, I tried using call_rcu() to avoid the problem you are saying, but when you stop the reproducer, sk_free() will warn printing "optmem leakage.." because the rcu callback didn't run yet. > I do not think it is therefore a suitable problem to this > race, if it does in fact exist. > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Flavio