From mboxrd@z Thu Jan 1 00:00:00 1970 From: Patrick McHardy Subject: Re: Deadlocks Date: Sun, 13 Jun 2004 21:58:29 +0200 Sender: netfilter-devel-admin@lists.netfilter.org Message-ID: <1087156709.11287.8.camel@ws> References: <20040609180909.GA11445@linuxace.com> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Cc: netfilter-devel@lists.netfilter.org Return-path: To: Phil Oester In-Reply-To: <20040609180909.GA11445@linuxace.com> Errors-To: netfilter-devel-admin@lists.netfilter.org List-Help: List-Post: List-Subscribe: , List-Unsubscribe: , List-Archive: List-Id: netfilter-devel.vger.kernel.org On Wed, 2004-06-09 at 20:09, Phil Oester wrote: > For the past 3 months I've been experiencing deadlocks on some heavily > used gateway/firewall boxes which started after upgrading from 2.4.20. > > I can confirm that moving back to 2.4.20 stops the hangs, moving to 2.4.21 > (or any kernel after that) makes them return. I am in the process of testing > out each individual 2.4.21-pre to find out where exactly the problem is. > > In the interim, I've collected some SysRq output which may help in the > analysis. Below are two separate lockups on a 2.6.6 kernel. Anyone have > any bright ideas? This looks like the problem I described a couple of month ago: http://lists.netfilter.org/pipermail/netfilter-devel/2003-November/013130.html I went through the 2.4.21 patch, but couldn't find anything that looks related to this. The patch attached to the email above should apply to something around 2.4.23. Please also enable CONFIG_NETFILTER_DEBUG, so we can see where exactly the problem occurs. Regards Patrick > > Phil Oester > > > Lockup #1: > Pid: 0, comm: swapper > EIP: 0060:[] CPU: 1 > EIP is at __write_lock_failed+0xf/0x20 > EFLAGS: 00000287 Not tainted (2.6.6) > EAX: c0283360 EBX: ffffffff ECX: 7d9d14aa EDX: ee83c1e0 > ESI: f454b910 EDI: ffffffff EBP: 0000007d DS: 007b ES: 007b > CR0: 8005003b CR2: 08076ac4 CR3: 37b34000 CR4: 00000690 > Call Trace: > [] .text.lock.ip_conntrack_core+0x7d/0xd5 > [] do_bindings+0x8d/0x260 > [] try_rfc959+0x25/0x30 > [] help+0x2f7/0x430 > [] try_rfc959+0x0/0x30 > [] tcp_packet+0xd1/0x160 > [] ip_conntrack_in+0x100/0x220 > [] nf_iterate+0x72/0xb0 > [] ip_rcv_finish+0x0/0x245 > [] nf_hook_slow+0x78/0x110 > [] ip_rcv_finish+0x0/0x245 > [] ip_rcv+0x3c1/0x480 > [] ip_rcv_finish+0x0/0x245 > [] alloc_skb+0x32/0xd0 > [] netif_receive_skb+0x162/0x190 > [] e1000_clean_rx_irq+0x399/0x410 > [] e1000_clean+0x34/0xb0 > [] net_rx_action+0x7f/0x110 > [] __do_softirq+0xb4/0xc0 > [] do_softirq+0x4c/0x60 > ======================= > [] do_IRQ+0x145/0x180 > [] common_interrupt+0x18/0x20 > [] default_idle+0x0/0x40 > [] default_idle+0x2c/0x40 > [] cpu_idle+0x3b/0x50 > [] __call_console_drivers+0x57/0x60 > [] call_console_drivers+0x7f/0x100 > > > Lockup #2: > Pid: 0, comm: swapper > EIP: 0060:[] CPU: 0 > EIP is at .text.lock.ip_nat_ftp+0x19/0x29 > EFLAGS: 00000286 Not tainted (2.6.6) > EAX: 00000001 EBX: c0306000 ECX: d31c3034 EDX: eaeb8ac0 > ESI: 00000019 EDI: eaeb8a48 EBP: c0306d24 DS: 007b ES: 007b > CR0: 8005003b CR2: 4024f0ec CR3: 31515000 CR4: 00000690 > Call Trace: > [] tcp_exp_matches_pkt+0x32/0x79 > [] do_bindings+0x34f/0x570 > [] ip_nat_fn+0x77/0x310 > [] nf_iterate+0x6e/0xc0 > [] ip_finish_output2+0x0/0x1cb > [] nf_hook_slow+0x86/0x150 > [] ip_finish_output2+0x0/0x1cb > [] ip_finish_output+0x43/0x50 > [] ip_finish_output2+0x0/0x1cb > [] ip_forward_finish+0x2c/0x50 > [] nf_hook_slow+0xda/0x150 > [] ip_forward_finish+0x0/0x50 > [] ip_forward+0x137/0x1d0 > [] ip_forward_finish+0x0/0x50 > [] ip_rcv_finish+0x1e8/0x25d > [] nf_iterate+0x6e/0xc0 > [] ip_rcv_finish+0x0/0x25d > [] nf_hook_slow+0xda/0x150 > [] ip_rcv_finish+0x0/0x25d > [] ip_rcv+0x18d/0x240 > [] ip_rcv_finish+0x0/0x25d > [] netif_receive_skb+0x174/0x1a0 > [] e1000_clean_rx_irq+0x3d8/0x490 > [] e1000_clean+0x3c/0xb0 > [] net_rx_action+0x90/0x130 > [] __do_softirq+0xb4/0xc0 > [] do_softirq+0x4f/0x60 > ======================= > [] do_IRQ+0x1a9/0x260 > [] smp_apic_timer_interrupt+0xcc/0x130 > [] common_interrupt+0x18/0x20 > [] default_idle+0x0/0x40 > [] default_idle+0x2f/0x40 > [] cpu_idle+0x3b/0x50 > [] unknown_bootoption+0x0/0x120 > [] start_kernel+0x173/0x1c0 > [] unknown_bootoption+0x0/0x120 > >