From mboxrd@z Thu Jan 1 00:00:00 1970 From: Phil Oester Subject: Deadlocks Date: Wed, 9 Jun 2004 11:09:09 -0700 Sender: netfilter-devel-admin@lists.netfilter.org Message-ID: <20040609180909.GA11445@linuxace.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: To: netfilter-devel@lists.netfilter.org Content-Disposition: inline Errors-To: netfilter-devel-admin@lists.netfilter.org List-Help: List-Post: List-Subscribe: , List-Unsubscribe: , List-Archive: List-Id: netfilter-devel.vger.kernel.org For the past 3 months I've been experiencing deadlocks on some heavily used gateway/firewall boxes which started after upgrading from 2.4.20. I can confirm that moving back to 2.4.20 stops the hangs, moving to 2.4.21 (or any kernel after that) makes them return. I am in the process of testing out each individual 2.4.21-pre to find out where exactly the problem is. In the interim, I've collected some SysRq output which may help in the analysis. Below are two separate lockups on a 2.6.6 kernel. Anyone have any bright ideas? Phil Oester Lockup #1: Pid: 0, comm: swapper EIP: 0060:[] CPU: 1 EIP is at __write_lock_failed+0xf/0x20 EFLAGS: 00000287 Not tainted (2.6.6) EAX: c0283360 EBX: ffffffff ECX: 7d9d14aa EDX: ee83c1e0 ESI: f454b910 EDI: ffffffff EBP: 0000007d DS: 007b ES: 007b CR0: 8005003b CR2: 08076ac4 CR3: 37b34000 CR4: 00000690 Call Trace: [] .text.lock.ip_conntrack_core+0x7d/0xd5 [] do_bindings+0x8d/0x260 [] try_rfc959+0x25/0x30 [] help+0x2f7/0x430 [] try_rfc959+0x0/0x30 [] tcp_packet+0xd1/0x160 [] ip_conntrack_in+0x100/0x220 [] nf_iterate+0x72/0xb0 [] ip_rcv_finish+0x0/0x245 [] nf_hook_slow+0x78/0x110 [] ip_rcv_finish+0x0/0x245 [] ip_rcv+0x3c1/0x480 [] ip_rcv_finish+0x0/0x245 [] alloc_skb+0x32/0xd0 [] netif_receive_skb+0x162/0x190 [] e1000_clean_rx_irq+0x399/0x410 [] e1000_clean+0x34/0xb0 [] net_rx_action+0x7f/0x110 [] __do_softirq+0xb4/0xc0 [] do_softirq+0x4c/0x60 ======================= [] do_IRQ+0x145/0x180 [] common_interrupt+0x18/0x20 [] default_idle+0x0/0x40 [] default_idle+0x2c/0x40 [] cpu_idle+0x3b/0x50 [] __call_console_drivers+0x57/0x60 [] call_console_drivers+0x7f/0x100 Lockup #2: Pid: 0, comm: swapper EIP: 0060:[] CPU: 0 EIP is at .text.lock.ip_nat_ftp+0x19/0x29 EFLAGS: 00000286 Not tainted (2.6.6) EAX: 00000001 EBX: c0306000 ECX: d31c3034 EDX: eaeb8ac0 ESI: 00000019 EDI: eaeb8a48 EBP: c0306d24 DS: 007b ES: 007b CR0: 8005003b CR2: 4024f0ec CR3: 31515000 CR4: 00000690 Call Trace: [] tcp_exp_matches_pkt+0x32/0x79 [] do_bindings+0x34f/0x570 [] ip_nat_fn+0x77/0x310 [] nf_iterate+0x6e/0xc0 [] ip_finish_output2+0x0/0x1cb [] nf_hook_slow+0x86/0x150 [] ip_finish_output2+0x0/0x1cb [] ip_finish_output+0x43/0x50 [] ip_finish_output2+0x0/0x1cb [] ip_forward_finish+0x2c/0x50 [] nf_hook_slow+0xda/0x150 [] ip_forward_finish+0x0/0x50 [] ip_forward+0x137/0x1d0 [] ip_forward_finish+0x0/0x50 [] ip_rcv_finish+0x1e8/0x25d [] nf_iterate+0x6e/0xc0 [] ip_rcv_finish+0x0/0x25d [] nf_hook_slow+0xda/0x150 [] ip_rcv_finish+0x0/0x25d [] ip_rcv+0x18d/0x240 [] ip_rcv_finish+0x0/0x25d [] netif_receive_skb+0x174/0x1a0 [] e1000_clean_rx_irq+0x3d8/0x490 [] e1000_clean+0x3c/0xb0 [] net_rx_action+0x90/0x130 [] __do_softirq+0xb4/0xc0 [] do_softirq+0x4f/0x60 ======================= [] do_IRQ+0x1a9/0x260 [] smp_apic_timer_interrupt+0xcc/0x130 [] common_interrupt+0x18/0x20 [] default_idle+0x0/0x40 [] default_idle+0x2f/0x40 [] cpu_idle+0x3b/0x50 [] unknown_bootoption+0x0/0x120 [] start_kernel+0x173/0x1c0 [] unknown_bootoption+0x0/0x120