From mboxrd@z Thu Jan 1 00:00:00 1970 From: "=?ISO-8859-1?Q?Ilpo_J=E4rvinen?=" Subject: Re: [Bugme-new] [Bug 10767] New: Seg Fault Instead of Swapping Date: Fri, 23 May 2008 21:01:28 +0300 (EEST) Message-ID: References: <20080521105747.215be17d.akpm@linux-foundation.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: Andrew Morton , Netdev , bugme-daemon@bugzilla.kernel.org To: Brian Vowell Return-path: Received: from courier.cs.helsinki.fi ([128.214.9.1]:50207 "EHLO mail.cs.helsinki.fi" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755289AbYEWSBa (ORCPT ); Fri, 23 May 2008 14:01:30 -0400 In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On Fri, 23 May 2008, Brian Vowell wrote: > I applied the ipv4 patch. Here are two traces that just showed up: > > ------------[ cut here ]------------ > WARNING: at net/ipv4/tcp_input.c:3297 tcp_ack+0xd58/0xeba() > Modules linked in: dm_mirror dm_multipath dm_mod cfi_cmdset_0002 cfi_util > button jedec_probe cfi_probe gen_probe ck804xrom i2c_nforce2 mtd chipreg > map_funcs k8temp sg hwmon pcspkr i2c_core serio_raw pata_amd cciss > ata_generic pata_acpi sata_nv libata sd_mod scsi_mod raid456 async_xor > async_memcpy async_tx xor raid1 uhci_hcd ohci_hcd ssb ehci_hcd [last > unloaded: scsi_wait_scan] > Pid: 0, comm: swapper Not tainted 2.6.25.4 #1 > > Call Trace: > [] warn_on_slowpath+0x51/0x63 > [] dev_queue_xmit+0x25b/0x284 > [] ip_queue_xmit+0x2cc/0x31f > [] sk_stream_alloc_skb+0x2f/0xd2 > [] getnstimeofday+0x2f/0x83 > [] tcp_transmit_skb+0x775/0x7a4 > [] tcp_ack+0xd58/0xeba > [] tcp_rcv_established+0x7b9/0x8ce > [] tcp_v4_do_rcv+0x2c2/0x48b > [] __qdisc_run+0xf6/0x1c8 > [] __inet_lookup_established+0xdf/0x17b > [] tcp_v4_rcv+0x6a3/0x700 > [] ip_local_deliver+0xd4/0x18e > [] ip_rcv+0x4fb/0x53a > [] netif_receive_skb+0x351/0x372 > [] tg3_poll+0x588/0x7df > [] net_rx_action+0xb6/0x1bf > [] __do_softirq+0x65/0xce > [] call_softirq+0x1c/0x28 > [] do_softirq+0x2c/0x68 > [] irq_exit+0x3f/0x83 > [] do_IRQ+0x13e/0x15f > [] ret_from_intr+0x0/0xa > [] default_idle+0x0/0x55 > [] lapic_next_event+0x0/0xa > [] default_idle+0x0/0x55 > [] default_idle+0x31/0x55 > [] default_idle+0x2c/0x55 > [] default_idle+0x0/0x55 > [] cpu_idle+0x77/0x9a > > ---[ end trace aae73dd976dfcd54 ]--- > TCP wq(s) S < > TCP wq(h) ++h-----+-----+---< > l0 s1 f2 p18 seq: wq1288576867, su1288576867 hs1288579763 sn1288602883 Aha, it seems you got TCP into the invalid state which I recently added check for :-), I'll try to figure out how that could still happen (I think I tried to find such code path already earlier but it seems that there's still something I've overlooked). Though this is not directly going to cause the 2539 WARNING, yet it could, after some other (probably rare condition) possibly lead to that as well if this invariant is assumed to hold while doing some state manipulation elsewhere in TCP (though I think that's not too likely). In case you don't see them too often, you can well continue with the patch in order to find the cause for 2539 as well (and just ignore those occassional net/ipv4/tcp_input.c:3297 ones for now). Thanks, -- i.