From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kevin Krsulich Subject: x86_64 domU panics on net_rx_action Date: Sat, 28 Mar 2009 14:55:00 -0400 Message-ID: <49CE7284.9070109@mit.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org Hi, I've been seeing frequent PVM domU crashes under xen-3.3.0 on x86_64 when the domU's are seeing a lot of network traffic. Interestingly, the crashes do not necessarily occur for the domU that is seeing the traffic. The domU's and dom0 are 2.6.18-xen-r12 on Gentoo. This seems similar to the issue mentioned here http://lists.xensource.com/archives/html/xen-devel/2005-10/msg01425.html and followed up here http://lists.xensource.com/archives/html/xen-devel/2005-11/msg00056.html I've opened up a bug http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1433 with a handful of dumps and the output of 'xm log' during a few crash events. I'll tag on an example here: Unable to handle kernel paging request at 00000000014f8258 RIP: [] net_rx_action+0x9f/0x1a6 PGD 4681067 PUD 4685067 PMD 0 Oops: 0000 [1] SMP CPU 0 Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.18-xen-r12 #1 RIP: e030:[] [] net_rx_action+0x9f/0x1a6 RSP: e02b:ffffffff8055bea0 EFLAGS: 00010206 RAX: 0000000000000000 RBX: 00000000014f8100 RCX: ffff8800014279e0 RDX: ffffffffff5fd000 RSI: 0000000000000000 RDI: ffffffff80477740 RBP: ffff8800014279e0 R08: 0000000000000000 R09: 00000000000008f8 R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800014279c0 R13: 000000010002c87f R14: 0000000000000000 R15: ffffffffff5fd000 FS: 00002ade752923e0(0000) GS:ffffffff80513000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 Process swapper (pid: 0, threadinfo ffffffff80526000, task ffffffff8046ffe0) Stack: ffff8800014279c0 ffffffff803a98f7 0000012b00000000 0000000000000001 ffffffff805131b0 000000000000000a 0000000000000000 ffffffff8022eda4 000000000000010b 0000000000000001 ffffffff8055bf18 000000000000000c Call Trace: [] kfree_skbmem+0x9/0x75 [] __do_softirq+0x83/0x117 [] call_softirq+0x1c/0x28 [] do_softirq+0x6a/0xed [] do_IRQ+0xec/0xf5 [] evtchn_do_upcall+0x13c/0x1fb [] do_hypervisor_callback+0x1e/0x2c [] hypercall_page+0x3aa/0x1000 [] hypercall_page+0x3aa/0x1000 [] raw_safe_halt+0x96/0xaa [] xen_idle+0x6d/0x7f [] cpu_idle+0xab/0xce [] start_kernel+0x24b/0x250 [] _sinittext+0x1e5/0x1eb Code: 83 bb 58 01 00 00 00 7e 12 48 8d 74 24 14 48 89 df ff 93 50 RIP [] net_rx_action+0x9f/0x1a6 RSP CR2: 00000000014f8258 <0>Kernel panic - not syncing: Aiee, killing interrupt handler! If it's helpful, the network device being used is a Broadcom NetXtreme BCM5704 using the tg3 driver compiled into the dom0 kernel. Thanks and let me know if there's any more information I can provide or somewhere else I should be directing this issue. Kevin