From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Christopher S. Aker" Subject: Re: netback Oops then xenwatch stuck in D state Date: Wed, 13 Feb 2013 15:12:44 -0500 Message-ID: <511BF3BC.2020800@theshore.net> References: <510C3AA3.2090508@theshore.net> <50E3A390-C52B-476A-8B20-BADBA42F3775@theshore.net> <51181924.4050500@theshore.net> <1360583103.16636.29.camel@zion.uk.xensource.com> <1360663133.20449.123.camel@zakaz.uk.xensource.com> <511AFFC9.3050404@theshore.net> <1360759637.16636.85.camel@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1360759637.16636.85.camel@zion.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu Cc: Ian Campbell , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org On 2/13/13 7:47 AM, Wei Liu wrote: > On Wed, 2013-02-13 at 02:51 +0000, Christopher S. Aker wrote: >> On 2/12/13 4:58 AM, Ian Campbell wrote: >>> Have you applied the XSA-39 fixes to this kernel? >> >> Yes! When I rebuilt with Wei's suggested patch for my original >> netback/xenwatch problem I also brought us up to date with XSA patches. > Good to have more context. We have found a way to reproduce a very similar BUG by keeping a guest's network IO busy and then from the host "ifconfig down" the vif. It results in the following dump. This only works if the guest is doing network I/O. We can reproduce regardless if dom0 is patched with XSA-39, and is trigger-able at least as far back as 3.2.6 dom0. Procedure: Launch a guest and configure iperf [in TCP mode] between it and another box on the network then bring down its vif on the host. root@dom0:~# ifconfig vif14.0 down <-- insta-boom br0: port 3(vif14.0) entered disabled state unable to handle kernel NULL pointer dereference at 00000000000008b8 IP: [] xen_spin_lock_flags+0x3a/0x80 PGD 0 Oops: 0002 [#1] SMP Modules linked in: ebt_comment ebt_arp ebt_set ebt_limit ebt_ip6 ebt_ip ip_set_hash_net ip_set xt_physdev iptable_filter ip_tables ebtable_nat xen_gntdev bonding ebtable_filter igb CPU 1 Pid: 0, comm: swapper/1 Not tainted 3.7.6-1-x86_64 #1 Supermicro X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F RIP: e030:[] [] xen_spin_lock_flags+0x3a/0x80 RSP: e02b:ffff880141843d60 EFLAGS: 00010006 RAX: 0000000000000400 RBX: 00000000000008b8 RCX: 0000000000012739 RDX: 0000000000000001 RSI: 0000000000000222 RDI: 00000000000008b8 RBP: ffff880141843d80 R08: 0000000000000144 R09: 0000000000000003 R10: 0000000000000003 R11: dead000000200200 R12: 0000000000000001 R13: 0000000000000200 R14: 0000000000000400 R15: ffff8800216ba700 FS: 00007f03fa88a700(0000) GS:ffff880141840000(0000) knlGS:0000000000000000 CS: e033 DS: 002b ES: 002b CR0: 000000008005003b CR2: 00000000000008b8 CR3: 0000000001c0b000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper/1 (pid: 0, threadinfo ffff880101138000, task ffff8801011049c0) Stack: 0000000000000222 00000000000008b8 ffff8800216ba700 ffff8800216ba7d8 ffff880141843da0 ffffffff817605da 0000000000000000 00000000000008b8 ffff880141843de0 ffffffff8154446f ffff88014184e5b8 ffff88014184e578 Call Trace: [] _raw_spin_lock_irqsave+0x2a/0x40 [] xen_netbk_schedule_xenvif+0x8f/0x100 [] ? xen_netbk_check_rx_xenvif+0x60/0x60 [] xen_netbk_check_rx_xenvif+0x25/0x60 [] tx_credit_callback+0x49/0x50 [] call_timer_fn+0x44/0x120 [] run_timer_softirq+0x241/0x2b0 [] ? xen_netbk_check_rx_xenvif+0x60/0x60 [] __do_softirq+0xcf/0x250 [] ? handle_percpu_irq+0x43/0x60 [] call_softirq+0x1c/0x30 [] do_softirq+0x65/0xa0 [] irq_exit+0xbd/0xe0 [] xen_evtchn_do_upcall+0x2f/0x40 [] xen_do_hypervisor_callback+0x1e/0x30 [] ? xen_hypercall_sched_op+0xa/0x20 [] ? xen_hypercall_sched_op+0xa/0x20 [] ? xen_safe_halt+0x10/0x20 [] ? default_idle+0x58/0x1b0 [] ? cpu_idle+0x88/0xd0 [] ? cpu_bringup_and_idle+0xe/0x10 Code: 24 08 4c 89 6c 24 10 4c 89 74 24 18 49 89 f5 48 89 fb 41 81 e5 00 02 00 00 41 bc 01 00 00 00 41 be 00 04 00 00 44 89 f0 44 89 e2 <86> 13 84 d2 74 0b f3 90 80 3b 00 74 f3 ff c8 75 f5 84 d2 75 15 RIP [] xen_spin_lock_flags+0x3a/0x80 RSP CR2: 00000000000008b8 ---[ end trace 337eb85a44e2f0ef ]--- Kernel panic - not syncing: Fatal exception in interrupt (XEN) Domain 0 crashed: rebooting machine in 5 seconds. (XEN) Resetting with ACPI MEMORY or I/O RESET_REG. -Chris