From: Philipp Hahn <hahn@univention.de>
To: Wei Liu <wei.liu2@citrix.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
Erik Damrose <Damrose@univention.de>,
Ian Campbell <ian.campbell@citrix.com>,
Zoltan Kiss <zoltan.kiss@citrix.com>
Subject: Re: RFH: Kernel OOPS in xen_netbk_rx_action / xenvif_gop_skb
Date: Wed, 18 Jun 2014 18:48:31 +0200 [thread overview]
Message-ID: <53A1C2DF.10407@univention.de> (raw)
In-Reply-To: <53923CD0.7010001@univention.de>
[-- Attachment #1: Type: text/plain, Size: 5763 bytes --]
Hello,
We are now more or less able to reproduce the OOPS within one hour by
constantly shutting down the vm and rebooting it:
> [32918.795695] XXXlan0: port 3(vif18.0) entered disabled state
> [32918.798732] BUG: unable to handle kernel paging request at ffffc90010da2188
> [32918.798823] IP: [<ffffffffa04287dc>] xen_netbk_rx_action+0x18b/0x6f0 [xen_netback]
> [32918.798911] PGD 95822067 PUD 95823067 PMD 94f47067 PTE 0
> [32918.798974] Oops: 0000 [#1] SMP
> [32918.799023] Modules linked in: xt_physdev xen_blkback xen_netback ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables xen_gntdev nfsv3 nfsv4 rpcsec_gss_krb5 nfsd nfs_acl auth_rpcgss oid_registry nfs fscache dns_resolver lockd sunrpc fuse loop xen_blkfront xen_evtchn blktap quota_v2 quota_tree xenfs xen_privcmd coretemp crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw snd_pcm gf128mul snd_timer glue_helper snd aes_x86_64 soundcore snd_page_alloc microcode tpm_tis tpm tpm_bios pcspkr lpc_ich mfd_core acpi_power_meter i7core_edac mperf serio_raw i2c_i801 evdev edac_core processor ioatdma thermal_sys ext4 jbd2 crc16 bonding bridge stp llc dm_snapshot dm_mirror dm_region_hash dm_log dm_mod sd_mod crc_t10dif hid_generic usbhid hid mptsas mptscsih mptbase scs
i_transport_sas ehci_pci button uhci_hcd ehci_hcd usbcore usb_common igb dca i2c_algo_bit i2c_core ptp pps_core
> [32918.799958] CPU: 0 PID: 6450 Comm: netback/0 Not tainted 3.10.0-ucs58-amd64 #1 Debian 3.10.11-1.58.201405060908
> [32918.800050] Hardware name: FUJITSU PRIMERGY BX920 S2/D3030, BIOS 080015 Rev.3D94.3030 10/09/2012
> [32918.800137] task: ffff880093864880 ti: ffff88009266c000 task.ti: ffff88009266c000
> [32918.800220] RIP: e030:[<ffffffffa04287dc>] [<ffffffffa04287dc>] xen_netbk_rx_action+0x18b/0x6f0 [xen_netback]
> [32918.800314] RSP: e02b:ffff88009266dce8 EFLAGS: 00010212
> [32918.800364] RAX: ffffc9001082dac0 RBX: ffff880004d86ac0 RCX: ffffc90010da2000
> [32918.800419] RDX: 0000000000000031 RSI: 0000000000000000 RDI: ffff880004bdd280
> [32918.800474] RBP: ffff8800932db800 R08: 0000000000000000 R09: ffff8800952f3800
> [32918.800529] R10: 0000000000007ff0 R11: ffff88009c611380 R12: ffff8800932db800
> [32918.800584] R13: ffff88009266dd58 R14: ffffc90010821000 R15: 0000000000000000
> [32918.800642] FS: 00007f2f8fdcd700(0000) GS:ffff88009c600000(0000) knlGS:0000000000000000
> [32918.800728] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [32918.800778] CR2: ffffc90010da2188 CR3: 0000000093eb0000 CR4: 0000000000002660
> [32918.800834] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [32918.800889] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [32918.800943] Stack:
> [32918.800981] ffff880093864c60 000000008106d2af ffff88009c613ec0 ffff88009c613ec0
> [32918.801077] 0000000093864880 ffffc90010828ac0 ffffc90010821020 000000009c613ec0
> [32918.801173] 0000000000000000 0000000000000001 ffffc90010828ac0 ffffc9001082dac0
> [32918.801269] Call Trace:
> [32918.801314] [<ffffffff813ca32d>] ? _raw_spin_lock_irqsave+0x11/0x2f
> [32918.801368] [<ffffffffa042a033>] ? xen_netbk_kthread+0x174/0x841 [xen_netback]
> [32918.801454] [<ffffffff8105d373>] ? wake_up_bit+0x20/0x20
> [32918.801504] [<ffffffffa0429ebf>] ? xen_netbk_tx_build_gops+0xce8/0xce8 [xen_netback]
> [32918.801590] [<ffffffff8105cd73>] ? kthread_freezable_should_stop+0x56/0x56
> [32918.801645] [<ffffffffa0429ebf>] ? xen_netbk_tx_build_gops+0xce8/0xce8 [xen_netback]
> [32918.801730] [<ffffffff8105ce1e>] ? kthread+0xab/0xb3
> [32918.801781] [<ffffffff81003638>] ? xen_end_context_switch+0xe/0x1c
> [32918.801834] [<ffffffff8105cd73>] ? kthread_freezable_should_stop+0x56/0x56
> [32918.801890] [<ffffffff813cfbfc>] ? ret_from_fork+0x7c/0xb0
> [32918.801941] [<ffffffff8105cd73>] ? kthread_freezable_should_stop+0x56/0x56
> [32918.801995] Code: 8b b3 d0 00 00 00 48 8b bb d8 00 00 00 0f b7 74 37 02 89 70 08 eb 07 c7 40 08 00 00 00 00 89 d2 c7 40 04 00 00 00 00 48 83 c2 08 <0f> b7 34 d1 89 30 c7 44 24 60 00 00 00 00 8b 44 d1 04 89 44 24
> [32918.802400] RIP [<ffffffffa04287dc>] xen_netbk_rx_action+0x18b/0x6f0 [xen_netback]
> [32918.802486] RSP <ffff88009266dce8>
> [32918.802529] CR2: ffffc90010da2188
> [32918.802859] ---[ end trace baf81e34c52eb41c ]---
(gdb) list *(xen_netbk_rx_action+0x18b)
0xffffffffa04287dc is in xen_netbk_rx_action
(/var/build/temp/tmp.hW3dNilayw/pbuilder/linux-3.10.11/drivers/net/xen-netback/netback
.c:611).
606 meta->gso_size = skb_shinfo(skb)->gso_size;
607 else
608 meta->gso_size = 0;
609
610 meta->size = 0;
611 meta->id = req->id;
612 npo->copy_off = 0;
613 npo->copy_gref = req->gref;
614
615 data = skb->data;
After more debugging today I think something like this happens:
1. The VM is receiving packets through bonding + bridge + netback +
netfront.
2. For some unknown reason at least one packet remains in the rx queue
and is not delivered to the domU immediately by netback.
3. The VM finishes shutting down.
4. The shared ring between dom0 and domU is freed.
5. then xen-netback continues processing the pending requests and tries
to put the packet into the now already released shared ring.
>From reading the attached disassembly I guess, that
AX = &meta
CX = &rx->string
DX =~ rx.req_cons
CR2 = &req->id
where
CX + DX * sizeof(union struct xen_netif_rx_{request,response})=8 = CR2
Any additional ideas or insight is appreciated.
FYI: The host has only a single CPU and is running >=2 VMs so far.
>> There's one more patch that you can pick up from 3.10.y tree. I doubt it
>> will make much difference though.
Which patch are you referring to?
Sincerely
Philipp
[-- Attachment #2: xen-netback.s --]
[-- Type: text/plain, Size: 5282 bytes --]
drivers/net/xen-netback/netback.c:582
* frontend-side LRO).
*/
static int netbk_gop_skb(struct sk_buff *skb,
struct netrx_pending_operations *npo)
{
struct xenvif *vif = netdev_priv(skb->dev);
721: 48 81 c5 00 08 00 00 add $0x800,%rbp
drivers/net/xen-netback/netback.c:594
int old_meta_prod;
old_meta_prod = npo->meta_prod;
/* Set up a GSO prefix descriptor, if necessary */
if (skb_shinfo(skb)->gso_size && vif->gso_prefix) {
728: 66 83 7c 02 02 00 cmpw $0x0,0x2(%rdx,%rax,1)
72e: 74 53 je 783 <xen_netbk_rx_action+0x132>
730: f6 45 60 04 testb $0x4,0x60(%rbp)
734: 74 4d je 783 <xen_netbk_rx_action+0x132>
drivers/net/xen-netback/netback.c:595
req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
736: 8b 55 4c mov 0x4c(%rbp),%edx
739: 48 8b 75 58 mov 0x58(%rbp),%rsi
73d: 8b 7d 50 mov 0x50(%rbp),%edi
740: 8d 42 01 lea 0x1(%rdx),%eax
743: ff cf dec %edi
745: 89 45 4c mov %eax,0x4c(%rbp)
drivers/net/xen-netback/netback.c:596
meta = npo->meta + npo->meta_prod++;
748: 8b 4c 24 48 mov 0x48(%rsp),%ecx
drivers/net/xen-netback/netback.c:599
meta->gso_size = skb_shinfo(skb)->gso_size;
meta->size = 0;
meta->id = req->id;
74c: 21 fa and %edi,%edx
drivers/net/xen-netback/netback.c:596
old_meta_prod = npo->meta_prod;
/* Set up a GSO prefix descriptor, if necessary */
if (skb_shinfo(skb)->gso_size && vif->gso_prefix) {
req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
meta = npo->meta + npo->meta_prod++;
74e: 89 c8 mov %ecx,%eax
750: ff c1 inc %ecx
752: 89 4c 24 48 mov %ecx,0x48(%rsp)
drivers/net/xen-netback/netback.c:597
meta->gso_size = skb_shinfo(skb)->gso_size;
756: 8b 8b d0 00 00 00 mov 0xd0(%rbx),%ecx
75c: 4c 8b 83 d8 00 00 00 mov 0xd8(%rbx),%r8
drivers/net/xen-netback/netback.c:596
old_meta_prod = npo->meta_prod;
/* Set up a GSO prefix descriptor, if necessary */
if (skb_shinfo(skb)->gso_size && vif->gso_prefix) {
req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
meta = npo->meta + npo->meta_prod++;
763: 48 6b c0 0c imul $0xc,%rax,%rax
767: 48 03 44 24 58 add 0x58(%rsp),%rax
drivers/net/xen-netback/netback.c:597
meta->gso_size = skb_shinfo(skb)->gso_size;
76c: 41 0f b7 4c 08 02 movzwl 0x2(%r8,%rcx,1),%ecx
drivers/net/xen-netback/netback.c:598
meta->size = 0;
772: c7 40 04 00 00 00 00 movl $0x0,0x4(%rax)
drivers/net/xen-netback/netback.c:597
/* Set up a GSO prefix descriptor, if necessary */
if (skb_shinfo(skb)->gso_size && vif->gso_prefix) {
req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
meta = npo->meta + npo->meta_prod++;
meta->gso_size = skb_shinfo(skb)->gso_size;
779: 89 48 08 mov %ecx,0x8(%rax)
drivers/net/xen-netback/netback.c:599
meta->size = 0;
meta->id = req->id;
77c: 0f b7 54 d6 40 movzwl 0x40(%rsi,%rdx,8),%edx
781: 89 10 mov %edx,(%rax)
drivers/net/xen-netback/netback.c:602
}
req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
783: 8b 55 50 mov 0x50(%rbp),%edx
786: 8b 45 4c mov 0x4c(%rbp),%eax
789: 48 8b 4d 58 mov 0x58(%rbp),%rcx
78d: ff ca dec %edx
78f: 21 c2 and %eax,%edx
791: ff c0 inc %eax
793: 89 45 4c mov %eax,0x4c(%rbp)
drivers/net/xen-netback/netback.c:603
meta = npo->meta + npo->meta_prod++;
796: 8b 74 24 48 mov 0x48(%rsp),%esi
79a: 89 f0 mov %esi,%eax
79c: ff c6 inc %esi
79e: 48 6b c0 0c imul $0xc,%rax,%rax
7a2: 89 74 24 48 mov %esi,0x48(%rsp)
7a6: 48 03 44 24 58 add 0x58(%rsp),%rax
drivers/net/xen-netback/netback.c:605
if (!vif->gso_prefix)
7ab: f6 45 60 04 testb $0x4,0x60(%rbp)
7af: 75 17 jne 7c8 <xen_netbk_rx_action+0x177>
drivers/net/xen-netback/netback.c:606
meta->gso_size = skb_shinfo(skb)->gso_size;
7b1: 8b b3 d0 00 00 00 mov 0xd0(%rbx),%esi
7b7: 48 8b bb d8 00 00 00 mov 0xd8(%rbx),%rdi
7be: 0f b7 74 37 02 movzwl 0x2(%rdi,%rsi,1),%esi
7c3: 89 70 08 mov %esi,0x8(%rax)
7c6: eb 07 jmp 7cf <xen_netbk_rx_action+0x17e>
drivers/net/xen-netback/netback.c:608
else
meta->gso_size = 0;
7c8: c7 40 08 00 00 00 00 movl $0x0,0x8(%rax)
drivers/net/xen-netback/netback.c:611
meta->size = 0;
meta->id = req->id;
7cf: 89 d2 mov %edx,%edx
drivers/net/xen-netback/netback.c:610
if (!vif->gso_prefix)
meta->gso_size = skb_shinfo(skb)->gso_size;
else
meta->gso_size = 0;
meta->size = 0;
7d1: c7 40 04 00 00 00 00 movl $0x0,0x4(%rax)
drivers/net/xen-netback/netback.c:611
meta->id = req->id;
7d8: 48 83 c2 08 add $0x8,%rdx
7dc: 0f b7 34 d1 movzwl (%rcx,%rdx,8),%esi
7e0: 89 30 mov %esi,(%rax)
[-- Attachment #3: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2014-06-18 16:48 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-06 10:26 RFH: Kernel OOPS in xen_netbk_rx_action / xenvif_gop_skb Philipp Hahn
2014-06-06 10:58 ` Wei Liu
2014-06-06 22:12 ` Philipp Hahn
2014-06-18 16:48 ` Philipp Hahn [this message]
2014-06-19 14:12 ` Wei Liu
2014-06-19 14:35 ` David Vrabel
2014-06-19 14:41 ` Wei Liu
2014-06-23 14:56 ` Philipp Hahn
2014-06-27 8:42 ` Philipp Hahn
2014-06-27 17:48 ` Philipp Hahn
2014-06-27 18:24 ` Philipp Hahn
2014-07-02 7:45 ` [PATCH] " Philipp Hahn
2014-07-10 12:41 ` Wei Liu
[not found] ` <20140710124122.GA2381@zion.uk.xensource.com>
2014-07-11 9:41 ` Philipp Hahn
[not found] ` <53BFB142.7050201@univention.de>
2014-07-11 9:53 ` Wei Liu
2014-07-11 10:32 ` Wei Liu
[not found] ` <20140711103236.GB12584@zion.uk.xensource.com>
2014-07-11 11:02 ` Philipp Hahn
[not found] ` <53BFC43A.4080709@univention.de>
2014-07-11 11:16 ` Wei Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53A1C2DF.10407@univention.de \
--to=hahn@univention.de \
--cc=Damrose@univention.de \
--cc=ian.campbell@citrix.com \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xenproject.org \
--cc=zoltan.kiss@citrix.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).