From: David Vrabel <david.vrabel@citrix.com>
To: Ed Swierk <eswierk@skyportsystems.com>, xen-devel@lists.xensource.com
Subject: Re: Intermittent xenvif_disconnect() hang on domU destroy
Date: Thu, 2 Jun 2016 10:51:46 +0100 [thread overview]
Message-ID: <575001B2.1020707@citrix.com> (raw)
In-Reply-To: <1464828206-60442-1-git-send-email-eswierk@skyportsystems.com>
On 02/06/16 01:43, Ed Swierk wrote:
> I'm seeing the xenwatch kernel thread hang intermittently when
> destroying a domU on recent stable xen 4.5, with Linux 4.4.11 + grsec
> dom0.
>
> The domU is created with a virtual network interface connected to a
> physical interface (ixgbevf) via an openvswitch virtual switch.
>
> Everything works fine until the domain is destroyed. Once in a while,
> a few seconds after the domain goes away, xenwatch hangs in
> xenvif_disconnect(), calling kthread_stop() on a dealloc task.
>
> I added a warning to xenvif_dealloc_kthread_should_stop() when
> kthread_should_stop() is true and queue->inflight_packets > 0,
> printing inflight_packets as well as stats.tx_zerocopy_*. Each time
> the hang occurs, inflight_packets == 1 and tx_zerocopy_sent ==
> tx_zerocopy_success + tx_zerocopy_fail + 1.
>
> I also added a warning to xenvif_skb_zerocopy_complete() when
> queue->task is null. If I manually bring down the physical interface
> to which the vif was connected (ifconfig down), this somehow causes
> the last in-flight packet to be transmitted, and everything is
> unblocked.
>
[...]
>
> It's not clear to me whether the problem lies in netback, ixgbevf, or
> somewhere in between. Is the root cause ixgbevf hanging onto a skb for
> so long, and doing nothing with it until I bring the interface down,
> or is that a symptom of some other problem? Or is netback supposed to
> somehow flush in-flight transmit packets before it gets as far as
> xenvif_disconnect()? Or should it forget about the in-flight packets
> since the interface is disappearing anyway?
>
> Any clues would be appreciated.
netback can't flush in-flight packets because the network stack doesn't
provide a mechanism for this. It can't forget about in-flight packets
because they have foreign (grant mapped) pages which must be unmapped
and the requests completed before the device can be CLOSED.
You need to investigate what is holding on to this skb.
David
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
prev parent reply other threads:[~2016-06-02 9:51 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-02 0:43 Intermittent xenvif_disconnect() hang on domU destroy Ed Swierk
2016-06-02 9:51 ` David Vrabel [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=575001B2.1020707@citrix.com \
--to=david.vrabel@citrix.com \
--cc=eswierk@skyportsystems.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).