From: jianhai luan <jianhai.luan@oracle.com>
To: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>,
xen-devel@lists.xenproject.org, netdev@vger.kernel.org,
ANNIE LI <annie.li@oracle.com>
Subject: Re: DomU's network interface will hung when Dom0 running 32bit
Date: Tue, 15 Oct 2013 19:26:31 +0800 [thread overview]
Message-ID: <525D2667.6040102@oracle.com> (raw)
In-Reply-To: <20131015100624.GB29436@zion.uk.xensource.com>
On 2013-10-15 18:06, Wei Liu wrote:
> On Tue, Oct 15, 2013 at 05:34:57PM +0800, jianhai luan wrote:
>> On 2013-10-15 16:43, Ian Campbell wrote:
>>> On Tue, 2013-10-15 at 10:44 +0800, jianhai luan wrote:
>>>> On 2013-10-14 19:19, Wei Liu wrote:
>>>>> On Sat, Oct 12, 2013 at 04:53:18PM +0800, jianhai luan wrote:
>>>>>> Hi Ian,
>>>>>> I meet the DomU's network interface hung issue recently, and have
>>>>>> been working on the issue from that time. I find that DomU's network
>>>>>> interface, which send lesser package, will hung if Dom0 running
>>>>>> 32bit and DomU's up-time is very long. I think that one jiffies
>>>>>> overflow bug exist in the function tx_credit_exceeded().
>>>>>> I know the inline function time_after_eq(a,b) will process jiffies
>>>>>> overflow, but the function have one limit a should little that (b +
>>>>>> MAX_SIGNAL_LONG). If a large than the value, time_after_eq will
>>>>>> return false. The MAX_SINGNAL_LONG should be 0x7fffffff at 32-bit
>>>>>> machine.
>>>>>> If DomU's network interface send lesser package (<0.5k/s if
>>>>>> jiffies=250 and credit_bytes=ULONG_MAX), jiffies will beyond out
>>>>>> (credit_timeout.expires + MAX_SIGNAL_LONG) and time_after_eq(now,
>>>>>> next_credit) will failure (should be true). So one timer which will
>>>>>> not be trigger in short time, and later process will be aborted when
>>>>>> timer_pending(&vif->credit_timeout) is true. The result will be
>>>>>> DomU's network interface will be hung in long time (> 40days).
>>>>>> Please think about the below scenario:
>>>>>> Condition:
>>>>>> Dom0 running 32-bit and HZ = 1000
>>>>>> vif->credit_timeout->expire = 0xffffffff, vif->remaining_credit
>>>>>> = 0xffffffff, vif->credit_usec=0 jiffies=0
>>>>>> vif receive lesser package (DomU send lesser package). If the
>>>>>> value is litter than 2K/s, consume 4G(0xffffffff) will need 582.55
>>>>>> hours. jiffies will large than 0x7ffffff. we guess jiffies =
>>>>>> 0x800000ff, time_after_eq(0x800000ff, 0xffffffff) will failure, and
>>>>>> one time which expire is 0xfffffff will be pended into system. So
>>>>>> the interface will hung until jiffies recount 0xffffffff (that will
>>>>>> need very long time).
>>>>> If I'm not mistaken you meant time_after_eq(now, next_credit) in
>>>>> netback. How does next_credit become 0xffffffff?
>>>> I only assume the value is 0xfffffff, and the value of next_credit
>>>> isn't point. If the delta between now and next_credit larger than
>>>> ULONG_MAX, time_after_eq will do wrong judge.
>>> So it sounds like we need a timer which is independent of the traffic
>>> being sent to keep credit_timeout.expires rolling over.
>>>
>>> Can you propose a patch?
>> Because credit_timeout.expire always after jiffies, i judge the
>> value over the range of time_after_eq() by time_before(now,
>> vif->credit_timeout.expires). please check the patch.
> I don't think this really fix the issue for you. You still have chance
> that now wraps around and falls between expires and next_credit. In that
> case it's stalled again.
if time_before(now, vif->credit_timeout.expires) is true, time wrap and
do operation. Otherwise time_before(now, vif->credit_timeout.expires)
isn't true, now - vif->credit_timeout.expires should be letter than
ULONG_MAX/2. Because next_credit large than vif->credit_timeout.expires
(next_crdit = vif->credit_timeout.expires +
msecs_to_jiffies(vif->credit_usec/1000)), the delta between now and
next_credit should be in range of time_after_eq(). So time_after_eq()
do correctly judge.
Jason
>
> Wei.
next prev parent reply other threads:[~2013-10-15 11:26 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-12 8:53 DomU's network interface will hung when Dom0 running 32bit jianhai luan
2013-10-14 11:19 ` Wei Liu
2013-10-15 2:44 ` jianhai luan
2013-10-15 8:43 ` Ian Campbell
2013-10-15 9:34 ` jianhai luan
2013-10-15 10:06 ` Wei Liu
2013-10-15 11:26 ` jianhai luan [this message]
2013-10-15 12:58 ` Wei Liu
2013-10-15 14:29 ` jianhai luan
2013-10-15 14:49 ` Wei Liu
2013-10-15 14:50 ` Ian Campbell
2013-10-15 15:19 ` jianhai luan
2013-10-15 16:03 ` Wei Liu
2013-10-15 16:23 ` jianhai luan
2013-10-16 0:15 ` jianhai luan
2013-10-16 7:35 ` jianhai luan
2013-10-16 9:39 ` [Xen-devel] " annie li
2013-10-16 13:08 ` jianhai luan
2013-10-16 13:47 ` Wei Liu
2013-10-16 15:04 ` jianhai luan
2013-10-16 15:17 ` Wei Liu
2013-10-16 16:11 ` David Vrabel
2013-10-16 16:44 ` jianhai luan
2013-10-16 15:26 ` annie li
2013-10-16 7:10 ` annie li
2013-10-16 8:46 ` Ian Campbell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=525D2667.6040102@oracle.com \
--to=jianhai.luan@oracle.com \
--cc=Ian.Campbell@citrix.com \
--cc=annie.li@oracle.com \
--cc=netdev@vger.kernel.org \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).