From: jianhai luan <jianhai.luan@oracle.com>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>,
xen-devel@lists.xenproject.org, netdev@vger.kernel.org,
ANNIE LI <annie.li@oracle.com>
Subject: Re: DomU's network interface will hung when Dom0 running 32bit
Date: Tue, 15 Oct 2013 17:34:57 +0800 [thread overview]
Message-ID: <525D0C41.2080407@oracle.com> (raw)
In-Reply-To: <1381826609.24708.135.camel@kazak.uk.xensource.com>
[-- Attachment #1: Type: text/plain, Size: 2838 bytes --]
On 2013-10-15 16:43, Ian Campbell wrote:
> On Tue, 2013-10-15 at 10:44 +0800, jianhai luan wrote:
>> On 2013-10-14 19:19, Wei Liu wrote:
>>> On Sat, Oct 12, 2013 at 04:53:18PM +0800, jianhai luan wrote:
>>>> Hi Ian,
>>>> I meet the DomU's network interface hung issue recently, and have
>>>> been working on the issue from that time. I find that DomU's network
>>>> interface, which send lesser package, will hung if Dom0 running
>>>> 32bit and DomU's up-time is very long. I think that one jiffies
>>>> overflow bug exist in the function tx_credit_exceeded().
>>>> I know the inline function time_after_eq(a,b) will process jiffies
>>>> overflow, but the function have one limit a should little that (b +
>>>> MAX_SIGNAL_LONG). If a large than the value, time_after_eq will
>>>> return false. The MAX_SINGNAL_LONG should be 0x7fffffff at 32-bit
>>>> machine.
>>>> If DomU's network interface send lesser package (<0.5k/s if
>>>> jiffies=250 and credit_bytes=ULONG_MAX), jiffies will beyond out
>>>> (credit_timeout.expires + MAX_SIGNAL_LONG) and time_after_eq(now,
>>>> next_credit) will failure (should be true). So one timer which will
>>>> not be trigger in short time, and later process will be aborted when
>>>> timer_pending(&vif->credit_timeout) is true. The result will be
>>>> DomU's network interface will be hung in long time (> 40days).
>>>> Please think about the below scenario:
>>>> Condition:
>>>> Dom0 running 32-bit and HZ = 1000
>>>> vif->credit_timeout->expire = 0xffffffff, vif->remaining_credit
>>>> = 0xffffffff, vif->credit_usec=0 jiffies=0
>>>> vif receive lesser package (DomU send lesser package). If the
>>>> value is litter than 2K/s, consume 4G(0xffffffff) will need 582.55
>>>> hours. jiffies will large than 0x7ffffff. we guess jiffies =
>>>> 0x800000ff, time_after_eq(0x800000ff, 0xffffffff) will failure, and
>>>> one time which expire is 0xfffffff will be pended into system. So
>>>> the interface will hung until jiffies recount 0xffffffff (that will
>>>> need very long time).
>>> If I'm not mistaken you meant time_after_eq(now, next_credit) in
>>> netback. How does next_credit become 0xffffffff?
>> I only assume the value is 0xfffffff, and the value of next_credit
>> isn't point. If the delta between now and next_credit larger than
>> ULONG_MAX, time_after_eq will do wrong judge.
> So it sounds like we need a timer which is independent of the traffic
> being sent to keep credit_timeout.expires rolling over.
>
> Can you propose a patch?
Because credit_timeout.expire always after jiffies, i judge the value
over the range of time_after_eq() by time_before(now,
vif->credit_timeout.expires). please check the patch.
>
> Ian.
>
>>> Wei.
>>>
>>>> If some error exist in above explain, please help me point it out.
>>>>
>>>> Thanks,
>>>> Jason
>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Process-the-wrong-judge-of-time_after_eq.patch --]
[-- Type: text/plain; charset=gb18030; name="0001-Process-the-wrong-judge-of-time_after_eq.patch", Size: 1206 bytes --]
From f08c584ca1f393f6559b58b6b4c9e259c313259e Mon Sep 17 00:00:00 2001
From: Jason Luan <jianhai.luan@oracle.com>
Date: Tue, 15 Oct 2013 17:07:49 +0800
Subject: [PATCH] Process the wrong judge of time_after_eq().
If netfront send lesser package, the delta between now and next_credit will be out range of time_after_qe() and the function will do wrong judge. Because the expires always after jiffies, we judge the condition by time_before(now, vif->credit_timeout.expires).
Signed-off-by: Jason Luan <jianhai.luan@oracle.com>
---
drivers/net/xen-netback/netback.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index f3e591c..8036ce6 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1195,7 +1195,8 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size)
return true;
/* Passed the point where we can replenish credit? */
- if (time_after_eq(now, next_credit)) {
+ if (time_after_eq(now, next_credit) ||
+ unlikely(time_before(now, vif->credit_timeout.expires))) {
vif->credit_timeout.expires = now;
tx_add_credit(vif);
}
--
1.7.6.5
next prev parent reply other threads:[~2013-10-15 9:35 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-12 8:53 DomU's network interface will hung when Dom0 running 32bit jianhai luan
2013-10-14 11:19 ` Wei Liu
2013-10-15 2:44 ` jianhai luan
2013-10-15 8:43 ` Ian Campbell
2013-10-15 9:34 ` jianhai luan [this message]
2013-10-15 10:06 ` Wei Liu
2013-10-15 10:06 ` Wei Liu
2013-10-15 11:26 ` jianhai luan
2013-10-15 12:58 ` Wei Liu
2013-10-15 12:58 ` Wei Liu
2013-10-15 14:29 ` jianhai luan
2013-10-15 14:49 ` Wei Liu
2013-10-15 14:50 ` Ian Campbell
2013-10-15 15:19 ` jianhai luan
2013-10-15 16:03 ` Wei Liu
2013-10-15 16:23 ` jianhai luan
2013-10-15 16:23 ` jianhai luan
2013-10-16 0:15 ` jianhai luan
2013-10-16 0:15 ` jianhai luan
2013-10-15 16:03 ` Wei Liu
2013-10-15 15:19 ` jianhai luan
2013-10-15 14:50 ` Ian Campbell
2013-10-15 14:49 ` Wei Liu
2013-10-15 14:29 ` jianhai luan
2013-10-16 7:35 ` jianhai luan
2013-10-16 9:39 ` annie li
2013-10-16 9:39 ` [Xen-devel] " annie li
2013-10-16 13:08 ` jianhai luan
2013-10-16 13:08 ` [Xen-devel] " jianhai luan
2013-10-16 13:47 ` Wei Liu
2013-10-16 15:04 ` jianhai luan
2013-10-16 15:04 ` [Xen-devel] " jianhai luan
2013-10-16 15:17 ` Wei Liu
2013-10-16 16:11 ` David Vrabel
2013-10-16 16:11 ` [Xen-devel] " David Vrabel
2013-10-16 16:44 ` jianhai luan
2013-10-16 16:44 ` [Xen-devel] " jianhai luan
2013-10-16 15:17 ` Wei Liu
2013-10-16 15:26 ` [Xen-devel] " annie li
2013-10-16 15:26 ` annie li
2013-10-16 13:47 ` Wei Liu
2013-10-16 7:35 ` jianhai luan
2013-10-15 11:26 ` jianhai luan
2013-10-15 9:34 ` jianhai luan
2013-10-16 7:10 ` annie li
2013-10-16 8:46 ` Ian Campbell
2013-10-16 8:46 ` Ian Campbell
2013-10-16 7:10 ` annie li
2013-10-15 8:43 ` Ian Campbell
2013-10-15 2:44 ` jianhai luan
2013-10-14 11:19 ` Wei Liu
-- strict thread matches above, loose matches on Subject: below --
2013-10-12 8:53 jianhai luan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=525D0C41.2080407@oracle.com \
--to=jianhai.luan@oracle.com \
--cc=Ian.Campbell@citrix.com \
--cc=annie.li@oracle.com \
--cc=netdev@vger.kernel.org \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.