netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: jianhai luan <jianhai.luan@oracle.com>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>,
	xen-devel@lists.xenproject.org, netdev@vger.kernel.org,
	ANNIE LI <annie.li@oracle.com>
Subject: Re: DomU's network interface will hung when Dom0 running 32bit
Date: Tue, 15 Oct 2013 17:34:57 +0800	[thread overview]
Message-ID: <525D0C41.2080407@oracle.com> (raw)
In-Reply-To: <1381826609.24708.135.camel@kazak.uk.xensource.com>

[-- Attachment #1: Type: text/plain, Size: 2838 bytes --]


On 2013-10-15 16:43, Ian Campbell wrote:
> On Tue, 2013-10-15 at 10:44 +0800, jianhai luan wrote:
>> On 2013-10-14 19:19, Wei Liu wrote:
>>> On Sat, Oct 12, 2013 at 04:53:18PM +0800, jianhai luan wrote:
>>>> Hi Ian,
>>>>     I meet the DomU's network interface hung issue recently, and have
>>>> been working on the issue from that time. I find that DomU's network
>>>> interface, which send lesser package, will hung if Dom0 running
>>>> 32bit and DomU's up-time is very long.  I think that one jiffies
>>>> overflow bug exist in the function tx_credit_exceeded().
>>>>     I know the inline function time_after_eq(a,b) will process jiffies
>>>> overflow, but the function have one limit a should little that (b +
>>>> MAX_SIGNAL_LONG). If a large than the value, time_after_eq will
>>>> return false. The MAX_SINGNAL_LONG should be 0x7fffffff at 32-bit
>>>> machine.
>>>>     If DomU's network interface send lesser package (<0.5k/s if
>>>> jiffies=250 and credit_bytes=ULONG_MAX), jiffies will beyond out
>>>> (credit_timeout.expires + MAX_SIGNAL_LONG) and time_after_eq(now,
>>>> next_credit) will failure (should be true). So one timer which will
>>>> not be trigger in short time, and later process will be aborted when
>>>> timer_pending(&vif->credit_timeout) is true. The result will be
>>>> DomU's network interface will be hung in long time (> 40days).
>>>>     Please think about the below scenario:
>>>>     Condition:
>>>>       Dom0 running 32-bit and HZ = 1000
>>>>       vif->credit_timeout->expire = 0xffffffff, vif->remaining_credit
>>>> = 0xffffffff, vif->credit_usec=0 jiffies=0
>>>>       vif receive lesser package (DomU send lesser package). If the
>>>> value is litter than 2K/s, consume 4G(0xffffffff) will need 582.55
>>>> hours. jiffies will large than 0x7ffffff. we guess jiffies =
>>>> 0x800000ff, time_after_eq(0x800000ff, 0xffffffff) will failure, and
>>>> one time which expire is 0xfffffff will be pended into system. So
>>>> the interface will hung until jiffies recount 0xffffffff (that will
>>>> need very long time).
>>> If I'm not mistaken you meant time_after_eq(now, next_credit) in
>>> netback. How does next_credit become 0xffffffff?
>> I only assume the value is 0xfffffff, and the value of next_credit
>> isn't  point. If the delta between now and next_credit larger than
>> ULONG_MAX, time_after_eq will do wrong judge.
> So it sounds like we need a timer which is independent of the traffic
> being sent to keep credit_timeout.expires rolling over.
>
> Can you propose a patch?

Because credit_timeout.expire always after jiffies, i judge the value 
over the range of time_after_eq() by time_before(now, 
vif->credit_timeout.expires). please check the patch.
>
> Ian.
>
>>> Wei.
>>>
>>>>     If some error exist in above explain, please help me point it out.
>>>>
>>>> Thanks,
>>>> Jason
>


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Process-the-wrong-judge-of-time_after_eq.patch --]
[-- Type: text/plain; charset=gb18030; name="0001-Process-the-wrong-judge-of-time_after_eq.patch", Size: 1206 bytes --]

From f08c584ca1f393f6559b58b6b4c9e259c313259e Mon Sep 17 00:00:00 2001
From: Jason Luan <jianhai.luan@oracle.com>
Date: Tue, 15 Oct 2013 17:07:49 +0800
Subject: [PATCH] Process the wrong judge of time_after_eq().

If netfront send lesser package, the delta between now and next_credit will be out range of time_after_qe() and the function will do wrong judge. Because the expires always after jiffies, we judge the condition by time_before(now, vif->credit_timeout.expires).

Signed-off-by: Jason Luan <jianhai.luan@oracle.com>
---
 drivers/net/xen-netback/netback.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index f3e591c..8036ce6 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1195,7 +1195,8 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size)
 		return true;
 
 	/* Passed the point where we can replenish credit? */
-	if (time_after_eq(now, next_credit)) {
+	if (time_after_eq(now, next_credit) ||
+		unlikely(time_before(now, vif->credit_timeout.expires))) {
 		vif->credit_timeout.expires = now;
 		tx_add_credit(vif);
 	}
-- 
1.7.6.5


  reply	other threads:[~2013-10-15  9:35 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-12  8:53 DomU's network interface will hung when Dom0 running 32bit jianhai luan
2013-10-14 11:19 ` Wei Liu
2013-10-15  2:44   ` jianhai luan
2013-10-15  8:43     ` Ian Campbell
2013-10-15  9:34       ` jianhai luan [this message]
2013-10-15 10:06         ` Wei Liu
2013-10-15 11:26           ` jianhai luan
2013-10-15 12:58             ` Wei Liu
2013-10-15 14:29               ` jianhai luan
2013-10-15 14:49                 ` Wei Liu
2013-10-15 14:50                   ` Ian Campbell
2013-10-15 15:19                     ` jianhai luan
2013-10-15 16:03                       ` Wei Liu
2013-10-15 16:23                         ` jianhai luan
2013-10-16  0:15                           ` jianhai luan
2013-10-16  7:35               ` jianhai luan
2013-10-16  9:39                 ` [Xen-devel] " annie li
2013-10-16 13:08                   ` jianhai luan
2013-10-16 13:47                     ` Wei Liu
2013-10-16 15:04                       ` jianhai luan
2013-10-16 15:17                         ` Wei Liu
2013-10-16 16:11                           ` David Vrabel
2013-10-16 16:44                             ` jianhai luan
2013-10-16 15:26                         ` annie li
2013-10-16  7:10       ` annie li
2013-10-16  8:46         ` Ian Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=525D0C41.2080407@oracle.com \
    --to=jianhai.luan@oracle.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=annie.li@oracle.com \
    --cc=netdev@vger.kernel.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).