All of lore.kernel.org
 help / color / mirror / Atom feed
From: jianhai luan <jianhai.luan@oracle.com>
To: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>,
	xen-devel@lists.xenproject.org, netdev@vger.kernel.org,
	ANNIE LI <annie.li@oracle.com>
Subject: Re: DomU's network interface will hung when Dom0 running 32bit
Date: Tue, 15 Oct 2013 19:26:31 +0800	[thread overview]
Message-ID: <525D2667.6040102@oracle.com> (raw)
In-Reply-To: <20131015100624.GB29436@zion.uk.xensource.com>


On 2013-10-15 18:06, Wei Liu wrote:
> On Tue, Oct 15, 2013 at 05:34:57PM +0800, jianhai luan wrote:
>> On 2013-10-15 16:43, Ian Campbell wrote:
>>> On Tue, 2013-10-15 at 10:44 +0800, jianhai luan wrote:
>>>> On 2013-10-14 19:19, Wei Liu wrote:
>>>>> On Sat, Oct 12, 2013 at 04:53:18PM +0800, jianhai luan wrote:
>>>>>> Hi Ian,
>>>>>>     I meet the DomU's network interface hung issue recently, and have
>>>>>> been working on the issue from that time. I find that DomU's network
>>>>>> interface, which send lesser package, will hung if Dom0 running
>>>>>> 32bit and DomU's up-time is very long.  I think that one jiffies
>>>>>> overflow bug exist in the function tx_credit_exceeded().
>>>>>>     I know the inline function time_after_eq(a,b) will process jiffies
>>>>>> overflow, but the function have one limit a should little that (b +
>>>>>> MAX_SIGNAL_LONG). If a large than the value, time_after_eq will
>>>>>> return false. The MAX_SINGNAL_LONG should be 0x7fffffff at 32-bit
>>>>>> machine.
>>>>>>     If DomU's network interface send lesser package (<0.5k/s if
>>>>>> jiffies=250 and credit_bytes=ULONG_MAX), jiffies will beyond out
>>>>>> (credit_timeout.expires + MAX_SIGNAL_LONG) and time_after_eq(now,
>>>>>> next_credit) will failure (should be true). So one timer which will
>>>>>> not be trigger in short time, and later process will be aborted when
>>>>>> timer_pending(&vif->credit_timeout) is true. The result will be
>>>>>> DomU's network interface will be hung in long time (> 40days).
>>>>>>     Please think about the below scenario:
>>>>>>     Condition:
>>>>>>       Dom0 running 32-bit and HZ = 1000
>>>>>>       vif->credit_timeout->expire = 0xffffffff, vif->remaining_credit
>>>>>> = 0xffffffff, vif->credit_usec=0 jiffies=0
>>>>>>       vif receive lesser package (DomU send lesser package). If the
>>>>>> value is litter than 2K/s, consume 4G(0xffffffff) will need 582.55
>>>>>> hours. jiffies will large than 0x7ffffff. we guess jiffies =
>>>>>> 0x800000ff, time_after_eq(0x800000ff, 0xffffffff) will failure, and
>>>>>> one time which expire is 0xfffffff will be pended into system. So
>>>>>> the interface will hung until jiffies recount 0xffffffff (that will
>>>>>> need very long time).
>>>>> If I'm not mistaken you meant time_after_eq(now, next_credit) in
>>>>> netback. How does next_credit become 0xffffffff?
>>>> I only assume the value is 0xfffffff, and the value of next_credit
>>>> isn't  point. If the delta between now and next_credit larger than
>>>> ULONG_MAX, time_after_eq will do wrong judge.
>>> So it sounds like we need a timer which is independent of the traffic
>>> being sent to keep credit_timeout.expires rolling over.
>>>
>>> Can you propose a patch?
>> Because credit_timeout.expire always after jiffies, i judge the
>> value over the range of time_after_eq() by time_before(now,
>> vif->credit_timeout.expires). please check the patch.
> I don't think this really fix the issue for you. You still have chance
> that now wraps around and falls between expires and next_credit. In that
> case it's stalled again.

if time_before(now, vif->credit_timeout.expires) is true, time wrap and 
do operation. Otherwise time_before(now, vif->credit_timeout.expires) 
isn't true, now - vif->credit_timeout.expires should be letter than 
ULONG_MAX/2. Because next_credit large than vif->credit_timeout.expires 
(next_crdit = vif->credit_timeout.expires + 
msecs_to_jiffies(vif->credit_usec/1000)), the delta between now and 
next_credit should be in range of time_after_eq().  So time_after_eq() 
do correctly judge.

Jason
>
> Wei.

  parent reply	other threads:[~2013-10-15 11:26 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-12  8:53 DomU's network interface will hung when Dom0 running 32bit jianhai luan
2013-10-14 11:19 ` Wei Liu
2013-10-15  2:44   ` jianhai luan
2013-10-15  2:44   ` jianhai luan
2013-10-15  8:43     ` Ian Campbell
2013-10-15  9:34       ` jianhai luan
2013-10-15 10:06         ` Wei Liu
2013-10-15 10:06         ` Wei Liu
2013-10-15 11:26           ` jianhai luan
2013-10-15 11:26           ` jianhai luan [this message]
2013-10-15 12:58             ` Wei Liu
2013-10-15 14:29               ` jianhai luan
2013-10-15 14:29               ` jianhai luan
2013-10-15 14:49                 ` Wei Liu
2013-10-15 14:50                   ` Ian Campbell
2013-10-15 14:50                   ` Ian Campbell
2013-10-15 15:19                     ` jianhai luan
2013-10-15 16:03                       ` Wei Liu
2013-10-15 16:03                       ` Wei Liu
2013-10-15 16:23                         ` jianhai luan
2013-10-15 16:23                         ` jianhai luan
2013-10-16  0:15                           ` jianhai luan
2013-10-16  0:15                           ` jianhai luan
2013-10-15 15:19                     ` jianhai luan
2013-10-15 14:49                 ` Wei Liu
2013-10-16  7:35               ` jianhai luan
2013-10-16  7:35               ` jianhai luan
2013-10-16  9:39                 ` [Xen-devel] " annie li
2013-10-16 13:08                   ` jianhai luan
2013-10-16 13:47                     ` Wei Liu
2013-10-16 15:04                       ` jianhai luan
2013-10-16 15:04                       ` [Xen-devel] " jianhai luan
2013-10-16 15:17                         ` Wei Liu
2013-10-16 16:11                           ` David Vrabel
2013-10-16 16:11                           ` [Xen-devel] " David Vrabel
2013-10-16 16:44                             ` jianhai luan
2013-10-16 16:44                             ` jianhai luan
2013-10-16 15:17                         ` Wei Liu
2013-10-16 15:26                         ` [Xen-devel] " annie li
2013-10-16 15:26                         ` annie li
2013-10-16 13:47                     ` Wei Liu
2013-10-16 13:08                   ` jianhai luan
2013-10-16  9:39                 ` annie li
2013-10-15 12:58             ` Wei Liu
2013-10-15  9:34       ` jianhai luan
2013-10-16  7:10       ` annie li
2013-10-16  7:10       ` annie li
2013-10-16  8:46         ` Ian Campbell
2013-10-16  8:46         ` Ian Campbell
2013-10-15  8:43     ` Ian Campbell
2013-10-14 11:19 ` Wei Liu
  -- strict thread matches above, loose matches on Subject: below --
2013-10-12  8:53 jianhai luan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=525D2667.6040102@oracle.com \
    --to=jianhai.luan@oracle.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=annie.li@oracle.com \
    --cc=netdev@vger.kernel.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.