netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Holger Hoffstaette <holger.hoffstaette@googlemail.com>
Cc: netdev@vger.kernel.org
Subject: Re: Network hangs with 2.6.30.5
Date: Thu, 03 Sep 2009 21:27:08 +0200	[thread overview]
Message-ID: <4AA0188C.20107@gmail.com> (raw)
In-Reply-To: <pan.2009.09.03.19.20.44.736875@googlemail.com>

Holger Hoffstaette a écrit :
> Problem found! At least for me..
> 
> On Thu, 03 Sep 2009 07:46:10 +0000, Jarek Poplawski wrote:
> 
>> On 01-09-2009 17:32, Holger Hoffstaette wrote:
>>> On Tue, 01 Sep 2009 16:17:08 +0200, Holger Hoffstaette wrote:
>>>
>>> [network regressions in .30]
>>>
>>>> I do have an older Intel Gbit card identified thusly: 00:0b.0 Ethernet
>>>> controller: Intel Corporation 82545GM Gigabit Ethernet Controller (rev
>>>> 04)
>>>>
>>>> and enabled all sorts of offloading:
>>>>
>>>> $ethtool -k eth0
>>>> Offload parameters for eth0:
>>>> rx-checksumming: on
>>>> tx-checksumming: on
>>>> scatter-gather: on
>>>> tcp segmentation offload: on
>>>> udp fragmentation offload: off
>>>> generic segmentation offload: on
>>>>
>>>> Maybe that is the culprit, as Eric Dumazet suspected in his mail..I
>>>> will try the latest .30 stable again without that, but in any case
>>>> something is indeed very broken in there.
>>> So I just tried .30.5 again. Indeed the offloading seems to play a role:
>>> with everything enabled I cannot even reliably ssh into the machine
>>> (only "sometimes"?); however without any offloading things get "a bit
>>> better" and squid even serves up some pages..for a while. Then it seems
>>> to hang, swallow requests or not finish them. The tested sites reliably
>>> work for the Windows client when it bypasses squid, as does DNS (also
>>> served from the box). It *seems* to affect incoming traffic more than
>>> outgoing - e.g. mail or news polling seemed to kick off and finish just
>>> fine. Rebooting back into .29 fixes everything. Last time I tried
>>> .31rc-something (4 IIRC) it exhibited the same problems.
>>>
>>> I'm open to suggestions and willing to help fix this but need this
>>> machine for actual work. :/
>> It seems, you and Clifford, use e1000 so it would be interesting to find
>> out if it matters. Does your friend with working .30 use another card? If
>> you can't try with another NIC, we could probably try to revert most of
>> the driver's changes after .29 (except maybe 3) to check this driver only.
>>
>> Clifford, if it still doesn't work for you, could you try 2.6.29?
> 
> I got the git .30.y stable tree and reverted various e1000 commits that
> seemed to coincide with the various .30-rc releases but nothing helped.
> Also no relation to offloads etc.
> 
> However I did notice that the "stuck squid" problem seemed to magically
> fix itself after a few seconds - then hang again, fix itself after
> timeouts etc. So I suspected something TCP related and BINGO!
> 
> Turns out I had both tcp_tw_recycle and tcp_tw_reuse set to 1 for reasons
> I don't want to explain. :)
> 
> I can now arbitrarily fix the hanging behaviour by setting
> tcp_tw_recycle to 0, and cause hangs by setting it to 1 again. For obvious
> reasons this seems to affect squid more than other tasks with more long-lived
> connections. What is the right behaviour? beats me.
> 
> tcp_tw_reuse does not appear to play a role, so the real culprit at least
> in my case seems to be tcp_tw_recycle. In previous releases this (and
> tw_reuse) was necessary for various server tasks.
> 
> Nevertheless, something has changed between .29 and .30 that "broke" the
> previous behaviour. Whether this is progress or an regression I cannot
> say. Maybe someone else has an idea?
> 

Well... not yet :)

We probably can reproduce this problem with any NIC...

Could you send from the 'buggy' setup

$ grep . /proc/sys/net/ipv4/*


When you say squid is stuck, does it mean it doesnt accept new connections ?

Could help to strace it and check what it is doing ?

  reply	other threads:[~2009-09-03 19:27 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-01  9:50 Network hangs with 2.6.30.5 Clifford Heath
2009-09-01 10:47 ` Eric Dumazet
2009-09-01 11:20 ` Ben Hutchings
2009-09-01 14:17 ` Holger Hoffstaette
2009-09-01 15:32   ` Holger Hoffstaette
2009-09-03  7:46     ` Jarek Poplawski
2009-09-03 19:20       ` Holger Hoffstaette
2009-09-03 19:27         ` Eric Dumazet [this message]
2009-09-03 19:55           ` Holger Hoffstaette
2009-09-07  7:21             ` Jarek Poplawski
2009-09-10 22:41               ` Clifford Heath
2009-10-01 22:49               ` David Miller
2009-10-02  8:11                 ` Ilpo Järvinen
2009-10-02 12:29                   ` Ilpo Järvinen
2009-10-02 12:38                     ` Eric Dumazet
2009-11-19 23:40                   ` David Miller
2009-11-20 12:04                     ` Evgeniy Polyakov
2009-11-20 12:09                       ` Ilpo Järvinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AA0188C.20107@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=holger.hoffstaette@googlemail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).