netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hans Nieser <hnsr@xs4all.nl>
To: Francois Romieu <romieu@fr.zoreil.com>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Mass udp flow reboot linux with RealTek RTL-8169 Gigabit
Date: Fri, 25 Feb 2011 14:45:42 +0100	[thread overview]
Message-ID: <1298641542.18103.48.camel@krikkit> (raw)
In-Reply-To: <1298485893.31256.40.camel@krikkit>

On Wed, 2011-02-23 at 19:31 +0100, Hans Nieser wrote:
> On Wed, 2011-02-23 at 13:21 +0100, Hans Nieser wrote:

> > On Wed, 2011-02-23 at 10:55 +0100, Francois Romieu wrote:
> > You may enable PCIEASPM_DEBUG, force 'pcie_aspm=off' and switch from
> > SLUB to SLAB but it's a bit cargo-cultish.

This seemed to have no effect sadly

> Ok, I just tried 2.6.34, and after over 5 hours of running my script,
> the system is still up and running, with only 24 'link up' messages on
> dmesg, and having transferred 2.1TiB of data (1428042421 rx_packets, 45
> rx_missed). So I'm going to assume the problem isn't present with this
> kernel and try a bisect between it and 2.6.35

After spending the entire day yesterday and this morning bisecting this,
I haven't gotten anywhere :/ I ended up at an unrelated commit as being
the first known bad commit (84d4db0e22965334ae8272f324d31fb4657465aa), I
think I may have marked a bad commit as good.. To properly bisect this
issue I probably need to test each commit for a several hours across
multiple reboots, but that is going to be too much time. I've at least
been able to establish that following v2.6.34, the following commits are
bad:

c222fb2efaf1a421f5bf74403df40a9384ccf516
4a973f2495fba8775d1c408b3ee7f2c19b19f13f
84d4db0e22965334ae8272f324d31fb4657465aa

After that I've been trying other various things (on 2.6.38-rc6+) and
made some interesting and confusing discoveries;

- Setting pci=nomsi causes instant reboot when I start my test script

- Enabling only one CPU core in the BIOS seems to solve the whole lock
  up problem, I have not been able to reproduce it after a few hours of
  testing (nor on 2.6.35). (Normally on 2.6.38-rc6 it would crash in
  just a few seconds.)

  Additionally, when I force wget to use IPv4 with only one core
  enabled, I'm suddenly getting a solid 112MB/s instead of the lousy
  9-12MB/s I have been getting since 2.6.36 - but only when using one
  core. With all 4 cores enabled, performance is bad again even when
  forcing wget to use IPv4..

  Using only one CPU core also reduces the 'link up' messages a lot, I
  only got a couple instead of hundreds/thousands.

- Enabling Tickless System (NO_HZ) kernel option seems to make lock up
  occur less frequently (but it still happens), also much less 'link up'
  messages, but also causes an occasional "NOHZ: local_softirq_pending
  08" to appear on dmesg.

- Enabling HyperThreading (I disable it by default due to an issue with
  VirtualBox) in BIOS causes performance to get even worse, just 2-3MB/s
  instead of 9-12MB/s


I've also attempted to bisect the issue I have been having with slow
transfer speed (I don't know if its related to the hang, but I figure if
the hang ever gets fixed, this will have to be fixed as well to make
r8169 usable for me), which started somewhere between v2.6.35 (good) and
v2.6.36 (bad), unfortunately this too ended up at a seemingly unrelated
commit:

af5ab277ded04bd9bc6b048c5a2f0e7d70ef0867 - clockevents: Remove the per
cpu tick skew

Just for kicks I attempted to revert this change on 2.6.38-rc6+, which
seemed to reduce the frequency of 'link up' messages, but no other real
change noticed.

  reply	other threads:[~2011-02-25 13:45 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-22 17:33 Mass udp flow reboot linux with RealTek RTL-8169 Gigabit Hans Nieser
2011-02-23  9:55 ` Francois Romieu
2011-02-23 12:21   ` Hans Nieser
2011-02-23 18:31     ` Hans Nieser
2011-02-25 13:45       ` Hans Nieser [this message]
2011-03-03 19:53         ` Hans Nieser
  -- strict thread matches above, loose matches on Subject: below --
2011-02-21 11:56 Hans Nieser
     [not found] <AANLkTin7GBSTcfZgr_9sNZ8CPMkW7Vstni+fs2v1-ink@mail.gmail.com>
2011-02-13  7:17 ` Eric Dumazet
2011-02-13 13:56   ` Francois Romieu
2011-02-13 17:27     ` Seblu
2011-02-13 18:02       ` Seblu
2011-02-13 20:34         ` Francois Romieu
2011-02-18  2:54           ` Seblu
2011-02-18  9:30             ` Francois Romieu
2011-03-06  0:29               ` Seblu
2011-03-10 12:08                 ` Francois Romieu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1298641542.18103.48.camel@krikkit \
    --to=hnsr@xs4all.nl \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=romieu@fr.zoreil.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).