netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jarek Poplawski <jarkao2@gmail.com>
To: Tuomas Jormola <tj@solitudo.net>
Cc: netdev@vger.kernel.org
Subject: Re: PROBLEM: A set of networking related oopses
Date: Thu, 24 Apr 2008 22:01:00 +0200	[thread overview]
Message-ID: <20080424200100.GA2900@ami.dom.local> (raw)
In-Reply-To: <20080424142559.GA25023@solitudo.net>

On Thu, Apr 24, 2008 at 05:25:59PM +0300, Tuomas Jormola wrote:
> Hi again,
> 
> On Sun, Mar 09, 2008 at 06:31:22PM +0100, Jarek Poplawski wrote:
> > On Sun, Mar 09, 2008 at 06:58:47PM +0200, Tuomas Jormola wrote:
> > ...
> > > there be new oopses, I will replace the old card with a newer Intel 
> > > gigabit card that I have laying around, and put it in a different PCI 
> > > slot.
> > 
> > The link I gave you described similar problem just with e1000.
> > The next message after this thread looks alike (e1000 driver).
> > So, you shouldn't hurry with this change. Just set this affinity
> > for both cards and check if it's respected.
> I've now run my system about a month with the following configuration. I
> replaced the very old e100 card with a newer e1000 PCI card and set
> affinity so that interrupts for the IRQs of both e1000e and e1000 cards
> are handled by a single CPU, and this is working very well.
> 
> (17:15:13)(tj@shakti)(~)$ grep eth /proc/interrupts 
>  18:   88113407       3780   IO-APIC-fasteoi   uhci_hcd:usb1, uhci_hcd:usb6, eth0
> 217:    9710797       4297   PCI-MSI-edge      eth1
> 
> (This is after about a 8 days of uptime, the affinity was set
> automatically in a local init script)

BTW, you could also try if setting affinity to different processors
works for you, i.e. irq 18 to cpu1 and irq 217 to cpu 2 (like described
in the earlier mentioned link).

> And with this, I've gotten rid of the OOPSes I had earlier. But is this
> really a feasible long term solution to the problem? I.e. if you're
> getting networking related OOPSes with SMP kernel on a box with two or
> more CPUs, the first thing you should do is to switch off the interrupt
> handling load balacing between the CPUs by issuing some obscure statment
> on the command line? I don't think that's very friendly advice for so
> called regular users... There's no way to work around it on the kernel
> side?

I looks like there are still attempts to fix this issue. Here is a
link to an interesting thread on this subject:

http://groups.google.com/group/linux.kernel/browse_thread/thread/6079876757758daa/43d38042acd9fb73?lnk=raot

Probably regular users shouldn't have such problems if they use
friendly distros.

> Also after installing the e1000 card, I've gotten a few of these dumps
> (see attachments) from the e1000 driver (during about a month, a dozen
> incidents, sometimes there might be 3 incidents a day, sometimes it
> takes a week when everything's normal.

Alas I'm not e1000 expert (this balancing advice is rather a general
issue). I've seen similar Tx hang reports, but it seems there could be
various reasons. Probably some of these could be fixed in current
kernels - did you try 2.6.25 BTW? Here is a case when turning off TSO
helped with something similar: 

http://bugzilla.kernel.org/show_bug.cgi?id=9808

So, if you still have these problems with current kernels and you are
willing to help in debugging this you should probably report this in
bugzilla too.

Regards,
Jarek P.

      reply	other threads:[~2008-04-24 20:01 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-06 19:05 PROBLEM: A set of networking related oopses Tuomas Jormola
2008-03-08  9:53 ` Jarek Poplawski
2008-03-08 16:13   ` Tuomas Jormola
2008-03-08 17:57     ` Jarek Poplawski
2008-03-09 16:58       ` Tuomas Jormola
2008-03-09 17:31         ` Jarek Poplawski
2008-04-24 14:25           ` Tuomas Jormola
2008-04-24 20:01             ` Jarek Poplawski [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080424200100.GA2900@ami.dom.local \
    --to=jarkao2@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=tj@solitudo.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).