* Lost packets - strange problem
[not found] ` <b9800b70603271128h7d0372b9ie7abf79780420042@mail.gmail.com>
@ 2006-03-27 20:33 ` Martín Ferrari
2006-04-03 9:45 ` Johnny Casey
0 siblings, 1 reply; 2+ messages in thread
From: Martín Ferrari @ 2006-03-27 20:33 UTC (permalink / raw)
To: netfilter
(x-posted in linux-net mailing list)
Hi!
I'm having a very strange problem. I have already tested a *lot* of
things before asking, and I still have no clue of what's happening.
I have 6 linux boxes acting as firewalls/routers. They have been using
similar configurations and netfilter rules for 4 years, when I
installed the first of these. Some of them route more than 10 Mbps
between interfaces, 50000+ connections tracked with netfilter, traffic
shaping, NAT, and stuff, and they don't even blink.
BUT, two of them started giving headaches, they don't have the highest
usage, but they lose packets (in any interface) up to 80%, sometimes
softirqd eats all the cpu, and you cannot even connect to the boxes.
This does not happen from the very first day, and not all the time!
The NICs are mostly 3c905*(a mix of them), also some e100 and 3c940
(sk98lin). The troublesome computers have 3c905 and 3c940, but I do
not find any pattern on hardware.
Also, the error count is 0 in the internet interface of the host which
fails the most.
I tried rewriting the rules, turning off traffic shaping, changing
NICs, then changing ALL the hardware (they have some very nice and fast
hardware now). I even migrated from debian woody with 2.4.x kernels to
debian sarge with 2.6.8 kernels and the problem is still the same. I
don't really know what to do.
I suspect that this could be triggered by some internet DoS attack, but
I didn't find anything special (I have already solved the recursion
problem with DNS servers). The 6 servers receive loads of dumb attacks
all the time.
Any help would be greatly appreciated!
--
Martín Ferrari
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Lost packets - strange problem
2006-03-27 20:33 ` Lost packets - strange problem Martín Ferrari
@ 2006-04-03 9:45 ` Johnny Casey
0 siblings, 0 replies; 2+ messages in thread
From: Johnny Casey @ 2006-04-03 9:45 UTC (permalink / raw)
To: Martín Ferrari; +Cc: netfilter
Martín Ferrari wrote:
> (x-posted in linux-net mailing list)
>
> Hi!
>
> I'm having a very strange problem. I have already tested a *lot* of
> things before asking, and I still have no clue of what's happening.
>
> I have 6 linux boxes acting as firewalls/routers. They have been using
> similar configurations and netfilter rules for 4 years, when I
> installed the first of these. Some of them route more than 10 Mbps
> between interfaces, 50000+ connections tracked with netfilter, traffic
> shaping, NAT, and stuff, and they don't even blink.
>
> BUT, two of them started giving headaches, they don't have the highest
> usage, but they lose packets (in any interface) up to 80%, sometimes
> softirqd eats all the cpu, and you cannot even connect to the boxes.
> This does not happen from the very first day, and not all the time!
>
> The NICs are mostly 3c905*(a mix of them), also some e100 and 3c940
> (sk98lin). The troublesome computers have 3c905 and 3c940, but I do
> not find any pattern on hardware.
I think the 3c940s are the problem. I have a desktop box which works
for a while and then the interface degrades for no apparent reason. No
errors appear in the log, or in ifconfig. Bringing down the interface,
removing the module works, but not reliably. Sometimes I just reboot.
This started happening around kernel 2.6.14-2.6.15 or some such.
Maybe we can track it down?
The hard to test bit is that it takes a while before the problem starts.
> Also, the error count is 0 in the internet interface of the host which
> fails the most.
same here.
...
> Any help would be greatly appreciated!
>
> --
> Martín Ferrari
Maybe we can try narrowing the kernel search. Unfortunately I'm also
using the Promise-SATA-PATA git from jgarzik...
HTH,
Johnny
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2006-04-03 9:45 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1143483717.776638.226080@u72g2000cwu.googlegroups.com>
[not found] ` <b9800b70603271128h7d0372b9ie7abf79780420042@mail.gmail.com>
2006-03-27 20:33 ` Lost packets - strange problem Martín Ferrari
2006-04-03 9:45 ` Johnny Casey
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.