From: "Shawn Wright" <swright@sls.bc.ca>
To: Daniel Chemko <dchemko@smgtec.com>, netfilter@lists.netfilter.org
Subject: RE: Using old CPU for 100s of clients
Date: Fri, 03 Dec 2004 13:57:10 -0800 [thread overview]
Message-ID: <41B070B6.23668.B47D46A8@localhost> (raw)
In-Reply-To: <7C9884991ADAE0479C14F10C858BCDF591E3A1@alderaan.smgtec.com>
On 3 Dec 2004 at 12:22, Daniel Chemko wrote:
> The Speed problems may not be isolated to your CPU. You'll want to make
> sure your conntrack table isn't getting full, and that conntracks are
> safely getting expired from your system. Are you using a custom kernel,
> or a stock distro one?
Thanks for the reply. I didn't give many details because I've already beat
this to death on the Shorewall list before coming here (I know, I should
have started here). It is a custom kernel, as all of the recent stock kernels
will not boot on this machine - APIC must be disabled (it's an old DEC
Prioris). I have tried 2.4.22, two different Mandrake releases, along with a
plain 2.4.28 from kernel.org. It is possible that I've messed up somehow,
so I plan on taking a stock 2.4.22-37mdk kernel that currently runs well on
a P3/667, and compile it, making no change except for CPU support and
APIC. This might help isolate the problem.
> Just for fun, could you forward me the following:
>
> # cat /proc/loadavg
Load average *never* goes above 0.3, currently all zeros...
I don't believe the system CPU% factors into the loadavg though?
> # free
total used free shared buffers cached
Mem: 223208 219472 3736 0 0 127028
-/+ buffers/cache: 92444 130764
Swap: 409616 0 409616
> # iostat 20 2 (sysstat package is nice for accounting)
don't have this installed, although I plan to...
> # top (grab the CPU lines, over time is best)
top will show up to ~13% system CPU% during a load test when I pass
1000kB/s + across the 10Mb link. Otherwise, it is rarely over 5% system.
> # cat /proc/slabinfo
I've looked at this also - our peak conntrack count is around 4000, max is
set to 16K. I've also tried it at 64K, and set the hashsize upon load of
ip_conntrack module to 64K, just for fun, made no difference.
> # cat /proc/net/ip_conntrack | wc -l
Usually around 1500, but I have seen 4000 peak.
> # hdparm /dev/<your disk(s)>
This is from the "bad" machine. All machines use a 3940 PCI SCSI with
aic7xxx driver, and one or more Seagate Cheetah 10K 9Gb drives.
/dev/sda:
readonly = 0 (off)
geometry = 1106/255/63, sectors = 17783240, start = 0
> # cat /proc/sys/net/ipv4/netfilter/ip_conntrack_max
Tried 16k and 64k...
> # netstat -i
This is from current live firewall (the good one). The bad one has been
rebooted since the last time I tried it live, so no data.
Kernel Interface table
Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0 1500 058662850 1 0 074520718 3 0 0 BMRU
eth1 1500 074674696 0 0 057280898 0 0 0 BMRU
lo 16436 0 89156 0 0 0 89156 0 0 0 LRU
> # mii-tool
I've used this exhaustively to check the NICs are setup right. The outside
NIC goes to a Cat1900 forced 10FD, and they are notoriously bad at
playing nice with NICs. No errors though as you can see above on eth1.
The inside link is 100Mb FD to a Cat 3500, and again no errors. Current
NICs are one Intel E100B (eepro100 driver), and a Dlink DFE500TX (tulip
driver). I have tried all combinations of e100/eepro100/tulip with half a
dozen different NICs, no change in symptoms.
I should mention that we can reproduce the problem within a few minutes
of hitting random web sites, waiting for one to "hang". We've eliminated
our DNS and proxy as sources of the problem - it occurs when bypassing
proxy and NATing through firewall. Have tried 3 different DNS servers,
squid reports avg DNS times of < 100ms. We're talking up to 20sec
delays before getting data from a website, even timeouts. A second visit
to same site, different pages, is quick. To duplicate we need to hit random
sites, but can do so within a few minutes, even when network load is low.
> wow.. there are a lot of areas to look into.. Anyways, hope to find
> something.
So do I...
> Good ol' BC boy!
Nice to hear from someone nearby! :-)
Thanks!
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Shawn Wright, I.T. Manager
Shawnigan Lake School
http://www.sls.bc.ca
swright@sls.bc.ca
next prev parent reply other threads:[~2004-12-03 21:57 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-12-03 20:22 Using old CPU for 100s of clients Daniel Chemko
2004-12-03 21:57 ` Shawn Wright [this message]
2004-12-04 1:24 ` Shawn Wright
2004-12-04 1:27 ` Michael Gale
-- strict thread matches above, loose matches on Subject: below --
2004-12-03 20:06 Shawn Wright
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=41B070B6.23668.B47D46A8@localhost \
--to=swright@sls.bc.ca \
--cc=dchemko@smgtec.com \
--cc=netfilter@lists.netfilter.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox