netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Strange CPU load when flushing route cache (kernel 2.6.31.6)
@ 2009-11-23  9:58 Jesper Dangaard Brouer
  2009-11-23 10:29 ` Eric Dumazet
  2009-11-23 15:07 ` robert
  0 siblings, 2 replies; 9+ messages in thread
From: Jesper Dangaard Brouer @ 2009-11-23  9:58 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Linux Kernel Network Hackers, Robert Olsson

[-- Attachment #1: Type: text/plain, Size: 1255 bytes --]

Hi Eric and netdev,

I have observed a strange route cache behaviour when I upgraded some
of my production Linux routers (1Gbit/s tg3) to kernel 2.6.31.6 (from
kernel 2.6.25.7).

Every time the route cache is flushed I get a CPU spike (in softirq)
with a tail.  I have attached some graphs that illustrate the issue
(hope vger.kernel.org will allow these attachments...)


I have done some tuning of the route cache:

 # From /etc/sysctl.conf
 #
 # Adjusting the route cache flush interval
 net/ipv4/route/secret_interval = 1200

 # Limiting the route cache size
 # ip_dst_cache slab objects is 256 bytes.
 # 2000000 * 256 bytes = 512 MB
 net/ipv4/route/max_size = 2000000

Boot parameters: "rhash_entries=262143 vmalloc=256M"

The rhash_entries is for the route cache hash size.  The vmalloc is
needed because I have _very_ large iptables rulesets (and is running
on a 32-bit kernel, due to old hardware).

Any thoughs on how to avoid these CPU spikes?
Or where the issue occurs in the code?

-- 
Med venlig hilsen / Best regards
  Jesper Brouer
  ComX Networks A/S
  Linux Network Kernel Developer
  Cand. Scient Datalog / MSc.CS
  Author of http://adsl-optimizer.dk
  LinkedIn: http://www.linkedin.com/in/brouer

[-- Attachment #2: CPU_usage.png --]
[-- Type: image/png, Size: 16710 bytes --]

[-- Attachment #3: CPU_usage_softirq.png --]
[-- Type: image/png, Size: 15344 bytes --]

[-- Attachment #4: PPS_eth1-rx.png --]
[-- Type: image/png, Size: 10930 bytes --]

[-- Attachment #5: route_cache.png --]
[-- Type: image/png, Size: 17499 bytes --]

[-- Attachment #6: softnet_time_squeeze.png --]
[-- Type: image/png, Size: 15145 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Strange CPU load when flushing route cache (kernel 2.6.31.6)
  2009-11-23  9:58 Strange CPU load when flushing route cache (kernel 2.6.31.6) Jesper Dangaard Brouer
@ 2009-11-23 10:29 ` Eric Dumazet
  2009-11-23 12:28   ` Jesper Dangaard Brouer
  2009-11-23 15:07 ` robert
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2009-11-23 10:29 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: Linux Kernel Network Hackers, Robert Olsson

Jesper Dangaard Brouer a écrit :
> Hi Eric and netdev,
> 
> I have observed a strange route cache behaviour when I upgraded some
> of my production Linux routers (1Gbit/s tg3) to kernel 2.6.31.6 (from
> kernel 2.6.25.7).
> 
> Every time the route cache is flushed I get a CPU spike (in softirq)
> with a tail.  I have attached some graphs that illustrate the issue
> (hope vger.kernel.org will allow these attachments...)
> 
> 
> I have done some tuning of the route cache:
> 
>  # From /etc/sysctl.conf
>  #
>  # Adjusting the route cache flush interval
>  net/ipv4/route/secret_interval = 1200
> 
>  # Limiting the route cache size
>  # ip_dst_cache slab objects is 256 bytes.
>  # 2000000 * 256 bytes = 512 MB
>  net/ipv4/route/max_size = 2000000
> 
> Boot parameters: "rhash_entries=262143 vmalloc=256M"
> 
> The rhash_entries is for the route cache hash size.  The vmalloc is
> needed because I have _very_ large iptables rulesets (and is running
> on a 32-bit kernel, due to old hardware).
> 
> Any thoughs on how to avoid these CPU spikes?
> Or where the issue occurs in the code?
> 

Sure, after a flush, we have to rebuild the cache, so extra work is expected.

(We receive a packet, notice the cached entry is obsolete, free it, allocate a new one
and inert it into cache)

If you dont want these spikes, just dont flush cache :)

Do you run a 2G/2G User/Kernel split kernel ?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Strange CPU load when flushing route cache (kernel 2.6.31.6)
  2009-11-23 10:29 ` Eric Dumazet
@ 2009-11-23 12:28   ` Jesper Dangaard Brouer
  2009-11-23 13:25     ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Jesper Dangaard Brouer @ 2009-11-23 12:28 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Linux Kernel Network Hackers, Robert Olsson

On Mon, 2009-11-23 at 11:29 +0100, Eric Dumazet wrote:
> Jesper Dangaard Brouer a écrit :
> > Hi Eric and netdev,
> > 
> > I have observed a strange route cache behaviour when I upgraded some
> > of my production Linux routers (1Gbit/s tg3) to kernel 2.6.31.6 (from
> > kernel 2.6.25.7).
> > 
> > Every time the route cache is flushed I get a CPU spike (in softirq)
> > with a tail.  I have attached some graphs that illustrate the issue
> > (hope vger.kernel.org will allow these attachments...)
> > 
> > 
> > I have done some tuning of the route cache:
> > 
> >  # From /etc/sysctl.conf
> >  #
> >  # Adjusting the route cache flush interval
> >  net/ipv4/route/secret_interval = 1200
> > 
> >  # Limiting the route cache size
> >  # ip_dst_cache slab objects is 256 bytes.
> >  # 2000000 * 256 bytes = 512 MB
> >  net/ipv4/route/max_size = 2000000
> > 
> > Boot parameters: "rhash_entries=262143 vmalloc=256M"
> > 
> > The rhash_entries is for the route cache hash size.  The vmalloc is
> > needed because I have _very_ large iptables rulesets (and is running
> > on a 32-bit kernel, due to old hardware).
> > 
> > Any thoughs on how to avoid these CPU spikes?
> > Or where the issue occurs in the code?
> > 
> 
> Sure, after a flush, we have to rebuild the cache, so extra work is expected.

But the old 2.6.25.7 do NOT show this behavior... That is the real
issue...

> (We receive a packet, notice the cached entry is obsolete, free it, allocate a new one
> and inert it into cache)
> 
> If you dont want these spikes, just dont flush cache :)

I did the cache flushing due to some historical issues, that I think you
did a fix for... Guess I can drop the flushing and see if the garbage
collection can keep up...

> Do you run a 2G/2G User/Kernel split kernel ?

Not sure, how do I check?

I do use a 32-bit kernel (due to the production machines runs an old
32-bit Slackware OS install and some of the machines cannot run 64-bit).

-- 
Med venlig hilsen / Best regards
  Jesper Brouer
  ComX Networks A/S
  Linux Network Kernel Developer
  Cand. Scient Datalog / MSc.CS
  Author of http://adsl-optimizer.dk
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Strange CPU load when flushing route cache (kernel 2.6.31.6)
  2009-11-23 12:28   ` Jesper Dangaard Brouer
@ 2009-11-23 13:25     ` Eric Dumazet
  2009-11-23 13:48       ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2009-11-23 13:25 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: Linux Kernel Network Hackers, Robert Olsson

Jesper Dangaard Brouer a écrit :
> On Mon, 2009-11-23 at 11:29 +0100, Eric Dumazet wrote:

>> Sure, after a flush, we have to rebuild the cache, so extra work is expected.
> 
> But the old 2.6.25.7 do NOT show this behavior... That is the real
> issue...

Previous kernels were crashing, because flush was immediate and not deferred
as today.

During flush, we were dropping enormous amounts of packets.

Now, its possible to have setups with equilibrium and no packet loss,
because we smoothtly invalidate cache entries.


> I did the cache flushing due to some historical issues, that I think you
> did a fix for... Guess I can drop the flushing and see if the garbage
> collection can keep up...

Yes it can. Unless your route cache settings are not optimal.

> 
>> Do you run a 2G/2G User/Kernel split kernel ?
> 
> Not sure, how do I check?

grep LowTotal /proc/meminfo

or

dmesg | grep LOWMEM
913MB LOWMEM available.   (standard 3G/1G User/Kernel split)



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Strange CPU load when flushing route cache (kernel 2.6.31.6)
  2009-11-23 13:25     ` Eric Dumazet
@ 2009-11-23 13:48       ` Jesper Dangaard Brouer
  2009-11-23 14:03         ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Jesper Dangaard Brouer @ 2009-11-23 13:48 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jesper Dangaard Brouer, Linux Kernel Network Hackers,
	Robert Olsson

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1870 bytes --]

On Mon, 23 Nov 2009, Eric Dumazet wrote:

> Jesper Dangaard Brouer a écrit :
>> On Mon, 2009-11-23 at 11:29 +0100, Eric Dumazet wrote:
>
>>> Sure, after a flush, we have to rebuild the cache, so extra work is expected.
>>
>> But the old 2.6.25.7 do NOT show this behavior... That is the real
>> issue...
>
> Previous kernels were crashing, because flush was immediate and not deferred
> as today.
>
> During flush, we were dropping enormous amounts of packets.

Ahh... Now I remember that was why I was flushing the cache so often.  If 
I flushed the route cache before it got too big then it was not a 
problem with packet drops occuring.


> Now, its possible to have setups with equilibrium and no packet loss,
> because we smoothtly invalidate cache entries.

Which is a good thing :-)


>> I did the cache flushing due to some historical issues, that I think you
>> did a fix for... Guess I can drop the flushing and see if the garbage
>> collection can keep up...
>
> Yes it can. Unless your route cache settings are not optimal.

I'll adjust my flushing interval, or disable it and monitor it.


>>> Do you run a 2G/2G User/Kernel split kernel ?
>>
>> Not sure, how do I check?
>
> grep LowTotal /proc/meminfo

Yes, guess I'm using User/Kernel split.

grep LowTotal /proc/meminfo
LowTotal:         747080 kB

What does that mean?  Is it bad? What should I run on a 32-bit 
system/kernel?


Can you recommend any other /proc/sys/ tuning options?

Does my kernel boot option rhash_entries=262143 make sense anymore?
Or do we adjust the hash bucket size dynamically these days?

Cheers,
   Jesper Brouer

--
-------------------------------------------------------------------
MSc. Master of Computer Science
Dept. of Computer Science, University of Copenhagen
Author of http://www.adsl-optimizer.dk
-------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Strange CPU load when flushing route cache (kernel 2.6.31.6)
  2009-11-23 13:48       ` Jesper Dangaard Brouer
@ 2009-11-23 14:03         ` Eric Dumazet
  2009-11-26 10:51           ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2009-11-23 14:03 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Jesper Dangaard Brouer, Linux Kernel Network Hackers,
	Robert Olsson


> grep LowTotal /proc/meminfo
> LowTotal:         747080 kB
> 
> What does that mean?  Is it bad? What should I run on a 32-bit
> system/kernel?

If you have more than 1GB of physical ram, and use your machine as a router, you might
compile a 2GB/2GB User/Kernel kernel, to get twice available RAM for kernel
and more IP route entries (if needed)

make menuconfig
--> Processor type and features
  --> Memory split
    --> 2G/2G

> Can you recommend any other /proc/sys/ tuning options?

Really hard to say without exact context of use :)

> 
> Does my kernel boot option rhash_entries=262143 make sense anymore?
> Or do we adjust the hash bucket size dynamically these days?
> 

Its not dynamic yet.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Strange CPU load when flushing route cache (kernel 2.6.31.6)
  2009-11-23  9:58 Strange CPU load when flushing route cache (kernel 2.6.31.6) Jesper Dangaard Brouer
  2009-11-23 10:29 ` Eric Dumazet
@ 2009-11-23 15:07 ` robert
  1 sibling, 0 replies; 9+ messages in thread
From: robert @ 2009-11-23 15:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Eric Dumazet, Linux Kernel Network Hackers, Robert Olsson


Jesper Dangaard Brouer writes:

 > I have observed a strange route cache behaviour when I upgraded some
 > of my production Linux routers (1Gbit/s tg3) to kernel 2.6.31.6 (from
 > kernel 2.6.25.7).
 > 
 > Every time the route cache is flushed I get a CPU spike (in softirq)
 > with a tail.  I have attached some graphs that illustrate the issue
 > (hope vger.kernel.org will allow these attachments...)


 Nice plots. Yes had the same problem long time. Packets were dropped on
 even moderately loaded machines and the network managers were complaining.

 Also the are some router benchmarks (RFC??) that estimates the forwarding 
 performance to the level when the first packet drop occurs. One can of course 
 discuss such test but it's there...

 IMO is best to have he GC "inlined" with the creation of new flows and avoid 
 periodic tasks in this aspect. 

 Also I tried with something I called "active" garbage collection. The idea 
 was to get hints from TCP-FIN etc when to remove stale entries to take a 
 more pro-active approach. I think this was mentioned in the TRASH-paper. 

 If you only do routing you might try to disable the route cache.

 Cheers
					--ro

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Strange CPU load when flushing route cache (kernel 2.6.31.6)
  2009-11-23 14:03         ` Eric Dumazet
@ 2009-11-26 10:51           ` Jesper Dangaard Brouer
  2009-11-26 11:05             ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Jesper Dangaard Brouer @ 2009-11-26 10:51 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jesper Dangaard Brouer, Linux Kernel Network Hackers,
	Robert Olsson

On Mon, 2009-11-23 at 15:03 +0100, Eric Dumazet wrote:
> > grep LowTotal /proc/meminfo
> > LowTotal:         747080 kB
> > 
> > What does that mean?  Is it bad? What should I run on a 32-bit
> > system/kernel?
> 
> If you have more than 1GB of physical ram, and use your machine as a router, you might
> compile a 2GB/2GB User/Kernel kernel, to get twice available RAM for kernel
> and more IP route entries (if needed)

Can I still use this option if the machine "only" have 2G of physical
RAM?  Most of my production machines have 2G RAM.

Just to verify: This is not an issue on 64-bit kernels right?

> make menuconfig
> --> Processor type and features
>   --> Memory split
>     --> 2G/2G

For completeness sake, Memory split depends on "if EMBEDDED"

So I also needed to enable:

 make menuconfig
  --> General setup
     --> Configure standard kernel features (for small systems) 


> > Can you recommend any other /proc/sys/ tuning options?
> 
> Really hard to say without exact context of use :)

The machines does Internet traffic routing, but with a VERY large
iptables rulesets, e.g. on one production machine I have 18409 chains
and 62916 iptables rules. (I did most of the scalability patches to
iptables userspace to make this work...). And the machines also uses a
very large HTB tree.  Basically I do, per customer, Access Control,
Bandwidth limiting and Personal firewall.

Also note that I'm reserving extra vmalloc memory as iptables uses
vmalloc'ed memory...

> > Does my kernel boot option rhash_entries=262143 make sense anymore?
> > Or do we adjust the hash bucket size dynamically these days?
> > 
> 
> Its not dynamic yet.

Okay I'll keep boot parameter then... or is it possible to resize the
hash runtime?


-- 
Med venlig hilsen / Best regards
  Jesper Brouer
  ComX Networks A/S
  Linux Network Kernel Developer
  Cand. Scient Datalog / MSc.CS
  Author of http://adsl-optimizer.dk
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Strange CPU load when flushing route cache (kernel 2.6.31.6)
  2009-11-26 10:51           ` Jesper Dangaard Brouer
@ 2009-11-26 11:05             ` Eric Dumazet
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2009-11-26 11:05 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Jesper Dangaard Brouer, Linux Kernel Network Hackers,
	Robert Olsson

Jesper Dangaard Brouer a écrit :
> On Mon, 2009-11-23 at 15:03 +0100, Eric Dumazet wrote:
>>> grep LowTotal /proc/meminfo
>>> LowTotal:         747080 kB
>>>
>>> What does that mean?  Is it bad? What should I run on a 32-bit
>>> system/kernel?
>> If you have more than 1GB of physical ram, and use your machine as a router, you might
>> compile a 2GB/2GB User/Kernel kernel, to get twice available RAM for kernel
>> and more IP route entries (if needed)
> 
> Can I still use this option if the machine "only" have 2G of physical
> RAM?  Most of my production machines have 2G RAM.
> 
> Just to verify: This is not an issue on 64-bit kernels right?


If your kernel is 32bit, and you have 2GB of ram, then selecting a 2G/2G split
allows your kernel to use more ram (This is called LOWMEM.

Yes, this not an issue on 64bits kernels : All RAM is LOWMEM :)


> 
>> make menuconfig
>> --> Processor type and features
>>   --> Memory split
>>     --> 2G/2G
> 
> For completeness sake, Memory split depends on "if EMBEDDED"
> 
> So I also needed to enable:
> 
>  make menuconfig
>   --> General setup
>      --> Configure standard kernel features (for small systems) 
> 
> 
>>> Can you recommend any other /proc/sys/ tuning options?
>> Really hard to say without exact context of use :)
> 
> The machines does Internet traffic routing, but with a VERY large
> iptables rulesets, e.g. on one production machine I have 18409 chains
> and 62916 iptables rules. (I did most of the scalability patches to
> iptables userspace to make this work...). And the machines also uses a
> very large HTB tree.  Basically I do, per customer, Access Control,
> Bandwidth limiting and Personal firewall.
> 
> Also note that I'm reserving extra vmalloc memory as iptables uses
> vmalloc'ed memory...


> 
> Okay I'll keep boot parameter then... or is it possible to resize the
> hash runtime?

Nope, we use RCU lookups, so a resize would be complex.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-11-26 11:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-23  9:58 Strange CPU load when flushing route cache (kernel 2.6.31.6) Jesper Dangaard Brouer
2009-11-23 10:29 ` Eric Dumazet
2009-11-23 12:28   ` Jesper Dangaard Brouer
2009-11-23 13:25     ` Eric Dumazet
2009-11-23 13:48       ` Jesper Dangaard Brouer
2009-11-23 14:03         ` Eric Dumazet
2009-11-26 10:51           ` Jesper Dangaard Brouer
2009-11-26 11:05             ` Eric Dumazet
2009-11-23 15:07 ` robert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).