2.6.0-test11: dst_cache_overflow causing unresponsive box

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* 2.6.0-test11: dst_cache_overflow causing unresponsive box
@ 2003-12-02  1:32 Francois Baligant
  2003-12-02 10:44 ` Robert Olsson
  0 siblings, 1 reply; 6+ messages in thread
From: Francois Baligant @ 2003-12-02  1:32 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: Type: text/plain, Size: 3968 bytes --]

We have a problem with a box running 2.6.0-test11-mjb1 and supporting around 90k simultaneous TCP connection. After a few hours/days of running,
when a lots of clients connects/disconnects, the console will start to display:

dst cache overflow
NET: 1860 messages suppressed.
dst cache overflow
NET: 1858 messages suppressed.


>From there, the box is completely unresponsive, apparently eating all its CPU in trying to shrink the routing cache. Only solution is reboot.

Current sysctl:
net.ipv4.route.max_size = 655360 # I know we shouldn't rise it that high but it's only cure for now.. it lasts a bit longer like this
net.ipv4.route.gc_min_interval = 2
net.ipv4.route.gc_interval = 10
net.ipv4.route.gc_timeout = 30

rtstat:
 size   IN: hit     tot    mc no_rt bcast madst masrc  OUT: hit     tot     mc GC: tot ignored goal_miss ovrf HASH: in_search out_search
139566     12393     123     0     0     0     0     0       184      21      0     143     142         0    0           26039        375
138876     13080     136     0     0     0     0     0       159      19      0     155     154         0    0           27153        277
139006     12317     125     0     0     0     0     0       180      28      0     153     153         0    0           25810        377
139138     13799     140     0     0     0     0     0       159      16      0     156     156         0    0           28375        331
139275     11610     128     0     0     0     0     0       177      27      0     154     153         0    0           23977        343
139383     12679     124     0     0     0     0     0       173      17      0     141     140         0    0           26717        398
139256     11946     135     0     0     0     0     0       166      17      0     152     151         0    0           24874        304
139353     11646     109     0     0     0     0     0       174      14      0     122     122         0    0           24165        320
138257     12702     116     0     0     0     0     0       180      16      0     131     130         0    0           26324        358
138369     12897     115     0     0     0     0     0       166      20      0     134     134         0    0           26819        339
138553     11309     133     0     0     0     0     0       158      33      0     165     165         0    0           21270        389
138172     17232     182     0     0     0     0     0       125      44      0     225     225         0    0           29702        375
138420     17407     182     0     0     0     0     0       165      73      0     254     253         0    0           29946        548
138833     17052     257     0     0     0     0     0       195     126      0     382     381         0    0           29715        812
139051     16606     224     0     0     0     0     0       238      97      0     320     319         0    0           28559        721
139217     18115     176     0     0     0     0     0       268      51      0     224     224         0    0           32983        527
139326     17531     178     0     0     0     0     0       291      44      0     220     220         0    0           33320        445
139422     15244     140     0     0     0     0     0       357      20      0     160     160         0    0           29934        415
139548     13123     142     0     0     0     0     0       281      12      0     154     154         0    0           26430        351
139684     13290     142     0     0     0     0     0       235      10      0     152     151         0    0           27341        309

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
142340 142296  99%    0.38K  14234       10     56936K ip_dst_cache

Are we tuning the rt_cache in a wrong way ?

regards,
Francois

Francois Baligant - http://www.pingouin.be
Change the numbers, change your Life!

^ permalink raw reply	[flat|nested] 6+ messages in thread

* 2.6.0-test11: dst_cache_overflow causing unresponsive box
  2003-12-02  1:32 2.6.0-test11: dst_cache_overflow causing unresponsive box Francois Baligant
@ 2003-12-02 10:44 ` Robert Olsson
  2003-12-02 11:26   ` David S. Miller
  0 siblings, 1 reply; 6+ messages in thread
From: Robert Olsson @ 2003-12-02 10:44 UTC (permalink / raw)
  To: Francois Baligant; +Cc: netdev


Francois Baligant writes:
 > We have a problem with a box running 2.6.0-test11-mjb1 and supporting around 90k simultaneous TCP connection. After a few hours/days of running,
 > when a lots of clients connects/disconnects, the console will start to display:
 > 
 > dst cache overflow
 > NET: 1860 messages suppressed.
 > 
 > >From there, the box is completely unresponsive, apparently eating all its CPU in trying to shrink the routing cache. Only solution is reboot.

 > Current sysctl:
 > net.ipv4.route.max_size = 655360 # I know we shouldn't rise it that high but it's only cure for now.. it lasts a bit longer like this

 >  size   IN: hit     tot    mc no_rt bcast madst masrc  OUT: hit     tot     mc GC: tot ignored goal_miss ovrf HASH: in_search out_search
 > 139566     12393     123     0     0     0     0     0       184      21      0     143     142         0    0           26039        375


 >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
 > 142340 142296  99%    0.38K  14234       10     56936K ip_dst_cache
 > 
 > Are we tuning the rt_cache in a wrong way ?

No experience with 90k TCP-flows but it seems GC is not able to free some 
the dst-entries for some reason. This will slowly kill your box with 
symptoms you describe. We have ask TCP-experts for timer settings to avoid
pending sessions etc. Also check slab for any other objects growing as 
dst cache overflow is most likely secondary effect in your case. rtstat 
looks sane expect for the high number of dst-entries. Tuning is another 
story.

Cheers.
						--ro

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.0-test11: dst_cache_overflow causing unresponsive box
  2003-12-02 10:44 ` Robert Olsson
@ 2003-12-02 11:26   ` David S. Miller
  2003-12-02 12:47     ` jamal
                       ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: David S. Miller @ 2003-12-02 11:26 UTC (permalink / raw)
  To: Robert Olsson; +Cc: francois, netdev

On Tue, 2 Dec 2003 11:44:31 +0100
Robert Olsson <Robert.Olsson@data.slu.se> wrote:

> No experience with 90k TCP-flows but it seems GC is not able to free some 
> the dst-entries for some reason. This will slowly kill your box with 
> symptoms you describe. We have ask TCP-experts for timer settings to avoid
> pending sessions etc. Also check slab for any other objects growing as 
> dst cache overflow is most likely secondary effect in your case. rtstat 
> looks sane expect for the high number of dst-entries. Tuning is another 
> story.

Let us assume, for the sake of back of the envelope calculations, that
all 90k TCP connections speak to unique destinations.  Let us further
assume that all of them have at least one packet in flight.

This means the routing cache must be able to hold at least 90k entries.
All of these routing cache entires will be referenced by the packets
in the TCP retransmission queues of all the sockets, and thus the
entries are unreclaimable.

You are setting net.ipv4.route.max_size to 655360 which should be more
than enough.  But you also have to make the net.ipv4.route.gc_thresh
more reasonable as well, perhaps 90K as a test.

If net.ipv4.route.gc_thresh is lower than 90K and my assertions above
hold, then the kernel will try to garbage collect too early, all the
routing cache entries will be in use and therefore uncollectable,
and you'll get the message you're seeing.

Try to pump up gc_thresh and see if that helps.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.0-test11: dst_cache_overflow causing unresponsive box
  2003-12-02 11:26   ` David S. Miller
@ 2003-12-02 12:47     ` jamal
  2003-12-02 17:56     ` Robert Olsson
  2003-12-03  0:15     ` Francois Baligant
  2 siblings, 0 replies; 6+ messages in thread
From: jamal @ 2003-12-02 12:47 UTC (permalink / raw)
  To: David S. Miller; +Cc: Robert Olsson, francois, netdev


With that many flows the neighbor cache gc may also be killing him.

cheers,
jamal

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.0-test11: dst_cache_overflow causing unresponsive box
  2003-12-02 11:26   ` David S. Miller
  2003-12-02 12:47     ` jamal
@ 2003-12-02 17:56     ` Robert Olsson
  2003-12-03  0:15     ` Francois Baligant
  2 siblings, 0 replies; 6+ messages in thread
From: Robert Olsson @ 2003-12-02 17:56 UTC (permalink / raw)
  To: David S. Miller; +Cc: Robert Olsson, francois, netdev



David S. Miller writes:

 > Let us assume, for the sake of back of the envelope calculations, that
 > all 90k TCP connections speak to unique destinations.  Let us further
 > assume that all of them have at least one packet in flight.
 > 
 > This means the routing cache must be able to hold at least 90k entries.
 > All of these routing cache entires will be referenced by the packets
 > in the TCP retransmission queues of all the sockets, and thus the
 > entries are unreclaimable.
 > 
 > You are setting net.ipv4.route.max_size to 655360 which should be more
 > than enough.  But you also have to make the net.ipv4.route.gc_thresh
 > more reasonable as well, perhaps 90K as a test.
 > 
 > If net.ipv4.route.gc_thresh is lower than 90K and my assertions above
 > hold, then the kernel will try to garbage collect too early, all the
 > routing cache entries will be in use and therefore uncollectable,
 > and you'll get the message you're seeing.
 > 
 > Try to pump up gc_thresh and see if that helps.

 Yes better tuning as gc_thresh and max_size is in better balance but max_size
 is same so I'll guess we collect unreclaimable entries util we see dst overflow 
 still'. The long time before overflow is suspect "hours to days" We have to 
 ask if this has ever worked before?
 
 I'll guess number of hash buckets should be increased for systems like this.

 
 Cheers.
						--ro

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.0-test11: dst_cache_overflow causing unresponsive box
  2003-12-02 11:26   ` David S. Miller
  2003-12-02 12:47     ` jamal
  2003-12-02 17:56     ` Robert Olsson
@ 2003-12-03  0:15     ` Francois Baligant
  2 siblings, 0 replies; 6+ messages in thread
From: Francois Baligant @ 2003-12-03  0:15 UTC (permalink / raw)
  Cc: netdev

Thanks all for your suggestions.

Actually I have noticed that with 90k establish TCP sessions, I have around
the double
amount of entries in the routing cache. For each TCP session there is an
inbound and outbond
cache entry like this:

39.125.111.131  81.64.64.96     39.125.111.129         1500 0          0
eth0
81.64.64.96   39.125.111.131  39.125.111.131  l         0 0          0 lo

This system has accepted that many sessions before when running 2.4 and this
problem surfaced
with 2.6. Now, I can't be sure that traffic pattern are exactly the same so
Im not drawing conclusions
about 2.6

I will try to raise gc_tresh and keep you informed.

Thanks,
Francois

----- Original Message ----- 
From: "David S. Miller" <davem@redhat.com>
To: "Robert Olsson" <Robert.Olsson@data.slu.se>
Cc: <francois@baligant.net>; <netdev@oss.sgi.com>
Sent: Tuesday, December 02, 2003 12:26 PM
Subject: Re: 2.6.0-test11: dst_cache_overflow causing unresponsive box


> On Tue, 2 Dec 2003 11:44:31 +0100
> Robert Olsson <Robert.Olsson@data.slu.se> wrote:
>
> > No experience with 90k TCP-flows but it seems GC is not able to free
some
> > the dst-entries for some reason. This will slowly kill your box with
> > symptoms you describe. We have ask TCP-experts for timer settings to
avoid
> > pending sessions etc. Also check slab for any other objects growing as
> > dst cache overflow is most likely secondary effect in your case. rtstat
> > looks sane expect for the high number of dst-entries. Tuning is another
> > story.
>
> Let us assume, for the sake of back of the envelope calculations, that
> all 90k TCP connections speak to unique destinations.  Let us further
> assume that all of them have at least one packet in flight.
>
> This means the routing cache must be able to hold at least 90k entries.
> All of these routing cache entires will be referenced by the packets
> in the TCP retransmission queues of all the sockets, and thus the
> entries are unreclaimable.
>
> You are setting net.ipv4.route.max_size to 655360 which should be more
> than enough.  But you also have to make the net.ipv4.route.gc_thresh
> more reasonable as well, perhaps 90K as a test.
>
> If net.ipv4.route.gc_thresh is lower than 90K and my assertions above
> hold, then the kernel will try to garbage collect too early, all the
> routing cache entries will be in use and therefore uncollectable,
> and you'll get the message you're seeing.
>
> Try to pump up gc_thresh and see if that helps.
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-12-03  0:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-12-02  1:32 2.6.0-test11: dst_cache_overflow causing unresponsive box Francois Baligant
2003-12-02 10:44 ` Robert Olsson
2003-12-02 11:26   ` David S. Miller
2003-12-02 12:47     ` jamal
2003-12-02 17:56     ` Robert Olsson
2003-12-03  0:15     ` Francois Baligant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).