netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Performance problem with route cache ?
@ 2003-11-20 10:55 dada1
  2003-11-20 12:37 ` Robert Olsson
  0 siblings, 1 reply; 4+ messages in thread
From: dada1 @ 2003-11-20 10:55 UTC (permalink / raw)
  To: netdev

Hi all

I'm doing some oprofile on a bi athlon, linux-2.6.0-test8 kernel, (with a
copybreak intel e1000 patch, because the machine receives a lot of small (<
30 bytes) messages)

It seems the route cache may/should be tuned, but I dont know how.

The machine receives about 10000 packets/second, from a lot of different IP.

Cpu type: Athlon
Cpu speed was (MHz estimation) : 1991.43
Counter 0 counted CPU_CLK_UNHALTED events (Cycles outside of halt state)
with a unit mask of 0x00 (No unit mask) count
 50000
vma      samples  %           symbol name
a0251fd0 6477022  15.6789     rt_garbage_collect
a028b6a0 2692339  6.51731     ipt_do_table
a0254560 1779167  4.30681     ip_route_input
a0251c40 1544801  3.73948     rt_may_expire
a0273370 1119668  2.71037     tcp_v4_rcv
a01f4350 1084883  2.62616     e1000_clean_rx_irq
a013dd70 1048705  2.53859     free_block
a013e120 844766   2.04492     kfree
a0269cc0 600011   1.45244     tcp_rcv_established
a0111c20 540780   1.30906     mark_offset_tsc
a0235d40 536856   1.29956     skb_release_data
a023a9a0 521576   1.26257     net_rx_action
a0259dc0 515115   1.24693     ip_queue_xmit
a023f990 498217   1.20603     dst_alloc
a0260190 487681   1.18052     tcp_sendmsg
a0231fe0 477588   1.15609     sockfd_lookup
a0118c30 459348   1.11194     schedule
a0261bd0 454353   1.09985     tcp_recvmsg
a01070c0 442280   1.07062     default_idle

# grep ip_dst_cache /proc/slabinfo
ip_dst_cache      253051 324840    320   12    1 : tunables   54   27    8 :
slabdata  27070  27070      0

# cat /proc/net/sockstat
sockets: used 246459
TCP: inuse 245501 orphan 591 tw 5639 alloc 247035 mem 74061
UDP: inuse 6
RAW: inuse 0
FRAG: inuse 3 memory 1164

#grep "routing cache hash table " /var/log/syslog
kernel: IP: routing cache hash table of 32768 buckets, 256Kbytes

# cat /proc/meminfo
MemTotal:      3371408 kB
MemFree:        847716 kB
Buffers:         70352 kB
Cached:          91080 kB
SwapCached:          0 kB
Active:        1162512 kB
Inactive:        27360 kB
HighTotal:     1900544 kB
HighFree:       161920 kB
LowTotal:      1470864 kB
LowFree:        685796 kB
SwapTotal:      506008 kB
SwapFree:       506008 kB
Dirty:            2092 kB
Writeback:           0 kB
Mapped:        1038884 kB
Slab:           704636 kB
Committed_AS:  1549704 kB
PageTables:       1412 kB
VmallocTotal:    49144 kB
VmallocUsed:      2380 kB
VmallocChunk:    45896 kB
HugePages_Total:   150
HugePages_Free:     48
Hugepagesize:     4096 kB


Does anybody have a hint, what should I change to lower a bit the CPU used
in kernel land ?

Thanks

Eric Dumazet

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Performance problem with route cache ?
  2003-11-20 10:55 Performance problem with route cache ? dada1
@ 2003-11-20 12:37 ` Robert Olsson
  2003-11-21  7:53   ` dada1
  0 siblings, 1 reply; 4+ messages in thread
From: Robert Olsson @ 2003-11-20 12:37 UTC (permalink / raw)
  To: dada1; +Cc: netdev



dada1 writes:
 > Hi all
 > 
 > I'm doing some oprofile on a bi athlon, linux-2.6.0-test8 kernel, (with a
 > copybreak intel e1000 patch, because the machine receives a lot of small (<
 > 30 bytes) messages)

 Does the copybreak help? I tested copybreak long time ago and saw no gain
 from it. Do you have pointer to the patch?

 > It seems the route cache may/should be tuned, but I dont know how.
 
 Tuning is not an easy task and not necessarily you will gain anything,
 It fully depends on your input traffic but it seems your system has a 
 lot of dst entries. Which would cause a lot linear search. Packets per 
 flow and no of parallel flows should be of interest for you. rtstat can 
 give you a better feeling of incoming routing load.
 
 Is 10kpkts/s the max performance?

 Cheers.
						--ro

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Performance problem with route cache ?
  2003-11-20 12:37 ` Robert Olsson
@ 2003-11-21  7:53   ` dada1
  2003-11-21  9:37     ` Robert Olsson
  0 siblings, 1 reply; 4+ messages in thread
From: dada1 @ 2003-11-21  7:53 UTC (permalink / raw)
  To: Robert Olsson; +Cc: netdev, scott.feldman


From: "Robert Olsson" <Robert.Olsson@data.slu.se>
>  Does the copybreak help? I tested copybreak long time ago and saw no gain
>  from it. Do you have pointer to the patch?

This is a patch kindly given to me by Scott Feldmann, that just copy the skb
to a new shorter one.
In my case, this is indeed usefull, because the server receives a lot of
small tcp messages, from a lot of clients.
Instead of using 2K or 4K buffers to store the data into socket, we end up
using 64 bytes size buffers... quite a huge difference indeed.
In case of a spike in network, the machine no longer consume 200 MB of
lowmem and stay alive.
In a router context, copybreak should not be used, because the extra copy is
CPU intensive, and the data should not live long enough on memory.

>
>  > It seems the route cache may/should be tuned, but I dont know how.
>
>  Tuning is not an easy task and not necessarily you will gain anything,
>  It fully depends on your input traffic but it seems your system has a
>  lot of dst entries. Which would cause a lot linear search. Packets per
>  flow and no of parallel flows should be of interest for you. rtstat can
>  give you a better feeling of incoming routing load.
>
>  Is 10kpkts/s the max performance?

Well,  the machine is not network bounded... We dont want to increase the
trafic.
The machine just receives a lot of frames from a lot of different IP. That
is not a DOS attack.

Some rtstat samples :

rtstat -i 1
 size   IN: hit     tot    mc no_rt bcast madst masrc  OUT: hit     tot
mc GC: tot ignored goal_miss ovrf HASH: in
_search out_search
279536      1490   11719     0     0     0     0     0       231     843
0   12562   12560         2    0
   97100       6853
282928      1732    9705     0     0     0     0     0       275     800
0   10505   10503         2    0
   80811       6774
278436      1211   10609     0     0     0     0     0       166     551
0   11160   11158         2    0
   87831       4550
280942      1195    8996     0     0     0     0     0       299    1128
0   10124   10122         2    0
   75815       9523
281774      1994   12201     0     0     0     0     0       295    1070
0   13271   13269         2    0
  102299       8864
274220      1514   11692     0     0     0     0     0       252     929
0   12621   12619         2    0
   96551       7743
274248      1271   10805     0     0     0     0     0       273    1094
0   11899   11897         2    0
   88822       9013
278905      1157    9942     0     0     0     0     0        85     308
0   10250   10248         2    0
   82651       2569
283423      1509    8471     0     0     0     0     0       176     631
0    9102    9100         2    0
   71805       5286
288620      1530    7685     0     0     0     0     0        74     299
0    7984    7982         2    0
   66499       2505
293264      1666    7432     0     0     0     0     0       174     552
0    7984    7982         2    0
   65642       4785
296831      1840    7570     0     0     0     0     0       129     371
0    7941    7939         2    0
   67687       3257
297902      2214    9863     0     0     0     0     0       346    1192
0   11055   11053         2    0
   88987      10566
300225      2192    8136     0     0     0     0     0       308     877
0    9013    9011         2    0
   74096       8029

size-32              931   1008     32  112    1 : tunables  120   60    8 :
slabdata      9      9      0
size-64            18477  51566     64   59    1 : tunables  120   60    8 :
slabdata    874    874    480
size-128            2013   2040    128   30    1 : tunables  120   60    8 :
slabdata     68     68      0
size-256              90     90    256   15    1 : tunables  120   60    8 :
slabdata      6      6      0
size-512           17569  37680    512    8    1 : tunables   54   27    8 :
slabdata   4710   4710     27
size-1024            348    348   1024    4    1 : tunables   54   27    8 :
slabdata     87     87     54
size-2048          13414  17306   2048    2    1 : tunables   24   12    8 :
slabdata   8653   8653     96
size-4096           3970   3970   4096    1    1 : tunables   24   12    8 :
slabdata   3970   3970      0
size-8192            139    139   8192    1    2 : tunables    8    4    0 :
slabdata    139    139      0
ip_dst_cache      280165 386208    320   12    1 : tunables   54   27    8 :
slabdata  32184  32184      0
skbuff_head_cache  35060 110200    192   20    1 : tunables  120   60    8 :
slabdata   5510   5510     60


Thanks
Eric Dumazet

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Performance problem with route cache ?
  2003-11-21  7:53   ` dada1
@ 2003-11-21  9:37     ` Robert Olsson
  0 siblings, 0 replies; 4+ messages in thread
From: Robert Olsson @ 2003-11-21  9:37 UTC (permalink / raw)
  To: dada1; +Cc: Robert Olsson, netdev, scott.feldman


dada1 writes:

 > Instead of using 2K or 4K buffers to store the data into socket, we end up
 > using 64 bytes size buffers... quite a huge difference indeed.
 > In case of a spike in network, the machine no longer consume 200 MB of
 > lowmem and stay alive.

 Well have no experice here... but think e1000 was using 4k for 1500 MTU
 packets at least before.

 > In a router context, copybreak should not be used, because the extra copy is
 > CPU intensive, and the data should not live long enough on memory.

 It easy to try again. I'm playing with realtek r8169 driver and are filling
 the skb one-by-one now as they are passed onto the stack. Rather than doing
 the RX-batch refill. I need to collect some data...
 
 > The machine just receives a lot of frames from a lot of different IP. That
 > is not a DOS attack.

 > rtstat -i 1
 >  size   IN: hit     tot    mc no_rt bcast madst masrc  OUT: hit     tot
 > mc GC: tot ignored goal_miss ovrf HASH: in
 > _search out_search
 > 279536      1490   11719     0     0     0     0     0       231     843
 > 0   12562   12560         2    0

 Well not DoS? What is this? See you have almost 10 times more new entries
 per/sec then what hits your already huge dst-cache. (hit/total).

 Compare to a procduction router for 10:th of thousands of users at 55 kpps.

size   IN: hit     tot    mc no_rt bcast madst masrc  OUT: hit     tot     mc GC: tot ignored goal_miss ovrf HASH: in_search out_search
21797     54543     529     0    49     0     0     0       137      12      0       0       0         0    0           28139         59

Cheers.
							--ro

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2003-11-21  9:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-11-20 10:55 Performance problem with route cache ? dada1
2003-11-20 12:37 ` Robert Olsson
2003-11-21  7:53   ` dada1
2003-11-21  9:37     ` Robert Olsson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).