From: "Paweł Staszewski" <pstaszewski@itcare.pl>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Linux Network Development list <netdev@vger.kernel.org>
Subject: Re: Linux Route Cache performance tests
Date: Sun, 06 Nov 2011 20:20:38 +0100 [thread overview]
Message-ID: <4EB6DE06.7050009@itcare.pl> (raw)
In-Reply-To: <1320605326.6506.27.camel@edumazet-laptop>
W dniu 2011-11-06 19:48, Eric Dumazet pisze:
> Le dimanche 06 novembre 2011 à 19:28 +0100, Paweł Staszewski a écrit :
>> W dniu 2011-11-06 18:29, Eric Dumazet pisze:
>>> Le dimanche 06 novembre 2011 à 16:57 +0100, Paweł Staszewski a écrit :
>>>> Hello
>>>>
>>>>
>>>>
>>>> I make some networking performance tests for Linux 3.1
>>>>
>>>> Configuration:
>>>>
>>>> Linux (pktget) ----> Linux (router) ----> Linux (Sink)
>>>>
>>>> pktgen config:
>>>> clone_skb 32
>>>> pkt_size 64
>>>> delay 0
>>>>
>>>> pgset "flag IPDST_RND"
>>>> pgset "dst_min 10.0.0.0"
>>>> pgset "dst_max 10.18.255.255"
>>>> pgset "config 1"
>>>> pgset "flows 256"
>>>> pgset "flowlen 8"
>>>>
>>>> TX performance for this host:
>>>> eth0: RX: 0.00 P/s TX: 12346107.73 P/s TOTAL:
>>>> 12346107.73 P/s
>>>>
>>>> On Linux (router):
>>>> grep . /proc/sys/net/ipv4/route/*
>>>> /proc/sys/net/ipv4/route/error_burst:500
>>>> /proc/sys/net/ipv4/route/error_cost:100
>>>> grep: /proc/sys/net/ipv4/route/flush: Permission denied
>>>> /proc/sys/net/ipv4/route/gc_elasticity:4
>>>> /proc/sys/net/ipv4/route/gc_interval:60
>>>> /proc/sys/net/ipv4/route/gc_min_interval:0
>>>> /proc/sys/net/ipv4/route/gc_min_interval_ms:500
>>>> /proc/sys/net/ipv4/route/gc_thresh:2000000
>>>> /proc/sys/net/ipv4/route/gc_timeout:60
>>>> /proc/sys/net/ipv4/route/max_size:8388608
>>>> /proc/sys/net/ipv4/route/min_adv_mss:256
>>>> /proc/sys/net/ipv4/route/min_pmtu:552
>>>> /proc/sys/net/ipv4/route/mtu_expires:600
>>>> /proc/sys/net/ipv4/route/redirect_load:2
>>>> /proc/sys/net/ipv4/route/redirect_number:9
>>>> /proc/sys/net/ipv4/route/redirect_silence:2048
>>>>
>>>> For the first 30secs maybee more router is forwarding ~5Mpps to the
>>>> Linux (Sink)
>>>> and some stats for this forst 30secs in attached image:
>>>>
>>>> http://imageshack.us/photo/my-images/684/test1ih.png/
>>>>
>>>> Left up - pktgen linux
>>>> left down - Linux router (htop)
>>>> Right up - Linux router (bwm-ng - showing pps)
>>>> Right down - Linux router (lnstat)
>>>>
>>>>
>>>> And all is good - performance 5Mpps until Linux router will reach ~1kk
>>>> entries
>>>> What You can see on next attached image:
>>>>
>>>> http://imageshack.us/photo/my-images/24/test2id.png/
>>>>
>>>> Forwarding performance drops from 5Mpps to 1,8Mpps
>>>> And after 3 - 4 minutes it will stop on 0,7Mpps
>>>>
>>>>
>>>> After flushing the route cache performance increase from 0.7Mpps to 6Mpps
>>>> What You can see on next attached image:
>>>>
>>>> http://imageshack.us/photo/my-images/197/test3r.png/
>>>>
>>>> Is it possible to turn off route cache ? and see what performance will
>>>> be without caching
>>>>
>>> Route cache cannot handle DDOS situation, since it will be filled,
>>> unless you have a lot of memory.
>> hmm
>> but what is DDOS situation for route cache ? new entries per sec ? total
>> amount of entries 1,2kk in my tests ?
>> Look sometimes in normal scenario You can hit
>> 1245072 route cache entries
>> This is normal for BGP configurations.
>>
> Then figure out the right tunables for your machine ?
>
> Its not a laptop or average server setup, so you need to allow your
> kernel to consume a fair amount of memory for the route cache.
Yes this parameters was special not tuned :)
To see what is the route cache performance limit
Because there was no optimal parameters for this test :)
no matter what i tuned results are always the same
performance drops from 5Mpps to 0.7Mpps without tuning sysctl
And with tuned parameters i can reach the same as turning off route
cache - when running this tests.
So Yes Tuned performance is better
performance drops from 5Mpps to 0.7Mpps - without tuning
and from 5Mpps to 3,7Mpps with tuned sysctl - so a little less than with
turned off route cache
So the point of this test was figure out how much of route cache entries
Linux can handle without dropping performance.
> Or accept low performance :(
Never :)
>> The performance of route cache is ok to the point where we reach more
>> than 1245072 entries.
>> Router is starting forwarding packets with 5Mpps and ends at about
>> 0.7Mpps when more than 1245072 entries is reached.
>> For my scenario
>> Random ip generation start at: 10.0.0.0 ends on 10.18.255.255
>> this is 1170450 random ip's
>>
> I have no problem with 4 millions entries in route cache, with full
> performance, not 80%.
>
>
> You currently have one hash table with 524288 entries
> (before you changed /proc/sys/net/ipv4/route/gc_thresh)
>
> Its not optimal for your workload, because you have many slots with 4
> chained items, performance sucks.
>
> You have to boot your machine with "rhash_entries=2097152", so that
> average chain length is less than 1
>
> Your problem is then solved :
>
> # grep . /proc/sys/net/ipv4/route/*
> /proc/sys/net/ipv4/route/error_burst:5000
> /proc/sys/net/ipv4/route/error_cost:1000
> /proc/sys/net/ipv4/route/gc_elasticity:8
> /proc/sys/net/ipv4/route/gc_min_interval:0
> /proc/sys/net/ipv4/route/gc_min_interval_ms:500
> /proc/sys/net/ipv4/route/gc_thresh:2097152
> /proc/sys/net/ipv4/route/gc_timeout:300
> /proc/sys/net/ipv4/route/max_size:33554432
> /proc/sys/net/ipv4/route/min_adv_mss:256
> /proc/sys/net/ipv4/route/min_pmtu:552
> /proc/sys/net/ipv4/route/mtu_expires:600
> /proc/sys/net/ipv4/route/redirect_load:20
> /proc/sys/net/ipv4/route/redirect_number:9
> /proc/sys/net/ipv4/route/redirect_silence:20480
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
next prev parent reply other threads:[~2011-11-06 19:20 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-06 15:57 Linux Route Cache performance tests Paweł Staszewski
2011-11-06 17:29 ` Eric Dumazet
2011-11-06 18:28 ` Paweł Staszewski
2011-11-06 18:48 ` Eric Dumazet
2011-11-06 19:20 ` Paweł Staszewski [this message]
2011-11-06 19:38 ` Eric Dumazet
2011-11-06 20:25 ` Paweł Staszewski
2011-11-06 21:26 ` Eric Dumazet
2011-11-06 21:57 ` Paweł Staszewski
2011-11-06 23:08 ` Eric Dumazet
2011-11-07 8:36 ` Paweł Staszewski
2011-11-07 9:08 ` Eric Dumazet
2011-11-07 9:16 ` Eric Dumazet
2011-11-07 22:12 ` Paweł Staszewski
2011-11-07 13:42 ` Ben Hutchings
2011-11-07 14:33 ` Eric Dumazet
2011-11-09 17:24 ` [PATCH net-next] ipv4: PKTINFO doesnt need dst reference Eric Dumazet
2011-11-09 21:37 ` David Miller
2011-11-09 22:03 ` Eric Dumazet
2011-11-10 0:29 ` [PATCH net-next] bnx2x: reduce skb truesize by 50% Eric Dumazet
2011-11-10 15:05 ` Eilon Greenstein
2011-11-10 15:27 ` Eric Dumazet
2011-11-10 16:27 ` Eilon Greenstein
2011-11-10 16:45 ` Eric Dumazet
2011-11-13 18:53 ` Eilon Greenstein
2011-11-13 19:42 ` Eric Dumazet
2011-11-13 20:08 ` Eilon Greenstein
2011-11-13 22:00 ` Eric Dumazet
2011-11-14 5:08 ` David Miller
2011-11-14 6:25 ` Eric Dumazet
2011-11-14 15:57 ` Eric Dumazet
2011-11-14 19:21 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4EB6DE06.7050009@itcare.pl \
--to=pstaszewski@itcare.pl \
--cc=eric.dumazet@gmail.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).