From: Robert Olsson <Robert.Olsson@data.slu.se>
To: kuznet@ms2.inr.ac.ru
Cc: davem@redhat.com, hadi@cyberus.ca, netdev@oss.sgi.com,
jensl@robur.slu.se
Subject: route cache GC monitoring
Date: Fri, 26 Apr 2002 14:55:00 +0200 [thread overview]
Message-ID: <15561.20004.964497.762468@robur.slu.se> (raw)
[-- Attachment #1: message body text --]
[-- Type: text/plain, Size: 584 bytes --]
Hello!
We like to propose four new stats. counters to monitor the garbage process
of route cache. This should be useful for tuning and debugging installations
which have a large number of flows.
Tuning is another business but we have played with max_size and gc_thresh
there are more tuning knobs. The number of buckets in the hash table is
something to tune as well be but we have not experimented with this here.
Anyway for most installations the "default" setting does a very good job
but when we see high numbers in the new GC counters one should be warned.
Kernel patch.
[-- Attachment #2: GC.pat --]
[-- Type: application/octet-stream, Size: 2246 bytes --]
--- linux/include/net/route.h.orig Mon Feb 25 20:38:13 2002
+++ linux/include/net/route.h Fri Apr 19 15:50:20 2002
@@ -110,6 +110,10 @@
unsigned int out_hit;
unsigned int out_slow_tot;
unsigned int out_slow_mc;
+ unsigned int gc_total;
+ unsigned int gc_ignored;
+ unsigned int gc_goal_miss;
+ unsigned int gc_dst_overflow;
} ____cacheline_aligned_in_smp;
extern struct ip_rt_acct *ip_rt_acct;
--- linux/net/ipv4/route.c.orig Thu Apr 18 07:43:44 2002
+++ linux/net/ipv4/route.c Fri Apr 26 11:27:16 2002
@@ -286,7 +286,7 @@
for (lcpu = 0; lcpu < smp_num_cpus; lcpu++) {
i = cpu_logical_map(lcpu);
- len += sprintf(buffer+len, "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x\n",
+ len += sprintf(buffer+len, "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x \n",
dst_entries,
rt_cache_stat[i].in_hit,
rt_cache_stat[i].in_slow_tot,
@@ -298,7 +298,13 @@
rt_cache_stat[i].out_hit,
rt_cache_stat[i].out_slow_tot,
- rt_cache_stat[i].out_slow_mc
+ rt_cache_stat[i].out_slow_mc,
+
+ rt_cache_stat[i].gc_total,
+ rt_cache_stat[i].gc_ignored,
+ rt_cache_stat[i].gc_goal_miss,
+ rt_cache_stat[i].gc_dst_overflow
+
);
}
len -= offset;
@@ -499,9 +505,14 @@
* Garbage collection is pretty expensive,
* do not make it too frequently.
*/
+
+ rt_cache_stat[smp_processor_id()].gc_total++;
+
if (now - last_gc < ip_rt_gc_min_interval &&
- atomic_read(&ipv4_dst_ops.entries) < ip_rt_max_size)
+ atomic_read(&ipv4_dst_ops.entries) < ip_rt_max_size) {
+ rt_cache_stat[smp_processor_id()].gc_ignored++;
goto out;
+ }
/* Calculate number of entries, which we want to expire now. */
goal = atomic_read(&ipv4_dst_ops.entries) -
@@ -567,6 +578,8 @@
We will not spin here for long time in any case.
*/
+ rt_cache_stat[smp_processor_id()].gc_goal_miss++;
+
if (expire == 0)
break;
@@ -584,6 +597,7 @@
goto out;
if (net_ratelimit())
printk(KERN_WARNING "dst cache overflow\n");
+ rt_cache_stat[smp_processor_id()].gc_dst_overflow++;
return 1;
work_done:
[-- Attachment #3: message body text --]
[-- Type: text/plain, Size: 17 bytes --]
Use/Experiment.
[-- Attachment #4: usage --]
[-- Type: application/octet-stream, Size: 7665 bytes --]
Julian Anatasovs test program. Using 80000 different sources.
testlvs x.y.z:80 -udp -srcnum 80000 -packets 0
tot == GC: total calls per sec
ignored == GC: calls ignored per sec
goal_miss == GC: goal miss per sec
ovrflw == GC: dst_overflow per sec
Not loaded:
----------
max_size 65536
gc_thresh 4096
size IN: hit tot mc no_rt bcast madst masrc OUT: hit tot mc GC: tot ignored goal_miss ovrf
11 134 1159 4 0 15 0 0 146 854 0 0 0 0 0
11 0 1 0 0 0 0 0 0 0 0 0 0 0 0
11 0 1 0 0 0 0 0 0 0 0 0 0 0 0
11 0 3 0 0 0 0 0 0 0 0 0 0 0 0
Cache too small:
---------------
max_size 2000
gc_thresh 1000
size IN: hit tot mc no_rt bcast madst masrc OUT: hit tot mc GC: tot ignored goal_miss ovrf
2000 541 1326737 4 0 39 0 0 542 1321662 0 1286563 1009673 3946 3946
2000 0 876 0 0 0 0 0 0 808 0 1724 18 67 67
2000 0 877 0 0 0 0 0 0 810 0 1719 10 65 65
2000 0 876 0 0 0 0 0 0 809 0 1723 17 66 66
2000 0 876 0 0 0 0 0 0 808 0 1723 7 67 67
2000 0 875 0 0 0 0 0 0 807 0 1722 11 67 67
2000 0 875 0 0 0 0 0 0 806 0 1713 6 69 69
2000 0 875 0 0 0 0 0 0 807 0 1720 13 68 68
2000 0 875 0 0 0 0 0 0 810 0 1727 10 65 65
Cache still "too" small but no dst overflow:
--------------------------------------------
max_size 8000
gc_thresh 4000
size IN: hit tot mc no_rt bcast madst masrc OUT: hit tot mc GC: tot ignored goal_miss ovrf
3408 551 1374530 4 0 39 0 0 558 1366058 0 1377074 1010314 7313 7313
4743 0 673 0 0 0 0 0 0 671 0 743 742 0 0
5880 0 569 0 0 0 0 0 0 568 0 1137 1137 0 0
7610 0 861 0 0 0 0 0 0 861 0 1722 1722 0 0
7348 0 872 0 0 0 0 0 0 871 0 1742 1741 0 0
7563 0 872 0 0 0 0 0 0 870 0 1740 1738 0 0
7963 0 872 0 0 0 0 0 0 871 0 1742 1718 0 0
7976 0 873 0 0 0 0 0 0 871 0 1742 1704 0 0
7956 0 872 0 0 0 0 0 0 871 0 1742 1708 0 0
7970 0 876 0 0 0 0 0 0 875 0 1750 1712 0 0
More decent settings: (oscillating around gc_thresh)
----------------------------------------------------
max_size 94000
gc_thresh 47000
size IN: hit tot mc no_rt bcast madst masrc OUT: hit tot mc GC: tot ignored goal_miss ovrf
42490 559 1442677 6 0 46 0 0 574 1434138 0 1445795 1077777 7313 7313
43864 0 686 0 0 0 0 0 0 685 0 0 0 0 0
45224 0 676 0 0 0 0 0 0 675 0 0 0 0 0
46616 0 691 0 0 0 0 0 0 691 0 0 0 0 0
40518 0 684 0 0 0 0 0 0 684 0 660 659 0 0
41948 0 711 0 0 0 0 0 0 711 0 0 0 0 0
43698 3 871 0 0 0 0 0 7 871 0 0 0 0 0
45448 2 872 0 0 0 0 0 3 872 0 0 0 0 0
47198 0 871 0 0 0 0 0 0 871 0 197 197 0 0
40980 0 872 0 0 0 0 0 0 872 0 1207 1206 0 0
42534 0 774 0 0 0 0 0 0 773 0 0 0 0 0
44062 0 764 0 0 0 0 0 0 764 0 0 0 0 0
45260 0 596 0 0 0 0 0 0 595 0 0 0 0 0
46508 0 621 0 0 0 0 0 0 620 0 0 0 0 0
40081 0 561 0 0 0 0 0 0 560 0 227 226 0 0
41835 0 874 0 0 0 0 0 0 873 0 0 0 0 0
43581 0 870 0 0 0 0 0 0 869 0 0 0 0 0
45330 0 872 0 0 1 0 0 0 870 0 0 0 0 0
47082 0 872 0 0 0 0 0 0 872 0 81 81 0 0
40910 0 871 0 0 0 0 0 0 871 0 1031 1030 0 0
Optimum (?) setting for this load:
----------------------------------
max_size 100000
gc_thresh 50000
size IN: hit tot mc no_rt bcast madst masrc OUT: hit tot mc GC: tot ignored goal_miss ovrf
42271 589 1529963 6 0 52 0 0 610 1521341 0 1456394 1088362 7313 7313
44001 0 866 0 0 0 0 0 0 865 0 0 0 0 0
45743 0 872 0 0 0 0 0 0 871 0 0 0 0 0
47491 0 875 0 0 0 0 0 0 874 0 0 0 0 0
49241 0 875 0 0 0 0 0 0 875 0 0 0 0 0
42326 0 860 0 0 0 0 0 0 860 0 1 0 0 0
44075 0 876 0 0 1 0 0 0 874 0 0 0 0 0
45825 0 876 0 0 0 0 0 0 875 0 0 0 0 0
47574 0 876 1 0 0 0 0 0 874 0 0 0 0 0
49316 0 872 0 0 0 0 0 0 871 0 0 0 0 0
42373 0 876 0 0 0 0 0 0 875 0 8 7 0 0
44121 0 875 0 0 0 0 0 0 874 0 0 0 0 0
45873 0 875 0 0 0 0 0 0 875 0 0 0 0 0
47623 0 875 0 0 0 0 0 0 875 0 0 0 0 0
49371 0 875 0 0 0 0 0 0 874 0 0 0 0 0
[-- Attachment #5: message body text --]
[-- Type: text/plain, Size: 68 bytes --]
Cheers.
Robert Olsson, Jens Laas
next reply other threads:[~2002-04-26 12:55 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-04-26 12:55 Robert Olsson [this message]
2002-04-26 15:25 ` route cache GC monitoring kuznet
2002-04-29 10:14 ` dst_entry and friends Gabriel Paues
2002-04-29 14:55 ` Andi Kleen
2002-05-09 9:35 ` route cache GC monitoring David S. Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=15561.20004.964497.762468@robur.slu.se \
--to=robert.olsson@data.slu.se \
--cc=davem@redhat.com \
--cc=hadi@cyberus.ca \
--cc=jensl@robur.slu.se \
--cc=kuznet@ms2.inr.ac.ru \
--cc=netdev@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).