* [PATCH] Reduce netfilter memory use on MP systems
@ 2005-02-04 14:09 Andi Kleen
2005-02-04 17:34 ` Martin Josefsson
2005-02-07 18:31 ` Harald Welte
0 siblings, 2 replies; 6+ messages in thread
From: Andi Kleen @ 2005-02-04 14:09 UTC (permalink / raw)
To: netdev; +Cc: netfilter-devel
On kernels compiled with a big NR_CPUS netfilter rules would
eat a lot of memory because all counters would be duplicated
for all NR_CPUs CPUs. With NR_CPUS=256 this would add up
to many MBs of memory.
This patch only allocates enough memory for the possible CPUs,
which is usually a much smaller number than NR_CPUS.
This allows loading of bigger rule sets on 64bit systems.
There is still a limit because someone else broke vmalloc to have a 64MB
limit on 64bit systems for single allocations, 129MB on 32bit.
It allocates an array of pages with kmalloc and kmalloc has a 128K limit.
To be fixed with a separate patch.
64bit systems were hurt worst because they tend to have big NR_CPUS
and the counters need more memory there, and the vmalloc limit is lower.
But it will raise the limits even on 32bit.
And in general it saves a lot of memory.
Tested only on a small dual CPU box.
Signed-off-by: Andi Kleen <ak@suse.de>
diff -u linux/net/ipv4/netfilter/ip_tables.c-o linux/net/ipv4/netfilter/ip_tables.c
--- linux/net/ipv4/netfilter/ip_tables.c-o 2005-02-04 09:40:12.000000000 +0100
+++ linux/net/ipv4/netfilter/ip_tables.c 2005-02-04 14:26:56.000000000 +0100
@@ -923,7 +923,7 @@
}
/* And one copy for every other CPU */
- for (i = 1; i < NR_CPUS; i++) {
+ for (i = 1; i < num_possible_cpus(); i++) {
memcpy(newinfo->entries + SMP_ALIGN(newinfo->size)*i,
newinfo->entries,
SMP_ALIGN(newinfo->size));
@@ -945,7 +945,7 @@
struct ipt_entry *table_base;
unsigned int i;
- for (i = 0; i < NR_CPUS; i++) {
+ for (i = 0; i < num_possible_cpus(); i++) {
table_base =
(void *)newinfo->entries
+ TABLE_OFFSET(newinfo, i);
@@ -992,7 +992,7 @@
unsigned int cpu;
unsigned int i;
- for (cpu = 0; cpu < NR_CPUS; cpu++) {
+ for (cpu = 0; cpu < num_possible_cpus(); cpu++) {
i = 0;
IPT_ENTRY_ITERATE(t->entries + TABLE_OFFSET(t, cpu),
t->size,
@@ -1130,7 +1130,7 @@
return -ENOMEM;
newinfo = vmalloc(sizeof(struct ipt_table_info)
- + SMP_ALIGN(tmp.size) * NR_CPUS);
+ + SMP_ALIGN(tmp.size) * num_possible_cpus());
if (!newinfo)
return -ENOMEM;
@@ -1460,7 +1460,7 @@
= { 0, 0, 0, { 0 }, { 0 }, { } };
newinfo = vmalloc(sizeof(struct ipt_table_info)
- + SMP_ALIGN(repl->size) * NR_CPUS);
+ + SMP_ALIGN(repl->size) * num_possible_cpus());
if (!newinfo)
return -ENOMEM;
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] Reduce netfilter memory use on MP systems 2005-02-04 14:09 [PATCH] Reduce netfilter memory use on MP systems Andi Kleen @ 2005-02-04 17:34 ` Martin Josefsson 2005-02-04 17:51 ` Andi Kleen 2005-02-07 18:31 ` Harald Welte 1 sibling, 1 reply; 6+ messages in thread From: Martin Josefsson @ 2005-02-04 17:34 UTC (permalink / raw) To: Patrick McHardy; +Cc: netdev, Netfilter-devel, Andi Kleen [-- Attachment #1: Type: text/plain, Size: 1180 bytes --] On Fri, 2005-02-04 at 15:09 +0100, Andi Kleen wrote: > On kernels compiled with a big NR_CPUS netfilter rules would > eat a lot of memory because all counters would be duplicated > for all NR_CPUs CPUs. With NR_CPUS=256 this would add up > to many MBs of memory. > > This patch only allocates enough memory for the possible CPUs, > which is usually a much smaller number than NR_CPUS. > > This allows loading of bigger rule sets on 64bit systems. > There is still a limit because someone else broke vmalloc to have a 64MB > limit on 64bit systems for single allocations, 129MB on 32bit. > It allocates an array of pages with kmalloc and kmalloc has a 128K limit. > To be fixed with a separate patch. > > 64bit systems were hurt worst because they tend to have big NR_CPUS > and the counters need more memory there, and the vmalloc limit is lower. > But it will raise the limits even on 32bit. > > And in general it saves a lot of memory. Patrick, could you apply and submit this patch to Davem? Or if Davem applies it himself. It's pretty obvious and would help small SMP machines with distribution kernels and/or strange admins. -- /Martin [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Reduce netfilter memory use on MP systems 2005-02-04 17:34 ` Martin Josefsson @ 2005-02-04 17:51 ` Andi Kleen 2005-02-04 18:13 ` Patrick McHardy 0 siblings, 1 reply; 6+ messages in thread From: Andi Kleen @ 2005-02-04 17:51 UTC (permalink / raw) To: Martin Josefsson; +Cc: netdev, Netfilter-devel, Patrick McHardy, Andi Kleen > > And in general it saves a lot of memory. > > Patrick, could you apply and submit this patch to Davem? Or if Davem > applies it himself. It's pretty obvious and would help small SMP > machines with distribution kernels and/or strange admins. The main motivation is actually not to save the memory (that's just a useful side effect), but increase the max limit on 64bit systems. Fixing it fully will require fixing vmalloc of course, but it already help. Without it you can't get more than ~3800 rules on a 64bit system with NR_CPUS==128 and 128 byte cache lines. -Andi ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Reduce netfilter memory use on MP systems 2005-02-04 17:51 ` Andi Kleen @ 2005-02-04 18:13 ` Patrick McHardy 0 siblings, 0 replies; 6+ messages in thread From: Patrick McHardy @ 2005-02-04 18:13 UTC (permalink / raw) To: Andi Kleen; +Cc: netdev, Netfilter-devel, Martin Josefsson [-- Attachment #1: Type: text/plain, Size: 505 bytes --] Andi Kleen wrote: >The main motivation is actually not to save the memory (that's just >a useful side effect), but increase the max limit on 64bit systems. >Fixing it fully will require fixing vmalloc of course, but it already >help. Without it you can't get more than ~3800 rules >on a 64bit system with NR_CPUS==128 and 128 byte cache lines. > Thanks Andi, I've added the patch to my 2.6.12 tree. I've also made the same change in arp_tables, ip6_tables and ebtables for consistency. Regards Patrick [-- Attachment #2: x --] [-- Type: text/plain, Size: 5343 bytes --] ===== net/bridge/netfilter/ebtables.c 1.17 vs edited ===== --- 1.17/net/bridge/netfilter/ebtables.c 2004-11-24 08:46:46 +01:00 +++ edited/net/bridge/netfilter/ebtables.c 2005-02-04 19:03:01 +01:00 @@ -822,10 +822,10 @@ /* this will get free'd in do_replace()/ebt_register_table() if an error occurs */ newinfo->chainstack = (struct ebt_chainstack **) - vmalloc(NR_CPUS * sizeof(struct ebt_chainstack)); + vmalloc(num_possible_cpus() * sizeof(struct ebt_chainstack)); if (!newinfo->chainstack) return -ENOMEM; - for (i = 0; i < NR_CPUS; i++) { + for (i = 0; i < num_possible_cpus(); i++) { newinfo->chainstack[i] = vmalloc(udc_cnt * sizeof(struct ebt_chainstack)); if (!newinfo->chainstack[i]) { @@ -898,7 +898,7 @@ memcpy(counters, oldcounters, sizeof(struct ebt_counter) * nentries); /* add other counters to those of cpu 0 */ - for (cpu = 1; cpu < NR_CPUS; cpu++) { + for (cpu = 1; cpu < num_possible_cpus(); cpu++) { counter_base = COUNTER_BASE(oldcounters, nentries, cpu); for (i = 0; i < nentries; i++) { counters[i].pcnt += counter_base[i].pcnt; @@ -930,7 +930,7 @@ BUGPRINT("Entries_size never zero\n"); return -EINVAL; } - countersize = COUNTER_OFFSET(tmp.nentries) * NR_CPUS; + countersize = COUNTER_OFFSET(tmp.nentries) * num_possible_cpus(); newinfo = (struct ebt_table_info *) vmalloc(sizeof(struct ebt_table_info) + countersize); if (!newinfo) @@ -1023,7 +1023,7 @@ vfree(table->entries); if (table->chainstack) { - for (i = 0; i < NR_CPUS; i++) + for (i = 0; i < num_possible_cpus(); i++) vfree(table->chainstack[i]); vfree(table->chainstack); } @@ -1043,7 +1043,7 @@ vfree(counterstmp); /* can be initialized in translate_table() */ if (newinfo->chainstack) { - for (i = 0; i < NR_CPUS; i++) + for (i = 0; i < num_possible_cpus(); i++) vfree(newinfo->chainstack[i]); vfree(newinfo->chainstack); } @@ -1137,7 +1137,7 @@ return -EINVAL; } - countersize = COUNTER_OFFSET(table->table->nentries) * NR_CPUS; + countersize = COUNTER_OFFSET(table->table->nentries) * num_possible_cpus(); newinfo = (struct ebt_table_info *) vmalloc(sizeof(struct ebt_table_info) + countersize); ret = -ENOMEM; @@ -1191,7 +1191,7 @@ up(&ebt_mutex); free_chainstack: if (newinfo->chainstack) { - for (i = 0; i < NR_CPUS; i++) + for (i = 0; i < num_possible_cpus(); i++) vfree(newinfo->chainstack[i]); vfree(newinfo->chainstack); } @@ -1215,7 +1215,7 @@ if (table->private->entries) vfree(table->private->entries); if (table->private->chainstack) { - for (i = 0; i < NR_CPUS; i++) + for (i = 0; i < num_possible_cpus(); i++) vfree(table->private->chainstack[i]); vfree(table->private->chainstack); } ===== net/ipv4/netfilter/arp_tables.c 1.23 vs edited ===== --- 1.23/net/ipv4/netfilter/arp_tables.c 2005-01-11 03:45:54 +01:00 +++ edited/net/ipv4/netfilter/arp_tables.c 2005-02-04 19:01:20 +01:00 @@ -717,7 +717,7 @@ } /* And one copy for every other CPU */ - for (i = 1; i < NR_CPUS; i++) { + for (i = 1; i < num_possible_cpus(); i++) { memcpy(newinfo->entries + SMP_ALIGN(newinfo->size)*i, newinfo->entries, SMP_ALIGN(newinfo->size)); @@ -768,7 +768,7 @@ unsigned int cpu; unsigned int i; - for (cpu = 0; cpu < NR_CPUS; cpu++) { + for (cpu = 0; cpu < num_possible_cpus(); cpu++) { i = 0; ARPT_ENTRY_ITERATE(t->entries + TABLE_OFFSET(t, cpu), t->size, @@ -886,7 +886,7 @@ return -ENOMEM; newinfo = vmalloc(sizeof(struct arpt_table_info) - + SMP_ALIGN(tmp.size) * NR_CPUS); + + SMP_ALIGN(tmp.size) * num_possible_cpus()); if (!newinfo) return -ENOMEM; @@ -1159,7 +1159,7 @@ = { 0, 0, 0, { 0 }, { 0 }, { } }; newinfo = vmalloc(sizeof(struct arpt_table_info) - + SMP_ALIGN(repl->size) * NR_CPUS); + + SMP_ALIGN(repl->size) * num_possible_cpus()); if (!newinfo) { ret = -ENOMEM; return ret; ===== net/ipv6/netfilter/ip6_tables.c 1.39 vs edited ===== --- 1.39/net/ipv6/netfilter/ip6_tables.c 2005-01-11 03:45:54 +01:00 +++ edited/net/ipv6/netfilter/ip6_tables.c 2005-02-04 19:01:55 +01:00 @@ -952,7 +952,7 @@ } /* And one copy for every other CPU */ - for (i = 1; i < NR_CPUS; i++) { + for (i = 1; i < num_possible_cpus(); i++) { memcpy(newinfo->entries + SMP_ALIGN(newinfo->size)*i, newinfo->entries, SMP_ALIGN(newinfo->size)); @@ -974,7 +974,7 @@ struct ip6t_entry *table_base; unsigned int i; - for (i = 0; i < NR_CPUS; i++) { + for (i = 0; i < num_possible_cpus(); i++) { table_base = (void *)newinfo->entries + TABLE_OFFSET(newinfo, i); @@ -1021,7 +1021,7 @@ unsigned int cpu; unsigned int i; - for (cpu = 0; cpu < NR_CPUS; cpu++) { + for (cpu = 0; cpu < num_possible_cpus(); cpu++) { i = 0; IP6T_ENTRY_ITERATE(t->entries + TABLE_OFFSET(t, cpu), t->size, @@ -1155,7 +1155,7 @@ return -ENOMEM; newinfo = vmalloc(sizeof(struct ip6t_table_info) - + SMP_ALIGN(tmp.size) * NR_CPUS); + + SMP_ALIGN(tmp.size) * num_possible_cpus()); if (!newinfo) return -ENOMEM; @@ -1469,7 +1469,7 @@ = { 0, 0, 0, { 0 }, { 0 }, { } }; newinfo = vmalloc(sizeof(struct ip6t_table_info) - + SMP_ALIGN(repl->size) * NR_CPUS); + + SMP_ALIGN(repl->size) * num_possible_cpus()); if (!newinfo) return -ENOMEM; ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Reduce netfilter memory use on MP systems 2005-02-04 14:09 [PATCH] Reduce netfilter memory use on MP systems Andi Kleen 2005-02-04 17:34 ` Martin Josefsson @ 2005-02-07 18:31 ` Harald Welte 2005-02-07 19:10 ` Andi Kleen 1 sibling, 1 reply; 6+ messages in thread From: Harald Welte @ 2005-02-07 18:31 UTC (permalink / raw) To: Andi Kleen; +Cc: netdev, netfilter-devel [-- Attachment #1: Type: text/plain, Size: 920 bytes --] On Fri, Feb 04, 2005 at 03:09:00PM +0100, Andi Kleen wrote: > > On kernels compiled with a big NR_CPUS netfilter rules would > eat a lot of memory because all counters would be duplicated > for all NR_CPUs CPUs. With NR_CPUS=256 this would add up > to many MBs of memory. Thanks, Andi. I think the NR_CPUS is actually a remnescant of 2.3.x times when we didn't have num_possible_cpus() yet. Also wrt. your vmalloc issues, I think there are floating around some patches which replace our vmalloc use by __get_free_pages() anyway. -- - Harald Welte <laforge@netfilter.org> http://www.netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Reduce netfilter memory use on MP systems 2005-02-07 18:31 ` Harald Welte @ 2005-02-07 19:10 ` Andi Kleen 0 siblings, 0 replies; 6+ messages in thread From: Andi Kleen @ 2005-02-07 19:10 UTC (permalink / raw) To: Harald Welte, Andi Kleen, netdev, netfilter-devel > Also wrt. your vmalloc issues, I think there are floating around some > patches which replace our vmalloc use by __get_free_pages() anyway. Don't think they're needed (unless you want it for other reasons). The vmalloc limit is clearly a bug in itself and will be surely fixed. -Andi ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-02-07 19:10 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-02-04 14:09 [PATCH] Reduce netfilter memory use on MP systems Andi Kleen 2005-02-04 17:34 ` Martin Josefsson 2005-02-04 17:51 ` Andi Kleen 2005-02-04 18:13 ` Patrick McHardy 2005-02-07 18:31 ` Harald Welte 2005-02-07 19:10 ` Andi Kleen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).