* limited number if iptable rules on 64bit hosts @ 2005-02-02 13:38 Olaf Hering 2005-02-02 22:25 ` Olaf Hering 0 siblings, 1 reply; 10+ messages in thread From: Olaf Hering @ 2005-02-02 13:38 UTC (permalink / raw) To: netdev What buffer or sysctrl value has to change to allow more than 3445 rules like this (on a 64bit host with 64bit iptables)? iptables -A FORWARD -j ACCEPT setsockopt(3, SOL_IP, 0x40 /* IP_??? */, "filter\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 524368) = -1 ENOMEM (Cannot allocate memory) I see this with 2.6.5 and 2.6.11. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: limited number if iptable rules on 64bit hosts 2005-02-02 13:38 limited number if iptable rules on 64bit hosts Olaf Hering @ 2005-02-02 22:25 ` Olaf Hering 2005-02-02 22:38 ` Bill Rugolsky Jr. 0 siblings, 1 reply; 10+ messages in thread From: Olaf Hering @ 2005-02-02 22:25 UTC (permalink / raw) To: netdev On Wed, Feb 02, Olaf Hering wrote: > > What buffer or sysctrl value has to change to allow more than 3445 rules > like this (on a 64bit host with 64bit iptables)? > > iptables -A FORWARD -j ACCEPT > > setsockopt(3, SOL_IP, 0x40 /* IP_??? */, > "filter\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 524368) = > -1 ENOMEM (Cannot allocate memory) it triggers the first -ENOMEM in net/ipv4/netfilter/ip_tables.c:do_replace sizeof(struct ipt_table_info)+SMP_ALIGN(tmp.size)*NR_CPUS == 67108992 bytes 128+524288*128==67108992 (sizeof(struct ipt_table_info) + (((tmp.size) + (1 << 7)-1) & ~((1 << 7)-1)) * 128) hmm, no braces missing. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: limited number if iptable rules on 64bit hosts 2005-02-02 22:25 ` Olaf Hering @ 2005-02-02 22:38 ` Bill Rugolsky Jr. 2005-02-02 22:52 ` Olaf Hering 0 siblings, 1 reply; 10+ messages in thread From: Bill Rugolsky Jr. @ 2005-02-02 22:38 UTC (permalink / raw) To: Olaf Hering; +Cc: netdev On Wed, Feb 02, 2005 at 11:25:16PM +0100, Olaf Hering wrote: > it triggers the first -ENOMEM in > net/ipv4/netfilter/ip_tables.c:do_replace > > sizeof(struct ipt_table_info)+SMP_ALIGN(tmp.size)*NR_CPUS == 67108992 bytes > > 128+524288*128==67108992 > > (sizeof(struct ipt_table_info) + (((tmp.size) + (1 << 7)-1) & ~((1 << 7)-1)) * 128) > > hmm, no braces missing. I don't have time to look now [I'm running for the door], but that's possibly the vmalloc() limit of 64M (67108864) ? Regards, Bill Rugolsky ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: limited number if iptable rules on 64bit hosts 2005-02-02 22:38 ` Bill Rugolsky Jr. @ 2005-02-02 22:52 ` Olaf Hering 2005-02-03 11:19 ` Olaf Kirch 0 siblings, 1 reply; 10+ messages in thread From: Olaf Hering @ 2005-02-02 22:52 UTC (permalink / raw) To: Bill Rugolsky Jr.; +Cc: netdev On Wed, Feb 02, Bill Rugolsky Jr. wrote: > I don't have time to look now [I'm running for the door], > but that's possibly the vmalloc() limit of 64M (67108864) ? maybe. ->size is a userprovided value, havent looked closely at iptables source. It seems we have to live with this limitation. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: limited number if iptable rules on 64bit hosts 2005-02-02 22:52 ` Olaf Hering @ 2005-02-03 11:19 ` Olaf Kirch 2005-02-03 18:48 ` David S. Miller 0 siblings, 1 reply; 10+ messages in thread From: Olaf Kirch @ 2005-02-03 11:19 UTC (permalink / raw) To: Olaf Hering; +Cc: Bill Rugolsky Jr., netdev On Wed, Feb 02, 2005 at 11:52:58PM +0100, Olaf Hering wrote: > > I don't have time to look now [I'm running for the door], > > but that's possibly the vmalloc() limit of 64M (67108864) ? > > maybe. > ->size is a userprovided value, havent looked closely at iptables > source. It seems we have to live with this limitation. The problem is two-fold. netfilter tries to allocate some data per-CPU and does vmalloc(sizeof(struct ipt_table_info) + SMP_ALIGN(tmp.size) * NR_CPUS); At 3445 rules, tmp.size is 524272 (why does it want that much memory? I would expect the only data that's per-CPU is the packet and byte counters). In some of our kernel configurations, NR_CPUS is 128 or even more, and we run into a vmalloc limit here. vmalloc wants to allocate an arrays of struct page pointers, and on a 64bit platform this means you're limited to 131072 / 8 = 16384 pages, or 67108864 bytes. In the example Olaf H posted, we fail at 128 + 524272 * 128 = 67108992 bytes, i.e. 16385 pages. So I guess it all boils down to why netfilter needs 150-odd bytes per rule and CPU. Olaf -- Olaf Kirch | --- o --- Nous sommes du soleil we love when we play okir@suse.de | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: limited number if iptable rules on 64bit hosts 2005-02-03 11:19 ` Olaf Kirch @ 2005-02-03 18:48 ` David S. Miller 2005-02-03 18:59 ` Olaf Hering 0 siblings, 1 reply; 10+ messages in thread From: David S. Miller @ 2005-02-03 18:48 UTC (permalink / raw) To: Olaf Kirch; +Cc: olh, brugolsky, netdev On Thu, 3 Feb 2005 12:19:39 +0100 Olaf Kirch <okir@suse.de> wrote: > At 3445 rules, tmp.size is 524272 (why does it want that much memory? I > would expect the only data that's per-CPU is the packet and byte > counters). The rule itself is replicated per-cpu as well to keep L2 cache accesses local per cpu on SMP systems. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: limited number if iptable rules on 64bit hosts 2005-02-03 18:48 ` David S. Miller @ 2005-02-03 18:59 ` Olaf Hering 2005-02-03 19:00 ` David S. Miller 0 siblings, 1 reply; 10+ messages in thread From: Olaf Hering @ 2005-02-03 18:59 UTC (permalink / raw) To: David S. Miller; +Cc: Olaf Kirch, brugolsky, netdev On Thu, Feb 03, David S. Miller wrote: > On Thu, 3 Feb 2005 12:19:39 +0100 > Olaf Kirch <okir@suse.de> wrote: > > > At 3445 rules, tmp.size is 524272 (why does it want that much memory? I > > would expect the only data that's per-CPU is the packet and byte > > counters). > > The rule itself is replicated per-cpu as well to keep L2 cache > accesses local per cpu on SMP systems. Andy made this change, which helped on a dual box. diff -u linux-2.6.5/net/ipv4/netfilter/ip_tables.c-o linux-2.6.5/net/ipv4/netfilter/ip_tables.c --- linux-2.6.5/net/ipv4/netfilter/ip_tables.c-o 2005-02-03 08:06:33.000000000 +0100 +++ linux-2.6.5/net/ipv4/netfilter/ip_tables.c 2005-02-03 13:06:32.163182472 +0100 @@ -29,6 +29,12 @@ #include <linux/netfilter_ipv4/ip_tables.h> +#ifdef CONFIG_HOTPLUG_CPU +#define NF_NR_CPUS NR_CPUS +#else +#define NF_NR_CPUS num_online_cpus() +#endif + MODULE_LICENSE("GPL"); MODULE_AUTHOR("Netfilter Core Team <coreteam@netfilter.org>"); MODULE_DESCRIPTION("IPv4 packet filter"); @@ -860,7 +866,7 @@ } /* And one copy for every other CPU */ - for (i = 1; i < NR_CPUS; i++) { + for (i = 1; i < NF_NR_CPUS; i++) { memcpy(newinfo->entries + SMP_ALIGN(newinfo->size)*i, newinfo->entries, SMP_ALIGN(newinfo->size)); @@ -882,7 +888,7 @@ struct ipt_entry *table_base; unsigned int i; - for (i = 0; i < NR_CPUS; i++) { + for (i = 0; i < NF_NR_CPUS; i++) { table_base = (void *)newinfo->entries + TABLE_OFFSET(newinfo, i); @@ -929,7 +935,7 @@ unsigned int cpu; unsigned int i; - for (cpu = 0; cpu < NR_CPUS; cpu++) { + for (cpu = 0; cpu < NF_NR_CPUS; cpu++) { i = 0; IPT_ENTRY_ITERATE(t->entries + TABLE_OFFSET(t, cpu), t->size, @@ -1067,7 +1073,7 @@ return -ENOMEM; newinfo = vmalloc(sizeof(struct ipt_table_info) - + SMP_ALIGN(tmp.size) * NR_CPUS); + + SMP_ALIGN(tmp.size) * NF_NR_CPUS); if (!newinfo) return -ENOMEM; @@ -1380,7 +1386,7 @@ = { 0, 0, 0, { 0 }, { 0 }, { } }; newinfo = vmalloc(sizeof(struct ipt_table_info) - + SMP_ALIGN(table->table->size) * NR_CPUS); + + SMP_ALIGN(table->table->size) * NF_NR_CPUS); if (!newinfo) return -ENOMEM; diff -u linux-2.6.5/mm/vmalloc.c-o linux-2.6.5/mm/vmalloc.c --- linux-2.6.5/mm/vmalloc.c-o 2005-02-03 08:06:50.000000000 +0100 +++ linux-2.6.5/mm/vmalloc.c 2005-02-03 13:07:44.162236952 +0100 @@ -310,7 +310,10 @@ __free_page(area->pages[i]); } - kfree(area->pages); + if (area->nr_pages * sizeof(struct page *) >= 4*PAGE_SIZE) + vfree(area->pages); + else + kfree(area->pages); } kfree(area); @@ -414,7 +417,11 @@ array_size = (nr_pages * sizeof(struct page *)); area->nr_pages = nr_pages; - area->pages = pages = kmalloc(array_size, (gfp_mask & ~__GFP_HIGHMEM)); + + if (array_size >= 4*PAGE_SIZE) + area->pages = pages = __vmalloc(array_size, (gfp_mask & ~__GFP_HIGHMEM), PAGE_KERNEL); + else + area->pages = pages = kmalloc(array_size, (gfp_mask & ~__GFP_HIGHMEM)); if (!area->pages) { remove_vm_area(area->addr); kfree(area); ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: limited number if iptable rules on 64bit hosts 2005-02-03 18:59 ` Olaf Hering @ 2005-02-03 19:00 ` David S. Miller 2005-02-03 19:33 ` Bart De Schuymer 2005-02-03 21:35 ` Bill Rugolsky Jr. 0 siblings, 2 replies; 10+ messages in thread From: David S. Miller @ 2005-02-03 19:00 UTC (permalink / raw) To: Olaf Hering; +Cc: okir, netdev, netfilter-devel, brugolsky On Thu, 3 Feb 2005 19:59:28 +0100 Olaf Hering <olh@suse.de> wrote: > On Thu, Feb 03, David S. Miller wrote: > > > The rule itself is replicated per-cpu as well to keep L2 cache > > accesses local per cpu on SMP systems. > > Andy made this change, which helped on a dual box. It might not help for Olaf's 128 cpu box though :-) I think reconsider the idea of replicating the rule itself per-cpu. Also, this thread should have begun with netfilter-devel at least on the CC:, added. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: limited number if iptable rules on 64bit hosts 2005-02-03 19:00 ` David S. Miller @ 2005-02-03 19:33 ` Bart De Schuymer 2005-02-03 21:35 ` Bill Rugolsky Jr. 1 sibling, 0 replies; 10+ messages in thread From: Bart De Schuymer @ 2005-02-03 19:33 UTC (permalink / raw) To: David S. Miller; +Cc: okir, netdev, netfilter-devel, brugolsky, Olaf Hering Op do, 03-02-2005 te 11:00 -0800, schreef David S. Miller: > On Thu, 3 Feb 2005 19:59:28 +0100 > Olaf Hering <olh@suse.de> wrote: > > > On Thu, Feb 03, David S. Miller wrote: > > > > > The rule itself is replicated per-cpu as well to keep L2 cache > > > accesses local per cpu on SMP systems. > > > > Andy made this change, which helped on a dual box. > > It might not help for Olaf's 128 cpu box though :-) > > I think reconsider the idea of replicating the rule itself per-cpu. > Also, this thread should have begun with netfilter-devel at least on > the CC:, added. Note that ebtables only has per-cpu counters. I wonder what limits are seen on such systems for ebtables. cheers, Bart ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: limited number if iptable rules on 64bit hosts 2005-02-03 19:00 ` David S. Miller 2005-02-03 19:33 ` Bart De Schuymer @ 2005-02-03 21:35 ` Bill Rugolsky Jr. 1 sibling, 0 replies; 10+ messages in thread From: Bill Rugolsky Jr. @ 2005-02-03 21:35 UTC (permalink / raw) To: David S. Miller; +Cc: okir, netdev, netfilter-devel, Olaf Hering On Thu, Feb 03, 2005 at 11:00:49AM -0800, David S. Miller wrote: > It might not help for Olaf's 128 cpu box though :-) > > I think reconsider the idea of replicating the rule itself per-cpu. > Also, this thread should have begun with netfilter-devel at least on > the CC:, added. As Olaf Kirch pointed out, an entry is about 150 bytes, while the counters are two 64-bit ints, and it looks like 'unsigned int comefrom' is set as the chains are traversed [net/ipv4/netfilter/ip_tables.c]: /* Save old back ptr in next entry */ struct ipt_entry *next = (void *)e + e->next_offset; next->comefrom = (void *)back - table_base; /* set back pointer to next entry */ back = next; That's 20-24 bytes of state per-entry per-cpu, for a factor of 6-7 savings, at the expense of hairing up the code slightly to do parallel indexed access, Fortran style. If I am understanding the mm code correctly, a single vmalloc() allocation is currently limited to 64M on a 64-bit platform, but the VMALLOC address range is much greater, so one might also prefer to do a kmalloc()/vmalloc() per CPU, perhaps by creating {vmalloc,vfree}_percpu() and using the percpu interfaces. Bill Rugolsky ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2005-02-03 21:35 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-02-02 13:38 limited number if iptable rules on 64bit hosts Olaf Hering 2005-02-02 22:25 ` Olaf Hering 2005-02-02 22:38 ` Bill Rugolsky Jr. 2005-02-02 22:52 ` Olaf Hering 2005-02-03 11:19 ` Olaf Kirch 2005-02-03 18:48 ` David S. Miller 2005-02-03 18:59 ` Olaf Hering 2005-02-03 19:00 ` David S. Miller 2005-02-03 19:33 ` Bart De Schuymer 2005-02-03 21:35 ` Bill Rugolsky Jr.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).