From: Eric Dumazet <dada1@cosmosbay.com>
To: Stephen Hemminger <shemminger@vyatta.com>
Cc: David Miller <davem@davemloft.net>, netdev@vger.kernel.org
Subject: Re: [PATCH 6/6] netfilter: convert x_tables to use RCU
Date: Sat, 31 Jan 2009 18:27:45 +0100 [thread overview]
Message-ID: <49848A11.7020908@cosmosbay.com> (raw)
In-Reply-To: <20090130215729.658203821@vyatta.com>
Stephen Hemminger a écrit :
> Replace existing reader/writer lock with Read-Copy-Update to
> elminate the overhead of a read lock on each incoming packet.
> This should reduce the overhead of iptables especially on SMP
> systems.
>
> The previous code used a reader-writer lock for two purposes.
> The first was to ensure that the xt_table_info reference was not in
> process of being changed. Since xt_table_info is only freed via one
> routine, it was a direct conversion to RCU.
>
> The other use of the reader-writer lock was to to block changes
> to counters while they were being read. This synchronization was
> fixed by the previous patch. But still need to make sure table info
> isn't going away.
>
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
>
>
> ---
> include/linux/netfilter/x_tables.h | 10 ++++++--
> net/ipv4/netfilter/arp_tables.c | 16 ++++++-------
> net/ipv4/netfilter/ip_tables.c | 27 +++++++++++-----------
> net/ipv6/netfilter/ip6_tables.c | 16 ++++++-------
> net/netfilter/x_tables.c | 45 ++++++++++++++++++++++++++-----------
> 5 files changed, 70 insertions(+), 44 deletions(-)
OK, I have a last comment Stephen
(I tested your patches on my dev machine and got nice results)
In net/ipv4/netfilter/ip_tables.c,
get_counters() can be quite slow on large firewalls, and it is
called from alloc_counters() with BH disabled, but also from __do_replace()
without BH disabled.
So get_counters() first iterates on set_entry_to_counter() for current cpu
but might be interrupted and we get bad results...
Solution is to always use seqcount_t protection, and no BH disabling at all.
Then, I wonder if all this is valid, since alloc_counters() might be supposed
to perform an atomic snapshots of all counters. This was possible because
of a write_lock_bh(), but does it make any sense to block all packets for
such a long time ??? If this is a strong requirement, I am afraid we have
to make other changes...
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
net/ipv4/netfilter/ip_tables.c | 25 +++----------------------
1 files changed, 3 insertions(+), 22 deletions(-)
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 7c50e2e..d208b25 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -900,25 +900,8 @@ get_counters(const struct xt_table_info *t,
{
unsigned int cpu;
unsigned int i;
- unsigned int curcpu;
-
- /* Instead of clearing (by a previous call to memset())
- * the counters and using adds, we set the counters
- * with data used by 'current' CPU
- * We dont care about preemption here.
- */
- curcpu = raw_smp_processor_id();
-
- i = 0;
- IPT_ENTRY_ITERATE(t->entries[curcpu],
- t->size,
- set_entry_to_counter,
- counters,
- &i);
for_each_possible_cpu(cpu) {
- if (cpu == curcpu)
- continue;
i = 0;
IPT_ENTRY_ITERATE(t->entries[cpu],
t->size,
@@ -939,15 +922,12 @@ static struct xt_counters * alloc_counters(struct xt_table *table)
(other than comefrom, which userspace doesn't care
about). */
countersize = sizeof(struct xt_counters) * private->number;
- counters = vmalloc_node(countersize, numa_node_id());
+ counters = __vmalloc(countersize, GFP_KERNEL | __GFP_ZERO, PAGE_KERNEL);
if (counters == NULL)
return ERR_PTR(-ENOMEM);
- /* First, sum counters... */
- local_bh_disable();
get_counters(private, counters);
- local_bh_enable();
return counters;
}
@@ -1209,7 +1189,8 @@ __do_replace(struct net *net, const char *name, unsigned int valid_hooks,
void *loc_cpu_old_entry;
ret = 0;
- counters = vmalloc(num_counters * sizeof(struct xt_counters));
+ counters = __vmalloc(num_counters * sizeof(struct xt_counters),
+ GFP_KERNEL | __GFP_ZERO, PAGE_KERNEL);
if (!counters) {
ret = -ENOMEM;
goto out;
next prev parent reply other threads:[~2009-01-31 17:27 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-30 21:57 [PATCH 0/6] iptables: eliminate read/write lock (v0.4) Stephen Hemminger
2009-01-30 21:57 ` [PATCH 1/6] netfilter: change elements in x_tables Stephen Hemminger
2009-01-30 21:57 ` [PATCH 2/6] netfilter: remove unneeded initializations Stephen Hemminger
2009-01-30 21:57 ` [PATCH 3/6] ebtables: " Stephen Hemminger
2009-01-30 21:57 ` [PATCH 4/6] netfilter: abstract xt_counters Stephen Hemminger
2009-02-01 12:25 ` Eric Dumazet
2009-02-02 23:33 ` [PATCH 3/3] iptables: lock free counters (alternate version) Stephen Hemminger
2009-02-03 19:00 ` Eric Dumazet
2009-02-03 19:19 ` Eric Dumazet
2009-02-03 19:32 ` Paul E. McKenney
2009-02-03 20:20 ` Eric Dumazet
2009-02-03 20:44 ` Stephen Hemminger
2009-02-03 21:05 ` Eric Dumazet
2009-02-03 21:10 ` Paul E. McKenney
2009-02-03 21:22 ` Stephen Hemminger
2009-02-03 21:27 ` Rick Jones
2009-02-03 23:11 ` Paul E. McKenney
2009-02-03 23:18 ` Stephen Hemminger
2009-01-30 21:57 ` [PATCH 5/6] netfilter: use sequence number synchronization for counters Stephen Hemminger
2009-01-30 21:57 ` [PATCH 6/6] netfilter: convert x_tables to use RCU Stephen Hemminger
2009-01-31 17:27 ` Eric Dumazet [this message]
-- strict thread matches above, loose matches on Subject: below --
2009-01-29 19:12 [PATCH 0/6] iptables: read/write lock elimination (v0.4) Stephen Hemminger
2009-01-29 19:12 ` [PATCH 6/6] netfilter: convert x_tables to use RCU Stephen Hemminger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49848A11.7020908@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
--cc=shemminger@vyatta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.