All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Stephen Hemminger <shemminger@vyatta.com>
Cc: David Miller <davem@davemloft.net>, netdev@vger.kernel.org
Subject: Re: [PATCH 5/5] netfilter: convert x_tables to use RCU
Date: Fri, 30 Jan 2009 07:53:09 +0100	[thread overview]
Message-ID: <4982A3D5.3030701@cosmosbay.com> (raw)
In-Reply-To: <20090129151624.37dce05e@extreme>

Stephen Hemminger a écrit :
> On Fri, 30 Jan 2009 00:04:16 +0100
> Eric Dumazet <dada1@cosmosbay.com> wrote:
> 
>> Stephen Hemminger a écrit :
>>> Replace existing reader/writer lock with Read-Copy-Update to
>>> elminate the overhead of a read lock on each incoming packet.
>>> This should reduce the overhead of iptables especially on SMP
>>> systems.
>>>
>>> The previous code used a reader-writer lock for two purposes.
>>> The first was to ensure that the xt_table_info reference was not in
>>> process of being changed. Since xt_table_info is only freed via one
>>> routine, it was a direct conversion to RCU.
>>>
>>> The other use of the reader-writer lock was to to block changes
>>> to counters while they were being read. This synchronization was
>>> fixed by the previous patch.  But still need to make sure table info
>>> isn't going away.
>>>
>>> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
>>>
>>>
>>> ---
>>>  include/linux/netfilter/x_tables.h |   10 ++++++-
>>>  net/ipv4/netfilter/arp_tables.c    |   12 ++++-----
>>>  net/ipv4/netfilter/ip_tables.c     |   12 ++++-----
>>>  net/ipv6/netfilter/ip6_tables.c    |   12 ++++-----
>>>  net/netfilter/x_tables.c           |   48 ++++++++++++++++++++++++++-----------
>>>  5 files changed, 60 insertions(+), 34 deletions(-)
>>>
>>> --- a/include/linux/netfilter/x_tables.h	2009-01-28 22:04:39.316517913 -0800
>>> +++ b/include/linux/netfilter/x_tables.h	2009-01-28 22:14:54.648490491 -0800
>>> @@ -352,8 +352,8 @@ struct xt_table
>>>  	/* What hooks you will enter on */
>>>  	unsigned int valid_hooks;
>>>  
>>> -	/* Lock for the curtain */
>>> -	rwlock_t lock;
>>> +	/* Lock for curtain */
>>> +	spinlock_t lock;
>>>  
>>>  	/* Man behind the curtain... */
>>>  	struct xt_table_info *private;
>>> @@ -386,6 +386,12 @@ struct xt_table_info
>>>  	/* Secret compartment */
>>>  	seqcount_t *seq;
>>>  
>>> +	/* For the dustman... */
>>> +	union {
>>> +		struct rcu_head rcu;
>>> +		struct work_struct work;
>>> +	};
>>> +
>>>  	/* ipt_entry tables: one per CPU */
>>>  	/* Note : this field MUST be the last one, see XT_TABLE_INFO_SZ */
>>>  	char *entries[1];
>>> --- a/net/ipv4/netfilter/arp_tables.c	2009-01-28 22:13:16.423490077 -0800
>>> +++ b/net/ipv4/netfilter/arp_tables.c	2009-01-28 22:14:54.648490491 -0800
>>> @@ -238,8 +238,8 @@ unsigned int arpt_do_table(struct sk_buf
>>>  	indev = in ? in->name : nulldevname;
>>>  	outdev = out ? out->name : nulldevname;
>>>  
>>> -	read_lock_bh(&table->lock);
>>> -	private = table->private;
>>> +	rcu_read_lock_bh();
>>> +	private = rcu_dereference(table->private);
>>>  	table_base = (void *)private->entries[smp_processor_id()];
>>>  	seq = per_cpu_ptr(private->seq, smp_processor_id());
>>>  	e = get_entry(table_base, private->hook_entry[hook]);
>>> @@ -315,7 +315,7 @@ unsigned int arpt_do_table(struct sk_buf
>>>  			e = (void *)e + e->next_offset;
>>>  		}
>>>  	} while (!hotdrop);
>>> -	read_unlock_bh(&table->lock);
>>> +	rcu_read_unlock_bh();
>>>  
>>>  	if (hotdrop)
>>>  		return NF_DROP;
>>> @@ -1163,8 +1163,8 @@ static int do_add_counters(struct net *n
>>>  		goto free;
>>>  	}
>>>  
>>> -	write_lock_bh(&t->lock);
>>> -	private = t->private;
>>> +	rcu_read_lock_bh();
>>> +	private = rcu_dereference(t->private);
>>>  	if (private->number != num_counters) {
>>>  		ret = -EINVAL;
>>>  		goto unlock_up_free;
>>> @@ -1179,7 +1179,7 @@ static int do_add_counters(struct net *n
>>>  			   paddc,
>>>  			   &i);
>>>   unlock_up_free:
>>> -	write_unlock_bh(&t->lock);
>>> +	rcu_read_unlock_bh();
>>>  	xt_table_unlock(t);
>>>  	module_put(t->me);
>>>   free:
>>> --- a/net/ipv4/netfilter/ip_tables.c	2009-01-28 22:06:10.596739805 -0800
>>> +++ b/net/ipv4/netfilter/ip_tables.c	2009-01-28 22:14:54.648490491 -0800
>>> @@ -348,9 +348,9 @@ ipt_do_table(struct sk_buff *skb,
>>>  	mtpar.family  = tgpar.family = NFPROTO_IPV4;
>>>  	tgpar.hooknum = hook;
>>>  
>>> -	read_lock_bh(&table->lock);
>>> +	rcu_read_lock_bh();
>>>  	IP_NF_ASSERT(table->valid_hooks & (1 << hook));
>>> -	private = table->private;
>>> +	private = rcu_dereference(table->private);
>>>  	table_base = (void *)private->entries[smp_processor_id()];
>>>  	seq = per_cpu_ptr(private->seq, smp_processor_id());
>>>  	e = get_entry(table_base, private->hook_entry[hook]);
>>> @@ -449,7 +449,7 @@ ipt_do_table(struct sk_buff *skb,
>>>  		}
>>>  	} while (!hotdrop);
>>>  
>>> -	read_unlock_bh(&table->lock);
>>> +	rcu_read_unlock_bh();
>>>  
>>>  #ifdef DEBUG_ALLOW_ALL
>>>  	return NF_ACCEPT;
>>> @@ -1408,8 +1408,8 @@ do_add_counters(struct net *net, void __
>>>  		goto free;
>>>  	}
>>>  
>>> -	write_lock_bh(&t->lock);
>>> -	private = t->private;
>>> +	rcu_read_lock_bh();
>>> +	private = rcu_dereference(t->private);
>> I feel litle bit nervous seeing a write_lock_bh() changed to a rcu_read_lock()
> 
> Facts, it is only updating entries on current cpu

Yes, like done in ipt_do_table() ;)

Fact is we need to tell other threads, running on other cpus, that an update
 of our entries is running.

Let me check if your v4 and xt_counters abstraction already solved this problem.

> 
>> Also, add_counter_to_entry() is not using seqcount protection, so another thread
>> doing an iptables -L in parallel with this thread will possibly get corrupted counters.
> add_counter_to_entry is local to current CPU.
> 
> 
>> (With write_lock_bh(), this corruption could not occur)
>>
>>
> --


  reply	other threads:[~2009-01-30  6:53 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-29  6:25 [PATCH 0/5] iptables lockless receive (v0.3) Stephen Hemminger
2009-01-29  6:25 ` [PATCH 1/5] netfilter: change elements in x_tables Stephen Hemminger
2009-01-29  6:25 ` [PATCH 2/5] netfilter: remove unneeded initializations Stephen Hemminger
2009-01-29  6:25 ` [PATCH 3/5] ebtables: " Stephen Hemminger
2009-01-29  6:25 ` [PATCH 4/5] netfilter: use sequence number synchronization for counters Stephen Hemminger
2009-01-29  8:47   ` Eric Dumazet
2009-01-29  6:25 ` [PATCH 5/5] netfilter: convert x_tables to use RCU Stephen Hemminger
2009-01-29 23:04   ` Eric Dumazet
2009-01-29 23:16     ` Stephen Hemminger
2009-01-30  6:53       ` Eric Dumazet [this message]
2009-01-30  7:02         ` Eric Dumazet
2009-01-30  7:05           ` Eric Dumazet
2009-01-29  8:07 ` [PATCH 0/5] iptables lockless receive (v0.3) Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4982A3D5.3030701@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=shemminger@vyatta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.