From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Patrick McHardy <kaber@trash.net>
Cc: Eric Dumazet <dada1@cosmosbay.com>,
Stephen Hemminger <shemminger@vyatta.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
David Miller <davem@davemloft.net>,
jeff.chua.linux@gmail.com, paulus@samba.org, mingo@elte.hu,
laijs@cn.fujitsu.com, jengelh@medozas.de, r000n@r000n.net,
linux-kernel@vger.kernel.org, netfilter-devel@vger.kernel.org,
netdev@vger.kernel.org, benh@kernel.crashing.org
Subject: Re: [PATCH] netfilter: use per-cpu spinlock and RCU (v5)
Date: Thu, 16 Apr 2009 07:47:48 -0700 [thread overview]
Message-ID: <20090416144748.GB6924@linux.vnet.ibm.com> (raw)
In-Reply-To: <49E7384B.5020707@trash.net>
On Thu, Apr 16, 2009 at 03:53:15PM +0200, Patrick McHardy wrote:
> Eric Dumazet wrote:
>> Stephen Hemminger a écrit :
>>> This is an alternative version of ip/ip6/arp tables locking using
>>> per-cpu locks. This avoids the overhead of synchronize_net() during
>>> update but still removes the expensive rwlock in earlier versions.
>>>
>>> The idea for this came from an earlier version done by Eric Dumazet.
>>> Locking is done per-cpu, the fast path locks on the current cpu
>>> and updates counters. The slow case involves acquiring the locks on
>>> all cpu's. This version uses RCU for the table->base reference
>>> but per-cpu-lock for counters.
>
>> This version is a regression over 2.6.2[0-9], because of two points
>> 1) Much more atomic ops :
>> Because of additional
>>> + spin_lock(&__get_cpu_var(ip_tables_lock));
>>> ADD_COUNTER(e->counters, ntohs(ip->tot_len), 1);
>>> + spin_unlock(&__get_cpu_var(ip_tables_lock));
>> added on each counter updates.
>> On many setups, each packet coming in or out of the machine has
>> to update between 2 to 20 rule counters. So to avoid *one* atomic ops
>> of read_unlock(), this v4 version adds 2 to 20 atomic ops...
>
> I agree, this seems to be a step backwards.
>
>> I still not see the problem between the previous version (2.6.2[0-8]) that
>> had a central
>> rwlock, that hurted performance on SMP because of cache line ping pong,
>> and the solution
>> having one rwlock per cpu.
>> We wanted to reduce the cache line ping pong first. This *is* the hurting
>> point,
>> by an order of magnitude.
>
> Dave doesn't seem to like the rwlock approach.
Well, we don't really need an rwlock, especially given that we really
don't want two "readers" incrementing the same counter concurrently.
A safer approach would be to maintain a flag in the task structure
tracking which (if any) of the per-CPU locks you hold. Also maintain
a recursion-depth counter. If the flag says you don't already hold
the lock, set it and acquire the lock. Either way, increment the
recursion-depth counter:
if (current->netfilter_lock_held != cur_cpu) {
BUG_ON(current->netfilter_lock_held != CPU_NONE);
spin_lock(per_cpu(..., cur_cpu));
current->netfilter_lock_held = cur_cpu;
}
current->netfilter_lock_nesting++;
And reverse the process to unlock:
if (--current->netfilter_lock_nesting == 0) {
spin_unlock(per_cpu(..., cur_cpu));
current->netfilter_lock_held = CPU_NONE;
}
> I don't see a way to
> do anything asynchronously like call_rcu() to improve this, so to
> bring up one of Stephens suggestions again:
>
>>> > * use on_each_cpu() somehow to do grace periood?
>
> We could use this to replace the counters, presuming it is
> indeed faster than waiting for a RCU grace period.
One way to accomplish this is to take Mathieu Desnoyers's user-level
RCU implementation and drop it into the kernel, replacing the POSIX
signal handling with on_each_cpu(), smp_call_function(), or whatever.
>> 2) Second problem : Potential OOM
>> About freeing old rules with call_rcu() and/or schedule_work(), this is
>> going
>> to OOM pretty fast on small appliances with basic firewall setups loading
>> rules one by one, as done by original topic reporter.
>> We had reports from guys using linux with 4MB of available ram (French
>> provider free.fr on
>> their applicance box), and we had to use SLAB_DESTROY_BY_RCU thing on
>> conntrack
>> to avoid OOM for their setups. We dont want to use call_rcu() and queue
>> 100 or 200 vfree().
>
> Agreed.
This is not a real problem be handled by doing a synchronize_rcu() every
so often as noted in a prior email elsewhere in this thread:
call_rcu(...);
if (++count > 50) {
synchronize_rcu();
count = 0;
}
This choice of constant would reduce the grace-period pain to 2% of the
full effect, which should be acceptable, at least if I remember the
original problem report of 0.2 seconds growing to 6.0 seconds -- this
would give you:
(6.0-0.2)/50+0.2 = .316
I would argue that 100 milliseconds is an OK penalty for a deprecated
feature. But of course the per-CPU lock approach should avoid even that
penalty, albeit at some per-packet penalty. However, my guess is that
this per-packet penalty is not measureable at the system level.
And if the penalty of a single uncontended lock -is- measureable, I will
be very quick to offer my congratulations, at least once I get my jaw
off my keyboard. ;-)
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2009-04-16 14:47 UTC|newest]
Thread overview: 215+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <Pine.LNX.4.64.0904101656190.2093@boston.corp.fedex.com>
[not found] ` <20090410095246.4fdccb56@s6510>
2009-04-11 1:25 ` iptables very slow after commit784544739a25c30637397ace5489eeb6e15d7d49 David Miller
2009-04-11 1:39 ` iptables very slow after commit 784544739a25c30637397ace5489eeb6e15d7d49 Linus Torvalds
2009-04-11 4:15 ` Paul E. McKenney
2009-04-11 5:14 ` Jan Engelhardt
2009-04-11 5:42 ` Paul E. McKenney
2009-04-11 6:00 ` David Miller
2009-04-11 18:12 ` Kyle Moffett
2009-04-11 18:32 ` Arkadiusz Miskiewicz
2009-04-12 0:54 ` david
2009-04-12 5:05 ` Kyle Moffett
2009-04-12 12:30 ` Harald Welte
2009-04-12 16:38 ` Jan Engelhardt
2009-04-11 15:07 ` Stephen Hemminger
2009-04-11 16:05 ` Jeff Chua
2009-04-11 17:51 ` Linus Torvalds
2009-04-11 7:08 ` Ingo Molnar
2009-04-11 15:05 ` Stephen Hemminger
2009-04-11 17:48 ` Paul E. McKenney
2009-04-12 10:54 ` Ingo Molnar
2009-04-12 11:34 ` Paul Mackerras
2009-04-12 17:31 ` Paul E. McKenney
2009-04-13 1:13 ` David Miller
2009-04-13 4:04 ` Paul E. McKenney
2009-04-13 16:53 ` [PATCH] netfilter: use per-cpu spinlock rather than RCU Stephen Hemminger
2009-04-13 17:40 ` Eric Dumazet
2009-04-13 18:11 ` Stephen Hemminger
2009-04-13 19:06 ` Martin Josefsson
2009-04-13 19:17 ` Linus Torvalds
2009-04-13 22:24 ` Andrew Morton
2009-04-13 23:20 ` Stephen Hemminger
2009-04-13 23:26 ` Andrew Morton
2009-04-13 23:37 ` Linus Torvalds
2009-04-13 23:52 ` Ingo Molnar
2009-04-14 12:27 ` Patrick McHardy
2009-04-14 14:23 ` Eric Dumazet
2009-04-14 14:45 ` Stephen Hemminger
2009-04-14 15:49 ` Eric Dumazet
2009-04-14 16:51 ` Jeff Chua
2009-04-14 18:17 ` [PATCH] netfilter: use per-cpu spinlock rather than RCU (v2) Stephen Hemminger
2009-04-14 19:28 ` Eric Dumazet
2009-04-14 21:11 ` Stephen Hemminger
2009-04-14 21:13 ` [PATCH] netfilter: use per-cpu spinlock rather than RCU (v3) Stephen Hemminger
2009-04-14 21:40 ` Eric Dumazet
2009-04-15 10:59 ` Patrick McHardy
2009-04-15 16:31 ` Stephen Hemminger
2009-04-15 20:55 ` Stephen Hemminger
2009-04-15 21:07 ` Eric Dumazet
2009-04-15 21:55 ` Jan Engelhardt
2009-04-16 12:12 ` Patrick McHardy
2009-04-16 12:24 ` Jan Engelhardt
2009-04-16 12:31 ` Patrick McHardy
2009-04-15 21:57 ` [PATCH] netfilter: use per-cpu rwlock rather than RCU (v4) Stephen Hemminger
2009-04-15 23:48 ` [PATCH] netfilter: use per-cpu spinlock rather than RCU (v3) David Miller
2009-04-16 0:01 ` Stephen Hemminger
2009-04-16 0:05 ` David Miller
2009-04-16 12:28 ` Patrick McHardy
2009-04-16 0:10 ` Linus Torvalds
2009-04-16 0:45 ` [PATCH] netfilter: use per-cpu spinlock and RCU (v5) Stephen Hemminger
2009-04-16 5:01 ` Eric Dumazet
2009-04-16 13:53 ` Patrick McHardy
2009-04-16 14:47 ` Paul E. McKenney [this message]
2009-04-16 16:10 ` [PATCH] netfilter: use per-cpu recursive spinlock (v6) Eric Dumazet
2009-04-16 16:20 ` Eric Dumazet
2009-04-16 16:37 ` Linus Torvalds
2009-04-16 16:59 ` Patrick McHardy
2009-04-16 17:58 ` Paul E. McKenney
2009-04-16 18:41 ` Eric Dumazet
2009-04-16 20:49 ` [PATCH[] netfilter: use per-cpu reader-writer lock (v0.7) Stephen Hemminger
2009-04-16 21:02 ` Linus Torvalds
2009-04-16 23:04 ` Ingo Molnar
2009-04-17 0:13 ` [PATCH] netfilter: use per-cpu recursive spinlock (v6) Paul E. McKenney
2009-04-16 13:11 ` [PATCH] netfilter: use per-cpu spinlock rather than RCU (v3) Patrick McHardy
2009-04-16 22:33 ` David Miller
2009-04-16 23:49 ` Paul E. McKenney
2009-04-16 23:52 ` [PATCH] netfilter: per-cpu spin-lock with recursion (v0.8) Stephen Hemminger
2009-04-17 0:15 ` Jeff Chua
2009-04-17 5:55 ` Peter Zijlstra
2009-04-17 6:03 ` Eric Dumazet
2009-04-17 6:14 ` Eric Dumazet
2009-04-17 17:08 ` Peter Zijlstra
2009-04-17 11:17 ` Patrick McHardy
2009-04-17 1:28 ` [PATCH] netfilter: use per-cpu spinlock rather than RCU (v3) Paul E. McKenney
2009-04-17 2:19 ` Mathieu Desnoyers
2009-04-17 5:05 ` Paul E. McKenney
2009-04-17 5:44 ` Mathieu Desnoyers
2009-04-17 14:51 ` Paul E. McKenney
2009-04-17 4:50 ` Stephen Hemminger
2009-04-17 5:08 ` Paul E. McKenney
2009-04-17 5:16 ` Eric Dumazet
2009-04-17 5:40 ` Paul E. McKenney
2009-04-17 8:07 ` David Miller
2009-04-17 15:00 ` Paul E. McKenney
2009-04-17 17:22 ` Peter Zijlstra
2009-04-17 17:32 ` Linus Torvalds
2009-04-17 6:12 ` Peter Zijlstra
2009-04-17 16:33 ` Paul E. McKenney
2009-04-17 16:51 ` Peter Zijlstra
2009-04-17 21:29 ` Paul E. McKenney
2009-04-18 9:40 ` Evgeniy Polyakov
2009-04-18 14:14 ` Paul E. McKenney
2009-04-20 17:34 ` [PATCH] netfilter: use per-cpu recursive lock (v10) Stephen Hemminger
2009-04-20 18:21 ` Paul E. McKenney
2009-04-20 18:25 ` Eric Dumazet
2009-04-20 20:32 ` Stephen Hemminger
2009-04-20 20:42 ` Stephen Hemminger
2009-04-20 21:05 ` Paul E. McKenney
2009-04-20 21:23 ` Paul Mackerras
2009-04-20 21:58 ` Paul E. McKenney
2009-04-20 22:41 ` Paul Mackerras
2009-04-20 23:01 ` [PATCH] netfilter: use per-cpu recursive lock (v11) Stephen Hemminger
2009-04-21 3:41 ` Lai Jiangshan
2009-04-21 3:56 ` Eric Dumazet
2009-04-21 4:15 ` Stephen Hemminger
2009-04-21 5:22 ` Lai Jiangshan
2009-04-21 5:45 ` Stephen Hemminger
2009-04-21 6:52 ` Lai Jiangshan
2009-04-21 8:16 ` Evgeniy Polyakov
2009-04-21 8:42 ` Lai Jiangshan
2009-04-21 8:49 ` David Miller
2009-04-21 8:55 ` Eric Dumazet
2009-04-21 9:22 ` Evgeniy Polyakov
2009-04-21 9:34 ` Lai Jiangshan
2009-04-21 5:34 ` Lai Jiangshan
2009-04-21 4:59 ` Eric Dumazet
2009-04-21 16:37 ` Paul E. McKenney
2009-04-21 5:46 ` Lai Jiangshan
2009-04-21 16:13 ` Linus Torvalds
2009-04-21 16:43 ` Stephen Hemminger
2009-04-21 16:50 ` Linus Torvalds
2009-04-21 18:02 ` Ingo Molnar
2009-04-21 18:15 ` Stephen Hemminger
2009-04-21 19:10 ` Ingo Molnar
2009-04-21 19:46 ` Eric Dumazet
2009-04-22 7:35 ` Ingo Molnar
2009-04-22 8:53 ` Eric Dumazet
2009-04-22 10:13 ` Jarek Poplawski
2009-04-22 11:26 ` Ingo Molnar
2009-04-22 11:39 ` Jarek Poplawski
2009-04-22 11:18 ` Ingo Molnar
2009-04-22 15:19 ` Linus Torvalds
2009-04-22 16:57 ` Eric Dumazet
2009-04-22 17:18 ` Linus Torvalds
2009-04-22 20:46 ` Jarek Poplawski
2009-04-22 17:48 ` Ingo Molnar
2009-04-21 21:04 ` Stephen Hemminger
2009-04-22 8:00 ` Ingo Molnar
2009-04-21 19:39 ` Ingo Molnar
2009-04-21 21:39 ` [PATCH] netfilter: use per-cpu recursive lock (v13) Stephen Hemminger
2009-04-22 4:17 ` Paul E. McKenney
2009-04-22 14:57 ` Eric Dumazet
2009-04-22 15:32 ` Linus Torvalds
2009-04-24 4:09 ` [PATCH] netfilter: use per-CPU recursive lock {XIV} Stephen Hemminger
2009-04-24 4:58 ` Eric Dumazet
2009-04-24 15:33 ` Patrick McHardy
2009-04-24 16:18 ` Stephen Hemminger
2009-04-24 20:43 ` Jarek Poplawski
2009-04-25 20:30 ` [PATCH] netfilter: iptables no lockdep is needed Stephen Hemminger
2009-04-26 8:18 ` Jarek Poplawski
2009-04-26 18:24 ` [PATCH] netfilter: use per-CPU recursive lock {XV} Eric Dumazet
2009-04-26 18:56 ` Mathieu Desnoyers
2009-04-26 21:57 ` Stephen Hemminger
2009-04-26 22:32 ` Mathieu Desnoyers
2009-04-27 17:44 ` Peter Zijlstra
2009-04-27 18:30 ` [PATCH] netfilter: use per-CPU r**ursive " Stephen Hemminger
2009-04-27 18:54 ` Ingo Molnar
2009-04-27 19:06 ` Stephen Hemminger
2009-04-27 19:46 ` Linus Torvalds
2009-04-27 19:48 ` Linus Torvalds
2009-04-27 20:36 ` Evgeniy Polyakov
2009-04-27 20:58 ` Linus Torvalds
2009-04-27 21:40 ` Stephen Hemminger
2009-04-27 22:24 ` Linus Torvalds
2009-04-27 23:01 ` Linus Torvalds
2009-04-27 23:03 ` Linus Torvalds
2009-04-28 6:58 ` Eric Dumazet
2009-04-28 11:53 ` David Miller
2009-04-28 12:40 ` Ingo Molnar
2009-04-28 13:43 ` David Miller
2009-04-28 13:52 ` Mathieu Desnoyers
2009-04-28 14:37 ` David Miller
2009-04-28 14:49 ` Mathieu Desnoyers
2009-04-28 15:00 ` David Miller
2009-04-28 16:24 ` [PATCH] netfilter: revised locking for x_tables Stephen Hemminger
2009-04-28 16:50 ` Linus Torvalds
2009-04-28 16:55 ` Linus Torvalds
2009-04-29 5:37 ` David Miller
[not found] ` <20090428.223708.168741998.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2009-04-30 3:26 ` Jeff Chua
[not found] ` <b6a2187b0904292026k7d6107a7vcdc761d4149f40aa-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-04-30 3:31 ` David Miller
2009-05-01 8:38 ` [PATCH] netfilter: use likely() in xt_info_rdlock_bh() Eric Dumazet
2009-05-01 16:10 ` David Miller
2009-04-28 15:42 ` [PATCH] netfilter: use per-CPU r**ursive lock {XV} Paul E. McKenney
2009-04-28 17:35 ` Christoph Lameter
2009-04-28 15:09 ` Linus Torvalds
2009-04-27 23:32 ` Linus Torvalds
2009-04-28 7:41 ` Peter Zijlstra
2009-04-28 14:22 ` Paul E. McKenney
2009-04-28 7:42 ` Jan Engelhardt
2009-04-26 19:31 ` [PATCH] netfilter: use per-CPU recursive " Mathieu Desnoyers
2009-04-26 20:55 ` Eric Dumazet
2009-04-26 21:39 ` Mathieu Desnoyers
2009-04-21 18:34 ` [PATCH] netfilter: use per-cpu recursive lock (v11) Paul E. McKenney
2009-04-21 20:14 ` Linus Torvalds
2009-04-20 23:44 ` [PATCH] netfilter: use per-cpu recursive lock (v10) Paul E. McKenney
2009-04-16 0:02 ` [PATCH] netfilter: use per-cpu spinlock rather than RCU (v3) Linus Torvalds
2009-04-16 6:26 ` Eric Dumazet
2009-04-16 14:33 ` Paul E. McKenney
2009-04-15 3:23 ` David Miller
2009-04-14 17:19 ` [PATCH] netfilter: use per-cpu spinlock rather than RCU Stephen Hemminger
2009-04-11 15:50 ` iptables very slow after commit 784544739a25c30637397ace5489eeb6e15d7d49 Stephen Hemminger
2009-04-11 17:43 ` Paul E. McKenney
2009-04-11 18:57 ` Linus Torvalds
2009-04-12 0:34 ` Paul E. McKenney
2009-04-12 7:23 ` Evgeniy Polyakov
2009-04-12 16:06 ` Stephen Hemminger
2009-04-12 17:30 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090416144748.GB6924@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=benh@kernel.crashing.org \
--cc=dada1@cosmosbay.com \
--cc=davem@davemloft.net \
--cc=jeff.chua.linux@gmail.com \
--cc=jengelh@medozas.de \
--cc=kaber@trash.net \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=paulus@samba.org \
--cc=r000n@r000n.net \
--cc=shemminger@vyatta.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).