From: Jesper Dangaard Brouer <brouer@redhat.com>
To: netfilter-devel@vger.kernel.org,
Eric Dumazet <eric.dumazet@gmail.com>,
Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>,
netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
Florian Westphal <fw@strlen.de>,
"Patrick McHardy" <kaber@trash.net>
Subject: [nf-next PATCH V2 0/5] netfilter: conntrack: optimization, remove central spinlock
Date: Fri, 28 Feb 2014 13:16:15 +0100 [thread overview]
Message-ID: <20140228121529.20347.38300.stgit@dragon> (raw)
In-Reply-To: <20140227182220.25907.12758.stgit@dragon>
This patchset change the conntrack locking and provides a huge
performance improvements.
This patchset is based upon Eric Dumazet's proposed patch:
http://thread.gmane.org/gmane.linux.network/268758/focus=47306
I have in agreement with Eric Dumazet, taken over this patch (and
turned it into a entire patchset).
Primary focus is to remove the central spinlock nf_conntrack_lock.
This requires several steps to be acheived.
Patch01: Trivial cleanups
Patch02: Moves the "special" dying/unconfirmed/template lists to use a
per cpu spinlock.
Patch03: Is preparing for patch04, as it address a race
condition. Doing this a seperate patch for reviewers sake.
Patch04: Seperates expect locking from nf_conntrack_lock. The expect
list is small (default max 256), this it just get a single lock.
Patch05: Finally can remove nf_conntrack_lock, and instead uses an
array of hashed spinlocks to protect insertions/deletions of
conntracks into the hash table. While still allowing dynamic
resizing of the hash table.
Testing
-------
For expectations I've mostly tested the FTP nf_conntrack_ftp
helper module, by commands:
for x in `seq 1 300`; do \
echo $x; \
echo -e "USER anonymous\nPASS nothing\nPASV" | nc 192.168.42.129 21; \
done
wget ftp://192.168.42.129/pub/delete.me.4k -O /dev/null
For overload/DoS testing, I've primarily done, SYN-flood attack testing.
Results on a 24-core E5-2695v2(ES) with 10Gbit/s ixgbe (with tool trafgen)
Base kernel : New 810.405 conntrack/sec
Fixed kernel: New 2.233.876 conntrack/sec
Notice other floods attack (SYN+ACK or ACK) can easily be deflected using:
# iptables -A INPUT -m state --state INVALID -j DROP
# sysctl -w net/netfilter/nf_conntrack_tcp_loose=0
E.g. this machine can reflect 6.481.463 "invalid" conntrack/sec (from
an ACK-flood).
Perf data:
----------
The nf_conntrack_lock is suffers from huge contention on current
generation servers (8 or more core/threads). Data from under
SYN-flooding (without a listen socket)
Perf locking congestion is very "visible" on a base kernel:
- 72.56% ksoftirqd/6 [kernel.kallsyms] [k] _raw_spin_lock_bh
- _raw_spin_lock_bh
+ 25.33% init_conntrack
+ 24.86% nf_ct_delete_from_lists
+ 24.62% __nf_conntrack_confirm
+ 24.38% destroy_conntrack
+ 0.70% tcp_packet
+ 2.21% ksoftirqd/6 [kernel.kallsyms] [k] fib_table_lookup
+ 1.15% ksoftirqd/6 [kernel.kallsyms] [k] __slab_free
+ 0.77% ksoftirqd/6 [kernel.kallsyms] [k] inet_getpeer
+ 0.70% ksoftirqd/6 [nf_conntrack] [k] nf_ct_delete
+ 0.55% ksoftirqd/6 [ip_tables] [k] ipt_do_table
Perf after the patchset (SYN-flood attack):
+ 9.62% ksoftirqd/6 [kernel.kallsyms] [k] fib_table_lookup
+ 3.78% ksoftirqd/6 [kernel.kallsyms] [k] __slab_free
+ 2.71% ksoftirqd/6 [kernel.kallsyms] [k] inet_getpeer
+ 2.55% ksoftirqd/6 [kernel.kallsyms] [k] check_leaf
+ 2.38% ksoftirqd/6 [ip_tables] [k] ipt_do_table
+ 2.06% ksoftirqd/6 [kernel.kallsyms] [k] __slab_alloc
+ 1.94% ksoftirqd/6 [nf_conntrack] [k] __nf_conntrack_alloc
- 1.94% ksoftirqd/6 [kernel.kallsyms] [k] _raw_spin_lock
- _raw_spin_lock
+ 90.32% nf_conntrack_double_lock
+ 3.61% get_partial_node
+ 1.81% nf_ct_delete_from_lists
+ 1.68% __nf_conntrack_confirm
+ 1.03% sch_direct_xmit
+ 0.52% scheduler_tick
+ 1.86% ksoftirqd/6 [kernel.kallsyms] [k] nf_iterate
+ 1.80% ksoftirqd/6 [nf_conntrack] [k] init_conntrack
+ 1.77% ksoftirqd/6 [kernel.kallsyms] [k] __neigh_event_send
- 1.70% ksoftirqd/6 [kernel.kallsyms] [k] _raw_spin_lock_bh
- _raw_spin_lock_bh
+ 32.55% nf_ct_del_from_dying_or_unconfirmed_list
+ 25.33% init_conntrack
+ 19.88% tcp_packet
+ 17.97% nf_ct_delete_from_lists
+ 1.62% nf_conntrack_in
+ 1.33% ixgbe_poll
+ 0.74% destroy_conntrack
+ 1.64% ksoftirqd/6 [nf_conntrack] [k] hash_conntrack_raw
+ 1.58% ksoftirqd/6 [kernel.kallsyms] [k] __netif_receive_skb_core
+ 1.51% ksoftirqd/6 [nf_conntrack] [k] __nf_conntrack_find_get
+ 1.48% ksoftirqd/6 [kernel.kallsyms] [k] __cmpxchg_double_slab
+ 1.46% ksoftirqd/6 [nf_conntrack] [k] nf_conntrack_in
+ 1.45% ksoftirqd/6 [kernel.kallsyms] [k] __local_bh_enable_ip
---
Jesper Dangaard Brouer (5):
netfilter: conntrack: remove central spinlock nf_conntrack_lock
netfilter: conntrack: seperate expect locking from nf_conntrack_lock
netfilter: avoid race with exp->master ct
netfilter: conntrack: spinlock per cpu to protect special lists.
netfilter: trivial code cleanup and doc changes
include/net/netfilter/nf_conntrack.h | 11 +
include/net/netfilter/nf_conntrack_core.h | 9 +
include/net/netns/conntrack.h | 13 +
net/netfilter/nf_conntrack_core.c | 432 ++++++++++++++++++++---------
net/netfilter/nf_conntrack_expect.c | 34 ++
net/netfilter/nf_conntrack_h323_main.c | 4
net/netfilter/nf_conntrack_helper.c | 37 ++
net/netfilter/nf_conntrack_netlink.c | 128 +++++----
net/netfilter/nf_conntrack_sip.c | 8 -
9 files changed, 459 insertions(+), 217 deletions(-)
--
next prev parent reply other threads:[~2014-02-28 12:16 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-27 18:23 [nf-next PATCH 0/5] (repost) netfilter: conntrack: optimization, remove central spinlock Jesper Dangaard Brouer
2014-02-27 18:23 ` [nf-next PATCH 1/5] netfilter: trivial code cleanup and doc changes Jesper Dangaard Brouer
2014-02-27 18:23 ` [nf-next PATCH 2/5] netfilter: conntrack: spinlock per cpu to protect special lists Jesper Dangaard Brouer
2014-02-27 18:23 ` [nf-next PATCH 3/5] netfilter: avoid race with exp->master ct Jesper Dangaard Brouer
2014-02-27 21:34 ` Florian Westphal
2014-02-28 11:30 ` Jesper Dangaard Brouer
2014-02-27 18:23 ` [nf-next PATCH 4/5] netfilter: conntrack: seperate expect locking from nf_conntrack_lock Jesper Dangaard Brouer
2014-02-27 18:23 ` [nf-next PATCH 5/5] netfilter: conntrack: remove central spinlock nf_conntrack_lock Jesper Dangaard Brouer
2014-02-27 23:34 ` [nf-next PATCH 0/5] (repost) netfilter: conntrack: optimization, remove central spinlock David Miller
2014-02-28 9:47 ` Jesper Dangaard Brouer
2014-02-28 12:16 ` Jesper Dangaard Brouer [this message]
2014-02-28 12:16 ` [nf-next PATCH V2 1/5] netfilter: trivial code cleanup and doc changes Jesper Dangaard Brouer
2014-02-28 12:17 ` [nf-next PATCH V2 2/5] netfilter: conntrack: spinlock per cpu to protect special lists Jesper Dangaard Brouer
2014-02-28 12:17 ` [nf-next PATCH V2 3/5] netfilter: avoid race with exp->master ct Jesper Dangaard Brouer
2014-02-28 12:17 ` [nf-next PATCH V2 4/5] netfilter: conntrack: seperate expect locking from nf_conntrack_lock Jesper Dangaard Brouer
2014-02-28 15:08 ` Florian Westphal
2014-03-03 11:33 ` Jesper Dangaard Brouer
2014-02-28 12:17 ` [nf-next PATCH V2 5/5] netfilter: conntrack: remove central spinlock nf_conntrack_lock Jesper Dangaard Brouer
2014-03-03 1:14 ` [nf-next PATCH V2 0/5] netfilter: conntrack: optimization, remove central spinlock David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140228121529.20347.38300.stgit@dragon \
--to=brouer@redhat.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=fw@strlen.de \
--cc=kaber@trash.net \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.