From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>,
Pablo Neira Ayuso <pablo@netfilter.org>,
"netfilter-devel@vger.kernel.org"
<netfilter-devel@vger.kernel.org>
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>
Subject: Re: [net-next PATCH 0/5] netfilter: conntrack: optimization, remove central spinlock
Date: Thu, 27 Feb 2014 18:15:22 +0100 [thread overview]
Message-ID: <20140227181522.09549165@redhat.com> (raw)
In-Reply-To: <20140227163950.24668.55934.stgit@dragon>
Hi Pablo,
This should obviously have been for nf-next, and I also forgot to cc
netfilter-devel@vger.kernel.org ... do you want me to repost?
--Jesper
On Thu, 27 Feb 2014 17:41:10 +0100 Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> This patchset change the conntrack locking and provides a huge
> performance improvements.
>
> This patchset is based upon Eric Dumazet's proposed patch:
> http://thread.gmane.org/gmane.linux.network/268758/focus=47306
> I have in agreement with Eric Dumazet, taken over this patch (and
> turned it into a entire patchset).
>
> Primary focus is to remove the central spinlock nf_conntrack_lock.
> This requires several steps to be acheived.
>
> Patch01: Trivial cleanups
>
> Patch02: Moves the "special" dying/unconfirmed/template lists to use a
> per cpu spinlock.
>
> Patch03: Is preparing for patch04, as it address a race
> condition. Doing this a seperate patch for reviewers sake.
>
> Patch04: Seperates expect locking from nf_conntrack_lock. The expect
> list is small (default max 256), this it just get a single lock.
>
> Patch05: Finally can remove nf_conntrack_lock, and instead uses an
> array of hashed spinlocks to protect insertions/deletions of
> conntracks into the hash table. While still allowing dynamic
> resizing of the hash table.
>
>
> Testing
> -------
> For expectations I've mostly tested the FTP nf_conntrack_ftp
> helper module, by commands:
>
> for x in `seq 1 300`; do \
> echo $x; \
> echo -e "USER anonymous\nPASS nothing\nPASV" | nc 192.168.42.129 21; \
> done
>
> wget ftp://192.168.42.129/pub/delete.me.4k -O /dev/null
>
> For overload/DoS testing, I've primarily done, SYN-flood attack testing.
> Results on a 24-core E5-2695v2(ES) with 10Gbit/s ixgbe (with tool trafgen)
>
> Base kernel : New 810.405 conntrack/sec
> Fixed kernel: New 2.233.876 conntrack/sec
>
> Notice other floods attack (SYN+ACK or ACK) can easily be deflected using:
> # iptables -A INPUT -m state --state INVALID -j DROP
> # sysctl -w net/netfilter/nf_conntrack_tcp_loose=0
>
> E.g. this machine can reflect 6.481.463 "invalid" conntrack/sec (from
> an ACK-flood).
>
> Perf data:
> ----------
> The nf_conntrack_lock is suffers from huge contention on current
> generation servers (8 or more core/threads). Data from under
> SYN-flooding (without a listen socket)
>
> Perf locking congestion is very "visible" on a base kernel:
>
> - 72.56% ksoftirqd/6 [kernel.kallsyms] [k] _raw_spin_lock_bh
> - _raw_spin_lock_bh
> + 25.33% init_conntrack
> + 24.86% nf_ct_delete_from_lists
> + 24.62% __nf_conntrack_confirm
> + 24.38% destroy_conntrack
> + 0.70% tcp_packet
> + 2.21% ksoftirqd/6 [kernel.kallsyms] [k] fib_table_lookup
> + 1.15% ksoftirqd/6 [kernel.kallsyms] [k] __slab_free
> + 0.77% ksoftirqd/6 [kernel.kallsyms] [k] inet_getpeer
> + 0.70% ksoftirqd/6 [nf_conntrack] [k] nf_ct_delete
> + 0.55% ksoftirqd/6 [ip_tables] [k] ipt_do_table
>
> Perf after the patchset (SYN-flood attack):
>
> + 9.62% ksoftirqd/6 [kernel.kallsyms] [k] fib_table_lookup
> + 3.78% ksoftirqd/6 [kernel.kallsyms] [k] __slab_free
> + 2.71% ksoftirqd/6 [kernel.kallsyms] [k] inet_getpeer
> + 2.55% ksoftirqd/6 [kernel.kallsyms] [k] check_leaf
> + 2.38% ksoftirqd/6 [ip_tables] [k] ipt_do_table
> + 2.06% ksoftirqd/6 [kernel.kallsyms] [k] __slab_alloc
> + 1.94% ksoftirqd/6 [nf_conntrack] [k] __nf_conntrack_alloc
> - 1.94% ksoftirqd/6 [kernel.kallsyms] [k] _raw_spin_lock
> - _raw_spin_lock
> + 90.32% nf_conntrack_double_lock
> + 3.61% get_partial_node
> + 1.81% nf_ct_delete_from_lists
> + 1.68% __nf_conntrack_confirm
> + 1.03% sch_direct_xmit
> + 0.52% scheduler_tick
> + 1.86% ksoftirqd/6 [kernel.kallsyms] [k] nf_iterate
> + 1.80% ksoftirqd/6 [nf_conntrack] [k] init_conntrack
> + 1.77% ksoftirqd/6 [kernel.kallsyms] [k] __neigh_event_send
> - 1.70% ksoftirqd/6 [kernel.kallsyms] [k] _raw_spin_lock_bh
> - _raw_spin_lock_bh
> + 32.55% nf_ct_del_from_dying_or_unconfirmed_list
> + 25.33% init_conntrack
> + 19.88% tcp_packet
> + 17.97% nf_ct_delete_from_lists
> + 1.62% nf_conntrack_in
> + 1.33% ixgbe_poll
> + 0.74% destroy_conntrack
> + 1.64% ksoftirqd/6 [nf_conntrack] [k] hash_conntrack_raw
> + 1.58% ksoftirqd/6 [kernel.kallsyms] [k] __netif_receive_skb_core
> + 1.51% ksoftirqd/6 [nf_conntrack] [k] __nf_conntrack_find_get
> + 1.48% ksoftirqd/6 [kernel.kallsyms] [k] __cmpxchg_double_slab
> + 1.46% ksoftirqd/6 [nf_conntrack] [k] nf_conntrack_in
> + 1.45% ksoftirqd/6 [kernel.kallsyms] [k] __local_bh_enable_ip
>
>
> ---
>
> Jesper Dangaard Brouer (5):
> netfilter: conntrack: remove central spinlock nf_conntrack_lock
> netfilter: conntrack: seperate expect locking from nf_conntrack_lock
> netfilter: avoid race with exp->master ct
> netfilter: conntrack: spinlock per cpu to protect special lists.
> netfilter: trivial code cleanup and doc changes
>
>
> include/net/netfilter/nf_conntrack.h | 11 +
> include/net/netfilter/nf_conntrack_core.h | 9 +
> include/net/netns/conntrack.h | 13 +
> net/netfilter/nf_conntrack_core.c | 427 ++++++++++++++++++++---------
> net/netfilter/nf_conntrack_expect.c | 36 ++
> net/netfilter/nf_conntrack_h323_main.c | 4
> net/netfilter/nf_conntrack_helper.c | 37 ++-
> net/netfilter/nf_conntrack_netlink.c | 128 +++++----
> net/netfilter/nf_conntrack_sip.c | 8 -
> 9 files changed, 456 insertions(+), 217 deletions(-)
>
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Sr. Network Kernel Developer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
next prev parent reply other threads:[~2014-02-27 17:15 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-27 16:41 [net-next PATCH 0/5] netfilter: conntrack: optimization, remove central spinlock Jesper Dangaard Brouer
2014-02-27 16:41 ` [net-next PATCH 1/5] netfilter: trivial code cleanup and doc changes Jesper Dangaard Brouer
2014-02-27 16:41 ` [net-next PATCH 2/5] netfilter: conntrack: spinlock per cpu to protect special lists Jesper Dangaard Brouer
2014-02-27 16:41 ` [net-next PATCH 3/5] netfilter: avoid race with exp->master ct Jesper Dangaard Brouer
2014-02-27 16:41 ` [net-next PATCH 4/5] netfilter: conntrack: seperate expect locking from nf_conntrack_lock Jesper Dangaard Brouer
2014-02-27 16:41 ` [net-next PATCH 5/5] netfilter: conntrack: remove central spinlock nf_conntrack_lock Jesper Dangaard Brouer
2014-02-27 17:15 ` Jesper Dangaard Brouer [this message]
2014-02-27 17:23 ` [net-next PATCH 0/5] netfilter: conntrack: optimization, remove central spinlock Pablo Neira Ayuso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140227181522.09549165@redhat.com \
--to=brouer@redhat.com \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.