All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	"netfilter-devel@vger.kernel.org"
	<netfilter-devel@vger.kernel.org>
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>
Subject: Re: [net-next PATCH 0/5] netfilter: conntrack: optimization, remove central spinlock
Date: Thu, 27 Feb 2014 18:15:22 +0100	[thread overview]
Message-ID: <20140227181522.09549165@redhat.com> (raw)
In-Reply-To: <20140227163950.24668.55934.stgit@dragon>


Hi Pablo,

This should obviously have been for nf-next, and I also forgot to cc
netfilter-devel@vger.kernel.org ... do you want me to repost?

--Jesper


On Thu, 27 Feb 2014 17:41:10 +0100 Jesper Dangaard Brouer <brouer@redhat.com> wrote:

> This patchset change the conntrack locking and provides a huge
> performance improvements.
> 
> This patchset is based upon Eric Dumazet's proposed patch:
>   http://thread.gmane.org/gmane.linux.network/268758/focus=47306
> I have in agreement with Eric Dumazet, taken over this patch (and
> turned it into a entire patchset).
> 
> Primary focus is to remove the central spinlock nf_conntrack_lock.
> This requires several steps to be acheived.
> 
> Patch01: Trivial cleanups
> 
> Patch02: Moves the "special" dying/unconfirmed/template lists to use a
>  per cpu spinlock.
> 
> Patch03: Is preparing for patch04, as it address a race
>  condition. Doing this a seperate patch for reviewers sake.
> 
> Patch04: Seperates expect locking from nf_conntrack_lock. The expect
>  list is small (default max 256), this it just get a single lock.
> 
> Patch05: Finally can remove nf_conntrack_lock, and instead uses an
>  array of hashed spinlocks to protect insertions/deletions of
>  conntracks into the hash table.  While still allowing dynamic
>  resizing of the hash table.
> 
> 
> Testing
> -------
> For expectations I've mostly tested the FTP nf_conntrack_ftp
> helper module, by commands:
> 
>  for x in `seq 1 300`; do \
>    echo $x; \
>    echo -e "USER anonymous\nPASS nothing\nPASV" | nc 192.168.42.129 21; \
>  done
> 
>  wget ftp://192.168.42.129/pub/delete.me.4k -O /dev/null
> 
> For overload/DoS testing, I've primarily done, SYN-flood attack testing.
> Results on a 24-core E5-2695v2(ES) with 10Gbit/s ixgbe (with tool trafgen)
> 
>  Base kernel : New   810.405 conntrack/sec
>  Fixed kernel: New 2.233.876 conntrack/sec
> 
> Notice other floods attack (SYN+ACK or ACK) can easily be deflected using:
>  # iptables -A INPUT -m state --state INVALID -j DROP
>  # sysctl -w net/netfilter/nf_conntrack_tcp_loose=0
> 
> E.g. this machine can reflect 6.481.463 "invalid" conntrack/sec (from
> an ACK-flood).
> 
> Perf data:
> ----------
> The nf_conntrack_lock is suffers from huge contention on current
> generation servers (8 or more core/threads).  Data from under
> SYN-flooding (without a listen socket)
> 
> Perf locking congestion is very "visible" on a base kernel:
> 
>     -  72.56%  ksoftirqd/6  [kernel.kallsyms]    [k] _raw_spin_lock_bh
>        - _raw_spin_lock_bh
>           + 25.33% init_conntrack
>           + 24.86% nf_ct_delete_from_lists
>           + 24.62% __nf_conntrack_confirm
>           + 24.38% destroy_conntrack
>           + 0.70% tcp_packet
>     +   2.21%  ksoftirqd/6  [kernel.kallsyms]    [k] fib_table_lookup
>     +   1.15%  ksoftirqd/6  [kernel.kallsyms]    [k] __slab_free
>     +   0.77%  ksoftirqd/6  [kernel.kallsyms]    [k] inet_getpeer
>     +   0.70%  ksoftirqd/6  [nf_conntrack]       [k] nf_ct_delete
>     +   0.55%  ksoftirqd/6  [ip_tables]          [k] ipt_do_table
> 
> Perf after the patchset (SYN-flood attack):
> 
> +   9.62%  ksoftirqd/6  [kernel.kallsyms]    [k] fib_table_lookup
> +   3.78%  ksoftirqd/6  [kernel.kallsyms]    [k] __slab_free
> +   2.71%  ksoftirqd/6  [kernel.kallsyms]    [k] inet_getpeer
> +   2.55%  ksoftirqd/6  [kernel.kallsyms]    [k] check_leaf
> +   2.38%  ksoftirqd/6  [ip_tables]          [k] ipt_do_table
> +   2.06%  ksoftirqd/6  [kernel.kallsyms]    [k] __slab_alloc
> +   1.94%  ksoftirqd/6  [nf_conntrack]       [k] __nf_conntrack_alloc
> -   1.94%  ksoftirqd/6  [kernel.kallsyms]    [k] _raw_spin_lock
>    - _raw_spin_lock
>       + 90.32% nf_conntrack_double_lock
>       + 3.61% get_partial_node
>       + 1.81% nf_ct_delete_from_lists
>       + 1.68% __nf_conntrack_confirm
>       + 1.03% sch_direct_xmit
>       + 0.52% scheduler_tick
> +   1.86%  ksoftirqd/6  [kernel.kallsyms]    [k] nf_iterate
> +   1.80%  ksoftirqd/6  [nf_conntrack]       [k] init_conntrack
> +   1.77%  ksoftirqd/6  [kernel.kallsyms]    [k] __neigh_event_send
> -   1.70%  ksoftirqd/6  [kernel.kallsyms]    [k] _raw_spin_lock_bh
>    - _raw_spin_lock_bh
>       + 32.55% nf_ct_del_from_dying_or_unconfirmed_list
>       + 25.33% init_conntrack
>       + 19.88% tcp_packet
>       + 17.97% nf_ct_delete_from_lists
>       + 1.62% nf_conntrack_in
>       + 1.33% ixgbe_poll
>       + 0.74% destroy_conntrack
> +   1.64%  ksoftirqd/6  [nf_conntrack]       [k] hash_conntrack_raw
> +   1.58%  ksoftirqd/6  [kernel.kallsyms]    [k] __netif_receive_skb_core
> +   1.51%  ksoftirqd/6  [nf_conntrack]       [k] __nf_conntrack_find_get
> +   1.48%  ksoftirqd/6  [kernel.kallsyms]    [k] __cmpxchg_double_slab
> +   1.46%  ksoftirqd/6  [nf_conntrack]       [k] nf_conntrack_in
> +   1.45%  ksoftirqd/6  [kernel.kallsyms]    [k] __local_bh_enable_ip
> 
> 
> ---
> 
> Jesper Dangaard Brouer (5):
>       netfilter: conntrack: remove central spinlock nf_conntrack_lock
>       netfilter: conntrack: seperate expect locking from nf_conntrack_lock
>       netfilter: avoid race with exp->master ct
>       netfilter: conntrack: spinlock per cpu to protect special lists.
>       netfilter: trivial code cleanup and doc changes
> 
> 
>  include/net/netfilter/nf_conntrack.h      |   11 +
>  include/net/netfilter/nf_conntrack_core.h |    9 +
>  include/net/netns/conntrack.h             |   13 +
>  net/netfilter/nf_conntrack_core.c         |  427 ++++++++++++++++++++---------
>  net/netfilter/nf_conntrack_expect.c       |   36 ++
>  net/netfilter/nf_conntrack_h323_main.c    |    4 
>  net/netfilter/nf_conntrack_helper.c       |   37 ++-
>  net/netfilter/nf_conntrack_netlink.c      |  128 +++++----
>  net/netfilter/nf_conntrack_sip.c          |    8 -
>  9 files changed, 456 insertions(+), 217 deletions(-)
> 



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

  parent reply	other threads:[~2014-02-27 17:15 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-27 16:41 [net-next PATCH 0/5] netfilter: conntrack: optimization, remove central spinlock Jesper Dangaard Brouer
2014-02-27 16:41 ` [net-next PATCH 1/5] netfilter: trivial code cleanup and doc changes Jesper Dangaard Brouer
2014-02-27 16:41 ` [net-next PATCH 2/5] netfilter: conntrack: spinlock per cpu to protect special lists Jesper Dangaard Brouer
2014-02-27 16:41 ` [net-next PATCH 3/5] netfilter: avoid race with exp->master ct Jesper Dangaard Brouer
2014-02-27 16:41 ` [net-next PATCH 4/5] netfilter: conntrack: seperate expect locking from nf_conntrack_lock Jesper Dangaard Brouer
2014-02-27 16:41 ` [net-next PATCH 5/5] netfilter: conntrack: remove central spinlock nf_conntrack_lock Jesper Dangaard Brouer
2014-02-27 17:15 ` Jesper Dangaard Brouer [this message]
2014-02-27 17:23   ` [net-next PATCH 0/5] netfilter: conntrack: optimization, remove central spinlock Pablo Neira Ayuso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140227181522.09549165@redhat.com \
    --to=brouer@redhat.com \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.