All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: netfilter-devel@vger.kernel.org,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>,
	netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
	Florian Westphal <fw@strlen.de>,
	"Patrick McHardy" <kaber@trash.net>
Subject: [nf-next PATCH V3 0/5] netfilter: conntrack: optimization, remove central spinlock
Date: Mon, 03 Mar 2014 14:44:16 +0100	[thread overview]
Message-ID: <20140303134319.2615.9293.stgit@dragon> (raw)

This patchset change the conntrack locking and provides a huge
performance improvements.

This patchset is based upon Eric Dumazet's proposed patch:
  http://thread.gmane.org/gmane.linux.network/268758/focus=47306
I have in agreement with Eric Dumazet, taken over this patch (and
turned it into a entire patchset).

Primary focus is to remove the central spinlock nf_conntrack_lock.
This requires several steps to be acheived.

Patch01: Trivial cleanups

Patch02: Moves the "special" dying/unconfirmed/template lists to use a
 per cpu spinlock.

Patch03: Is preparing for patch04, as it address a race
 condition. Doing this a seperate patch for reviewers sake.

Patch04: Seperates expect locking from nf_conntrack_lock. The expect
 list is small (default max 256), this it just get a single lock.

Patch05: Finally can remove nf_conntrack_lock, and instead uses an
 array of hashed spinlocks to protect insertions/deletions of
 conntracks into the hash table.  While still allowing dynamic
 resizing of the hash table.


Testing
-------
For expectations I've mostly tested the FTP nf_conntrack_ftp
helper module, by commands:

 for x in `seq 1 300`; do \
   echo $x; \
   echo -e "USER anonymous\nPASS nothing\nPASV" | nc 192.168.42.129 21; \
 done

 wget ftp://192.168.42.129/pub/delete.me.4k -O /dev/null

For overload/DoS testing, I've primarily done, SYN-flood attack testing.
Results on a 24-core E5-2695v2(ES) with 10Gbit/s ixgbe (with tool trafgen)

 Base kernel : New   810.405 conntrack/sec
 Fixed kernel: New 2.233.876 conntrack/sec

Notice other floods attack (SYN+ACK or ACK) can easily be deflected using:
 # iptables -A INPUT -m state --state INVALID -j DROP
 # sysctl -w net/netfilter/nf_conntrack_tcp_loose=0

E.g. this machine can reflect 6.481.463 "invalid" conntrack/sec (from
an ACK-flood).

Perf data:
----------
The nf_conntrack_lock is suffers from huge contention on current
generation servers (8 or more core/threads).  Data from under
SYN-flooding (without a listen socket)

Perf locking congestion is very "visible" on a base kernel:

  -  72.56%  ksoftirqd/6  [kernel.kallsyms]    [k] _raw_spin_lock_bh
     - _raw_spin_lock_bh
        + 25.33% init_conntrack
        + 24.86% nf_ct_delete_from_lists
        + 24.62% __nf_conntrack_confirm
        + 24.38% destroy_conntrack
        + 0.70% tcp_packet
  +   2.21%  ksoftirqd/6  [kernel.kallsyms]    [k] fib_table_lookup
  +   1.15%  ksoftirqd/6  [kernel.kallsyms]    [k] __slab_free
  +   0.77%  ksoftirqd/6  [kernel.kallsyms]    [k] inet_getpeer
  +   0.70%  ksoftirqd/6  [nf_conntrack]       [k] nf_ct_delete
  +   0.55%  ksoftirqd/6  [ip_tables]          [k] ipt_do_table

Perf after the patchset (SYN-flood attack):

 +   9.62%  ksoftirqd/6  [kernel.kallsyms]    [k] fib_table_lookup
 +   3.78%  ksoftirqd/6  [kernel.kallsyms]    [k] __slab_free
 +   2.71%  ksoftirqd/6  [kernel.kallsyms]    [k] inet_getpeer
 +   2.55%  ksoftirqd/6  [kernel.kallsyms]    [k] check_leaf
 +   2.38%  ksoftirqd/6  [ip_tables]          [k] ipt_do_table
 +   2.06%  ksoftirqd/6  [kernel.kallsyms]    [k] __slab_alloc
 +   1.94%  ksoftirqd/6  [nf_conntrack]       [k] __nf_conntrack_alloc
 -   1.94%  ksoftirqd/6  [kernel.kallsyms]    [k] _raw_spin_lock
   - _raw_spin_lock
      + 90.32% nf_conntrack_double_lock
      + 3.61% get_partial_node
      + 1.81% nf_ct_delete_from_lists
      + 1.68% __nf_conntrack_confirm
      + 1.03% sch_direct_xmit
      + 0.52% scheduler_tick
 +   1.86%  ksoftirqd/6  [kernel.kallsyms]    [k] nf_iterate
 +   1.80%  ksoftirqd/6  [nf_conntrack]       [k] init_conntrack
 +   1.77%  ksoftirqd/6  [kernel.kallsyms]    [k] __neigh_event_send
 -   1.70%  ksoftirqd/6  [kernel.kallsyms]    [k] _raw_spin_lock_bh
   - _raw_spin_lock_bh
      + 32.55% nf_ct_del_from_dying_or_unconfirmed_list
      + 25.33% init_conntrack
      + 19.88% tcp_packet
      + 17.97% nf_ct_delete_from_lists
      + 1.62% nf_conntrack_in
      + 1.33% ixgbe_poll
      + 0.74% destroy_conntrack
 +   1.64%  ksoftirqd/6  [nf_conntrack]       [k] hash_conntrack_raw
 +   1.58%  ksoftirqd/6  [kernel.kallsyms]    [k] __netif_receive_skb_core
 +   1.51%  ksoftirqd/6  [nf_conntrack]       [k] __nf_conntrack_find_get
 +   1.48%  ksoftirqd/6  [kernel.kallsyms]    [k] __cmpxchg_double_slab
 +   1.46%  ksoftirqd/6  [nf_conntrack]       [k] nf_conntrack_in
 +   1.45%  ksoftirqd/6  [kernel.kallsyms]    [k] __local_bh_enable_ip

---

Jesper Dangaard Brouer (5):
      netfilter: conntrack: remove central spinlock nf_conntrack_lock
      netfilter: conntrack: seperate expect locking from nf_conntrack_lock
      netfilter: avoid race with exp->master ct
      netfilter: conntrack: spinlock per cpu to protect special lists.
      netfilter: trivial code cleanup and doc changes


 include/net/netfilter/nf_conntrack.h      |   11 +
 include/net/netfilter/nf_conntrack_core.h |    9 +
 include/net/netns/conntrack.h             |   13 +
 net/netfilter/nf_conntrack_core.c         |  432 ++++++++++++++++++++---------
 net/netfilter/nf_conntrack_expect.c       |   36 ++
 net/netfilter/nf_conntrack_h323_main.c    |    4 
 net/netfilter/nf_conntrack_helper.c       |   41 ++-
 net/netfilter/nf_conntrack_netlink.c      |  128 +++++----
 net/netfilter/nf_conntrack_sip.c          |    8 -
 9 files changed, 461 insertions(+), 221 deletions(-)

-- 

             reply	other threads:[~2014-03-03 13:43 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-03 13:44 Jesper Dangaard Brouer [this message]
2014-03-03 13:44 ` [nf-next PATCH V3 1/5] netfilter: trivial code cleanup and doc changes Jesper Dangaard Brouer
2014-03-03 13:45 ` [nf-next PATCH V3 2/5] netfilter: conntrack: spinlock per cpu to protect special lists Jesper Dangaard Brouer
2014-03-03 13:45 ` [nf-next PATCH V3 3/5] netfilter: avoid race with exp->master ct Jesper Dangaard Brouer
2014-03-03 13:46 ` [nf-next PATCH V3 4/5] netfilter: conntrack: seperate expect locking from nf_conntrack_lock Jesper Dangaard Brouer
2014-03-03 13:46 ` [nf-next PATCH V3 5/5] netfilter: conntrack: remove central spinlock nf_conntrack_lock Jesper Dangaard Brouer
2014-03-03 21:00 ` [nf-next PATCH V3 0/5] netfilter: conntrack: optimization, remove central spinlock David Miller
2014-03-07 10:45   ` Pablo Neira Ayuso
2014-03-06 22:53 ` Florian Westphal
2014-03-06 23:37   ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140303134319.2615.9293.stgit@dragon \
    --to=brouer@redhat.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=fw@strlen.de \
    --cc=kaber@trash.net \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.