4.9 conntrack performance issues - Denys Fedoryshchenko

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
To: Guillaume Nault <g.nault@alphalink.fr>,
	Netfilter Devel <netfilter-devel@vger.kernel.org>,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>
Subject: 4.9 conntrack performance issues
Date: Sun, 15 Jan 2017 01:05:58 +0200	[thread overview]
Message-ID: <1a71d807acf63135bb037c7144fcd8d9@nuclearcat.com> (raw)

Hi!

Sorry if i added someone wrongly to CC, please let me know, if i should 
remove.
I just run successfully 4.9 on my nat several days ago, and seems panic 
issue disappeared. But i started to face another issue, it seems garbage 
collector is hogging one of CPU's.

Here is my data:
2xE5-2640 v3
396G ram
2x10G (bonding) with approx 14-15G load at peak time
It was handling load very well at 4.8 and below, it might be still fine, 
but i suspect queues that belong to hogged cpu might experience issues.
Is there anything can be done to improve cpu load distribution or reduce 
single core load?

net.netfilter.nf_conntrack_buckets = 65536
net.netfilter.nf_conntrack_checksum = 1
net.netfilter.nf_conntrack_count = 1236021
net.netfilter.nf_conntrack_events = 1
net.netfilter.nf_conntrack_expect_max = 1024
net.netfilter.nf_conntrack_generic_timeout = 600
net.netfilter.nf_conntrack_helper = 0
net.netfilter.nf_conntrack_icmp_timeout = 30
net.netfilter.nf_conntrack_log_invalid = 0
net.netfilter.nf_conntrack_max = 6553600
net.netfilter.nf_conntrack_tcp_be_liberal = 0
net.netfilter.nf_conntrack_tcp_loose = 0
net.netfilter.nf_conntrack_tcp_max_retrans = 3
net.netfilter.nf_conntrack_tcp_timeout_close = 10
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 10
net.netfilter.nf_conntrack_tcp_timeout_established = 600
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 20
net.netfilter.nf_conntrack_tcp_timeout_last_ack = 20
net.netfilter.nf_conntrack_tcp_timeout_max_retrans = 60
net.netfilter.nf_conntrack_tcp_timeout_syn_recv = 10
net.netfilter.nf_conntrack_tcp_timeout_syn_sent = 20
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 20
net.netfilter.nf_conntrack_tcp_timeout_unacknowledged = 30
net.netfilter.nf_conntrack_timestamp = 0
net.netfilter.nf_conntrack_udp_timeout = 30
net.netfilter.nf_conntrack_udp_timeout_stream = 180
net.nf_conntrack_max = 6553600


it is non-peak values, as adjustments i have shorter than default 
timeouts. Changing net.netfilter.nf_conntrack_buckets to higher value 
doesn't fix issue.
I noticed that one of CPU's hogged (N24 in this case):

Linux 4.9.2-build-0127 (NAT)	01/14/17	_x86_64_	(32 CPU)

23:01:54     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal 
  %guest   %idle
23:02:04     all    0.09    0.00    1.60    0.01    0.00   28.28    0.00 
    0.00   70.01
23:02:04       0    0.11    0.00    0.00    0.00    0.00   32.38    0.00 
    0.00   67.51
23:02:04       1    0.12    0.00    0.12    0.00    0.00   29.91    0.00 
    0.00   69.86
23:02:04       2    0.23    0.00    0.11    0.00    0.00   29.57    0.00 
    0.00   70.09
23:02:04       3    0.11    0.00    0.11    0.11    0.00   28.80    0.00 
    0.00   70.86
23:02:04       4    0.23    0.00    0.11    0.11    0.00   31.41    0.00 
    0.00   68.14
23:02:04       5    0.11    0.00    0.00    0.00    0.00   29.28    0.00 
    0.00   70.61
23:02:04       6    0.11    0.00    0.11    0.00    0.00   31.81    0.00 
    0.00   67.96
23:02:04       7    0.11    0.00    0.11    0.00    0.00   32.69    0.00 
    0.00   67.08
23:02:04       8    0.00    0.00    0.23    0.00    0.00   42.12    0.00 
    0.00   57.64
23:02:04       9    0.11    0.00    0.00    0.00    0.00   30.86    0.00 
    0.00   69.02
23:02:04      10    0.11    0.00    0.11    0.00    0.00   30.93    0.00 
    0.00   68.84
23:02:04      11    0.00    0.00    0.11    0.00    0.00   32.73    0.00 
    0.00   67.16
23:02:04      12    0.11    0.00    0.11    0.00    0.00   29.85    0.00 
    0.00   69.92
23:02:04      13    0.00    0.00    0.00    0.00    0.00   30.96    0.00 
    0.00   69.04
23:02:04      14    0.00    0.00    0.00    0.00    0.00   30.09    0.00 
    0.00   69.91
23:02:04      15    0.00    0.00    0.11    0.00    0.00   30.63    0.00 
    0.00   69.26
23:02:04      16    0.11    0.00    0.00    0.00    0.00   25.88    0.00 
    0.00   74.01
23:02:04      17    0.11    0.00    0.00    0.00    0.00   22.82    0.00 
    0.00   77.07
23:02:04      18    0.11    0.00    0.00    0.00    0.00   23.75    0.00 
    0.00   76.14
23:02:04      19    0.11    0.00    0.11    0.00    0.00   24.86    0.00 
    0.00   74.92
23:02:04      20    0.11    0.00    0.11    0.11    0.00   24.48    0.00 
    0.00   75.19
23:02:04      21    0.22    0.00    0.11    0.00    0.00   23.43    0.00 
    0.00   76.24
23:02:04      22    0.11    0.00    0.11    0.00    0.00   25.46    0.00 
    0.00   74.32
23:02:04      23    0.00    0.00    0.11    0.00    0.00   25.47    0.00 
    0.00   74.41
23:02:04      24    0.00    0.00   45.06    0.00    0.00   42.18    0.00 
    0.00   12.76
23:02:04      25    0.11    0.00    0.11    0.11    0.00   25.22    0.00 
    0.00   74.46
23:02:04      26    0.11    0.00    0.00    0.11    0.00   23.39    0.00 
    0.00   76.39
23:02:04      27    0.22    0.00    0.11    0.00    0.00   23.83    0.00 
    0.00   75.85
23:02:04      28    0.11    0.00    0.11    0.00    0.00   24.10    0.00 
    0.00   75.68
23:02:04      29    0.11    0.00    0.11    0.00    0.00   23.80    0.00 
    0.00   75.98
23:02:04      30    0.11    0.00    0.11    0.00    0.00   23.45    0.00 
    0.00   76.33
23:02:04      31    0.11    0.00    0.11    0.00    0.00   20.37    0.00 
    0.00   79.42

And this is output of ./perf top -C 24 -e cycles

    PerfTop:     933 irqs/sec  kernel:100.0%  exact:  0.0% [1000Hz 
cycles],  (all, CPU: 24)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     52.68%  [nf_conntrack]  [k] gc_worker
      3.88%  [ip_tables]     [k] ipt_do_table
      2.39%  [ixgbe]         [k] ixgbe_xmit_frame_ring
      2.29%  [kernel]        [k] _raw_spin_lock
      1.84%  [ixgbe]         [k] ixgbe_poll
      1.76%  [nf_conntrack]  [k] __nf_conntrack_find_get

perf report for this cpu (same, cycles)
# Children      Self  Command       Shared Object           Symbol
# ........  ........  ............  ......................  
....................................................
#
     88.98%     0.00%  kworker/24:1  [kernel.kallsyms]       [k] 
process_one_work
             |
             ---process_one_work
                |
                |--54.65%--gc_worker
                |          |
                |           --3.58%--nf_ct_gc_expired
                |                     |
                |                     |--1.90%--nf_ct_delete
                |                     |          |
                |                     |           
--1.31%--nf_ct_delete_from_lists
                |                     |
                |                      --1.61%--nf_conntrack_destroy
                |                                destroy_conntrack
                |                                |
                |                                 
--1.53%--nf_conntrack_free
                |                                           |
                |                                           
|--0.80%--kmem_cache_free
                |                                           |          |
                |                                           |           
--0.51%--__slab_free.isra.12
                |                                           |
                |                                            
--0.52%--__nf_ct_ext_destroy

next             reply	other threads:[~2017-01-14 23:05 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-14 23:05 Denys Fedoryshchenko [this message]
2017-01-14 23:53 ` 4.9 conntrack performance issues Florian Westphal
2017-01-15  0:18   ` Denys Fedoryshchenko
2017-01-15  0:29     ` Florian Westphal
2017-01-15  0:42       ` Denys Fedoryshchenko
2017-01-30 11:26 ` Guillaume Nault
2017-01-30 11:37   ` Denys Fedoryshchenko
2017-01-30 12:21     ` Guillaume Nault

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1a71d807acf63135bb037c7144fcd8d9@nuclearcat.com \
    --to=nuclearcat@nuclearcat.com \
    --cc=g.nault@alphalink.fr \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.