netfilter.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Stefan Majer <stefan.majer@gmail.com>
Cc: netfilter@vger.kernel.org
Subject: Re: conntrackd high cpu usage
Date: Mon, 16 Jan 2012 23:58:10 +0100	[thread overview]
Message-ID: <20120116225810.GC17879@1984> (raw)
In-Reply-To: <CADdPHGvZFsZa1hzwAsxFuvYr7dX=ATq64_Z2hF4zQNuUUB0pNQ@mail.gmail.com>

On Mon, Jan 16, 2012 at 08:53:23PM +0100, Stefan Majer wrote:
> Hi Pablo,
> 
> On Mon, Jan 16, 2012 at 12:28 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > Hi Stefan,
> >
> > On Mon, Jan 09, 2012 at 07:49:55PM +0100, Stefan Majer wrote:
> >> Hi,
> >>
> >> we have 2 8core Xeon Boxes with 2 Intel X520 10GBit Adapter running
> >> rhel 6.1 as redundant firewall.
> >
> > Interesting setup. So far, the reports of conntrackd usage that
> > I've received are deployments with 1GBit NICs and smaller machines
> > (up to 2-4 cores).
> >
> >> On every node we have conntrackd installed with a FTFW mode, we
> >> synchronize all states.
> >> Synchronization is made over multicast on a dedicated vlan interface.
> >> The Firewall itself actually have around 300 vlans active.
> >>
> >> Actually we see permanent ~400 new connections/sec with peaks at 800
> >> conn/sec.
> >
> > I've been abled to reach up to 20000 sessions/sec with 6 years old
> > hardward (dual core, 2.4GHz, 1Gbit links). I know people that
> > got better results in more modern hardware.
> 
> This would be sufficient for our use case but...
> 
> >
> > You may want to enable the reliable synchronization option in
> > conntrackd. With it, conntrackd starts dropping packets if the
> > synchronization does not happen timely.
> 
> This is probably not what we want as this prevent a working state on
> the secondary machine at any time right ?

the reliable synchronization means that we drop network packets in the
primary if we cannot back off (the rate of state-changes/s is so high
that conntrackd starts dropping events of state-changes coming from
the kernel).

See NetlinkEventsReliable option.

> >> With this load the conntrackd consumes about 15 - 25 % CPU from one
> >> CPU on the active side and about 5% CPU usage on the passive side.
> >> Is this expected ?
> >
> > What tool are you using to obtain those measurements?
> 
> This was actually with measured with top.
> 
> > top is fine for estimated load, but it's inaccurate.

sysstat is a simple tool and it's bit better.

> > Still, full state synchronization is a resource consuming task
> 
> Is it possible to reduce the synchronization of specifc state events
> to ESTABLISHED, and NEW for example
> without loosing a working state on the secondary side ?

Yes, please have a look at the conntrack-tools user-manual documentation.
See the CT target iniptables.

> >> This is our Testing environment, and we expect much higher (~10 - 20
> >> times) connection rates.
> >>
> >> This would not be possible with the current setup, as this would be
> >> cpu bound on the conntrackd, as this daemon is single threaded.
> >> Is there any way to make this process faster, eg. make the
> >> synchronization multi threaded ?
> >
> > There several things that we can do to improve conntrackd performance
> > (from the development side):
> >
> > 1) port conntrackd to libmnl to use recvmmsg system call.
> > 2) implement netlink multi-queue, we discussed this during the
> > NFWS2010. The idea is to implement something similar to the existing
> > nfqueue multiqueue load balancing (see --queue-balance in iptables's
> > NFQUEUE). It's similar to multi-threading that you're proposing.
> > 3) implement batching for the commit operation.
> >
> > So far, nobody has come to show interest on these tasks. Recent
> > enhancements for conntrackd have focused on adding new features.
> 
> This sounds all great but i have no idea how much this would increase
> performance.
> We will first try to measure our current environment how many conn/sec
> we are able to synchronize.

I don't have numbers because it's not implemented yet ;-), but I'm
sure this will boost performance considerably.

The recvmmsg will reduce the huge amount of recv system calls that
happen under heavy load to allow conntrackd receiving state-change
events from kernel-space.

The multiqueue approach will let it scale for a high number of
processors / cores.

The batching will allow us to reduce the time to inject the states
into the kernel.

> >> I already did some perf analysis, but they didnt gave us much light.
> >
> > What tools are you using?
> 
> we were using perf record, see man 1 perf.
> 
> > I suggest you to have a look at Willy Tarreau's tool (httpterm). You
> > may want to use my http client instead of inject32.
> >
> > http://1984.lsi.us.es/git/http-client-benchmark/
> 
> I will check both, but yours wont compile with:
> 
> make
> gcc -g -c alarm.c -o alarm.o
> gcc -g -c client.c -o client.o
> client.c: In function ‘print_alarm_cb’:
> client.c:335:3: warning: format ‘%llu’ expects argument of type ‘long
> long unsigned int’, but argument 5 has type ‘uint64_t’ [-Wformat]
> client.c:335:3: warning: format ‘%u’ expects argument of type
> ‘unsigned int’, but argument 10 has type ‘__time_t’ [-Wformat]
> client.c:335:3: warning: format ‘%u’ expects argument of type
> ‘unsigned int’, but argument 11 has type ‘__suseconds_t’ [-Wformat]
> client.c: In function ‘main’:
> client.c:404:5: error: variable-sized object may not be initialized
> make: *** [all] Error 1

Interesting, I don't hit that problem here.

I have applied one fix to git. Let me know if it compiles now.

This tool is quite rudimentary, not documented and I think I'm the one
using it for my benchmark evaluations. But it's very useful.

      reply	other threads:[~2012-01-16 22:58 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-09 18:49 conntrackd high cpu usage Stefan Majer
2012-01-16 11:28 ` Pablo Neira Ayuso
2012-01-16 19:53   ` Stefan Majer
2012-01-16 22:58     ` Pablo Neira Ayuso [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120116225810.GC17879@1984 \
    --to=pablo@netfilter.org \
    --cc=netfilter@vger.kernel.org \
    --cc=stefan.majer@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).