From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pablo Neira Ayuso Subject: Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets Date: Fri, 08 Aug 2008 10:47:49 +0200 Message-ID: <489C0835.3090900@netfilter.org> References: <488064DD.5080509@bock.nu> <488075F1.80901@bock.nu> <4880891C.4090004@netfilter.org> <4880A6BA.6030007@bock.nu> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4880A6BA.6030007@bock.nu> Sender: netfilter-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="us-ascii" To: Bernhard Bock Cc: netfilter@vger.kernel.org Hi Bernhard, Bernhard Bock wrote: > My next step is to run two firewalls in a cluster with conntrackd. > > The basic setup works like a charm. I have increased the HashSize > parameter in conntrackd as well. It replicates the states to the backup > firewall just fine. > > Unfortunately, failover works only in about 50% of all tests. There is > no obvious pattern as to when this failures occur. > > We trigger the failover softly by advertising a higher priority on the > backup firewall, not by switching off the primary one. If it goes well, > we do not loose a single connection. If it doesn't go well, we basically > loose all connections and the apachebench dies. There are hundreds of > INVALID packets in the syslog, and also some NEW (not SYN). In this > case, we also see lost packets in "multicast sequence tracking" in the > conntrackd stats. I think that I have reproduced your problem in my testbed. Say you have two nodes: A and B. Initially, A is primary and B is backup. 1) you generate tons of http traffic: A succesfully replicates states to B. 2) you trigger the fail-over: B becomes primary and A becomes backup. B successfully recovers the connections. Moreover, if you do `conntrack -L -p tcp' in A, you see lots of entries. 3) Just a bit later - 30 seconds later or so - you trigger the fail-over again from B to A. In this case, A fails to recover the entries showing tons of INVALID messages. The problem are the entries that are stuck in A (see step 2). Those former entries clashes with newly committed entries and the TCP state tracking code gets confused with old state information. This problem is fixed in the git repository. Now, we purge the entries in A once this node becomes backup after 15 seconds - this parameter is tunable via PurgeTimeout. Thus, the old entries does not clash with the brand new. Moreover, I have completely reworked the fail-over script, you can find it under doc/ in the conntrack-tools git tree [1]. You may give it a try. I expect to release a new version of the conntrack-tools with these updates soon. New (more complete) documentation is also on the way. Please, let me know how it goes. [1] http://git.netfilter.org -- "Los honestos son inadaptados sociales" -- Les Luthiers