From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pablo Neira Ayuso Subject: conntrackd reports message before expected seq [was Re: [ANNOUNCE] libnetfilter_conntrack 0.0.98 release] Date: Mon, 01 Dec 2008 20:50:24 +0100 Message-ID: <49344000.5060004@netfilter.org> References: <49313A4F.6090904@netfilter.org> <20081130094646.GE9523@bla.fasel.org> <20081130100314.GF9523@bla.fasel.org> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20081130100314.GF9523@bla.fasel.org> Sender: netfilter-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="us-ascii"; format="flowed" To: netfilter@vger.kernel.org Cc: Wolfram Schlich Hi Wolfram, Wolfram Schlich wrote: > * Wolfram Schlich [2008-11-30 10:47]: >> After upgrading to 0.0.98 and restarting conntrackd, I constantly >> get such messages on the backup firewall, even after restarting >> conntrackd on both firewalls once again: >> >> 2008-11-30 10:40:08 +01:00; hafw2; daemon.warning; conntrack-tools[29154]: Received seq=1228038103 before expected seq=1228039271 >> 2008-11-30 10:40:09 +01:00; hafw2; daemon.warning; conntrack-tools[29154]: Received seq=1228038104 before expected seq=1228039271 >> 2008-11-30 10:40:10 +01:00; hafw2; daemon.warning; conntrack-tools[29154]: Received seq=1228038105 before expected seq=1228039273 >> 2008-11-30 10:40:11 +01:00; hafw2; daemon.warning; conntrack-tools[29154]: Received seq=1228038106 before expected seq=1228039274 >> >> The numbers look kinda confusing to me. >> >> What's wrong? :) > > Interesting... it went away after rebooting both machines at once. There are two possible reasons for this: * There is a bug in the hello'ing, actually there was one in 0.9.7 (race condition, not that easy to trigger) but it is fixed in 0.9.8. When conntrackd starts in one node in ft-fw mode, it sets its hello flag in every message until the other node replies with a hello back. This is used to reset the sequence tracking. If the node does not see any hello, it does not reset its sequence tracking, reporting a similar log message. * This has happened to me once: You (or your script) has deleted the /var/lock/conntrack.lock file of an existing conntrackd instance, then you launched conntrackd. At this moment you have two instances of conntrackd running in ft-fw mode (but you did not notice), each sending messages with their own sequence number. Then, the other point drops the messages of one of the instances as they are before the expected sequence number. I think your problem is the second, as the expected sequence is increasing (so this means the node is accepting the messages from one instance or somewhere else). A bug in the hello'ing (as described in the first point) would keep the expected sequence the same. I'm not sure how to fix a situation in which the lock file is deleted accidentally and two instances of conntrackd run at the same time in ft-fw mode. Let me think about this, probably the init scripts can check this before relaunching conntrackd? -- "Los honestos son inadaptados sociales" -- Les Luthiers