conntrack performance test results in INVALID packets

Linux Netfilter discussions
 help / color / mirror / Atom feed

* conntrack performance test results in INVALID packets
@ 2008-07-18  9:39 Bernhard Bock
  2008-07-18 10:13 ` Jan Engelhardt
  0 siblings, 1 reply; 25+ messages in thread
From: Bernhard Bock @ 2008-07-18  9:39 UTC (permalink / raw)
  To: netfilter

Hi,

I'm performance testing a firewall with netfilter connection tracking 
based on Fedora Core 9 and I'm having some problems.

Every now and then the firewall drops packets in the state "INVALID".

The test setup is as follows:
- I'm using plain HTTP as test traffic, nothing else.
- Client is an ApacheBench (ab) client.
- Server is Apache.
- HTTP connection keepalive with a maximum lifetime of
   30 seconds per TCP session.

With 100 parallel TCP connections, it works. With 1000 parallel TCP 
connections, I start seeing INVALID packets.

Can somebody point me in a direction where to search for the root cause?

best regards
Bernhard

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrack performance test results in INVALID packets
  2008-07-18  9:39 conntrack performance test results in INVALID packets Bernhard Bock
@ 2008-07-18 10:13 ` Jan Engelhardt
  2008-07-18 10:52   ` Bernhard Bock
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Engelhardt @ 2008-07-18 10:13 UTC (permalink / raw)
  To: Bernhard Bock; +Cc: netfilter


On Friday 2008-07-18 11:39, Bernhard Bock wrote:
>
> With 100 parallel TCP connections, it works. With 1000 parallel TCP
> connections, I start seeing INVALID packets.
>
> Can somebody point me in a direction where to search for the root cause?

Vague guess..
You have too few memory and/or your connection table is full, hence
connections are dropped and future packets can't find their
original connection, resulting in INVALID. (Though I'd say they
should become NEW again)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrack performance test results in INVALID packets
  2008-07-18 10:13 ` Jan Engelhardt
@ 2008-07-18 10:52   ` Bernhard Bock
  2008-07-18 12:14     ` Pablo Neira Ayuso
  0 siblings, 1 reply; 25+ messages in thread
From: Bernhard Bock @ 2008-07-18 10:52 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: netfilter

Jan,

Jan Engelhardt schrieb:
> Vague guess..
> You have too few memory and/or your connection table is full, hence
> connections are dropped and future packets can't find their
> original connection, resulting in INVALID. (Though I'd say they
> should become NEW again)

Thanks for your answer. How can I check and/or increase the memory limit 
for the netfilter connection tracking?

The machine has 4G of RAM, so I guess the overall memory should not be a 
problem.

best regards
Bernhard

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrack performance test results in INVALID packets
  2008-07-18 10:52   ` Bernhard Bock
@ 2008-07-18 12:14     ` Pablo Neira Ayuso
  2008-07-18 14:20       ` conntrackd failover works partially, was " Bernhard Bock
  0 siblings, 1 reply; 25+ messages in thread
From: Pablo Neira Ayuso @ 2008-07-18 12:14 UTC (permalink / raw)
  To: Bernhard Bock; +Cc: Jan Engelhardt, netfilter

Bernhard Bock wrote:
> Jan,
> 
> Jan Engelhardt schrieb:
>> Vague guess..
>> You have too few memory and/or your connection table is full, hence
>> connections are dropped and future packets can't find their
>> original connection, resulting in INVALID. (Though I'd say they
>> should become NEW again)
> 
> Thanks for your answer. How can I check and/or increase the memory limit
> for the netfilter connection tracking?
> 
> The machine has 4G of RAM, so I guess the overall memory should not be a
> problem.

This document is a nice kick off:

http://www.wallfire.org/misc/netfilter_conntrack_perf.txt

-- 
"Los honestos son inadaptados sociales" -- Les Luthiers

^ permalink raw reply	[flat|nested] 25+ messages in thread

* conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets
  2008-07-18 12:14     ` Pablo Neira Ayuso
@ 2008-07-18 14:20       ` Bernhard Bock
  2008-07-21  0:37         ` Pablo Neira Ayuso
  2008-08-08  8:47         ` conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets Pablo Neira Ayuso
  0 siblings, 2 replies; 25+ messages in thread
From: Bernhard Bock @ 2008-07-18 14:20 UTC (permalink / raw)
  To: netfilter; +Cc: Pablo Neira Ayuso

Hi Pablo,

Pablo Neira Ayuso wrote:
> This document is a nice kick off:
> 
> http://www.wallfire.org/misc/netfilter_conntrack_perf.txt

Alright, I increased the nf_conntrack_buckets to 256k and it seems to 
have solved this problem. Thanks so far!

My next step is to run two firewalls in a cluster with conntrackd.

The basic setup works like a charm. I have increased the HashSize 
parameter in conntrackd as well. It replicates the states to the backup 
firewall just fine.

Unfortunately, failover works only in about 50% of all tests. There is 
no obvious pattern as to when this failures occur.

We trigger the failover softly by advertising a higher priority on the 
backup firewall, not by switching off the primary one. If it goes well, 
we do not loose a single connection. If it doesn't go well, we basically 
loose all connections and the apachebench dies. There are hundreds of 
INVALID packets in the syslog, and also some NEW (not SYN). In this 
case, we also see lost packets in "multicast sequence tracking" in the 
conntrackd stats.

One more detail worth mentioning is that we in any case see many 
"connections destroyed failed" in conntrackd statistics, but it does not 
have any visible impact.

We use conntrackd version 0.9.6 included with Fedora 9 in Alarm mode. 
Below I have attached the relevant config files snippets.

Can you (again) give any helpful pointers where I can search?

best regards
Bernhard

------------------------------conntrackd.conf---------------------------------

Sync {
         Mode Alarm {
                 RefreshTime 15
                 CacheTimeout 180
                 CommitTimeout 180
         }
         Multicast {
                 IPv4_address 225.0.0.50
                 Interface bond2
                 Group 3780
         }
         Checksum on
         CacheWriteThrough On
}
General {
         HashSize 262144
         HashLimit 2097152
         LogFile /var/log/conntrackd.log
         Syslog on
         LockFile /var/lock/conntrack.lock
         UNIX {
                 Path /tmp/sync.sock
                 Backlog 20
         }
         SocketBufferSize 268435456
         SocketBufferSizeMaxGrown 1073741824
}

------------------------------keepalived.conf---------------------------------
notify_master /etc/keepalived/script_master.sh
notify_backup /etc/keepalived/script_backup.sh

vrrp_instance VI_1 {
     interface bond1
     state BACKUP
     garp_master_delay 0
     virtual_router_id 20
     priority 104
     advert_int 1
     preempt_delay 30
}

------------------------------script_master.sh---------------------------------
#!/bin/sh
/usr/bin/logger "getting master"
/usr/sbin/conntrackd -c
/usr/sbin/conntrackd -R
/usr/bin/logger "got master"

------------------------------script_backup.sh---------------------------------
#!/bin/sh
/usr/bin/logger "getting backup"
/usr/sbin/conntrackd -B
/usr/bin/logger "got backup"

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets
  2008-07-18 14:20       ` conntrackd failover works partially, was " Bernhard Bock
@ 2008-07-21  0:37         ` Pablo Neira Ayuso
  2008-07-21 14:22           ` conntrackd failover works partially Bernhard Bock
  2008-08-08  8:47         ` conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets Pablo Neira Ayuso
  1 sibling, 1 reply; 25+ messages in thread
From: Pablo Neira Ayuso @ 2008-07-21  0:37 UTC (permalink / raw)
  To: Bernhard Bock; +Cc: netfilter

Bernhard Bock wrote:
> My next step is to run two firewalls in a cluster with conntrackd.
> 
> The basic setup works like a charm. I have increased the HashSize
> parameter in conntrackd as well. It replicates the states to the backup
> firewall just fine.
> 
> Unfortunately, failover works only in about 50% of all tests. There is
> no obvious pattern as to when this failures occur.
> 
> We trigger the failover softly by advertising a higher priority on the
> backup firewall, not by switching off the primary one. If it goes well,
> we do not loose a single connection. If it doesn't go well, we basically
> loose all connections and the apachebench dies. There are hundreds of
> INVALID packets in the syslog, and also some NEW (not SYN). In this
> case, we also see lost packets in "multicast sequence tracking" in the
> conntrackd stats.

As you're using the Alarm mode, the time required to resynchronize the
backup and the master is RefreshTime (which is 15 seconds in your config
files). Are you probably triggering the fail-over before that amount of
time?

BTW, you can use "conntrackd -i" and "conntrackd -e" to diagnose
problems, these commands dump the internal and external caches. The
internal cache contains the set of flows that this firewall replica is
filtering. The external cache contains the set of flows that the other
firewall replicas are filtering. Basically, you must to find the same
set of flows in the master's internal-cache and the backup's
external-cache if everything goes fine.

The lost packets reported by the sequence tracking can be reduced with a
clause introduced in 0.9.7 to increase the sender and the receiver
multicast socket buffers.

> One more detail worth mentioning is that we in any case see many
> "connections destroyed failed" in conntrackd statistics, but it does not
> have any visible impact.

This means that the kernel has told conntrackd to destroy a flow that it
is supposed to be in its internal cache. However, conntrackd did not
find such flow in there.

> We use conntrackd version 0.9.6 included with Fedora 9 in Alarm mode.
> Below I have attached the relevant config files snippets.
> 
> Can you (again) give any helpful pointers where I can search?

Until we reach conntrack-tools-1.0, which I expect to reach soon since
most of the pending work is already done, I suggest you to upgrade to
lastest (as for now, it is 0.9.7). This release includes important
improvements, fixes and features. The alarm mode is a bit spamming, I
also suggest you to give a try to the ft-fw and the notrack approaches.

-- 
"Los honestos son inadaptados sociales" -- Les Luthiers

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially
  2008-07-21  0:37         ` Pablo Neira Ayuso
@ 2008-07-21 14:22           ` Bernhard Bock
  2008-07-23  8:51             ` Bernhard Bock
  2008-07-23 12:50             ` Pablo Neira Ayuso
  0 siblings, 2 replies; 25+ messages in thread
From: Bernhard Bock @ 2008-07-21 14:22 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter

Pablo,

Pablo Neira Ayuso wrote:
> As you're using the Alarm mode, the time required to resynchronize the
> backup and the master is RefreshTime (which is 15 seconds in your config
> files). Are you probably triggering the fail-over before that amount of
> time?

No, I always waited longer. My keepalived has a pre-emption delay of 
30sec before becoming master, and I always did wait at least a minute or 
so before triggering a failback.

> Basically, you must to find the same
> set of flows in the master's internal-cache and the backup's
> external-cache if everything goes fine.

That's exactly what I can observe. They are consistent when the failover 
goes fine, and they're not when I have INVALID packets.

I also see 'conntrack -E' working with 100 parallel TCP connections, and 
dying with "Operation failed: No buffer space available" with 1000 
connections. Maybe this is related?
As written in my last mail, I increased the SocketBufferSize to 256M and 
the SocketBufferSizemaxGrown to 1024M in conntrackd.conf.

> Until we reach conntrack-tools-1.0, which I expect to reach soon since
> most of the pending work is already done, I suggest you to upgrade to
> lastest (as for now, it is 0.9.7). This release includes important
> improvements, fixes and features. The alarm mode is a bit spamming, I
> also suggest you to give a try to the ft-fw and the notrack approaches.

Let me give you a short update after upgrading:

I upgraded to conntrack-tools 0.9.7, libnflink 0.0.39 and 
libnetfilter_conntrack 0.0.96. Basically, I took already available 
Fedora 10 source RPMs and compiled them for Fedora 9.

Without failover, it seems to work at the first glance. In 'conntrackd 
-s' I see plausible numbers of entries in internal and external caches. 
Unfortunately, it still breaks on many failovers with 1000 parallel TCP 
connections.

Now I get a lot of the following entries in syslog in addition to the 
INVALID packets:
conntrack-tools[21319]: cache_wt crt-upd: Invalid argument
conntrack-tools[21319]: cache_wt update:Invalid argument

After a failed failover, I have to flush the connection table and 
stop/restart both conntrackd processes in order to make it work again.

In FT-FW mode, the failover always fails, and it produces log entries like:

conntrack-tools[25448]: The other node says HELLO
conntrack-tools[25448]: sending bulk update
--- failover here ---
conntrack-tools[25515]: committing external cache
conntrack-tools[25515]: commit: Invalid or incomplete multibyte or wide 
character
conntrack-tools[25448]: cache_wt update:Invalid or incomplete multibyte 
or wide character
conntrack-tools[25515]: Committed 28224 new entries
conntrack-tools[25515]: 8 entries can't be committed
conntrack-tools[25448]: resync with master table
conntrack-tools[25448]: cache_wt update:Timer expired
conntrack-tools[25448]: cache_wt update:Timer expired

I haven't tried the notrack mode yet.

best regards
Bernhard

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially
  2008-07-21 14:22           ` conntrackd failover works partially Bernhard Bock
@ 2008-07-23  8:51             ` Bernhard Bock
  2008-07-23 12:50             ` Pablo Neira Ayuso
  1 sibling, 0 replies; 25+ messages in thread
From: Bernhard Bock @ 2008-07-23  8:51 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter

Hi,

sorry that I repost this topic, but I'm really stuck and I don't find 
anything helpful in the net.

Bernhard Bock wrote:
> Now I get a lot of the following entries in syslog in addition to the 
> INVALID packets:
> conntrack-tools[21319]: cache_wt crt-upd: Invalid argument
> conntrack-tools[21319]: cache_wt update:Invalid argument

Can somebody help me interpreting this messages? What does go wrong here?

This happens after I do a graceful failover between two connection 
tracking firewalls synchronized by conntrackd 0.9.7.

best regards
Bernhard



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially
  2008-07-21 14:22           ` conntrackd failover works partially Bernhard Bock
  2008-07-23  8:51             ` Bernhard Bock
@ 2008-07-23 12:50             ` Pablo Neira Ayuso
  2008-07-23 15:20               ` Bernhard Bock
  1 sibling, 1 reply; 25+ messages in thread
From: Pablo Neira Ayuso @ 2008-07-23 12:50 UTC (permalink / raw)
  To: Bernhard Bock; +Cc: netfilter

Bernhard Bock wrote:
> Pablo Neira Ayuso wrote:
>> As you're using the Alarm mode, the time required to resynchronize the
>> backup and the master is RefreshTime (which is 15 seconds in your config
>> files). Are you probably triggering the fail-over before that amount of
>> time?
> 
> No, I always waited longer. My keepalived has a pre-emption delay of
> 30sec before becoming master, and I always did wait at least a minute or
> so before triggering a failback.

Right, I didn't look the config files in deep.

>> Basically, you must to find the same
>> set of flows in the master's internal-cache and the backup's
>> external-cache if everything goes fine.
> 
> That's exactly what I can observe. They are consistent when the failover
> goes fine, and they're not when I have INVALID packets.

Why did you set cache-write through on? You have a basic primary-backup
failover, right? Set it off, please.

> I also see 'conntrack -E' working with 100 parallel TCP connections, and
> dying with "Operation failed: No buffer space available" with 1000
> connections. Maybe this is related?

No, that's a different point. That's a bug in the CLI, I'll add a
parameter to increase the buffer size.

> As written in my last mail, I increased the SocketBufferSize to 256M and
> the SocketBufferSizemaxGrown to 1024M in conntrackd.conf.

That's too much, why did you set such a high buffer? Are you getting
some log messages that tells you to do so?

>> Until we reach conntrack-tools-1.0, which I expect to reach soon since
>> most of the pending work is already done, I suggest you to upgrade to
>> lastest (as for now, it is 0.9.7). This release includes important
>> improvements, fixes and features. The alarm mode is a bit spamming, I
>> also suggest you to give a try to the ft-fw and the notrack approaches.
> 
> Let me give you a short update after upgrading:
> 
> I upgraded to conntrack-tools 0.9.7, libnflink 0.0.39 and
> libnetfilter_conntrack 0.0.96. Basically, I took already available
> Fedora 10 source RPMs and compiled them for Fedora 9.
> 
> Without failover, it seems to work at the first glance. In 'conntrackd
> -s' I see plausible numbers of entries in internal and external caches.
> Unfortunately, it still breaks on many failovers with 1000 parallel TCP
> connections.
> 
> Now I get a lot of the following entries in syslog in addition to the
> INVALID packets:
> conntrack-tools[21319]: cache_wt crt-upd: Invalid argument
> conntrack-tools[21319]: cache_wt update:Invalid argument

Please, enable logging via /var/log/conntrackd.log. The syslog logging
is not including the information about the entry that has failed. I'll
fix this to make both logging approaches consistent.

> After a failed failover, I have to flush the connection table and
> stop/restart both conntrackd processes in order to make it work again.
> 
> 
> In FT-FW mode, the failover always fails, and it produces log entries like:

Please, too many issues at the same time. Let's try to get it working
without the cachewritethrough clause and then we'll get back to this, OK?

-- 
"Los honestos son inadaptados sociales" -- Les Luthiers

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially
  2008-07-23 12:50             ` Pablo Neira Ayuso
@ 2008-07-23 15:20               ` Bernhard Bock
  0 siblings, 0 replies; 25+ messages in thread
From: Bernhard Bock @ 2008-07-23 15:20 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter

Hi Pablo,

Pablo Neira Ayuso wrote:
>>> Basically, you must to find the same
>>> set of flows in the master's internal-cache and the backup's
>>> external-cache if everything goes fine.
>> That's exactly what I can observe. They are consistent when the failover
>> goes fine, and they're not when I have INVALID packets.
> 
> Why did you set cache-write through on? You have a basic primary-backup
> failover, right? Set it off, please.

Fine. I was just experimenting.


>> As written in my last mail, I increased the SocketBufferSize to 256M and
>> the SocketBufferSizemaxGrown to 1024M in conntrackd.conf.
> 
> That's too much, why did you set such a high buffer? Are you getting
> some log messages that tells you to do so?

No, I just wanted to make absolutely sure that a too small buffer cannot 
be the reason, and the machine has plenty of RAM. What is a sensible value?


>> Now I get a lot of the following entries in syslog in addition to the
>> INVALID packets:
>> conntrack-tools[21319]: cache_wt crt-upd: Invalid argument
>> conntrack-tools[21319]: cache_wt update:Invalid argument
> 
> Please, enable logging via /var/log/conntrackd.log. The syslog logging
> is not including the information about the entry that has failed. I'll
> fix this to make both logging approaches consistent.

OK, here are some example entries from conntrackd.log:

[Tue Jul 22 10:05:58 2008] (pid=27666) [ERROR] cache_wt crt-upd: Invalid 
argument
Tue Jul 22 10:05:58 2008        tcp      6 120 SYN_SENT src=10.5.0.101 
dst=10.6.6.102 sport=53000 dport=80 [UNREPLIED]
[Tue Jul 22 10:05:58 2008] (pid=27666) [ERROR] cache_wt update:Invalid 
argument
Tue Jul 22 10:05:58 2008        tcp      6 60 SYN_RECV src=10.5.0.101 
dst=10.6.6.102 sport=53000 dport=80
[Tue Jul 22 10:05:58 2008] (pid=27666) [ERROR] cache_wt crt-upd: Invalid 
argument
Tue Jul 22 10:05:58 2008        tcp      6 120 SYN_SENT src=10.5.0.101 
dst=10.6.6.102 sport=53074 dport=80 [UNREPLIED]
[Tue Jul 22 10:05:58 2008] (pid=27666) [ERROR] cache_wt update:Invalid 
argument
Tue Jul 22 10:05:58 2008        tcp      6 60 SYN_RECV src=10.5.0.101 
dst=10.6.6.102 sport=53074 dport=80
[Tue Jul 22 10:05:58 2008] (pid=27666) [ERROR] cache_wt crt-upd: Invalid 
argument

This is all with cache-write through, so we can just skip it for the 
moment if you like.

Without cache-writethrough, I don't have the "cache_wt" message. 
Nevertheless, I get lots of INVALID messages and many dying TCP 
conntections on failover, so there's no improvement in the result of 
0.9.7 over 0.9.6. The lost packets in the multicast sequence tracking 
are gone, as you suggested.


>> In FT-FW mode, the failover always fails, and it produces log entries like:
> 
> Please, too many issues at the same time. Let's try to get it working
> without the cachewritethrough clause and then we'll get back to this, OK?

No problem, I was just testing FT-FW mode as you were proposing in your 
last mail. One correctly working mode is enough for me. ;-)

best regards
Bernhard

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets
  2008-07-18 14:20       ` conntrackd failover works partially, was " Bernhard Bock
  2008-07-21  0:37         ` Pablo Neira Ayuso
@ 2008-08-08  8:47         ` Pablo Neira Ayuso
  2008-08-08 12:58           ` Bernhard Bock
  2008-09-02  9:39           ` Bernhard Bock
  1 sibling, 2 replies; 25+ messages in thread
From: Pablo Neira Ayuso @ 2008-08-08  8:47 UTC (permalink / raw)
  To: Bernhard Bock; +Cc: netfilter

Hi Bernhard,

Bernhard Bock wrote:
> My next step is to run two firewalls in a cluster with conntrackd.
> 
> The basic setup works like a charm. I have increased the HashSize
> parameter in conntrackd as well. It replicates the states to the backup
> firewall just fine.
> 
> Unfortunately, failover works only in about 50% of all tests. There is
> no obvious pattern as to when this failures occur.
> 
> We trigger the failover softly by advertising a higher priority on the
> backup firewall, not by switching off the primary one. If it goes well,
> we do not loose a single connection. If it doesn't go well, we basically
> loose all connections and the apachebench dies. There are hundreds of
> INVALID packets in the syslog, and also some NEW (not SYN). In this
> case, we also see lost packets in "multicast sequence tracking" in the
> conntrackd stats.

I think that I have reproduced your problem in my testbed. Say you have
two nodes: A and B. Initially, A is primary and B is backup.

1) you generate tons of http traffic: A succesfully replicates states to B.
2) you trigger the fail-over: B becomes primary and A becomes backup. B
successfully recovers the connections. Moreover, if you do `conntrack -L
-p tcp' in A, you see lots of entries.
3) Just a bit later - 30 seconds later or so - you trigger the fail-over
again from B to A. In this case, A fails to recover the entries showing
tons of INVALID messages.

The problem are the entries that are stuck in A (see step 2). Those
former entries clashes with newly committed entries and the TCP state
tracking code gets confused with old state information.

This problem is fixed in the git repository. Now, we purge the entries
in A once this node becomes backup after 15 seconds - this parameter is
tunable via PurgeTimeout. Thus, the old entries does not clash with the
brand new.

Moreover, I have completely reworked the fail-over script, you can find
it under doc/ in the conntrack-tools git tree [1]. You may give it a
try. I expect to release a new version of the conntrack-tools with these
updates soon. New (more complete) documentation is also on the way.

Please, let me know how it goes.

[1] http://git.netfilter.org

-- 
"Los honestos son inadaptados sociales" -- Les Luthiers

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets
  2008-08-08  8:47         ` conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets Pablo Neira Ayuso
@ 2008-08-08 12:58           ` Bernhard Bock
  2008-09-02  9:39           ` Bernhard Bock
  1 sibling, 0 replies; 25+ messages in thread
From: Bernhard Bock @ 2008-08-08 12:58 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter

Pablo Neira Ayuso wrote:
> I think that I have reproduced your problem in my testbed. 

Wow, thank you very much.

I'll be on holidays for the next three weeks beginning tomorrow, so 
it'll take some time until I'm able to test your patches. I will come 
back to you as soon as I was able to test the git version.

best regards
Bernhard

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets
  2008-08-08  8:47         ` conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets Pablo Neira Ayuso
  2008-08-08 12:58           ` Bernhard Bock
@ 2008-09-02  9:39           ` Bernhard Bock
  2008-09-02  9:56             ` Pablo Neira Ayuso
  1 sibling, 1 reply; 25+ messages in thread
From: Bernhard Bock @ 2008-09-02  9:39 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter

Hi Pablo,

I was now able to test your enhancements of conntrackd.

Pablo Neira Ayuso wrote:
> I think that I have reproduced your problem in my testbed. Say you have
> two nodes: A and B. Initially, A is primary and B is backup.
> 
> 1) you generate tons of http traffic: A succesfully replicates states to B.
> 2) you trigger the fail-over: B becomes primary and A becomes backup. B
> successfully recovers the connections. Moreover, if you do `conntrack -L
> -p tcp' in A, you see lots of entries.
> 3) Just a bit later - 30 seconds later or so - you trigger the fail-over
> again from B to A. In this case, A fails to recover the entries showing
> tons of INVALID messages.

Well, not exactly. My problem occurs already in step 2.

Before starting the test, I stop conntrackd on both nodes, clear the
connections from the table with 'conntrack -F' and start conntrackd
again. Both nodes have empty connection tracking tables at this point in
time. Then I start the HTTP traffic and trigger the fail-over.

I see the INVALID messages on the first fail-over, as soon as I have
more than about 500 (with NAT) to 750 (without NAT) parallel TCP
sessions, built up and teared down rapidly.

I'm not sure where the bottleneck is in this case. CPU of the nodes and
bandwith of the "node interconnect" (dedicated interfaces) are not busy
at all.

> This problem is fixed in the git repository. Now, we purge the entries
> in A once this node becomes backup after 15 seconds - this parameter is
> tunable via PurgeTimeout. Thus, the old entries does not clash with the
> brand new.

I compiled the current version from git. Unfortunately, it does not
change the results for me.

best regards
Bernhard

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets
  2008-09-02  9:39           ` Bernhard Bock
@ 2008-09-02  9:56             ` Pablo Neira Ayuso
  2008-09-02 12:34               ` Bernhard Bock
  0 siblings, 1 reply; 25+ messages in thread
From: Pablo Neira Ayuso @ 2008-09-02  9:56 UTC (permalink / raw)
  To: Bernhard Bock; +Cc: netfilter

Bernhard Bock wrote:
> Pablo Neira Ayuso wrote:
>> I think that I have reproduced your problem in my testbed. Say you have
>> two nodes: A and B. Initially, A is primary and B is backup.
>>
>> 1) you generate tons of http traffic: A succesfully replicates states to B.
>> 2) you trigger the fail-over: B becomes primary and A becomes backup. B
>> successfully recovers the connections. Moreover, if you do `conntrack -L
>> -p tcp' in A, you see lots of entries.
>> 3) Just a bit later - 30 seconds later or so - you trigger the fail-over
>> again from B to A. In this case, A fails to recover the entries showing
>> tons of INVALID messages.
> 
> Well, not exactly. My problem occurs already in step 2.
> 
> Before starting the test, I stop conntrackd on both nodes, clear the
> connections from the table with 'conntrack -F' and start conntrackd
> again. Both nodes have empty connection tracking tables at this point in
> time. Then I start the HTTP traffic and trigger the fail-over.
> 
> I see the INVALID messages on the first fail-over, as soon as I have
> more than about 500 (with NAT) to 750 (without NAT) parallel TCP
> sessions, built up and teared down rapidly.

That's exactly the test that I do in my testbed and it works fine here,
the problem must be elsewhere. The following line should help to see how
the connection tracking is marking the traffic as invalid:

echo 255 > /proc/sys/net/ipv4/netfilter/ip_conntrack_log_invalid

However, please see the comment below before doing this and repeating
the test.

> I'm not sure where the bottleneck is in this case. CPU of the nodes and
> bandwith of the "node interconnect" (dedicated interfaces) are not busy
> at all.

Are you using a sane stateful rule-set similar to the described in the
conntrack-tools website? What kernel version are you using? If your
kernel is < 2.6.22 you have to disabled TCP window tracking on both nodes.

echo 1 > /proc/sys/net/ipv4/netfilter/ip_conntrack_tcp_be_liberal

>> This problem is fixed in the git repository. Now, we purge the entries
>> in A once this node becomes backup after 15 seconds - this parameter is
>> tunable via PurgeTimeout. Thus, the old entries does not clash with the
>> brand new.
> 
> I compiled the current version from git. Unfortunately, it does not
> change the results for me.

There is a new script `primary-backup.sh' that replaces the old
script_master.sh and script_backup.sh. Although this is not directly
related it would be worth to use that instead as it will be the standard
in the upcoming release.

-- 
"Los honestos son inadaptados sociales" -- Les Luthiers

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets
  2008-09-02  9:56             ` Pablo Neira Ayuso
@ 2008-09-02 12:34               ` Bernhard Bock
  2008-09-02 12:48                 ` Pablo Neira Ayuso
  2008-09-04 11:40                 ` Pablo Neira Ayuso
  0 siblings, 2 replies; 25+ messages in thread
From: Bernhard Bock @ 2008-09-02 12:34 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter

Hi Pablo,

Pablo Neira Ayuso wrote:
> That's exactly the test that I do in my testbed and it works fine here,
> the problem must be elsewhere. The following line should help to see how
> the connection tracking is marking the traffic as invalid:
> 
> echo 255 > /proc/sys/net/ipv4/netfilter/ip_conntrack_log_invalid
> 
> However, please see the comment below before doing this and repeating
> the test.

I didn't know one can increase the verbosity. Now I get some (more)
helpful logs.

kernel: nf_ct_tcp: invalid packet ignored IN= OUT= SRC=10.5.0.101
DST=10.6.6.102 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=9977 DF PROTO=TCP
SPT=39101 DPT=80 SEQ=3381624888 ACK=0 WINDOW=5840 RES=0x00 SYN URGP=0 OPT

Invalid syn packet? Hm. And then:

kernel: nf_ct_tcp: killing out of sync session IN= OUT= SRC=10.6.6.102
DST=10.5.0.101 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=80
DPT=41647 SEQ=3243074286 ACK=3280173284 WINDOW=5792 RES=0x00 ACK SYN
URGP=0 OPT


> Are you using a sane stateful rule-set similar to the described in the
> conntrack-tools website? What kernel version are you using? If your
> kernel is < 2.6.22 you have to disabled TCP window tracking on both nodes.
> 
> echo 1 > /proc/sys/net/ipv4/netfilter/ip_conntrack_tcp_be_liberal


I'm using 2.6.25, from Fedora 9.
And I have set /proc/sys/net/netfilter/nf_conntrack_tcp_be_liberal to 1.

Here are my rules:

-A FORWARD -m state --state INVALID -j LOG --log-prefix "Invalid:"
-A FORWARD -m state --state INVALID -j DROP
-A FORWARD -p tcp ! --syn -m state --state NEW -j LOG --log-prefix "New
not syn:"
-A FORWARD -p tcp ! --syn -m state --state NEW -j DROP
-A FORWARD -m state --state ESTABLISHED,RELATED -j ACCEPT
-A FORWARD -m state --state NEW -m tcp -p tcp --syn -d 10.6.6.0/24 -j ACCEPT
-A FORWARD -m state --state NEW -m udp -p udp -s 10.5.0.0/24 -d
10.6.6.0/24 -j ACCEPT
-A FORWARD -j LOG --log-prefix "Packet dropped:"
-A FORWARD -j DROP


> There is a new script `primary-backup.sh' that replaces the old
> script_master.sh and script_backup.sh. Although this is not directly
> related it would be worth to use that instead as it will be the standard
> in the upcoming release.

I'll replace it for future tests. For now, I don't want to mess around
at too many places at the same time unless it's related to the problem.

best regards
Bernhard

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets
  2008-09-02 12:34               ` Bernhard Bock
@ 2008-09-02 12:48                 ` Pablo Neira Ayuso
  2008-09-02 15:18                   ` Bernhard Bock
  2008-09-04 11:40                 ` Pablo Neira Ayuso
  1 sibling, 1 reply; 25+ messages in thread
From: Pablo Neira Ayuso @ 2008-09-02 12:48 UTC (permalink / raw)
  To: Bernhard Bock; +Cc: netfilter

Bernhard Bock wrote:
> Pablo Neira Ayuso wrote:
>> That's exactly the test that I do in my testbed and it works fine here,
>> the problem must be elsewhere. The following line should help to see how
>> the connection tracking is marking the traffic as invalid:
>>
>> echo 255 > /proc/sys/net/ipv4/netfilter/ip_conntrack_log_invalid
>>
>> However, please see the comment below before doing this and repeating
>> the test.
> 
> I didn't know one can increase the verbosity. Now I get some (more)
> helpful logs.
> 
> kernel: nf_ct_tcp: invalid packet ignored IN= OUT= SRC=10.5.0.101
> DST=10.6.6.102 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=9977 DF PROTO=TCP
> SPT=39101 DPT=80 SEQ=3381624888 ACK=0 WINDOW=5840 RES=0x00 SYN URGP=0 OPT
> 
> Invalid syn packet? Hm. And then:
> 
> kernel: nf_ct_tcp: killing out of sync session IN= OUT= SRC=10.6.6.102
> DST=10.5.0.101 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=80
> DPT=41647 SEQ=3243074286 ACK=3280173284 WINDOW=5792 RES=0x00 ACK SYN
> URGP=0 OPT

I though that your problem was that you cannot even recover the flows in
the first failover, but it seems to me that you have triggered several
fail-overs between the nodes. There's no way to hit this in a clean
session - ie. empty connection tracking table. Please, see the end of
this email.

>> Are you using a sane stateful rule-set similar to the described in the
>> conntrack-tools website? What kernel version are you using? If your
>> kernel is < 2.6.22 you have to disabled TCP window tracking on both nodes.
>>
>> echo 1 > /proc/sys/net/ipv4/netfilter/ip_conntrack_tcp_be_liberal
> 
> 
> I'm using 2.6.25, from Fedora 9.
> And I have set /proc/sys/net/netfilter/nf_conntrack_tcp_be_liberal to 1.
> 
> Here are my rules:
> 
> -A FORWARD -m state --state INVALID -j LOG --log-prefix "Invalid:"
> -A FORWARD -m state --state INVALID -j DROP
> -A FORWARD -p tcp ! --syn -m state --state NEW -j LOG --log-prefix "New
> not syn:"
> -A FORWARD -p tcp ! --syn -m state --state NEW -j DROP
> -A FORWARD -m state --state ESTABLISHED,RELATED -j ACCEPT
> -A FORWARD -m state --state NEW -m tcp -p tcp --syn -d 10.6.6.0/24 -j ACCEPT
> -A FORWARD -m state --state NEW -m udp -p udp -s 10.5.0.0/24 -d
> 10.6.6.0/24 -j ACCEPT
> -A FORWARD -j LOG --log-prefix "Packet dropped:"
> -A FORWARD -j DROP

Looks sane.

>> There is a new script `primary-backup.sh' that replaces the old
>> script_master.sh and script_backup.sh. Although this is not directly
>> related it would be worth to use that instead as it will be the standard
>> in the upcoming release.
> 
> I'll replace it for future tests. For now, I don't want to mess around
> at too many places at the same time unless it's related to the problem.

If you are triggering several fail-overs with unclean session, the new
script should help. So please, give it a try. It will take you a couple
of minutes to get it working.

-- 
"Los honestos son inadaptados sociales" -- Les Luthiers

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets
  2008-09-02 12:48                 ` Pablo Neira Ayuso
@ 2008-09-02 15:18                   ` Bernhard Bock
  2008-09-02 16:22                     ` Pablo Neira Ayuso
  0 siblings, 1 reply; 25+ messages in thread
From: Bernhard Bock @ 2008-09-02 15:18 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter

Pablo,

Pablo Neira Ayuso wrote:
> I though that your problem was that you cannot even recover the flows in
> the first failover, but it seems to me that you have triggered several
> fail-overs between the nodes. There's no way to hit this in a clean
> session - ie. empty connection tracking table. 

Well, there are several thousand connections established and teared down
on the primary node before the secondary nodes takes over, but as far as
I can tell there is no "bouncing" between the nodes. So, there's no
empty connection tracking table at failover time:

1. Stop conntrackd
2. Clear conntrack table
3. Restart Fedora iptables service (see below)
4. Start conntrackd
-> 0 connections
5. Start traffic
-> lots of connections
6. fail-over

> If you are triggering several fail-overs with unclean session, the new
> script should help. So please, give it a try. It will take you a couple
> of minutes to get it working.

Your script makes things worse for me, as it drops a lot of traffic on
switchover.

In my setup, it helps a lot to let INVALID packets pass for a couple of
seconds after switchover and return to the “normal” policy only after
this time. I coded this into my keepalived scripts. During this time,
some state recovers and most of the sessions actually work afterwards.
With a “hard” failover, nearly all sessions get lost.

One more thing I just noticed: It is not sufficient to clear the
conntrack table with 'conntrack -F'. I have to unload and reload the
iptables kernel modules to make it work again. This is done by the
Fedora init scripts for iptables. Without this, after a "broken"
fail-over, the machine keeps dropping some (few) packets even without
conntrackd and a second node involved. After reloading the modules,
everything's fine again. I guess this hints towards searching in the
kernel space and not in the conntrack-tools?!

Best regards
Bernhard

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets
  2008-09-02 15:18                   ` Bernhard Bock
@ 2008-09-02 16:22                     ` Pablo Neira Ayuso
  2008-09-02 16:55                       ` Bernhard Bock
  0 siblings, 1 reply; 25+ messages in thread
From: Pablo Neira Ayuso @ 2008-09-02 16:22 UTC (permalink / raw)
  To: Bernhard Bock; +Cc: netfilter

Bernhard Bock wrote:
> Pablo Neira Ayuso wrote:
>> I though that your problem was that you cannot even recover the flows in
>> the first failover, but it seems to me that you have triggered several
>> fail-overs between the nodes. There's no way to hit this in a clean
>> session - ie. empty connection tracking table. 
> 
> Well, there are several thousand connections established and teared down
> on the primary node before the secondary nodes takes over, but as far as
> I can tell there is no "bouncing" between the nodes. So, there's no
> empty connection tracking table at failover time:
> 
> 1. Stop conntrackd
> 2. Clear conntrack table
> 3. Restart Fedora iptables service (see below)
> 4. Start conntrackd
> -> 0 connections
> 5. Start traffic
> -> lots of connections
> 6. fail-over

OK

>> If you are triggering several fail-overs with unclean session, the new
>> script should help. So please, give it a try. It will take you a couple
>> of minutes to get it working.
> 
> Your script makes things worse for me, as it drops a lot of traffic on
> switchover.

Hm, the new script does exactly the same when the node becomes primary
as it used to do script_master.sh, so I cannot find a reason why the new
script does it worst.

> In my setup, it helps a lot to let INVALID packets pass for a couple of
> seconds after switchover and return to the “normal” policy only after
> this time. I coded this into my keepalived scripts. During this time,
> some state recovers and most of the sessions actually work afterwards.

This is a horrible workaround :(

> With a “hard” failover, nearly all sessions get lost.

During the fail-over, keepalived recovers the virtual IPs and conntrackd
commits the states into the kernel. The commit takes very short but you
can still lose some packets if the state is not yet present in the
kernel - thus, these packets are logged as invalid and dropped as we
don't find any matching state (with a sane stateful rule-set, of
course). *However*, the TCP sessions should recover as the peer or the
server retransmits the packet in short, so I don't understand why you
lose nearly all the sessions.

Is the firewall sending RST packets to the peer/server to close
connections? If so, I remember a similar report with a RHEL kernel:

http://www.mail-archive.com/netfilter-failover@lists.netfilter.org/msg00065.html

> One more thing I just noticed: It is not sufficient to clear the
> conntrack table with 'conntrack -F'. I have to unload and reload the
> iptables kernel modules to make it work again. This is done by the
> Fedora init scripts for iptables. Without this, after a "broken"
> fail-over, the machine keeps dropping some (few) packets even without
> conntrackd and a second node involved. After reloading the modules,
> everything's fine again. I guess this hints towards searching in the
> kernel space and not in the conntrack-tools?!

conntrack -F should be enough, there's something wrong in the kernel.
There were other issues related with nat.

There are three patches that should hit -stable for 2.6.26 soon that are
not directly related but that are worth to have:

http://marc.info/?l=netfilter-devel&m=121907870404717&w=2
http://marc.info/?l=netfilter-devel&m=121907870504722&w=2
http://marc.info/?l=netfilter-devel&m=121907870604726&w=2

There were other issues related with NAT but they are fixed in 2.6.26,
however, I'm not sure if fedora is a real 2.6.26 kernel.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/239215

-- 
"Los honestos son inadaptados sociales" -- Les Luthiers

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets
  2008-09-02 16:22                     ` Pablo Neira Ayuso
@ 2008-09-02 16:55                       ` Bernhard Bock
  2008-09-03  9:13                         ` Pablo Neira Ayuso
  0 siblings, 1 reply; 25+ messages in thread
From: Bernhard Bock @ 2008-09-02 16:55 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter

Pablo,

Pablo Neira Ayuso wrote:
> Hm, the new script does exactly the same when the node becomes primary
> as it used to do script_master.sh, so I cannot find a reason why the new
> script does it worst.

It does worse than my own script (with the horrible workaround). It does
equally good as your original script.


> During the fail-over, keepalived recovers the virtual IPs and conntrackd
> commits the states into the kernel. The commit takes very short but you
> can still lose some packets if the state is not yet present in the
> kernel - thus, these packets are logged as invalid and dropped as we
> don't find any matching state (with a sane stateful rule-set, of
> course). *However*, the TCP sessions should recover as the peer or the
> server retransmits the packet in short, so I don't understand why you
> lose nearly all the sessions.

Agreed. My problem is, it doesn't recover. It keeps dropping packets as
long as the test runs (the test stops at some point in time with socket
timeouts).


> Is the firewall sending RST packets to the peer/server to close
> connections? If so, I remember a similar report with a RHEL kernel:

Will check tomorrow.


> conntrack -F should be enough, there's something wrong in the kernel.
> There were other issues related with nat.

This is happening entirely without NAT. And it is only appearing while
using conntrackd and doing a failover. With a standalone firewall, I
cannot reproduce this behavior. I haven't tested any other software
using netlink, like e.g. ULOG, though.

Best regards
Bernhard


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets
  2008-09-02 16:55                       ` Bernhard Bock
@ 2008-09-03  9:13                         ` Pablo Neira Ayuso
  2008-09-03 11:26                           ` Bernhard Bock
  0 siblings, 1 reply; 25+ messages in thread
From: Pablo Neira Ayuso @ 2008-09-03  9:13 UTC (permalink / raw)
  To: Bernhard Bock; +Cc: netfilter

Bernhard Bock wrote:
> Pablo Neira Ayuso wrote:
>> During the fail-over, keepalived recovers the virtual IPs and conntrackd
>> commits the states into the kernel. The commit takes very short but you
>> can still lose some packets if the state is not yet present in the
>> kernel - thus, these packets are logged as invalid and dropped as we
>> don't find any matching state (with a sane stateful rule-set, of
>> course). *However*, the TCP sessions should recover as the peer or the
>> server retransmits the packet in short, so I don't understand why you
>> lose nearly all the sessions.
> 
> Agreed. My problem is, it doesn't recover. It keeps dropping packets as
> long as the test runs (the test stops at some point in time with socket
> timeouts).

Hm, I remember that the problem reported with RHEL kernel was similar.
That user assured me that the state entries were successfully committed
- ie. he could verify that conntrack -L displays them - but the packets
were not matching the injected states, thus, leading to invalid logs and
drops. He ended up changing to Ubuntu. However, if it is a Fedora/RHEL
problem, it would be nice to know what's wrong with it.

>> Is the firewall sending RST packets to the peer/server to close
>> connections? If so, I remember a similar report with a RHEL kernel:
> 
> Will check tomorrow.

OK, wait for your news.

-- 
"Los honestos son inadaptados sociales" -- Les Luthiers

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets
  2008-09-03  9:13                         ` Pablo Neira Ayuso
@ 2008-09-03 11:26                           ` Bernhard Bock
  2008-09-04 12:29                             ` Pablo Neira Ayuso
  0 siblings, 1 reply; 25+ messages in thread
From: Bernhard Bock @ 2008-09-03 11:26 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter

Hi Pablo,

Pablo Neira Ayuso wrote:
>>> Is the firewall sending RST packets to the peer/server to close
>>> connections? If so, I remember a similar report with a RHEL kernel:

I do not see any RST packets, neither on server nor on client side.

I have done more tests this morning. Unfortunately, things are complicated:

I repeated a basic failover test lots of times while making 1.000.000
connections. This test with 1000 parallel connections breaks every time.
500 is OK every time.

The kernel only has problems with 1000 connections, and then only from
time to time. In most of the cases (I guess ca. 80% of all tests), I do
not need to unload/load the kernel modules, but only clear the conntrack
table to get it back up running. The other times I have to reload the
kernel modules in order to make the system work again. I cannot see any
pattern there.

Best regards
Bernhard

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets
  2008-09-03 11:26                           ` Bernhard Bock
@ 2008-09-04 12:29                             ` Pablo Neira Ayuso
  2008-09-04 13:27                               ` Bernhard Bock
  0 siblings, 1 reply; 25+ messages in thread
From: Pablo Neira Ayuso @ 2008-09-04 12:29 UTC (permalink / raw)
  To: Bernhard Bock; +Cc: netfilter

Hi Bernhard,

Bernhard Bock wrote:
> Pablo Neira Ayuso wrote:
>>>> Is the firewall sending RST packets to the peer/server to close
>>>> connections? If so, I remember a similar report with a RHEL kernel:
> 
> I do not see any RST packets, neither on server nor on client side.

Fine.

> I have done more tests this morning. Unfortunately, things are complicated:
> 
> I repeated a basic failover test lots of times while making 1.000.000
> connections. This test with 1000 parallel connections breaks every time.
> 500 is OK every time.

In my testbed [1], with a vanilla linux kernel 2.6.26.3 with no patching
at all, using conntrack-tools current git-snapshot, on debian using `ab'
- apache benchmark tool - to generate 1.000.000 connections with 1000
parallel connections - with log_invalid set off - I don't see any packet
hitting the log-invalid rule.

I noticed that ab reports ~3000 connection failures when triggering a
couple of fail-overs - to do so, my script sets one of the link down and
5 seconds later it set it up. However, the number of failures is similar
without triggering the fail-over, it is ~1500. I guess that the
connections are timing out after several resends but I notice nothing
abnormal since the packets don't hit the log-invalid rule.

Two comments that I have to do about my tests:

* I had to rise the default value of SocketBufferSize and
SocketBufferSizeMaxGrowth in conntrackd.conf to avoid netlink overflows
with such amount of traffic. There are log messages in conntrackd.log
that warn about this issue. Also, you can notice this if you observe
that conntrackd hits 100% CPU consumption at some point - this happens
when netlink overflows.
* Also, I had to rise the default value of McastSndSocketBuffer and
McastRcvSocketBuffer since I was noticing packets lost via conntrackd -s
- see multicast sequence tracking. This happens when the link gets
pretty congested because of

With these tweaks the results were good, conntrackd was consuming about
the same percetange of CPU than ksoftirqd (~25% each via top, which is
not very reliable but it's OK for an estimation). It would be possible
to reduce this CPU consumption even more by means of the Filter clause -
eg. only replicate TCP ESTABLISHED states.

These tweaks affect the behaviour of conntrackd so that they are worth
to give a try.

> The kernel only has problems with 1000 connections, and then only from
> time to time. In most of the cases (I guess ca. 80% of all tests), I do
> not need to unload/load the kernel modules, but only clear the conntrack
> table to get it back up running. The other times I have to reload the
> kernel modules in order to make the system work again. I cannot see any
> pattern there.

Nor me, and I cannot reproduce the problems that you're reporting.  More
questions to try to diagnose your problem:

1) does /var/log/conntrackd.log - or syslog - tells anything relevant?
Are the entries being comitted to kernel-space successfully?
2) Can you see the committed entries in the kernel via `conntrack -L'
after the fail-over?
3) Are you noticing any abnormal CPU consumption?

[1] http://conntrack-tools.netfilter.org/testcase.html

-- 
"Los honestos son inadaptados sociales" -- Les Luthiers

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets
  2008-09-04 12:29                             ` Pablo Neira Ayuso
@ 2008-09-04 13:27                               ` Bernhard Bock
  2008-09-05 10:55                                 ` Pablo Neira Ayuso
  0 siblings, 1 reply; 25+ messages in thread
From: Bernhard Bock @ 2008-09-04 13:27 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter

Hi Pablo,

Pablo Neira Ayuso wrote:
> * I had to rise the default value of SocketBufferSize and
> SocketBufferSizeMaxGrowth in conntrackd.conf to avoid netlink overflows
> with such amount of traffic. There are log messages in conntrackd.log
> that warn about this issue. Also, you can notice this if you observe
> that conntrackd hits 100% CPU consumption at some point - this happens
> when netlink overflows.

We already raised these values in the past. There are no hints in the
log about overflows.


> * Also, I had to rise the default value of McastSndSocketBuffer and
> McastRcvSocketBuffer since I was noticing packets lost via conntrackd -s
> - see multicast sequence tracking. This happens when the link gets
> pretty congested because of

Since upgrade to >0.9.6, there's no problem with multicast packets in
'conntrackd -s'. On the other hand, we have a dedicated 1 gigabit link
as cluster interconnect. I do not expect congestion there.


> With these tweaks the results were good, conntrackd was consuming about
> the same percetange of CPU than ksoftirqd (~25% each via top, which is
> not very reliable but it's OK for an estimation).

We have quad core machines, and CPU is idling a lot. 2 of the cores are
idle 100%, two are idle around 50%.


> 1) does /var/log/conntrackd.log - or syslog - tells anything relevant?
> Are the entries being comitted to kernel-space successfully?

according to both conntrackd.log and syslog, entries are being commited.
I see no relevant negative entries in both logs (except of course the
INVALID packets).


> 2) Can you see the committed entries in the kernel via `conntrack -L'
> after the fail-over?

yes.


> 3) Are you noticing any abnormal CPU consumption?

no.


Best regards
Bernhard




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets
  2008-09-04 13:27                               ` Bernhard Bock
@ 2008-09-05 10:55                                 ` Pablo Neira Ayuso
  0 siblings, 0 replies; 25+ messages in thread
From: Pablo Neira Ayuso @ 2008-09-05 10:55 UTC (permalink / raw)
  To: Bernhard Bock; +Cc: netfilter

Bernhard Bock wrote:
>> 1) does /var/log/conntrackd.log - or syslog - tells anything relevant?
>> Are the entries being comitted to kernel-space successfully?
> 
> according to both conntrackd.log and syslog, entries are being commited.
> I see no relevant negative entries in both logs (except of course the
> INVALID packets).
> 
>> 2) Can you see the committed entries in the kernel via `conntrack -L'
>> after the fail-over?
> 
> yes.
> 
>> 3) Are you noticing any abnormal CPU consumption?
> 
> no.

Is there any pattern in the invalid log messages that your rule-set
matches during the fail-over?

Are the packets hitting invalid or new-not-syn in your rule-set?

Can you check if the packets that are logged as invalid have a
state-entry? Just take one of the log messages and do `conntrack -L -p
tcp --dport XYZW' to check if there is a state-entry about that
connection while it keeps logging the packet as such state-entry would
not exist.

Are you noticing state-entries marked as UNREPLIED in TCP states !=
SYN_SENT?

-- 
"Los honestos son inadaptados sociales" -- Les Luthiers

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets
  2008-09-02 12:34               ` Bernhard Bock
  2008-09-02 12:48                 ` Pablo Neira Ayuso
@ 2008-09-04 11:40                 ` Pablo Neira Ayuso
  1 sibling, 0 replies; 25+ messages in thread
From: Pablo Neira Ayuso @ 2008-09-04 11:40 UTC (permalink / raw)
  To: Bernhard Bock; +Cc: netfilter

Bernhard Bock wrote:
> Pablo Neira Ayuso wrote:
>> That's exactly the test that I do in my testbed and it works fine here,
>> the problem must be elsewhere. The following line should help to see how
>> the connection tracking is marking the traffic as invalid:
>>
>> echo 255 > /proc/sys/net/ipv4/netfilter/ip_conntrack_log_invalid
>>
>> However, please see the comment below before doing this and repeating
>> the test.
> 
> I didn't know one can increase the verbosity. Now I get some (more)
> helpful logs.
> 
> kernel: nf_ct_tcp: invalid packet ignored IN= OUT= SRC=10.5.0.101
> DST=10.6.6.102 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=9977 DF PROTO=TCP
> SPT=39101 DPT=80 SEQ=3381624888 ACK=0 WINDOW=5840 RES=0x00 SYN URGP=0 OPT
> 
> Invalid syn packet?

The firewall is out-of-sync, this means that firewall has a different
state than the client and the server. The packet is not marked as
invalid, so it is not dropped but it is logged if
ip_conntrack_log_invalid is set up.

> Hm. And then:
> 
> kernel: nf_ct_tcp: killing out of sync session IN= OUT= SRC=10.6.6.102
> DST=10.5.0.101 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=80
> DPT=41647 SEQ=3243074286 ACK=3280173284 WINDOW=5792 RES=0x00 ACK SYN
> URGP=0 OPT

The firewall sees the syn-ack from the server after the previous -
ignored - syn packet.  Since the firewall is out-of-sync, the connection
tracking system kills conntrack and now it blocks the packet. Thus, the
client has to resend the syn to start a clean session.

Just to clarify, I don't think that these logs are very helpful for the
problem that you're reporting.

-- 
"Los honestos son inadaptados sociales" -- Les Luthiers

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2008-09-05 10:55 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-18  9:39 conntrack performance test results in INVALID packets Bernhard Bock
2008-07-18 10:13 ` Jan Engelhardt
2008-07-18 10:52   ` Bernhard Bock
2008-07-18 12:14     ` Pablo Neira Ayuso
2008-07-18 14:20       ` conntrackd failover works partially, was " Bernhard Bock
2008-07-21  0:37         ` Pablo Neira Ayuso
2008-07-21 14:22           ` conntrackd failover works partially Bernhard Bock
2008-07-23  8:51             ` Bernhard Bock
2008-07-23 12:50             ` Pablo Neira Ayuso
2008-07-23 15:20               ` Bernhard Bock
2008-08-08  8:47         ` conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets Pablo Neira Ayuso
2008-08-08 12:58           ` Bernhard Bock
2008-09-02  9:39           ` Bernhard Bock
2008-09-02  9:56             ` Pablo Neira Ayuso
2008-09-02 12:34               ` Bernhard Bock
2008-09-02 12:48                 ` Pablo Neira Ayuso
2008-09-02 15:18                   ` Bernhard Bock
2008-09-02 16:22                     ` Pablo Neira Ayuso
2008-09-02 16:55                       ` Bernhard Bock
2008-09-03  9:13                         ` Pablo Neira Ayuso
2008-09-03 11:26                           ` Bernhard Bock
2008-09-04 12:29                             ` Pablo Neira Ayuso
2008-09-04 13:27                               ` Bernhard Bock
2008-09-05 10:55                                 ` Pablo Neira Ayuso
2008-09-04 11:40                 ` Pablo Neira Ayuso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox