* multiprimary conntrackd setup @ 2008-06-17 11:02 Sebastian Vieira 2008-06-18 13:05 ` Pablo Neira Ayuso 0 siblings, 1 reply; 8+ messages in thread From: Sebastian Vieira @ 2008-06-17 11:02 UTC (permalink / raw) To: netfilter Hi, I must be looking in the wrong places for documentation but so far i'm unable to find it. I'm trying to set up a multiprimary (active-active) conntrackd on 2 firewalls. I have conntrackd running on both nodes and 'conntrackd -s' shows that mcast is working. However, i still have to do a manual 'conntrackd -c;conntrackd -R' to sync both tables (as would be proper in a failover / active-backup situation). Other than enable CacheWriteThrough , i couldn't find anything on multiprimary setup. If someone could point me to the correct documentation, i would be very happy indeed :) thanks, Sebastian ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: multiprimary conntrackd setup 2008-06-17 11:02 multiprimary conntrackd setup Sebastian Vieira @ 2008-06-18 13:05 ` Pablo Neira Ayuso 2008-06-23 6:46 ` Sebastian Vieira 0 siblings, 1 reply; 8+ messages in thread From: Pablo Neira Ayuso @ 2008-06-18 13:05 UTC (permalink / raw) To: Sebastian Vieira; +Cc: netfilter Sebastian Vieira wrote: > Hi, > > I must be looking in the wrong places for documentation but so far i'm > unable to find it. I'm trying to set up a multiprimary (active-active) > conntrackd on 2 firewalls. I have conntrackd running on both nodes and > 'conntrackd -s' shows that mcast is working. However, i still have to > do a manual 'conntrackd -c;conntrackd -R' to sync both tables (as > would be proper in a failover / active-backup situation). Other than > enable CacheWriteThrough , i couldn't find anything on multiprimary > setup. What kind of active-active? There are two kind: a) symmetric or flow-based: the packets are always handled by the same firewall replica. In this case, you only have to call conntrackd -c during the failover (which is usually done by your HA manager such as keepalived). b) asymmetric or packet-based: typical case of OSPF setups, there is no guarantees that the packet is handled by the same firewall replica as OSPF may change the routes at any time. In that case, you have to enable the CacheWriteThrough. However, from the design point of view, conntrackd suits better in the scenario a). > If someone could point me to the correct documentation, i would > be very happy indeed :) There's no documentation on active-active setups yet but there will be some at some point for sure. Anyway, I'd appreciate if you can write it. Feel free to ask whatever you need. -- "Los honestos son inadaptados sociales" -- Les Luthiers ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: multiprimary conntrackd setup 2008-06-18 13:05 ` Pablo Neira Ayuso @ 2008-06-23 6:46 ` Sebastian Vieira 2008-06-23 9:09 ` Pablo Neira Ayuso 0 siblings, 1 reply; 8+ messages in thread From: Sebastian Vieira @ 2008-06-23 6:46 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter On Wed, Jun 18, 2008 at 3:05 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > What kind of active-active? There are two kind: -snip- > b) asymmetric or packet-based: typical case of OSPF setups, there is no > guarantees that the packet is handled by the same firewall replica as > OSPF may change the routes at any time. In that case, you have to enable > the CacheWriteThrough. However, from the design point of view, > conntrackd suits better in the scenario a). I'm using the asymmetric setup. Two firewalls connected with BGP to the service provider, and as you mentioned, no way of knowing which firewall handles which packet. But the funny thing is, that it's working now :) Yes, i enabled the CacheWriteThrough option, but i was testing with ICMP's. Later i learnt that ICMP is a kind of unreliable protocol because when i tested it with a simple tcp connection it worked fine. I'm still fiddling around a bit with the ip_conntrack_max sysctl setting because i tend to get dropped packets. Also `conntrackd -s` indicated that for both nodes it failed to destroy connections on internal cache. These numbers roughly match the other node's succesfully destroyed connections: node1: connections destroyed: 31473050 failed: 7334 node2: connections destroyed: 7441 failed: 31475657 Is this something i need to worry about? regards, Sebastian ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: multiprimary conntrackd setup 2008-06-23 6:46 ` Sebastian Vieira @ 2008-06-23 9:09 ` Pablo Neira Ayuso 2008-06-23 12:42 ` Sebastian Vieira 0 siblings, 1 reply; 8+ messages in thread From: Pablo Neira Ayuso @ 2008-06-23 9:09 UTC (permalink / raw) To: Sebastian Vieira; +Cc: netfilter Sebastian Vieira wrote: > On Wed, Jun 18, 2008 at 3:05 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote: >> What kind of active-active? There are two kind: > > -snip- > >> b) asymmetric or packet-based: typical case of OSPF setups, there is no >> guarantees that the packet is handled by the same firewall replica as >> OSPF may change the routes at any time. In that case, you have to enable >> the CacheWriteThrough. However, from the design point of view, >> conntrackd suits better in the scenario a). > > I'm using the asymmetric setup. Two firewalls connected with BGP to > the service provider, and as you mentioned, no way of knowing which > firewall handles which packet. > > But the funny thing is, that it's working now :) Yes, i enabled the > CacheWriteThrough option, but i was testing with ICMP's. Later i > learnt that ICMP is a kind of unreliable protocol because when i > tested it with a simple tcp connection it worked fine. As you said, replicating ICMP does not make too much sense to me either. Some considerations on this setup: There's a shortcoming in the asymmetric approach. conntrackd performs much better in a flow-based multiprimary setup. The multipath setup that you're using works fine iif the RTT between the FW cluster and the server peer is greater than the time to send and inject the state change from FW1 to FW2. Otherwise, you'll probably notice a slow down in the connection setup. This condition fulfills if the server peer is in the Internet (DSL RTT is ~30 ms and the synchronization messages barely take 0.01 ms here). This limitation happens due to the asynchronous nature of the solution. The design of conntrackd supports this scenario but flow-based performs much better. In short: BGP works at packet level, when the stateful firewalling operate at flow level. > I'm still fiddling around a bit with the ip_conntrack_max sysctl > setting because i tend to get dropped packets. Also `conntrackd -s` > indicated that for both nodes it failed to destroy connections on > internal cache. These numbers roughly match the other node's > succesfully destroyed connections: > > node1: > connections destroyed: 31473050 failed: 7334 > > node2: > connections destroyed: 7441 failed: 31475657 > > Is this something i need to worry about? Well, I need to know which replication approach you're using. Anyhow, I'll try to do several assumptions from the information that you've posted. Basically, that output means that node1 has try to destroy 7334 connections that were not available it its cache. Since you have trimmed the output, I don't know if it's the internal or external cache. Assuming that the information that you've posted talks about node1's internal cache and node2's external cache: 1) node1 did not resynchronize against the kernel conntrack table at startup (you forgot to include conntrackd -R in your scripts for force resynchronization between internal cache <-> kernel conntrack table). Use conntrackd -i to check if the output is similar to conntrack -L. 2) node2 has try to destroy several connections in its external cache that were not available. This means that node2 did not issue a conntrackd -n to resynchronize its external cache to node1's internal cache (assuming that you're using FTFW or NOTRACK approach). -- "Los honestos son inadaptados sociales" -- Les Luthiers ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: multiprimary conntrackd setup 2008-06-23 9:09 ` Pablo Neira Ayuso @ 2008-06-23 12:42 ` Sebastian Vieira 2008-06-24 16:06 ` Pablo Neira Ayuso 0 siblings, 1 reply; 8+ messages in thread From: Sebastian Vieira @ 2008-06-23 12:42 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter On Mon, Jun 23, 2008 at 11:09 AM, Pablo Neira Ayuso <pablo@netfilter.org> wrote: First of all, thanks alot for the (quick) response, i appreciate it! > As you said, replicating ICMP does not make too much sense to me either. > > Some considerations on this setup: There's a shortcoming in the > asymmetric approach. conntrackd performs much better in a flow-based > multiprimary setup. > > The multipath setup that you're using works fine iif the RTT between the > FW cluster and the server peer is greater than the time to send and > inject the state change from FW1 to FW2. Otherwise, you'll probably > notice a slow down in the connection setup. This condition fulfills if > the server peer is in the Internet (DSL RTT is ~30 ms and the > synchronization messages barely take 0.01 ms here). This limitation > happens due to the asynchronous nature of the solution. The design of > conntrackd supports this scenario but flow-based performs much better. > Right. Is there any way to measure these synchronization times? The setup is located on the LAN where even the synchronization messages are passing through a switch. Maybe we can overcome this by hooking up a crosscable but that depends, i don't know if we have free NIC :) > Well, I need to know which replication approach you're using. Anyhow, > I'll try to do several assumptions from the information that you've posted. > > Basically, that output means that node1 has try to destroy 7334 > connections that were not available it its cache. Since you have trimmed > the output, I don't know if it's the internal or external cache. Both is the output of internal cache. I'll paste it in full below. Note that conntrackd was just restarted a couple of minutes ago: node1: cache internal: current active connections: 15200 connections created: 28326 failed: 0 connections updated: 68477 failed: 0 connections destroyed: 13126 failed: 1 cache external: current active connections: 167 connections created: 167 failed: 0 connections updated: 434 failed: 0 connections destroyed: 0 failed: 0 traffic processed: 0 Bytes 0 Pckts multicast traffic: 6735580 Bytes sent 53708 Bytes recv 75025 Pckts sent 596 Pckts recv 0 Error send 0 Error recv multicast sequence tracking: 0 Pckts mfrm 0 Pckts lost node2: cache internal: current active connections: 1699 connections created: 1826 failed: 0 connections updated: 636 failed: 0 connections destroyed: 127 failed: 804 cache external: current active connections: 11893 connections created: 11989 failed: 0 connections updated: 68991 failed: 0 connections destroyed: 96 failed: 0 traffic processed: 0 Bytes 0 Pckts multicast traffic: 58372 Bytes sent 6810200 Bytes recv 646 Pckts sent 75940 Pckts recv 0 Error send 0 Error recv multicast sequence tracking: 0 Pckts mfrm 0 Pckts lost And for completeness' sake, the conntrackd.conf for both nodes (where only IPv4_interface differs) : fw02:~# cat /etc/conntrackd/conntrackd.conf Sync { Mode NOTRACK { CommitTimeout 180 } Multicast { IPv4_address 225.0.0.50 IPv4_interface 172.29.254.3 Interface eth1 Group 3780 McastSndSocketBuffer 1249280 McastRcvSocketBuffer 1249280 } Checksum on CacheWriteThrough On } General { HashSize 8192 HashLimit 65535 LockFile /var/lock/conntrack.lock LogFile /var/log/conntrackd.log UNIX { Path /tmp/sync.sock Backlog 20 } SocketBufferSize 262142 SocketBufferSizeMaxGrown 655355 } IgnoreTrafficFor { IPv4_address 172.29.254.3 # loc (fw02) IPv4_address 172.29.254.2 # loc (fw01) IPv4_address 172.29.253.1 # loc IPv4_address 127.0.0.1 # loopback } IgnoreProtocol { UDP VRRP } regards, Sebastian ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: multiprimary conntrackd setup 2008-06-23 12:42 ` Sebastian Vieira @ 2008-06-24 16:06 ` Pablo Neira Ayuso 2008-06-25 21:02 ` Sebastian Vieira 0 siblings, 1 reply; 8+ messages in thread From: Pablo Neira Ayuso @ 2008-06-24 16:06 UTC (permalink / raw) To: Sebastian Vieira; +Cc: netfilter Sebastian Vieira wrote: > On Mon, Jun 23, 2008 at 11:09 AM, Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > First of all, thanks alot for the (quick) response, i appreciate it! > >> As you said, replicating ICMP does not make too much sense to me either. >> >> Some considerations on this setup: There's a shortcoming in the >> asymmetric approach. conntrackd performs much better in a flow-based >> multiprimary setup. >> >> The multipath setup that you're using works fine iif the RTT between the >> FW cluster and the server peer is greater than the time to send and >> inject the state change from FW1 to FW2. Otherwise, you'll probably >> notice a slow down in the connection setup. This condition fulfills if >> the server peer is in the Internet (DSL RTT is ~30 ms and the >> synchronization messages barely take 0.01 ms here). This limitation >> happens due to the asynchronous nature of the solution. The design of >> conntrackd supports this scenario but flow-based performs much better. > > Right. Is there any way to measure these synchronization times? The > setup is located on the LAN where even the synchronization messages > are passing through a switch. Maybe we can overcome this by hooking up > a crosscable but that depends, i don't know if we have free NIC :) Actually, the nodes must use a dedicated link, otherwise you risk to leak state information. And please, elaborate your setup a bit more. >> Well, I need to know which replication approach you're using. Anyhow, >> I'll try to do several assumptions from the information that you've posted. >> >> Basically, that output means that node1 has try to destroy 7334 >> connections that were not available it its cache. Since you have trimmed >> the output, I don't know if it's the internal or external cache. > > Both is the output of internal cache. I'll paste it in full below. > Note that conntrackd was just restarted a couple of minutes ago: > > node1: > cache internal: > current active connections: 15200 > connections created: 28326 failed: 0 > connections updated: 68477 failed: 0 > connections destroyed: 13126 failed: 1 > > cache external: > current active connections: 167 > connections created: 167 failed: 0 > connections updated: 434 failed: 0 > connections destroyed: 0 failed: 0 > > traffic processed: > 0 Bytes 0 Pckts > > multicast traffic: > 6735580 Bytes sent 53708 Bytes recv > 75025 Pckts sent 596 Pckts recv > 0 Error send 0 Error recv > > multicast sequence tracking: > 0 Pckts mfrm 0 Pckts lost > > > node2: > cache internal: > current active connections: 1699 > connections created: 1826 failed: 0 > connections updated: 636 failed: 0 > connections destroyed: 127 failed: 804 > > cache external: > current active connections: 11893 > connections created: 11989 failed: 0 > connections updated: 68991 failed: 0 > connections destroyed: 96 failed: 0 > > traffic processed: > 0 Bytes 0 Pckts > > multicast traffic: > 58372 Bytes sent 6810200 Bytes recv > 646 Pckts sent 75940 Pckts recv > 0 Error send 0 Error recv > > multicast sequence tracking: > 0 Pckts mfrm 0 Pckts lost > > > > And for completeness' sake, the conntrackd.conf for both nodes (where > only IPv4_interface differs) : > > fw02:~# cat /etc/conntrackd/conntrackd.conf > Sync { > Mode NOTRACK { > CommitTimeout 180 > } If you're using NOTRACK, the nodes do not seem to be in sync as the number of internal cache entries in node1 must be equal to node2's in the external cache. I guess that you've been testing the failover several times before posting this results. BTW, which HA manager are you using? The HA manager is required to assist conntrackd as it invokes several important commands (see the scripts). -- "Los honestos son inadaptados sociales" -- Les Luthiers ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: multiprimary conntrackd setup 2008-06-24 16:06 ` Pablo Neira Ayuso @ 2008-06-25 21:02 ` Sebastian Vieira 2008-06-26 15:25 ` Pablo Neira Ayuso 0 siblings, 1 reply; 8+ messages in thread From: Sebastian Vieira @ 2008-06-25 21:02 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter On Tue, Jun 24, 2008 at 6:06 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote: > Actually, the nodes must use a dedicated link, otherwise you risk to > leak state information. And please, elaborate your setup a bit more. Hm, the hardware lacks a dedicated link at the moment so i send mcast traffic over the least used link. I'll see if i can add an extra NIC to get the dedicated link up. I'll try to describe the setup better: There are two firewalls, both configured as BGP routers using quagga. Right now we have configured it so that traffic leaves through one node and comes back through the other. There is no HA software configured for this, only for the floating IP (main gateway ip). It was my understanding that connection updates would be inserted immediately, not depending on a HA manager script. Also, since this is a production environment and we can't really 'experiment' too much with it, the way we're similating the active-active setup now is by adding a static route to some external host via the inactive firewall so the packet route is as follows: [our host in dmz] -----> [fw02] -----> [external host] --------> [fw01] ------> [our host in dmz] But i guess that the ACK to our external host, and the SYN/ACK response from it, is faster that the conntrackd synchronization between fw01 and 02. Am i right in that assumption? > If you're using NOTRACK, the nodes do not seem to be in sync as the > number of internal cache entries in node1 must be equal to node2's in > the external cache. I guess that you've been testing the failover > several times before posting this results. BTW, which HA manager are you > using? The HA manager is required to assist conntrackd as it invokes > several important commands (see the scripts). See above. Did i understand the working of conntrackd incorrectly? We have somewhat come to the conclusion that a multiprimary firewall setup is maybe impossible to accomplish due to the latency and that it might be better (read: easier) to split the routers from the firewalls (they are now one and the same physical machine), have only 1 firewall active at the time and add the HA manager to sync conntrack tables upon failover. regards, Sebastian ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: multiprimary conntrackd setup 2008-06-25 21:02 ` Sebastian Vieira @ 2008-06-26 15:25 ` Pablo Neira Ayuso 0 siblings, 0 replies; 8+ messages in thread From: Pablo Neira Ayuso @ 2008-06-26 15:25 UTC (permalink / raw) To: Sebastian Vieira; +Cc: netfilter Sebastian Vieira wrote: > On Tue, Jun 24, 2008 at 6:06 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote: >> Actually, the nodes must use a dedicated link, otherwise you risk to >> leak state information. And please, elaborate your setup a bit more. > > Hm, the hardware lacks a dedicated link at the moment so i send mcast > traffic over the least used link. I'll see if i can add an extra NIC > to get the dedicated link up. > > I'll try to describe the setup better: > > There are two firewalls, both configured as BGP routers using quagga. > Right now we have configured it so that traffic leaves through one > node and comes back through the other. There is no HA software > configured for this, only for the floating IP (main gateway ip). It > was my understanding that connection updates would be inserted > immediately, not depending on a HA manager script. Yes, but the HA manager assists conntrackd to flush/request a resync/etc whenever a node comes up/down. Otherwise, you'll probably get flow entries stuck into conntrackd forever. > Also, since this is a production environment and we can't really > 'experiment' too much with it, the way we're similating the > active-active setup now is by adding a static route to some external > host via the inactive firewall so the packet route is as follows: > > [our host in dmz] -----> [fw02] -----> [external host] --------> > [fw01] ------> [our host in dmz] > > But i guess that the ACK to our external host, and the SYN/ACK > response from it, is faster that the conntrackd synchronization > between fw01 and 02. Am i right in that assumption? It depends on where your external host is, ie. the RTT between the external and your dmz host (see my previous email). Moreover, the CPU consumption with the asymmetric approach is higher since: a) you have to inject every single state change into the kernel. b) you have to replicate every single state-change. The asymmetric multiprimary is less performant the symmetric approach. >> If you're using NOTRACK, the nodes do not seem to be in sync as the >> number of internal cache entries in node1 must be equal to node2's in >> the external cache. I guess that you've been testing the failover >> several times before posting this results. BTW, which HA manager are you >> using? The HA manager is required to assist conntrackd as it invokes >> several important commands (see the scripts). > > See above. Did i understand the working of conntrackd incorrectly? > > We have somewhat come to the conclusion that a multiprimary firewall > setup is maybe impossible to accomplish due to the latency and that it > might be better (read: easier) to split the routers from the firewalls > (they are now one and the same physical machine), have only 1 firewall > active at the time and add the HA manager to sync conntrack tables > upon failover. The per-packet (asymmetric) multiprimary support is possible but, as said, I'd suggest a per-flow multiprimary, ie. the same firewall always handles the same subset of flows. I have setup one symmetric multiprimary testbed with ClusterIP, however, this target is focused on backend clustering - thus not for gateway clustering. Anytime soon, I'd like to come with a new target similar to clusterIP but for gatweays and, of course, some documentation to avoid this sort of threads. -- "Los honestos son inadaptados sociales" -- Les Luthiers ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-06-26 15:25 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-06-17 11:02 multiprimary conntrackd setup Sebastian Vieira 2008-06-18 13:05 ` Pablo Neira Ayuso 2008-06-23 6:46 ` Sebastian Vieira 2008-06-23 9:09 ` Pablo Neira Ayuso 2008-06-23 12:42 ` Sebastian Vieira 2008-06-24 16:06 ` Pablo Neira Ayuso 2008-06-25 21:02 ` Sebastian Vieira 2008-06-26 15:25 ` Pablo Neira Ayuso
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox