From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pablo Neira Ayuso Subject: Re: [PATCH] netfilter: xtables: add cluster match Date: Mon, 16 Feb 2009 15:30:20 +0100 Message-ID: <4999787C.7050203@netfilter.org> References: <20090214192936.11718.44732.stgit@Decadence> <49994643.8010001@trash.net> <499971CC.6040903@netfilter.org> <49997247.3010105@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Cc: netfilter-devel@vger.kernel.org To: Patrick McHardy Return-path: Received: from mail.us.es ([193.147.175.20]:37281 "EHLO us.es" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751323AbZBPOWG (ORCPT ); Mon, 16 Feb 2009 09:22:06 -0500 In-Reply-To: <49997247.3010105@trash.net> Sender: netfilter-devel-owner@vger.kernel.org List-ID: Patrick McHardy wrote: > Pablo Neira Ayuso wrote: >> Patrick McHardy wrote: >>>> ip maddr add 01:00:5e:00:01:01 dev eth1 >>>> ip maddr add 01:00:5e:00:01:02 dev eth2 >>>> arptables -I OUTPUT -o eth1 --h-length 6 \ >>>> -j mangle --mangle-mac-s 01:00:5e:00:01:01 >>>> arptables -I INPUT -i eth1 --h-length 6 \ >>>> --destination-mac 01:00:5e:00:01:01 \ >>>> -j mangle --mangle-mac-d 00:zz:yy:xx:5a:27 >>> >>> Mhh, is the saving of one or two characters really worth these >>> deviations from the kind-of established naming scheme? Its hard >>> to remember all these minor differences in my opinion. >> >> Hm, you mean the name "mangle" or the name of the option >> "--mangle-mac-d"? This is what we currently have in kernel mainline >> and arptables userspace, it's not my fault :). I can send you a patch >> to fix it with a consistent naming without breaking backward >> compatibility both in kernel and user-space. > > Great, I wasn't aware that this already existed in userspace :) Yes, it's hosted by the ebtables projects. That tool really need some care. It works but I don't know if it's actively maintained. Probably we can offer hosting for it in git.netfilter.org. I'll investigate this. >>>> In the case of TCP connections, pickup facility has to be disabled >>>> to avoid marking TCP ACK packets coming in the reply direction as >>>> valid. >>>> >>>> echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose >>> >>> I'm not sure I understand this. You *don't* want to mark them >>> as valid, and you need to disable pickup for this? >> >> If TCP pickup is enabled, one TCP ACK packet coming in the reply >> direction enters TCP ESTABLISHED state. Since that's a valid >> state-transition, the cluster match will consider that this is part of >> a connection that this node is handling since it's a valid >> state-transition. The cluster match does not mark packets that trigger >> invalid state transitions. > > Why use conntrack at all? Shouldn't the cluster match simply > filter out all packets not for this cluster and thats it? > You stated it needs conntrack to get a constant tuple, but I > don't see why the conntrack tuple would differ from the data > that you can gather from the packet headers. No, source NAT connections would have different headers. A -> B for original, and B -> FW for reply direction. Thus, I cannot apply the same hashing for packets going in the original and the reply direction. Moreover, if this is packet-based, think also about possible asymmetric filtering: original traffic direction filtered by node 1 but reply traffic filtered by node 2. That will not work for a stateful firewall. >>>> echo +2 > /proc/sys/net/netfilter/cluster/$PROC_NAME >>> >>> Does this provide anything you can't do by replacing the rule >>> itself? >> >> Yes, the nodes in the cluster are identifies by an ID, the rule allows >> you to specify one ID. Say you have two cluster nodes, one with ID 1, >> and the other with ID 2. If the cluster node with ID 1 goes down, you >> can echo +1 to node with ID 2 so that it will handle packets going to >> node with ID 1 and ID 2. Of course, you need conntrackd to allow node >> ID 2 recover the filtering. > > I see. That kind of makes sense, but if you're running a > synchronization daemon anyways, you might as well renumber > all nodes so you still have proper balancing, right? Indeed, the daemon may also add a new rule for the node that has gone down but that results in another extra hash operation to mark it or not (one extra hash per rule) :(. -- "Los honestos son inadaptados sociales" -- Les Luthiers