From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pablo Neira Ayuso Subject: Re: [PATCH nf-next] netfilter: conntrack: add support for flextuples Date: Mon, 4 May 2015 15:08:28 +0200 Message-ID: <20150504130828.GA3607@salvia> References: <776b8819c85c83088478b933a35691133055347a.1430733932.git.daniel@iogearbox.net> <20150504103451.GA12200@salvia> <55475F13.1000304@iogearbox.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: netfilter-devel@vger.kernel.org, Thomas Graf , Madhu Challa To: Daniel Borkmann Return-path: Received: from mail.us.es ([193.147.175.20]:44358 "EHLO mail.us.es" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752704AbbEDNDz (ORCPT ); Mon, 4 May 2015 09:03:55 -0400 Content-Disposition: inline In-Reply-To: <55475F13.1000304@iogearbox.net> Sender: netfilter-devel-owner@vger.kernel.org List-ID: On Mon, May 04, 2015 at 01:59:15PM +0200, Daniel Borkmann wrote: > Hi Pablo, > > On 05/04/2015 12:34 PM, Pablo Neira Ayuso wrote: > >On Mon, May 04, 2015 at 12:23:41PM +0200, Daniel Borkmann wrote: > >>This patch adds support for the possibility of doing NAT with > >>conflicting IP address/ports tuples from multiple, isolated > >>tenants, represented as network namespaces and netfilter zones. > >>For such internal VRFs, traffic is directed to a single or shared > >>pool of public IP address/port range for the external/public VRF. > >> > >>Or in other words, this allows for doing NAT *between* VRFs > >>instead of *inside* VRFs without requiring each tenant to NAT > >>twice or to use its own dedicated IP address to SNAT to, also > >>with the side effect to not requiring to expose a unique marker > >>per tenant in the data center to the public. > >> > >>Simplified example scheme: > >> > >> +--- VRF A ---+ +--- CT Zone 1 --------+ > >> | 10.1.1.1/8 +--+ 10.1.1.1 ESTABLISHED | > >> +-------------+ +--+-------------------+ > >> | > >> +--+--+ > >> | L3 +-SNAT-[20.1.1.1:20000-40000]--eth0 > >> +--+--+ > >> | > >> +-- VRF B ----+ +--- CT Zone 2 --------+ > >> | 10.1.1.1/8 +--+ 10.1.1.1 ESTABLISHED | > >> +-------------+ +----------------------+ > > > >So, it's the skb->mark that survives between the containers. I'm not > >sure it makes sense to keep a zone 0 from the container that performs > >SNAT. Instead, we can probably restore the zone based on the > >skb->mark. The problem is that the existing zone is u16. In nftables, > >Patrick already mentioned about supporting casting so we can do > >something like: > > > > ct zone set (u16)meta mark > > > >So you can reserve a part of the skb->mark to map it to the zone. I'm > >not very convinced about this. > > Thanks for the feedback! I'm not yet sure though, I understood the > above suggestion to the described problem fully so far, i.e. how > would replies on the SNAT find the correct zone again? >>From the original direction, you can set the zone based on the mark: -m mark --mark 1 -j CT --zone 1 Then, from the reply direction, you can restore it: -m conntrack --ctzone 1 -j MARK --set-mark 1 ... --ctzone is not supported though, it would need a new revision for the conntrack match. > Our issue simplified, basically boils down to: given are two zones, > both use IP address , both zones want to talk to IP address in > a third zone. To let those two with talk to , connections are > being routed + SNATed from a non-unique to a unique address/port > tuple [which the proposed approach solves], so they can talk to .