Re: [PATCH nf-next] netfilter: conntrack: add support for flextuples

netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Daniel Borkmann <daniel@iogearbox.net>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: netfilter-devel@vger.kernel.org, Thomas Graf <tgraf@suug.ch>,
	Madhu Challa <challa@noironetworks.com>
Subject: Re: [PATCH nf-next] netfilter: conntrack: add support for flextuples
Date: Mon, 04 May 2015 15:51:37 +0200	[thread overview]
Message-ID: <55477969.4000407@iogearbox.net> (raw)
In-Reply-To: <20150504130828.GA3607@salvia>

On 05/04/2015 03:08 PM, Pablo Neira Ayuso wrote:
> On Mon, May 04, 2015 at 01:59:15PM +0200, Daniel Borkmann wrote:
>> Hi Pablo,
>>
>> On 05/04/2015 12:34 PM, Pablo Neira Ayuso wrote:
>>> On Mon, May 04, 2015 at 12:23:41PM +0200, Daniel Borkmann wrote:
>>>> This patch adds support for the possibility of doing NAT with
>>>> conflicting IP address/ports tuples from multiple, isolated
>>>> tenants, represented as network namespaces and netfilter zones.
>>>> For such internal VRFs, traffic is directed to a single or shared
>>>> pool of public IP address/port range for the external/public VRF.
>>>>
>>>> Or in other words, this allows for doing NAT *between* VRFs
>>>> instead of *inside* VRFs without requiring each tenant to NAT
>>>> twice or to use its own dedicated IP address to SNAT to, also
>>>> with the side effect to not requiring to expose a unique marker
>>>> per tenant in the data center to the public.
>>>>
>>>> Simplified example scheme:
>>>>
>>>>    +--- VRF A ---+  +--- CT Zone 1 --------+
>>>>    | 10.1.1.1/8  +--+ 10.1.1.1 ESTABLISHED |
>>>>    +-------------+  +--+-------------------+
>>>>                        |
>>>>                     +--+--+
>>>>                     | L3  +-SNAT-[20.1.1.1:20000-40000]--eth0
>>>>                     +--+--+
>>>>                        |
>>>>    +-- VRF B ----+  +--- CT Zone 2 --------+
>>>>    | 10.1.1.1/8  +--+ 10.1.1.1 ESTABLISHED |
>>>>    +-------------+  +----------------------+
>>>
>>> So, it's the skb->mark that survives between the containers.  I'm not
>>> sure it makes sense to keep a zone 0 from the container that performs
>>> SNAT. Instead, we can probably restore the zone based on the
>>> skb->mark. The problem is that the existing zone is u16. In nftables,
>>> Patrick already mentioned about supporting casting so we can do
>>> something like:
>>>
>>>          ct zone set (u16)meta mark
>>>
>>> So you can reserve a part of the skb->mark to map it to the zone. I'm
>>> not very convinced about this.
>>
>> Thanks for the feedback! I'm not yet sure though, I understood the
>> above suggestion to the described problem fully so far, i.e. how
>> would replies on the SNAT find the correct zone again?
>
>  From the original direction, you can set the zone based on the mark:
>
>          -m mark --mark 1 -j CT --zone 1
>
> Then, from the reply direction, you can restore it:
>
>          -m conntrack --ctzone 1 -j MARK --set-mark 1
>          ...
>
> --ctzone is not supported though, it would need a new revision for the
> conntrack match.

Ok, thanks a lot, now I see what you mean.

If I'm not missing something, I would see two problems with that: the
first would be that the zone match would be linear, f.e. if we support
100 or more zones, we would need to walk through the rules linearly until
we find --mark 100, right?

The other issue is that from reply direction (when the packet comes in
with the translated addr), we couldn't match in the connection tracking
table on the correct zone. The above restore rule would assume that the
match itself already has taken place and was successfully, no? (That is
actually why we are direction based: --flextuple ORIGINAL|REPLY.)

>> Our issue simplified, basically boils down to: given are two zones,
>> both use IP address <A>, both zones want to talk to IP address <B> in
>> a third zone. To let those two with <A> talk to <B>, connections are
>> being routed + SNATed from a non-unique to a unique address/port
>> tuple [which the proposed approach solves], so they can talk to <B>.

     prev parent reply	other threads:[~2015-05-04 13:51 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-04 10:23 [PATCH nf-next] netfilter: conntrack: add support for flextuples Daniel Borkmann
2015-05-04 10:34 ` Pablo Neira Ayuso
2015-05-04 11:59   ` Daniel Borkmann
2015-05-04 13:08     ` Pablo Neira Ayuso
2015-05-04 13:47       ` Thomas Graf
2015-05-06 14:27         ` Pablo Neira Ayuso
2015-05-06 18:00           ` Daniel Borkmann
2015-05-06 18:50             ` Pablo Neira Ayuso
2015-05-07 12:01               ` Daniel Borkmann
2015-05-07 18:10                 ` Pablo Neira Ayuso
2015-05-08  9:45                   ` Daniel Borkmann
2015-05-04 13:51       ` Daniel Borkmann [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55477969.4000407@iogearbox.net \
    --to=daniel@iogearbox.net \
    --cc=challa@noironetworks.com \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).