netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Mika Penttilä" <mika.penttila@nextfour.com>
To: Pablo Neira Ayuso <pablo@netfilter.org>, netfilter-devel@vger.kernel.org
Cc: davem@davemloft.net, netdev@vger.kernel.org, kuba@kernel.org
Subject: Re: [PATCH net 3/9] netfilter: nf_nat: undo erroneous tcp edemux lookup
Date: Sat, 6 Mar 2021 16:49:19 +0200	[thread overview]
Message-ID: <b0d7a77b-a33d-150c-65e2-6caebcec772f@nextfour.com> (raw)
In-Reply-To: <20210306121223.28711-4-pablo@netfilter.org>



On 6.3.2021 14.12, Pablo Neira Ayuso wrote:
> From: Florian Westphal <fw@strlen.de>
>
> Under extremely rare conditions TCP early demux will retrieve the wrong
> socket.
>
> 1. local machine establishes a connection to a remote server, S, on port
>     p.
>
>     This gives:
>     laddr:lport -> S:p
>     ... both in tcp and conntrack.
>
> 2. local machine establishes a connection to host H, on port p2.
>     2a. TCP stack choses same laddr:lport, so we have
>     laddr:lport -> H:p2 from TCP point of view.
>     2b). There is a destination NAT rewrite in place, translating
>          H:p2 to S:p.  This results in following conntrack entries:
>
>     I)  laddr:lport -> S:p  (origin)  S:p -> laddr:lport (reply)
>     II) laddr:lport -> H:p2 (origin)  S:p -> laddr:lport2 (reply)
>
>     NAT engine has rewritten laddr:lport to laddr:lport2 to map
>     the reply packet to the correct origin.
Could you eloborate where and how linux nat engine is doing the

laddr:lport to laddr:lport2

rewrite? There's only DST nat and there should be conflict (for reply) 
in tuple establishment afaik....


>
>     When server sends SYN/ACK to laddr:lport2, the PREROUTING hook
>     will undo-the SNAT transformation, rewriting IP header to
>     S:p -> laddr:lport
>
>     This causes TCP early demux to associate the skb with the TCP socket
>     of the first connection.
>
>     The INPUT hook will then reverse the DNAT transformation, rewriting
>     the IP header to H:p2 -> laddr:lport.
>
> Because packet ends up with the wrong socket, the new connection
> never completes: originator stays in SYN_SENT and conntrack entry
> remains in SYN_RECV until timeout, and responder retransmits SYN/ACK
> until it gives up.
>
> To resolve this, orphan the skb after the input rewrite:
> Because the source IP address changed, the socket must be incorrect.
> We can't move the DNAT undo to prerouting due to backwards
> compatibility, doing so will make iptables/nftables rules to no longer
> match the way they did.
>
> After orphan, the packet will be handed to the next protocol layer
> (tcp, udp, ...) and that will repeat the socket lookup just like as if
> early demux was disabled.
>
> Fixes: 41063e9dd1195 ("ipv4: Early TCP socket demux.")
> Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1427
> Signed-off-by: Florian Westphal <fw@strlen.de>
> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> ---
>   net/netfilter/nf_nat_proto.c | 25 +++++++++++++++++++++----
>   1 file changed, 21 insertions(+), 4 deletions(-)
>
> diff --git a/net/netfilter/nf_nat_proto.c b/net/netfilter/nf_nat_proto.c
> index e87b6bd6b3cd..4731d21fc3ad 100644
> --- a/net/netfilter/nf_nat_proto.c
> +++ b/net/netfilter/nf_nat_proto.c
> @@ -646,8 +646,8 @@ nf_nat_ipv4_fn(void *priv, struct sk_buff *skb,
>   }
>   
>   static unsigned int
> -nf_nat_ipv4_in(void *priv, struct sk_buff *skb,
> -	       const struct nf_hook_state *state)
> +nf_nat_ipv4_pre_routing(void *priv, struct sk_buff *skb,
> +			const struct nf_hook_state *state)
>   {
>   	unsigned int ret;
>   	__be32 daddr = ip_hdr(skb)->daddr;
> @@ -659,6 +659,23 @@ nf_nat_ipv4_in(void *priv, struct sk_buff *skb,
>   	return ret;
>   }
>   
> +static unsigned int
> +nf_nat_ipv4_local_in(void *priv, struct sk_buff *skb,
> +		     const struct nf_hook_state *state)
> +{
> +	__be32 saddr = ip_hdr(skb)->saddr;
> +	struct sock *sk = skb->sk;
> +	unsigned int ret;
> +
> +	ret = nf_nat_ipv4_fn(priv, skb, state);
> +
> +	if (ret == NF_ACCEPT && sk && saddr != ip_hdr(skb)->saddr &&
> +	    !inet_sk_transparent(sk))
> +		skb_orphan(skb); /* TCP edemux obtained wrong socket */
> +
> +	return ret;
> +}
> +
>   static unsigned int
>   nf_nat_ipv4_out(void *priv, struct sk_buff *skb,
>   		const struct nf_hook_state *state)
> @@ -736,7 +753,7 @@ nf_nat_ipv4_local_fn(void *priv, struct sk_buff *skb,
>   static const struct nf_hook_ops nf_nat_ipv4_ops[] = {
>   	/* Before packet filtering, change destination */
>   	{
> -		.hook		= nf_nat_ipv4_in,
> +		.hook		= nf_nat_ipv4_pre_routing,
>   		.pf		= NFPROTO_IPV4,
>   		.hooknum	= NF_INET_PRE_ROUTING,
>   		.priority	= NF_IP_PRI_NAT_DST,
> @@ -757,7 +774,7 @@ static const struct nf_hook_ops nf_nat_ipv4_ops[] = {
>   	},
>   	/* After packet filtering, change source */
>   	{
> -		.hook		= nf_nat_ipv4_fn,
> +		.hook		= nf_nat_ipv4_local_in,
>   		.pf		= NFPROTO_IPV4,
>   		.hooknum	= NF_INET_LOCAL_IN,
>   		.priority	= NF_IP_PRI_NAT_SRC,


  reply	other threads:[~2021-03-06 14:50 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-06 12:12 [PATCH net 0/9] Netfilter fixes for net Pablo Neira Ayuso
2021-03-06 12:12 ` [PATCH net 1/9] uapi: nfnetlink_cthelper.h: fix userspace compilation error Pablo Neira Ayuso
2021-03-07  1:10   ` patchwork-bot+netdevbpf
2021-03-06 12:12 ` [PATCH net 2/9] netfilter: conntrack: Remove a double space in a log message Pablo Neira Ayuso
2021-03-06 12:12 ` [PATCH net 3/9] netfilter: nf_nat: undo erroneous tcp edemux lookup Pablo Neira Ayuso
2021-03-06 14:49   ` Mika Penttilä [this message]
2021-03-06 16:10     ` Mika Penttilä
2021-03-06 12:12 ` [PATCH net 4/9] netfilter: conntrack: avoid misleading 'invalid' in log message Pablo Neira Ayuso
2021-03-06 12:12 ` [PATCH net 5/9] selftests: netfilter: test nat port clash resolution interaction with tcp early demux Pablo Neira Ayuso
2021-03-06 12:12 ` [PATCH net 6/9] netfilter: x_tables: gpf inside xt_find_revision() Pablo Neira Ayuso
2021-03-06 12:12 ` [PATCH net 7/9] netfilter: nftables: disallow updates on table ownership Pablo Neira Ayuso
2021-03-06 12:12 ` [PATCH net 8/9] netfilter: nftables: fix possible double hook unregistration with table owner Pablo Neira Ayuso
2021-03-06 12:12 ` [PATCH net 9/9] netfilter: nftables: bogus check for netlink portID " Pablo Neira Ayuso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b0d7a77b-a33d-150c-65e2-6caebcec772f@nextfour.com \
    --to=mika.penttila@nextfour.com \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).