All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Maximilien Cuony <maximilien.cuony@arcanite.ch>
Cc: netdev@vger.kernel.org, Phil Sutter <phil@nwl.cc>,
	Florian Westphal <fw@strlen.de>,
	Mike Manning <mvrmanning@gmail.com>,
	David Ahern <dsahern@kernel.org>,
	netfilter-devel@vger.kernel.org
Subject: Re: [REGRESSION] Unable to NAT own TCP packets from another VRF with tcp_l3mdev_accept = 1
Date: Fri, 30 Sep 2022 17:42:37 -0700	[thread overview]
Message-ID: <20220930174237.2e89c9e1@kernel.org> (raw)
In-Reply-To: <98348818-28c5-4cb2-556b-5061f77e112c@arcanite.ch>

Adding netfilter and vrf experts.

On Wed, 28 Sep 2022 16:02:43 +0200 Maximilien Cuony wrote:
> Hello,
> 
> We're using VRF with a machine used as a router and have a specific 
> issue where the router doesn't handle his own packets correctly during 
> NATing if the packet is coming from a different VRF.
> 
> We had the issue with debian buster (4.19), but the issue solved itself 
> when we updated to debian bullseye (5.10.92).
> 
> However, during an upgrade of debian bullseye to the latest kernel, the 
> issue appeared again (5.10.140).
> 
> We did a bisection and this leaded us to 
> "b0d67ef5b43aedbb558b9def2da5b4fffeb19966 net: allow unbound socket for 
> packets in VRF when tcp_l3mdev_accept set [ Upstream commit 
> 944fd1aeacb627fa617f85f8e5a34f7ae8ea4d8e ]".
> 
> Simplified case setup:
> 
> There is two machines in the setup. They both forward packets 
> (net.ipv4.ip_forward = 1) and there is two interface between them.
> 
> The main machine has two VRF. The default VRF is using the second 
> machine as the default route, on a specific interface.
> The second machine has as default route to main machine, on the other 
> VRF using the second pair of interfaces.
> 
> On the main machine, the second interface is in a specific VRF. In that 
> VRF, packets are NATed to the internet on a third interface.
> 
> A visual schema with the normal flow is available there: 
> https://etinacra.ch/kernel.png
> 
> Configuration command:
> 
> Main machine:
> sysctl -w net.ipv4.tcp_l3mdev_accept = 1
> sysctl -w systnet.ipv4.ip_forward = 1
> iptables -t raw -A PREROUTING -i eth0 -j CT --zone 5
> iptables -t raw -A OUTPUT -o eth0 -j CT --zone 5
> iptables -t nat -A POSTROUTING -o eth2 -j SNAT --to 192.168.1.1
> cat /etc/network/interfaces
> 
> auto firewall
> iface firewall
>      vrf-table 1200
> 
> auto eth0
> iface eth0
>      address 192.168.5.1/24
>      gateway 192.168.5.2
> 
> auto eth1
> iface eth1
>      address 192.168.10.1/24
>      vrf firewall
>      up ip route add 192.168.5.0/24 via 192.168.10.2 vrf firewall
> 
> auto eth2
> iface eth2
>      address 192.168.1.1/24
>      gateway 192.168.1.250
>      vrf firewall
> 
> ==
> 
> Second machine:
> 
> sysctl -w net.ipv4.ip_forward = 1
> 
> cat /etc/network/interfaces
> 
> auto eth0
> iface eth0
>      address 192.168.5.2/24
> 
> auto eth1
> iface eth1
>      address 192.168.10.2/24
>      gateway 192.168.10.1
> 
> ==
> 
> Without issue, if we look at a tcpdump on all interface on the main 
> machine, everything is fine (output truncated):
> 
> 10:28:32.811283 eth0 Out IP 192.168.5.1.55750 > 99.99.99.99.80: Flags 
> [S], seq 2216112145
> 10:28:32.811666 eth1 In  IP 192.168.5.1.55750 > 99.99.99.99.80: Flags 
> [S], seq 2216112145
> 10:28:32.811679 eth2 Out IP 192.168.1.1.55750 > 99.99.99.99.80: Flags 
> [S], seq 2216112145
> 10:28:32.835138 eth2 In  IP 99.99.99.99.80 > 192.168.1.1.55750: Flags 
> [S.], seq 383992840, ack 2216112146
> 10:28:32.835152 eth1 Out IP 99.99.99.99.80 > 192.168.5.1.55750: Flags 
> [S.], seq 383992840, ack 2216112146
> 10:28:32.835457 eth0 In  IP 99.99.99.99.80 > 192.168.5.1.55750: Flags 
> [S.], seq 383992840, ack 2216112146
> 10:28:32.835511 eth0 Out IP 192.168.5.1.55750 > 99.99.99.99.80: Flags 
> [.], ack 1, win 502
> 
> However when the issue is present, the SYNACK does arrives on eth2, but 
> is never "unNATed" back to eth1:
> 
> 10:25:07.644433 eth0 Out IP 192.168.5.1.48684 > 99.99.99.99.80: Flags 
> [S], seq 3207393154
> 10:25:07.644782 eth1 In  IP 192.168.5.1.48684 > 99.99.99.99.80: Flags 
> [S], seq 3207393154
> 10:25:07.644793 eth2 Out IP 192.168.1.1.48684 > 99.99.99.99.80: Flags 
> [S], seq 3207393154
> 10:25:07.668551 eth2 In  IP 54.36.61.42.80 > 192.168.1.1.48684: Flags 
> [S.], seq 823335485, ack 3207393155
> 
> The issue is only with TCP connections. UDP or ICMP works fine.
> 
> Turing off net.ipv4.tcp_l3mdev_accept back to 0 also fix the issue, but 
> we need this flag since we use some sockets that does not understand VRFs.
> 
> We did have a look at the diff and the code of inet_bound_dev_eq, but we 
> didn't understand much the real problem - but it does seem now that 
> bound_dev_if if now checked not to be False before the bound_dev_if == 
> dif || bound_dev_if == sdif comparison, something that was not the case 
> before (especially since it's dependent on l3mdev_accept).
> 
> Maybe our setup is wrong and we should not be able to route packets like 
> that?
> 
> Thanks a lot and have a nice day!
> 
> Maximilien Cuony
> 
> 


  parent reply	other threads:[~2022-10-01  0:42 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-28 14:02 [REGRESSION] Unable to NAT own TCP packets from another VRF with tcp_l3mdev_accept = 1 Maximilien Cuony
2022-09-30  7:36 ` Thorsten Leemhuis
2022-10-01  0:42 ` Jakub Kicinski [this message]
2022-10-07 14:42   ` David Ahern
2022-10-07 16:47   ` Mike Manning
2022-10-12 12:24     ` Maximilien Cuony
2022-10-26 12:40       ` [REGRESSION] Unable to NAT own TCP packets from another VRF with tcp_l3mdev_accept = 1 #forregzbot Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220930174237.2e89c9e1@kernel.org \
    --to=kuba@kernel.org \
    --cc=dsahern@kernel.org \
    --cc=fw@strlen.de \
    --cc=maximilien.cuony@arcanite.ch \
    --cc=mvrmanning@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=phil@nwl.cc \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.