Linux Netfilter discussions
 help / color / mirror / Atom feed
* Issue migrating "iptables -m socket --transparent" into nftables
@ 2020-08-17 14:54 Nirgal Vourgère
  2020-08-17 19:34 ` Florian Westphal
  0 siblings, 1 reply; 4+ messages in thread
From: Nirgal Vourgère @ 2020-08-17 14:54 UTC (permalink / raw)
  To: netfilter

Hi

I have a working haproxy in transparent mode, that analyze the TLS SNI header to choose a route, without decrypting the packets. I use it as a frontal, for several https servers using the same IP address, and I'm very happy to have the pristine client IP address in my httpd.

My kernel have net.ipv4.ip_nonlocal_bind=1.

/etc/iproute2/rt_tables contains:

    100 haproxy

I am using

    ip rule add fwmark 1 lookup haproxy
    ip route add local default dev lo table haproxy

My firewall rules have

    iptables -t mangle -A PREROUTING -m socket --transparent -j MARK --set-mark 1

This works fine. But iptables is deprecated and will vanish at some point. So I’m trying to replace this by the new nftables system. And miserably fails.

I tried this nft rule:

    table inet haproxy {
        chain prerouting {
            type filter hook prerouting priority -150; policy accept;
            socket transparent 1 mark set 0x00000001
        }
    }

It does work, but all traffic is routed to the haproxy socket, including outbound masqueraded connection… I mean when a box on the lan side connects to a foreign https server, the connection is grabbed by haproxy, which is not what I want.

Does any one know the proper equivalent to

    iptables -t mangle -A PREROUTING -m socket --transparent -j MARK --set-mark 1

using nft?



Here's a useful failure. My haproxy configuration contains:

    frontend https4-in
        bind :443 strict-sni transparent
        mode tcp
        ...

I tried replacing in haproxy.cfg "bind :443" by "bind 1.2.3.4:443" - where 1.2.3.4 is my IP address obviously - and it works ok. But I have some servers with dynamic ip adresses, so this is not a solution for me.

My guess is that the iptables version is adding some logic.



I also tried "nft add rule inet haproxy prerouting ct state new fib daddr . iif type local socket transparent 1 meta mark set 1", but it doesn't work either.

Any help would be appreciated.

I am using Debian stable (kernel 4.19.132 with nftables 0.9). The haproxy is in LXC container.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Issue migrating "iptables -m socket --transparent" into nftables
  2020-08-17 14:54 Issue migrating "iptables -m socket --transparent" into nftables Nirgal Vourgère
@ 2020-08-17 19:34 ` Florian Westphal
  2020-08-17 23:25   ` Nirgal Vourgère
  0 siblings, 1 reply; 4+ messages in thread
From: Florian Westphal @ 2020-08-17 19:34 UTC (permalink / raw)
  To: Nirgal Vourgère; +Cc: netfilter

Nirgal Vourgère <contact_vgernf@nirgal.com> wrote:
> 
>     ip rule add fwmark 1 lookup haproxy
>     ip route add local default dev lo table haproxy
> 
> My firewall rules have
> 
>     iptables -t mangle -A PREROUTING -m socket --transparent -j MARK --set-mark 1

[..]

> I tried this nft rule:
> 
>     table inet haproxy {
>         chain prerouting {
>             type filter hook prerouting priority -150; policy accept;
>             socket transparent 1 mark set 0x00000001
>         }
>     }
> 
> It does work, but all traffic is routed to the haproxy socket, including outbound masqueraded connection… I mean when a box on the lan side connects to a foreign https server, the connection is grabbed by haproxy, which is not what I want.

I don't understand how the iptables rule would not do exactly the same
thing, there is nothing that checks interface names or addresses.

Are you sure there is nothing in the iptables rule set that
makes the socket rule only handle those packets that should be redirected?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Issue migrating "iptables -m socket --transparent" into nftables
  2020-08-17 19:34 ` Florian Westphal
@ 2020-08-17 23:25   ` Nirgal Vourgère
  2020-08-18 10:17     ` Pablo Neira Ayuso
  0 siblings, 1 reply; 4+ messages in thread
From: Nirgal Vourgère @ 2020-08-17 23:25 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter

[-- Attachment #1: Type: text/plain, Size: 2765 bytes --]

On Monday, 17 August 2020 21:34:06 CEST Florian Westphal wrote:
> Nirgal Vourgère <contact_vgernf@nirgal.com> wrote:
> > 
> >     ip rule add fwmark 1 lookup haproxy
> >     ip route add local default dev lo table haproxy
> > 
> > My firewall rules have
> > 
> >     iptables -t mangle -A PREROUTING -m socket --transparent -j MARK --set-mark 1
> 
> [..]
> 
> > I tried this nft rule:
> > 
> >     table inet haproxy {
> >         chain prerouting {
> >             type filter hook prerouting priority -150; policy accept;
> >             socket transparent 1 mark set 0x00000001
> >         }
> >     }
> > 
> > It does work, but all traffic is routed to the haproxy socket, including outbound masqueraded connection… I mean when a box on the lan side connects to a foreign https server, the connection is grabbed by haproxy, which is not what I want.
> 
> I don't understand how the iptables rule would not do exactly the same
> thing, there is nothing that checks interface names or addresses.
> 
> Are you sure there is nothing in the iptables rule set that
> makes the socket rule only handle those packets that should be redirected?

I am sure. Yes.

Here's the output of my iptables generated "nft list ruleset", only the fragment regarding the mangle tables generated by iptables and ip6tables:

table ip mangle {
	chain PREROUTING {
		type filter hook prerouting priority -150; policy accept;
		# socket --transparent counter packets 83537 bytes 53363874 meta mark set 0x1 
	}

	chain INPUT {
		type filter hook input priority -150; policy accept;
	}

	chain FORWARD {
		type filter hook forward priority -150; policy accept;
	}

	chain OUTPUT {
		type route hook output priority -150; policy accept;
	}

	chain POSTROUTING {
		type filter hook postrouting priority -150; policy accept;
	}
}
table ip6 mangle {
	chain PREROUTING {
		type filter hook prerouting priority -150; policy accept;
		# socket --transparent counter packets 3 bytes 180 meta mark set 0x1 
	}

	chain INPUT {
		type filter hook input priority -150; policy accept;
	}

	chain FORWARD {
		type filter hook forward priority -150; policy accept;
	}

	chain OUTPUT {
		type route hook output priority -150; policy accept;
	}

	chain POSTROUTING {
		type filter hook postrouting priority -150; policy accept;
	}
}

This works.

No protocol check, no ip check, no port, just a simple brutal "iptables -t mangle -A PREROUTING -m socket --transparent -j MARK --set-mark 1"
Attached are the whole firewall mangle fragments, one works. The other does not.

Maybe there's some magic in the old transparent module, that silently add some conditions?

I've been using that set up on a whole bunch of servers.

[-- Attachment #2: fw_ok.sh --]
[-- Type: text/plain, Size: 1621 bytes --]

#!/bin/bash
ENABLE_HAPROXY=1
HA_RT_TABLE="haproxy"
RT_TABLES=/etc/iproute2/rt_tables
HAPROXY_IPMARK=1 # Id for packets to go to haproxy

ip46tables() {
    # simple function that run rule on both IPv4 and IPv6
    iptables "$@"
    ip6tables "$@"
}

###############################################################################
# Marking packets for haproxy
###############################################################################

ip46tables -t mangle --flush PREROUTING
if [ -n "$ENABLE_HAPROXY" ]
then
        sysctl -q -w net.ipv4.ip_nonlocal_bind=1
        ip46tables -t mangle -A PREROUTING -m socket --transparent -j MARK --set-mark $HAPROXY_IPMARK

        if grep -q $HA_RT_TABLE $RT_TABLES
        then
                for ipversion in 4 6
                do
                        # all packets marked by HAPROXY_IPMARK should be routed using $HA_RT_TABLE
                        ip -$ipversion rule del fwmark $HAPROXY_IPMARK
                        ip -$ipversion rule add fwmark $HAPROXY_IPMARK lookup $HA_RT_TABLE

                        # default for routing table $HA_RT_TABLE is to try local bind
                        # Note that net.ipv4.ip_nonlocal_bind=1
                        ip -$ipversion route flush table $HA_RT_TABLE
                        ip -$ipversion route add local default dev lo table $HA_RT_TABLE
                done
        else
                $LOGGER -p user.crit -- "$RT_TABLES does not have $HA_RT_TABLE entry. Consider running \"echo 100 $HA_RT_TABLE >> $RT_TABLES\". haproxy rules disabled."
        fi
else
        sysctl -q -w net.ipv4.ip_nonlocal_bind=0
fi


[-- Attachment #3: fw_nok.sh --]
[-- Type: text/plain, Size: 1600 bytes --]

#!/bin/bash
ENABLE_HAPROXY=1
HA_RT_TABLE="haproxy"
RT_TABLES=/etc/iproute2/rt_tables
HAPROXY_IPMARK=1 # Id for packets to go to haproxy

###############################################################################
# Marking packets for haproxy
###############################################################################

if [ -n "$ENABLE_HAPROXY" ]
then
        sysctl -q -w net.ipv4.ip_nonlocal_bind=1

        nft create table inet haproxy
        nft -- add chain inet haproxy prerouting \{ type filter hook prerouting priority -150\; \}
        nft add rule inet haproxy prerouting socket transparent 1 meta mark set $HAPROXY_IPMARK
        if grep -q $HA_RT_TABLE $RT_TABLES
        then
                for ipversion in 4 6
                do
                        # all packets marked by HAPROXY_IPMARK should be routed using $HA_RT_TABLE
                        ip -$ipversion rule del fwmark $HAPROXY_IPMARK
                        ip -$ipversion rule add fwmark $HAPROXY_IPMARK lookup $HA_RT_TABLE

                        # default for routing table $HA_RT_TABLE is to try local bind
                        # Note that net.ipv4.ip_nonlocal_bind=1
                        ip -$ipversion route flush table $HA_RT_TABLE
                        ip -$ipversion route add local default dev lo table $HA_RT_TABLE
                done
        else
                $LOGGER -p user.crit -- "$RT_TABLES does not have $HA_RT_TABLE entry. Consider running \"echo 100 $HA_RT_TABLE >> $RT_TABLES\". haproxy rules disabled."
        fi
else
        sysctl -q -w net.ipv4.ip_nonlocal_bind=0
fi



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Issue migrating "iptables -m socket --transparent" into nftables
  2020-08-17 23:25   ` Nirgal Vourgère
@ 2020-08-18 10:17     ` Pablo Neira Ayuso
  0 siblings, 0 replies; 4+ messages in thread
From: Pablo Neira Ayuso @ 2020-08-18 10:17 UTC (permalink / raw)
  To: Nirgal Vourgère; +Cc: Florian Westphal, netfilter, Balazs Scheidler

On Tue, Aug 18, 2020 at 01:25:45AM +0200, Nirgal Vourgère wrote:
> Maybe there's some magic in the old transparent module, that silently add some conditions?

Balazs cannot reply to the mailing list for some reason. He sent me
this privately:

"The original iptables "socket" match had an extra check so that it wouldn't
match listener sockets, at least by default (that is if --nowildcard is not
specified).

I don't see however how "outbound masqueraded connection" could be
impacted. The "socket transparent 1" expression should require that the
socket being matched has IP_TRANSPARENT setsockopt set. Are those
connections also initiated by haproxy?

In any case, I think the check to ignore wildcard bound listener sockets is
definitely missing, however I am not sure how to properly add it to
nftables. If I added it to the socket match implementation that might break
a few currently well behaving use-cases.

This is the check that is in iptables -m socket:

                wildcard = (!(info->flags & XT_SOCKET_NOWILDCARD) &&
                            sk_fullsock(sk) &&
                            inet_sk(sk)->inet_rcv_saddr == 0);

And then if --transparent is used, these sockets are not accepted / the
rule does not match."

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-08-18 10:17 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-08-17 14:54 Issue migrating "iptables -m socket --transparent" into nftables Nirgal Vourgère
2020-08-17 19:34 ` Florian Westphal
2020-08-17 23:25   ` Nirgal Vourgère
2020-08-18 10:17     ` Pablo Neira Ayuso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox