All of lore.kernel.org
 help / color / mirror / Atom feed
From: Simon Kirby <sim@hostway.ca>
To: Florian Westphal <fw@strlen.de>
Cc: netdev@vger.kernel.org, netfilter-devel@vger.kernel.org,
	lvs-devel@vger.kernel.org
Subject: Re: Inability to IPVS DR with nft dnat since 9971a514ed26
Date: Wed, 27 Mar 2019 08:34:23 -0700	[thread overview]
Message-ID: <20190327153422.GA1161@hostway.ca> (raw)
In-Reply-To: <20190327093027.gmflo27icuhr326p@breakpoint.cc>

On Wed, Mar 27, 2019 at 10:30:27AM +0100, Florian Westphal wrote:

> > I bisected this to 9971a514ed2697e542f3984a6162eac54bb1da98 ("netfilter:
> > nf_nat: add nat type hooks to nat core").
> > 
> > It should be pretty easy to see this with a minimal setup:
> > 
> > /etc/nftables.conf:
> > 
> > table ip nat {
> >     chain prerouting {
> > 		type nat hook prerouting priority 0;
> > 
> > 		ip daddr $ext_ip dnat to $vip
> > 	}
> > 	chain postrouting {
> > 		type nat hook postrouting priority 100;
> > 
> > 		# In theory this hook no longer needed since this commit,
> > 		# but we also need to do some unrelated snatting.
> > 	}
> > }
> > 
> > /etc/sysctl.conf:
> > 	
> > net.ipv4.conf.all.accept_local = 1
> > net.ipv4.vs.conntrack = 1
> > 
> > IPVS DR setup:
> > 
> > ipvsadm -A -t $vip:80 -s wrr
> > ipvsadm -a -t $vip:80 -r $real_ip:80 -g -w 100
> 
> I have a hard time figuring out how to expand $ext_ip, $vip and $real_ip,
> and where to place those addresses on the nft machine.

$ext_ip is something reachable from the "outside"; it just has to be
something which can get to the nft box that isn't the real server or the
same host. We have a public IP in this case.

$vip is something that is on the local LAN "behind" the nft box. In our
case this is an rfc1918 IP address.

$real_ip is on the same subnet as the $vip and is just a way for IPVS to
resolve the neighbor of one of the real servers in order to forward the
packet. With this example configuration, IPVS is basically equivalent to:

ip route add $vip via $real_ip

Except that it hooks the input path because $vip is expected to be bound
locally...and normally you have multiple real servers and some algorithm
selected for balancing. So, I guess I didn't mention that, and you also
need to bind $vip to the nft box, and also to the real server if you
want it to actually be able to respond.

"LVS-HOWTO" has info on how to set up LVS-DR. The only difference here is
that we're using it in a relatively new (2009) configuration where "DR"
(Direct Return) mode is actually symmetric and replying back to the nft
box (symmetric) instead of directly to a separate router. This lets NAT
actually work since it can see traffic in both directions.

Simon-

  reply	other threads:[~2019-03-27 15:34 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-27  6:26 Inability to IPVS DR with nft dnat since 9971a514ed26 Simon Kirby
2019-03-27  9:30 ` Florian Westphal
2019-03-27 15:34   ` Simon Kirby [this message]
2021-12-03  8:34   ` Simon Kirby
2021-12-03  9:40     ` Pablo Neira Ayuso
2021-12-03 21:48     ` Julian Anastasov
2021-12-03 21:48       ` Julian Anastasov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190327153422.GA1161@hostway.ca \
    --to=sim@hostway.ca \
    --cc=fw@strlen.de \
    --cc=lvs-devel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.