All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Rabbitson <rabbit@rabbit.us>
To: lartc@vger.kernel.org
Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter
Date: Mon, 14 May 2007 11:24:31 +0000	[thread overview]
Message-ID: <464846EF.3080109@rabbit.us> (raw)
In-Reply-To: <4647FA30.5040401@rabbit.us>

Answer inlined:

Salim S I wrote:
>     iptables -t mangle -A PREROUTING -j ISP2
> 
> Doesn't it need to check for state NEW? Or packets will not reach the
> restore-mark rule.

Of course, and the real script does check. I typed this line manually
because the copy cut it, and missed the obvious check.

> You may have to manually populate the routing tables when an interface
> comes up, after being down for some time. (Kernel would have removed the
> routing entries for this interface after it found the interface down.
> This happens only if its nexthop is down)

This is what I can't really understand (and it applies to DGD as well) -
how often in real life does someone yank a cable out, so an interface
will go down? In over 7 years of dealing with various ISPs I have never
seen the link go so dead, that the kernel will down the interface and
remove all associated routing information. What I have seen on the other
hand is the link dying at the 2nd or 3rd hop, which (if I understand
correctly) DGD simply can not detect. Correct me if my assumption is wrong.

> I tend to favor this approach, because it is more flexible in selecting
> the interface. You can use different weights/probability depending on
> different factors. I have seen a variation of this method, used with
> 'recent' (-m recent) match, instead of CONNMARK.

I see. But recent would have a "caching effect", and from what I
understand is heavier on the kernel, unlike the CONNMARK which hooks
into the conntrack which in turn has to track connections either way.

> The only downside in using this method, as far as I can see, is the need
> to reconfigure rules and routing tables, in case of a failure/coming-up.
> But lately, I have found that even with multipath method, there IS a
> need for reconfiguration.

Got you. This pretty much answers my original question. Thank you for
your time.

> -----Original Message-----
> From: lartc-bounces@mailman.ds9a.nl
> [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Peter Rabbitson
> Sent: Monday, May 14, 2007 3:16 PM
> To: lartc@mailman.ds9a.nl
> Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter
> 
> Salim S I wrote:
>>> -----Original Message-----
>>> From: lartc-bounces@mailman.ds9a.nl
>>> [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Peter Rabbitson
>>> Sent: Monday, May 14, 2007 1:57 PM
>>> To: lartc@mailman.ds9a.nl
>>> Subject: [LARTC] Multihome load balancing - kernel vs netfilter
>>>
>>> Hi,
>>> I have searched the archives on the topic, and it seems that the list
>>> gurus favor load balancing to be done in the kernel as opposed to
> other
>>> means. I have been using a home-grown approach, which splits traffic
>>> based on `-m statistic --mode random --probability X`, then CONNMARKs
>>> the individual connections and the kernel happily routes them. I
>>> understand that for > 2 links it will become impractical to calculate
> a
>>> correct X. But if we only have 2 gateways to the internet - are there
>>> any advantages in letting the kernel multipath scheduler do the
>>> balancing (with all the downsides of route caching), as opposed to
> the
>>> pure random approach described above?
>> I have thought about this approach, but, I think, this approach does
> not
>> handle failover/dead-gateway-detection well. Because you need to alter
>> all your netfilter routing rules if you find a link down. And then
>> reconfigure again when the link comes up. I am interested to know how
>> you handle that.
>>
> 
> Certainly. What I am doing is NATing a large company network, which gets
> load balanced and receives fail over protection. I also have a number of
> services running on the router which must not be balanced nor failed
> over, as they are expected to respond on a specific IP only. All
> remaining traffic on the server itself is not balanced but fails over
> when the designated primary link goes down.
> 
> I start with a simple pinger app, that pings several well known remote
> sites once a minute using a large icmp packet (1k of payload). The rtt
> times are averaged out and are used to calculate the current "quality"
> of the link (the large packet makes congestion a visible factor). If one
> of the interface responses is 0 (meaning not a single one of the pinged
> hosts has responded) - the link is dead.
> 
> In iproute I have two separate tables, each using one of the links as
> default gw, matching a certain mark. The default route is set to a
> single gateway (not a multipath), either by hardcoding, or by using the
> first input of the pinger (it can run without a default gw set,
> explanation follows)
> 
> In iptables I have two user defined chains:
>     iptables -t mangle -A ISP1 -j CONNMARK --set-mark 11
>     iptables -t mangle -A ISP1 -j MARK --set-mark 11
>     iptables -t mangle -A ISP1 -j ACCEPT
> 
>     iptables -t mangle -A ISP2 -j CONNMARK --set-mark 12
>     iptables -t mangle -A ISP2 -j MARK --set-mark 12
>     iptables -t mangle -A ISP2 -j ACCEPT
> 
> The rules that reference those chains are:
> 
> For all locally originating traffic:
>     iptables -t mangle -A OUTPUT -o $I1 -j ISP1
>     iptables -t mangle -A OUTPUT -o $I2 -j ISP2
> 
> For all incoming traffic from the internet:
>     iptables -t mangle -A PREROUTING -i $I1 -m state --state NEW -j ISP1
>     iptables -t mangle -A PREROUTING -i $I2 -m state --state NEW -j ISP2
> 
> For all other traffic (nat)
>     iptables -t mangle -A PREROUTING -m state --state NEW -m statistic
> --mode random --probability $X -j ISP1
>     iptables -t mangle -A PREROUTING -j ISP2
> 
> At the end of the PREROUTING cain I have
>     iptables -t mangle -A PREROUTING -j CONNMARK --restore-mark
> 
> The NATing is trivially solved by:
>     iptables -t nat -A POSTROUTING -s 10.0.58.0/24 -j SOURCE_NAT
>     iptables -t nat -A POSTROUTING -s 192.168.58.0/24 -j SOURCE_NAT
>     iptables -t nat -A POSTROUTING -s 192.168.8.0/24 -j SOURCE_NAT
> 
>     iptables -t nat -A SOURCE_NAT -o $I1 -j SNAT --to $I1_IP
>     iptables -t nat -A SOURCE_NAT -o $I2 -j SNAT --to $I2_IP
> 
> 
> What does this achieve:
> * Local applications that have explicitly requested a specific IP to
> bind to, will be routed over the corresponding interface and will stay
> that way. Only applications binding to 0.0.0.0 will be routed by
> consulting the default route.
> * Responses to connections from the internet are guaranteed to leave
> from the same interface they came in.
> * All new connection not coming from the external interfaces are load
> balanced by the weight of $X, and are again guaranteed to stay there for
>  the life of the connection, but another connection to the same host is
> not guaranteed to go over the same link. This is important in a company
> environment, since most employees use the same online resources.
> 
> On every run of the pinger I do the following:
> * If both gateways are alive I replace the -m statistic rule, adjusting
> the value of $X
> * If one is detected dead, I adjust the probability accordingly (or
> alternatively remove the statistic match altogether), and change the
> default gateway if it is the one that failed.
> 
> So really the whole exercise revolves around changing a single rule (or
> two rules, if you want to control the probability in a more fine-grained
> way).
> 
> Last but not least this setup allowed me to program exception tables for
> certain IP blocks. For instance Yahoo has a braindead two tier
> authentication system for commercial solutions. It remembers the IP
> which you used to login with first, and it must match the IP used to
> login to a more secure area (using another password). Or users from
> within the lan might want to use one of the ISPs SMTP servers, which
> keeps a close eye on who is talking to it. So I have a $PREFERRED which
> is adjusted to either ISP1 or ISP2, depending on the current state of
> affairs, and rules like:
>     iptables -t mangle -A PREROUTING -d 66.218.64.0/19 -m state --state
> NEW -j $PREFERRED
>     iptables -t mangle -A PREROUTING -d 68.142.192.0/18 -m state --state
> NEW -j $PREFERRED
> 
> This pretty much sums it up. The only downside I can think of is that
> loss of service can be observed between two runs of the pinger. Let me
> know if I missed something be it critical or minor.
> 
> Thanks
> 
> Peter
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
> 
> 
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

  parent reply	other threads:[~2007-05-14 11:24 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-14  5:57 [LARTC] Multihome load balancing - kernel vs netfilter Peter Rabbitson
2007-05-14  6:07 ` Salim S I
2007-05-14  7:15 ` Peter Rabbitson
2007-05-14  8:23 ` Salim S I
2007-05-14 11:24 ` Peter Rabbitson [this message]
2007-05-22  3:28 ` Luciano Ruete
2007-05-29  6:16 ` Salim S I
2007-05-30  3:58 ` Salim S I
2007-05-30  4:55 ` Peter Rabbitson
2007-05-31  5:02 ` Salim S I
2007-06-02  3:27 ` Luciano Ruete
2007-06-05  6:48 ` Salim S I
2007-06-05 21:09 ` Alex Samad
2007-06-13  2:52 ` Luciano Ruete

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=464846EF.3080109@rabbit.us \
    --to=rabbit@rabbit.us \
    --cc=lartc@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.