From: Eliezer Croitoru <eliezer@ngtech.co.il>
To: netfilter@vger.kernel.org
Cc: "Humberto Jucá" <betolj@gmail.com>
Subject: Re: A question about routing cache (for load balancing).
Date: Fri, 08 Nov 2013 03:23:56 +0200 [thread overview]
Message-ID: <527C3D2C.5010502@ngtech.co.il> (raw)
In-Reply-To: <CACuyg25OkrrcSngu9MCUT3vwvQ87L7Y_HCaXvBkY80Hb2jWoqw@mail.gmail.com>
On 11/08/2013 02:39 AM, Humberto Jucá wrote:
> You're talking about this:
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=89aef8921bfbac22f00e04f8450f6e447db13e42
Exactly!!!
> I will need to research about it too.
The main problem with LoadBalancing based on route cache is that it's
not real loadbalancing but rather an exploit of the routing capabilities
design.
A route cache is per *route*
means all traffic from this "src ip" that travels throw "this local ip"
towards "this remote ip" will continue to be routed like this for the
next 300 secs..
as a route cache entry states:
# ip route get 209.169.10.131
209.169.10.131 via 192.168.10.254 dev eth0 src 192.168.10.1
cache mtu 1500 advmss 1460 hoplimit 64
##end
So let say I am issuing 20 connections towards the same host the exact
same gateway will be used as long as the garbage collection will not
remove the entry.
In this time the nexthop\gateway could fall down and get up about 60++
times..
So from what I understood from the change in the kernel is that a
routing system should use the FIB to calculate the right path (10Mpps is
enough?) as expected from a dynamic routing system on the packet level
instead of routing the packets based on a cache and a FIB lookup in a
case needed(which means it can take two lookups one for the cache and
second using the FIB).
In a case of LoadBalancer I would assume there is a need for iptables
connection marking which has an option to really follow the TCP and UDP
connections and not just routing based on cache.
In any case IPTABLES based loadbalancing of TCP level (not an
application level) can take a bit more cycles and a bit more RAM but it
still faster then any proxy application.
But an application that monitors the LB router and the services servers
load constantly can change "static" routes to make sure that the load is
distributed.
In this case there must be some connection tracking on the LB to make
sure that the TCP connections will not just break to the clients in the
middle of a connection which can lead to a "read error" for example.
How to handle these errors? I think it's another subject which I want to
later on read more about.
Eliezer
next prev parent reply other threads:[~2013-11-08 1:23 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-05 1:55 A question about routing cache (for load balancing) Eliezer Croitoru
2013-11-07 10:26 ` Humberto Jucá
2013-11-07 16:52 ` Eliezer Croitoru
2013-11-07 20:59 ` Humberto Jucá
2013-11-07 22:03 ` Eliezer Croitoru
2013-11-07 22:39 ` Neal Murphy
2013-11-07 23:53 ` Humberto Jucá
2013-11-08 0:02 ` Humberto Jucá
2013-11-08 0:39 ` Humberto Jucá
2013-11-08 1:23 ` Eliezer Croitoru [this message]
2013-11-08 2:29 ` Humberto Jucá
2013-11-08 2:35 ` Humberto Jucá
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=527C3D2C.5010502@ngtech.co.il \
--to=eliezer@ngtech.co.il \
--cc=betolj@gmail.com \
--cc=netfilter@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.