* [LARTC] Solved: Using more than 1 Internet Line
@ 2001-12-02 19:24 Christoph Simon
2001-12-02 19:41 ` Christoph Simon
` (13 more replies)
0 siblings, 14 replies; 15+ messages in thread
From: Christoph Simon @ 2001-12-02 19:24 UTC (permalink / raw)
To: lartc
[-- Attachment #1: Type: text/plain, Size: 722 bytes --]
I've been looking for about a year to find a way to use more than one
independent link to the Internet from a single computer. But I only
found many others who tried to do the same, without success.
Then I found out about Julian Anastasov's patches, and well, now it's
working. Unfortunately, this is not a five minutes solution, nor will
it help in all kinds of situations. So I decided to write a document
which tells step by step how this can be done and how it works, for
which I received lots of support from Julian. Thanks.
I've Attached the text to this message. I hope it is useful to
others. Here it is already working.
--
Christoph Simon
ciccio@kiosknet.com.br
---
^X^C
q
quit
:q
^C
end
x
exit
ZZ
^D
?
help
.
[-- Attachment #2: nano.txt --]
[-- Type: text/plain, Size: 34350 bytes --]
Nano-Howto to use more than one independent Internet connection.
----------------------------------------------------------------
1. Intro
========
1.0 Thanks
----------
The solution presented here is based on the patches by Julian
Anastasov. Like several other users on the net struggling with this, I
believe they fill in yet another gap for using Linux as a very
advanced router. It was hard for me not only to get this running for
the first time, but also to understand most of the principles. Also in
this Julian has proofed to have unlimited patience. Thanks.
He also agreed to expose this file on his site:
http://www.linuxvirtualserver.org/~julian/nano.txt
1.1 Initial situation
----------------------
I've been looking for a solution to use more than one connection to
Internet at the same time for a long time. I've found out, that I'm
not alone; there seem to be many having this very problem. The reasons
are generally just to cheaply increase total throughput, or to provide
for a fail-over in case of cheap lines. This is not the situation of a
cluster of web-servers, where main traffic is input, but for networks
which are mainly dedicated to any kind of download.
1.2. Bad news
-------------
The setup of all this is not a question of 5 minutes, so I'll tell at
the beginning where are the limits.
First, this will not work for a single computer with two modem
connections. Load balancing does work reasonably well, but only with a
high number of simultaneous connections to different remote sites.
The tests I'm performing are on a network with 17 client computers
with an daily download (12h) of some 600MB of HTTP traffic in several
hundreds of thousands of connections.
When one connection fails while being used, this connection will be
lost and can't be continued on another line. The user will have to
establish this connection anew. If it was a big download and the
server doesn't support continuation, he'll be out of luck.
The lines I'm using have external modems which react as any other
device with Ethernet. There is no PPP, PPPoE, etc. I can't tell if
this works or would need modification for instance to use two
telephone lines. Maybe someone else can fill in this.
1.3 Good news
-------------
What does work is:
- there is no restriction to 2 lines, though I didn't test more.
- using 2.4 kernels
- interaction with ip(8) and iptables(8)
- usage of both lines at the same time
- if any line fails, new connections will use any other which working.
- if a line fails, no reconfiguration of the network is required.
Right now, I can't tell how long it takes in case of a fail over, but
there where several cases I found in the logs and no user realized
that there was a problem.
2. Setup
========
2.1. Preparing the kernel
-------------------------
Stock kernels are not able to provide this functionality. Julian
Anastasov wrote a patch for 2.2 and 2.4 Linux kernels. I only tried
with several 2.4 kernels, now running 2.4.15pre5. The patches have not
been written for this version, but apply cleanly. The patches are
available on Julian's web-page:
http://www.linuxvirtualserver.org/~julian/#routes
Choose the patches for your kernel, download them, and apply all of
them.
The patches will not offer any additional configuration options. The
kernel needs to have "equal cost multi path" enabled. This is done in
Network Options. After choosing "IP: advanced router" there will be an
option for it. Only an indirect requirement is, that we will need NAT:
Any host on the local network needs to be able to appear on the
Internet with any of the external IPs; this is the main purpose of
NAT. To get NAT, of course we have to enable connection tracking. But
this would be done anyway. No further options are required, but any
networking option may be used. Particularly, I've enabled almost all
options for netfilter and QoS and it's working fine.
Nothing special is required to run this kernel (no boot options,
etc.).
2.2 Netfilter issues
--------------------
As I stated at the beginning, to make use of the patches and the
configuration described here, we need a relatively high number of
connections and remote servers. I've tried bitterly, but wasn't able
to achieve this effect alone. I started several simultaneous
downloads, ran Mozilla on several computers at the same time, opening
different big sites, but no avail. When the regular users came, things
suddenly showed to work correctly. So this is certainly not a setup
for a single box, and then, we'll need to do some address
translation. Beside this, a firewall will almost always be
unavoidable, but it's setup is out of scope here. A minimal setup
would enable forwarding in the kernel, establish a default policy of
ALLOW, and create at least two SNAT rules:
iptables -t nat -A POSTROUTING -o IFE1 -s NWE1/NME1 -j SNAT --to IPE1
iptables -t nat -A POSTROUTING -o IFE2 -s NWE2/NME2 -j SNAT --to IPE2
If the ISP didn't give us a fixed IP, for instance using PPPoE, but it
seems that it must be an ARP type device, we would say:
iptables -t nat -A POSTROUTING -o IFE1 -s NWE1/NME1 -j MASQUERADE
iptables -t nat -A POSTROUTING -o IFE2 -s NWE2/NME2 -j MASQUERADE
I didn't try this, but it will work if the analog thing works with
just one line. This is not really new, as we would have to do
something similar anyway, even with only one line to the Internet.
At least for netfilter (not sure for ipfwadm/ipchains), the firewall
must be stateful. This can be done by:
iptables -t filter -N keep_state
iptables -t filter -A keep_state -m state --state RELATED,ESTABLISHED \
-j ACCEPT
iptables -t filter -A keep_state -j RETURN
iptables -t nat -N keep_state
iptables -t nat -A keep_state -m state --state RELATED,ESTABLISHED \
-j ACCEPT
iptables -t nat -A keep_state -j RETURN
and calling this at the beginning of the script:
iptables -t nat -A PREROUTING -j keep_state
iptables -t nat -A POSTROUTING -j keep_state
iptables -t nat -A OUTPUT -j keep_state
iptables -t filter -A INPUT -j keep_state
iptables -t filter -A FORWARD -j keep_state
iptables -t filter -A OUTPUT -j keep_state
2.3 Routing
-----------
Here I'll assume that the computer has three network cards: one to the
local network, and two for each of the external links (DSL modems in
my case). This needs not to be so, as there could be more DSL modems,
and, at least in theory, it's possible to use one card for more than
one modems, using a hub and two or more IPs on the same device. This
is not a recommended setup because it may introduce security problems,
but might be the only one if we had more lines than PCI slots for
additional network cards.
Abbreviations:
IFI internal interface
IPI IP address of internal interface
NMI netmask for the internal interface
IFE1, IFE2 external interfaces
IPE1, IPE2 external IP address
NWE1, NWE2 external network address
NME1, NME2 mask for the external network (number of bits, like in /24)
BRD1, BRD2 broadcast address for external network
GWE1, GWE2 gateway for external interface
2.3.1 The local loopback and the internal interface
---------------------------------------------------
The local loopback must be up and running. There is nothing unusual
about this.
ip link set lo up
ip addr add 127.0.0.1/8 brd + dev lo
At this point, we should be able to ping localhost.
We also set up the internal interface IFI with the internal address
IPI and the netmask for the local network NMI. Assigning an address
will cause the kernel to automatically insert an appropriate route
into table main. This is just fine. Later, there will also be a route
for each external interface, but none of the routes in table main will
have a gateway.
We'll want to give table main a priority of 50 to make sure it is
looked at first.
But also we want to make sure that there is no default route in table
main. The commands so far didn't insert any, but if this is a
reconfiguration, there might remain some from the previous setup. If
there is none, the last command will fail, but that's OK: we just want
to be sure.
ip link set IFI up
ip addr add IPI/NMI brd + dev IFI
ip rule add prio 50 table main
ip route del default table main
At this point, we should be able to ping any host on the local
network.
2.3.2 Setup of external interfaces
----------------------------------
The configuration of the external interfaces isn't really different
from that of the internal, but for now, we don't specify a gateway nor
do we insert a default route.
ip link set IFE1 up
ip addr flush dev IFE1
ip addr add IPE1/NME1 brd BRD1 dev IFE1
ip link set IFE2 up
ip addr flush dev IFE2
ip addr add IPE2/NME2 brd BRD2 dev IFE2
As ip(8) allows to assign more than one address to an interface, it's
important that there doesn't remain any configuration from a previous
setup; failing to do so might cause the same address to be specified
twice. That's what the "addr flush" commands are for. When we actually
specify the address (the last line of each block), the kernel, again,
will insert automatically a route into table main, very much like in
the case of the internal interface.
Now, we should be able to ping the external IPs (IPE1 and IPE2), as
well as the gateways or any other host or device which belongs to
those external network addresses. In my case there is none beside the
gateways. For this reason it would be correct to consider such an
interface like a point-to-point device, but technically it is not.
2.3.3 Setup of the default routes
---------------------------------
This is the complicated part of the setup. During this discussion,
I'll mention the setup commands step by step. In the end, I'll repeat
them all together.
Before really starting, let's make a small experiment: Using any kind
of packet tracer (logs from iptables, tcpdump, etc), we will observe
that if we ping 127.0.0.1, the source address of that packet will be
127.0.0.1; if we ping any address of the local network, the source
address will be that of the internal interface. And if we ping any
host or device outside, the source address will be that of the device
where the packet is going out. Who is being that smart? It might be
the application (e.g., ping) itself, using a bind(2) call, but this is
not normally done for a client. It's the routing system. In other
words, until a packet actually reaches the routing system, it doesn't
have a source address.
Well, in practice, some programs try to be smarter than the routing
system, use the bind(2) call, but actually choose a less than optimal
route. This would easily be the case under our setup, as a program
can't know which of the two external interfaces it should use. Also,
there are some versions of ping which allow the user to specify the
source address. While being an interesting tool also in our setup, the
`being smart' is then really up to the user.
As I stated in the requirements, NAT is one of them and so is
connection tracking. This consists of a set of information for each
known connection, including specific NAT details.
We see, even if the packet originates on the same computer, in our
case, a packet can have one of four addresses: 127.0.0.1, IPI, IPE1,
or IPE2. This is called the local source address, which plays a major
role when doing NAT. Here, this was one of the key concepts I had to
understand to be able to understand the rest.
I'll try to `deduce' the routing commands by following a packet from a
host on the local network. While it might also originate from the
computer we are configuring, this will be the most typical path. I'll
call that host w1 (the first workstation).
The setup of w1 is as usual: It has a loopback interface, a network
card with an IP address compatible with the local network address, and
a default route which uses as a gateway the computer we are
configuring here (IPI). As soon as the user on w1 does anything which
requires access to Internet, her program will set up a socket and call
connect(2). Then, an IP packet will be formed. The routing system on
w1 will set the source address to w1's local IP, the destination
address to that of the remote server on the Internet. It will use the
default route to send this packet to us.
The packet will come in by the local interface IFI, being welcome, if
present, by the prerouting chain of the mangle table, but certainly by
the prerouting chain of the nat table.
The nat table will search the already existing connection structures
and find out, that this is a new connection. It will create an
uninitialized info-block, and pass the packet to the routing system.
At this time, we have the following information: The packet came in by
the local interface IFI, has the source address of w1, but we don't
know yet the local source address.
Being a new connection, there will be nothing matching in the
cache. Also, our configuration so far doesn't have any default route
yet. So we need one which tells what to do with such a packet:
ip rule add prio 222 table 222
ip route add default table 222 proto static \
nexthop via GWE1 dev IFE1 \
nexthop via GWE2 dev IFE2
This is a multipath default route, causing the kernel to extract each
time another alternative; there could be more than these two.
Now, we get two additional informations: the output device (either
IFE1 or IFE2) and the IP address of the gateway (either GWE1 or
GWE2). Actually we also may get a third information, which is our
local IP address, which is not contained in this route, but the system
will deduce it by interface and gateway if it is not a dynamic
IP. That's it, routing is doing for us right now, beside sending the
packet to the forward chain of the filter table.
Of course, forwarding must be enabled in the kernel, and there must
not be any netfilter rule disallowing this packet. After that, the
packet is passed along to the postrouting chain of the nat table.
Using the information about the packet we gathered so far, the NAT
system will look for a rule, and find one of the two we gave in the
previous section. If our IP is fixed, now the local source address
will became known, but if it is dynamic, i.e., we used the MASQUERADE
target, we will need to go back to routing, as there is no other way
to find out the right local source address. The answer will come from
table 222, the multipath route, but this time, also interface and
gateway need to match. The NAT system will insert this into the, now
initialized, NAT information in the connection structure, replace the
source address in the packet by this one (which was still the IP of
w1), and get this packet on it's way to the gateway, going out by the
proper output device.
Note here, that when asking the multipath route, table 222, with a
known output device and gateway, the kernel still has to deduce the IP
address. It will do so, looking at the first address attached to the
interface which, by netmask, is compatible with the gateway
address. If there were more than one possible answers, only the first
will be used. This might be the case if we have both lines on the same
network card, which will work only if we use the MASQUERADE target in
the iptables rules. If we didn't use the gateway (stock kernels don't)
to make this match, we surely would be in troubles, because the output
device is the same for both. This means: To be able to use more than
one Internet connection on one and the same network card, the gateways
must be different, the IP addresses must be public IPs and must not
belong to the same subnet. Additionally, we should avoid SNAT in the
iptables rules and take special measures to protect against IP
spoofing.
The gateway, and potentially lots of routers in the Internet, are
supposed to do the right thing to make this packet reach it's
destination. There, the answer packet will use the source address
(which is now either IPE1 or IPE2) to come back into our computer by
the same interface where it left. Essentially, the procedure is
reversed; our routing table will find the way to w1 already in table
main.
Most user actions on the Internet aren't happy with just one packet of
data; let's have a look what happens if a second packet is sent from
w1 to the same destination:
Again, the packet will arrive at the local interface of this computer
and pass through the prerouting chain of the mangle table if that
exists. The prerouting chain of the nat table will start looking for
an existing, matching connection, and this time, it will find one. As
there is no DNAT rule in the netfilter setup for this packet, NAT will
just hand over the packet to the routing system.
The routing system will look at the connection, and the patches will
find that there is NAT information present, and that after routing
some mangling will be done. Now, the routing question will give the
input device (IFI), the source address (the IP of w1), the destination
address (that of the server on the Internet), and the local source
address (either IPE1 or IPE2). In this situation, we do not want to
match the multipath route, because the multipath route doesn't look at
local source address, we couldn't make sure that we get this time the
same line as before. So we add a table for each interface:
ip rule add prio 201 from NWE1/NME1 table 201
ip route add default via GWE1 dev IFE1 src IPE1 proto static table 201
ip route append prohibit default table 201 metric 1 proto static
ip rule add prio 202 from NWE2/NME2 table 202
ip route add default via GWE2 dev IFE2 src IPE2 proto static table 202
ip route append prohibit default table 202 metric 1 proto static
This can't be matched when table 222 should be matched, as we now
require the local source address to be known. We choose a priority
that makes sure these routes are found after the main table and before
table 222, which is more general than these.
These are still default route, because they must match any Internet
address.
The third line of each block is similar to a REJECT target in iptables
in case the corresponding interface is not working: If the client on
the local network sends a packet on an established connection, but in
the meanwhile the interface stopped operating, we will send this
client an ICMP controll message `PKT_FILTERED', hoping to cause it to
stop sending packets, and the user might wish to open a new
connection, which will succeed if at least one other line is still
working.
Stepping through all routing tables, either table 201 or table 202
will match, according to the output device selected when we matched
the multipath route for the first packet. The tricky part is, what the
patches are doing here: They look at the local source address. If it
is zero, the old behavior would just match the multipath route in
table 222, but as we do know already the local source address (i.e.,
it is not zero) from what we found in the connection structure,
particularly what concerns NAT, we will try to match the rules against
the local source address rather than the packet's source address,
which is still the IP of w1. This is why either table 201 or table 202
will provide the match, telling the gateway address and the output
device.
Having done it's job, the routing system will send the packet to the
forward chain of the filter table, to continue in the postrouting
chain of the nat table. That has an easy game now, because, no matter
if the found target is MASQUERADE or SNAT, all necessary information
is already known. The packet's source address will be replaced by our
local source address and the packet can continue it's way going out by
the chosen output device, and reach it's destination through the
Internet. The same procedure is repeated for all further packets,
because the connection information is kept alive during a certain
time, even after closing the connection (FIN or RST).
Now let's have a look at a packet which originates on this computer
with a destination on the Internet. As we found out in our experiment,
the source address is not known before routing, and we won't need any
NAT. Here we will assume that the program which generates this packet
does not use the bind() call. If it would, either table 201 or table
202, according to the IP this program binds to, would be used and no
(further) load balancing can take place. So, the packet will start
crossing the output chain of the mangle table, if that is present, and
go to the output chain of nat. As there is no rule for a destination
nat, it will continue being evaluated in the routing system. No local
source address is known yet, so the multipath route in table 222 will
match. This will provide the output device, the gateway and also the
local source IP. Now the connection information can be completed. With
that, the packet will cross the output chain of the filter table and
reach the postrouting chain of the nat table. From there the packet
can be sent to the gateway. From now on, the connection is known and
all further packets will use this path.
Now the full sequence of commands, all together. For convenience, I'll
repeat the meaning of the abbreviations:
Abbreviations:
IFI internal interface
IPI IP address of internal interface
NMI netmask for the internal interface
IFE1, IFE2 external interfaces
IPE1, IPE2 external IP address
NWE1, NWE2 external network address
NME1, NME2 mask for the external network (number of bits, like in /24)
BRD1, BRD2 broadcast address for external network
GWE1, GWE2 gateway for external interface
ip link set IFI up
ip addr add IPI/NMI brd + dev IFI
ip rule add prio 50 table main
ip route del default table main
ip link set IFE1 up
ip addr flush dev IFE1
ip addr add IPE1/NME1 brd BRD1 dev IFE1
ip link set IFE2 up
ip addr flush dev IFE2
ip addr add IPE2/NME2 brd BRD2 dev IFE2
ip rule add prio 201 from NWE1/NME1 table 201
ip route add default via GWE1 dev IFE1 src IPE1 proto static table 201
ip route append prohibit default table 201 metric 1 proto static
ip rule add prio 202 from NWE2/NME2 table 202
ip route add default via GWE2 dev IFE2 src IPE2 proto static table 202
ip route append prohibit default table 202 metric 1 proto static
ip rule add prio 222 table 222
ip route add default table 222 proto static \
nexthop via GWE1 dev IFE1 \
nexthop via GWE2 dev IFE2
Let's check it out:
ip address
This should print on the terminal one entry for the local loopback,
IFI, IFE1 and IFE2, and maybe some other things, if we have it
configured (like my GRE tunnels).
ip rule
This should look like this:
0: from all lookup local
50: from all lookup main
201: from NWE1/NME1 lookup 201
202: from NWE2/NME2 lookup 202
222: from all lookup 222
32766: from all lookup main
32767: from all lookup default
Table local is used for the local loopback, table main has the network
routes to the internal network and for the external interfaces, which
only give access to our gateways. Tables 201 and 202 (which also might
have the same priority), will provide a default route if the local
source address is known (because they have to match NWE1 or NWE2). And
table 222 will provide the multipath route. The tables with priority
32766 and 32767 will not be used.
ip route list table main
Giving:
NWI/NMI dev IFI proto kernel scope link src IPI
NWE1/NME1 dev IFE1 proto kernel scope link src IPE1
NWE2/NME2 dev IFE2 proto kernel scope link src IPE2
These are only routes to the corresponding networks without using a
gateway.
ip route list table 201
Giving:
default via GWE1 dev IFE1 proto static src IPE1
prohibit default proto static metric 1
And:
ip route list table 201
Giving:
default via GWE1 dev IFE1 proto static src IPE1
prohibit default proto static metric 1
These are the default routes requiring the local source address to be
known. And finally:
ip route list table 222
Giving:
default proto static
nexthop via GWE1 dev IFE1 weight 1
nexthop via GWE2 dev IFE2 weight 1
which of course is our multipath route. The weights where inserted by
the kernel and could be changed. Actually, they need to be changed
specially if the two lines have a very different capacity. With this
configuration, both lines are going to be used the same amount of
times (at least in the long run), even in case the first line has 4
times the capacity of the second. To compensate for that, we would add
"weight 4" after IFE1, like in:
ip route add default table 222 proto static \
nexthop via GWE1 dev IFE1 weight 4\
nexthop via GWE2 dev IFE2 weight 1
The numbers should be as small as possible, because in this case, the
kernel will create 5 alternative routes, four of them being the
same. For this reason, it wouldn't really be an equivalent to say
"weight 400" and "weight 100", nor would it be a good idea.
Now, let's see how the kernel chooses certain routes:
Asking, how to reach the gateway:
ip route get GW1
We get the route from table main:
GW1 dev IFE1 src IPE1
cache mtu 1500
Now, let's ask for the route to a host on the internet. Note, that the
ip(8) command does no DNS lookups; we need to provide the IP. This one
goes to www.kernel.org:
ip route get 204.152.189.113
My system answered:
204.152.189.113 via GWE1 dev IFE1 src IPE1
cache mtu 1500
i.e., having chosen the first interface. As it got this from the
cache, it won't be easy to get the answer for the second interface
until the cache expires, or we flush it explicitly. If not found in
the cache, this answer must come from table 222. To test tables 201
and 202, we can use:
ip route get from IPE1 to 204.152.189.113
ip route get from IPE2 to 204.152.189.113
and would get respectively:
204.152.189.113 from IPE1 via GWE1 dev IFE1
cache mtu 1500
204.152.189.113 from IPE2 via GWE2 dev IFE2
cache mtu 1500
Up to here, we know how all external lines are being used while they
are working. It's also easy to guess what happens if all external
lines fail. But let's see what happens if only one external line
fails:
At a first glance, nothing special happens. If one of the routes in
the multipath statement is unavailable, it just stops being eligible
and we get always the other one (one of the others, if more than
two). But it is less evident, what happen during the transition.
When we try to send a packet to another device, and that fails, like
in real live, we'll blame the other side. But if we can't reach one
neighbor, it doesn't mean necessarily that we can't reach any
neighbor. This depends on the reason for the failure. If the problem
is in our network card or in the cable attached to it, we won't reach
any other, but it's also possible that only this remote device is
down. Well, in our particular case, this is the same, because, from
the functional point of view, this is a point to point connection, and
beside the gateway and ourselves, there is no other device. But the
lower level transport system can't know that and hence will mark this
link just as suspected to fail. Neither the routing cache nor the
connection information have yet knowledge of the problem, and the
user's connection will block.
After a certain time, Ethernet will consider the gateway dead and mark
it as unreachable. If this is because the link was brought down, the
routing cache will be flushed immediately, forcing the next packet to
choose a new route, which will get only a working one. Otherwise, the
packet could use a route in the cache which is actually unreachable,
causing it to fail: All connections which are in the cache and which
use this route will just go nowhere. After a certain interval of time,
the entries in the routing cache expire, and than the route have to be
looked for again, offering only working routes, and things will work
again.
From the user's point of view: An established connection using a line
which stops working will be lost. She'll need to establish a new
connection. Until the cache expires, a new connection will work only
if the route isn't yet in the cache, or if the cached route uses a
working line. But if the route is cached and uses the failing line,
the connection will not work.
The time until the cache expires is controlled by the kernel variable
/proc/sys/net/ipv4/route/gc_interval and defaults to 60 seconds. The
bigger this number, the longer it will take until failing routes are
not being used again, but the shorter this time, the smaller the
chances to find a connection in the cache, loosing time to look up the
connection in the routing tables.
The inverse is analog: If the line comes back up, cached routes won't
use it. But routes found by matching the multipath route in table 222
will consider this route as eligible again, and possibly use it. But
more on this in the next section.
2.4 Keeping them alive
----------------------
We're almost done. There is a subtle issue about the kernel detecting
the state of an interface. We can not find out whether an interface is
working without just trying to use it. As long as the kernel doesn't
use the interface, it can't know if it is working or not. When we try
to use it, the kernel will mark the interface according to the result
of that operation. If this means, the interface is down, and we
continue trying to access Internet, the kernel will simply use the
other interface. How then can the kernel find out, that the failing
interface is working again? Well, it can't, because it wouldn't even
try. This is good, because users we wouldn't like to have to wait for
a timeout with each packet. But then, there needs to be a way to
notice when the interface comes back up. The only way is to use the
interface explicitly with non critical packets. Before using these
patches I had written a daemon which would set up explicit routes to
each gateway and send pings in regular intervals (once per
minute). Then the default route was set either to the preferred
interface or to the only one working. Now, this daemon doesn't need to
change routes anymore, but it still has to send the pings.
But there isn't really a daemon needed for this. A simple script:
while : ; do
ping -c 1 GWE1 > /dev/null 2>&1
ping -c 1 GWE2 > /dev/null 2>&1
sleep 60
done
can do this.
At the beginning, I was worried because I've a link where the provider
doesn't reply pings. But this is really not necessary. The trick is,
that ARP requests must be answered for anything to work. One
consequence of this is, that for now I'm only sure that this works for
devices which are based on Ethernet, i.e., which use the ARP. Other
devices, like dialup links, can only work if the underlying network
protocol provides a correct status change. But I'm not aware if or
which Linux protocols are implemented this way.
What I've learned from this is, that it's not enough to just set a
device up or down, or to add or remove a route. Somehow, the kernel
must also be told, that this device or route is working or stopped
doing so.
Running the pings once every minute all the time isn't what seems to
be an elegant solution, but it works. Hopefully, someday we'll get
this done behind the scene.
During development of this setup, I've considered using arping rather
than ICMP echo replies. Both can be problematic. The gateway can drop
echo messages, so ping wouldn't work. On the other hand, I've noticed
in two situations that the provider apparently had installed some kind
of load balancing making the MAC address of the gateway toggling from
one to another. This renders arping useless. So I stayed with
ping. This will work anyway, because my daemon doesn't need to know
the result: Even if I don't get an ICMP echo reply, the kernel will
fill in the right MAC address for this gateway and know that it is
reachable or not.
It seems that newer versions of arping use a broadcast mode, which
would solve the problem with a provider that changes the MAC address
of the gateway every few minutes. I didn't try it yet, but I will.
3. Comments
3.2 About the patches
---------------------
Some of us are curious and wish to know what the Julian's patches
actually do. Here is a summary:
- With the patches enabled, we can use "proto static" for all
routes. Doing so, the route for a link will survive even if that
device stops working. This is essential, specially if the reason for
using two lines is redundancy: If a line fails, all attributes of
the device remain, beside the fact that it can't be used until it
comes up again. We can continue pinging the gateway for to detect
the moment the link comes up again. Equal cost multipath exists also
without these patches, but wouldn't work in such a situation.
- If the device continues existing even if it doesn't work, the kernel
needs a way to decide whether this is a route might be chosen,
specially if several are available. This is called the ``dead
gateway detection''; it works for any kind of unicast routes
(multipath/unipath, input/output) and provided by these patches.
- A multipath route consists of several routes with the same source
and destination, but a different path, which is specified by the
device and the gateway. It is because of the patches that the kernel
knows how to distinguish the route not only by source and
destination, but also by device and gateway. Also, at least when
each line has a different device, the kernel's reverse path checking
(rp_filter) will work correctly.
- The NAT system has no idea how to deal with alternative routes, this
is something the routing system has to do. And one of the additions
of these patches is make the routing system prepare the route of an
individual packet for the nat system, such that it looks as if it
where the only possible route.
3.1 Practical experience
-----------------------
It's very little time that I use this, so obviously I can't give any
long term results. But after the first few days of working, I can say
that load-balancing is working reasonably well. I could imagine a more
fine grained behavior which would benefit single users with two DSL
lines, but in my particular situation I've seen a difference on total
load of only 15%.
Fail-over happened already several times and went by without being
noticed by the users. In the time I'm using this, I didn't experience
any problem. It just works :-).
ciccio@kiosknet.com.br
^ permalink raw reply [flat|nested] 15+ messages in thread* [LARTC] Solved: Using more than 1 Internet Line
2001-12-02 19:24 [LARTC] Solved: Using more than 1 Internet Line Christoph Simon
@ 2001-12-02 19:41 ` Christoph Simon
2001-12-03 20:22 ` Whit Blauvelt
` (12 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Simon @ 2001-12-02 19:41 UTC (permalink / raw)
To: lartc
I've been looking for about a year to find a way to use more than one
independent link to the Internet from a single computer. But I only
found many others who tried to do the same, without success.
Then I found out about Julian Anastasov's patches, and well, now it's
working. Unfortunately, this is not a five minutes solution, nor will
it help in all kinds of situations. So I decided to write a document
which tells step by step how this can be done and how it works, for
which I received lots of support from Julian. Thanks.
The text can be found at:
http://www.linuxvirtualserver.org/~julian/nano.txt
--
Christoph Simon
ciccio@kiosknet.com.br
---
^X^C
q
quit
:q
^C
end
x
exit
ZZ
^D
?
help
.
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://ds9a.nl/2.4Routing/
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [LARTC] Solved: Using more than 1 Internet Line
2001-12-02 19:24 [LARTC] Solved: Using more than 1 Internet Line Christoph Simon
2001-12-02 19:41 ` Christoph Simon
@ 2001-12-03 20:22 ` Whit Blauvelt
2001-12-03 20:45 ` Christoph Simon
` (11 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Whit Blauvelt @ 2001-12-03 20:22 UTC (permalink / raw)
To: lartc
Thanks Christoph (and Julian!), by happy coincidence this is exactly what
I'm looking for today.
In nano.txt you say the firewall, for iptables, must be stateful. Of course,
ipchains doesn't do stateful. I'm looking at using Julian's patches with a
2.2.20 kernel and ipchains and masquerading. Does anyone know offhand
whether I should:
1. Expect this to work?
2. Expect this to get weird?
If 2:
- What weirdness should I look out for?
- What, in theory, is the statefulness accomplishing in this context?
Whit
On Sun, Dec 02, 2001 at 05:41:54PM -0200, Christoph Simon wrote:
> I've been looking for about a year to find a way to use more than one
> independent link to the Internet from a single computer. But I only
> found many others who tried to do the same, without success.
>
> Then I found out about Julian Anastasov's patches, and well, now it's
> working. Unfortunately, this is not a five minutes solution, nor will
> it help in all kinds of situations. So I decided to write a document
> which tells step by step how this can be done and how it works, for
> which I received lots of support from Julian. Thanks.
>
> The text can be found at:
>
> http://www.linuxvirtualserver.org/~julian/nano.txt
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://ds9a.nl/2.4Routing/
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [LARTC] Solved: Using more than 1 Internet Line
2001-12-02 19:24 [LARTC] Solved: Using more than 1 Internet Line Christoph Simon
2001-12-02 19:41 ` Christoph Simon
2001-12-03 20:22 ` Whit Blauvelt
@ 2001-12-03 20:45 ` Christoph Simon
2001-12-03 21:43 ` Julian Anastasov
` (10 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Simon @ 2001-12-03 20:45 UTC (permalink / raw)
To: lartc
On Mon, 3 Dec 2001 15:22:25 -0500
Whit Blauvelt <whit@transpect.com> wrote:
> Thanks Christoph (and Julian!), by happy coincidence this is exactly what
> I'm looking for today.
>
> In nano.txt you say the firewall, for iptables, must be stateful. Of course,
> ipchains doesn't do stateful. I'm looking at using Julian's patches with a
> 2.2.20 kernel and ipchains and masquerading. Does anyone know offhand
> whether I should:
>
> 1. Expect this to work?
>
> 2. Expect this to get weird?
>
> If 2:
>
> - What weirdness should I look out for?
>
> - What, in theory, is the statefulness accomplishing in this context?
Honestly, I have no idea. Maybe Julian can help you out. I've tested
this only on 2.4 with iptables. The setup isn't a three mouseclick
action, but it's also not that bad, that you couldn't just try.
--
Christoph Simon
ciccio@kiosknet.com.br
---
^X^C
q
quit
:q
^C
end
x
exit
ZZ
^D
?
help
.
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://ds9a.nl/2.4Routing/
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [LARTC] Solved: Using more than 1 Internet Line
2001-12-02 19:24 [LARTC] Solved: Using more than 1 Internet Line Christoph Simon
` (2 preceding siblings ...)
2001-12-03 20:45 ` Christoph Simon
@ 2001-12-03 21:43 ` Julian Anastasov
2001-12-03 22:04 ` Arthur van Leeuwen
` (9 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Julian Anastasov @ 2001-12-03 21:43 UTC (permalink / raw)
To: lartc
Hello,
On Mon, 3 Dec 2001, Whit Blauvelt wrote:
> Thanks Christoph (and Julian!), by happy coincidence this is exactly what
> I'm looking for today.
>
> In nano.txt you say the firewall, for iptables, must be stateful. Of course,
> ipchains doesn't do stateful. I'm looking at using Julian's patches with a
Assume that this is recommendation (should).
> 2.2.20 kernel and ipchains and masquerading. Does anyone know offhand
> whether I should:
>
> 1. Expect this to work?
If the settings are correct and I didn't broke something
when building all pieces together. The end goal of the patches
both for 2.2 and 2.4 should be same. The implementations differ,
Netfilter is more suitable for such changes while 2.2 has some
weirdness supporting these extensions. The same weirdness you can
see in the changes for the ipchains compat code in 2.4.
> 2. Expect this to get weird?
>
> If 2:
>
> - What weirdness should I look out for?
Make some tests before going to production :) It needs
some understanding. That is why the document Christoph wrote is
so useful.
> - What, in theory, is the statefulness accomplishing in this context?
If I understand your question correctly, this is not a goal.
It is a conntracking specific thing which is not touched from these
patches. These patches change the routing and the way it is used from
NAT after adding or extending some nice features.
> Whit
Regards
--
Julian Anastasov <ja@ssi.bg>
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://ds9a.nl/2.4Routing/
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [LARTC] Solved: Using more than 1 Internet Line
2001-12-02 19:24 [LARTC] Solved: Using more than 1 Internet Line Christoph Simon
` (3 preceding siblings ...)
2001-12-03 21:43 ` Julian Anastasov
@ 2001-12-03 22:04 ` Arthur van Leeuwen
2001-12-03 22:19 ` Christoph Simon
` (8 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Arthur van Leeuwen @ 2001-12-03 22:04 UTC (permalink / raw)
To: lartc
On Mon, 3 Dec 2001, Whit Blauvelt wrote:
> Thanks Christoph (and Julian!), by happy coincidence this is exactly what
> I'm looking for today.
> In nano.txt you say the firewall, for iptables, must be stateful. Of course,
> ipchains doesn't do stateful. I'm looking at using Julian's patches with a
> 2.2.20 kernel and ipchains and masquerading. Does anyone know offhand
> whether I should:
> 1. Expect this to work?
Yes.
> 2. Expect this to get weird?
Maybe. But that's not related to 2.2, it can happen with 2.4 as well.
> If 2:
>
> - What weirdness should I look out for?
OpenSSH sets up the TOS fields *after* authenticating. This breaks the
entries in the route-cache, as they are keyed on source, destination and TOS
field.
> - What, in theory, is the statefulness accomplishing in this context?
Don't really know, as I haven't needed it. (I've set up a similar system
with only 2.2, never even so much as thinking about 2.4).
Doei, Arthur. (Oh... in my opinion the firewalling is an optional extra.)
--
/\ / | arthurvl@sci.kun.nl | Work like you don't need the money
/__\ / | A friend is someone with whom | Love like you have never been hurt
/ \/__ | you can dare to be yourself | Dance like there's nobody watching
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://ds9a.nl/2.4Routing/
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [LARTC] Solved: Using more than 1 Internet Line
2001-12-02 19:24 [LARTC] Solved: Using more than 1 Internet Line Christoph Simon
` (4 preceding siblings ...)
2001-12-03 22:04 ` Arthur van Leeuwen
@ 2001-12-03 22:19 ` Christoph Simon
2001-12-03 22:33 ` Whit Blauvelt
` (7 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Simon @ 2001-12-03 22:19 UTC (permalink / raw)
To: lartc
On Mon, 3 Dec 2001 23:45:49 +0000 (GMT)
Julian Anastasov <ja@ssi.bg> wrote:
> On Mon, 3 Dec 2001, Whit Blauvelt wrote:
>
> > Thanks Christoph (and Julian!), by happy coincidence this is exactly what
> > I'm looking for today.
> >
> > In nano.txt you say the firewall, for iptables, must be stateful. Of course,
> > ipchains doesn't do stateful. I'm looking at using Julian's patches with a
>
> Assume that this is recommendation (should).
I was reviewing my notes and now I believe, that statefulness should
not strictly be required, though at least for netfilter, I think it is
very recommendable. What is required is connection tracking, as
Julian's patches use their information for routing decisions. If you
use the netfilter snat target, actually it shouldn't make a
difference, as far as I can see. Maybe there is one, when using the
masquerade target, because there might be a second call to the routing
system, but I tend to believe that not. It's possible that I'm
completely wrong, but I think it's definitively worth a try.
--
Christoph Simon
ciccio@kiosknet.com.br
---
^X^C
q
quit
:q
^C
end
x
exit
ZZ
^D
?
help
.
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://ds9a.nl/2.4Routing/
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [LARTC] Solved: Using more than 1 Internet Line
2001-12-02 19:24 [LARTC] Solved: Using more than 1 Internet Line Christoph Simon
` (5 preceding siblings ...)
2001-12-03 22:19 ` Christoph Simon
@ 2001-12-03 22:33 ` Whit Blauvelt
2001-12-03 22:44 ` Julian Anastasov
` (6 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Whit Blauvelt @ 2001-12-03 22:33 UTC (permalink / raw)
To: lartc
On Mon, Dec 03, 2001 at 11:04:20PM +0100, Arthur van Leeuwen wrote:
> > - What weirdness should I look out for?
>
> OpenSSH sets up the TOS fields *after* authenticating. This breaks the
> entries in the route-cache, as they are keyed on source, destination and TOS
> field.
Hmm, I use OpenSSH all the time. Does this fully break it, or just add some
flakiness?
Thanks again,
Whit
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://ds9a.nl/2.4Routing/
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [LARTC] Solved: Using more than 1 Internet Line
2001-12-02 19:24 [LARTC] Solved: Using more than 1 Internet Line Christoph Simon
` (6 preceding siblings ...)
2001-12-03 22:33 ` Whit Blauvelt
@ 2001-12-03 22:44 ` Julian Anastasov
2001-12-04 8:52 ` Arthur van Leeuwen
` (5 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Julian Anastasov @ 2001-12-03 22:44 UTC (permalink / raw)
To: lartc
Hello,
On Mon, 3 Dec 2001, Whit Blauvelt wrote:
> On Mon, Dec 03, 2001 at 11:04:20PM +0100, Arthur van Leeuwen wrote:
>
> > > - What weirdness should I look out for?
> >
> > OpenSSH sets up the TOS fields *after* authenticating. This breaks the
> > entries in the route-cache, as they are keyed on source, destination and TOS
> > field.
>
> Hmm, I use OpenSSH all the time. Does this fully break it, or just add some
> flakiness?
It breaks only the plain kernel. The patches use the
multipath routes only for the first packet.
> Thanks again,
> Whit
Regards
--
Julian Anastasov <ja@ssi.bg>
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://ds9a.nl/2.4Routing/
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [LARTC] Solved: Using more than 1 Internet Line
2001-12-02 19:24 [LARTC] Solved: Using more than 1 Internet Line Christoph Simon
` (7 preceding siblings ...)
2001-12-03 22:44 ` Julian Anastasov
@ 2001-12-04 8:52 ` Arthur van Leeuwen
2001-12-04 10:57 ` Julian Anastasov
` (4 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Arthur van Leeuwen @ 2001-12-04 8:52 UTC (permalink / raw)
To: lartc
On Tue, 4 Dec 2001, Julian Anastasov wrote:
>
> Hello,
>
> On Mon, 3 Dec 2001, Whit Blauvelt wrote:
>
> > On Mon, Dec 03, 2001 at 11:04:20PM +0100, Arthur van Leeuwen wrote:
> >
> > > > - What weirdness should I look out for?
> > >
> > > OpenSSH sets up the TOS fields *after* authenticating. This breaks the
> > > entries in the route-cache, as they are keyed on source, destination and TOS
> > > field.
> >
> > Hmm, I use OpenSSH all the time. Does this fully break it, or just add some
> > flakiness?
>
> It breaks only the plain kernel. The patches use the
> multipath routes only for the first packet.
Okay, I've read both the nanohowto and the docs on Julian's patches by now.
A few things to note: the nanohowto's information is good even without
Julian's patches, although things will become trickier. One has to do ones
own link-probing and rerouting from userland. That is very doable however,
provided you have machines somewhere at the ISP's site that will answer to
either pings ore traceroutes or somesuch, as you will need answers.
The patches Julian provided fix a bunch of nastiness. For one, dead gateway
detection is done on the ARP level in kernelspace. Very neat when you have
ARP, thus on ethernet, but not very useful without. Furthermore, they
provide true alternative routes, not only multipath default routes. This is
once more extremely neat, but not directly necessary for the usual case.
Thirdly, Julian's patches add gateways as a routing key. This will not help
pure routing boxes, such as would be standard issue in an office full of
Windows toasters, as the gateway will be determined at the routing stage, so
it cannot be used as a key.
The *main* reason to use Julian's patches is the masquerading connection
rerouting. This will fix the big bugs in your setup by just redirecting a
masqueraded connection out to a different interface when the old one is
dead. This is *very* cool on UDP, and will make UDP failover to another
route fully transparent. However, it will not fix stateful protocols in
which the server on the other side keeps state on the IP address it was
talking to, such as SSH. It will fix the TOS nastiness OpenSSH brings to the
fore, as it will *reroute* after masquerading. Bit of a hack, that. I
simply nixed the TOS bits in the firewalling code. :)
Summarizing: Yes, you can do equal cost multipath. Yes it is cool. Yes it
can be made nicer and friendlier to set up using Julian's patches. However,
it will not be an ideal solution. Things *will* break. Load will just be
approximately balanced. Failover is in most cases definitely not transparent
to the user: new connections have to be set up. If the links stay up though,
equal cost multipath is a *good* thing.
Oh, and it does work on >2 uplinks. I've set up a system for a client using
1 ISDN line, 2 ADSL links (with the Dutch MXStream cruftiness, but I
digress) and 1 cable modem using masquerading only on the last three, using
the standard kernel (Julian's patches didn't exist a year ago, when I did
this). Worked splendidly (and still does, I'm told). Needed some manual
supervision though, as link failover and especially failback is *not*
trivial.
Doei, Arthur.
--
/\ / | arthurvl@sci.kun.nl | Work like you don't need the money
/__\ / | A friend is someone with whom | Love like you have never been hurt
/ \/__ | you can dare to be yourself | Dance like there's nobody watching
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://ds9a.nl/2.4Routing/
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [LARTC] Solved: Using more than 1 Internet Line
2001-12-02 19:24 [LARTC] Solved: Using more than 1 Internet Line Christoph Simon
` (8 preceding siblings ...)
2001-12-04 8:52 ` Arthur van Leeuwen
@ 2001-12-04 10:57 ` Julian Anastasov
2001-12-04 11:05 ` Christoph Simon
` (3 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Julian Anastasov @ 2001-12-04 10:57 UTC (permalink / raw)
To: lartc
Hello,
On Tue, 4 Dec 2001, Arthur van Leeuwen wrote:
> Okay, I've read both the nanohowto and the docs on Julian's patches by now.
> A few things to note: the nanohowto's information is good even without
> Julian's patches, although things will become trickier. One has to do ones
> own link-probing and rerouting from userland. That is very doable however,
what do you mean with "rerouting from userland"?
> provided you have machines somewhere at the ISP's site that will answer to
> either pings ore traceroutes or somesuch, as you will need answers.
BTW, only ARP answers are needed, they can't be filtered
with an excuse of ISP's policy. The ICMP replies are used from
the userland only (for monitoring). Of course, this is one solution,
as you said, one can do the same from user space by changing the
routes after a failover. Even from user space it can be faster in
some cases as the kernel do passive dead gateway (neighbour) detection,
not active.
>
> The patches Julian provided fix a bunch of nastiness. For one, dead gateway
> detection is done on the ARP level in kernelspace. Very neat when you have
> ARP, thus on ethernet, but not very useful without. Furthermore, they
Right. But there are non-ARP device drivers that can alter
the device link state. For others, it is in the hands of a user program.
At least, the patches handle situations with such pointtopoint devices.
> provide true alternative routes, not only multipath default routes. This is
> once more extremely neat, but not directly necessary for the usual case.
>
> Thirdly, Julian's patches add gateways as a routing key. This will not help
> pure routing boxes, such as would be standard issue in an office full of
> Windows toasters, as the gateway will be determined at the routing stage, so
> it cannot be used as a key.
The gateway as key has the purpose only for the NAT to select
the best source address according to data provided at routing time
about the path from the multipath route that is selected for this
packet. No other hosts depend on this. It is a way to select a source
IP for the link already selected at routing time when you are using
paths with same output device in the multipath route. We can
distinguish them only by outdev and gateway. The plain kernel
matches them only by device with the assumptions that the other variant
is insecure.
>
> The *main* reason to use Julian's patches is the masquerading connection
> rerouting. This will fix the big bugs in your setup by just redirecting a
> masqueraded connection out to a different interface when the old one is
> dead. This is *very* cool on UDP, and will make UDP failover to another
> route fully transparent. However, it will not fix stateful protocols in
It is no protocol specific. As for the established connections,
they are already bound to specific source and the number of variants are
equal to the number of links that serve this public IP (1 in most of
the setups), so, there is no place for failover for the masqueraded
connections in the usual setups. As for the plain usage of alternative
routes even TCP can start to use them except when the socket is
bound to device. In all other cases you can setup subnet over 2
devices and when one fails the TCP connections (with short gc_interval)
will switch to the other line (slooowly sometimes).
> which the server on the other side keeps state on the IP address it was
> talking to, such as SSH. It will fix the TOS nastiness OpenSSH brings to the
Nobody changes IP addresses, once the connection is created
its traffic can go only through the allowed devices for the addresses.
The multipath route is used only from the new connections to select
different line for them. So, the remote end will not detect packets
with changed source. We simply can't send them through the wrong device.
The routes say so.
> fore, as it will *reroute* after masquerading. Bit of a hack, that. I
> simply nixed the TOS bits in the firewalling code. :)
The changes in 2.2 are a big hack. But the masquerading there
will be not slower than the plain 2.2 kernel because the ip_route_output
call is still permanently there to select maddr which is already known.
> Summarizing: Yes, you can do equal cost multipath. Yes it is cool. Yes it
> can be made nicer and friendlier to set up using Julian's patches. However,
> it will not be an ideal solution. Things *will* break. Load will just be
These patches fix small number of things. You can do more
of the work (even the same work) with unpatched kernel and some
scripts in user space. The problems come when the route cache entries
expire.
> approximately balanced. Failover is in most cases definitely not transparent
> to the user: new connections have to be set up. If the links stay up though,
The failover simply depends on gc_interval (used from
the new connections only). As for the broken established connections:
nobody can resurrect them if they have only one line available for
the public IP addresses they use. If you can send traffic from one
public IP through 2 external devices and if the 1st fails, the other
will be used. Only the local connections bound to device will die, the
NAT and the unbound TCP sockets can use another alive device if there
are routes that say so.
> equal cost multipath is a *good* thing.
>
> Oh, and it does work on >2 uplinks. I've set up a system for a client using
Yes, it is tested, for example, with 2 WANs and one ethernet
to route packets to another router with WAN. 2 routers, 3 WANs total.
In short, solution for end hosts and routers that use border gateways.
All other (VRRP, etc) solutions for failover are for intermediate
routers. You can't move the physical link from the failed box to
another in some of the cases.
> 1 ISDN line, 2 ADSL links (with the Dutch MXStream cruftiness, but I
> digress) and 1 cable modem using masquerading only on the last three, using
> the standard kernel (Julian's patches didn't exist a year ago, when I did
> this). Worked splendidly (and still does, I'm told). Needed some manual
> supervision though, as link failover and especially failback is *not*
> trivial.
There are some things in these patches that are not for the
innocent Linux user: the need for active monitoring from user space
for the gateways in the multipath and the alternative routes. I have
an idea to do active monitoring in kernel but this is TODO. The
problem is in fact for the multipath routes: we need valid neighbour
state for these gateways (provided from ARP pings). If the ARP info
is not known we can end with using only part of the paths in the
multipath route and this bad, at least unexpected for the users
that do not feed the kernel with fresh ARP info. But in such situation
it seems they don't demand failover detection.
> Doei, Arthur.
Regards
--
Julian Anastasov <ja@ssi.bg>
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://ds9a.nl/2.4Routing/
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [LARTC] Solved: Using more than 1 Internet Line
2001-12-02 19:24 [LARTC] Solved: Using more than 1 Internet Line Christoph Simon
` (9 preceding siblings ...)
2001-12-04 10:57 ` Julian Anastasov
@ 2001-12-04 11:05 ` Christoph Simon
2001-12-04 16:13 ` Don Cohen
` (2 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Simon @ 2001-12-04 11:05 UTC (permalink / raw)
To: lartc
On Tue, 4 Dec 2001 09:52:49 +0100 (MET)
Arthur van Leeuwen <arthurvl@sci.kun.nl> wrote:
> Okay, I've read both the nanohowto and the docs on Julian's patches by now.
> A few things to note: the nanohowto's information is good even without
> Julian's patches, although things will become trickier. One has to do ones
> own link-probing and rerouting from userland. That is very doable however,
> provided you have machines somewhere at the ISP's site that will answer to
> either pings ore traceroutes or somesuch, as you will need answers.
I'm not sure which is the situation Julian wanted to address with the
patches, but having more than one line and using both of them at the
same time, while being able to continue in case of failures, was mine
to use those patches: I have no relation to my ISPs, they are not
going to cooperate in any sense, often not even paying lots of money,
and often they are doing things just to make it more difficult,
because they want our money, but the do not want that we actually
_use_ their installations (this is a special greating for spanish
Telefónica and spanish Terra).
If I had access to ISP cooperation, I could use a perfect line
balancing between two known machines, or much easier: just ask for one
bigger line, But the idea is just NOT to use the same ISP because they
are unreliably. Again, just let me name as an example the spanish
company Terra operating in Brazil, which has huge money and
demonstrates record beating levels of incompetence and bad faith.
> The patches Julian provided fix a bunch of nastiness. For one, dead gateway
> detection is done on the ARP level in kernelspace. Very neat when you have
> ARP, thus on ethernet, but not very useful without. Furthermore, they
> provide true alternative routes, not only multipath default routes. This is
> once more extremely neat, but not directly necessary for the usual case.
This is not strictly true, as far as I understood Julians
comments. The requirement is not to be ARP but to get some support
from the link level, which is being the case, but which could be the
case or at least could be done on other protocols as well. So if it
doesn't work with non ARP devices, the bug is not in Julians patches
but in the implementation of that protocols.
> Thirdly, Julian's patches add gateways as a routing key. This will not help
> pure routing boxes, such as would be standard issue in an office full of
> Windows toasters, as the gateway will be determined at the routing stage, so
> it cannot be used as a key.
I did think of that, and one, at least indirect, consequence is that
the gateways actually must be different. I had one case where both
lines came from the same ISP (for regional reasons of availability)
and I got two compatible IPs with the same gateway:
IFE1 eth0 IPE1 200.201.202.28 GWE1 200.201.202.1
IFE2 eth1 IPE2 200.201.202.29 GWE1 200.201.202.1
(only the forth digits are the actual ones)
Then I didn't know of Julians patches and wrote a daemon which would
reconfigure the network on failure or comeback of the lines. To do
that, I used to have one explicit host route to each gateway, send
pings to both and set the default route to the preferred line if both
are working, or to the working line if one failed. With a patched or
unpatched kernel, what would you do? To make it more difficult, this
was still before 2.4 kernels, using route and ifconfig only, no ip(8).
A priori, it is not obvious why this doesn't work, but it just can't
work. I gave back one line and hired another one, because once again,
the ISP didn't want to cooperate, preferring to charge uninstallation
and installation again (then some 300 US$). Not even arping broadcasts
will work. Maybe a hack with source routed packets, but I usually
disable this possibility in the kernel.
Yes, the dependency on the gateway is a limitation, but it's _much_
better than nothing. Maybe the solution would be to use more
consequently the output device, but I guess this requires
modifications which may reach the user interface. Julian's packets
didn't do that, ip(8) & Co. is still used as before.
> The *main* reason to use Julian's patches is the masquerading connection
> rerouting. This will fix the big bugs in your setup by just redirecting a
> masqueraded connection out to a different interface when the old one is
> dead.
Could you elaborate which are those bugs in my setup? I would also be
very grateful for any suggestion in fixing them.
> This is *very* cool on UDP, and will make UDP failover to another
> route fully transparent. However, it will not fix stateful protocols in
> which the server on the other side keeps state on the IP address it was
> talking to, such as SSH. It will fix the TOS nastiness OpenSSH brings to the
> fore, as it will *reroute* after masquerading. Bit of a hack, that. I
> simply nixed the TOS bits in the firewalling code. :)
I also thought of a hack when I knew about the reroute at NAT time for
masquerade, but then, first, it works, and second this seems to have
been the way to not rewrite most of Linux networking stuff.
Also, a stateful connection really can't survive. We do not have any
support at the ISP side; the connections are 100% independent, most of
the times one ISP doesn't (shouldn't) even know of the other. There
_is_ no solution to it. If you request a download, the remote server
will start sending the packets to one particular IP. What are you
suggesting to persuade that remote server, to continue sending them to
another IP? I guess many crackers would love having such an oportunity
to take over a connection! The remote server has no chance to know if
the two IPs are in the same computer or thousands of kilometers apart.
> Summarizing: Yes, you can do equal cost multipath. Yes it is cool. Yes it
> can be made nicer and friendlier to set up using Julian's patches. However,
> it will not be an ideal solution. Things *will* break. Load will just be
> approximately balanced. Failover is in most cases definitely not transparent
> to the user: new connections have to be set up. If the links stay up though,
> equal cost multipath is a *good* thing.
No. Things do *not* break beyond having to restart certain connections
(see above). I'm using this now since more than a week with hundreds
of thousands of connections a day and it never broke. It works as
smoothe as I was much to afraid to wish since a long time. The purpose
of having more than one line is because the ISPs are unreliably. There
are three to four line failures per day as an average. Nobody noticed
ever any failover. OK, most of them are just surfing the net and not
doing prolongated downloads or ssh session. I use ssh for maintainance
and do know that a failing line can require reopening a new
session. But now I can do that, and before I had to wait a couple of
minutes until my daemon would reconfigure the network, including the
netfilter rules, and establish the new situation. I did mention the
result of load balancing in the nano howto; for now I observed a daily
disbalance of maybe some 15%, which tend to diminish on longer periods
of time.
> Oh, and it does work on >2 uplinks. I've set up a system for a client using
> 1 ISDN line, 2 ADSL links (with the Dutch MXStream cruftiness, but I
> digress) and 1 cable modem using masquerading only on the last three, using
> the standard kernel (Julian's patches didn't exist a year ago, when I did
> this). Worked splendidly (and still does, I'm told). Needed some manual
> supervision though, as link failover and especially failback is *not*
> trivial.
Well, if you did that, you did it very much in secret! Many persons
around the globe, including myself, asked on this and other lists for
an answer. I do remember a two-liner answer from you, which in that
form was useless because it just wouldn't work. A second direct
question remained without reply. Maybe actually it didn't work so
splendidly after all, because after having studied Julians patches, I
don't really see how it could be done. But then, you dispose obviously
of much more advance knowledge.
--
Christoph Simon
ciccio@kiosknet.com.br
---
^X^C
q
quit
:q
^C
end
x
exit
ZZ
^D
?
help
.
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://ds9a.nl/2.4Routing/
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [LARTC] Solved: Using more than 1 Internet Line
2001-12-02 19:24 [LARTC] Solved: Using more than 1 Internet Line Christoph Simon
` (10 preceding siblings ...)
2001-12-04 11:05 ` Christoph Simon
@ 2001-12-04 16:13 ` Don Cohen
2001-12-04 16:20 ` Arthur van Leeuwen
2001-12-04 16:56 ` Don Cohen
13 siblings, 0 replies; 15+ messages in thread
From: Don Cohen @ 2001-12-04 16:13 UTC (permalink / raw)
To: lartc
...
> > dead. This is *very* cool on UDP, and will make UDP failover to another
> > route fully transparent. However, it will not fix stateful protocols in
> > which the server on the other side keeps state on the IP address it was
> > talking to, such as SSH.
Sending packets with the same source address out another link only
helps UDP if it's not necessary to get replies, in which case I
wouldn't call it a "connection". If you didn't need the replies then
you could do the same thing for TCP.
Note that this trick only works at all if your ISPs are not doing the
ingress filtering they should - a safe assumption for now, but I hope
not in the future.
The problem is that you normally don't have any control over the
routing between two places. Even though your ISPs will let you send
out packets with the source addresses belonging to each other, the
replies will be routed to the one that "should" have sent the packets,
and if that one is down you can't get the replies.
This problem could be fixed by extending TCP (and of course, changing
the kernel) to support multiple IP addresses. I suggest a new option
that says "here's another IP address for me" (or perhaps, here's an
alternative IP/port). The kernel then has to merge these two input
streams. On the output side (when you send to someone who has told
you about alternative addresses) I can think of several ways to
control which address you send to. I suppose the application should
have some way to influence that, but as a first stab, I suggest that
whenever tcp has to resend a packet, it should move to the next
address.
Anyone interested in trying to implement it? Put it on your queue and
post to the list if you ever get to it. Also please post to the list
any bugs (and fixes) you see in the proposal.
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://ds9a.nl/2.4Routing/
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [LARTC] Solved: Using more than 1 Internet Line
2001-12-02 19:24 [LARTC] Solved: Using more than 1 Internet Line Christoph Simon
` (11 preceding siblings ...)
2001-12-04 16:13 ` Don Cohen
@ 2001-12-04 16:20 ` Arthur van Leeuwen
2001-12-04 16:56 ` Don Cohen
13 siblings, 0 replies; 15+ messages in thread
From: Arthur van Leeuwen @ 2001-12-04 16:20 UTC (permalink / raw)
To: lartc
On Tue, 4 Dec 2001, Don Cohen wrote:
> ...
> > > dead. This is *very* cool on UDP, and will make UDP failover to another
> > > route fully transparent. However, it will not fix stateful protocols in
> > > which the server on the other side keeps state on the IP address it was
> > > talking to, such as SSH.
>
> Sending packets with the same source address out another link only
> helps UDP if it's not necessary to get replies, in which case I
> wouldn't call it a "connection". If you didn't need the replies then
> you could do the same thing for TCP.
Ah, but the rerouting patch doesn't do that. It reroutes and *remasquerades*
packets going out. You will only lose return packets destined to the other
interface. The problem really only lies in the other end maintaining state
in the higher level protocol.
> This problem could be fixed by extending TCP (and of course, changing
> the kernel) to support multiple IP addresses. I suggest a new option
> that says "here's another IP address for me" (or perhaps, here's an
> alternative IP/port). The kernel then has to merge these two input
> streams. On the output side (when you send to someone who has told
> you about alternative addresses) I can think of several ways to
> control which address you send to. I suppose the application should
> have some way to influence that, but as a first stab, I suggest that
> whenever tcp has to resend a packet, it should move to the next
> address.
Ooh, that'd be cool. Building your own anycast group dynamically... and
registering on the other side as said anycast group. Unfortunately, IPv4
doesn't allow for IP anycasting. IPv6, anyone? :)
Oh, you're talking about implementing it at the TCP level? Right then...
right. That should be possible... if only programs couldn't bind to specific
addresses...
Doei, Arthur. (Note: the idea is one of the coolest I've seen in a while)
--
/\ / | arthurvl@sci.kun.nl | Work like you don't need the money
/__\ / | A friend is someone with whom | Love like you have never been hurt
/ \/__ | you can dare to be yourself | Dance like there's nobody watching
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://ds9a.nl/2.4Routing/
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [LARTC] Solved: Using more than 1 Internet Line
2001-12-02 19:24 [LARTC] Solved: Using more than 1 Internet Line Christoph Simon
` (12 preceding siblings ...)
2001-12-04 16:20 ` Arthur van Leeuwen
@ 2001-12-04 16:56 ` Don Cohen
13 siblings, 0 replies; 15+ messages in thread
From: Don Cohen @ 2001-12-04 16:56 UTC (permalink / raw)
To: lartc
Arthur van Leeuwen writes:
> > This problem could be fixed by extending TCP (and of course, changing
> > the kernel) to support multiple IP addresses. I suggest a new option
> > that says "here's another IP address for me" (or perhaps, here's an
> > alternative IP/port). The kernel then has to merge these two input
> > streams. On the output side (when you send to someone who has told
> > you about alternative addresses) I can think of several ways to
> > control which address you send to. I suppose the application should
> > have some way to influence that, but as a first stab, I suggest that
> > whenever tcp has to resend a packet, it should move to the next
> > address.
...
> Oh, you're talking about implementing it at the TCP level? Right then...
> right. That should be possible... if only programs couldn't bind to specific
> addresses...
Right, I forgot about the other half, what interface programs will
have to control their own alternative addresses.
While I do think there should be such interfaces, I suggest a default
that does the "right" thing for applications that don't know about the
extension. There should be some command to tell the kernel about the
alternatives, e.g., whenever you allocate a port for address 1.2.3.*
you should also allocate alternatives for addresses 6.7.8.9 and
4.5.6.7.
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://ds9a.nl/2.4Routing/
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2001-12-04 16:56 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-12-02 19:24 [LARTC] Solved: Using more than 1 Internet Line Christoph Simon
2001-12-02 19:41 ` Christoph Simon
2001-12-03 20:22 ` Whit Blauvelt
2001-12-03 20:45 ` Christoph Simon
2001-12-03 21:43 ` Julian Anastasov
2001-12-03 22:04 ` Arthur van Leeuwen
2001-12-03 22:19 ` Christoph Simon
2001-12-03 22:33 ` Whit Blauvelt
2001-12-03 22:44 ` Julian Anastasov
2001-12-04 8:52 ` Arthur van Leeuwen
2001-12-04 10:57 ` Julian Anastasov
2001-12-04 11:05 ` Christoph Simon
2001-12-04 16:13 ` Don Cohen
2001-12-04 16:20 ` Arthur van Leeuwen
2001-12-04 16:56 ` Don Cohen
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.