* [LARTC] multipath device round robin not working?
@ 2007-01-13 11:54 zutph3n
2007-01-14 9:26 ` Alex Samad
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: zutph3n @ 2007-01-13 11:54 UTC (permalink / raw)
To: lartc
Hi,
I have a linux server running kernel 2.6.19 that is connected with 2
seperate 100Mbit links to the same isp:
+---+
+---------------+ | I |
+---------------+
| | | S |
| |
| eth0 --+--------------+ P |
| |
| | | S |
| |
| linux 2.6.19 | | W |============| ISP
GATEWAY |
| | | I |
| |
| eth1 --+--------------+ T |
| |
| | | C |
| |
+---------------+ | H |
+---------------+
+---+
Both links have their own ip but have the same gateway. The problem is I
can't seem to get egress traffic load balanced over the 2 nics.
IP config after boot (dhcp from isp)
ip a:
1: lo: <LOOPBACK,UP,10000> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
2: eth0: <BROADCAST,MULTICAST,NOTRAILERS,UP,10000> mtu 1500 qdisc
pfifo_fast qlen 1000
link/ether 00:00:00:00:00:0f brd ff:ff:ff:ff:ff:ff
inet 10.0.0.110/24 brd 10.0.0.255 scope global eth0
3: eth1: <BROADCAST,MULTICAST,NOTRAILERS,UP,10000> mtu 1500 qdisc
pfifo_fast qlen 1000
link/ether 00:00:00:00:00:ed brd ff:ff:ff:ff:ff:ff
inet 10.0.0.120/24 brd 10.0.0.255 scope global eth1
Default routing table after boot
ip r:
10.0.0.0/24 dev eth0 scope link
10.0.0.0/24 dev eth1 scope link metric 1
127.0.0.0/8 dev lo scope link
default via 10.0.0.1 dev eth0
default via 10.0.0.1 dev eth1 metric 1
I enabled ip_forward and set arp_ignore to 1 for eth0 and eth1 to make
sure the correct nic answers to arp requests.
I tried to get the egress load balancing to work by replacing the above
two default routes with:
ip route add default mpath drr nexthop via 10.0.0.1 dev eth0 weight 1
onlink nexthop via 10.0.0.1 dev eth1 weight 1 onlink
I assumed that with mpath device round robin both nics would be used
more or less equally, but the reality is only one of the nics actually
works and the second nic even stops responding to arp requests.
Am I doing something totally wrong or impossible here or is the device
round robin code not working properly?
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [LARTC] multipath device round robin not working?
2007-01-13 11:54 [LARTC] multipath device round robin not working? zutph3n
@ 2007-01-14 9:26 ` Alex Samad
2007-01-15 4:14 ` Grant Taylor
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Alex Samad @ 2007-01-14 9:26 UTC (permalink / raw)
To: lartc
[-- Attachment #1.1: Type: text/plain, Size: 3639 bytes --]
On Sat, Jan 13, 2007 at 12:54:24PM +0100, zutph3n@gmail.com wrote:
> Hi,
>
> I have a linux server running kernel 2.6.19 that is connected with 2
> seperate 100Mbit links to the same isp:
>
>
> +---+
> +---------------+ | I |
> +---------------+
> | | | S |
> | |
> | eth0 --+--------------+ P |
> | |
> | | | S |
> | |
> | linux 2.6.19 | | W |========================| ISP
> GATEWAY |
> | | | I |
> | |
> | eth1 --+--------------+ T |
> | |
> | | | C |
> | |
> +---------------+ | H |
> +---------------+
> +---+
>
> Both links have their own ip but have the same gateway. The problem is I
> can't seem to get egress traffic load balanced over the 2 nics.
>
> IP config after boot (dhcp from isp)
> ip a:
>
> 1: lo: <LOOPBACK,UP,10000> mtu 16436 qdisc noqueue
> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
>
> 2: eth0: <BROADCAST,MULTICAST,NOTRAILERS,UP,10000> mtu 1500 qdisc
> pfifo_fast qlen 1000
> link/ether 00:00:00:00:00:0f brd ff:ff:ff:ff:ff:ff
> inet 10.0.0.110/24 brd 10.0.0.255 scope global eth0
>
> 3: eth1: <BROADCAST,MULTICAST,NOTRAILERS,UP,10000> mtu 1500 qdisc
> pfifo_fast qlen 1000
> link/ether 00:00:00:00:00:ed brd ff:ff:ff:ff:ff:ff
> inet 10.0.0.120/24 brd 10.0.0.255 scope global eth1
>
> Default routing table after boot
> ip r:
>
> 10.0.0.0/24 dev eth0 scope link
> 10.0.0.0/24 dev eth1 scope link metric 1
> 127.0.0.0/8 dev lo scope link
> default via 10.0.0.1 dev eth0
> default via 10.0.0.1 dev eth1 metric 1
>
> I enabled ip_forward and set arp_ignore to 1 for eth0 and eth1 to make
> sure the correct nic answers to arp requests.
>
> I tried to get the egress load balancing to work by replacing the above
> two default routes with:
>
> ip route add default mpath drr nexthop via 10.0.0.1 dev eth0 weight 1
> onlink nexthop via 10.0.0.1 dev eth1 weight 1 onlink
>
> I assumed that with mpath device round robin both nics would be used
> more or less equally, but the reality is only one of the nics actually
> works and the second nic even stops responding to arp requests.
>
> Am I doing something totally wrong or impossible here or is the device
> round robin code not working properly?
Curiosity but why use such a setup is your ISP link > 2Gbp/s ? Why not bond if
you want HA.
why its not round robining. I am going to guess but this line
default via 10.0.0.1 dev eth0
costs less to use than
default via 10.0.0.1 dev eth1 metric 1
so it should never use the second. I say guess cause I don't know what the
default metric is if you do add one.
What you want it to look something like is
default proto static metric 5
nexthop via 144.132.144.1 dev vlan2 weight 5
nexthop via 10.20.20.230 dev ppp0 weight 20
There is a link to a howto on the web site that steps out how to set this up
Alex
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>
[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 143 bytes --]
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [LARTC] multipath device round robin not working?
2007-01-13 11:54 [LARTC] multipath device round robin not working? zutph3n
2007-01-14 9:26 ` Alex Samad
@ 2007-01-15 4:14 ` Grant Taylor
2007-01-16 0:44 ` Grant Taylor
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Grant Taylor @ 2007-01-15 4:14 UTC (permalink / raw)
To: lartc
On 01/13/07 05:54, zutph3n@gmail.com wrote:
> Both links have their own ip but have the same gateway. The problem is I
> can't seem to get egress traffic load balanced over the 2 nics.
I don't know if it is still a problem or not, but I ran in to something
very similar a LONG time ago (mid 2.4).
Basically what I found was the problem was that (I believe) multi-path
routing really is multi gateway routing. I.e. load balancing across two
(or more) different gateways. In your case, and the case that I had,
all the IPs in the world did not make any difference b/c each path had
the same default gateway.
My solution at the time was to use UML routers to provide different
subnets to the box doing ECMP routing. Each UML basically NATed from
the one upstream network to a small downstream private subnet that was
unique for each link. This allowed the box doing ECMP to see different
gateways. This worked great until I hit a memory limit on connection state.
I should probably say that the problem I ran in to was not a problem
with ECMP but the number of hosts that I was trying to NAT with the
amount of RAM that was on the box. I was able to resolve this by adding
RAM to the box to provide a larger Connection State Table. For the
record, the box was running a mid 2.4 kernel with a subnet that was the
size of 4 class C networks (2048 IPs) on a box that had 256 MB of RAM.
I ended up taking the box up to 2 GB of RAM and things have been working
GREAT ever sense. I do believe this memory / connection state problem
has been resolved long ago. However the system is working and payed for
and the client is perfectly happy with what is in place and sees no
reason to do any thing with it. (If any one would like more details,
just ask.)
Grant. . . .
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [LARTC] multipath device round robin not working?
2007-01-13 11:54 [LARTC] multipath device round robin not working? zutph3n
2007-01-14 9:26 ` Alex Samad
2007-01-15 4:14 ` Grant Taylor
@ 2007-01-16 0:44 ` Grant Taylor
2007-01-16 19:52 ` Alex Samad
2007-01-17 5:04 ` Grant Taylor
4 siblings, 0 replies; 6+ messages in thread
From: Grant Taylor @ 2007-01-16 0:44 UTC (permalink / raw)
To: lartc
On 01/15/07 15:20, zutph3n@gmail.com wrote:
> Wow, that's a complicated solution. Nicely done:) But I think that's a
> bit too complicated for my setup.... thx for the input anyway.
Thanks.
Indeed the set up is not simple. You may consider talking with your ISP
and seeing if they can assign one of your links an IP on a different subnet.
I have found that ISPs that are worth their salt are willing to work
with you to help you resolve these types of problems.
Grant. . . .
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [LARTC] multipath device round robin not working?
2007-01-13 11:54 [LARTC] multipath device round robin not working? zutph3n
` (2 preceding siblings ...)
2007-01-16 0:44 ` Grant Taylor
@ 2007-01-16 19:52 ` Alex Samad
2007-01-17 5:04 ` Grant Taylor
4 siblings, 0 replies; 6+ messages in thread
From: Alex Samad @ 2007-01-16 19:52 UTC (permalink / raw)
To: lartc
[-- Attachment #1.1: Type: text/plain, Size: 1031 bytes --]
On Mon, Jan 15, 2007 at 06:44:54PM -0600, Grant Taylor wrote:
> On 01/15/07 15:20, zutph3n@gmail.com wrote:
> >Wow, that's a complicated solution. Nicely done:) But I think that's a
> >bit too complicated for my setup.... thx for the input anyway.
>
> Thanks.
>
> Indeed the set up is not simple. You may consider talking with your ISP
> and seeing if they can assign one of your links an IP on a different subnet.
>
> I have found that ISPs that are worth their salt are willing to work
> with you to help you resolve these types of problems.
>
>
>
> Grant. . . .
something else to look for, because you have 2 nics in the same broadcast
domain (http://cactuswax.net/blog/articles/2006/09/arp_ignore.html) explains
about arp_ignore.
In its default setup you are going to find i nic is going to arp respond for
both IP addresses!
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>
[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 143 bytes --]
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [LARTC] multipath device round robin not working?
2007-01-13 11:54 [LARTC] multipath device round robin not working? zutph3n
` (3 preceding siblings ...)
2007-01-16 19:52 ` Alex Samad
@ 2007-01-17 5:04 ` Grant Taylor
4 siblings, 0 replies; 6+ messages in thread
From: Grant Taylor @ 2007-01-17 5:04 UTC (permalink / raw)
To: lartc
On 01/16/07 04:01, Andrew Lyon wrote:
> Can you suggest how this can be done with 2.6 as we will be upgrading our
> server soon and I don't think ECMP patch exists for 2.6?
The patch does not exist because ECMP is in the main line kernel.
Networking ->
Networking options ->
IP: advanced router
IP: equal cost multipath
Grant. . . .
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-01-17 5:04 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-13 11:54 [LARTC] multipath device round robin not working? zutph3n
2007-01-14 9:26 ` Alex Samad
2007-01-15 4:14 ` Grant Taylor
2007-01-16 0:44 ` Grant Taylor
2007-01-16 19:52 ` Alex Samad
2007-01-17 5:04 ` Grant Taylor
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox