From mboxrd@z Thu Jan 1 00:00:00 1970 From: William T Mullaney Date: Fri, 15 Sep 2006 04:12:16 +0000 Subject: RE: [LARTC] Problem with Load Balancing Message-Id: <4986F8D166F1E44CBA92A91C98ACAD5214CA0E@sql_server> MIME-Version: 1 Content-Type: multipart/mixed; boundary="===============0922384777==" List-Id: References: In-Reply-To: To: lartc@vger.kernel.org This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. --===============0922384777== Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C6D87D.20DEC5E0" This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. ------_=_NextPart_001_01C6D87D.20DEC5E0 Content-Type: text/plain Vlad, We have also set up a somewhat similar method of load balancing. Our traffic is never a 50-50 split (well 3:2 is how we have it set, but it doesn't always get close to that), but as the load picks up, it tends to be closer to the actual amount. Dead gateway detection has never worked for us, and one day I'll probably bother other members of the LARTC group to get some help, but the method that we use is to check the output of the ip neighbor command. Basically, if our two ISPs are 10.1.1.254 and 10.2.2.254, we run a bash script via cron every minute that does a call something like: ETH1 = ip neigh 10.1.1.254 | egrep "REACHABLE|DELAY|PROBE|STALE" -c ETH2 = ip neigh 10.2.2.254 | egrep "REACHABLE|DELAY|PROBE|STALE" -c The neighbor system basically monitors ARP and if it sees a message leave an interface without a reply after something like 3-5 seconds, it moves the interface to DELAY, after another few seconds it moves to PROBE and does an active arp request, and if that fails to work in a few seconds, it becomes INCOMPLETE or FAILED or just simply isn't listed. If no data is sent either way for a while, the entry can be marked STALE or removed. With the above lines, we get a 1 in the ETH1 or ETH2 variables if the next neighbor is up, and a 0 if not. From there you can use some if scripts to detect if both are up, or if only one is up, which one. In our case, if both are up we clear the default route and then make it something like ip route add default nexthop via 10.1.1.254 dev eth1 weight 1 \ nexthop via 10.2.2.254 dev eth2 weight 1 and if only one is up we clear it and make it : ip route add default nexthop via 10.1.1.254 dev eth1 or ip route add default nexthop via 10.2.2.254 dev eth2 With some additional scripting we can allow this to be overridden, we can set the link to prefer using only one line, but switch to the other if the preferred line fails, and to take input from programs like Nagios to auto-prefer one line or another if ping times get high, etc. In addition, the script remembers the state it was in (so that it only changes the routing table when needed), controls DNS, can flush the DNS cache, and reports status back to Nagios. Once I get all the bugs out and some documentation, I'd be happy to post it to the news group, though you or anyone else can send me an email if you would like to take a look at it before then. In practice, this method usually detects and adjusts outbound connections quickly without user intervention; DNS changes with short TTLS take care of inbound connections. Just be careful... if you don't have something sending traffic out to your upstream routers (and back) every few minutes, the entry in your ARP table can potentially be removed and thus cause your system to think an unused gateway has failed, or that a recovered gateway is still down. This could be checked with a quick "if ip neigh test fails, ping neighbor 5 times, then test again before making decisions". Running an uptime monitor that pings or does something else to/through the gateway (regardless of default route) also takes care of this. -Will -----Original Message----- From: Vladimir Burciaga Aguilar [mailto:anakinv7@hotmail.com] Sent: Thursday, September 14, 2006 10:25 PM To: lartc@mailman.ds9a.nl Subject: [LARTC] Problem with Load Balancing Hi everybody! I'm trying to implement the load balancing for a LAN with two ISPs. I've installed a Suse Linux Enterpise Server 9 with iproute2 for that porpouse. The server have two NICs, one of them is for both the LAN and ISP 1. I've setup both NICs with YAST (if I use ip for this, then the whole thing doesn't work!) and execute the following commands to setup the routing tables: ip route flush cache ip route flush default ip route flush table 1 ip route flush table 2 [snip] ------_=_NextPart_001_01C6D87D.20DEC5E0 Content-Type: text/html Content-Transfer-Encoding: base64 PCFET0NUWVBFIEhUTUwgUFVCTElDICItLy9XM0MvL0RURCBIVE1MIDMuMi8vRU4iPg0KPEhUTUw+ DQo8SEVBRD4NCjxNRVRBIEhUVFAtRVFVSVY9IkNvbnRlbnQtVHlwZSIgQ09OVEVOVD0idGV4dC9o dG1sOyBjaGFyc2V0PVVTLUFTQ0lJIj4NCjxNRVRBIE5BTUU9IkdlbmVyYXRvciIgQ09OVEVOVD0i TVMgRXhjaGFuZ2UgU2VydmVyIHZlcnNpb24gNS41LjI2NTMuMTIiPg0KPFRJVExFPlJFOiBbTEFS VENdIFByb2JsZW0gd2l0aCBMb2FkIEJhbGFuY2luZzwvVElUTEU+DQo8L0hFQUQ+DQo8Qk9EWT4N Cg0KPFA+PEZPTlQgU0laRT0yPlZsYWQsPC9GT05UPg0KPC9QPg0KDQo8UD48Rk9OVCBTSVpFPTI+ V2UgaGF2ZSBhbHNvIHNldCB1cCBhIHNvbWV3aGF0IHNpbWlsYXIgbWV0aG9kIG9mIGxvYWQgYmFs YW5jaW5nLiZuYnNwOyBPdXIgdHJhZmZpYyBpcyBuZXZlciBhIDUwLTUwIHNwbGl0ICh3ZWxsIDM6 MiBpcyBob3cgd2UgaGF2ZSBpdCBzZXQsIGJ1dCBpdCBkb2Vzbid0IGFsd2F5cyBnZXQgY2xvc2Ug dG8gdGhhdCksIGJ1dCBhcyB0aGUgbG9hZCBwaWNrcyB1cCwgaXQgdGVuZHMgdG8gYmUgY2xvc2Vy IHRvIHRoZSBhY3R1YWwgYW1vdW50LjwvRk9OVD48L1A+DQoNCjxQPjxGT05UIFNJWkU9Mj5EZWFk IGdhdGV3YXkgZGV0ZWN0aW9uIGhhcyBuZXZlciB3b3JrZWQgZm9yIHVzLCBhbmQgb25lIGRheSBJ J2xsIHByb2JhYmx5IGJvdGhlciBvdGhlciBtZW1iZXJzIG9mIHRoZSBMQVJUQyBncm91cCB0byBn ZXQgc29tZSBoZWxwLCBidXQgdGhlIG1ldGhvZCB0aGF0IHdlIHVzZSBpcyB0byBjaGVjayB0aGUg b3V0cHV0IG9mIHRoZSBpcCBuZWlnaGJvciBjb21tYW5kLiZuYnNwOyBCYXNpY2FsbHksIGlmIG91 ciB0d28gSVNQcyBhcmUgMTAuMS4xLjI1NCBhbmQgMTAuMi4yLjI1NCwgd2UgcnVuIGEgYmFzaCBz Y3JpcHQgdmlhIGNyb24gZXZlcnkgbWludXRlIHRoYXQgZG9lcyBhIGNhbGwgc29tZXRoaW5nIGxp a2U6PC9GT05UPjwvUD4NCg0KPFA+PEZPTlQgU0laRT0yPkVUSDEgPSBpcCBuZWlnaCAxMC4xLjEu MjU0IHwgZWdyZXAgJnF1b3Q7UkVBQ0hBQkxFfERFTEFZfFBST0JFfFNUQUxFJnF1b3Q7IC1jIDwv Rk9OVD4NCjxCUj48Rk9OVCBTSVpFPTI+RVRIMiA9IGlwIG5laWdoIDEwLjIuMi4yNTQgfCBlZ3Jl cCAmcXVvdDtSRUFDSEFCTEV8REVMQVl8UFJPQkV8U1RBTEUmcXVvdDsgLWM8L0ZPTlQ+DQo8L1A+ DQoNCjxQPjxGT05UIFNJWkU9Mj5UaGUgbmVpZ2hib3Igc3lzdGVtIGJhc2ljYWxseSBtb25pdG9y cyBBUlAgYW5kIGlmIGl0IHNlZXMgYSBtZXNzYWdlIGxlYXZlIGFuIGludGVyZmFjZSB3aXRob3V0 IGEgcmVwbHkgYWZ0ZXIgc29tZXRoaW5nIGxpa2UgMy01IHNlY29uZHMsIGl0IG1vdmVzIHRoZSBp bnRlcmZhY2UgdG8gREVMQVksIGFmdGVyIGFub3RoZXIgZmV3IHNlY29uZHMgaXQgbW92ZXMgdG8g UFJPQkUgYW5kIGRvZXMgYW4gYWN0aXZlIGFycCByZXF1ZXN0LCBhbmQgaWYgdGhhdCBmYWlscyB0 byB3b3JrIGluIGEgZmV3IHNlY29uZHMsIGl0IGJlY29tZXMgSU5DT01QTEVURSBvciBGQUlMRUQg b3IganVzdCBzaW1wbHkgaXNuJ3QgbGlzdGVkLiZuYnNwOyBJZiBubyBkYXRhIGlzIHNlbnQgZWl0 aGVyIHdheSBmb3IgYSB3aGlsZSwgdGhlIGVudHJ5IGNhbiBiZSBtYXJrZWQgU1RBTEUgb3IgcmVt b3ZlZC48L0ZPTlQ+PC9QPg0KDQo8UD48Rk9OVCBTSVpFPTI+V2l0aCB0aGUgYWJvdmUgbGluZXMs IHdlIGdldCBhIDEgaW4gdGhlIEVUSDEgb3IgRVRIMiB2YXJpYWJsZXMgaWYgdGhlIG5leHQgbmVp Z2hib3IgaXMgdXAsIGFuZCBhIDAgaWYgbm90LiZuYnNwOyBGcm9tIHRoZXJlIHlvdSBjYW4gdXNl IHNvbWUgaWYgc2NyaXB0cyB0byBkZXRlY3QgaWYgYm90aCBhcmUgdXAsIG9yIGlmIG9ubHkgb25l IGlzIHVwLCB3aGljaCBvbmUuJm5ic3A7IEluIG91ciBjYXNlLCBpZiBib3RoIGFyZSB1cCB3ZSBj bGVhciB0aGUgZGVmYXVsdCByb3V0ZSBhbmQgdGhlbiBtYWtlIGl0IHNvbWV0aGluZyBsaWtlPC9G T05UPjwvUD4NCg0KPFA+PEZPTlQgU0laRT0yPmlwIHJvdXRlIGFkZCBkZWZhdWx0IG5leHRob3Ag dmlhIDEwLjEuMS4yNTQgZGV2IGV0aDEgd2VpZ2h0IDEgXDwvRk9OVD4NCjxCUj48Rk9OVCBTSVpF PTI+bmV4dGhvcCB2aWEgMTAuMi4yLjI1NCBkZXYgZXRoMiB3ZWlnaHQgMTwvRk9OVD4NCjwvUD4N Cg0KPFA+PEZPTlQgU0laRT0yPmFuZCBpZiBvbmx5IG9uZSBpcyB1cCB3ZSBjbGVhciBpdCBhbmQg bWFrZSBpdCA6PC9GT05UPg0KPC9QPg0KDQo8UD48Rk9OVCBTSVpFPTI+aXAgcm91dGUgYWRkIGRl ZmF1bHQgbmV4dGhvcCB2aWEgMTAuMS4xLjI1NCBkZXYgZXRoMSA8L0ZPTlQ+DQo8QlI+PEZPTlQg U0laRT0yPm9yIDwvRk9OVD4NCjxCUj48Rk9OVCBTSVpFPTI+aXAgcm91dGUgYWRkIGRlZmF1bHQg bmV4dGhvcCB2aWEgMTAuMi4yLjI1NCBkZXYgZXRoMiA8L0ZPTlQ+DQo8L1A+DQo8QlI+DQoNCjxQ PjxGT05UIFNJWkU9Mj5XaXRoIHNvbWUgYWRkaXRpb25hbCBzY3JpcHRpbmcgd2UgY2FuIGFsbG93 IHRoaXMgdG8gYmUgb3ZlcnJpZGRlbiwgd2UgY2FuIHNldCB0aGUgbGluayB0byBwcmVmZXIgdXNp bmcgb25seSBvbmUgbGluZSwgYnV0IHN3aXRjaCB0byB0aGUgb3RoZXIgaWYgdGhlIHByZWZlcnJl ZCBsaW5lIGZhaWxzLCBhbmQgdG8gdGFrZSBpbnB1dCBmcm9tIHByb2dyYW1zIGxpa2UgTmFnaW9z IHRvIGF1dG8tcHJlZmVyIG9uZSBsaW5lIG9yIGFub3RoZXIgaWYgcGluZyB0aW1lcyBnZXQgaGln aCwgZXRjLiZuYnNwOyBJbiBhZGRpdGlvbiwgdGhlIHNjcmlwdCByZW1lbWJlcnMgdGhlIHN0YXRl IGl0IHdhcyBpbiAoc28gdGhhdCBpdCBvbmx5IGNoYW5nZXMgdGhlIHJvdXRpbmcgdGFibGUgd2hl biBuZWVkZWQpLCBjb250cm9scyBETlMsIGNhbiBmbHVzaCB0aGUgRE5TIGNhY2hlLCBhbmQgcmVw b3J0cyBzdGF0dXMgYmFjayB0byBOYWdpb3MuJm5ic3A7IE9uY2UgSSBnZXQgYWxsIHRoZSBidWdz IG91dCBhbmQgc29tZSBkb2N1bWVudGF0aW9uLCBJJ2QgYmUgaGFwcHkgdG8gcG9zdCBpdCB0byB0 aGUgbmV3cyBncm91cCwgdGhvdWdoIHlvdSBvciBhbnlvbmUgZWxzZSBjYW4gc2VuZCBtZSBhbiBl bWFpbCBpZiB5b3Ugd291bGQgbGlrZSB0byB0YWtlIGEgbG9vayBhdCBpdCBiZWZvcmUgdGhlbi48 L0ZPTlQ+PC9QPg0KDQo8UD48Rk9OVCBTSVpFPTI+SW4gcHJhY3RpY2UsIHRoaXMgbWV0aG9kIHVz dWFsbHkgZGV0ZWN0cyBhbmQgYWRqdXN0cyBvdXRib3VuZCBjb25uZWN0aW9ucyBxdWlja2x5IHdp dGhvdXQgdXNlciBpbnRlcnZlbnRpb247IEROUyBjaGFuZ2VzIHdpdGggc2hvcnQgVFRMUyB0YWtl IGNhcmUgb2YgaW5ib3VuZCBjb25uZWN0aW9ucy4mbmJzcDsgSnVzdCBiZSBjYXJlZnVsLi4uIGlm IHlvdSBkb24ndCBoYXZlIHNvbWV0aGluZyBzZW5kaW5nIHRyYWZmaWMgb3V0IHRvIHlvdXIgdXBz dHJlYW0gcm91dGVycyAoYW5kIGJhY2spIGV2ZXJ5IGZldyBtaW51dGVzLCB0aGUgZW50cnkgaW4g eW91ciBBUlAgdGFibGUgY2FuIHBvdGVudGlhbGx5IGJlIHJlbW92ZWQgYW5kIHRodXMgY2F1c2Ug eW91ciBzeXN0ZW0gdG8gdGhpbmsgYW4gdW51c2VkIGdhdGV3YXkgaGFzIGZhaWxlZCwgb3IgdGhh dCBhIHJlY292ZXJlZCBnYXRld2F5IGlzIHN0aWxsIGRvd24uJm5ic3A7IFRoaXMgY291bGQgYmUg Y2hlY2tlZCB3aXRoIGEgcXVpY2sgJnF1b3Q7aWYgaXAgbmVpZ2ggdGVzdCBmYWlscywgcGluZyBu ZWlnaGJvciA1IHRpbWVzLCB0aGVuIHRlc3QgYWdhaW4gYmVmb3JlIG1ha2luZyBkZWNpc2lvbnMm cXVvdDsuJm5ic3A7IFJ1bm5pbmcgYW4gdXB0aW1lIG1vbml0b3IgdGhhdCBwaW5ncyBvciBkb2Vz IHNvbWV0aGluZyBlbHNlIHRvL3Rocm91Z2ggdGhlIGdhdGV3YXkgKHJlZ2FyZGxlc3Mgb2YgZGVm YXVsdCByb3V0ZSkgYWxzbyB0YWtlcyBjYXJlIG9mIHRoaXMuPC9GT05UPjwvUD4NCjxCUj4NCg0K PFA+PEZPTlQgU0laRT0yPi1XaWxsPC9GT05UPg0KPC9QPg0KDQo8UD48Rk9OVCBTSVpFPTI+LS0t LS1PcmlnaW5hbCBNZXNzYWdlLS0tLS08L0ZPTlQ+DQo8QlI+PEZPTlQgU0laRT0yPkZyb206IFZs YWRpbWlyIEJ1cmNpYWdhIEFndWlsYXIgWzxBIEhSRUY9Im1haWx0bzphbmFraW52N0Bob3RtYWls LmNvbSI+bWFpbHRvOmFuYWtpbnY3QGhvdG1haWwuY29tPC9BPl0gPC9GT05UPg0KPEJSPjxGT05U IFNJWkU9Mj5TZW50OiBUaHVyc2RheSwgU2VwdGVtYmVyIDE0LCAyMDA2IDEwOjI1IFBNPC9GT05U Pg0KPEJSPjxGT05UIFNJWkU9Mj5UbzogbGFydGNAbWFpbG1hbi5kczlhLm5sPC9GT05UPg0KPEJS PjxGT05UIFNJWkU9Mj5TdWJqZWN0OiBbTEFSVENdIFByb2JsZW0gd2l0aCBMb2FkIEJhbGFuY2lu ZzwvRk9OVD4NCjwvUD4NCg0KPFA+PEZPTlQgU0laRT0yPkhpIGV2ZXJ5Ym9keSE8L0ZPTlQ+DQo8 L1A+DQoNCjxQPjxGT05UIFNJWkU9Mj5JJ20gdHJ5aW5nIHRvIGltcGxlbWVudCB0aGUgbG9hZCBi YWxhbmNpbmcgZm9yIGEgTEFOIHdpdGggdHdvIElTUHMuIEkndmUgPC9GT05UPg0KPEJSPjxGT05U IFNJWkU9Mj5pbnN0YWxsZWQgYSBTdXNlIExpbnV4IEVudGVycGlzZSBTZXJ2ZXIgOSB3aXRoIGlw cm91dGUyIGZvciB0aGF0IHBvcnBvdXNlLjwvRk9OVD4NCjwvUD4NCg0KPFA+PEZPTlQgU0laRT0y PlRoZSBzZXJ2ZXIgaGF2ZSB0d28gTklDcywgb25lIG9mIHRoZW0gaXMgZm9yIGJvdGggdGhlIExB TiBhbmQgSVNQIDEuIEkndmUgPC9GT05UPg0KPEJSPjxGT05UIFNJWkU9Mj5zZXR1cCBib3RoIE5J Q3Mgd2l0aCBZQVNUIChpZiBJIHVzZSBpcCBmb3IgdGhpcywgdGhlbiB0aGUgd2hvbGUgdGhpbmcg PC9GT05UPg0KPEJSPjxGT05UIFNJWkU9Mj5kb2Vzbid0IHdvcmshKSBhbmQgZXhlY3V0ZSB0aGUg Zm9sbG93aW5nIGNvbW1hbmRzIHRvIHNldHVwIHRoZSByb3V0aW5nIDwvRk9OVD4NCjxCUj48Rk9O VCBTSVpFPTI+dGFibGVzOjwvRk9OVD4NCjwvUD4NCg0KPFA+PEZPTlQgU0laRT0yPmlwIHJvdXRl IGZsdXNoIGNhY2hlPC9GT05UPg0KPEJSPjxGT05UIFNJWkU9Mj5pcCByb3V0ZSBmbHVzaCBkZWZh dWx0PC9GT05UPg0KPEJSPjxGT05UIFNJWkU9Mj5pcCByb3V0ZSBmbHVzaCB0YWJsZSAxPC9GT05U Pg0KPEJSPjxGT05UIFNJWkU9Mj5pcCByb3V0ZSBmbHVzaCB0YWJsZSAyPC9GT05UPg0KPC9QPg0K DQo8UD48Rk9OVCBTSVpFPTI+W3NuaXBdPC9GT05UPg0KPC9QPg0KDQo8L0JPRFk+DQo8L0hUTUw+ ------_=_NextPart_001_01C6D87D.20DEC5E0-- --===============0922384777== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc --===============0922384777==--