* Multiple load balancers problem @ 2012-08-25 7:37 Dmitry Akindinov 2012-08-25 10:13 ` Dmitry Akindinov 2012-08-25 11:53 ` Julian Anastasov 0 siblings, 2 replies; 13+ messages in thread From: Dmitry Akindinov @ 2012-08-25 7:37 UTC (permalink / raw) To: lvs-devel Hello, We are currently stuck with the following ipvs problem: 1. The configuration includes a (potentially large) set of servers providing various services - besides HTTP (POP, IMAP, LDAP, SMTP, XMPP, etc.) The test setup includes just 2 servers, though. 2. Each server runs a stock version of CentOS 6.0 3. The application software (CommuniGate Pro) controls the ipvs kernel module using the ipvsadm commands. 4. On each server, iptables are configured to: a) disable connection tracking for VIP address(es) b) mark all packets coming to the VIP address(es) with the mark value of 100. 5. On the currently active load balancer, the ipvsadm is used to configure ipvs to load-balance packets with the marker 100: -A -f 100 -s rr -p 1 -a -f 100 -r <server1> -g -a -f 100 -r <server2> -g .... where the active balancer itself is one of the <serverN> 6. All other servers (just 1 "other" server in our test config) are running ipvs, but with an empty rule set. 7. The active load balancer runs the sync daemon started with ipvsadm --start-daemon master 7. All other servers run the sync daemon started with ipvsadm --start-daemon backup. As a result, all servers have the duplicated ipvs connection tables. If the active balancer fails, some other server assumes its role by arp-broadcasting VIP and loading the ipvs rule set listed above. When a connection is being established to the VIP address, and the active load balancer directs it to itself, everything works fine. When a connection is being established to the VIP address, and the active load balancer directs it to some other server, the connection is established fine, and if the protocol is POP, IMAP, SMTP, the server prompt is sent to the client via VIP, and it is seen by client just fine. But when the client tries to send anything to the server, the packet (according to tcpdump) reaches the load balancer server, and from there it reaches the "other" server. Where the packet is dropped. The client resends that packet, it goes to the active balancer, then to the "other" server, and it is dropped again. Observations: *) if ipvs is switched off on that "other" server, everything works just fine (service ipvsadm stop) *) if ipvs is left running on that "other" server, but syncing daemon is switched off, everything works just fine. We are 95% sure that the problem appears only if the "other server" ipvs connection table gets a copy of this connection from the active balancer. If the copy is not there (the sync daemon was stopped when the connection was established, and restarted immediately after), everything works just fine. *) the problem exists for protocols like POP, IMAP, SMTP - where the server immediately sends some data (prompt) to the client, as soon as the connection is established. When the HTTP protocol is used, the problem does not exist, but only if the entire request is sent as one packet. If the HTTP connection is a "keep-alive" one, subsequent requests in the same connection do not reach the application either. I.e. it looks like the "idling" ipvs allows only one incoming data packet in, and only if there has been no outgoing packet on that connection yet. *) Sometimes (we still cannot reproduce this reliably) the ksoftirqd threads on the "other" server jump to 100% CPU utilization, and when it happens, it happens in reaction to one connection being established. Received suggestions: *) it was suggested that we use iptables to filter the packets to VIP that come from other servers in the farm (using their MAC addresses) and direct them directly to the local application, bypassing ipvs processing. We cannot do that, as servers in the farm can be added at any moment, and updating the list of MACs on all servers is not trivial. It may be easier to filter the packets that come from the router(s), which are less numerous and do not change that often. But it does not look like a good solution. If the ipvs table on "inactive" balancer drops packets, why would it stop dropping them when it becomes an "active" balancer? Just because there will be ipvs rules present? *) The suggestion to separate load balancer(s) and real servers won't work for us at all. *) We tried not to empty the ipvs table on the "other" server(s). Instead, we left it balancing - but with only one "real server" - this server itself. Now, the "active" load balancer dsitributes packets to itself and other servers, and when the packets hit the "other" server(s), they get to the ipvs again, where they are balanced again, but to the local server only. It looks like it does solve the problem. But now the ipvs connection table on the "other" server(s) is filled by both that server ipvs itself and by the sync-daemon. While the locally-generated connection table entries should be the same as corresponding entries received with the sync daemon, it does not look good when the same table is modified from two sources. Any comment, please? Should we use the last suggestion? -- Best regards, Dmitry Akindinov ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Multiple load balancers problem 2012-08-25 7:37 Multiple load balancers problem Dmitry Akindinov @ 2012-08-25 10:13 ` Dmitry Akindinov 2012-08-25 11:53 ` Julian Anastasov 1 sibling, 0 replies; 13+ messages in thread From: Dmitry Akindinov @ 2012-08-25 10:13 UTC (permalink / raw) To: lvs-devel Hello, A small addition below. On 2012-08-25 11:37, Dmitry Akindinov wrote: > Hello, > > We are currently stuck with the following ipvs problem: > > 1. The configuration includes a (potentially large) set of servers > providing various services - besides HTTP (POP, IMAP, LDAP, SMTP, XMPP, > etc.) The test setup includes just 2 servers, though. > 2. Each server runs a stock version of CentOS 6.0 > 3. The application software (CommuniGate Pro) controls the ipvs kernel > module using the ipvsadm commands. > 4. On each server, iptables are configured to: > a) disable connection tracking for VIP address(es) > b) mark all packets coming to the VIP address(es) with the mark value of > 100. > 5. On the currently active load balancer, the ipvsadm is used to > configure ipvs to load-balance packets with the marker 100: > -A -f 100 -s rr -p 1 > -a -f 100 -r <server1> -g > -a -f 100 -r <server2> -g > .... > where the active balancer itself is one of the <serverN> > 6. All other servers (just 1 "other" server in our test config) are > running ipvs, but with an empty rule set. > 7. The active load balancer runs the sync daemon started with ipvsadm > --start-daemon master > 7. All other servers run the sync daemon started with ipvsadm > --start-daemon backup. > > As a result, all servers have the duplicated ipvs connection tables. If > the active balancer fails, some other server assumes its role by > arp-broadcasting VIP and loading the ipvs rule set listed above. > > When a connection is being established to the VIP address, and the > active load balancer directs it to itself, everything works fine. > When a connection is being established to the VIP address, and the > active load balancer directs it to some other server, the connection is > established fine, and if the protocol is POP, IMAP, SMTP, the server > prompt is sent to the client via VIP, and it is seen by client just fine. > But when the client tries to send anything to the server, the packet > (according to tcpdump) reaches the load balancer server, and from there > it reaches the "other" server. Where the packet is dropped. The client > resends that packet, it goes to the active balancer, then to the "other" > server, and it is dropped again. > > > Observations: > *) if ipvs is switched off on that "other" server, everything works just > fine (service ipvsadm stop) > > *) if ipvs is left running on that "other" server, but syncing daemon is > switched off, everything works just fine. > We are 95% sure that the problem appears only if the "other server" ipvs > connection table gets a copy of this > connection from the active balancer. If the copy is not there (the sync > daemon was stopped when the connection > was established, and restarted immediately after), everything works just > fine. > > *) the problem exists for protocols like POP, IMAP, SMTP - where the > server immediately sends some data (prompt) to the client, as soon as > the connection is established. > When the HTTP protocol is used, the problem does not exist, but only if > the entire request is sent as one packet. If the HTTP connection is a > "keep-alive" one, subsequent requests in the same connection do not > reach the application either. > I.e. it looks like the "idling" ipvs allows only one incoming data > packet in, and only if there has been no outgoing packet on that > connection yet. > > *) Sometimes (we still cannot reproduce this reliably) the ksoftirqd > threads on the "other" server jump to 100% CPU > utilization, and when it happens, it happens in reaction to one > connection being established. And when a new connection is being established, the second ksoftirqd thread using 100% CPU appears in the "top" output, and so on - till all ksoftirqd threads (8 in case of our 8-CPU test servers) are looping somewhere, consuming most of CPU cycles. > Received suggestions: > *) it was suggested that we use iptables to filter the packets to VIP > that come from other servers in the farm (using their MAC addresses) and > direct them directly to the local application, bypassing ipvs > processing. We cannot do that, as servers in the farm can be added at > any moment, and updating the list of MACs on all servers is not trivial. > It may be easier to filter the packets that come from the router(s), > which are less numerous and do not change that often. > But it does not look like a good solution. If the ipvs table on > "inactive" balancer drops packets, why would it stop dropping them when > it becomes an "active" balancer? Just because there will be ipvs rules > present? > > *) The suggestion to separate load balancer(s) and real servers won't > work for us at all. > > *) We tried not to empty the ipvs table on the "other" server(s). > Instead, we left it balancing - but with only one "real server" - this > server itself. Now, the "active" load balancer dsitributes packets to > itself and other servers, and when the packets hit the "other" > server(s), they get to the ipvs again, where they are balanced again, > but to the local server only. > > It looks like it does solve the problem. But now the ipvs connection > table on the "other" server(s) is filled by both that server ipvs itself > and by the sync-daemon. While the locally-generated connection table > entries should be the same as corresponding entries received with the > sync daemon, it does not look good when the same table is modified from > two sources. > > Any comment, please? Should we use the last suggestion? > > -- Best regards, Dmitry Akindinov ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Multiple load balancers problem 2012-08-25 7:37 Multiple load balancers problem Dmitry Akindinov 2012-08-25 10:13 ` Dmitry Akindinov @ 2012-08-25 11:53 ` Julian Anastasov 2012-08-27 8:02 ` Dmitry Akindinov 1 sibling, 1 reply; 13+ messages in thread From: Julian Anastasov @ 2012-08-25 11:53 UTC (permalink / raw) To: Dmitry Akindinov; +Cc: lvs-devel Hello, On Sat, 25 Aug 2012, Dmitry Akindinov wrote: > Hello, > > We are currently stuck with the following ipvs problem: > > 1. The configuration includes a (potentially large) set of servers providing > various services - besides HTTP (POP, IMAP, LDAP, SMTP, XMPP, etc.) The test > setup includes just 2 servers, though. > 2. Each server runs a stock version of CentOS 6.0 OK, I don't know what kernel and patches includes every distribution. Can you tell at least what shows uname -a? > 3. The application software (CommuniGate Pro) controls the ipvs kernel module > using the ipvsadm commands. > 4. On each server, iptables are configured to: > a) disable connection tracking for VIP address(es) > b) mark all packets coming to the VIP address(es) with the mark value of > 100. > 5. On the currently active load balancer, the ipvsadm is used to configure > ipvs to load-balance packets with the marker 100: > -A -f 100 -s rr -p 1 > -a -f 100 -r <server1> -g > -a -f 100 -r <server2> -g > .... > where the active balancer itself is one of the <serverN> > 6. All other servers (just 1 "other" server in our test config) are running > ipvs, but with an empty rule set. I think, running slaves without same rules is a mistake. When the slave receives sync message it has to assign it to some virtual server and even assign real server for this connection. But if this slave is also a real server the things are complicated. I now check the code and do not see where we prevent backup to schedule traffic received from current master. The master gives the traffic to backup because considers it a real server but this backup with rules decides to schedule it to different real server. This problem can not happen for NAT, only for DR/TUN, I see that you are using DR forwarding method. So, currently, IPVS users do not add ipvsadm rules in backup for DR/TUN for this reason? > 7. The active load balancer runs the sync daemon started with ipvsadm > --start-daemon master > 7. All other servers run the sync daemon started with ipvsadm --start-daemon > backup. > > As a result, all servers have the duplicated ipvs connection tables. If the > active balancer fails, some other server assumes its role by arp-broadcasting > VIP and loading the ipvs rule set listed above. In initial email you said: "Now, we initiate a failover. During the failover, the ipvs table on the old "active" balancer is cleared," Why do you clear the connection table? What if you decide after 10 seconds to return back control to the first master? > When a connection is being established to the VIP address, and the active load > balancer directs it to itself, everything works fine. I assume you are talking for box 2 (the new master) > When a connection is being established to the VIP address, and the active load > balancer directs it to some other server, the connection is established fine, > and if the protocol is POP, IMAP, SMTP, the server prompt is sent to the > client via VIP, and it is seen by client just fine. You mean, new connection does 3-way handshake via the new master to other real servers and succeeds, or already established connection before the failover work after failover? Is packet directed to old master? > But when the client tries to send anything to the server, the packet > (according to tcpdump) reaches the load balancer server, and from there it > reaches the "other" server. Where the packet is dropped. The client resends > that packet, it goes to the active balancer, then to the "other" server, and > it is dropped again. Why this real server drops the packet? What is different in this packet? Are you talking about connections created before failover, that they can not continue to work after failover? May be problem happens for DR. Can you show tcpdump in old master that 3-way traffic is received and also that it is replied by it, not by some real server. Problem can happen only if master sends new traffic to backup (its real server). For example: - master schedules SYN to real server which is backup with same rules - SYNC conn is not sent before IPVS conn enters ESTABLISHED state, so backup does not know for such connection, it looks like new one - backup has rules, it decides to use real server 3 and directs the SYN there. It can happen only for DR/TUN because the daddr is VIP, that is why people overcome the problem by checking that packet comes from some master and not from uplink gateway MAC. For NAT there is no such double-step scheduling because the backups' rules do not match the internal real server IP in the daddr, they work only for VIP - more traffic comes, backup directs it to real server 3 - the first SYNC message for this connection comes from master but the SYNC message claims the backup is a real server for this connection. Looking at current code, ip_vs_proc_conn ignores the fact that master wants the backup as real server for this connection, backup will continue to use real server 3. For now, I don't see where this can fail except if persistence comes in the game or if failover happens to another backup which will use real server 3. The result is that the backup acts as balancer even if it is just a backup without master function. > Observations: > *) if ipvs is switched off on that "other" server, everything works just fine > (service ipvsadm stop) So, someone stops the SYN traffic in backup? > *) if ipvs is left running on that "other" server, but syncing daemon is > switched off, everything works just fine. Without rules in this backup? > We are 95% sure that the problem appears only if the "other server" ipvs > connection table gets a copy of this > connection from the active balancer. If the copy is not there (the sync daemon > was stopped when the connection > was established, and restarted immediately after), everything works just fine. Interesting, new master forwards to old master, so it should send SYNC containing the old master as real server, how can there be a problem, may be your kernel does not support properly the local server function which is fixed 2 years ago. > *) the problem exists for protocols like POP, IMAP, SMTP - where the server > immediately sends some data (prompt) to the client, as soon as the connection > is established. The SYNC packets always go after the traffic, so not sure why SYN will work while there will be difference for other traffic. May be your kernel version reacts differently when first SYNC message claims server 3 is the real server, not backup 1 and the double-scheduling is broken after 3-way handshake. > When the HTTP protocol is used, the problem does not exist, but only if the > entire request is sent as one packet. If the HTTP connection is a "keep-alive" > one, subsequent requests in the same connection do not reach the application > either. > I.e. it looks like the "idling" ipvs allows only one incoming data packet in, > and only if there has been no outgoing packet on that connection yet. May be SYNC message changes the destination in backup as I already said above? Some tcpdump output will be helpful in case you don't know how to dig into the sources of your kernel. > *) Sometimes (we still cannot reproduce this reliably) the ksoftirqd threads > on the "other" server jump to 100% CPU > utilization, and when it happens, it happens in reaction to one connection > being established. This sounds as a problem fixed 2 years ago: http://marc.info/?t=128428786100001&r=1&w=2 At that time even fwmark was not supported for sync purposes. Note that many changes happened in this 2 year period, some for fwmark support for IPVS sync, some for the 100% loops. Without knowing the kernel version I'm not willing to flood you with changes that you should check if they are present in your kernel if it contains additional patches. > Received suggestions: > *) it was suggested that we use iptables to filter the packets to VIP that > come from other servers in the farm (using their MAC addresses) and direct > them directly to the local application, bypassing ipvs processing. We cannot > do that, as servers in the farm can be added at any moment, and updating the > list of MACs on all servers is not trivial. It may be easier to filter the > packets that come from the router(s), which are less numerous and do not > change that often. > But it does not look like a good solution. If the ipvs table on "inactive" > balancer drops packets, why would it stop dropping them when it becomes an > "active" balancer? Just because there will be ipvs rules present? > > *) The suggestion to separate load balancer(s) and real servers won't work for > us at all. > > *) We tried not to empty the ipvs table on the "other" server(s). Instead, we > left it balancing - but with only one "real server" - this server itself. Now, > the "active" load balancer dsitributes packets to itself and other servers, > and when the packets hit the "other" server(s), they get to the ipvs again, > where they are balanced again, but to the local server only. Very good, only that you need recent kernel for this, 2010-Nov +, there are fixes even after that time. > It looks like it does solve the problem. But now the ipvs connection table on > the "other" server(s) is filled by both that server ipvs itself and by the > sync-daemon. While the locally-generated connection table entries should be > the same as corresponding entries received with the sync daemon, it does not > look good when the same table is modified from two sources. Sync happens only in one direction at a time, from current master to current backup (it can be more than one). The benefit is that all servers used for sync have same table and you can switch between them at any time. Of course, there is some performance price for traffic that goes to the local stack of backups but they should get from current master only traffic for their stack. > Any comment, please? Should we use the last suggestion? I think, with fresh kernel your setup should be supported. After showing the kernel version we can decide for further steps. I'm not sure if we need to change kernel not to schedule new connections for the BACKUP && !MASTER configuration. By this way backup can have same rules as backup which can work for DR/TUN. Without such change we can not do role change without breaking connections because the SYNC protocol declares real server 1 as server while some backup overrides this decision and uses real server 3, decision not known by other potential masters. > -- > Best regards, > Dmitry Akindinov > -- Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Multiple load balancers problem 2012-08-25 11:53 ` Julian Anastasov @ 2012-08-27 8:02 ` Dmitry Akindinov 2012-08-27 11:17 ` Julian Anastasov 0 siblings, 1 reply; 13+ messages in thread From: Dmitry Akindinov @ 2012-08-27 8:02 UTC (permalink / raw) To: Julian Anastasov; +Cc: lvs-devel Hello, On 2012-08-25 15:53, Julian Anastasov wrote: > > Hello, > > On Sat, 25 Aug 2012, Dmitry Akindinov wrote: > >> Hello, >> >> We are currently stuck with the following ipvs problem: >> >> 1. The configuration includes a (potentially large) set of servers providing >> various services - besides HTTP (POP, IMAP, LDAP, SMTP, XMPP, etc.) The test >> setup includes just 2 servers, though. >> 2. Each server runs a stock version of CentOS 6.0 > > OK, I don't know what kernel and patches includes > every distribution. Can you tell at least what shows uname -a? Ah, sorry. That was [root@fm1 ~]# uname -a Linux fm1.***.com 2.6.32-71.el6.x86_64 #1 SMP Fri May 20 03:51:51 BST 2011 x86_64 x86_64 x86_64 GNU/Linux >> 3. The application software (CommuniGate Pro) controls the ipvs kernel module >> using the ipvsadm commands. >> 4. On each server, iptables are configured to: >> a) disable connection tracking for VIP address(es) >> b) mark all packets coming to the VIP address(es) with the mark value of >> 100. >> 5. On the currently active load balancer, the ipvsadm is used to configure >> ipvs to load-balance packets with the marker 100: >> -A -f 100 -s rr -p 1 >> -a -f 100 -r<server1> -g >> -a -f 100 -r<server2> -g >> .... >> where the active balancer itself is one of the<serverN> >> 6. All other servers (just 1 "other" server in our test config) are running >> ipvs, but with an empty rule set. > > I think, running slaves without same rules is a mistake. > When the slave receives sync message it has to assign it > to some virtual server and even assign real server for > this connection. But if this slave is also a real server > the things are complicated. I now check the code and > do not see where we prevent backup to schedule traffic > received from current master. The master gives the traffic > to backup because considers it a real server but this > backup with rules decides to schedule it to different real server. Yes, exactly. And to avoid this "secondary load balancing", we do not load the rules into ipvs until it becomes the active balancer. Looks like it's causing problems, so the alternative we are using now is to load the rules, but make them balance everything to a single server - the local one. > This problem can not happen for NAT, only for DR/TUN, > I see that you are using DR forwarding method. So, > currently, IPVS users do not add ipvsadm rules in backup > for DR/TUN for this reason? Yes, please see above. >> 7. The active load balancer runs the sync daemon started with ipvsadm >> --start-daemon master >> 7. All other servers run the sync daemon started with ipvsadm --start-daemon >> backup. >> >> As a result, all servers have the duplicated ipvs connection tables. If the >> active balancer fails, some other server assumes its role by arp-broadcasting >> VIP and loading the ipvs rule set listed above. > > In initial email you said: > "Now, we initiate a failover. During the failover, the ipvs table on the > old "active" balancer is cleared," > > Why do you clear the connection table? What if you > decide after 10 seconds to return back control to the first > master? No, we do not clear the connection table, we clear the rule set, to avoid the "double balancing" problem. Now, instead of clearing the rule set completely, we simply remove from it all other servers, leaving only the local one. >> When a connection is being established to the VIP address, and the active load >> balancer directs it to itself, everything works fine. > > I assume you are talking for box 2 (the new master) Yes. >> When a connection is being established to the VIP address, and the active load >> balancer directs it to some other server, the connection is established fine, >> and if the protocol is POP, IMAP, SMTP, the server prompt is sent to the >> client via VIP, and it is seen by client just fine. > > You mean, new connection does 3-way handshake via > the new master to other real servers and succeeds, or already > established connection before the failover work after failover? > Is packet directed to old master? We have only tested new connections so far. We will now test how existing connections survive during failover, and we will report if there are problems there. >> But when the client tries to send anything to the server, the packet >> (according to tcpdump) reaches the load balancer server, and from there it >> reaches the "other" server. Where the packet is dropped. The client resends >> that packet, it goes to the active balancer, then to the "other" server, and >> it is dropped again. > > Why this real server drops the packet? What is > different in this packet? Are you talking about connections > created before failover, that they can not continue to work after > failover? May be problem happens for DR. Can you show > tcpdump in old master that 3-way traffic is received and also > that it is replied by it, not by some real server. We ran tcpdump on both systems - the active balancer and the inactive one. We saw all incoming packets properly coming to the active balancer (so, no arp problems), and we saw the active balancer directing some connections to the inactive one: both the active and inactive balancers are the only real servers in the ipvs config of the active balancer. When we look at the connections directed by the active balancer to the inactive one, we see the incoming packets reaching the inactive balancer, and we see the inactive balancer (the application on that server) receiving the connection. We see the application sending the prompt message out, and we see (tcpdump) that this packet goes out, directly to the client (to the router). Now, we see the client trying to send some data to the server, and we see the data packet hitting the active load balancer, and then - the inactive load balancer. And there we see the packet disappearing - the application does not see it, and since there is not "ack" sent back to the client, we see the client TCP stack resending that packet over and over, but all resent packets have the same fate - they disappear inside the inactive load balancer. We can send the actual tcpdumps if needed. > Problem can happen only if master sends new traffic to > backup (its real server). For example: > > - master schedules SYN to real server which is backup with same rules > - SYNC conn is not sent before IPVS conn enters ESTABLISHED state, > so backup does not know for such connection, it looks like new one > - backup has rules, it decides to use real server 3 and > directs the SYN there. It can happen only for DR/TUN because > the daddr is VIP, that is why people overcome the problem > by checking that packet comes from some master and not > from uplink gateway MAC. For NAT there is no such double-step > scheduling because the backups' rules do not match the > internal real server IP in the daddr, they work only for VIP No, this is not the case. The backup balancer did not have rules, so it could not schedule the packet to some server 3. Also, the "sync" exchange, which happens when there is no connection table record yet, works just fine. Packets are dropped only after/when the connection record appears in the inactive load balancer (via sync'ing with the active balancer). > - more traffic comes, backup directs it to real server 3 > - the first SYNC message for this connection comes from > master but the SYNC message claims the backup is a real > server for this connection. Looking at current code, > ip_vs_proc_conn ignores the fact that master wants the > backup as real server for this connection, backup will > continue to use real server 3. For now, I don't see where > this can fail except if persistence comes in the game > or if failover happens to another backup which will use > real server 3. The result is that the backup acts as > balancer even if it is just a backup without master function. Again, it was not the case. It looks like (as you have specified initially), that if there is a connection record (received from the active balancer), the inactive (backup) balancer must assign it to some local ipvs rule. In our case, the rule set on the backup balancer was empty, and that drove ipvs there mad, and somehow resulted in it dropping the packets belonging to this "orphan" connection. When we added 2 rules to "inactive/backup" ipvs, one for the virt server and one for the only real server - the local server, the problem has disappeared. >> Observations: >> *) if ipvs is switched off on that "other" server, everything works just fine >> (service ipvsadm stop) > > So, someone stops the SYN traffic in backup? SYN negotiations worked fine. The problem started AFTER the SYN exchange was over. Here is a theory (most likely, a wrong one ;-) ): during SYN exchange, a connection record does not exist (as you've mentioned), so SYN exchange works fine. after SYN exchange, a connection record is created on the active balancer and sent to the backup balancer where is entered into its table. when the application on the backup balancer sends the prompt out, the client receives it, and sends back the ACK packet. The ACK packet goes to the active balancer, and from there - to the backup one. Upon receiving this FIRST packet for the newly created connection record, ipvs marks that record somehow, but lets this packet throw, the local TCP stack gets it, so it does not resend the application prompt. Now, when the client sends its data, it comes to the backup balancer via the active balancer. The ipvs module sees that this packet is directed to the connection record it already has, and that connection record is "marked" somehow by the first packet and this mark forces ipvs to drop this and the following data packets. If the TCP protocol used does not include a server prompt (HTTP, for example), then the fist data packet the client sends does reach the application. But it also marks the connection record somehow, so all subsequent packets are dropped. As soon as we have added the records to ipvs on the backup balancer, the problem has disappeared. >> *) if ipvs is left running on that "other" server, but syncing daemon is >> switched off, everything works just fine. > > Without rules in this backup? Yes. When there is no syncing, the backup balancer works just fine - i.e. it does not do anything and does not interfere with the traffic in any way. >> We are 95% sure that the problem appears only if the "other server" ipvs >> connection table gets a copy of this >> connection from the active balancer. If the copy is not there (the sync daemon >> was stopped when the connection >> was established, and restarted immediately after), everything works just fine. > > Interesting, new master forwards to old master, > so it should send SYNC containing the old master as real > server, how can there be a problem, may be your kernel does > not support properly the local server function which is > fixed 2 years ago. Hmm. I assume the kernel we use is pretty fresh. >> *) the problem exists for protocols like POP, IMAP, SMTP - where the server >> immediately sends some data (prompt) to the client, as soon as the connection >> is established. > > The SYNC packets always go after the traffic, so > not sure why SYN will work while there will be difference for > other traffic. May be your kernel version reacts differently > when first SYNC message claims server 3 is the real server, > not backup 1 and the double-scheduling is broken after > 3-way handshake. There is no 3rd server anywhere in the config. Please see our theory above. >> When the HTTP protocol is used, the problem does not exist, but only if the >> entire request is sent as one packet. If the HTTP connection is a "keep-alive" >> one, subsequent requests in the same connection do not reach the application >> either. >> I.e. it looks like the "idling" ipvs allows only one incoming data packet in, >> and only if there has been no outgoing packet on that connection yet. > > May be SYNC message changes the destination in > backup as I already said above? Some tcpdump output will > be helpful in case you don't know how to dig into the > sources of your kernel. There is no change in destination. The dropped packets are really dropped, not relayed somewhere. Also, if they were relayed, they could only be relayed to the active balancer, as ipvs config only has or had these two servers in it. And tcpdump on the active balancer properly shows the packets sent to the backup balancer, but no packets coming back from that balancer. >> *) Sometimes (we still cannot reproduce this reliably) the ksoftirqd threads >> on the "other" server jump to 100% CPU >> utilization, and when it happens, it happens in reaction to one connection >> being established. > > This sounds as a problem fixed 2 years ago: > > http://marc.info/?t=128428786100001&r=1&w=2 Yes, our kernel may be susceptible to this problem > At that time even fwmark was not supported for > sync purposes. > > Note that many changes happened in this 2 year > period, some for fwmark support for IPVS sync, some for > the 100% loops. Without knowing the kernel version > I'm not willing to flood you with changes that you > should check if they are present in your kernel if > it contains additional patches. > >> Received suggestions: >> *) it was suggested that we use iptables to filter the packets to VIP that >> come from other servers in the farm (using their MAC addresses) and direct >> them directly to the local application, bypassing ipvs processing. We cannot >> do that, as servers in the farm can be added at any moment, and updating the >> list of MACs on all servers is not trivial. It may be easier to filter the >> packets that come from the router(s), which are less numerous and do not >> change that often. >> But it does not look like a good solution. If the ipvs table on "inactive" >> balancer drops packets, why would it stop dropping them when it becomes an >> "active" balancer? Just because there will be ipvs rules present? >> >> *) The suggestion to separate load balancer(s) and real servers won't work for >> us at all. >> >> *) We tried not to empty the ipvs table on the "other" server(s). Instead, we >> left it balancing - but with only one "real server" - this server itself. Now, >> the "active" load balancer dsitributes packets to itself and other servers, >> and when the packets hit the "other" server(s), they get to the ipvs again, >> where they are balanced again, but to the local server only. > > Very good, only that you need recent kernel for this, > 2010-Nov +, there are fixes even after that time. Yes, it looks like we have the kernels built in May-2011. >> It looks like it does solve the problem. But now the ipvs connection table on >> the "other" server(s) is filled by both that server ipvs itself and by the >> sync-daemon. While the locally-generated connection table entries should be >> the same as corresponding entries received with the sync daemon, it does not >> look good when the same table is modified from two sources. > > Sync happens only in one direction at a time, from > current master to current backup (it can be more than one). > The benefit is that all servers used for sync have same > table and you can switch between them at any time. Of > course, there is some performance price for traffic that > goes to the local stack of backups but they should get from > current master only traffic for their stack. That's not what concerns us. IPVS on the backup balancer is now being filled by 2 sources: the "sync" process, which copies records from the active balancer, and the IPVS itself. I.e. now (when we have rules in the backup balancer, too) - when a new connection arrives to the backup balancer, the balancer creates a connection record and places it into its connection table. A few moments later, the sync daemon receives a connection record for the same connection from the active load balancer, and it also wants to put that record into the connection table on the backup balancer. Our concern is a potential conflict here: that record is already in the table. If you say that there can be no conflict - it would be nice, but we do not know how ipvs is designed, so we cannot get rid of that concern on our own. >> Any comment, please? Should we use the last suggestion? > > I think, with fresh kernel your setup should be > supported. After showing the kernel version we can decide > for further steps. I'm not sure if we need to change kernel > not to schedule new connections for the BACKUP&& !MASTER > configuration. By this way backup can have same rules > as backup which can work for DR/TUN. Without such change > we can not do role change without breaking connections > because the SYNC protocol declares real server 1 as > server while some backup overrides this decision and > uses real server 3, decision not known by other > potential masters. We now keep the IPVS rule set on backup balancer(s) with just 2 records: a virt server and one real server - the local one. All connections the backup balancers get (from the active one) are directed to their local applications, so they are properly "balanced". When a backup balancer is instructed to become an active one, our application automatically loads the ruleset with all other real servers into its ipvs rule set, and then sends arp broadcast for all VIPs, switching the traffic to the new active balancer. The existing connections should survive, as the connection table contains all the records sync'ed from the old active balancer, right? The interesting question is how ipvs assigns the connection records received via the sync protocol: as we have seen, we had to put the virt server and the local real server rules into ipvs in order to stop the problem of the "backup" mode. Now, during the failover, we add the rules for other "real servers" AFTER the connection records for their connections were received from the then-active balancer. Will it cause the same type of problem? We will test failovers now, and we will report. -- Best regards, Dmitry Akindinov -- Stalker Labs. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Multiple load balancers problem 2012-08-27 8:02 ` Dmitry Akindinov @ 2012-08-27 11:17 ` Julian Anastasov 2012-08-27 15:15 ` Dmitry Akindinov 0 siblings, 1 reply; 13+ messages in thread From: Julian Anastasov @ 2012-08-27 11:17 UTC (permalink / raw) To: Dmitry Akindinov; +Cc: lvs-devel Hello, On Mon, 27 Aug 2012, Dmitry Akindinov wrote: > > OK, I don't know what kernel and patches includes > > every distribution. Can you tell at least what shows uname -a? > > Ah, sorry. That was > > [root@fm1 ~]# uname -a > Linux fm1.***.com 2.6.32-71.el6.x86_64 #1 SMP Fri May 20 03:51:51 BST 2011 > x86_64 x86_64 x86_64 GNU/Linux I downloaded kernel-2.6.32-71.el6.src.rpm and I see that it does not contain the needed changes to support backup to be real server for DR/TUN: commit fc604767613b6d2036cdc35b660bc39451040a47 Author: Julian Anastasov <ja@ssi.bg> Date: Sun Oct 17 16:38:15 2010 +0300 ipvs: changes for local real server and to support fwmark for SYNC: commit fe5e7a1efb664df0280f10377813d7099fb7eb0f Author: Hans Schillstrom <hans.schillstrom@ericsson.com> Date: Fri Nov 19 14:25:12 2010 +0100 IPVS: Backup, Adding Version 1 receive capability Functionality improvements * flags changed from 16 to 32 bits * fwmark added (32 bits) * timeout in sec. added (32 bits) * pe data added (Variable length) * IPv6 capabilities (3x16 bytes for addr.) * Version and type in every conn msg. > Yes, exactly. And to avoid this "secondary load balancing", we > do not load the rules into ipvs until it becomes the active balancer. > > Looks like it's causing problems, so the alternative we are using now > is to load the rules, but make them balance everything to a single > server - the local one. It seems even this is not enough because when the backup receives the sync message it creates SYNC connection (after passing the initial SYN and ACK) but this connection claims this backup is a real server and is using DR method. Without the commit fc604767613b6d2036cdc35b660bc39451040a47 when next packets come ip_vs_dr_xmit tries to send them to LOCAL_OUT (DR forwarding) instead of returning NF_ACCEPT as for LOCALNODE. As result, packet does not reach local stack as the previous SYN and ACK packets and may be you see that packet loops in the stack cuasing 100% CPU usage as you said below that it disappears: > Now, we see the client trying to send some data to the server, > and we see the data packet hitting the active load balancer, > and then - the inactive load balancer. And there we see the > packet disappearing - the application does not see it, and since > there is not "ack" sent back to the client, we see the client > TCP stack resending that packet over and over, but all resent > packets have the same fate - they disappear inside the inactive > load balancer. > > We can send the actual tcpdumps if needed. Not needed, I think, you need kernel update. > > directs the SYN there. It can happen only for DR/TUN because > > the daddr is VIP, that is why people overcome the problem > > by checking that packet comes from some master and not > > from uplink gateway MAC. For NAT there is no such double-step > > scheduling because the backups' rules do not match the > > internal real server IP in the daddr, they work only for VIP > > No, this is not the case. The backup balancer did not have rules, Yes, I just explained this variant too. > > Interesting, new master forwards to old master, > > so it should send SYNC containing the old master as real > > server, how can there be a problem, may be your kernel does > > not support properly the local server function which is > > fixed 2 years ago. > > Hmm. I assume the kernel we use is pretty fresh. I see ip_vs_conn.c from Sep 1 2010 is the latest file from IPVS. > > May be SYNC message changes the destination in > > backup as I already said above? Some tcpdump output will > > be helpful in case you don't know how to dig into the > > sources of your kernel. > > There is no change in destination. The dropped packets are really dropped, not > relayed somewhere. Also, if they were relayed, they could only be relayed to > the active balancer, as ipvs config only has or had these two servers in it. > And tcpdump on the active balancer properly shows the packets sent to the > backup balancer, but no packets coming back from that balancer. Yes, may be they loop in stack: DR via LOCAL_OUT, then they appear again in LOCAL_IN for forwarding? > > Very good, only that you need recent kernel for this, > > 2010-Nov +, there are fixes even after that time. > > Yes, it looks like we have the kernels built in May-2011. Yep. > > table and you can switch between them at any time. Of > > course, there is some performance price for traffic that > > goes to the local stack of backups but they should get from > > current master only traffic for their stack. > > That's not what concerns us. IPVS on the backup balancer is now > being filled by 2 sources: the "sync" process, which copies records > from the active balancer, and the IPVS itself. > > I.e. now (when we have rules in the backup balancer, too) - > when a new connection arrives to the backup balancer, > the balancer creates a connection record and places it into its > connection table. > A few moments later, the sync daemon receives a connection > record for the same connection from the active load balancer, > and it also wants to put that record into the connection table > on the backup balancer. > Our concern is a potential conflict here: that record is already > in the table. If you say that there can be no conflict - it would > be nice, but we do not know how ipvs is designed, so we > cannot get rid of that concern on our own. May be we should stop any forwarding while we are in backup mode. The problem is that we can be both in master and backup mode and I'm not sure if this is used at all. I guess master and backup use different syncid but anyways, may be such setup works only for NAT. > When a backup balancer is instructed to become an active one, > our application automatically loads the ruleset with all other > real servers into its ipvs rule set, and then sends arp broadcast > for all VIPs, switching the traffic to the new active balancer. > > The existing connections should survive, as the connection table > contains all the records sync'ed from the old active balancer, right? Yes. > The interesting question is how ipvs assigns the connection records > received via the sync protocol: as we have seen, we had to put the > virt server and the local real server rules into ipvs in order to stop > the problem of the "backup" mode. > Now, during the failover, we add the rules for other "real servers" > AFTER the connection records for their connections were received > from the then-active balancer. Will it cause the same type of problem? Not fatal but without rules we can not maintain actual counters for active/inactive conns. After failover the setup will start with zeroed counters that are later modified only for new connections, all SYNCed conns are not accounted and the first minutes after failover we can see some imbalance. Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Multiple load balancers problem 2012-08-27 11:17 ` Julian Anastasov @ 2012-08-27 15:15 ` Dmitry Akindinov 2012-08-27 15:27 ` Dmitry Akindinov 2012-08-27 16:13 ` Julian Anastasov 0 siblings, 2 replies; 13+ messages in thread From: Dmitry Akindinov @ 2012-08-27 15:15 UTC (permalink / raw) To: Julian Anastasov; +Cc: lvs-devel Hello, Sorry for top posting: doing this to avoid clutter below. Thank you for your assistance. Let me to summarize the current situation, after all the changes and testing. 1. The test system consists of two servers, S1 and S2. Both running CentOS 6.0: [root@fm1 ~]# uname -a Linux fm1.***.com 2.6.32-71.el6.x86_64 #1 SMP Fri May 20 03:51:51 BST 2011 x86_64 x86_64 x86_64 GNU/Linux We are now setting up new boxes (CentOS 6.3) to re-test with newer kernels 2. Both systems have iptables configured to mark the traffic to VIP with the "100" marker. 3. At the beginning of the test, IPSV on S1 configuration is: -A -f 100 -s rr -p 1 -a -f 100 -r S1:0 -g -w 100 -a -f 100 -r S2:0 -g -w 100 IPSV on S2 configuration is: -A -f 100 -s rr -p 1 -a -f 100 -r S1:0 -g -w 0 -a -f 100 -r S2:0 -g -w 100 We establish test connections from client systems to the port 110 of VIP, the S1 routes one connection to itself, the other one to S2. Both connections are alive and well, the connection tables on both systems are the same due to ipsv syncing daemons: ipsvadm -l -c -n: IP 00:49 NONE C1:0 0.0.0.100:0 S1:0 TCP 14:49 ESTABLISHED C1:54837 VIP:110 S1:110 TCP 14:43 ESTABLISHED C2:54648 VIP:110 S2:110 IP 00:43 NONE C2:0 0.0.0.100:0 S2:0 Now, we initiate a failover, so S2 becomes the active balancer. The IPSV rules on S2 are updated, so they become the same as they were on S1: -A -f 100 -s rr -p 1 -a -f 100 -r S1:0 -g -w 100 -a -f 100 -r S2:0 -g -w 100 And S1 gets the same config as S2 used before the failover. All the connections that existed on S2 before failover continue to work. But the connections that existed on S1 are closed as soon as the client sends any data to that connection. The tcpdump on S1 does not show any incoming packets, and tcpdump on S2 shows that it's S2 itself (a new load balancer) that closes these connections (the data the client has sent was "HELP\r\n" - 6 bytes): 07:54:59.214200 IP (tos 0x10, ttl 54, id 20406, offset 0, flags [DF], proto TCP (6), length 58) C1.54837 > VIP.110: Flags [P.], cksum 0xba0d (correct), seq 3572724860:3572724866, ack 1018696840, win 33304, options [nop,nop,TS val 3371318384 ecr 1243703099], length 6 07:54:59.214253 IP (tos 0x10, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40) VIP.110 > C1.54837: Flags [R], cksum 0x5767 (correct), seq 1018696840, win 0, length 0 What can cause the new load balancer to reset (Flags [R.]) the existing connections to the "old" balancer? New connections now work fine, being distributed by the new load balancer to itself and to the old balancer. On 2012-08-27 15:17, Julian Anastasov wrote: > > Hello, > > On Mon, 27 Aug 2012, Dmitry Akindinov wrote: > >>> OK, I don't know what kernel and patches includes >>> every distribution. Can you tell at least what shows uname -a? >> >> Ah, sorry. That was >> >> [root@fm1 ~]# uname -a >> Linux fm1.***.com 2.6.32-71.el6.x86_64 #1 SMP Fri May 20 03:51:51 BST 2011 >> x86_64 x86_64 x86_64 GNU/Linux > > I downloaded kernel-2.6.32-71.el6.src.rpm and I see > that it does not contain the needed changes to support > backup to be real server for DR/TUN: > > commit fc604767613b6d2036cdc35b660bc39451040a47 > Author: Julian Anastasov<ja@ssi.bg> > Date: Sun Oct 17 16:38:15 2010 +0300 > > ipvs: changes for local real server > > and to support fwmark for SYNC: > > commit fe5e7a1efb664df0280f10377813d7099fb7eb0f > Author: Hans Schillstrom<hans.schillstrom@ericsson.com> > Date: Fri Nov 19 14:25:12 2010 +0100 > > IPVS: Backup, Adding Version 1 receive capability > > Functionality improvements > * flags changed from 16 to 32 bits > * fwmark added (32 bits) > * timeout in sec. added (32 bits) > * pe data added (Variable length) > * IPv6 capabilities (3x16 bytes for addr.) > * Version and type in every conn msg. > >> Yes, exactly. And to avoid this "secondary load balancing", we >> do not load the rules into ipvs until it becomes the active balancer. >> >> Looks like it's causing problems, so the alternative we are using now >> is to load the rules, but make them balance everything to a single >> server - the local one. > > It seems even this is not enough because when > the backup receives the sync message it creates SYNC > connection (after passing the initial SYN and ACK) but > this connection claims this backup is a real server > and is using DR method. Without the commit > fc604767613b6d2036cdc35b660bc39451040a47 > when next packets come ip_vs_dr_xmit tries to send them > to LOCAL_OUT (DR forwarding) instead of returning > NF_ACCEPT as for LOCALNODE. As result, packet does not > reach local stack as the previous SYN and ACK packets > and may be you see that packet loops in the stack cuasing > 100% CPU usage as you said below that it disappears: > >> Now, we see the client trying to send some data to the server, >> and we see the data packet hitting the active load balancer, >> and then - the inactive load balancer. And there we see the >> packet disappearing - the application does not see it, and since >> there is not "ack" sent back to the client, we see the client >> TCP stack resending that packet over and over, but all resent >> packets have the same fate - they disappear inside the inactive >> load balancer. >> >> We can send the actual tcpdumps if needed. > > Not needed, I think, you need kernel update. > >>> directs the SYN there. It can happen only for DR/TUN because >>> the daddr is VIP, that is why people overcome the problem >>> by checking that packet comes from some master and not >>> from uplink gateway MAC. For NAT there is no such double-step >>> scheduling because the backups' rules do not match the >>> internal real server IP in the daddr, they work only for VIP >> >> No, this is not the case. The backup balancer did not have rules, > > Yes, I just explained this variant too. > >>> Interesting, new master forwards to old master, >>> so it should send SYNC containing the old master as real >>> server, how can there be a problem, may be your kernel does >>> not support properly the local server function which is >>> fixed 2 years ago. >> >> Hmm. I assume the kernel we use is pretty fresh. > > I see ip_vs_conn.c from Sep 1 2010 is the latest > file from IPVS. > >>> May be SYNC message changes the destination in >>> backup as I already said above? Some tcpdump output will >>> be helpful in case you don't know how to dig into the >>> sources of your kernel. >> >> There is no change in destination. The dropped packets are really dropped, not >> relayed somewhere. Also, if they were relayed, they could only be relayed to >> the active balancer, as ipvs config only has or had these two servers in it. >> And tcpdump on the active balancer properly shows the packets sent to the >> backup balancer, but no packets coming back from that balancer. > > Yes, may be they loop in stack: DR via LOCAL_OUT, > then they appear again in LOCAL_IN for forwarding? > >>> Very good, only that you need recent kernel for this, >>> 2010-Nov +, there are fixes even after that time. >> >> Yes, it looks like we have the kernels built in May-2011. > > Yep. > >>> table and you can switch between them at any time. Of >>> course, there is some performance price for traffic that >>> goes to the local stack of backups but they should get from >>> current master only traffic for their stack. >> >> That's not what concerns us. IPVS on the backup balancer is now >> being filled by 2 sources: the "sync" process, which copies records >> from the active balancer, and the IPVS itself. >> >> I.e. now (when we have rules in the backup balancer, too) - >> when a new connection arrives to the backup balancer, >> the balancer creates a connection record and places it into its >> connection table. >> A few moments later, the sync daemon receives a connection >> record for the same connection from the active load balancer, >> and it also wants to put that record into the connection table >> on the backup balancer. >> Our concern is a potential conflict here: that record is already >> in the table. If you say that there can be no conflict - it would >> be nice, but we do not know how ipvs is designed, so we >> cannot get rid of that concern on our own. > > May be we should stop any forwarding while we are > in backup mode. The problem is that we can be both in > master and backup mode and I'm not sure if this is used > at all. I guess master and backup use different syncid but > anyways, may be such setup works only for NAT. > >> When a backup balancer is instructed to become an active one, >> our application automatically loads the ruleset with all other >> real servers into its ipvs rule set, and then sends arp broadcast >> for all VIPs, switching the traffic to the new active balancer. >> >> The existing connections should survive, as the connection table >> contains all the records sync'ed from the old active balancer, right? > > Yes. > >> The interesting question is how ipvs assigns the connection records >> received via the sync protocol: as we have seen, we had to put the >> virt server and the local real server rules into ipvs in order to stop >> the problem of the "backup" mode. >> Now, during the failover, we add the rules for other "real servers" >> AFTER the connection records for their connections were received >> from the then-active balancer. Will it cause the same type of problem? > > Not fatal but without rules we can not maintain > actual counters for active/inactive conns. After failover > the setup will start with zeroed counters that are > later modified only for new connections, all SYNCed conns are > not accounted and the first minutes after failover we can > see some imbalance. > > Regards > > -- > Julian Anastasov<ja@ssi.bg> -- Best regards, Dmitry Akindinov -- Stalker Labs. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Multiple load balancers problem 2012-08-27 15:15 ` Dmitry Akindinov @ 2012-08-27 15:27 ` Dmitry Akindinov 2012-08-27 16:13 ` Julian Anastasov 1 sibling, 0 replies; 13+ messages in thread From: Dmitry Akindinov @ 2012-08-27 15:27 UTC (permalink / raw) To: Julian Anastasov; +Cc: lvs-devel Hello, An addition below. On 2012-08-27 19:15, Dmitry Akindinov wrote: > Hello, > > Sorry for top posting: doing this to avoid clutter below. > > Thank you for your assistance. Let me to summarize the current > situation, after all the changes and testing. > > 1. The test system consists of two servers, S1 and S2. Both running > CentOS 6.0: > > [root@fm1 ~]# uname -a > Linux fm1.***.com 2.6.32-71.el6.x86_64 #1 SMP Fri May 20 03:51:51 BST 2011 > x86_64 x86_64 x86_64 GNU/Linux > > We are now setting up new boxes (CentOS 6.3) to re-test with newer kernels > > 2. Both systems have iptables configured to mark the traffic to VIP with > the "100" marker. > > 3. At the beginning of the test, > IPSV on S1 configuration is: > -A -f 100 -s rr -p 1 > -a -f 100 -r S1:0 -g -w 100 > -a -f 100 -r S2:0 -g -w 100 > > IPSV on S2 configuration is: > -A -f 100 -s rr -p 1 > -a -f 100 -r S1:0 -g -w 0 > -a -f 100 -r S2:0 -g -w 100 > > We establish test connections from client systems to the port 110 of > VIP, the S1 routes one connection to itself, the other one to S2. Both > connections are alive and well, the connection tables on both systems > are the same due to ipsv syncing daemons: > > ipsvadm -l -c -n: > IP 00:49 NONE C1:0 0.0.0.100:0 S1:0 > TCP 14:49 ESTABLISHED C1:54837 VIP:110 S1:110 > TCP 14:43 ESTABLISHED C2:54648 VIP:110 S2:110 > IP 00:43 NONE C2:0 0.0.0.100:0 S2:0 > > Now, we initiate a failover, so S2 becomes the active balancer. > The IPSV rules on S2 are updated, so they become the same as they were > on S1: > -A -f 100 -s rr -p 1 > -a -f 100 -r S1:0 -g -w 100 > -a -f 100 -r S2:0 -g -w 100 > > And S1 gets the same config as S2 used before the failover. > > All the connections that existed on S2 before failover continue to work. > > But the connections that existed on S1 are closed as soon as the client > sends any data to that connection. The tcpdump on S1 does not show any > incoming packets, and tcpdump on S2 shows that it's S2 itself (a new > load balancer) that closes these connections (the data the client has > sent was "HELP\r\n" - 6 bytes): > > 07:54:59.214200 IP (tos 0x10, ttl 54, id 20406, offset 0, flags [DF], > proto TCP (6), length 58) > C1.54837 > VIP.110: Flags [P.], cksum 0xba0d (correct), seq > 3572724860:3572724866, ack 1018696840, win 33304, options [nop,nop,TS > val 3371318384 ecr 1243703099], length 6 > 07:54:59.214253 IP (tos 0x10, ttl 64, id 0, offset 0, flags [DF], proto > TCP (6), length 40) > VIP.110 > C1.54837: Flags [R], cksum 0x5767 (correct), seq 1018696840, > win 0, length 0 > > What can cause the new load balancer to reset (Flags [R.]) the existing > connections to the "old" balancer? > > New connections now work fine, being distributed by the new load > balancer to itself and to the old balancer. PS. the look at the ipvsadm -l -c -n of the new balancer showed that the troubled connection (directed to the "old" balancer) appears in the ESTABLISHED state before and after the client has sent some data, and the new load balancer designed to drop the connection. The connection tracking is switched off on both servers: *raw :PREROUTING ACCEPT [887797:396864975] :OUTPUT ACCEPT [426902:66177111] -A PREROUTING -d VIP/32 -j NOTRACK The backup balancer used the following commands when it became the new active balancer: 43 STARTBALANCER\n * switching on * ipvsadm -e -f 100 -r S1 -g -w 100 * ipvsadm --stop-daemon backup * ipvsadm --start-daemon master --mcast-interface eth0 --syncid 0 * sysctl net.ipv4.conf.all.arp_ignore=0 * result=net.ipv4.conf.all.arp_ignore = 0 * sysctl net.ipv4.conf.eth0.arp_ignore=0 * result=net.ipv4.conf.eth0.arp_ignore = 0 * sysctl net.ipv4.conf.all.arp_announce=0 * result=net.ipv4.conf.all.arp_announce = 0 * sysctl net.ipv4.conf.eth0.arp_announce=0 * result=net.ipv4.conf.eth0.arp_announce = 0 * arping -c 1 -I eth0 -U VIP * result=ARPING VIP from VIP eth0 Sent 1 probes (1 broadcast(s)) Received 0 response(s) as you can see, the rule for the virtual server (-A -f 100) and the rule for the local real server were not touched, and the record for the other server (old load balancer) was editted, not removed and added ipvsadm -e -f 100 -r S1 -g -w 100 to make it "active" (it was -w 0 while this server was not an active balancer). Still, when this server becomes the active balancer, it resets all existing connections to the old balancer. > On 2012-08-27 15:17, Julian Anastasov wrote: >> >> Hello, >> >> On Mon, 27 Aug 2012, Dmitry Akindinov wrote: >> >>>> OK, I don't know what kernel and patches includes >>>> every distribution. Can you tell at least what shows uname -a? >>> >>> Ah, sorry. That was >>> >>> [root@fm1 ~]# uname -a >>> Linux fm1.***.com 2.6.32-71.el6.x86_64 #1 SMP Fri May 20 03:51:51 BST >>> 2011 >>> x86_64 x86_64 x86_64 GNU/Linux >> >> I downloaded kernel-2.6.32-71.el6.src.rpm and I see >> that it does not contain the needed changes to support >> backup to be real server for DR/TUN: >> >> commit fc604767613b6d2036cdc35b660bc39451040a47 >> Author: Julian Anastasov<ja@ssi.bg> >> Date: Sun Oct 17 16:38:15 2010 +0300 >> >> ipvs: changes for local real server >> >> and to support fwmark for SYNC: >> >> commit fe5e7a1efb664df0280f10377813d7099fb7eb0f >> Author: Hans Schillstrom<hans.schillstrom@ericsson.com> >> Date: Fri Nov 19 14:25:12 2010 +0100 >> >> IPVS: Backup, Adding Version 1 receive capability >> >> Functionality improvements >> * flags changed from 16 to 32 bits >> * fwmark added (32 bits) >> * timeout in sec. added (32 bits) >> * pe data added (Variable length) >> * IPv6 capabilities (3x16 bytes for addr.) >> * Version and type in every conn msg. >> >>> Yes, exactly. And to avoid this "secondary load balancing", we >>> do not load the rules into ipvs until it becomes the active balancer. >>> >>> Looks like it's causing problems, so the alternative we are using now >>> is to load the rules, but make them balance everything to a single >>> server - the local one. >> >> It seems even this is not enough because when >> the backup receives the sync message it creates SYNC >> connection (after passing the initial SYN and ACK) but >> this connection claims this backup is a real server >> and is using DR method. Without the commit >> fc604767613b6d2036cdc35b660bc39451040a47 >> when next packets come ip_vs_dr_xmit tries to send them >> to LOCAL_OUT (DR forwarding) instead of returning >> NF_ACCEPT as for LOCALNODE. As result, packet does not >> reach local stack as the previous SYN and ACK packets >> and may be you see that packet loops in the stack cuasing >> 100% CPU usage as you said below that it disappears: >> >>> Now, we see the client trying to send some data to the server, >>> and we see the data packet hitting the active load balancer, >>> and then - the inactive load balancer. And there we see the >>> packet disappearing - the application does not see it, and since >>> there is not "ack" sent back to the client, we see the client >>> TCP stack resending that packet over and over, but all resent >>> packets have the same fate - they disappear inside the inactive >>> load balancer. >>> >>> We can send the actual tcpdumps if needed. >> >> Not needed, I think, you need kernel update. >> >>>> directs the SYN there. It can happen only for DR/TUN because >>>> the daddr is VIP, that is why people overcome the problem >>>> by checking that packet comes from some master and not >>>> from uplink gateway MAC. For NAT there is no such double-step >>>> scheduling because the backups' rules do not match the >>>> internal real server IP in the daddr, they work only for VIP >>> >>> No, this is not the case. The backup balancer did not have rules, >> >> Yes, I just explained this variant too. >> >>>> Interesting, new master forwards to old master, >>>> so it should send SYNC containing the old master as real >>>> server, how can there be a problem, may be your kernel does >>>> not support properly the local server function which is >>>> fixed 2 years ago. >>> >>> Hmm. I assume the kernel we use is pretty fresh. >> >> I see ip_vs_conn.c from Sep 1 2010 is the latest >> file from IPVS. >> >>>> May be SYNC message changes the destination in >>>> backup as I already said above? Some tcpdump output will >>>> be helpful in case you don't know how to dig into the >>>> sources of your kernel. >>> >>> There is no change in destination. The dropped packets are really >>> dropped, not >>> relayed somewhere. Also, if they were relayed, they could only be >>> relayed to >>> the active balancer, as ipvs config only has or had these two servers >>> in it. >>> And tcpdump on the active balancer properly shows the packets sent to >>> the >>> backup balancer, but no packets coming back from that balancer. >> >> Yes, may be they loop in stack: DR via LOCAL_OUT, >> then they appear again in LOCAL_IN for forwarding? >> >>>> Very good, only that you need recent kernel for this, >>>> 2010-Nov +, there are fixes even after that time. >>> >>> Yes, it looks like we have the kernels built in May-2011. >> >> Yep. >> >>>> table and you can switch between them at any time. Of >>>> course, there is some performance price for traffic that >>>> goes to the local stack of backups but they should get from >>>> current master only traffic for their stack. >>> >>> That's not what concerns us. IPVS on the backup balancer is now >>> being filled by 2 sources: the "sync" process, which copies records >>> from the active balancer, and the IPVS itself. >>> >>> I.e. now (when we have rules in the backup balancer, too) - >>> when a new connection arrives to the backup balancer, >>> the balancer creates a connection record and places it into its >>> connection table. >>> A few moments later, the sync daemon receives a connection >>> record for the same connection from the active load balancer, >>> and it also wants to put that record into the connection table >>> on the backup balancer. >>> Our concern is a potential conflict here: that record is already >>> in the table. If you say that there can be no conflict - it would >>> be nice, but we do not know how ipvs is designed, so we >>> cannot get rid of that concern on our own. >> >> May be we should stop any forwarding while we are >> in backup mode. The problem is that we can be both in >> master and backup mode and I'm not sure if this is used >> at all. I guess master and backup use different syncid but >> anyways, may be such setup works only for NAT. >> >>> When a backup balancer is instructed to become an active one, >>> our application automatically loads the ruleset with all other >>> real servers into its ipvs rule set, and then sends arp broadcast >>> for all VIPs, switching the traffic to the new active balancer. >>> >>> The existing connections should survive, as the connection table >>> contains all the records sync'ed from the old active balancer, right? >> >> Yes. >> >>> The interesting question is how ipvs assigns the connection records >>> received via the sync protocol: as we have seen, we had to put the >>> virt server and the local real server rules into ipvs in order to stop >>> the problem of the "backup" mode. >>> Now, during the failover, we add the rules for other "real servers" >>> AFTER the connection records for their connections were received >>> from the then-active balancer. Will it cause the same type of problem? >> >> Not fatal but without rules we can not maintain >> actual counters for active/inactive conns. After failover >> the setup will start with zeroed counters that are >> later modified only for new connections, all SYNCed conns are >> not accounted and the first minutes after failover we can >> see some imbalance. >> >> Regards >> >> -- >> Julian Anastasov<ja@ssi.bg> > -- Best regards, Dmitry Akindinov -- Stalker Labs. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Multiple load balancers problem 2012-08-27 15:15 ` Dmitry Akindinov 2012-08-27 15:27 ` Dmitry Akindinov @ 2012-08-27 16:13 ` Julian Anastasov 2012-08-27 20:24 ` Dmitry Akindinov 1 sibling, 1 reply; 13+ messages in thread From: Julian Anastasov @ 2012-08-27 16:13 UTC (permalink / raw) To: Dmitry Akindinov; +Cc: lvs-devel Hello, On Mon, 27 Aug 2012, Dmitry Akindinov wrote: > Hello, > > Sorry for top posting: doing this to avoid clutter below. > > Thank you for your assistance. Let me to summarize the current situation, > after all the changes and testing. > > 1. The test system consists of two servers, S1 and S2. Both running CentOS > 6.0: > > [root@fm1 ~]# uname -a > Linux fm1.***.com 2.6.32-71.el6.x86_64 #1 SMP Fri May 20 03:51:51 BST 2011 > x86_64 x86_64 x86_64 GNU/Linux > > We are now setting up new boxes (CentOS 6.3) to re-test with newer kernels The last bug fixes that can be needed for sync are from April 2012. > 2. Both systems have iptables configured to mark the traffic to VIP with the > "100" marker. > > 3. At the beginning of the test, > IPSV on S1 configuration is: > -A -f 100 -s rr -p 1 > -a -f 100 -r S1:0 -g -w 100 > -a -f 100 -r S2:0 -g -w 100 > > IPSV on S2 configuration is: > -A -f 100 -s rr -p 1 > -a -f 100 -r S1:0 -g -w 0 > -a -f 100 -r S2:0 -g -w 100 > > We establish test connections from client systems to the port 110 of VIP, the > S1 routes one connection to itself, the other one to S2. Both connections are > alive and well, the connection tables on both systems are the same due to ipsv > syncing daemons: > > ipsvadm -l -c -n: > IP 00:49 NONE C1:0 0.0.0.100:0 S1:0 > TCP 14:49 ESTABLISHED C1:54837 VIP:110 S1:110 > TCP 14:43 ESTABLISHED C2:54648 VIP:110 S2:110 > IP 00:43 NONE C2:0 0.0.0.100:0 S2:0 > > Now, we initiate a failover, so S2 becomes the active balancer. > The IPSV rules on S2 are updated, so they become the same as they were on S1: > -A -f 100 -s rr -p 1 > -a -f 100 -r S1:0 -g -w 100 > -a -f 100 -r S2:0 -g -w 100 > > And S1 gets the same config as S2 used before the failover. > > All the connections that existed on S2 before failover continue to work. No loop? Because S2 will receive DR method in the SYNC messages for its stack. You see traffic flow or just that there is no reset? > But the connections that existed on S1 are closed as soon as the client sends This is fixed by the mentioned change: "ipvs: changes for local real server" Before this change the SYNC messages indicate LOCALNODE as forwarding method. So, S2 thinks that the conns for S1 stack are LOCALNODE, S2 deliver them locally and they are reset. > any data to that connection. The tcpdump on S1 does not show any incoming > packets, and tcpdump on S2 shows that it's S2 itself (a new load balancer) > that closes these connections (the data the client has sent was "HELP\r\n" - 6 > bytes): > > 07:54:59.214200 IP (tos 0x10, ttl 54, id 20406, offset 0, flags [DF], proto > TCP (6), length 58) > C1.54837 > VIP.110: Flags [P.], cksum 0xba0d (correct), seq > 3572724860:3572724866, ack 1018696840, win 33304, options [nop,nop,TS val > 3371318384 ecr 1243703099], length 6 > 07:54:59.214253 IP (tos 0x10, ttl 64, id 0, offset 0, flags [DF], proto TCP > (6), length 40) > VIP.110 > C1.54837: Flags [R], cksum 0x5767 (correct), seq 1018696840, win > 0, length 0 > > What can cause the new load balancer to reset (Flags [R.]) the existing > connections to the "old" balancer? The wrong forwarding type. That is why we now save the original forwarding type (DR), we sync it and finally when packet is transmitted if the destination is local address we deliver it to local stack. We do not send LOCALNODE type in messages to backups anymore, we removed this forwarding type. > New connections now work fine, being distributed by the new load balancer to > itself and to the old balancer. Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Multiple load balancers problem 2012-08-27 16:13 ` Julian Anastasov @ 2012-08-27 20:24 ` Dmitry Akindinov 2012-08-28 7:21 ` Julian Anastasov 0 siblings, 1 reply; 13+ messages in thread From: Dmitry Akindinov @ 2012-08-27 20:24 UTC (permalink / raw) To: Julian Anastasov; +Cc: lvs-devel Hello, On 2012-08-27 20:13, Julian Anastasov wrote: > > Hello, > > On Mon, 27 Aug 2012, Dmitry Akindinov wrote: > >> Hello, >> >> Sorry for top posting: doing this to avoid clutter below. >> >> Thank you for your assistance. Let me to summarize the current situation, >> after all the changes and testing. >> >> 1. The test system consists of two servers, S1 and S2. Both running CentOS >> 6.0: >> >> [root@fm1 ~]# uname -a >> Linux fm1.***.com 2.6.32-71.el6.x86_64 #1 SMP Fri May 20 03:51:51 BST 2011 >> x86_64 x86_64 x86_64 GNU/Linux >> >> We are now setting up new boxes (CentOS 6.3) to re-test with newer kernels > > The last bug fixes that can be needed for sync > are from April 2012. > >> 2. Both systems have iptables configured to mark the traffic to VIP with the >> "100" marker. >> >> 3. At the beginning of the test, >> IPSV on S1 configuration is: >> -A -f 100 -s rr -p 1 >> -a -f 100 -r S1:0 -g -w 100 >> -a -f 100 -r S2:0 -g -w 100 >> >> IPSV on S2 configuration is: >> -A -f 100 -s rr -p 1 >> -a -f 100 -r S1:0 -g -w 0 >> -a -f 100 -r S2:0 -g -w 100 >> >> We establish test connections from client systems to the port 110 of VIP, the >> S1 routes one connection to itself, the other one to S2. Both connections are >> alive and well, the connection tables on both systems are the same due to ipsv >> syncing daemons: >> >> ipsvadm -l -c -n: >> IP 00:49 NONE C1:0 0.0.0.100:0 S1:0 >> TCP 14:49 ESTABLISHED C1:54837 VIP:110 S1:110 >> TCP 14:43 ESTABLISHED C2:54648 VIP:110 S2:110 >> IP 00:43 NONE C2:0 0.0.0.100:0 S2:0 >> >> Now, we initiate a failover, so S2 becomes the active balancer. >> The IPSV rules on S2 are updated, so they become the same as they were on S1: >> -A -f 100 -s rr -p 1 >> -a -f 100 -r S1:0 -g -w 100 >> -a -f 100 -r S2:0 -g -w 100 >> >> And S1 gets the same config as S2 used before the failover. >> >> All the connections that existed on S2 before failover continue to work. > > No loop? Because S2 will receive DR method in > the SYNC messages for its stack. You see traffic flow > or just that there is no reset? Yes, after the failover, all connections that were open on the S2 (which now becomes the active balancer) do not reset and continue to function just fine (traffic flow in both directions). We do understand (we think) your explanation about the sync table problem - the connections to the actual balancer are marked in a special way, which causes problems when these connection table records are used on a new balancer. It looks like updating the kernels is the only way, if (as you and Hans Schillstrom outlines) even the latest CentOS/RedHat kernels do not contain necessary patches. It's a pity, as the idea was to provide an "out of the box" solution for our customers, and asking them to update to some Linux kernel is not what they like to hear. But thank you very much in any case, we will update the kernels on our test systems and will see if this (last?) problem disappears. >> But the connections that existed on S1 are closed as soon as the client sends > > This is fixed by the mentioned change: > "ipvs: changes for local real server" > > Before this change the SYNC messages indicate > LOCALNODE as forwarding method. So, S2 thinks that the > conns for S1 stack are LOCALNODE, S2 deliver them locally > and they are reset. > >> any data to that connection. The tcpdump on S1 does not show any incoming >> packets, and tcpdump on S2 shows that it's S2 itself (a new load balancer) >> that closes these connections (the data the client has sent was "HELP\r\n" - 6 >> bytes): >> >> 07:54:59.214200 IP (tos 0x10, ttl 54, id 20406, offset 0, flags [DF], proto >> TCP (6), length 58) >> C1.54837 > VIP.110: Flags [P.], cksum 0xba0d (correct), seq >> 3572724860:3572724866, ack 1018696840, win 33304, options [nop,nop,TS val >> 3371318384 ecr 1243703099], length 6 >> 07:54:59.214253 IP (tos 0x10, ttl 64, id 0, offset 0, flags [DF], proto TCP >> (6), length 40) >> VIP.110 > C1.54837: Flags [R], cksum 0x5767 (correct), seq 1018696840, win >> 0, length 0 >> >> What can cause the new load balancer to reset (Flags [R.]) the existing >> connections to the "old" balancer? > > The wrong forwarding type. That is why we now > save the original forwarding type (DR), we sync it and > finally when packet is transmitted if the destination is > local address we deliver it to local stack. We do not > send LOCALNODE type in messages to backups anymore, we > removed this forwarding type. > >> New connections now work fine, being distributed by the new load balancer to >> itself and to the old balancer. > > Regards > > -- > Julian Anastasov <ja@ssi.bg> > -- Best regards, Dmitry Akindinov ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Multiple load balancers problem 2012-08-27 20:24 ` Dmitry Akindinov @ 2012-08-28 7:21 ` Julian Anastasov 0 siblings, 0 replies; 13+ messages in thread From: Julian Anastasov @ 2012-08-28 7:21 UTC (permalink / raw) To: Dmitry Akindinov; +Cc: lvs-devel Hello, On Tue, 28 Aug 2012, Dmitry Akindinov wrote: > > No loop? Because S2 will receive DR method in > > the SYNC messages for its stack. You see traffic flow > > or just that there is no reset? > > Yes, after the failover, all connections that were open on the S2 (which now > becomes the active balancer) do not reset and continue to function just fine > (traffic flow in both directions). Aha, I now see why, ip_vs_bind_dest assigns the forwarding mode from real server, not from SYNC messages. As you have LOCALNODE when S2 is slave, the DR mode from SYNC message is not used, that is why after failover the S2 connections work and are not reset. So, at least, we have explanation for all things that happen. If you run slave without any rules the connection will use the mode from master and you will see packet loop. > -- > Best regards, > Dmitry Akindinov Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re[2]: Multiple load balancers problem
@ 2012-08-27 20:43 Hans Schillstrom
2012-08-30 17:24 ` Dmitry Akindinov
0 siblings, 1 reply; 13+ messages in thread
From: Hans Schillstrom @ 2012-08-27 20:43 UTC (permalink / raw)
To: Dmitry Akindinov; +Cc: Julian Anastasov, lvs-devel
Hello
[snip]
>>
>> No loop? Because S2 will receive DR method in
>> the SYNC messages for its stack. You see traffic flow
>> or just that there is no reset?
>
>Yes, after the failover, all connections that were open on the S2 (which
>now becomes the active balancer) do not reset and continue to function
>just fine (traffic flow in both directions).
>
>We do understand (we think) your explanation about the sync table
>problem - the connections to the actual balancer are marked in a
>special way, which causes problems when these connection table records
>are used on a new balancer.
>
>It looks like updating the kernels is the only way, if (as you and Hans
>Schillstrom outlines) even the latest CentOS/RedHat kernels do not
>contain necessary patches. It's a pity, as the idea was to provide an
>"out of the box" solution for our customers, and asking them to update
>to some Linux kernel is not what they like to hear.
>
>But thank you very much in any case, we will update the kernels on our
>test systems and will see if this (last?) problem disappears.
>
There is newer kernels floating around in the 3.x range
Have a look at ELRepo Project there you can find a 3.5.x kernel
that have all necessary changes.
BR
Hans
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Multiple load balancers problem 2012-08-27 20:43 Re[2]: " Hans Schillstrom @ 2012-08-30 17:24 ` Dmitry Akindinov 2012-08-30 20:00 ` Julian Anastasov 0 siblings, 1 reply; 13+ messages in thread From: Dmitry Akindinov @ 2012-08-30 17:24 UTC (permalink / raw) To: Hans Schillstrom; +Cc: Julian Anastasov, lvs-devel Hello, On 2012-08-28 00:43, Hans Schillstrom wrote: > Hello > > [snip] > >>> >>> No loop? Because S2 will receive DR method in >>> the SYNC messages for its stack. You see traffic flow >>> or just that there is no reset? >> >> Yes, after the failover, all connections that were open on the S2 (which >> now becomes the active balancer) do not reset and continue to function >> just fine (traffic flow in both directions). >> >> We do understand (we think) your explanation about the sync table >> problem - the connections to the actual balancer are marked in a >> special way, which causes problems when these connection table records >> are used on a new balancer. >> >> It looks like updating the kernels is the only way, if (as you and Hans >> Schillstrom outlines) even the latest CentOS/RedHat kernels do not >> contain necessary patches. It's a pity, as the idea was to provide an >> "out of the box" solution for our customers, and asking them to update >> to some Linux kernel is not what they like to hear. >> >> But thank you very much in any case, we will update the kernels on our >> test systems and will see if this (last?) problem disappears. >> > > There is newer kernels floating around in the 3.x range > Have a look at ELRepo Project there you can find a 3.5.x kernel > that have all necessary changes. Installed CentOS 6.3 and the kernel 3.5.3 from elrepo.org [root@fe3 ~]# uname -a Linux fe3.msk 3.5.3-1.el6.elrepo.x86_64 #1 SMP Sun Aug 26 14:05:15 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux But we still see the same problem: connections to the old balancer are RSET'ed when the new balancer takes over. Any idea which particular kernel we need to use? Or, do we need to apply some specific patches and build our own kernel? Thank you for all your help so far! > BR > Hans > -- Best regards, Dmitry Akindinov ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Multiple load balancers problem 2012-08-30 17:24 ` Dmitry Akindinov @ 2012-08-30 20:00 ` Julian Anastasov 0 siblings, 0 replies; 13+ messages in thread From: Julian Anastasov @ 2012-08-30 20:00 UTC (permalink / raw) To: Dmitry Akindinov; +Cc: Hans Schillstrom, lvs-devel Hello, On Thu, 30 Aug 2012, Dmitry Akindinov wrote: > Installed CentOS 6.3 and the kernel 3.5.3 from elrepo.org > > [root@fe3 ~]# uname -a > Linux fe3.msk 3.5.3-1.el6.elrepo.x86_64 #1 SMP Sun Aug 26 14:05:15 EDT 2012 > x86_64 x86_64 x86_64 GNU/Linux > > > But we still see the same problem: connections to the old balancer are RSET'ed > when the new balancer takes over. Any idea which particular kernel we need to > use? Or, do we need to apply some specific patches and build our own kernel? This kernel has everything you need. Is it installed on both machines? Do you see that new balancer forwards traffic to old balancer, i.e. who exactly sends the resets? Old or new balancer? > Thank you for all your help so far! > > > BR > > Hans > > > > -- > Best regards, > Dmitry Akindinov Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re[2]: Multiple load balancers problem @ 2012-08-31 8:21 Hans Schillstrom 2012-09-03 7:54 ` Dmitry Akindinov 0 siblings, 1 reply; 13+ messages in thread From: Hans Schillstrom @ 2012-08-31 8:21 UTC (permalink / raw) To: Dmitry Akindinov; +Cc: Julian Anastasov, lvs-devel > >Hello, > >On 2012-08-28 00:43, Hans Schillstrom wrote: >> Hello >> >> [snip] >> >>>> >>>> No loop? Because S2 will receive DR method in >>>> the SYNC messages for its stack. You see traffic flow >>>> or just that there is no reset? >>> >>> Yes, after the failover, all connections that were open on the S2 (which >>> now becomes the active balancer) do not reset and continue to function >>> just fine (traffic flow in both directions). >>> >>> We do understand (we think) your explanation about the sync table >>> problem - the connections to the actual balancer are marked in a >>> special way, which causes problems when these connection table records >>> are used on a new balancer. >>> >>> It looks like updating the kernels is the only way, if (as you and Hans >>> Schillstrom outlines) even the latest CentOS/RedHat kernels do not >>> contain necessary patches. It's a pity, as the idea was to provide an >>> "out of the box" solution for our customers, and asking them to update >>> to some Linux kernel is not what they like to hear. >>> >>> But thank you very much in any case, we will update the kernels on our >>> test systems and will see if this (last?) problem disappears. >>> >> >> There is newer kernels floating around in the 3.x range >> Have a look at ELRepo Project there you can find a 3.5.x kernel >> that have all necessary changes. > >Installed CentOS 6.3 and the kernel 3.5.3 from elrepo.org > >[root@fe3 ~]# uname -a >Linux fe3.msk 3.5.3-1.el6.elrepo.x86_64 #1 SMP Sun Aug 26 14:05:15 EDT >2012 x86_64 x86_64 x86_64 GNU/Linux > > >But we still see the same problem: connections to the old balancer are >RSET'ed when the new balancer takes over. Any idea which particular >kernel we need to use? Or, do we need to apply some specific patches and >build our own kernel? A silly question, old and new in this case I hope it's a 3.5.x kernel in both cases ... > >Thank you for all your help so far! > >> BR >> Hans >> > >-- >Best regards, >Dmitry Akindinov ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Multiple load balancers problem 2012-08-31 8:21 Re[2]: " Hans Schillstrom @ 2012-09-03 7:54 ` Dmitry Akindinov 0 siblings, 0 replies; 13+ messages in thread From: Dmitry Akindinov @ 2012-09-03 7:54 UTC (permalink / raw) To: Hans Schillstrom; +Cc: Julian Anastasov, lvs-devel Hello, On 2012-08-31 12:21, Hans Schillstrom wrote: >> >> Hello, >> >> On 2012-08-28 00:43, Hans Schillstrom wrote: >>> Hello >>> >>> [snip] >>> >>>>> >>>>> No loop? Because S2 will receive DR method in >>>>> the SYNC messages for its stack. You see traffic flow >>>>> or just that there is no reset? >>>> >>>> Yes, after the failover, all connections that were open on the S2 (which >>>> now becomes the active balancer) do not reset and continue to function >>>> just fine (traffic flow in both directions). >>>> >>>> We do understand (we think) your explanation about the sync table >>>> problem - the connections to the actual balancer are marked in a >>>> special way, which causes problems when these connection table records >>>> are used on a new balancer. >>>> >>>> It looks like updating the kernels is the only way, if (as you and Hans >>>> Schillstrom outlines) even the latest CentOS/RedHat kernels do not >>>> contain necessary patches. It's a pity, as the idea was to provide an >>>> "out of the box" solution for our customers, and asking them to update >>>> to some Linux kernel is not what they like to hear. >>>> >>>> But thank you very much in any case, we will update the kernels on our >>>> test systems and will see if this (last?) problem disappears. >>>> >>> >>> There is newer kernels floating around in the 3.x range >>> Have a look at ELRepo Project there you can find a 3.5.x kernel >>> that have all necessary changes. >> >> Installed CentOS 6.3 and the kernel 3.5.3 from elrepo.org >> >> [root@fe3 ~]# uname -a >> Linux fe3.msk 3.5.3-1.el6.elrepo.x86_64 #1 SMP Sun Aug 26 14:05:15 EDT >> 2012 x86_64 x86_64 x86_64 GNU/Linux >> >> >> But we still see the same problem: connections to the old balancer are >> RSET'ed when the new balancer takes over. Any idea which particular >> kernel we need to use? Or, do we need to apply some specific patches and >> build our own kernel? > > A silly question, old and new in this case > I hope it's a 3.5.x kernel in both cases ... Yes, sure. :-) The kernels were upgraded on both frontends. And in fact, that helped. My previous report that the problem was still observed with the new kernel - well, it was not correct. The test environment on virtual machines was not set up correctly. We will do more extensive testing this week, but for now it appears that problems we reported initially have gone when we upgraded the kernel to 3.5.3. Thank you! [] -- Best regards, Dmitry Akindinov ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2012-09-03 7:54 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-08-25 7:37 Multiple load balancers problem Dmitry Akindinov 2012-08-25 10:13 ` Dmitry Akindinov 2012-08-25 11:53 ` Julian Anastasov 2012-08-27 8:02 ` Dmitry Akindinov 2012-08-27 11:17 ` Julian Anastasov 2012-08-27 15:15 ` Dmitry Akindinov 2012-08-27 15:27 ` Dmitry Akindinov 2012-08-27 16:13 ` Julian Anastasov 2012-08-27 20:24 ` Dmitry Akindinov 2012-08-28 7:21 ` Julian Anastasov -- strict thread matches above, loose matches on Subject: below -- 2012-08-27 20:43 Re[2]: " Hans Schillstrom 2012-08-30 17:24 ` Dmitry Akindinov 2012-08-30 20:00 ` Julian Anastasov 2012-08-31 8:21 Re[2]: " Hans Schillstrom 2012-09-03 7:54 ` Dmitry Akindinov
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.