All of lore.kernel.org
 help / color / mirror / Atom feed
* Multiple load balancers problem
@ 2012-08-25  7:37 Dmitry Akindinov
  2012-08-25 10:13 ` Dmitry Akindinov
  2012-08-25 11:53 ` Julian Anastasov
  0 siblings, 2 replies; 13+ messages in thread
From: Dmitry Akindinov @ 2012-08-25  7:37 UTC (permalink / raw)
  To: lvs-devel

Hello,

We are currently stuck with the following ipvs problem:

1. The configuration includes a (potentially large) set of servers 
providing various services - besides HTTP (POP, IMAP, LDAP, SMTP, XMPP, 
etc.) The test setup includes just 2 servers, though.
2. Each server runs a stock version of CentOS 6.0
3. The application software (CommuniGate Pro) controls the ipvs kernel 
module using the ipvsadm commands.
4. On each server, iptables are configured to:
   a) disable connection tracking for VIP address(es)
   b) mark all packets coming to the VIP address(es) with the mark value 
of 100.
5. On the currently active load balancer, the ipvsadm is used to 
configure ipvs to load-balance packets with the marker 100:
-A -f 100 -s rr -p 1
-a -f 100 -r <server1> -g
-a -f 100 -r <server2> -g
....
where the active balancer itself is one of the <serverN>
6. All other servers (just 1 "other" server in our test config) are 
running ipvs, but with an empty rule set.
7. The active load balancer runs the sync daemon started with ipvsadm 
--start-daemon master
7. All other servers run the sync daemon started with ipvsadm 
--start-daemon backup.

As a result, all servers have the duplicated ipvs connection tables. If 
the active balancer fails, some other server assumes its role by 
arp-broadcasting VIP and loading the ipvs rule set listed above.

When a connection is being established to the VIP address, and the 
active load balancer directs it to itself, everything works fine.
When a connection is being established to the VIP address, and the 
active load balancer directs it to some other server, the connection is 
established fine, and if the protocol is POP, IMAP, SMTP, the server 
prompt is sent to the client via VIP, and it is seen by client just fine.
But when the client tries to send anything to the server, the packet 
(according to tcpdump) reaches the load balancer server, and from there 
it reaches the "other" server. Where the packet is dropped. The client 
resends that packet, it goes to the active balancer, then to the "other" 
server, and it is dropped again.


Observations:
*) if ipvs is switched off on that "other" server, everything works just 
fine (service ipvsadm stop)

*) if ipvs is left running on that "other" server, but syncing daemon is 
switched off, everything works just fine.
We are 95% sure that the problem appears only if the "other server" ipvs 
connection table gets a copy of this
connection from the active balancer. If the copy is not there (the sync 
daemon was stopped when the connection
was established, and restarted immediately after), everything works just 
fine.

*) the problem exists for protocols like POP, IMAP, SMTP - where the 
server immediately sends some data (prompt) to the client, as soon as 
the connection is established.
When the HTTP protocol is used, the problem does not exist, but only if 
the entire request is sent as one packet. If the HTTP connection is a 
"keep-alive" one, subsequent requests in the same connection do not 
reach the application either.
I.e. it looks like the "idling" ipvs allows only one incoming data 
packet in, and only if there has been no outgoing packet on that 
connection yet.

*) Sometimes (we still cannot reproduce this reliably) the  ksoftirqd 
threads on the "other" server jump to 100% CPU
utilization, and when it happens, it happens in reaction to one 
connection being established.

Received suggestions:
*) it was suggested that we use iptables to filter the packets to VIP 
that come from other servers in the farm (using their MAC addresses) and 
direct them directly to the local application, bypassing ipvs 
processing. We cannot do that, as servers in the farm can be added at 
any moment, and updating the list of MACs on all servers is not trivial. 
It may be easier to filter the packets that come from the router(s), 
which are less numerous and do not change that often.
But it does not look like a good solution. If the ipvs table on 
"inactive" balancer drops packets, why would it stop dropping them when 
it becomes an "active" balancer? Just because there will be ipvs rules 
present?

*) The suggestion to separate load balancer(s) and real servers won't 
work for us at all.

*) We tried not to empty the ipvs table on the "other" server(s). 
Instead, we left it balancing - but with only one "real server" - this 
server itself. Now, the "active" load balancer dsitributes packets to 
itself and other servers, and when the packets hit the "other" 
server(s), they get to the ipvs again, where they are balanced again, 
but to the local server only.

It looks like it does solve the problem. But now the ipvs connection 
table on the "other" server(s) is filled by both that server ipvs itself 
and by the sync-daemon. While the locally-generated connection table 
entries should be the same as corresponding entries received with the 
sync daemon, it does not look good when the same table is modified from 
two sources.

Any comment, please? Should we use the last suggestion?


-- 
Best regards,
Dmitry Akindinov

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re[2]:  Multiple load balancers problem
@ 2012-08-27 20:43 Hans Schillstrom
  2012-08-30 17:24 ` Dmitry Akindinov
  0 siblings, 1 reply; 13+ messages in thread
From: Hans Schillstrom @ 2012-08-27 20:43 UTC (permalink / raw)
  To: Dmitry Akindinov; +Cc: Julian Anastasov, lvs-devel

Hello

[snip]

>>
>> 	No loop? Because S2 will receive DR method in
>> the SYNC messages for its stack. You see traffic flow
>> or just that there is no reset?
>
>Yes, after the failover, all connections that were open on the S2 (which 
>now becomes the active balancer) do not reset and continue to function 
>just fine (traffic flow in both directions).
>
>We do understand (we think) your explanation about the sync table 
>problem  - the connections to the actual balancer are marked in a 
>special way, which causes problems when these connection table records 
>are used on a new balancer.
>
>It looks like updating the kernels is the only way, if (as you and Hans 
>Schillstrom outlines) even the latest CentOS/RedHat kernels do not 
>contain necessary patches. It's a pity, as the idea was to provide an 
>"out of the box" solution for our customers, and asking them to update 
>to some Linux kernel is not what they like to hear.
>
>But thank you very much in any case, we will update the kernels on our 
>test systems and will see if this (last?) problem disappears.
>

There is newer kernels floating around in the 3.x range
Have a look at ELRepo Project there you can find a 3.5.x kernel
that have all necessary changes.

BR
Hans 


^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re[2]:  Multiple load balancers problem
@ 2012-08-31  8:21 Hans Schillstrom
  2012-09-03  7:54 ` Dmitry Akindinov
  0 siblings, 1 reply; 13+ messages in thread
From: Hans Schillstrom @ 2012-08-31  8:21 UTC (permalink / raw)
  To: Dmitry Akindinov; +Cc: Julian Anastasov, lvs-devel

>
>Hello,
>
>On 2012-08-28 00:43, Hans Schillstrom wrote:
>> Hello
>>
>> [snip]
>>
>>>>
>>>> 	No loop? Because S2 will receive DR method in
>>>> the SYNC messages for its stack. You see traffic flow
>>>> or just that there is no reset?
>>>
>>> Yes, after the failover, all connections that were open on the S2 (which
>>> now becomes the active balancer) do not reset and continue to function
>>> just fine (traffic flow in both directions).
>>>
>>> We do understand (we think) your explanation about the sync table
>>> problem  - the connections to the actual balancer are marked in a
>>> special way, which causes problems when these connection table records
>>> are used on a new balancer.
>>>
>>> It looks like updating the kernels is the only way, if (as you and Hans
>>> Schillstrom outlines) even the latest CentOS/RedHat kernels do not
>>> contain necessary patches. It's a pity, as the idea was to provide an
>>> "out of the box" solution for our customers, and asking them to update
>>> to some Linux kernel is not what they like to hear.
>>>
>>> But thank you very much in any case, we will update the kernels on our
>>> test systems and will see if this (last?) problem disappears.
>>>
>>
>> There is newer kernels floating around in the 3.x range
>> Have a look at ELRepo Project there you can find a 3.5.x kernel
>> that have all necessary changes.
>
>Installed CentOS 6.3 and the kernel 3.5.3 from elrepo.org
>
>[root@fe3 ~]# uname -a
>Linux fe3.msk 3.5.3-1.el6.elrepo.x86_64 #1 SMP Sun Aug 26 14:05:15 EDT 
>2012 x86_64 x86_64 x86_64 GNU/Linux
>
>
>But we still see the same problem: connections to the old balancer are 
>RSET'ed when the new balancer takes over. Any idea which particular 
>kernel we need to use? Or, do we need to apply some specific patches and 
>build our own kernel?

A silly question,  old and new in this case
I hope it's a 3.5.x kernel in both cases ...


>
>Thank you for all your help so far!
>
>> BR
>> Hans
>>
>
>-- 
>Best regards,
>Dmitry Akindinov


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-09-03  7:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-25  7:37 Multiple load balancers problem Dmitry Akindinov
2012-08-25 10:13 ` Dmitry Akindinov
2012-08-25 11:53 ` Julian Anastasov
2012-08-27  8:02   ` Dmitry Akindinov
2012-08-27 11:17     ` Julian Anastasov
2012-08-27 15:15       ` Dmitry Akindinov
2012-08-27 15:27         ` Dmitry Akindinov
2012-08-27 16:13         ` Julian Anastasov
2012-08-27 20:24           ` Dmitry Akindinov
2012-08-28  7:21             ` Julian Anastasov
  -- strict thread matches above, loose matches on Subject: below --
2012-08-27 20:43 Re[2]: " Hans Schillstrom
2012-08-30 17:24 ` Dmitry Akindinov
2012-08-30 20:00   ` Julian Anastasov
2012-08-31  8:21 Re[2]: " Hans Schillstrom
2012-09-03  7:54 ` Dmitry Akindinov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.