[Cluster-devel] [PATCH] Fix pacemaker's wrong quorum view in a CMAN+pacemaker cluster

cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed

From: Simone Gotti <simone.gotti@gmail.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH] Fix pacemaker's wrong quorum view in a CMAN+pacemaker cluster
Date: Mon, 14 Mar 2011 16:37:43 +0100	[thread overview]
Message-ID: <4D7E3647.9060403@gmail.com> (raw)
In-Reply-To: <AANLkTi=_Osyg5bLYdyAr4z7jZ94q66g+8S=A7XmTeAut@mail.gmail.com>

On 03/14/2011 11:41 AM, Andrew Beekhof wrote:
> On Mon, Mar 14, 2011 at 11:04 AM, Simone Gotti <simone.gotti@gmail.com> wrote:
>> On 03/14/2011 08:36 AM, Andrew Beekhof wrote:
>>> On Sun, Mar 13, 2011 at 2:17 PM, Simone Gotti <simone.gotti@gmail.com> wrote:
>>>> Hi all,
>>>>
>>>> Testing a cman+pacemaker cluster on rhel6 I noticed a very nasty
>>>> behavior when some nodes were leaving and rejoining the cluster. When a
>>>> nodes starts leaving and rejoining the cluster the quorum view of
>>>> pacemaker starts becoming sometimes different from the quorum view of
>>>> cman. The one not telling the truth was pacemaker.
>>> Do they ever start agreeing again?  In other words, is the situation
>>> transient or is Pacemaker always 1 (or more) behind after that?
>> Looks like it will be always behind and even more at every new cluster
>> change. I don't see any other part in the code where CMAN's events are
>> dequeued other than this one.
> Ok, it looks like the functions behind G_main_add_fd() don't work as
> one would expect.
> This is a very surprising thing to find out after all these years.
>
> Patch applied. Thanks!
> Could I trouble you to create a RHEL6 bug for this though?
> That will allow me to fix it there too.

Done. https://bugzilla.redhat.com/show_bug.cgi?id=684825

Thanks!

>> The event just says if we have or not quorum. So can happen that an old
>> dequeued message says the same as the current real state but it's just a
>> coincidence.
>>
>>
>>
>>>> I reproduced the problem with a simple test case made of 2 nodes using
>>>> cman (no two_nodes flag) and pacemaker (started only on the first node:
>>>> pcmk01).
>>>>
>>>> For the tests I was using the latest version of pacemaker (1.1.5) while
>>>> keeping the original versions of corosync and cluster (cman) packages
>>>> provided by the rhel6 (corosync-1.2.3-21.el6.x86_64,
>>>> cman-3.0.12-23.el6.4.x86_64)
>>>>
>>>> The problem is that when a node joins a cluster (starting cman) the cman
>>>> on the other nodes emits not one but 2 events (I didn't investigated if
>>>> this is normal or present only in some versions of cman) but when crmd
>>>> calls cman_dispatch it's using the flag CMAN_DISPATCH_ONE so only one of
>>>> the two events is dequeued. In the subsequent cluster event the old one
>>>> is dequeued.
>>>>
>>>> The fix I tried used CMAN_DISPATCH_ALL instead of CMAN_DISPATCH_ONE and
>>>> looks like its working.
>>>>
>>>> I'm CCing the cluster-devel list as they can be interested in the double
>>>> event emitted by cman.
>>>>
>>>>
>>>> Thanks.
>>>>
>>>> Bye!
>>>>
>>>>
>>>> == Test case ==
>>>>
>>>> === Without the patch ===
>>>>
>>>> Start with both nodes with cman started (so the cluster is quorate).
>>>>
>>>>
>>>> Now stop cman on pcmk02. Output on pcmk01:
>>>>
>>>> pcmk01 corosync[16793]:   [CMAN  ] quorum lost, blocking activity
>>>> pcmk01 corosync[16793]:   [QUORUM] This node is within the non-primary
>>>> component and will NOT provide any services.
>>>> pcmk01 corosync[16793]:   [QUORUM] Members[1]: 1
>>>> pcmk01 corosync[16793]:   [TOTEM ] A processor joined or left the
>>>> membership and a new membership was formed.
>>>> pcmk01 corosync[16793]:   [CPG   ] downlist received left_list: 1
>>>> pcmk01 corosync[16793]:   [CPG   ] chosen downlist from node r(0)
>>>> ip(192.168.200.71)
>>>> pcmk01 corosync[16793]:   [MAIN  ] Completed service synchronization,
>>>> ready to provide service.
>>>> pcmk01 crmd: [16993]: notice: cman_event_callback: Membership 668:
>>>> quorum lost
>>>>
>>>> Only one event is enqueued.
>>>>
>>>> Now start again cman on pcmk02. Output on pcmk01:
>>>>
>>>> pcmk01 corosync[16793]:   [TOTEM ] A processor joined or left the
>>>> membership and a new membership was formed.
>>>> pcmk01 corosync[16793]:   [CMAN  ] quorum regained, resuming activity
>>>> pcmk01 corosync[16793]:   [QUORUM] This node is within the primary
>>>> component and will provide service.
>>>> pcmk01 corosync[16793]:   [QUORUM] Members[2]: 1 2
>>>> pcmk01 corosync[16793]:   [QUORUM] Members[2]: 1 2
>>>> pcmk01 crmd: [16993]: notice: cman_event_callback: Membership 672:
>>>> quorum acquired
>>>> pcmk01 corosync[16793]:   [CPG   ] downlist received left_list: 0
>>>> pcmk01 corosync[16793]:   [CPG   ] downlist received left_list: 0
>>>> pcmk01 corosync[16793]:   [CPG   ] chosen downlist from node r(0)
>>>> ip(192.168.200.71)
>>>> pcmk01 corosync[16793]:   [MAIN  ] Completed service synchronization,
>>>> ready to provide service.
>>>>
>>>> As you can see two events are enqueued and only one si dequeued (due to
>>>> the CMAN_DISPATCH_ONE flag passed to cman_dispatch).
>>>>
>>>> The quorum is ragained both on cman and crmd. But there's another event
>>>> saying that the quorum is regained in the queue.
>>>>
>>>>
>>>> Now stop again cman on pcmk02. Output on pcmk01:
>>>>
>>>> pcmk01 corosync[16793]:   [CMAN  ] quorum lost, blocking activity
>>>> pcmk01 corosync[16793]:   [QUORUM] This node is within the non-primary
>>>> component and will NOT provide any services.
>>>> pcmk01 corosync[16793]:   [QUORUM] Members[1]: 1
>>>> pcmk01 corosync[16793]:   [TOTEM ] A processor joined or left the
>>>> membership and a new membership was formed.
>>>> pcmk01 corosync[16793]:   [CPG   ] downlist received left_list: 1
>>>> pcmk01 corosync[16793]:   [CPG   ] chosen downlist from node r(0)
>>>> ip(192.168.200.71)
>>>> pcmk01 corosync[16793]:   [MAIN  ] Completed service synchronization,
>>>> ready to provide service.
>>>> pcmk01 crmd: [16993]: info: cman_event_callback: Membership 676: quorum
>>>> retained
>>>>
>>>> CMAN says that the quorum is lost and only one event is dispatched. But
>>>> crmd dequeued the previous event and thinks that we have the quorum.
>>>>
>>>>
>>>> Now start again cman on pcmk02. Output on pcmk01:
>>>>
>>>> pcmk01 corosync[16793]:   [TOTEM ] A processor joined or left the
>>>> membership and a new membership was formed.
>>>> pcmk01 corosync[16793]:   [CMAN  ] quorum regained, resuming activity
>>>> pcmk01 corosync[16793]:   [QUORUM] This node is within the primary
>>>> component and will provide service.
>>>> pcmk01 corosync[16793]:   [QUORUM] Members[2]: 1 2
>>>> pcmk01 corosync[16793]:   [QUORUM] Members[2]: 1 2
>>>> pcmk01 crmd: [16993]: notice: cman_event_callback: Membership 680:
>>>> quorum lost
>>>> pcmk01 corosync[16793]:   [CPG   ] downlist received left_list: 0
>>>> pcmk01 corosync[16793]:   [CPG   ] downlist received left_list: 0
>>>> pcmk01 corosync[16793]:   [CPG   ] chosen downlist from node r(0)
>>>> ip(192.168.200.71)
>>>> pcmk01 corosync[16793]:   [MAIN  ] Completed service synchronization,
>>>> ready to provide service.
>>>>
>>>> CMAN says that the quorum is regained but crmd dequeued again the old
>>>> event and now it says that the quorum is lost. And so on...
>>>>
>>>>
>>>>
>>>> === With the patch ===
>>>>
>>>> stop cman on pcmk02. Output on pcmk01:
>>>>
>>>> pcmk01 corosync[13149]:   [CMAN  ] quorum lost, blocking activity
>>>> pcmk01 corosync[13149]:   [QUORUM] This node is within the non-primary
>>>> component and will NOT provide any services.
>>>> pcmk01 corosync[13149]:   [QUORUM] Members[1]: 1
>>>> pcmk01 corosync[13149]:   [TOTEM ] A processor joined or left the
>>>> membership and a new membership was formed.
>>>> pcmk01 corosync[13149]:   [CPG   ] downlist received left_list: 1
>>>> pcmk01 corosync[13149]:   [CPG   ] chosen downlist from node r(0)
>>>> ip(192.168.200.71)
>>>> pcmk01 corosync[13149]:   [MAIN  ] Completed service synchronization,
>>>> ready to provide service.
>>>>
>>>>  pcmk01 crmd: [13351]: notice: cman_event_callback: Membership 648:
>>>> quorum lost
>>>>
>>>> Only one event is enqued.
>>>>
>>>>
>>>> Now start again cman on pcmk02. Output on pcmk01:
>>>>
>>>> pcmk01 corosync[13149]:   [TOTEM ] A processor joined or left the
>>>> membership and a new membership was formed.
>>>> pcmk01 corosync[13149]:   [CMAN  ] quorum regained, resuming activity
>>>> pcmk01 corosync[13149]:   [QUORUM] This node is within the primary
>>>> component and will provide service.
>>>> pcmk01 corosync[13149]:   [QUORUM] Members[2]: 1 2
>>>> pcmk01 corosync[13149]:   [QUORUM] Members[2]: 1 2
>>>> pcmk01 crmd: [13351]: notice: cman_event_callback: Membership 652:
>>>> quorum acquired
>>>> pcmk01 corosync[13149]:   [CPG   ] downlist received left_list: 0
>>>> pcmk01 corosync[13149]:   [CPG   ] downlist received left_list: 0
>>>> pcmk01 corosync[13149]:   [CPG   ] chosen downlist from node r(0)
>>>> ip(192.168.200.71)
>>>> pcmk01 corosync[13149]:   [MAIN  ] Completed service synchronization,
>>>> ready to provide service.
>>>> pcmk01 crmd: [13351]: info: cman_event_callback: Membership 652: quorum
>>>> retained
>>>>
>>>> As you can see two events are enqued and both are dequeued.
>>>>
>>>>
>>>>
>>>> --
>>>> Simone Gotti
>>>>
>>>>
>>>>
>>
>> --
>> Simone Gotti
>>
>>

     prev parent reply	other threads:[~2011-03-14 15:37 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-13 13:17 [Cluster-devel] [PATCH] Fix pacemaker's wrong quorum view in a CMAN+pacemaker cluster Simone Gotti
2011-03-13 14:34 ` Fabio M. Di NItto
2011-03-14  7:36 ` Andrew Beekhof
2011-03-14 10:04   ` Simone Gotti
2011-03-14 10:41     ` Andrew Beekhof
2011-03-14 15:37       ` Simone Gotti [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D7E3647.9060403@gmail.com \
    --to=simone.gotti@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).