cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [Cluster-devel] Error running corosync
@ 2011-11-07 12:34 Nick Khamis
  2011-11-08  2:08 ` [Cluster-devel] [Linux-HA] " Tim Serong
  0 siblings, 1 reply; 3+ messages in thread
From: Nick Khamis @ 2011-11-07 12:34 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hello Everyone,

After being unsuccessful trying to get cman+pacemaker working,
I decided to try the latest committed version of pacemaker "git clone
https://github.com/ClusterLabs/pacemaker.git". And recieving
the following error from ocfs2_controld.pcmk:


 ocfs2_controld.pcmk -D
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
Processing additional service options...
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
'corosync_quorum' for option: name
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
Processing additional service options...
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
'corosync_cman' for option: name
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
Processing additional service options...
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
'openais_clm' for option: name
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
Processing additional service options...
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
'openais_evt' for option: name
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
Processing additional service options...
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
'openais_ckpt' for option: name
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
Processing additional service options...
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
'openais_msg' for option: name
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
Processing additional service options...
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
'openais_lck' for option: name
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
Processing additional service options...
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
'openais_tmr' for option: name
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next: No
additional configuration supplied for: service
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
Processing additional quorum options...
ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
'quorum_cman' for option: provider
ocfs2_controld[6883]: 2011/11/03_16:34:20 info: get_cluster_type:
Detected an active 'cman' cluster
ocfs2_controld[6883]: 2011/11/03_16:34:20 info: get_local_node_name:
Using CMAN node name: astdrbd1
ocfs2_controld[6883]: 2011/11/03_16:34:20 info:
init_ais_connection_once: Connection to 'cman': established
ocfs2_controld[6883]: 2011/11/03_16:34:20 info: crm_new_peer: Node
astdrbd1 now has id: 1
ocfs2_controld[6883]: 2011/11/03_16:34:20 info: crm_new_peer: Node 1
is now known as astdrbd1
ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: crm_abort:
send_ais_text: Triggered assert at corosync.c:352 : dest !=
crm_msg_ais
Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: send_ais_text:
Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: crm_abort:
send_ais_text: Triggered assert at corosync.c:352 : dest !=
crm_msg_ais
Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: send_ais_text:
Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
1320352460 setup_stack at 170: Cluster connection established.  Local node id: 1
1320352460 setup_stack at 174: Added Pacemaker as client 1 with fd -1

Setup:

PCMK 1.1.6-2d8fad5
CMAN 3.1.7
Corosync 1.4.2
OpenAIS Latest version

I just want to mention that I never start OpenAIS just corosync. Is
this ok for dlm,
and configfs? Or should I be using openais?

If I had more time I would dig deeper however, currently I am under so
much pressure
to finishing things up, that I am averting my efforts to different
parts of the cluster.

Your Help is Greatly Appreciated,

Nick.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Cluster-devel] [Linux-HA] Error running corosync
  2011-11-07 12:34 [Cluster-devel] Error running corosync Nick Khamis
@ 2011-11-08  2:08 ` Tim Serong
  2011-11-11  4:17   ` Andrew Beekhof
  0 siblings, 1 reply; 3+ messages in thread
From: Tim Serong @ 2011-11-08  2:08 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On 11/07/2011 11:34 PM, Nick Khamis wrote:
> Hello Everyone,
>
> After being unsuccessful trying to get cman+pacemaker working,
> I decided to try the latest committed version of pacemaker "git clone
> https://github.com/ClusterLabs/pacemaker.git". And recieving
> the following error from ocfs2_controld.pcmk:
>
>
>   ocfs2_controld.pcmk -D
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
> 'corosync_quorum' for option: name
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
> 'corosync_cman' for option: name
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
> 'openais_clm' for option: name
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
> 'openais_evt' for option: name
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
> 'openais_ckpt' for option: name
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
> 'openais_msg' for option: name
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
> 'openais_lck' for option: name
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
> 'openais_tmr' for option: name
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next: No
> additional configuration supplied for: service
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
> Processing additional quorum options...
> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
> 'quorum_cman' for option: provider
> ocfs2_controld[6883]: 2011/11/03_16:34:20 info: get_cluster_type:
> Detected an active 'cman' cluster
> ocfs2_controld[6883]: 2011/11/03_16:34:20 info: get_local_node_name:
> Using CMAN node name: astdrbd1
> ocfs2_controld[6883]: 2011/11/03_16:34:20 info:
> init_ais_connection_once: Connection to 'cman': established
> ocfs2_controld[6883]: 2011/11/03_16:34:20 info: crm_new_peer: Node
> astdrbd1 now has id: 1
> ocfs2_controld[6883]: 2011/11/03_16:34:20 info: crm_new_peer: Node 1
> is now known as astdrbd1
> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: crm_abort:
> send_ais_text: Triggered assert at corosync.c:352 : dest !=
> crm_msg_ais
> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: send_ais_text:
> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: crm_abort:
> send_ais_text: Triggered assert at corosync.c:352 : dest !=
> crm_msg_ais
> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: send_ais_text:
> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
> 1320352460 setup_stack at 170: Cluster connection established.  Local node id: 1
> 1320352460 setup_stack at 174: Added Pacemaker as client 1 with fd -1
>

I still believe these errors are the result of pacemaker (apparently) 
not knowing/thinking it's running on/with openais (for some reason).  See:

http://oss.clusterlabs.org/pipermail/pacemaker/2011-November/011978.html

I also don't see how the patch Andrew mentioned at 
http://oss.clusterlabs.org/pipermail/pacemaker/2011-November/011992.html 
could fix this (but would be delighted to be proved wrong).

> Setup:
>
> PCMK 1.1.6-2d8fad5
> CMAN 3.1.7
> Corosync 1.4.2
> OpenAIS Latest version
>
> I just want to mention that I never start OpenAIS just corosync. Is
> this ok for dlm,
> and configfs? Or should I be using openais?

ocfs2_controld with Pacemaker needs openais, but openais isn't something 
you "start" separately, it's a bunch of plugins that corosync is meant 
to load.  What this means in a CMAN environment, I do not know.

IMO (and as Florian alluded to in another message), you'd probably save 
yourself a lot of trouble taking prebuilt packages from a distro where 
the pieces you need are known to work together.

Not to say I think what you're doing won't ultimately be worthwhile, but 
it could be the case that you are the first person in the world to try 
to combine these versions of these specific components in exactly the 
way you are doing so.

Regards,

Tim
-- 
Tim Serong
Senior Clustering Engineer
SUSE
tserong at suse.com



^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Cluster-devel] [Linux-HA] Error running corosync
  2011-11-08  2:08 ` [Cluster-devel] [Linux-HA] " Tim Serong
@ 2011-11-11  4:17   ` Andrew Beekhof
  0 siblings, 0 replies; 3+ messages in thread
From: Andrew Beekhof @ 2011-11-11  4:17 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Tue, Nov 8, 2011 at 1:08 PM, Tim Serong <tserong@suse.com> wrote:
> On 11/07/2011 11:34 PM, Nick Khamis wrote:
>> Hello Everyone,
>>
>> After being unsuccessful trying to get cman+pacemaker working,
>> I decided to try the latest committed version of pacemaker "git clone
>> https://github.com/ClusterLabs/pacemaker.git". And recieving
>> the following error from ocfs2_controld.pcmk:
>>
>>
>> ? ocfs2_controld.pcmk -D
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional service options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'corosync_quorum' for option: name
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional service options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'corosync_cman' for option: name
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional service options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'openais_clm' for option: name
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional service options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'openais_evt' for option: name
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional service options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'openais_ckpt' for option: name
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional service options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'openais_msg' for option: name
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional service options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'openais_lck' for option: name
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional service options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'openais_tmr' for option: name
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next: No
>> additional configuration supplied for: service
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional quorum options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'quorum_cman' for option: provider
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 info: get_cluster_type:
>> Detected an active 'cman' cluster
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 info: get_local_node_name:
>> Using CMAN node name: astdrbd1
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 info:
>> init_ais_connection_once: Connection to 'cman': established
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 info: crm_new_peer: Node
>> astdrbd1 now has id: 1
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 info: crm_new_peer: Node 1
>> is now known as astdrbd1
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: crm_abort:
>> send_ais_text: Triggered assert at corosync.c:352 : dest !=
>> crm_msg_ais
>> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: send_ais_text:
>> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: crm_abort:
>> send_ais_text: Triggered assert at corosync.c:352 : dest !=
>> crm_msg_ais
>> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: send_ais_text:
>> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
>> 1320352460 setup_stack at 170: Cluster connection established. ?Local node id: 1
>> 1320352460 setup_stack at 174: Added Pacemaker as client 1 with fd -1
>>
>
> I still believe these errors are the result of pacemaker (apparently)
> not knowing/thinking it's running on/with openais (for some reason). ?See:
>
> http://oss.clusterlabs.org/pipermail/pacemaker/2011-November/011978.html
>
> I also don't see how the patch Andrew mentioned at
> http://oss.clusterlabs.org/pipermail/pacemaker/2011-November/011992.html
> could fix this (but would be delighted to be proved wrong).

The RA, IIRC, was looking for HA_quorum_type which was unset before
that patch when starting pacemaker from the daemon.

However, if he's getting this problem while using cman (i've
completely lost track at this point) then the problem is that the RA
is selecting  ocfs2_controld.pcmk instead of the "normal"
ocfs2_controld

>
>> Setup:
>>
>> PCMK 1.1.6-2d8fad5
>> CMAN 3.1.7
>> Corosync 1.4.2
>> OpenAIS Latest version
>>
>> I just want to mention that I never start OpenAIS just corosync. Is
>> this ok for dlm,
>> and configfs? Or should I be using openais?
>
> ocfs2_controld with Pacemaker needs openais, but openais isn't something
> you "start" separately, it's a bunch of plugins that corosync is meant
> to load. ?What this means in a CMAN environment, I do not know.

Cman starts corosync+openais.

>
> IMO (and as Florian alluded to in another message), you'd probably save
> yourself a lot of trouble taking prebuilt packages from a distro where
> the pieces you need are known to work together.

Indeed.

>
> Not to say I think what you're doing won't ultimately be worthwhile, but
> it could be the case that you are the first person in the world to try
> to combine these versions of these specific components in exactly the
> way you are doing so.
>
> Regards,
>
> Tim
> --
> Tim Serong
> Senior Clustering Engineer
> SUSE
> tserong at suse.com
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-11-11  4:17 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-07 12:34 [Cluster-devel] Error running corosync Nick Khamis
2011-11-08  2:08 ` [Cluster-devel] [Linux-HA] " Tim Serong
2011-11-11  4:17   ` Andrew Beekhof

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).