cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [Cluster-devel] Re: [Debian-ha-maintainers] again: "redhat-cluster: services are not relocated when a node fails"
       [not found] ` <20091119124739.GA30480@bogon.sigxcpu.org>
@ 2009-11-19 22:01   ` Fabio M. Di Nitto
       [not found]     ` <ac20632d0911210755v6229e34cq4cf2628a1dc643eb@mail.gmail.com>
  0 siblings, 1 reply; 2+ messages in thread
From: Fabio M. Di Nitto @ 2009-11-19 22:01 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Guido G?nther wrote:
> Hi Ernesto,
> On Wed, Nov 18, 2009 at 02:30:57PM +0100, Ernesto Rodriguez Reina wrote:
>> Hi everyone!
>>
>> I recently start using RHCS for a project I'm working on but I found
>> that RHCS2 in Debian Lenny do not relocate services when a node fails.
>> I found the thread [1] where Guido G?nther says that this problem was
>> solved on RHCS 3.0.2. Then I downloaded and installed RHCS 3.0.4 (the
>> deb packages from debian mirror) and reproduced the experiment of
>> Martin Waite and again the service was not relocated on node fail.
>> Does someone had make it work as it should in Debian? Martin, or Guido
>> or anybody can you please help me to find out why it is not working as
>> it should?

> I checked with RHCS 3.0.4 as it's currently in unstable rebuilt for
> Lenny. The kernel enters a soft lock after I shut off one node (see
> attached log) and no resource takeover happens. Fabione, any idea what
> triggers this?

since you guys are running cluster 3.0.4, please do the following:

1) add <logging debug="on"/> in cluster.conf

<cluster...
 <logging debug="on"/>
...

2) reproduce the above scenario, then collect all the logs, from all
daemons, from all nodes from /var/log/cluster (this is upstream default,
check with Debian if they have changed it please).

then I?d like to see your cluster.conf and have a better idea on how a
node is "killed". If cluster.conf contains sensitive data such as
passwords, either blank them or send the file to me only. I?ll keep it
confidential but please do NOT randomly mangle the configuration to hide
bits.

The recovery operation is strictly dependent on different things. The
configuration and the logs should be able to tell us something.

Thanks
Fabio



^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Cluster-devel] Re: [Debian-ha-maintainers] again: "redhat-cluster: services are not  relocated when a node fails"
       [not found]               ` <ac20632d0911251015t440582bct10f6ef53329d149b@mail.gmail.com>
@ 2009-11-25 18:19                 ` Fabio M. Di Nitto
  0 siblings, 0 replies; 2+ messages in thread
From: Fabio M. Di Nitto @ 2009-11-25 18:19 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Ernesto Rodriguez Reina wrote:
> I?m glad to tell you people that using the kernel patch Fabio says the
> problem is solved and RedHat-Cluster-Suite 3.0.4 works perfectly well
> in all test made from yesterday up to now (and believe me have been a
> lot). We?ll keep on testing, but for now I want to thanks all of you.
> 
> My best regards,
> 

Hi Ernesto,

thanks for the feedback, but you have sent it to me only :)

I am CC?ing the other folks..

and this is the missing fix from Debian kernel:

http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=063c4c99630c0b06afad080d2a18bda64172c1a2

as pointed out by David. It will be in 2.6.32 upstream or you can cherry
pick that change.

Fabio



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2009-11-25 18:19 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <ac20632d0911180530w39d72b5et2d8d673525232352@mail.gmail.com>
     [not found] ` <20091119124739.GA30480@bogon.sigxcpu.org>
2009-11-19 22:01   ` [Cluster-devel] Re: [Debian-ha-maintainers] again: "redhat-cluster: services are not relocated when a node fails" Fabio M. Di Nitto
     [not found]     ` <ac20632d0911210755v6229e34cq4cf2628a1dc643eb@mail.gmail.com>
     [not found]       ` <4B0AE4D5.6030302@redhat.com>
     [not found]         ` <ac20632d0911231150u408d526cr9f01e48f7a855e8a@mail.gmail.com>
     [not found]           ` <4B0B743B.7080803@redhat.com>
     [not found]             ` <ac20632d0911240732q9e56b58va0a38e2033d87064@mail.gmail.com>
     [not found]               ` <ac20632d0911251015t440582bct10f6ef53329d149b@mail.gmail.com>
2009-11-25 18:19                 ` Fabio M. Di Nitto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).