* [Cluster-devel] Re: [Debian-ha-maintainers] again: "redhat-cluster: services are not relocated when a node fails" [not found] ` <20091119124739.GA30480@bogon.sigxcpu.org> @ 2009-11-19 22:01 ` Fabio M. Di Nitto [not found] ` <ac20632d0911210755v6229e34cq4cf2628a1dc643eb@mail.gmail.com> 0 siblings, 1 reply; 2+ messages in thread From: Fabio M. Di Nitto @ 2009-11-19 22:01 UTC (permalink / raw) To: cluster-devel.redhat.com Guido G?nther wrote: > Hi Ernesto, > On Wed, Nov 18, 2009 at 02:30:57PM +0100, Ernesto Rodriguez Reina wrote: >> Hi everyone! >> >> I recently start using RHCS for a project I'm working on but I found >> that RHCS2 in Debian Lenny do not relocate services when a node fails. >> I found the thread [1] where Guido G?nther says that this problem was >> solved on RHCS 3.0.2. Then I downloaded and installed RHCS 3.0.4 (the >> deb packages from debian mirror) and reproduced the experiment of >> Martin Waite and again the service was not relocated on node fail. >> Does someone had make it work as it should in Debian? Martin, or Guido >> or anybody can you please help me to find out why it is not working as >> it should? > I checked with RHCS 3.0.4 as it's currently in unstable rebuilt for > Lenny. The kernel enters a soft lock after I shut off one node (see > attached log) and no resource takeover happens. Fabione, any idea what > triggers this? since you guys are running cluster 3.0.4, please do the following: 1) add <logging debug="on"/> in cluster.conf <cluster... <logging debug="on"/> ... 2) reproduce the above scenario, then collect all the logs, from all daemons, from all nodes from /var/log/cluster (this is upstream default, check with Debian if they have changed it please). then I?d like to see your cluster.conf and have a better idea on how a node is "killed". If cluster.conf contains sensitive data such as passwords, either blank them or send the file to me only. I?ll keep it confidential but please do NOT randomly mangle the configuration to hide bits. The recovery operation is strictly dependent on different things. The configuration and the logs should be able to tell us something. Thanks Fabio ^ permalink raw reply [flat|nested] 2+ messages in thread
[parent not found: <ac20632d0911210755v6229e34cq4cf2628a1dc643eb@mail.gmail.com>]
[parent not found: <4B0AE4D5.6030302@redhat.com>]
[parent not found: <ac20632d0911231150u408d526cr9f01e48f7a855e8a@mail.gmail.com>]
[parent not found: <4B0B743B.7080803@redhat.com>]
[parent not found: <ac20632d0911240732q9e56b58va0a38e2033d87064@mail.gmail.com>]
[parent not found: <ac20632d0911251015t440582bct10f6ef53329d149b@mail.gmail.com>]
* [Cluster-devel] Re: [Debian-ha-maintainers] again: "redhat-cluster: services are not relocated when a node fails" [not found] ` <ac20632d0911251015t440582bct10f6ef53329d149b@mail.gmail.com> @ 2009-11-25 18:19 ` Fabio M. Di Nitto 0 siblings, 0 replies; 2+ messages in thread From: Fabio M. Di Nitto @ 2009-11-25 18:19 UTC (permalink / raw) To: cluster-devel.redhat.com Ernesto Rodriguez Reina wrote: > I?m glad to tell you people that using the kernel patch Fabio says the > problem is solved and RedHat-Cluster-Suite 3.0.4 works perfectly well > in all test made from yesterday up to now (and believe me have been a > lot). We?ll keep on testing, but for now I want to thanks all of you. > > My best regards, > Hi Ernesto, thanks for the feedback, but you have sent it to me only :) I am CC?ing the other folks.. and this is the missing fix from Debian kernel: http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=063c4c99630c0b06afad080d2a18bda64172c1a2 as pointed out by David. It will be in 2.6.32 upstream or you can cherry pick that change. Fabio ^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2009-11-25 18:19 UTC | newest] Thread overview: 2+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <ac20632d0911180530w39d72b5et2d8d673525232352@mail.gmail.com> [not found] ` <20091119124739.GA30480@bogon.sigxcpu.org> 2009-11-19 22:01 ` [Cluster-devel] Re: [Debian-ha-maintainers] again: "redhat-cluster: services are not relocated when a node fails" Fabio M. Di Nitto [not found] ` <ac20632d0911210755v6229e34cq4cf2628a1dc643eb@mail.gmail.com> [not found] ` <4B0AE4D5.6030302@redhat.com> [not found] ` <ac20632d0911231150u408d526cr9f01e48f7a855e8a@mail.gmail.com> [not found] ` <4B0B743B.7080803@redhat.com> [not found] ` <ac20632d0911240732q9e56b58va0a38e2033d87064@mail.gmail.com> [not found] ` <ac20632d0911251015t440582bct10f6ef53329d149b@mail.gmail.com> 2009-11-25 18:19 ` Fabio M. Di Nitto
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).