From mboxrd@z Thu Jan 1 00:00:00 1970 From: Lon Hohberger Date: Fri, 08 Feb 2008 15:56:50 -0500 Subject: [Cluster-devel] rind-0.8.1 patch In-Reply-To: <200802070938.48446.grimme@atix.de> References: <1196441345.2454.25.camel@localhost.localdomain> <200802061003.24749.grimme@atix.de> <1202317294.21504.50.camel@ayanami.boston.devel.redhat.com> <200802070938.48446.grimme@atix.de> Message-ID: <1202504210.6443.84.camel@ayanami.boston.devel.redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Thu, 2008-02-07 at 09:38 +0100, Marc Grimme wrote: > Something else I was thinking about when playing with those things: > 1. Why are USER, CONFIG and MIGRATION events not yet being passed? It could be > quite interesting as well to trigger those. USER + CONFIG is being passed to the event handlers in CVS, you just can't define events off of them currently in the configuration. I think what we have right now is plenty for blowing your own foot off, but we certainly could add those. Virtual machine requests (e.g. clusvcadm -M) aren't going out with 5.2 for central_processing. > 2. And wouldn't it be a good idea to being able to call some kind of > higherlevel os-skript? I disagree here, sort of: * I don't think the possibility of lots of fork/execs while trying to determine service placement after a failure is a great idea. We want to try to be as neutral as we can during this situation. A really low-impact script interface that reorders a node list might be okay; i.e.: node_list = external_reorder("my_script", old_node_list); I suppose it's kind of like shuffle(), but with intelligence. That script could then sort the node IDs by whatever criteria it wanted. As for processing events in external scripts, I disagree fairly strongly: * The data rgmanager is currently using to make decisions (e.g. configuration info such as failover domains, service recovery policies, and extended stuff which you can randomly add) is difficult to access from shell scripts. * Internal rgmanager operations (flipping service states for example) can't be done from outside rgmanager in a sane way. > I thought it might then be possible to generate a more > dynamic failoverdomain. Agreed. > For example one with the lowest loaded node being > lowest prioritized. That can be quite nice when having services or vms which > produce very high load. There are lots of kinds of load: * memory pressure * cpu load * run queue average length (the 'uptime' load) * i/o bandwidth to shared storage * network bandwidth I'd recommend whatever load monitoring we care about be done proactively. That is, have something publish current load states periodically, and have the data 'already there' - so that in the event of a failure, we can just act on what is known - rather than asking around for various pieces of data. However, we're getting a little far out though - does what's in CVS work for doing the 'follows' logic or not? :) -- Lon