From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zhigang Wang Subject: Re: [PATCH] [RFC] Add lock on domain start Date: Wed, 05 Aug 2009 16:39:23 +0800 Message-ID: <4A79453B.8000205@oracle.com> References: <48993F48.9030705@novell.com> <18592.7953.353263.676403@mariner.uk.xensource.com> <48A06CA3.2080801@novell.com> <20090805074128.GG24960@edu.joroinen.fi> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------070402080200060301050408" Return-path: In-Reply-To: <20090805074128.GG24960@edu.joroinen.fi> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: =?UTF-8?B?UGFzaSDvv70=?= Cc: Jim Fehlig , xen-devel@lists.xensource.com, Ian Jackson List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --------------070402080200060301050408 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Pasi =EF=BF=BD wrote: > On Mon, Aug 11, 2008 at 10:45:23AM -0600, Jim Fehlig wrote: >> Ian Jackson wrote: >>> Jim Fehlig writes ("[Xen-devel] [PATCH] [RFC] Add lock on domain star= t"): >>> =20 >>>> This patch adds a simple lock mechanism when starting domains by pla= cing=20 >>>> a lock file in xend-domains-path/. The lock file is remov= ed=20 >>>> when domain is stopped. The motivation for such a mechanism is to=20 >>>> prevent starting the same domain from multiple hosts. >>>> =20 >>> I think this should be dealt with in your next-layer-up management >>> tools. >>> =20 >> Perhaps. I wanted to see if there was any interest in having such a >> feature at the xend layer. If not, I will no longer pursue this optio= n. >> >=20 > Replying a bit late to this.. I think there is demand for this feature!= =20 >=20 > Many people (mostly in a smaller environments) don't want to use > 'next-layer-up' management tools.. >=20 >>> Lockfiles are bad because they can become stale. >>> =20 >> Yep. Originally I considered a 'lockless-lock' approach where a bit i= t >> set and counter is spun on a 'reserved' sector of vbd, e.g. first >> sector. Attempting to attach the vbd to another domain would fail if >> lock bit is set and counter is incrementing. If counter is not >> incrementing assume lock is stale and proceed. This approach is >> certainly more complex. We support various image formats (raw, qcow, >> vmdk, ...) and such an approach may mean changing the format (e.g. >> qcow3). Wouldn't work for existing images. Who is responsible for >> spinning the counter? Anyhow seemed like a lot of complexity as >> compared to the suggested simple approach with override for stale lock= . >> >=20 > I assume you guys have this patch included in OpenSuse/SLES Xen rpms. >=20 > Is the latest version available from somewhere?=20 >=20 > -- Pasi I ever seen a patch in SUSE xen rpm. maybe Jim can tell you the latest st= atus. In Oracle VM, we add hooks in xend and use a external locking utility. currently, we use DLM (distributed lock manager) to manage the domain run= ning lock to prevent the same VM starts from two servers simultaneously. We have add hooks to VM start/shutdown/migration for acquire/release the = lock. Note during migration, we release the lock before starting the migration = process and a lock will be acquired in the destination side. There still a chance= for other servers rather than the destination server to acquire the lock. thu= s cause the migration fail. hope someone can give some advice. here is the patch for your reference. thanks, zhigang --------------070402080200060301050408 Content-Type: text/x-patch; name="xen-unstable-locking-callout-hook.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="xen-unstable-locking-callout-hook.patch" diff -Nurp --exclude '*.orig' xen-3.4.0.bak/tools/examples/xend-config.sxp xen-3.4.0/tools/examples/xend-config.sxp --- xen-3.4.0.bak/tools/examples/xend-config.sxp 2009-08-05 16:17:42.000000000 +0800 +++ xen-3.4.0/tools/examples/xend-config.sxp 2009-08-04 10:23:17.000000000 +0800 @@ -69,6 +69,12 @@ (xend-unix-path /var/lib/xend/xend-socket) +# External locking utility for get/release domain running lock. By default, +# no utility is specified. Thus there will be no lock as VM running. +# The locking utility should accept: +# <--lock | --unlock> --name --uuid +# command line options, and returns zero on success, others on error. +#(xend-domains-lock-path '') # Address and port xend should use for the legacy TCP XMLRPC interface, # if xend-tcp-xmlrpc-server is set. diff -Nurp --exclude '*.orig' xen-3.4.0.bak/tools/python/xen/xend/XendDomainInfo.py xen-3.4.0/tools/python/xen/xend/XendDomainInfo.py --- xen-3.4.0.bak/tools/python/xen/xend/XendDomainInfo.py 2009-08-05 16:17:42.000000000 +0800 +++ xen-3.4.0/tools/python/xen/xend/XendDomainInfo.py 2009-08-05 16:35:35.000000000 +0800 @@ -359,6 +359,8 @@ class XendDomainInfo: @type state_updated: threading.Condition @ivar refresh_shutdown_lock: lock for polling shutdown state @type refresh_shutdown_lock: threading.Condition + @ivar running_lock: lock for running VM + @type running_lock: bool or None @ivar _deviceControllers: device controller cache for this domain @type _deviceControllers: dict 'string' to DevControllers """ @@ -427,6 +429,8 @@ class XendDomainInfo: self.refresh_shutdown_lock = threading.Condition() self._stateSet(DOM_STATE_HALTED) + self.running_lock = None + self._deviceControllers = {} for state in DOM_STATES_OLD: @@ -453,6 +457,7 @@ class XendDomainInfo: if self._stateGet() in (XEN_API_VM_POWER_STATE_HALTED, XEN_API_VM_POWER_STATE_SUSPENDED, XEN_API_VM_POWER_STATE_CRASHED): try: + self.acquire_running_lock(); XendTask.log_progress(0, 30, self._constructDomain) XendTask.log_progress(31, 60, self._initDomain) @@ -485,6 +490,7 @@ class XendDomainInfo: state = self._stateGet() if state in (DOM_STATE_SUSPENDED, DOM_STATE_HALTED): try: + self.acquire_running_lock(); self._constructDomain() try: @@ -2617,6 +2623,11 @@ class XendDomainInfo: self._stateSet(DOM_STATE_HALTED) self.domid = None # Do not push into _stateSet()! + + try: + self.release_running_lock() + except: + log.exception("Release running lock failed: %s" % status) finally: self.refresh_shutdown_lock.release() @@ -4073,6 +4084,28 @@ class XendDomainInfo: params.get('burst', '50K')) return 1 + def acquire_running_lock(self): + if not self.running_lock: + lock_path = xoptions.get_xend_domains_lock_path() + if lock_path: + status = os.system('%s --lock --name %s --uuid %s' % \ + (lock_path, self.info['name_label'], self.info['uuid'])) + if status == 0: + self.running_lock = True + else: + raise XendError('Acquire running lock failed: %s' % status) + + def release_running_lock(self): + if self.running_lock: + lock_path = xoptions.get_xend_domains_lock_path() + if lock_path: + status = os.system('%s --unlock --name %s --uuid %s' % \ + (lock_path, self.info['name_label'], self.info['uuid'])) + if status == 0: + self.running_lock = False + else: + raise XendError('Release running lock failed: %s' % status) + def __str__(self): return '' % \ (str(self.domid), self.info['name_label'], diff -Nurp --exclude '*.orig' xen-3.4.0.bak/tools/python/xen/xend/XendDomain.py xen-3.4.0/tools/python/xen/xend/XendDomain.py --- xen-3.4.0.bak/tools/python/xen/xend/XendDomain.py 2009-08-05 16:17:09.000000000 +0800 +++ xen-3.4.0/tools/python/xen/xend/XendDomain.py 2009-08-04 10:23:17.000000000 +0800 @@ -1317,6 +1317,7 @@ class XendDomain: POWER_STATE_NAMES[dominfo._stateGet()]) """ The following call may raise a XendError exception """ + dominfo.release_running_lock(); dominfo.testMigrateDevices(True, dst) if live: diff -Nurp --exclude '*.orig' xen-3.4.0.bak/tools/python/xen/xend/XendOptions.py xen-3.4.0/tools/python/xen/xend/XendOptions.py --- xen-3.4.0.bak/tools/python/xen/xend/XendOptions.py 2009-08-05 16:17:42.000000000 +0800 +++ xen-3.4.0/tools/python/xen/xend/XendOptions.py 2009-08-04 10:23:17.000000000 +0800 @@ -281,6 +281,11 @@ class XendOptions: """ return self.get_config_string("xend-domains-path", self.xend_domains_path_default) + def get_xend_domains_lock_path(self): + """ Get the path of the lock utility for running domains. + """ + return self.get_config_string("xend-domains-lock-path") + def get_xend_state_path(self): """ Get the path for persistent domain configuration storage """ --------------070402080200060301050408 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --------------070402080200060301050408--