From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yosuke Iwamatsu Subject: [PATCH] xend: Sleep before sending SIGKILL to device model Date: Thu, 29 Jan 2009 17:40:49 +0900 Message-ID: <49816B91.4080809@ab.jp.nec.com> References: <49801B68.5060700@ab.jp.nec.com> <18816.15891.759792.425858@mariner.uk.xensource.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------020403000305010901010609" Return-path: In-Reply-To: <18816.15891.759792.425858@mariner.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Ian Jackson Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --------------020403000305010901010609 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Ian Jackson wrote: > The code already has a timeout to forcibly kill the device model after > (I think) 10 seconds. Surely we should reuse that code path (and the > same timeout value) ? > > Restarting xend is not a usual thing to do and I think it's OK if > shutting down a domain started by a previous xend involves waiting for > such a longer timeout. It's better to err on the side of safety. O.K. Attached is a revised patch which reuses the existing code path. 10 seconds seems to me a bit too long, but I can agree we had better keep on the safe side. > Also, your patch was: > Content-Type: all/allfiles; > This is not a recognised content type and prevented both of my > mailreaders from displaying it to me. Can you please fix your MUA ? Sorry for the inconvenience. This time your mail client can recognize it, I think. -- Yosuke Signed-off-by: Yosuke Iwamatsu --------------020403000305010901010609 Content-Type: text/plain; name="xend_dm_sigkill.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="xend_dm_sigkill.txt" diff -r 39517e863cc8 tools/python/xen/xend/image.py --- a/tools/python/xen/xend/image.py Mon Jan 26 16:19:42 2009 +0000 +++ b/tools/python/xen/xend/image.py Thu Jan 29 17:30:20 2009 +0900 @@ -558,24 +558,30 @@ os.kill(self.pid, signal.SIGHUP) except OSError, exn: log.exception(exn) - try: - # Try to reap the child every 100ms for 10s. Then SIGKILL it. - for i in xrange(100): + # Try to reap the child every 100ms for 10s. Then SIGKILL it. + for i in xrange(100): + try: (p, rv) = os.waitpid(self.pid, os.WNOHANG) if p == self.pid: break - time.sleep(0.1) - else: - log.warning("DeviceModel %d took more than 10s " - "to terminate: sending SIGKILL" % self.pid) + except OSError: + # This is expected if Xend has been restarted within + # the life of this domain. In this case, we can kill + # the process, but we can't wait for it because it's + # not our child. We continue this loop, and after it is + # terminated make really sure the process is going away + # (SIGKILL). + pass + time.sleep(0.1) + else: + log.warning("DeviceModel %d took more than 10s " + "to terminate: sending SIGKILL" % self.pid) + try: os.kill(self.pid, signal.SIGKILL) os.waitpid(self.pid, 0) - except OSError, exn: - # This is expected if Xend has been restarted within the - # life of this domain. In this case, we can kill the process, - # but we can't wait for it because it's not our child. - # We just make really sure it's going away (SIGKILL) first. - os.kill(self.pid, signal.SIGKILL) + except OSError: + # This happens if the process doesn't exist. + pass state = xstransact.Remove("/local/domain/0/device-model/%i" % self.vm.getDomid()) finally: --------------020403000305010901010609 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --------------020403000305010901010609--