From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: Detecting deadlocks with hypervisor.. Date: Fri, 07 Apr 2006 12:41:20 -0500 Message-ID: <4436A440.90704@us.ibm.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: T S Cc: rthelen@netapp.com, Xen-devel@lists.xensource.com, ewan@xensource.com, edwin.zhai@intel.com List-Id: xen-devel@lists.xenproject.org T S wrote: >> From: Anthony Liguori >> To: T S >> CC: xen-devel@lists.xensource.com >> Subject: Re: [Xen-devel] Detecting deadlocks with hypervisor.. >> Date: Fri, 24 Mar 2006 13:24:46 -0600 >> >> T S wrote: >>> This may sound a silly question (pardon me because i am relatively=20 >>> new to linux kernel) .. will it be possible to continue running=20 >>> reboot.c (or for that matter any kernel thread) when the kernel is=20 >>> deadlocked ? In Linux, is the kernel a single process or a bunch of=20 >>> parallelly executing entities? If later, then during a kernel=20 >>> deadlock (eg: by loading a faulty module that disables interrupts=20 >>> and do something silly) there can still be some other=20 >>> processes/threads run, right? >> >> Sorry for not making this more clear previously. You cannot restore a=20 >> dead-locked domain if a normal xm save doesn't work. One thing that=20 >> makes Xen unique is that guests actually are aware of what physical=20 >> pages are assigned to them. When one does a save/restore, the guest=20 >> has to canonicalize all of it's internal references to physical=20 >> pages. When it's restored, it then remaps it's newly assigned=20 >> physical pages to all the old places where it needed to know about=20 >> them for some reason or another. > > We took a look at the xc_linux_save() function ... and what we see is=20 > that > the canonicalize action is actually done by the Dom-0 (and not by the=20 > Dom-U); Take a look at linux-2.6-sparse/drivers/core/reboot.c:__do_suspend().=20 Canonicalization is done both in Dom-0 and in the guest itself. Dom-0=20 attempts to do as much of it as it can but as I've said before, it=20 cannot do all of it. > Also, given that Dom-0 can access the page tables and other structures=20 > of the deadlocked guest, > can one of you be able to tell me what changes I need to do to=20 > xm_linux_save( ) (and other related functions) to save the state of=20 > the deadlocked guest without doing any handshake with the guest OS ? If you want to attempt to futz with the state of a guest while it's=20 running without the guest cooperating, your best bet is to do as Keir=20 suggested and pause the domain, make your changes, and then unpause. Regards, Anthony Liguori > > thanks! > - T > > >> If the guest isn't responsive when you do a save, then it will never=20 >> canonicalize itself and there is no way to restore the domain. >> >> Regards, >> >> Anthony Liguori >> >>> thanks >>> TS >>> >>>> >>>> If a suspend completes correctly, Xend will see it (another watch=20 >>>> will fire), >>>> and xc_linux_save will be free to complete the save. >>>> >>>> > Also, does it seem viable to clone a copy of a deadlocked guest=20 >>>> OS in the >>>> > first place? >>>> >>>> If you have a byte-for-byte copy of a deadlocked guest, even if you=20 >>>> could >>>> suspend it, surely it will be deadlocked when it is resumed. How do=20 >>>> you >>>> intend to break the deadlock, and how is it easier to do that from=20 >>>> outside >>>> than it is to perform deadlock detection in the guest? >>>> >>>> Ewan. >>>> >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xensource.com >>>> http://lists.xensource.com/xen-devel >>> >>> _________________________________________________________________ >>> Express yourself instantly with MSN Messenger! Download today - it's=20 >>> FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ >>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >> > > _________________________________________________________________ > Don=92t just search. Find. Check out the new MSN Search!=20 > http://search.msn.click-url.com/go/onm00200636ave/direct/01/ >