From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kiszka Subject: Re: Detect guest panic Date: Tue, 18 Nov 2008 18:13:31 +0100 Message-ID: <4922F7BB.6010501@siemens.com> References: <20081118153645.GM1897@easter-eggs.com> <4922EFC7.9030803@mair-family.org> <20081118164922.GO1897@easter-eggs.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: kvm@vger.kernel.org To: Emmanuel Lacour Return-path: Received: from gecko.sbs.de ([194.138.37.40]:20190 "EHLO gecko.sbs.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755265AbYKRROU (ORCPT ); Tue, 18 Nov 2008 12:14:20 -0500 In-Reply-To: <20081118164922.GO1897@easter-eggs.com> Sender: kvm-owner@vger.kernel.org List-ID: Emmanuel Lacour wrote: > On Tue, Nov 18, 2008 at 09:39:35AM -0700, David Mair wrote: >>> >> If the guest has a reachable IP address the simplest way might be to >> ping the guest from the host every so often and, if it stops responding >> for long enough to make you believe it has frozen, kill the qemu process >> and run it again. I suppose you could also expose the qemu console via a >> socket or other host file descriptor then you can have the pinging >> program on the host try to reset the guest without killing the qemu >> process. >> > > Thanks for your help, but ping is not enough, if it doesn't answer it > doesn't mean that the WM is crashed, it can means that only the network > is crashed (and I have this kind of problems too (see other recent > thread for virtio_net ;)) and I have other fixes for those kind of > problems. > > Well I'm looking for some sort of "watchdog" kvm device ;) nmi_watchdog=1 (NMI watchdog via IO-APIC) is working for Linux guests if the host uses kvm-intel (kvm-amd is not yet implemented). Other OSes that can exploit this trick as well should also be able to benefit from it. There is just one open issue regarding NMIs for which a patch is pending, but expect the next kvm release to include a fix. Otherwise, you are free to define and implement some virt-watchdog (what would be a hardware watchdog with a link to some reset pin in real life), letting the emulation code trigger a system_reset when the timer fires. You could also choose to emulate an existing watchdog interface for which there are already drivers for your guest OS (we've done that for virtualizing a custom board). Jan -- Siemens AG, Corporate Technology, CT SE 2 ES-OS Corporate Competence Center Embedded Linux