From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: NMI: Enable watchdog by default Date: Wed, 7 Mar 2012 16:55:04 +0000 Message-ID: <4F5792E8.1000002@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------080407050903000204000409" Return-path: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "xen-devel@lists.xensource.com" Cc: Keir Fraser , Jan Beulich List-Id: xen-devel@lists.xenproject.org --------------080407050903000204000409 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit This patch is based on one which has been in XenServer for a very long. To keep the trend of documentation going, it also corrects the new command line document. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com --------------080407050903000204000409 Content-Type: text/x-patch; name="enable-nmi-watchdog.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="enable-nmi-watchdog.patch" # HG changeset patch # Parent e193375fb82af819eb55a1189308fcd4c1e8b40f NMI: Enable watchdog by default. This should make hung conditions more visible. Change the timeout from 5 seconds to 300, as several operations on large guests take far longer than 5 seconds. Signed-off-by: Andrew Cooper diff -r e193375fb82a docs/misc/xen-command-line.markdown --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -391,7 +391,10 @@ The optional `keep` parameter causes Xen ### watchdog > `= ` -Run an NMI watchdog on each processor. Defaults to disabled. +> Default: `true` + +Run an NMI watchdog on each processor. A CPU hung for more than 300 seconds will result in a panic, making hung conditions more visible. + ### x2apic ### x2apic\_phys ### xencons diff -r e193375fb82a xen/arch/x86/nmi.c --- a/xen/arch/x86/nmi.c +++ b/xen/arch/x86/nmi.c @@ -425,11 +425,12 @@ void nmi_watchdog_tick(struct cpu_user_r !atomic_read(&watchdog_disable_count) ) { /* - * Ayiee, looks like this CPU is stuck ... wait a few IRQs (5 seconds) - * before doing the oops ... + * Ayiee, looks like this CPU is stuck. + * Wait 5 minutes before panic because some actions on large guests + * can take many seconds to complete. */ this_cpu(alert_counter)++; - if ( this_cpu(alert_counter) == 5*nmi_hz ) + if ( this_cpu(alert_counter) == 300*nmi_hz ) { console_force_unlock(); printk("Watchdog timer detects that CPU%d is stuck!\n", diff -r e193375fb82a xen/arch/x86/setup.c --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -55,7 +55,7 @@ static unsigned int __initdata max_cpus; integer_param("maxcpus", max_cpus); /* opt_watchdog: If true, run a watchdog NMI on each processor. */ -static bool_t __initdata opt_watchdog; +static bool_t __initdata opt_watchdog = 1; boolean_param("watchdog", opt_watchdog); /* smep: Enable/disable Supervisor Mode Execution Protection (default on). */ --------------080407050903000204000409 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --------------080407050903000204000409--