From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [PATCH 3/4] x86/nmi: wait for all CPUs in check_nmi_watchdog() Date: Wed, 14 May 2014 15:09:46 +0100 Message-ID: <5373792A.6070506@citrix.com> References: <1400072299-2285-1-git-send-email-david.vrabel@citrix.com> <1400072299-2285-4-git-send-email-david.vrabel@citrix.com> <537392FE02000078000123AD@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1WkZst-0003iK-H0 for xen-devel@lists.xenproject.org; Wed, 14 May 2014 14:10:15 +0000 In-Reply-To: <537392FE02000078000123AD@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: xen-devel@lists.xenproject.org, Keir Fraser , David Vrabel List-Id: xen-devel@lists.xenproject.org On 14/05/14 14:59, Jan Beulich wrote: >>>> On 14.05.14 at 14:58, wrote: >> The counting of a CPUs NMIs in check_nmi_watchdog() is only reliable >> if all CPUs have been spinning for 5 or more ticks. There may be >> delays in waking other CPUs from deep power states that can mean that >> when the counts are checked CPUs haven't run for long enough. > 5 ticks ought to be a couple of orders of a magnitude longer than > the worst possible wakeup time. I.e. I don't buy this argument > without actual numbers to support it. > > Jan Its not necesserily the wakeup time, although on some systems that does appear to be a consideration. Simple A We have a prototype 8 socket system where the cpus on the further sockets had progressively less nmi delta, and are mostly declared stuck as the BSP reads the delta before the APs have completed their busy loop. ~Andrew