From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Vrabel Subject: Re: [PATCH] x86/xen: resume timer irqs early Date: Fri, 8 Aug 2014 18:38:43 +0100 Message-ID: <53E50B23.5080200@citrix.com> References: <1407431785-21394-1-git-send-email-david.vrabel@citrix.com> <53E3B797.9060701@oracle.com> <53E4A93F.6030704@citrix.com> <53E4D8E3.3070805@oracle.com> <53E4E02F.1000506@citrix.com> <20140808171502.GG13551@laptop.dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1XFo7s-0003EJ-ID for xen-devel@lists.xenproject.org; Fri, 08 Aug 2014 17:38:48 +0000 In-Reply-To: <20140808171502.GG13551@laptop.dumpdata.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Konrad Rzeszutek Wilk Cc: xen-devel@lists.xenproject.org, Boris Ostrovsky List-Id: xen-devel@lists.xenproject.org On 08/08/14 18:15, Konrad Rzeszutek Wilk wrote: > On Fri, Aug 08, 2014 at 03:35:27PM +0100, David Vrabel wrote: >> On 08/08/14 15:04, Boris Ostrovsky wrote: >>> On 08/08/2014 06:41 AM, David Vrabel wrote: >>>> On 07/08/14 18:29, Boris Ostrovsky wrote: >>>>> On 08/07/2014 01:16 PM, David Vrabel wrote: >>>>>> If the timer irqs are resumed during device resume it is possible in >>>>>> certain circumstances for the resume to hang early on, before device >>>>>> interrupts are resumed. >>>>>> >>>>>> It is not entirely clear what is occuring the point of the hang but I >>>>>> think a task necessary for the resume calls schedule_timeout(), >>>>>> waiting for a timer interrupt (which never arrives). This failure may >>>>>> require specific tasks to be running on the other VCPUs to trigger >>>>>> (processes are not frozen during a suspend/resume if PREEMPT is >>>>>> disabled). >>>>>> >>>>>> Add IRQF_EARLY_RESUME to the timer interrupts so they are resumed in >>>>>> syscore_resume(). >>>>>> >>>>>> Also add IRQF_NO_SUSPEND as it is not necessary to suspend the timer >>>>>> interrupts and IRQF_FORCE_RESUME was already set. >>>>> >>>>> IRQF_NO_SUSPEND is a component of IRQF_TIMER. >>>> So it is. How about this instead? >>> >>> The change makes sense so >>> >>> Reviewed-by: Boris Ostrovsky >>> >>> but I am curious whether you actually were able to prove that it in fact >>> fixes the hang (the description doesn't make it clear). >> >> Without the patch repeatedly migrating a VM would hang during resume >> after < 500 iterations. With the patch the VM was migrated > 8000 times >> without a problem. > > Ah, should said patch have a Reported-by too then? I don't think the XenServer automated test system really minds. > It would also be neat to have that in the description of the patch I think. I'm not really keen on system-specific numbers like this, but it would probably be useful in this case since my analysis is so woolly. David