From mboxrd@z Thu Jan 1 00:00:00 1970 From: Julien Grall Subject: Re: [PATCH for-4.5] xen/arm: Fix virtual timer on ARMv8 Model Date: Thu, 27 Nov 2014 12:46:14 +0000 Message-ID: <54771D16.4090607@linaro.org> References: <1416937469-8162-1-git-send-email-julien.grall@linaro.org> <1417084837.12784.5.camel@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1XtySg-0005wm-98 for xen-devel@lists.xenproject.org; Thu, 27 Nov 2014 12:46:18 +0000 Received: by mail-wg0-f44.google.com with SMTP id b13so6371730wgh.17 for ; Thu, 27 Nov 2014 04:46:16 -0800 (PST) In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Stefano Stabellini , Ian Campbell Cc: xen-devel@lists.xenproject.org, tim@xen.org, stefano.stabellini@citrix.com List-Id: xen-devel@lists.xenproject.org Hi Stefano, On 27/11/14 10:51, Stefano Stabellini wrote: > On Thu, 27 Nov 2014, Ian Campbell wrote: >> On Tue, 2014-11-25 at 17:44 +0000, Julien Grall wrote: >>> ARMv8 model may not disable correctly the timer interrupt when Xen >> >> "correct disable" >> >>> context switch to an idle vCPU. Therefore Xen may receive a spurious >> >> "context switches" and s/spurious/unexpected/ (since spurious has a >> specific meaning in the h/w which does not match what is happening here) >> >>> timer interrupt. As the idle domain doesn't have vGIC, Xen will crash >>> when trying to inject the interrupt with the following stack trace. >>> >>> (XEN) [<0000000000228388>] _spin_lock_irqsave+0x28/0x94 (PC) >>> (XEN) [<0000000000228380>] _spin_lock_irqsave+0x20/0x94 (LR) >>> (XEN) [<0000000000250510>] vgic_vcpu_inject_irq+0x40/0x1b0 >>> (XEN) [<000000000024bcd0>] vtimer_interrupt+0x4c/0x54 >>> (XEN) [<0000000000247010>] do_IRQ+0x1a4/0x220 >>> (XEN) [<0000000000244864>] gic_interrupt+0x50/0xec >>> (XEN) [<000000000024fbac>] do_trap_irq+0x20/0x2c >>> (XEN) [<0000000000255240>] hyp_irq+0x5c/0x60 >>> (XEN) [<0000000000241084>] context_switch+0xb8/0xc4 >>> (XEN) [<000000000022482c>] schedule+0x684/0x6d0 >>> (XEN) [<000000000022785c>] __do_softirq+0xcc/0xe8 >>> (XEN) [<00000000002278d4>] do_softirq+0x14/0x1c >>> (XEN) [<0000000000240fac>] idle_loop+0x134/0x154 >>> (XEN) [<000000000024c160>] start_secondary+0x14c/0x15c >>> (XEN) [<0000000000000001>] 0000000000000001 >>> >>> While we receive spurious virtual timer interrupt, this could be safely >>> ignore for the time being. A proper fix need to be found for Xen 4.6. >>> >>> Signed-off-by: Julien Grall >> >> Acked-by: Ian Campbell >> >> Although I wonder if we should log, perhaps rate limited or only once. >> >> Also, I've some grammar nits (above and below) which I can fix on commit >> if there is no resend... >> >>> >>> --- >>> >>> This patch is a bug fix candidate for Xen 4.5. Any ARMv8 model may >>> randomly crash when running Xen. >> >> CCing Konrad. >> >>> This patch don't inject the virtual timer interrupt if the current VCPU >>> is the idle one. Entering in this function with the idle VCPU is already >>> a bug itself. For now, I think this patch is the safest way to resolve >>> the problem. >>> >>> Meanwhile, I'm investigating with ARM to see wheter the bug comes from >>> Xen or the model. > > It is worth noting that there are no bad side effects of this change: > the vtimer_interrupt is always supposed to be received on non-idle > domains. As Julien wrote, the fact that we are receiving a > vtimer_interrupt in the idle_domain is a bug, one that probably comes > from the ARM model not emulating hardware correctly. ARM says: "The v8A ARM ARM says that the signal output will be disabled if , so the signal will be set to 0. However, how this is treated by the GIC depends on its configuration." So I'm not so sure it's a model bug. Regards, -- Julien Grall