From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [xen-unstable test] 33690: regressions - FAIL Date: Mon, 26 Jan 2015 14:56:31 +0000 Message-ID: <54C6559F.8010700@citrix.com> References: <54C62D560200007800059615@mail.emea.novell.com> <54C63560020000780005967F@mail.emea.novell.com> <54C6540B.7080506@citrix.com> <54C65461.2000503@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1YFl5i-0005FX-DN for xen-devel@lists.xenproject.org; Mon, 26 Jan 2015 14:56:38 +0000 In-Reply-To: <54C65461.2000503@oracle.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Boris Ostrovsky , Jan Beulich Cc: xen-devel List-Id: xen-devel@lists.xenproject.org On 26/01/15 14:51, Boris Ostrovsky wrote: > On 01/26/2015 09:49 AM, Andrew Cooper wrote: >> On 26/01/15 11:38, Jan Beulich wrote: >>>>>> On 26.01.15 at 12:04, wrote: >>>>>>> On 24.01.15 at 13:54, wrote: >>>>> test-amd64-amd64-xl-qemut-win7-amd64 7 windows-install fail >>>>> REGR. vs. 33637 >>>> Jan 24 00:35:16.262627 (XEN) ----[ Xen-4.6-unstable x86_64 >>>> debug=y Not tainted ]---- >>>> Jan 24 00:35:16.478599 (XEN) CPU: 1 >>>> Jan 24 00:35:16.478624 (XEN) RIP: e008:[<0000000000000000>] >>>> 0000000000000000 >>>> Jan 24 00:35:16.486596 (XEN) RFLAGS: 0000000000010082 CONTEXT: >>>> hypervisor >>>> ... >>>> Jan 24 00:35:16.678620 (XEN) Xen call trace: >>>> Jan 24 00:35:16.678650 (XEN) [] >>>> vpmu_do_interrupt+0x2f/0x8a >>>> Jan 24 00:35:16.686605 (XEN) [] >>>> pmu_apic_interrupt+0x33/0x35 >>>> Jan 24 00:35:16.698582 (XEN) [] do_IRQ+0x9c/0x624 >>>> Jan 24 00:35:16.698615 (XEN) [] >>>> common_interrupt+0x62/0x70 >>>> Jan 24 00:35:16.698653 (XEN) [] >>>> _spin_unlock_irq+0x30/0x31 >>>> Jan 24 00:35:16.706604 (XEN) [] >>>> __do_softirq+0x81/0x8c >>>> Jan 24 00:35:16.706638 (XEN) [] >>>> do_softirq+0x13/0x15 >>>> Jan 24 00:35:16.718591 (XEN) [] >>>> vmx_asm_do_vmentry+0x2a/0x50 >>> I think I see what the problem here is: Commit 8097616fbd >>> ("x86/VPMU: handle APIC_LVTPC accesses") gives the guest >>> control over LVTPC.mask regardless of whether the vPMU was >>> actually initialized for it. Supposedly in the case above the >>> guest is being run with core2_no_vpmu_ops, which in >>> particular has .do_interrupt == NULL. It's not immediately >>> clear whether vpmu_lvtpc_update() should do the check or its >>> (sole) caller. In any event I'm going to revert that commit as >>> the primary suspect for causing the regression. >> I have just fallen over this as well. I second a revert in the absence >> of a clear way to fix the patch. > > I can't reproduce this -- neither at this patch level nor at full series. > > Yes, we can test for do_interrupt presence in vpmu_lvtpc_update() (or > in vpmu_interrupt() itself) but since we cannot arm the counters > (there is no do_wrmsr op) I am not sure I understand what can trigger > this interrupt. > > -boris > > As Jan explained, The patch in question allows guests (windows in both problematic cases) to arm LVTPC, with a vpmu instance with a NULL pointer for do_interrupt. When a pmu apic interrupt arrives, the interrupt handler dies from a NULL function pointer dereference. ~Andrew