public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Alexander Graf <agraf@suse.de>
To: kvm-ppc@vger.kernel.org
Cc: kvm@vger.kernel.org, pbonzini@redhat.com, mtosatti@redhat.com,
	Paul Mackerras <paulus@samba.org>
Subject: [PULL 38/41] KVM: PPC: Book3S HV: Work around POWER8 performance monitor bugs
Date: Fri, 30 May 2014 14:42:53 +0200	[thread overview]
Message-ID: <1401453776-55285-39-git-send-email-agraf@suse.de> (raw)
In-Reply-To: <1401453776-55285-1-git-send-email-agraf@suse.de>

From: Paul Mackerras <paulus@samba.org>

This adds workarounds for two hardware bugs in the POWER8 performance
monitor unit (PMU), both related to interrupt generation.  The effect
of these bugs is that PMU interrupts can get lost, leading to tools
such as perf reporting fewer counts and samples than they should.

The first bug relates to the PMAO (perf. mon. alert occurred) bit in
MMCR0; setting it should cause an interrupt, but doesn't.  The other
bug relates to the PMAE (perf. mon. alert enable) bit in MMCR0.
Setting PMAE when a counter is negative and counter negative
conditions are enabled to cause alerts should cause an alert, but
doesn't.

The workaround for the first bug is to create conditions where a
counter will overflow, whenever we are about to restore a MMCR0
value that has PMAO set (and PMAO_SYNC clear).  The workaround for
the second bug is to freeze all counters using MMCR2 before reading
MMCR0.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/reg.h          | 12 ++++---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 59 +++++++++++++++++++++++++++++++--
 2 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index e5d2e0b..4852bcf 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -670,18 +670,20 @@
 #define   MMCR0_PROBLEM_DISABLE MMCR0_FCP
 #define   MMCR0_FCM1	0x10000000UL /* freeze counters while MSR mark = 1 */
 #define   MMCR0_FCM0	0x08000000UL /* freeze counters while MSR mark = 0 */
-#define   MMCR0_PMXE	0x04000000UL /* performance monitor exception enable */
-#define   MMCR0_FCECE	0x02000000UL /* freeze ctrs on enabled cond or event */
+#define   MMCR0_PMXE	ASM_CONST(0x04000000) /* perf mon exception enable */
+#define   MMCR0_FCECE	ASM_CONST(0x02000000) /* freeze ctrs on enabled cond or event */
 #define   MMCR0_TBEE	0x00400000UL /* time base exception enable */
 #define   MMCR0_BHRBA	0x00200000UL /* BHRB Access allowed in userspace */
 #define   MMCR0_EBE	0x00100000UL /* Event based branch enable */
 #define   MMCR0_PMCC	0x000c0000UL /* PMC control */
 #define   MMCR0_PMCC_U6	0x00080000UL /* PMC1-6 are R/W by user (PR) */
 #define   MMCR0_PMC1CE	0x00008000UL /* PMC1 count enable*/
-#define   MMCR0_PMCjCE	0x00004000UL /* PMCj count enable*/
+#define   MMCR0_PMCjCE	ASM_CONST(0x00004000) /* PMCj count enable*/
 #define   MMCR0_TRIGGER	0x00002000UL /* TRIGGER enable */
-#define   MMCR0_PMAO_SYNC 0x00000800UL /* PMU interrupt is synchronous */
-#define   MMCR0_PMAO	0x00000080UL /* performance monitor alert has occurred, set to 0 after handling exception */
+#define   MMCR0_PMAO_SYNC ASM_CONST(0x00000800) /* PMU intr is synchronous */
+#define   MMCR0_C56RUN	ASM_CONST(0x00000100) /* PMC5/6 count when RUN=0 */
+/* performance monitor alert has occurred, set to 0 after handling exception */
+#define   MMCR0_PMAO	ASM_CONST(0x00000080)
 #define   MMCR0_SHRFC	0x00000040UL /* SHRre freeze conditions between threads */
 #define   MMCR0_FC56	0x00000010UL /* freeze counters 5 and 6 */
 #define   MMCR0_FCTI	0x00000008UL /* freeze counters in tags inactive mode */
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index ffbb871..60fe8ba 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -86,6 +86,12 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
 	lbz	r4, LPPACA_PMCINUSE(r3)
 	cmpwi	r4, 0
 	beq	23f			/* skip if not */
+BEGIN_FTR_SECTION
+	ld	r3, HSTATE_MMCR(r13)
+	andi.	r4, r3, MMCR0_PMAO_SYNC | MMCR0_PMAO
+	cmpwi	r4, MMCR0_PMAO
+	beql	kvmppc_fix_pmao
+END_FTR_SECTION_IFSET(CPU_FTR_PMAO_BUG)
 	lwz	r3, HSTATE_PMC(r13)
 	lwz	r4, HSTATE_PMC + 4(r13)
 	lwz	r5, HSTATE_PMC + 8(r13)
@@ -726,6 +732,12 @@ skip_tm:
 	sldi	r3, r3, 31		/* MMCR0_FC (freeze counters) bit */
 	mtspr	SPRN_MMCR0, r3		/* freeze all counters, disable ints */
 	isync
+BEGIN_FTR_SECTION
+	ld	r3, VCPU_MMCR(r4)
+	andi.	r5, r3, MMCR0_PMAO_SYNC | MMCR0_PMAO
+	cmpwi	r5, MMCR0_PMAO
+	beql	kvmppc_fix_pmao
+END_FTR_SECTION_IFSET(CPU_FTR_PMAO_BUG)
 	lwz	r3, VCPU_PMC(r4)	/* always load up guest PMU registers */
 	lwz	r5, VCPU_PMC + 4(r4)	/* to prevent information leak */
 	lwz	r6, VCPU_PMC + 8(r4)
@@ -1324,6 +1336,30 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_206)
 25:
 	/* Save PMU registers if requested */
 	/* r8 and cr0.eq are live here */
+BEGIN_FTR_SECTION
+	/*
+	 * POWER8 seems to have a hardware bug where setting
+	 * MMCR0[PMAE] along with MMCR0[PMC1CE] and/or MMCR0[PMCjCE]
+	 * when some counters are already negative doesn't seem
+	 * to cause a performance monitor alert (and hence interrupt).
+	 * The effect of this is that when saving the PMU state,
+	 * if there is no PMU alert pending when we read MMCR0
+	 * before freezing the counters, but one becomes pending
+	 * before we read the counters, we lose it.
+	 * To work around this, we need a way to freeze the counters
+	 * before reading MMCR0.  Normally, freezing the counters
+	 * is done by writing MMCR0 (to set MMCR0[FC]) which
+	 * unavoidably writes MMCR0[PMA0] as well.  On POWER8,
+	 * we can also freeze the counters using MMCR2, by writing
+	 * 1s to all the counter freeze condition bits (there are
+	 * 9 bits each for 6 counters).
+	 */
+	li	r3, -1			/* set all freeze bits */
+	clrrdi	r3, r3, 10
+	mfspr	r10, SPRN_MMCR2
+	mtspr	SPRN_MMCR2, r3
+	isync
+END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 	li	r3, 1
 	sldi	r3, r3, 31		/* MMCR0_FC (freeze counters) bit */
 	mfspr	r4, SPRN_MMCR0		/* save MMCR0 */
@@ -1347,6 +1383,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_206)
 	std	r4, VCPU_MMCR(r9)
 	std	r5, VCPU_MMCR + 8(r9)
 	std	r6, VCPU_MMCR + 16(r9)
+BEGIN_FTR_SECTION
+	std	r10, VCPU_MMCR + 24(r9)
+END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 	std	r7, VCPU_SIAR(r9)
 	std	r8, VCPU_SDAR(r9)
 	mfspr	r3, SPRN_PMC1
@@ -1370,12 +1409,10 @@ BEGIN_FTR_SECTION
 	stw	r11, VCPU_PMC + 28(r9)
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_201)
 BEGIN_FTR_SECTION
-	mfspr	r4, SPRN_MMCR2
 	mfspr	r5, SPRN_SIER
 	mfspr	r6, SPRN_SPMC1
 	mfspr	r7, SPRN_SPMC2
 	mfspr	r8, SPRN_MMCRS
-	std	r4, VCPU_MMCR + 24(r9)
 	std	r5, VCPU_SIER(r9)
 	stw	r6, VCPU_PMC + 24(r9)
 	stw	r7, VCPU_PMC + 28(r9)
@@ -2311,3 +2348,21 @@ kvmppc_msr_interrupt:
 	li	r0, 1
 1:	rldimi	r11, r0, MSR_TS_S_LG, 63 - MSR_TS_T_LG
 	blr
+
+/*
+ * This works around a hardware bug on POWER8E processors, where
+ * writing a 1 to the MMCR0[PMAO] bit doesn't generate a
+ * performance monitor interrupt.  Instead, when we need to have
+ * an interrupt pending, we have to arrange for a counter to overflow.
+ */
+kvmppc_fix_pmao:
+	li	r3, 0
+	mtspr	SPRN_MMCR2, r3
+	lis	r3, (MMCR0_PMXE | MMCR0_FCECE)@h
+	ori	r3, r3, MMCR0_PMCjCE | MMCR0_C56RUN
+	mtspr	SPRN_MMCR0, r3
+	lis	r3, 0x7fff
+	ori	r3, r3, 0xffff
+	mtspr	SPRN_PMC6, r3
+	isync
+	blr
-- 
1.8.1.4


  parent reply	other threads:[~2014-05-30 12:43 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-30 12:42 [PULL 00/41] ppc patch queue 2014-05-30 Alexander Graf
2014-05-30 12:42 ` [PULL 01/41] KVM: PPC: E500: Ignore L1CSR1_ICFI,ICLFR Alexander Graf
2014-05-30 12:42 ` [PULL 02/41] KVM: PPC: E500: Add dcbtls emulation Alexander Graf
2014-05-30 12:42 ` [PULL 03/41] KVM: PPC: BOOK3S: PR: Enable Little Endian PR guest Alexander Graf
2014-05-30 12:42 ` [PULL 04/41] KVM: PPC: BOOK3S: PR: Fix WARN_ON with debug options on Alexander Graf
2014-05-30 12:42 ` [PULL 05/41] KVM: PPC: Book3S: PR: Fix C/R bit setting Alexander Graf
2014-05-30 12:42 ` [PULL 06/41] KVM: PPC: Book3S_32: PR: Access HTAB in big endian Alexander Graf
2014-05-30 12:42 ` [PULL 07/41] KVM: PPC: Book3S_64 " Alexander Graf
2014-05-30 12:42 ` [PULL 08/41] KVM: PPC: Book3S_64 PR: Access shadow slb " Alexander Graf
2014-05-30 12:42 ` [PULL 09/41] KVM: PPC: Book3S PR: Default to big endian guest Alexander Graf
2014-05-30 12:42 ` [PULL 10/41] KVM: PPC: Book3S PR: PAPR: Access HTAB in big endian Alexander Graf
2014-05-30 12:42 ` [PULL 11/41] KVM: PPC: Book3S PR: PAPR: Access RTAS " Alexander Graf
2014-05-30 12:42 ` [PULL 12/41] KVM: PPC: PR: Fill pvinfo hcall instructions " Alexander Graf
2014-05-30 12:42 ` [PULL 13/41] KVM: PPC: Make shared struct aka magic page guest endian Alexander Graf
2014-05-30 12:42 ` [PULL 14/41] KVM: PPC: Book3S PR: Do dcbz32 patching with big endian instructions Alexander Graf
2014-05-30 12:42 ` [PULL 15/41] KVM: PPC: Book3S: Move little endian conflict to HV KVM Alexander Graf
2014-05-30 12:42 ` [PULL 16/41] KVM: PPC: Book3S PR: Ignore PMU SPRs Alexander Graf
2014-05-30 12:42 ` [PULL 17/41] KVM: PPC: Book3S PR: Emulate TIR register Alexander Graf
2014-05-30 12:42 ` [PULL 18/41] KVM: PPC: Book3S PR: Handle Facility interrupt and FSCR Alexander Graf
2014-05-30 12:42 ` [PULL 19/41] KVM: PPC: Book3S PR: Expose TAR facility to guest Alexander Graf
2014-05-30 12:42 ` [PULL 20/41] KVM: PPC: Book3S PR: Expose EBB registers Alexander Graf
2014-05-30 12:42 ` [PULL 21/41] KVM: PPC: Book3S PR: Expose TM registers Alexander Graf
2014-05-30 12:42 ` [PULL 22/41] KVM: PPC: BOOK3S: HV: Prefer CMA region for hash page table allocation Alexander Graf
2014-05-30 12:42 ` [PULL 23/41] KVM: PPC: BOOK3S: HV: Add mixed page-size support for guest Alexander Graf
2014-05-30 12:42 ` [PULL 24/41] KVM: PPC: Disable NX for old magic page using guests Alexander Graf
2014-05-30 12:42 ` [PULL 25/41] PPC: KVM: Make NX bit available with magic page Alexander Graf
2014-05-30 12:42 ` [PULL 26/41] KVM: PPC: BOOK3S: Always use the saved DAR value Alexander Graf
2014-05-30 12:42 ` [PULL 27/41] KVM: PPC: BOOK3S: Remove open coded make_dsisr in alignment handler Alexander Graf
2014-05-30 12:42 ` [PULL 28/41] PPC: ePAPR: Fix hypercall on LE guest Alexander Graf
2014-05-30 12:42 ` [PULL 29/41] KVM: PPC: Graciously fail broken LE hypercalls Alexander Graf
2014-05-30 12:42 ` [PULL 30/41] KVM: PPC: MPIC: Reset IRQ source private members Alexander Graf
2014-05-30 12:42 ` [PULL 31/41] KVM: PPC: Add CAP to indicate hcall fixes Alexander Graf
2014-05-30 12:42 ` [PULL 32/41] KVM: PPC: Book3S: Add ONE_REG register names that were missed Alexander Graf
2014-05-30 12:42 ` [PULL 33/41] KVM: PPC: Book3S: Move KVM_REG_PPC_WORT to an unused register number Alexander Graf
2014-05-30 15:50   ` Paolo Bonzini
2014-05-30 15:53     ` Alexander Graf
2014-05-30 15:55       ` Paolo Bonzini
2014-05-30 15:58         ` Alexander Graf
2014-05-30 16:03           ` Paolo Bonzini
2014-05-30 16:08             ` Alexander Graf
2014-05-30 16:11               ` Paolo Bonzini
2014-05-30 16:14                 ` Alexander Graf
2014-05-30 12:42 ` [PULL 34/41] KVM: PPC: Book3S HV: Fix check for running inside guest in global_invalidates() Alexander Graf
2014-05-30 12:42 ` [PULL 35/41] KVM: PPC: Book3S HV: Put huge-page HPTEs in rmap chain for base address Alexander Graf
2014-05-30 12:42 ` [PULL 36/41] KVM: PPC: Book3S HV: Fix dirty map for hugepages Alexander Graf
2014-05-30 12:42 ` [PULL 37/41] KVM: PPC: Book3S HV: Make sure we don't miss dirty pages Alexander Graf
2014-05-30 12:42 ` Alexander Graf [this message]
2014-05-30 12:42 ` [PULL 39/41] KVM: PPC: Book3S HV: Fix machine check delivery to guest Alexander Graf
2014-05-30 12:42 ` [PULL 40/41] KVM: PPC: Book3S PR: Use SLB entry 0 Alexander Graf
2014-05-30 12:42 ` [PULL 41/41] KVM: PPC: Book3S PR: Rework SLB switching code Alexander Graf
2014-05-30 12:58 ` [PULL 00/41] ppc patch queue 2014-05-30 Paolo Bonzini
2014-05-30 13:10   ` Alexander Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1401453776-55285-39-git-send-email-agraf@suse.de \
    --to=agraf@suse.de \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=paulus@samba.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox