From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mm01.cs.columbia.edu (mm01.cs.columbia.edu [128.59.11.253]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F549ECAAA1 for ; Mon, 24 Oct 2022 10:29:32 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 8588A49E18; Mon, 24 Oct 2022 06:29:32 -0400 (EDT) X-Virus-Scanned: at lists.cs.columbia.edu Authentication-Results: mm01.cs.columbia.edu (amavisd-new); dkim=softfail (fail, message has been altered) header.i=@kernel.org Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WjHX84wzUHbN; Mon, 24 Oct 2022 06:29:31 -0400 (EDT) Received: from mm01.cs.columbia.edu (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 178DE4B177; Mon, 24 Oct 2022 06:29:31 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 071D549EC2 for ; Mon, 24 Oct 2022 06:29:30 -0400 (EDT) X-Virus-Scanned: at lists.cs.columbia.edu Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id O4dltteb1nVA for ; Mon, 24 Oct 2022 06:29:28 -0400 (EDT) Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by mm01.cs.columbia.edu (Postfix) with ESMTPS id 9D6FB49E18 for ; Mon, 24 Oct 2022 06:29:28 -0400 (EDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 3F2A6B810B0; Mon, 24 Oct 2022 10:29:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EE13AC433C1; Mon, 24 Oct 2022 10:29:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666607366; bh=e9EJjLUy6t6gwdKoVDL3bho6pQx55paAnahTf23c/+I=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=V5KSuSnkGIMXtpKaGL98PrEKJ9aQkHuz0r24lmwoCxFg6Ps/6RuaIyw8MPVoEEnJV WuQZE+qJyjyIoT8R000LLhkrk93xsxyMcLae7lKIcPUv6LS0WImMy3+FhzDHcLonpT WjSzZLadY4L+snHKyFblPA1QQ7uWkcpZqMA+P1vVxLfH7LhgD7cXvj9QuaGjycCjfX UTaJBGcG+0FHhgTMbdQ5PVINv5Y0ikRXakPxKYDFIzArgxjgiTokH9yQ9VT753eiwu DsVtouLX70chZw1YZ9h39HGP0i/5u/YEqsfbIPdx5/eCJb0kcgpo6kirMTosftJtTT NCBJBTiM3NyUw== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1omuhv-0017qA-Kt; Mon, 24 Oct 2022 11:29:23 +0100 Date: Mon, 24 Oct 2022 11:29:23 +0100 Message-ID: <86zgdlms58.wl-maz@kernel.org> From: Marc Zyngier To: Reiji Watanabe Subject: Re: [PATCH 1/9] KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode In-Reply-To: References: <20220805135813.2102034-1-maz@kernel.org> <20220805135813.2102034-2-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: reijiw@google.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, kernel-team@android.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Cc: kernel-team@android.com, kvmarm@lists.cs.columbia.edu, Linux ARM , kvm@vger.kernel.org X-BeenThere: kvmarm@lists.cs.columbia.edu X-Mailman-Version: 2.1.14 Precedence: list List-Id: Where KVM/ARM decisions are made List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu Hi Reiji, Catching up on this. On Tue, 23 Aug 2022 05:30:21 +0100, Reiji Watanabe wrote: > > Hi Marc, > > On Fri, Aug 5, 2022 at 6:58 AM Marc Zyngier wrote: > > > > Ricardo recently pointed out that the PMU chained counter emulation > > in KVM wasn't quite behaving like the one on actual hardware, in > > the sense that a chained counter would expose an overflow on > > both halves of a chained counter, while KVM would only expose the > > overflow on the top half. > > > > The difference is subtle, but significant. What does the architecture > > say (DDI0087 H.a): > > > > - Before PMUv3p4, all counters but the cycle counter are 32bit > > - A 32bit counter that overflows generates a CHAIN event on the > > adjacent counter after exposing its own overflow status > > - The CHAIN event is accounted if the counter is correctly > > configured (CHAIN event selected and counter enabled) > > > > This all means that our current implementation (which uses 64bit > > perf events) prevents us from emulating this overflow on the lower half. > > > > How to fix this? By implementing the above, to the letter. > > > > This largly results in code deletion, removing the notions of > > "counter pair", "chained counters", and "canonical counter". > > The code is further restructured to make the CHAIN handling similar > > to SWINC, as the two are now extremely similar in behaviour. > > > > Reported-by: Ricardo Koller > > Signed-off-by: Marc Zyngier > > --- > > arch/arm64/kvm/pmu-emul.c | 324 +++++++++++--------------------------- > > include/kvm/arm_pmu.h | 2 - > > 2 files changed, 91 insertions(+), 235 deletions(-) > > > > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c > > index 11c43bed5f97..4986e8b3ea6c 100644 > > --- a/arch/arm64/kvm/pmu-emul.c > > +++ b/arch/arm64/kvm/pmu-emul.c [...] > > +/* > > + * Perform an increment on any of the counters described in @mask, > > + * generating the overflow if required, and propagate it as a chained > > + * event if possible. > > + */ > > +static void kvm_pmu_counter_increment(struct kvm_vcpu *vcpu, > > + unsigned long mask, u32 event) > > +{ > > + int i; > > + > > + if (!kvm_vcpu_has_pmu(vcpu)) > > + return; > > + > > + if (!(__vcpu_sys_reg(vcpu, PMCR_EL0) & ARMV8_PMU_PMCR_E)) > > + return; > > + > > + /* Weed out disabled counters */ > > + mask &= __vcpu_sys_reg(vcpu, PMCNTENSET_EL0); > > + > > + for_each_set_bit(i, &mask, ARMV8_PMU_CYCLE_IDX) { > > + u64 type, reg; > > + > > + /* Filter on event type */ > > + type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i); > > + type &= kvm_pmu_event_mask(vcpu->kvm); > > + if (type != event) > > + continue; > > + > > + /* Increment this counter */ > > + reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1; > > + reg = lower_32_bits(reg); > > + __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg; > > + > > + if (reg) /* No overflow? move on */ > > + continue; > > + > > + /* Mark overflow */ > > + __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i); > > Perhaps it might be useful to create another helper that takes > care of just one counter (it would essentially do the code above > in the loop). The helper could be used (in addition to the above > loop) from the code below for the CHAIN event case and from > kvm_pmu_perf_overflow(). Then unnecessary execution of > for_each_set_bit() could be avoided for these two cases. I'm not sure it really helps. We would still need to check whether the counter is enabled, and we'd need to bring that into the helper instead of keeping it outside of the loop. [...] > > @@ -625,30 +528,27 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx) > > { > > struct arm_pmu *arm_pmu = vcpu->kvm->arch.arm_pmu; > > struct kvm_pmu *pmu = &vcpu->arch.pmu; > > - struct kvm_pmc *pmc; > > + struct kvm_pmc *pmc = &pmu->pmc[select_idx]; > > struct perf_event *event; > > struct perf_event_attr attr; > > u64 eventsel, counter, reg, data; > > > > - /* > > - * For chained counters the event type and filtering attributes are > > - * obtained from the low/even counter. We also use this counter to > > - * determine if the event is enabled/disabled. > > - */ > > - pmc = kvm_pmu_get_canonical_pmc(&pmu->pmc[select_idx]); > > - > > - reg = (pmc->idx == ARMV8_PMU_CYCLE_IDX) > > + reg = (select_idx == ARMV8_PMU_CYCLE_IDX) > > ? PMCCFILTR_EL0 : PMEVTYPER0_EL0 + pmc->idx; > > You may want to use select_idx instead of pmc->id for consistency ? Yes. Although Oliver had a point in saying that these pmc->idx vs select_idx conversions were not strictly necessary and cluttered the patch. [...] > > @@ -752,11 +607,15 @@ static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 select_idx) > > void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data, > > u64 select_idx) > > { > > + struct kvm_pmu *pmu = &vcpu->arch.pmu; > > + struct kvm_pmc *pmc = &pmu->pmc[select_idx]; > > u64 reg, mask; > > > > if (!kvm_vcpu_has_pmu(vcpu)) > > return; > > > > + kvm_pmu_stop_counter(vcpu, pmc); > > It appears that kvm_pmu_stop_counter() doesn't have to be called here > because it is called in the beginning of kvm_pmu_create_perf_event(). It feels a bit odd to change the event type without stopping the counter first, but I can't see anything going wrong if we omit it. I'll drop it. Thanks, M. -- Without deviation from the norm, progress is not possible. _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C8750ECAAA1 for ; Mon, 24 Oct 2022 10:31:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Subject:Cc:To:From:Message-ID:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=WwxPW3/uPFlxbriXeUK5hKtiv9RCOyD45gX1DwhCjEI=; b=iyEA4Ae95sTOiy m8NGzx3U8hKZ3FU3Pwk75O0tLeyTdZm/OyE+9bzisJ7WzNBLx2KrPCPEyuB9K51fAfJKsC/ilIh86 nwnRfVpzXF3Nv7dYLOu6hTYIg/LU+h6e2lOJIllDcsMOdRKf7KXDoukZ6FB8D1tHc4f2nMt8gn8sU 6BfbCMP+OxnnKcrwocRBty3FeDWjEbTi+h9BKxMGMSblEE/zrjlfsP2wK/l7irLjrOz22876mz5eq nOf87f7w+ac+MyoOzpCJNUdZYBr+KC8Xo2KcsQ1wkorSLeR6nLCDSngvadHbjQqy4aIDY7FZCpXl5 YjKAjd9hGyUfugeEKuWg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1omuiV-000kZ5-SK; Mon, 24 Oct 2022 10:30:00 +0000 Received: from ams.source.kernel.org ([145.40.68.75]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1omui1-000kAH-7c for linux-arm-kernel@lists.infradead.org; Mon, 24 Oct 2022 10:29:32 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 3F2A6B810B0; Mon, 24 Oct 2022 10:29:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EE13AC433C1; Mon, 24 Oct 2022 10:29:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666607366; bh=e9EJjLUy6t6gwdKoVDL3bho6pQx55paAnahTf23c/+I=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=V5KSuSnkGIMXtpKaGL98PrEKJ9aQkHuz0r24lmwoCxFg6Ps/6RuaIyw8MPVoEEnJV WuQZE+qJyjyIoT8R000LLhkrk93xsxyMcLae7lKIcPUv6LS0WImMy3+FhzDHcLonpT WjSzZLadY4L+snHKyFblPA1QQ7uWkcpZqMA+P1vVxLfH7LhgD7cXvj9QuaGjycCjfX UTaJBGcG+0FHhgTMbdQ5PVINv5Y0ikRXakPxKYDFIzArgxjgiTokH9yQ9VT753eiwu DsVtouLX70chZw1YZ9h39HGP0i/5u/YEqsfbIPdx5/eCJb0kcgpo6kirMTosftJtTT NCBJBTiM3NyUw== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1omuhv-0017qA-Kt; Mon, 24 Oct 2022 11:29:23 +0100 Date: Mon, 24 Oct 2022 11:29:23 +0100 Message-ID: <86zgdlms58.wl-maz@kernel.org> From: Marc Zyngier To: Reiji Watanabe Cc: Linux ARM , kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, kernel-team@android.com Subject: Re: [PATCH 1/9] KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode In-Reply-To: References: <20220805135813.2102034-1-maz@kernel.org> <20220805135813.2102034-2-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: reijiw@google.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, kernel-team@android.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221024_032929_635638_D7AB10C9 X-CRM114-Status: GOOD ( 49.03 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Reiji, Catching up on this. On Tue, 23 Aug 2022 05:30:21 +0100, Reiji Watanabe wrote: > > Hi Marc, > > On Fri, Aug 5, 2022 at 6:58 AM Marc Zyngier wrote: > > > > Ricardo recently pointed out that the PMU chained counter emulation > > in KVM wasn't quite behaving like the one on actual hardware, in > > the sense that a chained counter would expose an overflow on > > both halves of a chained counter, while KVM would only expose the > > overflow on the top half. > > > > The difference is subtle, but significant. What does the architecture > > say (DDI0087 H.a): > > > > - Before PMUv3p4, all counters but the cycle counter are 32bit > > - A 32bit counter that overflows generates a CHAIN event on the > > adjacent counter after exposing its own overflow status > > - The CHAIN event is accounted if the counter is correctly > > configured (CHAIN event selected and counter enabled) > > > > This all means that our current implementation (which uses 64bit > > perf events) prevents us from emulating this overflow on the lower half. > > > > How to fix this? By implementing the above, to the letter. > > > > This largly results in code deletion, removing the notions of > > "counter pair", "chained counters", and "canonical counter". > > The code is further restructured to make the CHAIN handling similar > > to SWINC, as the two are now extremely similar in behaviour. > > > > Reported-by: Ricardo Koller > > Signed-off-by: Marc Zyngier > > --- > > arch/arm64/kvm/pmu-emul.c | 324 +++++++++++--------------------------- > > include/kvm/arm_pmu.h | 2 - > > 2 files changed, 91 insertions(+), 235 deletions(-) > > > > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c > > index 11c43bed5f97..4986e8b3ea6c 100644 > > --- a/arch/arm64/kvm/pmu-emul.c > > +++ b/arch/arm64/kvm/pmu-emul.c [...] > > +/* > > + * Perform an increment on any of the counters described in @mask, > > + * generating the overflow if required, and propagate it as a chained > > + * event if possible. > > + */ > > +static void kvm_pmu_counter_increment(struct kvm_vcpu *vcpu, > > + unsigned long mask, u32 event) > > +{ > > + int i; > > + > > + if (!kvm_vcpu_has_pmu(vcpu)) > > + return; > > + > > + if (!(__vcpu_sys_reg(vcpu, PMCR_EL0) & ARMV8_PMU_PMCR_E)) > > + return; > > + > > + /* Weed out disabled counters */ > > + mask &= __vcpu_sys_reg(vcpu, PMCNTENSET_EL0); > > + > > + for_each_set_bit(i, &mask, ARMV8_PMU_CYCLE_IDX) { > > + u64 type, reg; > > + > > + /* Filter on event type */ > > + type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i); > > + type &= kvm_pmu_event_mask(vcpu->kvm); > > + if (type != event) > > + continue; > > + > > + /* Increment this counter */ > > + reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1; > > + reg = lower_32_bits(reg); > > + __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg; > > + > > + if (reg) /* No overflow? move on */ > > + continue; > > + > > + /* Mark overflow */ > > + __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i); > > Perhaps it might be useful to create another helper that takes > care of just one counter (it would essentially do the code above > in the loop). The helper could be used (in addition to the above > loop) from the code below for the CHAIN event case and from > kvm_pmu_perf_overflow(). Then unnecessary execution of > for_each_set_bit() could be avoided for these two cases. I'm not sure it really helps. We would still need to check whether the counter is enabled, and we'd need to bring that into the helper instead of keeping it outside of the loop. [...] > > @@ -625,30 +528,27 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx) > > { > > struct arm_pmu *arm_pmu = vcpu->kvm->arch.arm_pmu; > > struct kvm_pmu *pmu = &vcpu->arch.pmu; > > - struct kvm_pmc *pmc; > > + struct kvm_pmc *pmc = &pmu->pmc[select_idx]; > > struct perf_event *event; > > struct perf_event_attr attr; > > u64 eventsel, counter, reg, data; > > > > - /* > > - * For chained counters the event type and filtering attributes are > > - * obtained from the low/even counter. We also use this counter to > > - * determine if the event is enabled/disabled. > > - */ > > - pmc = kvm_pmu_get_canonical_pmc(&pmu->pmc[select_idx]); > > - > > - reg = (pmc->idx == ARMV8_PMU_CYCLE_IDX) > > + reg = (select_idx == ARMV8_PMU_CYCLE_IDX) > > ? PMCCFILTR_EL0 : PMEVTYPER0_EL0 + pmc->idx; > > You may want to use select_idx instead of pmc->id for consistency ? Yes. Although Oliver had a point in saying that these pmc->idx vs select_idx conversions were not strictly necessary and cluttered the patch. [...] > > @@ -752,11 +607,15 @@ static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 select_idx) > > void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data, > > u64 select_idx) > > { > > + struct kvm_pmu *pmu = &vcpu->arch.pmu; > > + struct kvm_pmc *pmc = &pmu->pmc[select_idx]; > > u64 reg, mask; > > > > if (!kvm_vcpu_has_pmu(vcpu)) > > return; > > > > + kvm_pmu_stop_counter(vcpu, pmc); > > It appears that kvm_pmu_stop_counter() doesn't have to be called here > because it is called in the beginning of kvm_pmu_create_perf_event(). It feels a bit odd to change the event type without stopping the counter first, but I can't see anything going wrong if we omit it. I'll drop it. Thanks, M. -- Without deviation from the norm, progress is not possible. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D245CC38A2D for ; Mon, 24 Oct 2022 10:29:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229674AbiJXK3e (ORCPT ); Mon, 24 Oct 2022 06:29:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32854 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230141AbiJXK3c (ORCPT ); Mon, 24 Oct 2022 06:29:32 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5540426E8 for ; Mon, 24 Oct 2022 03:29:28 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 57BB2B810D9 for ; Mon, 24 Oct 2022 10:29:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EE13AC433C1; Mon, 24 Oct 2022 10:29:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666607366; bh=e9EJjLUy6t6gwdKoVDL3bho6pQx55paAnahTf23c/+I=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=V5KSuSnkGIMXtpKaGL98PrEKJ9aQkHuz0r24lmwoCxFg6Ps/6RuaIyw8MPVoEEnJV WuQZE+qJyjyIoT8R000LLhkrk93xsxyMcLae7lKIcPUv6LS0WImMy3+FhzDHcLonpT WjSzZLadY4L+snHKyFblPA1QQ7uWkcpZqMA+P1vVxLfH7LhgD7cXvj9QuaGjycCjfX UTaJBGcG+0FHhgTMbdQ5PVINv5Y0ikRXakPxKYDFIzArgxjgiTokH9yQ9VT753eiwu DsVtouLX70chZw1YZ9h39HGP0i/5u/YEqsfbIPdx5/eCJb0kcgpo6kirMTosftJtTT NCBJBTiM3NyUw== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1omuhv-0017qA-Kt; Mon, 24 Oct 2022 11:29:23 +0100 Date: Mon, 24 Oct 2022 11:29:23 +0100 Message-ID: <86zgdlms58.wl-maz@kernel.org> From: Marc Zyngier To: Reiji Watanabe Cc: Linux ARM , kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, kernel-team@android.com Subject: Re: [PATCH 1/9] KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode In-Reply-To: References: <20220805135813.2102034-1-maz@kernel.org> <20220805135813.2102034-2-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: reijiw@google.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, kernel-team@android.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Hi Reiji, Catching up on this. On Tue, 23 Aug 2022 05:30:21 +0100, Reiji Watanabe wrote: > > Hi Marc, > > On Fri, Aug 5, 2022 at 6:58 AM Marc Zyngier wrote: > > > > Ricardo recently pointed out that the PMU chained counter emulation > > in KVM wasn't quite behaving like the one on actual hardware, in > > the sense that a chained counter would expose an overflow on > > both halves of a chained counter, while KVM would only expose the > > overflow on the top half. > > > > The difference is subtle, but significant. What does the architecture > > say (DDI0087 H.a): > > > > - Before PMUv3p4, all counters but the cycle counter are 32bit > > - A 32bit counter that overflows generates a CHAIN event on the > > adjacent counter after exposing its own overflow status > > - The CHAIN event is accounted if the counter is correctly > > configured (CHAIN event selected and counter enabled) > > > > This all means that our current implementation (which uses 64bit > > perf events) prevents us from emulating this overflow on the lower half. > > > > How to fix this? By implementing the above, to the letter. > > > > This largly results in code deletion, removing the notions of > > "counter pair", "chained counters", and "canonical counter". > > The code is further restructured to make the CHAIN handling similar > > to SWINC, as the two are now extremely similar in behaviour. > > > > Reported-by: Ricardo Koller > > Signed-off-by: Marc Zyngier > > --- > > arch/arm64/kvm/pmu-emul.c | 324 +++++++++++--------------------------- > > include/kvm/arm_pmu.h | 2 - > > 2 files changed, 91 insertions(+), 235 deletions(-) > > > > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c > > index 11c43bed5f97..4986e8b3ea6c 100644 > > --- a/arch/arm64/kvm/pmu-emul.c > > +++ b/arch/arm64/kvm/pmu-emul.c [...] > > +/* > > + * Perform an increment on any of the counters described in @mask, > > + * generating the overflow if required, and propagate it as a chained > > + * event if possible. > > + */ > > +static void kvm_pmu_counter_increment(struct kvm_vcpu *vcpu, > > + unsigned long mask, u32 event) > > +{ > > + int i; > > + > > + if (!kvm_vcpu_has_pmu(vcpu)) > > + return; > > + > > + if (!(__vcpu_sys_reg(vcpu, PMCR_EL0) & ARMV8_PMU_PMCR_E)) > > + return; > > + > > + /* Weed out disabled counters */ > > + mask &= __vcpu_sys_reg(vcpu, PMCNTENSET_EL0); > > + > > + for_each_set_bit(i, &mask, ARMV8_PMU_CYCLE_IDX) { > > + u64 type, reg; > > + > > + /* Filter on event type */ > > + type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i); > > + type &= kvm_pmu_event_mask(vcpu->kvm); > > + if (type != event) > > + continue; > > + > > + /* Increment this counter */ > > + reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1; > > + reg = lower_32_bits(reg); > > + __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg; > > + > > + if (reg) /* No overflow? move on */ > > + continue; > > + > > + /* Mark overflow */ > > + __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i); > > Perhaps it might be useful to create another helper that takes > care of just one counter (it would essentially do the code above > in the loop). The helper could be used (in addition to the above > loop) from the code below for the CHAIN event case and from > kvm_pmu_perf_overflow(). Then unnecessary execution of > for_each_set_bit() could be avoided for these two cases. I'm not sure it really helps. We would still need to check whether the counter is enabled, and we'd need to bring that into the helper instead of keeping it outside of the loop. [...] > > @@ -625,30 +528,27 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx) > > { > > struct arm_pmu *arm_pmu = vcpu->kvm->arch.arm_pmu; > > struct kvm_pmu *pmu = &vcpu->arch.pmu; > > - struct kvm_pmc *pmc; > > + struct kvm_pmc *pmc = &pmu->pmc[select_idx]; > > struct perf_event *event; > > struct perf_event_attr attr; > > u64 eventsel, counter, reg, data; > > > > - /* > > - * For chained counters the event type and filtering attributes are > > - * obtained from the low/even counter. We also use this counter to > > - * determine if the event is enabled/disabled. > > - */ > > - pmc = kvm_pmu_get_canonical_pmc(&pmu->pmc[select_idx]); > > - > > - reg = (pmc->idx == ARMV8_PMU_CYCLE_IDX) > > + reg = (select_idx == ARMV8_PMU_CYCLE_IDX) > > ? PMCCFILTR_EL0 : PMEVTYPER0_EL0 + pmc->idx; > > You may want to use select_idx instead of pmc->id for consistency ? Yes. Although Oliver had a point in saying that these pmc->idx vs select_idx conversions were not strictly necessary and cluttered the patch. [...] > > @@ -752,11 +607,15 @@ static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 select_idx) > > void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data, > > u64 select_idx) > > { > > + struct kvm_pmu *pmu = &vcpu->arch.pmu; > > + struct kvm_pmc *pmc = &pmu->pmc[select_idx]; > > u64 reg, mask; > > > > if (!kvm_vcpu_has_pmu(vcpu)) > > return; > > > > + kvm_pmu_stop_counter(vcpu, pmc); > > It appears that kvm_pmu_stop_counter() doesn't have to be called here > because it is called in the beginning of kvm_pmu_create_perf_event(). It feels a bit odd to change the event type without stopping the counter first, but I can't see anything going wrong if we omit it. I'll drop it. Thanks, M. -- Without deviation from the norm, progress is not possible.