From: Marc Zyngier <maz@kernel.org>
To: Alexandre Chartre <alexandre.chartre@oracle.com>
Cc: will@kernel.org, catalin.marinas@arm.com,
alexandru.elisei@arm.com, james.morse@arm.com,
suzuki.poulose@arm.com, linux-arm-kernel@lists.infradead.org,
kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
konrad.wilk@oracle.com
Subject: Re: [PATCH] KVM: arm64: Disabling disabled PMU counters wastes a lot of time
Date: Tue, 29 Jun 2021 10:06:43 +0100 [thread overview]
Message-ID: <878s2tavks.wl-maz@kernel.org> (raw)
In-Reply-To: <20210628161925.401343-1-alexandre.chartre@oracle.com>
Hi Alexandre,
Thanks for looking into this.
On Mon, 28 Jun 2021 17:19:25 +0100,
Alexandre Chartre <alexandre.chartre@oracle.com> wrote:
>
> In a KVM guest on ARM, performance counters interrupts have an
nit: arm64. 32bit ARM never had any working KVM PMU emulation.
> unnecessary overhead which slows down execution when using the "perf
> record" command and limits the "perf record" sampling period.
>
> The problem is that when a guest VM disables counters by clearing the
> PMCR_EL0.E bit (bit 0), KVM will disable all counters defined in
> PMCR_EL0 even if they are not enabled in PMCNTENSET_EL0.
>
> KVM disables a counter by calling into the perf framework, in particular
> by calling perf_event_create_kernel_counter() which is a time consuming
> operation. So, for example, with a Neoverse N1 CPU core which has 6 event
> counters and one cycle counter, KVM will always disable all 7 counters
> even if only one is enabled.
>
> This typically happens when using the "perf record" command in a guest
> VM: perf will disable all event counters with PMCNTENTSET_EL0 and only
> uses the cycle counter. And when using the "perf record" -F option with
> a high profiling frequency, the overhead of KVM disabling all counters
> instead of one on every counter interrupt becomes very noticeable.
>
> The problem is fixed by having KVM disable only counters which are
> enabled in PMCNTENSET_EL0. If a counter is not enabled in PMCNTENSET_EL0
> then KVM will not enable it when setting PMCR_EL0.E and it will remain
> disable as long as it is not enabled in PMCNTENSET_EL0. So there is
nit: disabled
> effectively no need to disable a counter when clearing PMCR_EL0.E if it
> is not enabled PMCNTENSET_EL0.
>
> Fixes: 76993739cd6f ("arm64: KVM: Add helper to handle PMCR register bits")
This isn't a fix (the current behaviour is correct per the
architecture), "only" a performance improvement. We reserve "Fixes:"
for things that are actually broken.
> Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com>
> ---
> arch/arm64/kvm/pmu-emul.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index fd167d4f4215..bab4b735a0cf 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -571,7 +571,8 @@ void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val)
> kvm_pmu_enable_counter_mask(vcpu,
> __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) & mask);
> } else {
> - kvm_pmu_disable_counter_mask(vcpu, mask);
> + kvm_pmu_disable_counter_mask(vcpu,
> + __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) & mask);
This seems to perpetuate a flawed pattern. Why do we need to work out
the *valid* PMCTENSET_EL0 bits? They should be correct by construction,
and the way the shadow sysreg gets populated already enforces this:
<quote>
static bool access_pmcnten(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
[...]
mask = kvm_pmu_valid_counter_mask(vcpu);
if (p->is_write) {
val = p->regval & mask;
if (r->Op2 & 0x1) {
/* accessing PMCNTENSET_EL0 */
__vcpu_sys_reg(vcpu, PMCNTENSET_EL0) |= val;
kvm_pmu_enable_counter_mask(vcpu, val);
kvm_vcpu_pmu_restore_guest(vcpu);
</quote>
So the sysreg is the only thing we should consider, and I think we
should drop the useless masking. There is at least another instance of
this in the PMU code (kvm_pmu_overflow_status()), and apart from
kvm_pmu_vcpu_reset(), only the sysreg accessors should care about the
masking to sanitise accesses.
What do you think?
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2021-06-29 9:09 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-28 16:19 [PATCH] KVM: arm64: Disabling disabled PMU counters wastes a lot of time Alexandre Chartre
2021-06-29 9:06 ` Marc Zyngier [this message]
2021-06-29 13:16 ` Alexandre Chartre
2021-06-29 13:47 ` Marc Zyngier
2021-06-29 14:17 ` Alexandre Chartre
2021-06-29 14:25 ` Marc Zyngier
2021-06-29 14:40 ` Alexandre Chartre
2021-07-06 13:50 ` Alexandre Chartre
2021-07-06 14:52 ` Marc Zyngier
2021-07-06 15:35 ` Alexandre Chartre
2021-07-06 17:36 ` Marc Zyngier
2021-07-07 12:48 ` Alexandre Chartre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=878s2tavks.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=alexandre.chartre@oracle.com \
--cc=alexandru.elisei@arm.com \
--cc=catalin.marinas@arm.com \
--cc=james.morse@arm.com \
--cc=konrad.wilk@oracle.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=suzuki.poulose@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).