From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CC78C05027 for ; Fri, 20 Jan 2023 18:05:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230114AbjATSFN (ORCPT ); Fri, 20 Jan 2023 13:05:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55964 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230113AbjATSFF (ORCPT ); Fri, 20 Jan 2023 13:05:05 -0500 X-Greylist: delayed 63260 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Fri, 20 Jan 2023 10:05:01 PST Received: from out0.migadu.com (out0.migadu.com [94.23.1.103]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 88594521E2 for ; Fri, 20 Jan 2023 10:05:01 -0800 (PST) Date: Fri, 20 Jan 2023 18:04:55 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1674237899; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Z1CaMcH4x5rtypgK1Xxe771oK8a9hWqV3gyDwwh42eA=; b=s+eRmBypvtugBDgcUxeYCjsvJV1Ae4KYybdsroHaliFb55C2HFYchZaA8jsQLjwHC/lmtn kYYWKySUBrw07RcLprP4ZxCmfxU8Td+Cx8vDrKSZdQohAxRDZfWRZdgW+sPyJILUzSDNXb EnmsfZCqROpfaQcSS24cv2UsT7n+gCs= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Oliver Upton To: Marc Zyngier Cc: Reiji Watanabe , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, James Morse , Alexandru Elisei , Zenghui Yu , Suzuki K Poulose , Paolo Bonzini , Ricardo Koller , Jing Zhang , Raghavendra Rao Anata Subject: Re: [PATCH v2 3/8] KVM: arm64: PMU: Preserve vCPU's PMCR_EL0.N value on vCPU reset Message-ID: References: <20230117013542.371944-1-reijiw@google.com> <20230117013542.371944-4-reijiw@google.com> <86pmb9mmkv.wl-maz@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <86pmb9mmkv.wl-maz@kernel.org> X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Hey Marc, On Fri, Jan 20, 2023 at 12:12:32PM +0000, Marc Zyngier wrote: > On Fri, 20 Jan 2023 00:30:33 +0000, Oliver Upton wrote: > > I think we need to derive a sanitised value for PMCR_EL0.N, as I believe > > nothing in the architecture prevents implementers from gluing together > > cores with varying numbers of PMCs. We probably haven't noticed it yet > > since it would appear all Arm designs have had 6 PMCs. > > This brings back the question of late onlining. How do you cope with > with the onlining of such a CPU that has a smaller set of counters > than its online counterparts? This is at odds with the way the PMU > code works. You're absolutely right, any illusion we derived from the online set of CPUs could fall apart with a late onlining of a different core. > If you have a different set of counters, you are likely to have a > different PMU altogether: > > [ 1.192606] hw perfevents: enabled with armv8_cortex_a57 PMU driver, 7 counters available > [ 1.201254] hw perfevents: enabled with armv8_cortex_a53 PMU driver, 7 counters available > > This isn't a broken system, but it has two set of cores which are > massively different, and two PMUs. > > This really should tie back to the PMU type we're counting on, and to > the set of CPUs that implements it. We already have some > infrastructure to check for the affinity of the PMU vs the CPU we're > running on, and this is already visible to userspace. > > Can't we just leave this responsibility to userspace? Believe me, I'm always a fan of offloading things to userspace :) If the VMM is privy to the details of the system it is on then the differing PMUs can be passed through to the guest w/ pinned vCPU threads. I just worry about the case of a naive VMM that assumes a homogenous system. I don't think I could entirely blame the VMM in this case either as we've gone to lengths to sanitise the feature set exposed to userspace. What happens when a vCPU gets scheduled on a core where the vPMU doesn't match? Ignoring other incongruences, it is not possible to virtualize more counters than are supported by the vPMU of the core. Stopping short of any major hacks in the kernel to fudge around the problem, I believe we may need to provide better documentation of how heterogeneous CPUs are handled in KVM and what userspace can do about it. -- Thanks, Oliver