From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 94E2A330652; Thu, 26 Feb 2026 23:05:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772147128; cv=none; b=Cl5/mhc7gT4BcyS4Sc6RqNEsitFUdBR5Luq03S1dpH1QwZqvqL/KgaGCYE2iTNrimraoeRbtCPXggb88H4iDyvjAAZHocu4Ip+49b56tBK4bitXLCJ+io85UleQuZaAX9GhdLmSnId5IsIpDNtRzKmcdbPGQ8krgdcwChaypJHc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772147128; c=relaxed/simple; bh=sPgCA9ohTSBtInSypJVYHZRcpxI7ECO568juyPVkSGU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Hpk23p/CCLgiwEO5aHBNXcYY/6AtHZ8cN4bmfL2GaspSU8AH1ftQo5uTgcGiwC2MZmHBmhgGN1jwMrUzIeSAWM5sNz5IVCPj9K7G1a0bZyJ37yiv0nAAI+tWu8LyJu7BYHZUW79yQTyDJWdTOwhDpDC5iYHW09s/8N6dFwnqjP8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=YeP18rNY; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="YeP18rNY" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DF6D6C116C6; Thu, 26 Feb 2026 23:05:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1772147128; bh=sPgCA9ohTSBtInSypJVYHZRcpxI7ECO568juyPVkSGU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=YeP18rNY4v9IeLt8Tz/aP65S/WVUS3VEjY1QKawXIf1Nk5G71+7pMmVetSqkS9Q83 0aijxsp+XOMgFv3a18rx+7GGF0VwnER0oGPweSOxg2P5g9LhHWVkoezJwJWuw+a7zS i14cRm9d/dAZavB2jqY63wD5ajUgGYZkMaxzM9CWoSq+eIMzcF7KTeIDpdECTeA13J h+ET67UrLkDQ9UjphbVmystA/twgma7lp67O0awlIDs37JG+vaSiMPGcU5YAipSD95 O8tSVKKUbwU0s53DRvnjQYDCDmJ48d2cRmzgtF3PL28twpYDZIXEGgHn1aO6fFi/93 toB83PzGWCIuA== Date: Thu, 26 Feb 2026 15:05:26 -0800 From: Oliver Upton To: Akihiko Odaki Cc: Marc Zyngier , Joey Gouly , Suzuki K Poulose , Zenghui Yu , Catalin Marinas , Will Deacon , Kees Cook , "Gustavo A. R. Silva" , Paolo Bonzini , Jonathan Corbet , Shuah Khan , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org, devel@daynix.com, kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: Re: [PATCH v3 1/2] KVM: arm64: PMU: Introduce FIXED_COUNTERS_ONLY Message-ID: References: <20260225-hybrid-v3-0-46e8fe220880@rsg.ci.i.u-tokyo.ac.jp> <20260225-hybrid-v3-1-46e8fe220880@rsg.ci.i.u-tokyo.ac.jp> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On Thu, Feb 26, 2026 at 11:47:54PM +0900, Akihiko Odaki wrote: > On 2026/02/26 23:43, Akihiko Odaki wrote: > > On 2026/02/26 20:54, Oliver Upton wrote: > > > Hi Akihiko, > > > > > > On Wed, Feb 25, 2026 at 01:31:15PM +0900, Akihiko Odaki wrote: > > > > @@ -629,6 +629,8 @@ void kvm_arch_vcpu_load(struct kvm_vcpu > > > > *vcpu, int cpu) > > > >           kvm_vcpu_load_vhe(vcpu); > > > >       kvm_arch_vcpu_load_fp(vcpu); > > > >       kvm_vcpu_pmu_restore_guest(vcpu); > > > > +    if (test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, > > > > &vcpu- >kvm->arch.flags)) > > > > +        kvm_make_request(KVM_REQ_CREATE_PMU, vcpu); > > > > > > We only need to set the request if the vCPU has migrated to a different > > > PMU implementation, no? > > > > Indeed. I was too lazy to implement such a check since it won't affect > > performance unless the new feature is requested, but having one may be > > still nice. I'd definitely like to see this. > > > > > > >       if (kvm_arm_is_pvtime_enabled(&vcpu->arch)) > > > >           kvm_make_request(KVM_REQ_RECORD_STEAL, vcpu); > > > > @@ -1056,6 +1058,9 @@ static int check_vcpu_requests(struct > > > > kvm_vcpu *vcpu) > > > >           if (kvm_check_request(KVM_REQ_RELOAD_PMU, vcpu)) > > > >               kvm_vcpu_reload_pmu(vcpu); > > > > +        if (kvm_check_request(KVM_REQ_CREATE_PMU, vcpu)) > > > > +            kvm_vcpu_create_pmu(vcpu); > > > > + > > > > > > My strong preference would be to squash the migration handling into > > > kvm_vcpu_reload_pmu(). It is already reprogramming PMU events in > > > response to other things. > > > > Can you share a reason for that? > > > > In terms of complexity, I don't think it will help reducing complexity > > since the only common things between kvm_vcpu_reload_pmu() and > > kvm_vcpu_create_pmu() are the enumeration of enabled counters, which is > > simple enough. I prefer it in terms of code organization. We should have a single helper that refreshes the backing perf events when something has globally changed for the vPMU. Besides this, "create" is confusing since the vPMU has already been instantiated. > > In terms of performance, I guess it is better to keep > > kvm_vcpu_create_pmu() small since it is triggered for each migration. I think the surrounding KVM code for iterating over the counters is inconsequential compared to the overheads of calling into perf to recreate the PMU events. Since we expect this to be slow, we should only set the request when absolutely necessary. > > > > +static bool kvm_pmu_counter_is_enabled(struct kvm_pmc *pmc) > > > > +{ > > > > +    struct kvm_vcpu *vcpu = kvm_pmc_to_vcpu(pmc); > > > > + > > > > +    return kvm_pmu_enabled_counter_mask(vcpu) & BIT(pmc->idx); > > > >   } > > > > > > You're churning a good bit of code, this needs to happen in a separate > > > patch (if at all). > > > > It makes sense. The next version will have a separate patch for this. If I have the full picture right, you may not need it with a common request handler. > > > > > > > @@ -689,6 +710,14 @@ static void > > > > kvm_pmu_create_perf_event(struct kvm_pmc *pmc) > > > >       int eventsel; > > > >       u64 evtreg; > > > > +    if (!arm_pmu) { > > > > +        arm_pmu = kvm_pmu_probe_armpmu(vcpu->cpu); > > > > > > kvm_pmu_probe_armpmu() takes a global mutex, I'm not sure that's what we > > > want. > > > > > > What prevents us from opening a PERF_TYPE_RAW event and allowing perf to > > > work out the right PMU for this CPU? > > > > Unfortunately perf does not seem to have a capability to switch to the > > right PMU. tools/perf/Documentation/intel-hybrid.txt says the perf tool > > creates events for each PMU in a hybird configuration, for example. > > I think I misunderstood what you meant. Letting > perf_event_create_kernel_counter() to figure out what a PMU to use may be a > good idea. I'll give a try with the next version. Yep, this is what I was alluding to. > > > > > > > > > +        if (!arm_pmu) { > > > > +            vcpu_set_on_unsupported_cpu(vcpu); > > > > > > At this point it seems pretty late to flag the CPU as unsupported. Maybe > > > instead we can compute the union cpumask for all the PMU implemetations > > > the VM may schedule on. > > > > This is just a safe guard and it is a responsibility of the userspace to > > schedule the VCPU properly. It is conceptually same with what > > kvm_arch_vcpu_load() does when migrating to an unsupported CPU. I agree with you that we need to have some handling for this situation. What I don't like about this is userspace doesn't discover its mistake until the guest actually programs a PMC. I'd much rather preserve the existing ABI where KVM proactively rejects running a vCPU on an unsupported CPU. > > > > +    case KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY: > > > > +        lockdep_assert_held(&vcpu->kvm->arch.config_lock); > > > > +        if (test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, > > > > &vcpu->kvm->arch.flags)) > > > > +            return 0; > > > > > > We don't need a getter for this, userspace should remember how it > > > provisioned the VM. > > > > The getter is useful for debugging and testing. The selftest will use it > > to query the current state. That's fine for debugging this on your own kernel but we don't need it upstream. There's several other vPMU attributes that are write-only, like KVM_ARM_VCPU_PMU_V3_SET_PMU. Thanks, Oliver