From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BBF9B3C061F for ; Tue, 5 May 2026 19:58:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778011120; cv=none; b=LfXAKdXSxLtw47y6hGvXkhRSGHWa5BFJ9JNVKDujA7JpAE26X9wk2+gjDjibUr6EJIYtMt4cV/LHBEmpc0mcEtJsxfooDNkt6z4vM8x3kqi45LMf47ojPWaNux19PnsNa9hZY/PTysyrC3NO8IOAchQTFbSG9hc9VaKRxKjfVog= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778011120; c=relaxed/simple; bh=wZ+4e4bdgtA+cnZnsi1sCsZpKcgiP7bLYcdq96UmCG8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=mPIf27iiIC03etJ5zzURKTiJLn0oDQfv8NAaeafd2/u8tCFS2DHC+MY/+4I75Gr/iG4oPD5KQm87G09l4PW9MhMSuQFZDq4SzaWbKaorty4Q0Nvgca4oqiH5bkX+Z+3KgjkOa6AMxmvuhaMENVIFKjSvUhODipKE2d4zv4CUUGc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=k5kxg8fw; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="k5kxg8fw" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2ba603a7d2cso1814465ad.1 for ; Tue, 05 May 2026 12:58:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1778011118; x=1778615918; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=0N8pahFNFpVRIbsdReJ0xAb2DOf33TA/Z1AFvPeKNcA=; b=k5kxg8fwnKsRaK+SPA2KgFFy1+a56vNBQ74gok7XVFO2ltfNbSznU5vKRUWwrM02yA j+1elym+nr1+3ngGqPW4R8uZMucfzOCtGm6QUkCWUEk11XuycaKb0iRMIA82jw2zjy91 fhQICs2otAjyVnM7FmakNsVsV1akysxjM4GrwZ49rMIXURcMDoju8O+I8pT0THFSIlGA ky/iJr5pORakUka0JzufWZBNG8YyS6MPlZn42DFR/r3LgUHzECfNR3XmOqGoVJmwGOZ7 p5bYZSx5fS9KZiW4jdiiFut4wVJj/VA29oiFUd7KMF9AQxqMOovvKQB8NTq2WT3kghh+ aPoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778011118; x=1778615918; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0N8pahFNFpVRIbsdReJ0xAb2DOf33TA/Z1AFvPeKNcA=; b=T/QRcE7Rx5DRjBg84WNOuJG4mPhGhsELHTTgMJengCrBUKtQ1rWdjzLpI49t61NYRp e22ZEosqPKwkQc1gm01Z+BKke2lm59KWW+tkiBgZhyJWJVXh+d1iuUvoLtwGdmPq593H CkAJzR1WRR2QznKxPaLSFsUZtzqyG7uHeGv6Pyiyak2p0PTLu7zuzCx1Uh9HaZ2IuLje HikSHpIkK/eAT82e8MptAXb6GLtx53qlYeXS/jg5v1PhQUOopfFM5IfARzPXOeKwL+CE PlRxXb9keHxo/DgJXB//mJV8WKGZr7At0eh6T7UlZA8s7cfxQvT6uG1HKwDnYwID6B5C +xXA== X-Forwarded-Encrypted: i=1; AFNElJ9Ww6PPaNOjIB+ZkYF5PNf7UzfCqMEGmX8AIhdSSdlWpQhLOJYuyxoJs7ppPhP98/jAgJw=@vger.kernel.org X-Gm-Message-State: AOJu0YxM0/IkMez2Bc9uVfEcuI/eOhSx67JJjNeg6aPxcrk7bja91jjk sHzWbPW8V+HEAxIKjIiSjrd4NJU3GM9vIGJ6bqf26uWkZdgjBl7eP+Mkd5x/j6y8VlnxkoE8hjB ZorAsaQ== X-Received: from plhq10.prod.google.com ([2002:a17:903:11ca:b0:2b2:a488:ae47]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:1b4d:b0:2b4:6122:10f8 with SMTP id d9443c01a7336-2ba4e4d0cc0mr37140905ad.21.1778011117786; Tue, 05 May 2026 12:58:37 -0700 (PDT) Date: Tue, 5 May 2026 12:58:37 -0700 In-Reply-To: Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260430202750.3924147-1-yosry@kernel.org> <20260430202750.3924147-8-yosry@kernel.org> Message-ID: Subject: Re: [PATCH v5 07/13] KVM: x86/pmu: Disable counters based on Host-Only/Guest-Only bits in SVM From: Sean Christopherson To: Yosry Ahmed Cc: Paolo Bonzini , Jim Mattson , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="us-ascii" On Tue, May 05, 2026, Yosry Ahmed wrote: > > > Did you see my other replies and code snippet tracking disabling > > > reasons? I think the code snippet would still work, I just need to > > > move the pmc_is_nested_disabled() check into pmc_is_locally_enabled(). > > > > I did. IMO, all of what you proposed is an optimization to avoid the "costly" > > checks at the time of pmc_is_locally_enabled(). In quotes because I don't think > > the _overall_ cost is actually all that high. pmc_is_locally_enabled() is only > > called in relatively slow paths, and my guess is the CALL+RET (or untrained RET, > > ugh) is probably more expensive than the logic itself. > > > > The very nice side effect of incorporating the logic into pmc_is_locally_enabled() > > is that I _think_ we can drop kvm_pmu_ops.reprogram_counters(), because > > amd_mediated_pmu_handle_host_guest_bits() will instead be: > > > > static bool amd_pmc_is_locally_disabled(struct kvm_pmc *pmc) > > { > > struct kvm_pmu *pmu = pmc_to_pmu(pmc); > > struct kvm_vcpu *vcpu = pmu_to_vcpu(pmu); > > u64 host_guest_bits; > > > > /* Common code is supposed to check the common enable bit. */ > > if (WARN_ON_ONCE(!(pmc->eventsel & ARCH_PERFMON_EVENTSEL_ENABLE))) > > return false; > > > > /* > > * If both bits are cleared, always keep the counter enabled. Otherwise, > > * counter enablement needs to be re-evaluated on every nested > > * transition (and EFER.SVME change). > > */ > > host_guest_bits = pmc->eventsel & AMD64_EVENTSEL_HOST_GUEST_MASK; > > if (!host_guest_bits) > > return true; > > > > /* If either bit is set and EFER.SVME=0, the counter is disabled. */ > > if (!(vcpu->arch.efer & EFER_SVME)) > > return false; > > > > if (host_guest_bits == AMD64_EVENTSEL_HOST_GUEST_MASK) > > return true; > > > > return !!(host_guest_bits & AMD64_EVENTSEL_GUESTONLY) == is_guest_mode(vcpu); > > } > > If we do this and drop kvm_pmu_ops.reprogram_counters(), we still need > somewhere to actually clear ARCH_PERFMON_EVENTSEL_ENABLE in eventsel_hw. Yep. I was thinking we'd do that as part of reprogram_counter(). > What if we call kvm_pmu_ops.pmc_is_locally_disabled() at the top of > reprogram_counter(), cache the result, and use that for eventsel_hw > modification and in pmc_is_locally_enabled()? Why bother caching it? I suspect it will make the code harder to read, I would generally prefer pmc_is_locally_enabled() be self-sufficient, and I don't think the caching will elide any lookups. Legacy PMU doesn't care, and even if it did, it should only query pmc_is_locally_enabled() once. And the mediated PMU should only query it once. > We'd also probably want to rename it. I would honeslty just use 'nested' > instead of 'locally_disabled' and 'mode_specific_enables' as that's the > only current user. But it's not strictly nested specific. E.g. even the vCPU doesn't support nested virtualization, a (stupid) guest can still set HOST_ONLY and effectively disable the counter, thanks to the bizarro behavior of HOST_ONLY when EFER.SVME=0. > Something like this with your proposed amd_pmc_is_locally_disabled() > above, which is similar to the kvm_pmu_ops.mediated_reprogram_counter() > implementation in v4 except that the vendor-specific callback is more > targeted: > > static void pmc_check_nested_disabled(struct kvm_pmc *pmc) > { > if (!(pmc->eventsel & ARCH_PERFMON_EVENTSEL_ENABLE)) > return; > > if (!test_bit(pmc->idx, &pmu->pmc_has_nested_enables)) > return; > > pmc->is_nested_disabled = kvm_pmu_call(pmc_is_nested_disabled)(pmc); > if (!pmc->is_nested_disabled) > pmc->eventsel_hw &= ~ARCH_PERFMON_EVENTSEL_ENABLE; I don't want multiple paths mucking with eventsel_hw's ARCH_PERFMON_EVENTSEL_ENABLE. That's how we end up with ordering bugs. E.g. pmc_check_nested_disabled() *must* be called after kvm_mediated_pmu_refresh_event_filter(), which is gross and brittle. E.g. something like so. We can even optimize away the PMU filter lookup (which I suspect would be more expensive in the common case?) if the counter is disabled thanks to H/G bits. diff --git arch/x86/kvm/pmu.c arch/x86/kvm/pmu.c index 7b2b4ce6bdad..4ca4a0078d35 100644 --- arch/x86/kvm/pmu.c +++ arch/x86/kvm/pmu.c @@ -530,21 +530,24 @@ static bool pmc_is_event_allowed(struct kvm_pmc *pmc) return is_fixed_event_allowed(filter, pmc->idx); } -static void kvm_mediated_pmu_refresh_event_filter(struct kvm_pmc *pmc) +static void kvm_mediated_pmu_refresh_eventsel_hw(struct kvm_pmc *pmc) { - bool allowed = pmc_is_event_allowed(pmc); struct kvm_pmu *pmu = pmc_to_pmu(pmc); if (pmc_is_gp(pmc)) { pmc->eventsel_hw &= ~ARCH_PERFMON_EVENTSEL_ENABLE; - if (allowed) + if (!test_bit(pmc->idx, &pmu->pmc_has_mode_specific_enables) && + kvm_pmu_call(pmc_is_locally_disabled(pmc))) + return; + + if (pmc_is_event_allowed(pmc)) pmc->eventsel_hw |= pmc->eventsel & ARCH_PERFMON_EVENTSEL_ENABLE; } else { u64 mask = intel_fixed_bits_by_idx(pmc->idx - KVM_FIXED_PMC_BASE_IDX, 0xf); pmu->fixed_ctr_ctrl_hw &= ~mask; - if (allowed) + if (pmc_is_event_allowed(pmc)) pmu->fixed_ctr_ctrl_hw |= pmu->fixed_ctr_ctrl & mask; } } Actually, this code is being silly. It unnecessarily performs the PMU filter lookup when the _guest_ disables the counters via eventsel. If you first "fix" that by querying pmc_is_locally_enabled() before checking the event filter, then you won't even need to touch that code when introducing H/G bits, because it will Just Work thanks to pmc_is_locally_enabled() doing the right thing (as it should, becase as mentioned early, that logic is architectural). diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 7b2b4ce6bdad..c84f2f971ab1 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -530,22 +530,23 @@ static bool pmc_is_event_allowed(struct kvm_pmc *pmc) return is_fixed_event_allowed(filter, pmc->idx); } -static void kvm_mediated_pmu_refresh_event_filter(struct kvm_pmc *pmc) +static void kvm_mediated_pmu_refresh_eventsel_hw(struct kvm_pmc *pmc) { - bool allowed = pmc_is_event_allowed(pmc); + bool allowed = pmc_is_locally_enabled(pmc) && pmc_is_event_allowed(pmc); struct kvm_pmu *pmu = pmc_to_pmu(pmc); if (pmc_is_gp(pmc)) { - pmc->eventsel_hw &= ~ARCH_PERFMON_EVENTSEL_ENABLE; if (allowed) - pmc->eventsel_hw |= pmc->eventsel & - ARCH_PERFMON_EVENTSEL_ENABLE; + pmc->eventsel_hw |= ARCH_PERFMON_EVENTSEL_ENABLE; + else + pmc->eventsel_hw &= ~ARCH_PERFMON_EVENTSEL_ENABLE; } else { u64 mask = intel_fixed_bits_by_idx(pmc->idx - KVM_FIXED_PMC_BASE_IDX, 0xf); - pmu->fixed_ctr_ctrl_hw &= ~mask; if (allowed) - pmu->fixed_ctr_ctrl_hw |= pmu->fixed_ctr_ctrl & mask; + pmu->fixed_ctr_ctrl_hw |= mask; + else + pmu->fixed_ctr_ctrl_hw &= ~mask; } } @@ -558,7 +559,7 @@ static int reprogram_counter(struct kvm_pmc *pmc) u8 fixed_ctr_ctrl; if (kvm_vcpu_has_mediated_pmu(pmu_to_vcpu(pmu))) { - kvm_mediated_pmu_refresh_event_filter(pmc); + kvm_mediated_pmu_refresh_eventsel_hw(pmc); return 0; } > Also, would you rather I send a new version with everything, or do you > want to pick up some of the patches in this version independently? Uh, probably just send a new version. I doubt I'll get through the first few patches before you're ready to send the next version.