From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 68C92D2CE01 for ; Tue, 22 Oct 2024 16:55:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=9kRp9X9Evgeqhd6nWrjVapRhaCIHseB9bHGDukDgzlg=; b=xqDAe4zMuLOTFpj/E2wSVkFBlq EWqu0zKeGBsijUYc7mBxKMIvZsieGI4Pfpdt18Fbd/59VIBfNLludBgSaFZVFLDiZB9dPQ/6wLGJr AvW6nRk1pcr5RWoKrKbuX0q6l81HMjmGtHcxTrPT6J27ETvrC5M3w3+TXBpVdgE5XKZOT3Pl+gaW1 LOkEe2e1fe4ZYs5agqARXegtc7XeQhnnxlkmG1CS0eRb7YzPlnUnfran042qP1MV0gHrmygTUzTaL 6UcPzxhELR6OAkpGj4ZNtf/rgNgqUYLTVSgl2tkYpNHlsg0IEsjH6QVW7jEX59JJUk3dt2L1xbbUI P531nDzg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1t3I9t-0000000BYa6-3zZL; Tue, 22 Oct 2024 16:55:01 +0000 Received: from mail-yb1-xb4a.google.com ([2607:f8b0:4864:20::b4a]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1t3HmX-0000000BTDj-2rZ6 for linux-arm-kernel@lists.infradead.org; Tue, 22 Oct 2024 16:30:55 +0000 Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-e2605ce4276so10414300276.3 for ; Tue, 22 Oct 2024 09:30:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729614651; x=1730219451; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=9kRp9X9Evgeqhd6nWrjVapRhaCIHseB9bHGDukDgzlg=; b=cghK4Yod52xnXkd1chllYeBS65JAUsizzKNrY7jPfx/18q/ug5BMfikbnZiP0EPQJE o9TZwcRiOCUyWdHRJY2zTBpVYsW388UqvzIynzSI8xjtlLU4kQ8Gu8nB/c6H0mGKnpsR AldoSHJ6r3wQb/k8xZxJOGyYu6IPLQMGPq8v39eKwRMYlXEuoid6LX1vwTNjzGJ6vugY RWs48Bj8nl0i0Lyyyhs8mzt2X7ervWsoxB6ZG07U/x3xyyPtfmzXw06oegrdYgVcmY+l IVn7gYtOhYwEHnIFXQWAgym8332HaKZM9FF2Q/Iw0whLecBUFxhtGLTsEwKe8TXA4FAO Vf5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729614651; x=1730219451; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9kRp9X9Evgeqhd6nWrjVapRhaCIHseB9bHGDukDgzlg=; b=MSVdSGNUQ4uxAv4OYAPTj45B2gFdUtmE4c+ktwGoq7vOf3e5YwwQpMO0Og8B2iy0vd 0RbAEodfiP19rZe8UEo2q4vXvj3i+w+ff5i0wXWULNu97zesE4gV1fuo7f3SRAZ5IhL5 prdsmZ9Yxr6h0pzRxvOuZdbzh4RmBcAt6e3mJVcrQ/UpWEcqPVIAJiT1RfxYkUSbF54t 8U1QmP/641thPRPbRQbgJoeJ61hOnfP3+u2iR86m/iAEHg++JANyfDTE2bX/u3hdyyye O+lCgBwbHI2C81frz9Hcc9lMxpErRcLFDk++mpO4a/b+ganYR6djLmDaknYzM/LSApt6 xCow== X-Forwarded-Encrypted: i=1; AJvYcCX7zer72B7evwEvLAXaD3AvwFanMgK7DVF+q6qp4fNXutE0skSpJuKGXDIb3CNtxLnRGKZBcnXaxSeWBIlXaBnH@lists.infradead.org X-Gm-Message-State: AOJu0YyrIw7jh3uW99X5BHRnVE4ipsEfXFKlG84peII8fcfsEc6bBfx8 GQNjjETk0BB7R8AVkMzFD/3ZC3zB/ywkuUgAHCvvUV45sgxnVfIm/Xu3IA6FGI6rtnuR5yZ/PcD CuQ== X-Google-Smtp-Source: AGHT+IFhidXoMwHV1i3xjXEppuei+o0CgKpCcKTpOo27zytz1Beo5NbI8wae2gvpTFRD+0KpTD3bdky90Rg= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:9d:3983:ac13:c240]) (user=seanjc job=sendgmr) by 2002:a5b:c10:0:b0:e20:2502:be14 with SMTP id 3f1490d57ef6-e2bb168d0f5mr9106276.7.1729614650729; Tue, 22 Oct 2024 09:30:50 -0700 (PDT) Date: Tue, 22 Oct 2024 09:30:49 -0700 In-Reply-To: Mime-Version: 1.0 References: <20241014105124.24473-1-adrian.hunter@intel.com> <20241014105124.24473-4-adrian.hunter@intel.com> Message-ID: Subject: Re: [PATCH V13 03/14] KVM: x86: Fix Intel PT Host/Guest mode when host tracing also From: Sean Christopherson To: Adrian Hunter Cc: Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Mark Rutland , Alexander Shishkin , Heiko Carstens , Thomas Richter , Hendrik Brueckner , Suzuki K Poulose , Mike Leach , James Clark , coresight@lists.linaro.org, linux-arm-kernel@lists.infradead.org, Yicong Yang , Jonathan Cameron , Will Deacon , Arnaldo Carvalho de Melo , Jiri Olsa , Namhyung Kim , Ian Rogers , Andi Kleen , Thomas Gleixner , Borislav Petkov , Dave Hansen , x86@kernel.org, H Peter Anvin , Kan Liang , Zhenyu Wang , mizhang@google.com, kvm@vger.kernel.org, Shuah Khan , linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org Content-Type: text/plain; charset="us-ascii" X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241022_093053_752477_41518740 X-CRM114-Status: GOOD ( 37.90 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, Oct 22, 2024, Adrian Hunter wrote: > On 14/10/24 21:25, Sean Christopherson wrote: > >> Fixes: 2ef444f1600b ("KVM: x86: Add Intel PT context switch for each vcpu") > >> Cc: stable@vger.kernel.org > > > > This is way, way too big for stable@. Given that host/guest mode is disabled by > > default and that no one has complained about this, I think it's safe to say that > > unless we can provide a minimal patch, fixing this in LTS kernels isn't a priority. > > > > Alternatively, I'm tempted to simply drop support for host/guest mode. It clearly > > hasn't been well tested, and given the lack of bug reports, likely doesn't have > > many, if any, users. And I'm guessing the overhead needed to context switch all > > the RTIT MSRs makes tracing in the guest relatively useless. > > As a control flow trace, it is not affected by context switch overhead. Out of curiosity, how much is Intel PT used purely for control flow tracing, i.e. without caring _at all_ about perceived execution time? > Intel PT timestamps are also not affected by that. Timestamps are affected because the guest will see inexplicable jumps in time. Those gaps are unavoidable to some degree, but context switching on every entry and exit is > This patch reduces the MSR switching. To be clear, I'm not objecting to any of the ideas in this patch, I'm "objecting" to trying to put band-aids on KVM's existing implementation, which is clearly buggy and, like far too many PMU-ish features in KVM, was probably developed without any thought as to how it would affect use cases beyond the host admin and the VM owner being a single person. And I'm also objecting, vehemently, to sending anything of this magnitude and complexity to LTS kernels. > > /me fiddles around > > > > LOL, yeah, this needs to be burned with fire. It's wildly broken. So for stable@, > > It doesn't seem wildly broken. Just the VMM passing invalid CPUID > and KVM not validating it. Heh, I agree with "just", but unfortunately "just ... not validating" a large swath of userspace inputs is pretty widly broken. More importantly, it's not easy to fix. E.g. KVM could require the inputs to exactly match hardware, but that creates an ABI that I'm not entirely sure is desirable in the long term. > > I'll post a patch to hide the module param if CONFIG_BROKEN=n (and will omit > > stable@ for the previous patch). > > > > Going forward, if someone actually cares about virtualizing PT enough to want to > > fix KVM's mess, then they can put in the effort to fix all the bugs, write all > > the tests, and in general clean up the implementation to meet KVM's current > > standards. E.g. KVM usage of intel_pt_validate_cap() instead of KVM's guest CPUID > > and capabilities infrastructure needs to go. > > The problem below seems to be caused by not validating against the *host* > CPUID. KVM's CPUID information seems to be invalid. Yes. > > My vote is to queue the current code for removal, and revisit support after the > > mediated PMU has landed. Because I don't see any point in supporting Intel PT > > without a mediated PMU, as host/guest mode really only makes sense if the entire > > PMU is being handed over to the guest. > > Why? To simplify the implementation, and because I don't see how virtualizing Intel PT without also enabling the mediated PMU makes any sense. Conceptually, KVM's PT implementation is very, very similar to the mediated PMU. They both effectively give the guest control of hardware when the vCPU starts running, and take back control when the vCPU stops running. If KVM allows Intel PT without the mediated PMU, then KVM and perf have to support two separate implementations for the same model. If virtualizing Intel PT is allowed if and only if the mediated PMU is enabled, then .handle_intel_pt_intr() goes away. And on the flip side, it becomes super obvious that host usage of Intel PT needs to be mutually exclusive with the mediated PMU. > Intel PT PMU is programmed separately from the x86 PMU. Except for the minor detail that Intel PT generates PMIs, and that PEBS can log to PT buffers. Oh, and giving the guest control of the PMU means host usage of Intel PT will break the host *and* guest. The host won't get PMIs, while the guest will see spurious PMIs. So I don't see any reason to try to separate the two. > > [ 1458.686107] ------------[ cut here ]------------ > > [ 1458.690766] Invalid MSR 588, please adapt vmx_possible_passthrough_msrs[] > > VMM is trying to set a non-existent MSR. Looks like it has > decided there are more PT address filter MSRs that are architecturally > possible. > > I had no idea QEMU was so broken. It's not QEMU that's broken, it's KVM that's broken. > I always just use -cpu host. Yes, and that's exactly the problem. The only people that have ever touched this likely only ever use `-cpu host`, and so KVM's flaws have gone unnoticed. > What were you setting? I tweaked your selftest to feed KVM garbage.