From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFCA7C7618E for ; Wed, 26 Apr 2023 16:56:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233520AbjDZQ4G (ORCPT ); Wed, 26 Apr 2023 12:56:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53506 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231186AbjDZQ4E (ORCPT ); Wed, 26 Apr 2023 12:56:04 -0400 Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com [IPv6:2607:f8b0:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C0D08E6D for ; Wed, 26 Apr 2023 09:56:02 -0700 (PDT) Received: by mail-pf1-x44a.google.com with SMTP id d2e1a72fcca58-63d3b5c334eso4984109b3a.3 for ; Wed, 26 Apr 2023 09:56:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1682528162; x=1685120162; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=TLtDRjMjMhUauQTfVjTwks2U9xARAIB2c32l6X3FD+Q=; b=7dZxd0/7JHac0T+Cv2Wf4Z85eHRXtIHuNqAfzRAkmZf7OX37ouxMWyfnKNy0LQfCbv NhNDooT3eprDgzDK67IrumTU+QVW4ntQfUjZTCbtn0bhQPcsBP09aPjTsPZwlWs4rog+ Zy0hP1AGcKh2rzywN1xwvChzFornLI3vdqGQ8iDZjpSd8IWObZJ12pVwAs+w5ABNTJlv pDbd7YSEEjrziDQ6+1rGAgZHgXb34776GLqnHM9rLPt0B+I8nRDVHnOrtrMVjrQBci1l bAyWOIh7Ll5dG2mpmfMPV+7Zyht9hESvhlEGk/FWqn+meWW7RLus8qqnP7n8TvZNETwg F24A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682528162; x=1685120162; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=TLtDRjMjMhUauQTfVjTwks2U9xARAIB2c32l6X3FD+Q=; b=KDLf8z7igXtAYkfhn+wUPRhAULI3qVsX9qY2DvnReiKspB2R8Cbf47PWFornEeKABY +q5mDL5maZZaHkHuP4m37/eu9+XJ4pw4GdF3hN6vnqJqGw1QPctBPerp3FJvkMHdfz4B ozPKItNPfmiME8FRAh7LZm5KdhhWWw0FGpjnteNvCgmgAuFY1DIhmCMr1WAU7bSZPEhA 07yhP6yS5qEWBqgU4CugittihOZeIr3Lh9XcS4CCEGNJ4Sl/l1i/9EshlsiR1ydZPMmA DaxDXx2dr09Z63oO6rJkayClHdOwR0qxZ+8y99eRM5QexvOcVvYBnRJbGhs9wolvyW0b yMQg== X-Gm-Message-State: AAQBX9e5YQFr83mrOlfgB10wMDvSIo337CDSfLN4MaOEdOBaowiW4PDG eEyf9vH6PEOy5MW4fZ7iicsxdOuAypM= X-Google-Smtp-Source: AKy350Y3mlmjd1pqRsrZI2x6e/W959s9/299pvAYOMxJOnQ1S5fmSFYRp5P+DHpNT1clQpfA3W+fc7MQUo8= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:aa7:8890:0:b0:625:5949:6dc0 with SMTP id z16-20020aa78890000000b0062559496dc0mr8750261pfe.4.1682528162349; Wed, 26 Apr 2023 09:56:02 -0700 (PDT) Date: Wed, 26 Apr 2023 09:56:00 -0700 In-Reply-To: <20230426082601.85372-1-abusse@amazon.com> Mime-Version: 1.0 References: <20230426082601.85372-1-abusse@amazon.com> Message-ID: Subject: Re: [PATCH] KVM: x86: Add a vCPU stat for #AC exceptions From: Sean Christopherson To: Anselm Busse Cc: dwmw@amazon.co.uk, hborghor@amazon.de, sironi@amazon.de, Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="us-ascii" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 26, 2023, Anselm Busse wrote: > This patch adds a KVM vCPU stat that reflects the number of #AC > exceptions caused by a guest. This improves the identification and > debugging of issues that are possibly caused by guests triggering > split-locks and allows more insides compared to the current situation > of having only a warning printed when an #AC exception is raised. Irrespective of the inaccuracy Xiaoyao pointed out, I don't want to add a one-off stat for _any_ exception. I agree with what Marc said[*] when we (Google / GCP) tried to push our pile o' stats upstream: : Because I'm pretty sure that whatever stat we expose, every cloud : vendor will want their own variant, so we may just as well put the : matter in their own hands. That doesn't mean I don't want a massive pile of stats about all things KVM, quite the opposite, but I don't think they belong in upstream where KVM has to maintain them in perpetuity. E.g. at some point in the (distant) future, split-lock #AC will be completely uninteresting because all software will have been updated/fixed. FWIW, we looked at using eBPF for our out-of-tree stats and ultimately decided that carrying patches to add our stats would be significantly easier to maintain than an eBPF-based approach, e.g. rebasing this patch is trivial. But the challenges we anticipated with switching to eBPF were largely specific to running at scale. eBPF is a very viable approach for gathering information for debug, development, individual users, etc. On idea I had for easing the pain of out-of-tree stats was to clean up KVM x86's tracepoints, e.g. to give eBPF programs more stable and useful hooks, but also to allow CSPs like us to play macro games to "inject" stats at key points, e.g. add infrastructure to #define overload tracepoints to make KVM trampoline through out-of-tree stats code. But we haven't pursued that idea because (a) as above, carrying patches for out-of-tree stats requires minimal effort and (b) it wouldn't eliminate "invasive" code because we'd (GCP) inevitably want stats in places where a KVM tracepoint makes no sense. So as much as I advocate for pushing code upstream, this is one of the few areas where I think it's better to carry code out-of-tree. [*] https://lore.kernel.org/all/875yusv3vm.wl-maz@kernel.org