From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-23.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E695CC47083 for ; Wed, 2 Jun 2021 15:39:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CE4A8613BF for ; Wed, 2 Jun 2021 15:39:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232079AbhFBPlm (ORCPT ); Wed, 2 Jun 2021 11:41:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54966 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231618AbhFBPll (ORCPT ); Wed, 2 Jun 2021 11:41:41 -0400 Received: from mail-pf1-x434.google.com (mail-pf1-x434.google.com [IPv6:2607:f8b0:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 19A70C06174A for ; Wed, 2 Jun 2021 08:39:52 -0700 (PDT) Received: by mail-pf1-x434.google.com with SMTP id s14so1608958pfd.9 for ; Wed, 02 Jun 2021 08:39:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Ql9d/9lC/Z+oDU+E2Te9dSYsTCkeZFltxB8n5ToB/Gw=; b=eDmjm6x3PDZ43U95ZfqC/7h/3BXiLvCQpHMtTLWt0PoVyzou8OV6f4KCKtUevfBuZv 3svHbVtP1wOf47r0ACg9gzxYVBF6qeE7gNos1aKDY23hu1Pu9MSgE2rJPnUgj0MpGXXo 2puFR9jQpNlLACRzjAYhoY3f1xtfTT0D1NPhDRK8OQlpNqzg3vfcs+aZ7/wnmzJTiheh Ph5oe532nJEhPyRFLq7lYCnbLdx3tPM++hbFtdJMu9QJ2EvAqkPhF8Qodv5vJHewFH1Y SMeW8kAnzhi1j0Qc5fuC007pRMRVfxRvjQRj2tOQcuC+QbnP5H+w8KMQHKGBvDPOPsxb 7PAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Ql9d/9lC/Z+oDU+E2Te9dSYsTCkeZFltxB8n5ToB/Gw=; b=Y66ABvWSCOPNPd4BIYbl0JnVGIwFGyuc99jWuAY4XOGHtCSsr8A3kXrBiKnYap2Pwv 25RtzHvtXKrL4iRFcHEAH8TcD6w6oTZFR+cZcQMPUxVIPEw+HnyGBdpRvxmyoUXiCabG 2pEtPzIKowZJNYvsrqffk0PLcSUyE0BoegZiEtXZmuSkHRQwHjXPTvGkllwJ8NuZuOFa nb3fYcH06R+eiH82Fp3lL0c5HotxIib2qgBISkkyZPJqD9WaCtkjfee37PnurKh1hndZ S8RDiywmvgIIKCFSmKJmEOKJlYcijb0IJS0lqf1qKhaPr6vqDhct9gEWLqr92Nt1glF3 vBsw== X-Gm-Message-State: AOAM532rU8cBLGQmgmJGnsoDytYQgvWflJApelBcIYTnnbtiRcRqGaTn S9eUTJvwQbOrcU8F/RIko9+wWw== X-Google-Smtp-Source: ABdhPJyoFdBqGNzP8RjidgmmG03Wg/fcm/E0mpp8XjjIzR9WkDNNSXTACznsMKOveIcatXdp+UyYmQ== X-Received: by 2002:a63:145e:: with SMTP id 30mr34900690pgu.174.1622648391388; Wed, 02 Jun 2021 08:39:51 -0700 (PDT) Received: from google.com (240.111.247.35.bc.googleusercontent.com. [35.247.111.240]) by smtp.gmail.com with ESMTPSA id o27sm156917pgb.70.2021.06.02.08.39.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Jun 2021 08:39:50 -0700 (PDT) Date: Wed, 2 Jun 2021 15:39:47 +0000 From: Sean Christopherson To: Lai Jiangshan Cc: linux-kernel@vger.kernel.org, Lai Jiangshan , Maxim Levitsky , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" , Radim =?utf-8?B?S3LEjW3DocWZ?= , kvm@vger.kernel.org Subject: Re: [PATCH V2] KVM: X86: fix tlb_flush_guest() Message-ID: References: <4c3ef411ba68ca726531a379fb6c9d16178c8513.camel@redhat.com> <20210531172256.2908-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210531172256.2908-1-jiangshanlai@gmail.com> Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Tue, Jun 01, 2021, Lai Jiangshan wrote: > From: Lai Jiangshan > > For KVM_VCPU_FLUSH_TLB used in kvm_flush_tlb_multi(), the guest expects > the hypervisor do the operation that equals to native_flush_tlb_global() > or invpcid_flush_all() in the specified guest CPU. I don't like referencing guest code, here and in the comment. The paravirt flush isn't limited to Linux guests, the existing kernel code might change, and it doesn't directly explain KVM's responsibilities in response to a guest TLB flush. Something like: When using shadow paging, unload the guest MMU when emulating a guest TLB flush to all roots are synchronized. From the guest's perspective, flushing the TLB ensure any and all modifications to its PTEs will be recognized by the CPU. Note, unloading the MMU is overkill, but is done to mirror KVM's existing handling of INVPCID(all) and ensure the bug is squashed. Future cleanup can be done to more precisely synchronize roots when servicing a guest TLB flush. > When TDP is enabled, there is no problem to just flush the hardware > TLB of the specified guest CPU. > > But when using shadowpaging, the hypervisor should have to sync the > shadow pagetable at first before flushing the hardware TLB so that > it can truely emulate the operation of invpcid_flush_all() in guest. > > The problem exists since the first implementation of KVM_VCPU_FLUSH_TLB > in commit f38a7b75267f ("KVM: X86: support paravirtualized help for TLB > shootdowns"). But I don't think it would be a real world problem that > time since the local CPU's tlb is flushed at first in guest before queuing > KVM_VCPU_FLUSH_TLB to other CPUs. It means that the hypervisor syncs the > shadow pagetable before seeing the corresponding KVM_VCPU_FLUSH_TLBs. > > After commit 4ce94eabac16 ("x86/mm/tlb: Flush remote and local TLBs > concurrently"), the guest doesn't flush local CPU's tlb at first and > the hypervisor can handle other VCPU's KVM_VCPU_FLUSH_TLB earlier than > local VCPU's tlb flush and might flush the hardware tlb without syncing > the shadow pagetable beforehand. Maybe reword the last two paragraphs to make it clear that a change in the Linux kernel exposed the KVM bug? This bug has existed since the introduction of KVM_VCPU_FLUSH_TLB, but was only recently exposed after Linux guests stopped flushing the local CPU's TLB prior to flushing remote TLBs (see commit 4ce94eabac16, "x86/mm/tlb: Flush remote and local TLBs concurrently"). > Cc: Maxim Levitsky > Fixes: f38a7b75267f ("KVM: X86: support paravirtualized help for TLB shootdowns") > Signed-off-by: Lai Jiangshan > --- > Changed from V1 > Use kvm_mmu_unload() instead of KVM_REQ_MMU_RELOAD to avoid > causing unneeded iteration of vcpu_enter_guest(). > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index bbc4e04e67ad..27248e330767 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -3072,6 +3072,22 @@ static void kvm_vcpu_flush_tlb_all(struct kvm_vcpu *vcpu) > static void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu) > { > ++vcpu->stat.tlb_flush; > + > + if (!tdp_enabled) { > + /* > + * When two dimensional paging is not enabled, the > + * operation should equal to native_flush_tlb_global() > + * or invpcid_flush_all() on the guest's behalf via > + * synchronzing shadow pagetable and flushing. > + * > + * kvm_mmu_unload() results consequent kvm_mmu_load() > + * before entering guest which will do the required > + * pagetable synchronzing and TLB flushing. > + */ > + kvm_mmu_unload(vcpu); I don't like the full unload, but I suppose the big hammer does make sense for a backport since handle_invcpid() and toggling CR4.PGE already nuke everything. :-/ > + return; > + } > + > static_call(kvm_x86_tlb_flush_guest)(vcpu); > } > > -- > 2.19.1.6.gb485710b >