From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6196F56B8C for ; Thu, 11 Apr 2024 17:28:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.181 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712856523; cv=none; b=WRW493zjGN0hXU/BySi3R0Vr03gOLXJA54YcovvAtFoqjtdhBwCavO2kzN3W0GgLwX70oojd+qJ7mvlzIkA/AMfzxFff4hYT3va0e3lzMFQovuuN+J4OXv1ywSKke/n0uX96iYUw261LxGZFV6yeAnYltitsYYptgVj3Bl54YLk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712856523; c=relaxed/simple; bh=YU2VLd18c7WKYUwXK8lM7bD6O3ynStX6B+W9q8BizTI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=H404rSi3qY3h1SlI+EWVGl4jhfMZoqlEQS5B+ewkFKdCtvbyPPqrdp0QJQ8ulH2jzV2wdI7hXSYHcFlSAWucaLFgHeCfZHXzuqEex4+3kVhYC8+kWDiwiz+r0K0sRHx66Olp5QrxwYU/9r5euy1kjC6JsI4F/mamWLotymWHSpQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=zhbvLx1I; arc=none smtp.client-ip=209.85.210.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="zhbvLx1I" Received: by mail-pf1-f181.google.com with SMTP id d2e1a72fcca58-6eddff25e4eso103237b3a.3 for ; Thu, 11 Apr 2024 10:28:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1712856521; x=1713461321; darn=lists.linux.dev; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=89Ohds8IaE0Yxua4zAHj3lBzHQSIcJMIu/8p76cYkP8=; b=zhbvLx1IAoa+NyUebekRWDpa6b+zCZuEsrS8+CGiVLaVCClJknyxdWesCoZ9hQKn+r AfnPMVihonNumACoXbfbDHkEMjoyTeZILWLBBAOVdYXxzPKRw47s2yr4nbCo6PuTbj0P 1TRqoxbZlh2NvbYuHJ+xWh78hdWxMotliNKDfJb/OMXTi+VxTUiCF6TOnG5gK7xKorbh KQCBAmh8C7HpZVfz6U26Xb4kLdUzE42TMmMh6ZC1SQ6vqcTNYnrUADbwy2IZzrqPOx3z Alr7m0jAqe7wjoxon7GxC9bmyVmFS2i41ocK/RORGdWRfVBsJhFlSZapgCLgYYYDn08B qywA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712856521; x=1713461321; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=89Ohds8IaE0Yxua4zAHj3lBzHQSIcJMIu/8p76cYkP8=; b=uJv9mPLcjMfvvDdGhSy2qx4U6aZ9EpUJYGExr36HCuk16U48FZwjSsCiHYWfBSM+kG TNpartSWUt0ITepvSORu1yG6NjCf+rNKPL0cYyxzh5vhXSSRbhLmiyMtESsjU2mUglfr lLE+P/ArG3gN/rLFAN0gbrM6FWGGEx99MNAKDGUy36VfVjXJe64uQ6ofMRKqdFO3DwyI R1c4UkpiVQyOM5DAXgYt7clskNCf54bou0cB3DzvE2uJnIwN/M0o9MVCW84fcPPwu9n/ NgDQGhr5KR+bAUhwCzf68n4QDIQ7vIO+4E9Dfo3u/PShqHx1M9qZDnTwAcT4CvlKOjKw e/XA== X-Forwarded-Encrypted: i=1; AJvYcCVa6iO+a6Kn46GAt2tTigV8DfjMAF8v8PZD03OcE6WadXkhOSzl89QxD5wfWqtumI8MxZkwEXVXg1ml4uc8rhGy1UD6Ddsb X-Gm-Message-State: AOJu0YzghZpMiI758ZmKThR8YC6hA8u4uhsC4zpBJqXzeuScyKZKba4u yLpP0Pm/XPTYM/9BpBMnsF2LMCkDmJ+Vvi71qj1/P9vfIydxJE7A4kcm0kOliQ== X-Google-Smtp-Source: AGHT+IGk9fMHrTOJhbVacJwX4HkkiAnG2/NY48bYHirkrQIHsXa3eN5po5lEPEP/llFVVtvrszeHoA== X-Received: by 2002:a05:6a00:2d22:b0:6ec:e733:c66f with SMTP id fa34-20020a056a002d2200b006ece733c66fmr466537pfb.0.1712856520380; Thu, 11 Apr 2024 10:28:40 -0700 (PDT) Received: from google.com (210.73.125.34.bc.googleusercontent.com. [34.125.73.210]) by smtp.gmail.com with ESMTPSA id m14-20020a63580e000000b005dc4806ad7dsm1340545pgb.40.2024.04.11.10.28.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Apr 2024 10:28:39 -0700 (PDT) Date: Thu, 11 Apr 2024 10:28:35 -0700 From: David Matlack To: James Houghton Cc: Andrew Morton , Paolo Bonzini , Yu Zhao , Marc Zyngier , Oliver Upton , Sean Christopherson , Jonathan Corbet , James Morse , Suzuki K Poulose , Zenghui Yu , Catalin Marinas , Will Deacon , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Shaoqin Huang , Gavin Shan , Ricardo Koller , Raghavendra Rao Ananta , Ryan Roberts , David Rientjes , Axel Rasmussen , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: Re: [PATCH v3 5/7] KVM: x86: Participate in bitmap-based PTE aging Message-ID: References: <20240401232946.1837665-1-jthoughton@google.com> <20240401232946.1837665-6-jthoughton@google.com> Precedence: bulk X-Mailing-List: kvmarm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On 2024-04-11 10:08 AM, David Matlack wrote: > On 2024-04-01 11:29 PM, James Houghton wrote: > > Only handle the TDP MMU case for now. In other cases, if a bitmap was > > not provided, fallback to the slowpath that takes mmu_lock, or, if a > > bitmap was provided, inform the caller that the bitmap is unreliable. > > > > Suggested-by: Yu Zhao > > Signed-off-by: James Houghton > > --- > > arch/x86/include/asm/kvm_host.h | 14 ++++++++++++++ > > arch/x86/kvm/mmu/mmu.c | 16 ++++++++++++++-- > > arch/x86/kvm/mmu/tdp_mmu.c | 10 +++++++++- > > 3 files changed, 37 insertions(+), 3 deletions(-) > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > index 3b58e2306621..c30918d0887e 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -2324,4 +2324,18 @@ int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages); > > */ > > #define KVM_EXIT_HYPERCALL_MBZ GENMASK_ULL(31, 1) > > > > +#define kvm_arch_prepare_bitmap_age kvm_arch_prepare_bitmap_age > > +static inline bool kvm_arch_prepare_bitmap_age(struct mmu_notifier *mn) > > +{ > > + /* > > + * Indicate that we support bitmap-based aging when using the TDP MMU > > + * and the accessed bit is available in the TDP page tables. > > + * > > + * We have no other preparatory work to do here, so we do not need to > > + * redefine kvm_arch_finish_bitmap_age(). > > + */ > > + return IS_ENABLED(CONFIG_X86_64) && tdp_mmu_enabled > > + && shadow_accessed_mask; > > +} > > + > > #endif /* _ASM_X86_KVM_HOST_H */ > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 992e651540e8..fae1a75750bb 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -1674,8 +1674,14 @@ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) > > { > > bool young = false; > > > > - if (kvm_memslots_have_rmaps(kvm)) > > + if (kvm_memslots_have_rmaps(kvm)) { > > + if (range->lockless) { > > + kvm_age_set_unreliable(range); > > + return false; > > + } > > If a VM has TDP MMU enabled, supports A/D bits, and is using nested > virtualization, MGLRU will effectively be blind to all accesses made by > the VM. > > kvm_arch_prepare_bitmap_age() will return true indicating that the > bitmap is supported. But then kvm_age_gfn() and kvm_test_age_gfn() will > return false immediately and indicate the bitmap is unreliable because a > shadow root is allocate. The notfier will then return > MMU_NOTIFIER_YOUNG_BITMAP_UNRELIABLE. > > Looking at the callers, MMU_NOTIFIER_YOUNG_BITMAP_UNRELIABLE is never > consumed or used. So I think MGLRU will assume all memory is > unaccessed? > > One way to improve the situation would be to re-order the TDP MMU > function first and return young instead of false, so that way MGLRU at > least has visibility into accesses made by L1 (and L2 if EPT is disable > in L2). But that still means MGLRU is blind to accesses made by L2. > > What about grabbing the mmu_lock if there's a shadow root allocated and > get rid of MMU_NOTIFIER_YOUNG_BITMAP_UNRELIABLE altogether? > > if (kvm_memslots_have_rmaps(kvm)) { > write_lock(&kvm->mmu_lock); > young |= kvm_handle_gfn_range(kvm, range, kvm_age_rmap); > write_unlock(&kvm->mmu_lock); > } > > The TDP MMU walk would still be lockless. KVM only has to take the > mmu_lock to collect accesses made by L2. > > kvm_age_rmap() and kvm_test_age_rmap() will need to become bitmap-aware > as well, but that seems relatively simple with the helper functions. Wait, even simpler, just check kvm_memslots_have_rmaps() in kvm_arch_prepare_bitmap_age() and skip the shadow MMU when processing a bitmap request. i.e. static inline bool kvm_arch_prepare_bitmap_age(struct kvm *kvm, struct mmu_notifier *mn) { /* * Indicate that we support bitmap-based aging when using the TDP MMU * and the accessed bit is available in the TDP page tables. * * We have no other preparatory work to do here, so we do not need to * redefine kvm_arch_finish_bitmap_age(). */ return IS_ENABLED(CONFIG_X86_64) && tdp_mmu_enabled && shadow_accessed_mask && !kvm_memslots_have_rmaps(kvm); } bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { bool young = false; if (!range->arg.metadata->bitmap && kvm_memslots_have_rmaps(kvm)) young = kvm_handle_gfn_range(kvm, range, kvm_age_rmap); if (tdp_mmu_enabled) young |= kvm_tdp_mmu_age_gfn_range(kvm, range); return young; } bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { bool young = false; if (!range->arg.metadata->bitmap && kvm_memslots_have_rmaps(kvm)) young = kvm_handle_gfn_range(kvm, range, kvm_test_age_rmap); if (tdp_mmu_enabled) young |= kvm_tdp_mmu_test_age_gfn(kvm, range); return young; } Sure this could race with the creation of a shadow root but so can the non-bitmap code. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 18AD7C001CC for ; Thu, 11 Apr 2024 17:28:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=xwpDQsHZ0d+u9PUHTDiqNHUlZkU9w1ide+GMa1zz0/8=; b=UE0ajMGKkcnewm Rh+tMJV8AwvIXyDUBqQerQiJpDBpOwtZlAPMbIoHj118A9G+epaPvTwilmQ+zKQ71yt5v7HJCfh32 2NHRZE/APSgkxZvXzKtIx47WaFg5k4vcQh/pazlxC0OJix2IConzD86slK7X7XYooeI2sRK7HrmvG 8e2t7nC5WX7ppzsc3QwwwolqunOaFmYiApLJu54L55GtOaF9Mlz7EphrlS5+oJdJ3sd124VJTGrDr ThpkQbbZQEe2j08C42Z27MVHAWlmGP6Vx6KG+CuGtJIB6/yZjFddNG+W1yq/BE36Mlcm5G/mTcPaK ENHEsUK9NcncQmejVnnA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1ruyEA-0000000DNrW-2Hl3; Thu, 11 Apr 2024 17:28:46 +0000 Received: from mail-pf1-x431.google.com ([2607:f8b0:4864:20::431]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1ruyE6-0000000DNps-3XNP for linux-arm-kernel@lists.infradead.org; Thu, 11 Apr 2024 17:28:44 +0000 Received: by mail-pf1-x431.google.com with SMTP id d2e1a72fcca58-6ed2dbf3c92so72661b3a.2 for ; Thu, 11 Apr 2024 10:28:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1712856521; x=1713461321; darn=lists.infradead.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=89Ohds8IaE0Yxua4zAHj3lBzHQSIcJMIu/8p76cYkP8=; b=FmyC2G0Jrs4vRThTo1fxd3sKveRc9FShQDVmtm6UkGr+lTlZ2TfIdkJ1+sO6y++3W0 INA8kGjuC2HYFTU21OdI3FKSxz9wZizI0b6hL6CUV7pxIG0cRy5Rnebqi1xy29AidRar ekpoBkG7nqvZxpFGN6RCKqSf/9Oe5JTjVhC0AYRIDhedo+yGwQgw0c2LtHjWFYkx65ok RvzUYfhE58qtSqJNuA+MEKS1+GDaJTFeB5uZUb562glmhcdYTIOiWDvj+TiGr7dZn+2m KO6v6h+dUyyx8tfSXzlc4ThZBT3EpAQxgIonWvEWZZ7Vdb37SJAhebEZSPSn4H15/vED FVAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712856521; x=1713461321; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=89Ohds8IaE0Yxua4zAHj3lBzHQSIcJMIu/8p76cYkP8=; b=e0eTV9WF+Dn2c3Io5KAQtnPJcgqMCgB7paSgx/yacmhaAkAcZsMaoSm+aNau5uGPSV s042JsPEil0h3U4Z5Fcqo7a9jVBkV2rbYQcMwYUunyFGxHWX45eQcm5t14BehP5PnjmN rW3JDgvFCaslrvol8bWJ5z3usph9OMWrIrje3mErt1mc2eU7FxdfbptXWqL+ubKZw42V dhmuLnDmCt8VkLdQGUYEzMWr68oj4aV8KIugpwOeQfHdQKmmvhHXlDhk6YKzdFOk3q6b iwObwnNFDBEGXbzKS9XiJ++Q0hqevwJMS1PPBUSNMFc8LONevLMrrRqRiTNmExxjVNag 9fkQ== X-Forwarded-Encrypted: i=1; AJvYcCXXXxGDlUUe76oTBfte7BMX0aI8jtZGw/yWeiApz4OjCZp1/fvM1s739TWIowvckz3RhUE1/YQVRfMXGSC+bCJTTJoekuchcyP7XZp1xXaXhioIdh0= X-Gm-Message-State: AOJu0YxePMw1Eo1w9LyvgBh3ftA8e+aUB5aLfLl+DtxBh7luUN0tINHG rCOfUsS0L4QD8S+gxF60IGNLzb4+M+2Yb9V2xI8KwJ0bL6nuB2Ja2NdXhgJr/g== X-Google-Smtp-Source: AGHT+IGk9fMHrTOJhbVacJwX4HkkiAnG2/NY48bYHirkrQIHsXa3eN5po5lEPEP/llFVVtvrszeHoA== X-Received: by 2002:a05:6a00:2d22:b0:6ec:e733:c66f with SMTP id fa34-20020a056a002d2200b006ece733c66fmr466537pfb.0.1712856520380; Thu, 11 Apr 2024 10:28:40 -0700 (PDT) Received: from google.com (210.73.125.34.bc.googleusercontent.com. [34.125.73.210]) by smtp.gmail.com with ESMTPSA id m14-20020a63580e000000b005dc4806ad7dsm1340545pgb.40.2024.04.11.10.28.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Apr 2024 10:28:39 -0700 (PDT) Date: Thu, 11 Apr 2024 10:28:35 -0700 From: David Matlack To: James Houghton Cc: Andrew Morton , Paolo Bonzini , Yu Zhao , Marc Zyngier , Oliver Upton , Sean Christopherson , Jonathan Corbet , James Morse , Suzuki K Poulose , Zenghui Yu , Catalin Marinas , Will Deacon , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Shaoqin Huang , Gavin Shan , Ricardo Koller , Raghavendra Rao Ananta , Ryan Roberts , David Rientjes , Axel Rasmussen , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: Re: [PATCH v3 5/7] KVM: x86: Participate in bitmap-based PTE aging Message-ID: References: <20240401232946.1837665-1-jthoughton@google.com> <20240401232946.1837665-6-jthoughton@google.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240411_102842_932133_3EF19B41 X-CRM114-Status: GOOD ( 36.50 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 2024-04-11 10:08 AM, David Matlack wrote: > On 2024-04-01 11:29 PM, James Houghton wrote: > > Only handle the TDP MMU case for now. In other cases, if a bitmap was > > not provided, fallback to the slowpath that takes mmu_lock, or, if a > > bitmap was provided, inform the caller that the bitmap is unreliable. > > > > Suggested-by: Yu Zhao > > Signed-off-by: James Houghton > > --- > > arch/x86/include/asm/kvm_host.h | 14 ++++++++++++++ > > arch/x86/kvm/mmu/mmu.c | 16 ++++++++++++++-- > > arch/x86/kvm/mmu/tdp_mmu.c | 10 +++++++++- > > 3 files changed, 37 insertions(+), 3 deletions(-) > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > index 3b58e2306621..c30918d0887e 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -2324,4 +2324,18 @@ int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages); > > */ > > #define KVM_EXIT_HYPERCALL_MBZ GENMASK_ULL(31, 1) > > > > +#define kvm_arch_prepare_bitmap_age kvm_arch_prepare_bitmap_age > > +static inline bool kvm_arch_prepare_bitmap_age(struct mmu_notifier *mn) > > +{ > > + /* > > + * Indicate that we support bitmap-based aging when using the TDP MMU > > + * and the accessed bit is available in the TDP page tables. > > + * > > + * We have no other preparatory work to do here, so we do not need to > > + * redefine kvm_arch_finish_bitmap_age(). > > + */ > > + return IS_ENABLED(CONFIG_X86_64) && tdp_mmu_enabled > > + && shadow_accessed_mask; > > +} > > + > > #endif /* _ASM_X86_KVM_HOST_H */ > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 992e651540e8..fae1a75750bb 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -1674,8 +1674,14 @@ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) > > { > > bool young = false; > > > > - if (kvm_memslots_have_rmaps(kvm)) > > + if (kvm_memslots_have_rmaps(kvm)) { > > + if (range->lockless) { > > + kvm_age_set_unreliable(range); > > + return false; > > + } > > If a VM has TDP MMU enabled, supports A/D bits, and is using nested > virtualization, MGLRU will effectively be blind to all accesses made by > the VM. > > kvm_arch_prepare_bitmap_age() will return true indicating that the > bitmap is supported. But then kvm_age_gfn() and kvm_test_age_gfn() will > return false immediately and indicate the bitmap is unreliable because a > shadow root is allocate. The notfier will then return > MMU_NOTIFIER_YOUNG_BITMAP_UNRELIABLE. > > Looking at the callers, MMU_NOTIFIER_YOUNG_BITMAP_UNRELIABLE is never > consumed or used. So I think MGLRU will assume all memory is > unaccessed? > > One way to improve the situation would be to re-order the TDP MMU > function first and return young instead of false, so that way MGLRU at > least has visibility into accesses made by L1 (and L2 if EPT is disable > in L2). But that still means MGLRU is blind to accesses made by L2. > > What about grabbing the mmu_lock if there's a shadow root allocated and > get rid of MMU_NOTIFIER_YOUNG_BITMAP_UNRELIABLE altogether? > > if (kvm_memslots_have_rmaps(kvm)) { > write_lock(&kvm->mmu_lock); > young |= kvm_handle_gfn_range(kvm, range, kvm_age_rmap); > write_unlock(&kvm->mmu_lock); > } > > The TDP MMU walk would still be lockless. KVM only has to take the > mmu_lock to collect accesses made by L2. > > kvm_age_rmap() and kvm_test_age_rmap() will need to become bitmap-aware > as well, but that seems relatively simple with the helper functions. Wait, even simpler, just check kvm_memslots_have_rmaps() in kvm_arch_prepare_bitmap_age() and skip the shadow MMU when processing a bitmap request. i.e. static inline bool kvm_arch_prepare_bitmap_age(struct kvm *kvm, struct mmu_notifier *mn) { /* * Indicate that we support bitmap-based aging when using the TDP MMU * and the accessed bit is available in the TDP page tables. * * We have no other preparatory work to do here, so we do not need to * redefine kvm_arch_finish_bitmap_age(). */ return IS_ENABLED(CONFIG_X86_64) && tdp_mmu_enabled && shadow_accessed_mask && !kvm_memslots_have_rmaps(kvm); } bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { bool young = false; if (!range->arg.metadata->bitmap && kvm_memslots_have_rmaps(kvm)) young = kvm_handle_gfn_range(kvm, range, kvm_age_rmap); if (tdp_mmu_enabled) young |= kvm_tdp_mmu_age_gfn_range(kvm, range); return young; } bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { bool young = false; if (!range->arg.metadata->bitmap && kvm_memslots_have_rmaps(kvm)) young = kvm_handle_gfn_range(kvm, range, kvm_test_age_rmap); if (tdp_mmu_enabled) young |= kvm_tdp_mmu_test_age_gfn(kvm, range); return young; } Sure this could race with the creation of a shadow root but so can the non-bitmap code. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel