* [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn)
@ 2024-10-11 2:10 Sean Christopherson
2024-10-11 2:10 ` [PATCH 01/18] KVM: x86/mmu: Flush remote TLBs iff MMU-writable flag is cleared from RO SPTE Sean Christopherson
` (19 more replies)
0 siblings, 20 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
This is effectively an extensive of the kvm_follow_pfn series[*] (and
applies on top of said series), but is x86-specific and is *almost*
entirely related to Accessed and Dirty bits.
There's no central theme beyond cleaning up things that were discovered
when digging deep for the kvm_follow_pfn overhaul, and to a lesser extent
the series to add MGLRU support in KVM x86.
[*] https://lore.kernel.org/all/20241010182427.1434605-1-seanjc@google.com
Sean Christopherson (18):
KVM: x86/mmu: Flush remote TLBs iff MMU-writable flag is cleared from
RO SPTE
KVM: x86/mmu: Always set SPTE's dirty bit if it's created as writable
KVM: x86/mmu: Fold all of make_spte()'s writable handling into one
if-else
KVM: x86/mmu: Don't force flush if SPTE update clears Accessed bit
KVM: x86/mmu: Don't flush TLBs when clearing Dirty bit in shadow MMU
KVM: x86/mmu: Drop ignored return value from
kvm_tdp_mmu_clear_dirty_slot()
KVM: x86/mmu: Fold mmu_spte_update_no_track() into mmu_spte_update()
KVM: x86/mmu: WARN and flush if resolving a TDP MMU fault clears
MMU-writable
KVM: x86/mmu: Add a dedicated flag to track if A/D bits are globally
enabled
KVM: x86/mmu: Set shadow_accessed_mask for EPT even if A/D bits
disabled
KVM: x86/mmu: Set shadow_dirty_mask for EPT even if A/D bits disabled
KVM: x86/mmu: Use Accessed bit even when _hardware_ A/D bits are
disabled
KVM: x86/mmu: Process only valid TDP MMU roots when aging a gfn range
KVM: x86/mmu: Stop processing TDP MMU roots for test_age if young SPTE
found
KVM: x86/mmu: Dedup logic for detecting TLB flushes on leaf SPTE
changes
KVM: x86/mmu: Set Dirty bit for new SPTEs, even if _hardware_ A/D bits
are disabled
KVM: Allow arch code to elide TLB flushes when aging a young page
KVM: x86: Don't emit TLB flushes when aging SPTEs for mmu_notifiers
arch/x86/kvm/Kconfig | 1 +
arch/x86/kvm/mmu/mmu.c | 72 +++++++-----------------
arch/x86/kvm/mmu/spte.c | 59 ++++++++------------
arch/x86/kvm/mmu/spte.h | 72 ++++++++++++------------
arch/x86/kvm/mmu/tdp_mmu.c | 109 +++++++++++++++++--------------------
arch/x86/kvm/mmu/tdp_mmu.h | 2 +-
virt/kvm/Kconfig | 4 ++
virt/kvm/kvm_main.c | 20 ++-----
8 files changed, 142 insertions(+), 197 deletions(-)
base-commit: 3f9cf3d569fdf7fb451294b636991291965573ce
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 01/18] KVM: x86/mmu: Flush remote TLBs iff MMU-writable flag is cleared from RO SPTE
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-11 2:10 ` [PATCH 02/18] KVM: x86/mmu: Always set SPTE's dirty bit if it's created as writable Sean Christopherson
` (18 subsequent siblings)
19 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
Don't force a remote TLB flush if KVM happens to effectively "refresh" a
read-only SPTE that is still MMU-Writable, as KVM allows MMU-Writable SPTEs
to have Writable TLB entries, even if the SPTE is !Writable. Remote TLBs
need to be flushed only when creating a read-only SPTE for write-tracking,
i.e. when installing a !MMU-Writable SPTE.
In practice, especially now that KVM doesn't overwrite existing SPTEs when
prefetching, KVM will rarely "refresh" a read-only, MMU-Writable SPTE,
i.e. this is unlikely to eliminate many, if any, TLB flushes. But, more
precisely flushing makes it easier to understand exactly when KVM does and
doesn't need to flush.
Note, x86 architecturally requires relevant TLB entries to be invalidated
on a page fault, i.e. there is no risk of putting a vCPU into an infinite
loop of read-only page faults.
Cc: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/mmu.c | 12 +++++++-----
arch/x86/kvm/mmu/spte.c | 6 ------
2 files changed, 7 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 55eeca931e23..176fc37540df 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -514,9 +514,12 @@ static u64 mmu_spte_update_no_track(u64 *sptep, u64 new_spte)
/* Rules for using mmu_spte_update:
* Update the state bits, it means the mapped pfn is not changed.
*
- * Whenever an MMU-writable SPTE is overwritten with a read-only SPTE, remote
- * TLBs must be flushed. Otherwise rmap_write_protect will find a read-only
- * spte, even though the writable spte might be cached on a CPU's TLB.
+ * If the MMU-writable flag is cleared, i.e. the SPTE is write-protected for
+ * write-tracking, remote TLBs must be flushed, even if the SPTE was read-only,
+ * as KVM allows stale Writable TLB entries to exist. When dirty logging, KVM
+ * flushes TLBs based on whether or not dirty bitmap/ring entries were reaped,
+ * not whether or not SPTEs were modified, i.e. only the write-tracking case
+ * needs to flush at the time the SPTEs is modified, before dropping mmu_lock.
*
* Returns true if the TLB needs to be flushed
*/
@@ -533,8 +536,7 @@ static bool mmu_spte_update(u64 *sptep, u64 new_spte)
* we always atomically update it, see the comments in
* spte_has_volatile_bits().
*/
- if (is_mmu_writable_spte(old_spte) &&
- !is_writable_pte(new_spte))
+ if (is_mmu_writable_spte(old_spte) && !is_mmu_writable_spte(new_spte))
flush = true;
/*
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index f1a50a78badb..e5af69a8f101 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -133,12 +133,6 @@ static bool kvm_is_mmio_pfn(kvm_pfn_t pfn)
*/
bool spte_has_volatile_bits(u64 spte)
{
- /*
- * Always atomically update spte if it can be updated
- * out of mmu-lock, it can ensure dirty bit is not lost,
- * also, it can help us to get a stable is_writable_pte()
- * to ensure tlb flush is not missed.
- */
if (!is_writable_pte(spte) && is_mmu_writable_spte(spte))
return true;
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 02/18] KVM: x86/mmu: Always set SPTE's dirty bit if it's created as writable
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
2024-10-11 2:10 ` [PATCH 01/18] KVM: x86/mmu: Flush remote TLBs iff MMU-writable flag is cleared from RO SPTE Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-11 2:10 ` [PATCH 03/18] KVM: x86/mmu: Fold all of make_spte()'s writable handling into one if-else Sean Christopherson
` (17 subsequent siblings)
19 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
When creating a SPTE, always set the Dirty bit if the Writable bit is set,
i.e. if KVM is creating a writable mapping. If two (or more) vCPUs are
racing to install a writable SPTE on a !PRESENT fault, only the "winning"
vCPU will create a SPTE with W=1 and D=1, all "losers" will generate a
SPTE with W=1 && D=0.
As a result, tdp_mmu_map_handle_target_level() will fail to detect that
the losing faults are effectively spurious, and will overwrite the D=1
SPTE with a D=0 SPTE. For normal VMs, overwriting a present SPTE is a
small performance blip; KVM blasts a remote TLB flush, but otherwise life
goes on.
For upcoming TDX VMs, overwriting a present SPTE is much more costly, and
can even lead to the VM being terminated if KVM isn't careful, e.g. if KVM
attempts TDH.MEM.PAGE.AUG because the TDX code doesn't detect that the
new SPTE is actually the same as the old SPTE (which would be a bug in its
own right).
Suggested-by: Sagi Shahar <sagis@google.com>
Cc: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/spte.c | 28 +++++++++-------------------
1 file changed, 9 insertions(+), 19 deletions(-)
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index e5af69a8f101..09ce93c4916a 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -219,30 +219,21 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
if (pte_access & ACC_WRITE_MASK) {
spte |= PT_WRITABLE_MASK | shadow_mmu_writable_mask;
- /*
- * When overwriting an existing leaf SPTE, and the old SPTE was
- * writable, skip trying to unsync shadow pages as any relevant
- * shadow pages must already be unsync, i.e. the hash lookup is
- * unnecessary (and expensive).
- *
- * The same reasoning applies to dirty page/folio accounting;
- * KVM marked the folio dirty when the old SPTE was created,
- * thus there's no need to mark the folio dirty again.
- *
- * Note, both cases rely on KVM not changing PFNs without first
- * zapping the old SPTE, which is guaranteed by both the shadow
- * MMU and the TDP MMU.
- */
- if (is_last_spte(old_spte, level) && is_writable_pte(old_spte))
- goto out;
-
/*
* Unsync shadow pages that are reachable by the new, writable
* SPTE. Write-protect the SPTE if the page can't be unsync'd,
* e.g. it's write-tracked (upper-level SPs) or has one or more
* shadow pages and unsync'ing pages is not allowed.
+ *
+ * When overwriting an existing leaf SPTE, and the old SPTE was
+ * writable, skip trying to unsync shadow pages as any relevant
+ * shadow pages must already be unsync, i.e. the hash lookup is
+ * unnecessary (and expensive). Note, this relies on KVM not
+ * changing PFNs without first zapping the old SPTE, which is
+ * guaranteed by both the shadow MMU and the TDP MMU.
*/
- if (mmu_try_to_unsync_pages(vcpu->kvm, slot, gfn, synchronizing, prefetch)) {
+ if ((!is_last_spte(old_spte, level) || !is_writable_pte(old_spte)) &&
+ mmu_try_to_unsync_pages(vcpu->kvm, slot, gfn, synchronizing, prefetch)) {
wrprot = true;
pte_access &= ~ACC_WRITE_MASK;
spte &= ~(PT_WRITABLE_MASK | shadow_mmu_writable_mask);
@@ -252,7 +243,6 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
if (pte_access & ACC_WRITE_MASK)
spte |= spte_shadow_dirty_mask(spte);
-out:
if (prefetch && !synchronizing)
spte = mark_spte_for_access_track(spte);
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 03/18] KVM: x86/mmu: Fold all of make_spte()'s writable handling into one if-else
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
2024-10-11 2:10 ` [PATCH 01/18] KVM: x86/mmu: Flush remote TLBs iff MMU-writable flag is cleared from RO SPTE Sean Christopherson
2024-10-11 2:10 ` [PATCH 02/18] KVM: x86/mmu: Always set SPTE's dirty bit if it's created as writable Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-11 2:10 ` [PATCH 04/18] KVM: x86/mmu: Don't force flush if SPTE update clears Accessed bit Sean Christopherson
` (16 subsequent siblings)
19 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
Now that make_spte() no longer uses a funky goto to bail out for a special
case of its unsync handling, combine all of the unsync vs. writable logic
into a single if-else statement.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/spte.c | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index 09ce93c4916a..030813781a63 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -217,8 +217,6 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
spte |= (u64)pfn << PAGE_SHIFT;
if (pte_access & ACC_WRITE_MASK) {
- spte |= PT_WRITABLE_MASK | shadow_mmu_writable_mask;
-
/*
* Unsync shadow pages that are reachable by the new, writable
* SPTE. Write-protect the SPTE if the page can't be unsync'd,
@@ -233,16 +231,13 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
* guaranteed by both the shadow MMU and the TDP MMU.
*/
if ((!is_last_spte(old_spte, level) || !is_writable_pte(old_spte)) &&
- mmu_try_to_unsync_pages(vcpu->kvm, slot, gfn, synchronizing, prefetch)) {
+ mmu_try_to_unsync_pages(vcpu->kvm, slot, gfn, synchronizing, prefetch))
wrprot = true;
- pte_access &= ~ACC_WRITE_MASK;
- spte &= ~(PT_WRITABLE_MASK | shadow_mmu_writable_mask);
- }
+ else
+ spte |= PT_WRITABLE_MASK | shadow_mmu_writable_mask |
+ spte_shadow_dirty_mask(spte);
}
- if (pte_access & ACC_WRITE_MASK)
- spte |= spte_shadow_dirty_mask(spte);
-
if (prefetch && !synchronizing)
spte = mark_spte_for_access_track(spte);
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 04/18] KVM: x86/mmu: Don't force flush if SPTE update clears Accessed bit
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
` (2 preceding siblings ...)
2024-10-11 2:10 ` [PATCH 03/18] KVM: x86/mmu: Fold all of make_spte()'s writable handling into one if-else Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-11 2:10 ` [PATCH 05/18] KVM: x86/mmu: Don't flush TLBs when clearing Dirty bit in shadow MMU Sean Christopherson
` (15 subsequent siblings)
19 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
Don't force a TLB flush if mmu_spte_update() clears the Accessed bit, as
access tracking tolerates false negatives, as evidenced by the
mmu_notifier hooks that explicitly test and age SPTEs without doing a TLB
flush.
In practice, this is very nearly a nop. spte_write_protect() and
spte_clear_dirty() never clear the Accessed bit. make_spte() always
sets the Accessed bit for !prefetch scenarios. FNAME(sync_spte) only sets
SPTE if the protection bits are changing, i.e. if a flush will be needed
regardless of the Accessed bits. And FNAME(pte_prefetch) sets SPTE if and
only if the old SPTE is !PRESENT.
That leaves kvm_arch_async_page_ready() as the one path that will generate
a !ACCESSED SPTE *and* overwrite a PRESENT SPTE. And that's very arguably
a bug, as clobbering a valid SPTE in that case is nonsensical.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/mmu.c | 30 +++++++++---------------------
1 file changed, 9 insertions(+), 21 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 176fc37540df..9ccfe7eba9b4 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -521,36 +521,24 @@ static u64 mmu_spte_update_no_track(u64 *sptep, u64 new_spte)
* not whether or not SPTEs were modified, i.e. only the write-tracking case
* needs to flush at the time the SPTEs is modified, before dropping mmu_lock.
*
+ * Remote TLBs also need to be flushed if the Dirty bit is cleared, as false
+ * negatives are not acceptable, e.g. if KVM is using D-bit based PML on VMX.
+ *
+ * Don't flush if the Accessed bit is cleared, as access tracking tolerates
+ * false negatives, and the one path that does care about TLB flushes,
+ * kvm_mmu_notifier_clear_flush_young(), uses mmu_spte_update_no_track().
+ *
* Returns true if the TLB needs to be flushed
*/
static bool mmu_spte_update(u64 *sptep, u64 new_spte)
{
- bool flush = false;
u64 old_spte = mmu_spte_update_no_track(sptep, new_spte);
if (!is_shadow_present_pte(old_spte))
return false;
- /*
- * For the spte updated out of mmu-lock is safe, since
- * we always atomically update it, see the comments in
- * spte_has_volatile_bits().
- */
- if (is_mmu_writable_spte(old_spte) && !is_mmu_writable_spte(new_spte))
- flush = true;
-
- /*
- * Flush TLB when accessed/dirty states are changed in the page tables,
- * to guarantee consistency between TLB and page tables.
- */
-
- if (is_accessed_spte(old_spte) && !is_accessed_spte(new_spte))
- flush = true;
-
- if (is_dirty_spte(old_spte) && !is_dirty_spte(new_spte))
- flush = true;
-
- return flush;
+ return (is_mmu_writable_spte(old_spte) && !is_mmu_writable_spte(new_spte)) ||
+ (is_dirty_spte(old_spte) && !is_dirty_spte(new_spte));
}
/*
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 05/18] KVM: x86/mmu: Don't flush TLBs when clearing Dirty bit in shadow MMU
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
` (3 preceding siblings ...)
2024-10-11 2:10 ` [PATCH 04/18] KVM: x86/mmu: Don't force flush if SPTE update clears Accessed bit Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-11 2:10 ` [PATCH 06/18] KVM: x86/mmu: Drop ignored return value from kvm_tdp_mmu_clear_dirty_slot() Sean Christopherson
` (14 subsequent siblings)
19 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
Don't force a TLB flush when an SPTE update in the shadow MMU happens to
clear the Dirty bit, as KVM unconditionally flushes TLBs when enabling
dirty logging, and when clearing dirty logs, KVM flushes based on its
software structures, not the SPTEs. I.e. the flows that care about
accurate Dirty bit information already ensure there are no stale TLB
entries.
Opportunistically drop is_dirty_spte() as mmu_spte_update() was the sole
caller.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/mmu.c | 14 ++++++++------
arch/x86/kvm/mmu/spte.h | 7 -------
2 files changed, 8 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 9ccfe7eba9b4..faa524d5a0e8 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -521,12 +521,15 @@ static u64 mmu_spte_update_no_track(u64 *sptep, u64 new_spte)
* not whether or not SPTEs were modified, i.e. only the write-tracking case
* needs to flush at the time the SPTEs is modified, before dropping mmu_lock.
*
- * Remote TLBs also need to be flushed if the Dirty bit is cleared, as false
- * negatives are not acceptable, e.g. if KVM is using D-bit based PML on VMX.
- *
* Don't flush if the Accessed bit is cleared, as access tracking tolerates
* false negatives, and the one path that does care about TLB flushes,
- * kvm_mmu_notifier_clear_flush_young(), uses mmu_spte_update_no_track().
+ * kvm_mmu_notifier_clear_flush_young(), flushes if a young SPTE is found, i.e.
+ * doesn't rely on lower helpers to detect the need to flush.
+ *
+ * Lastly, don't flush if the Dirty bit is cleared, as KVM unconditionally
+ * flushes when enabling dirty logging (see kvm_mmu_slot_apply_flags()), and
+ * when clearing dirty logs, KVM flushes based on whether or not dirty entries
+ * were reaped from the bitmap/ring, not whether or not dirty SPTEs were found.
*
* Returns true if the TLB needs to be flushed
*/
@@ -537,8 +540,7 @@ static bool mmu_spte_update(u64 *sptep, u64 new_spte)
if (!is_shadow_present_pte(old_spte))
return false;
- return (is_mmu_writable_spte(old_spte) && !is_mmu_writable_spte(new_spte)) ||
- (is_dirty_spte(old_spte) && !is_dirty_spte(new_spte));
+ return is_mmu_writable_spte(old_spte) && !is_mmu_writable_spte(new_spte);
}
/*
diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index c81cac9358e0..574ca9a1fcab 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -363,13 +363,6 @@ static inline bool is_accessed_spte(u64 spte)
: !is_access_track_spte(spte);
}
-static inline bool is_dirty_spte(u64 spte)
-{
- u64 dirty_mask = spte_shadow_dirty_mask(spte);
-
- return dirty_mask ? spte & dirty_mask : spte & PT_WRITABLE_MASK;
-}
-
static inline u64 get_rsvd_bits(struct rsvd_bits_validate *rsvd_check, u64 pte,
int level)
{
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 06/18] KVM: x86/mmu: Drop ignored return value from kvm_tdp_mmu_clear_dirty_slot()
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
` (4 preceding siblings ...)
2024-10-11 2:10 ` [PATCH 05/18] KVM: x86/mmu: Don't flush TLBs when clearing Dirty bit in shadow MMU Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-11 2:10 ` [PATCH 07/18] KVM: x86/mmu: Fold mmu_spte_update_no_track() into mmu_spte_update() Sean Christopherson
` (13 subsequent siblings)
19 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
Drop the return value from kvm_tdp_mmu_clear_dirty_slot() as its sole
caller ignores the result (KVM flushes after clearing dirty logs based on
the logs themselves, not based on SPTEs).
Cc: David Matlack <dmatlack@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/tdp_mmu.c | 20 ++++++--------------
arch/x86/kvm/mmu/tdp_mmu.h | 2 +-
2 files changed, 7 insertions(+), 15 deletions(-)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 91caa73a905b..9c66be7fb002 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1492,13 +1492,12 @@ static bool tdp_mmu_need_write_protect(struct kvm_mmu_page *sp)
return kvm_mmu_page_ad_need_write_protect(sp) || !kvm_ad_enabled();
}
-static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
- gfn_t start, gfn_t end)
+static void clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
+ gfn_t start, gfn_t end)
{
const u64 dbit = tdp_mmu_need_write_protect(root) ? PT_WRITABLE_MASK :
shadow_dirty_mask;
struct tdp_iter iter;
- bool spte_set = false;
rcu_read_lock();
@@ -1519,31 +1518,24 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
if (tdp_mmu_set_spte_atomic(kvm, &iter, iter.old_spte & ~dbit))
goto retry;
-
- spte_set = true;
}
rcu_read_unlock();
- return spte_set;
}
/*
* Clear the dirty status (D-bit or W-bit) of all the SPTEs mapping GFNs in the
- * memslot. Returns true if an SPTE has been changed and the TLBs need to be
- * flushed.
+ * memslot.
*/
-bool kvm_tdp_mmu_clear_dirty_slot(struct kvm *kvm,
+void kvm_tdp_mmu_clear_dirty_slot(struct kvm *kvm,
const struct kvm_memory_slot *slot)
{
struct kvm_mmu_page *root;
- bool spte_set = false;
lockdep_assert_held_read(&kvm->mmu_lock);
for_each_valid_tdp_mmu_root_yield_safe(kvm, root, slot->as_id)
- spte_set |= clear_dirty_gfn_range(kvm, root, slot->base_gfn,
- slot->base_gfn + slot->npages);
-
- return spte_set;
+ clear_dirty_gfn_range(kvm, root, slot->base_gfn,
+ slot->base_gfn + slot->npages);
}
static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
index 1b74e058a81c..d842bfe103ab 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.h
+++ b/arch/x86/kvm/mmu/tdp_mmu.h
@@ -34,7 +34,7 @@ bool kvm_tdp_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range);
bool kvm_tdp_mmu_wrprot_slot(struct kvm *kvm,
const struct kvm_memory_slot *slot, int min_level);
-bool kvm_tdp_mmu_clear_dirty_slot(struct kvm *kvm,
+void kvm_tdp_mmu_clear_dirty_slot(struct kvm *kvm,
const struct kvm_memory_slot *slot);
void kvm_tdp_mmu_clear_dirty_pt_masked(struct kvm *kvm,
struct kvm_memory_slot *slot,
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 07/18] KVM: x86/mmu: Fold mmu_spte_update_no_track() into mmu_spte_update()
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
` (5 preceding siblings ...)
2024-10-11 2:10 ` [PATCH 06/18] KVM: x86/mmu: Drop ignored return value from kvm_tdp_mmu_clear_dirty_slot() Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-11 2:10 ` [PATCH 08/18] KVM: x86/mmu: WARN and flush if resolving a TDP MMU fault clears MMU-writable Sean Christopherson
` (12 subsequent siblings)
19 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
Fold the guts of mmu_spte_update_no_track() into mmu_spte_update() now
that the latter doesn't flush when clearing A/D bits, i.e. now that there
is no need to explicitly avoid TLB flushes when aging SPTEs.
Opportunistically WARN if mmu_spte_update() requests a TLB flush when
aging SPTEs, as aging should never modify a SPTE in such a way that KVM
thinks a TLB flush is needed.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/mmu.c | 50 ++++++++++++++++++------------------------
1 file changed, 21 insertions(+), 29 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index faa524d5a0e8..a72ecac63e07 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -485,32 +485,6 @@ static void mmu_spte_set(u64 *sptep, u64 new_spte)
__set_spte(sptep, new_spte);
}
-/*
- * Update the SPTE (excluding the PFN), but do not track changes in its
- * accessed/dirty status.
- */
-static u64 mmu_spte_update_no_track(u64 *sptep, u64 new_spte)
-{
- u64 old_spte = *sptep;
-
- WARN_ON_ONCE(!is_shadow_present_pte(new_spte));
- check_spte_writable_invariants(new_spte);
-
- if (!is_shadow_present_pte(old_spte)) {
- mmu_spte_set(sptep, new_spte);
- return old_spte;
- }
-
- if (!spte_has_volatile_bits(old_spte))
- __update_clear_spte_fast(sptep, new_spte);
- else
- old_spte = __update_clear_spte_slow(sptep, new_spte);
-
- WARN_ON_ONCE(spte_to_pfn(old_spte) != spte_to_pfn(new_spte));
-
- return old_spte;
-}
-
/* Rules for using mmu_spte_update:
* Update the state bits, it means the mapped pfn is not changed.
*
@@ -535,10 +509,23 @@ static u64 mmu_spte_update_no_track(u64 *sptep, u64 new_spte)
*/
static bool mmu_spte_update(u64 *sptep, u64 new_spte)
{
- u64 old_spte = mmu_spte_update_no_track(sptep, new_spte);
+ u64 old_spte = *sptep;
- if (!is_shadow_present_pte(old_spte))
+ WARN_ON_ONCE(!is_shadow_present_pte(new_spte));
+ check_spte_writable_invariants(new_spte);
+
+ if (!is_shadow_present_pte(old_spte)) {
+ mmu_spte_set(sptep, new_spte);
return false;
+ }
+
+ if (!spte_has_volatile_bits(old_spte))
+ __update_clear_spte_fast(sptep, new_spte);
+ else
+ old_spte = __update_clear_spte_slow(sptep, new_spte);
+
+ WARN_ON_ONCE(!is_shadow_present_pte(old_spte) ||
+ spte_to_pfn(old_spte) != spte_to_pfn(new_spte));
return is_mmu_writable_spte(old_spte) && !is_mmu_writable_spte(new_spte);
}
@@ -1587,8 +1574,13 @@ static bool kvm_rmap_age_gfn_range(struct kvm *kvm,
clear_bit((ffs(shadow_accessed_mask) - 1),
(unsigned long *)sptep);
} else {
+ /*
+ * WARN if mmu_spte_update() signals the need
+ * for a TLB flush, as Access tracking a SPTE
+ * should never trigger an _immediate_ flush.
+ */
spte = mark_spte_for_access_track(spte);
- mmu_spte_update_no_track(sptep, spte);
+ WARN_ON_ONCE(mmu_spte_update(sptep, spte));
}
young = true;
}
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 08/18] KVM: x86/mmu: WARN and flush if resolving a TDP MMU fault clears MMU-writable
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
` (6 preceding siblings ...)
2024-10-11 2:10 ` [PATCH 07/18] KVM: x86/mmu: Fold mmu_spte_update_no_track() into mmu_spte_update() Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-11 2:10 ` [PATCH 09/18] KVM: x86/mmu: Add a dedicated flag to track if A/D bits are globally enabled Sean Christopherson
` (11 subsequent siblings)
19 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
Do a remote TLB flush if installing a leaf SPTE overwrites an existing
leaf SPTE (with the same target pfn, which is enforced by a BUG() in
handle_changed_spte()) and clears the MMU-Writable bit. Since the TDP MMU
passes ACC_ALL to make_spte(), i.e. always requests a Writable SPTE, the
only scenario in which make_spte() should create a !MMU-Writable SPTE is
if the gfn is write-tracked or if KVM is prefetching a SPTE.
When write-protecting for write-tracking, KVM must hold mmu_lock for write,
i.e. can't race with a vCPU faulting in the SPTE. And when prefetching a
SPTE, the TDP MMU takes care to avoid clobbering a shadow-present SPTE,
i.e. it should be impossible to replace a MMU-writable SPTE with a
!MMU-writable SPTE when handling a TDP MMU fault.
Cc: David Matlack <dmatlack@google.com>
Cc: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/tdp_mmu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 9c66be7fb002..bc9e2f50dc80 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1033,7 +1033,9 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu,
else if (tdp_mmu_set_spte_atomic(vcpu->kvm, iter, new_spte))
return RET_PF_RETRY;
else if (is_shadow_present_pte(iter->old_spte) &&
- !is_last_spte(iter->old_spte, iter->level))
+ (!is_last_spte(iter->old_spte, iter->level) ||
+ WARN_ON_ONCE(is_mmu_writable_spte(iter->old_spte) &&
+ !is_mmu_writable_spte(new_spte))))
kvm_flush_remote_tlbs_gfn(vcpu->kvm, iter->gfn, iter->level);
/*
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 09/18] KVM: x86/mmu: Add a dedicated flag to track if A/D bits are globally enabled
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
` (7 preceding siblings ...)
2024-10-11 2:10 ` [PATCH 08/18] KVM: x86/mmu: WARN and flush if resolving a TDP MMU fault clears MMU-writable Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-11 2:10 ` [PATCH 10/18] KVM: x86/mmu: Set shadow_accessed_mask for EPT even if A/D bits disabled Sean Christopherson
` (10 subsequent siblings)
19 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
Add a dedicated flag to track if KVM has enabled A/D bits at the module
level, instead of inferring the state based on whether or not the MMU's
shadow_accessed_mask is non-zero. This will allow defining and using
shadow_accessed_mask even when A/D bits aren't used by hardware.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/mmu.c | 6 +++---
arch/x86/kvm/mmu/spte.c | 6 ++++++
arch/x86/kvm/mmu/spte.h | 20 +++++++++-----------
arch/x86/kvm/mmu/tdp_mmu.c | 4 ++--
4 files changed, 20 insertions(+), 16 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index a72ecac63e07..06fb0fd3f87d 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3350,7 +3350,7 @@ static bool page_fault_can_be_fast(struct kvm *kvm, struct kvm_page_fault *fault
* by setting the Writable bit, which can be done out of mmu_lock.
*/
if (!fault->present)
- return !kvm_ad_enabled();
+ return !kvm_ad_enabled;
/*
* Note, instruction fetches and writes are mutually exclusive, ignore
@@ -3485,7 +3485,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
* uses A/D bits for non-nested MMUs. Thus, if A/D bits are
* enabled, the SPTE can't be an access-tracked SPTE.
*/
- if (unlikely(!kvm_ad_enabled()) && is_access_track_spte(spte))
+ if (unlikely(!kvm_ad_enabled) && is_access_track_spte(spte))
new_spte = restore_acc_track_spte(new_spte);
/*
@@ -5462,7 +5462,7 @@ kvm_calc_tdp_mmu_root_page_role(struct kvm_vcpu *vcpu,
role.efer_nx = true;
role.smm = cpu_role.base.smm;
role.guest_mode = cpu_role.base.guest_mode;
- role.ad_disabled = !kvm_ad_enabled();
+ role.ad_disabled = !kvm_ad_enabled;
role.level = kvm_mmu_get_tdp_level(vcpu);
role.direct = true;
role.has_4_byte_gpte = false;
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index 030813781a63..c9543625dda2 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -24,6 +24,8 @@ static bool __ro_after_init allow_mmio_caching;
module_param_named(mmio_caching, enable_mmio_caching, bool, 0444);
EXPORT_SYMBOL_GPL(enable_mmio_caching);
+bool __read_mostly kvm_ad_enabled;
+
u64 __read_mostly shadow_host_writable_mask;
u64 __read_mostly shadow_mmu_writable_mask;
u64 __read_mostly shadow_nx_mask;
@@ -414,6 +416,8 @@ EXPORT_SYMBOL_GPL(kvm_mmu_set_me_spte_mask);
void kvm_mmu_set_ept_masks(bool has_ad_bits, bool has_exec_only)
{
+ kvm_ad_enabled = has_ad_bits;
+
shadow_user_mask = VMX_EPT_READABLE_MASK;
shadow_accessed_mask = has_ad_bits ? VMX_EPT_ACCESS_BIT : 0ull;
shadow_dirty_mask = has_ad_bits ? VMX_EPT_DIRTY_BIT : 0ull;
@@ -447,6 +451,8 @@ void kvm_mmu_reset_all_pte_masks(void)
u8 low_phys_bits;
u64 mask;
+ kvm_ad_enabled = true;
+
/*
* If the CPU has 46 or less physical address bits, then set an
* appropriate mask to guard against L1TF attacks. Otherwise, it is
diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index 574ca9a1fcab..908cb672c4e1 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -167,6 +167,15 @@ static_assert(!(SHADOW_NONPRESENT_VALUE & SPTE_MMU_PRESENT_MASK));
#define SHADOW_NONPRESENT_VALUE 0ULL
#endif
+
+/*
+ * True if A/D bits are supported in hardware and are enabled by KVM. When
+ * enabled, KVM uses A/D bits for all non-nested MMUs. Because L1 can disable
+ * A/D bits in EPTP12, SP and SPTE variants are needed to handle the scenario
+ * where KVM is using A/D bits for L1, but not L2.
+ */
+extern bool __read_mostly kvm_ad_enabled;
+
extern u64 __read_mostly shadow_host_writable_mask;
extern u64 __read_mostly shadow_mmu_writable_mask;
extern u64 __read_mostly shadow_nx_mask;
@@ -285,17 +294,6 @@ static inline bool is_ept_ve_possible(u64 spte)
(spte & VMX_EPT_RWX_MASK) != VMX_EPT_MISCONFIG_WX_VALUE;
}
-/*
- * Returns true if A/D bits are supported in hardware and are enabled by KVM.
- * When enabled, KVM uses A/D bits for all non-nested MMUs. Because L1 can
- * disable A/D bits in EPTP12, SP and SPTE variants are needed to handle the
- * scenario where KVM is using A/D bits for L1, but not L2.
- */
-static inline bool kvm_ad_enabled(void)
-{
- return !!shadow_accessed_mask;
-}
-
static inline bool sp_ad_disabled(struct kvm_mmu_page *sp)
{
return sp->role.ad_disabled;
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index bc9e2f50dc80..f77927b57962 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1075,7 +1075,7 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu,
static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter,
struct kvm_mmu_page *sp, bool shared)
{
- u64 spte = make_nonleaf_spte(sp->spt, !kvm_ad_enabled());
+ u64 spte = make_nonleaf_spte(sp->spt, !kvm_ad_enabled);
int ret = 0;
if (shared) {
@@ -1491,7 +1491,7 @@ static bool tdp_mmu_need_write_protect(struct kvm_mmu_page *sp)
* from level, so it is valid to key off any shadow page to determine if
* write protection is needed for an entire tree.
*/
- return kvm_mmu_page_ad_need_write_protect(sp) || !kvm_ad_enabled();
+ return kvm_mmu_page_ad_need_write_protect(sp) || !kvm_ad_enabled;
}
static void clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 10/18] KVM: x86/mmu: Set shadow_accessed_mask for EPT even if A/D bits disabled
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
` (8 preceding siblings ...)
2024-10-11 2:10 ` [PATCH 09/18] KVM: x86/mmu: Add a dedicated flag to track if A/D bits are globally enabled Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-11 2:10 ` [PATCH 11/18] KVM: x86/mmu: Set shadow_dirty_mask " Sean Christopherson
` (9 subsequent siblings)
19 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
Now that KVM doesn't use shadow_accessed_mask to detect if hardware A/D
bits are enabled, set shadow_accessed_mask for EPT even when A/D bits
are disabled in hardware. This will allow using shadow_accessed_mask for
software purposes, e.g. to preserve accessed status in a non-present SPTE
acros NUMA balancing, if something like that is ever desirable.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/spte.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index c9543625dda2..e352d1821319 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -419,7 +419,7 @@ void kvm_mmu_set_ept_masks(bool has_ad_bits, bool has_exec_only)
kvm_ad_enabled = has_ad_bits;
shadow_user_mask = VMX_EPT_READABLE_MASK;
- shadow_accessed_mask = has_ad_bits ? VMX_EPT_ACCESS_BIT : 0ull;
+ shadow_accessed_mask = VMX_EPT_ACCESS_BIT;
shadow_dirty_mask = has_ad_bits ? VMX_EPT_DIRTY_BIT : 0ull;
shadow_nx_mask = 0ull;
shadow_x_mask = VMX_EPT_EXECUTABLE_MASK;
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 11/18] KVM: x86/mmu: Set shadow_dirty_mask for EPT even if A/D bits disabled
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
` (9 preceding siblings ...)
2024-10-11 2:10 ` [PATCH 10/18] KVM: x86/mmu: Set shadow_accessed_mask for EPT even if A/D bits disabled Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-11 2:10 ` [PATCH 12/18] KVM: x86/mmu: Use Accessed bit even when _hardware_ A/D bits are disabled Sean Christopherson
` (8 subsequent siblings)
19 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
Set shadow_dirty_mask to the architectural EPT Dirty bit value even if
A/D bits are disabled at the module level, i.e. even if KVM will never
enable A/D bits in hardware. Doing so provides consistent behavior for
Accessed and Dirty bits, i.e. doesn't leave KVM in a state where it sets
shadow_accessed_mask but not shadow_dirty_mask.
Functionally, this should be one big nop, as consumption of
shadow_dirty_mask is always guarded by a check that hardware A/D bits are
enabled.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/spte.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index e352d1821319..54d8c9b76050 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -420,7 +420,7 @@ void kvm_mmu_set_ept_masks(bool has_ad_bits, bool has_exec_only)
shadow_user_mask = VMX_EPT_READABLE_MASK;
shadow_accessed_mask = VMX_EPT_ACCESS_BIT;
- shadow_dirty_mask = has_ad_bits ? VMX_EPT_DIRTY_BIT : 0ull;
+ shadow_dirty_mask = VMX_EPT_DIRTY_BIT;
shadow_nx_mask = 0ull;
shadow_x_mask = VMX_EPT_EXECUTABLE_MASK;
/* VMX_EPT_SUPPRESS_VE_BIT is needed for W or X violation. */
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 12/18] KVM: x86/mmu: Use Accessed bit even when _hardware_ A/D bits are disabled
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
` (10 preceding siblings ...)
2024-10-11 2:10 ` [PATCH 11/18] KVM: x86/mmu: Set shadow_dirty_mask " Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-11 2:10 ` [PATCH 13/18] KVM: x86/mmu: Process only valid TDP MMU roots when aging a gfn range Sean Christopherson
` (7 subsequent siblings)
19 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
Use the Accessed bit in SPTEs even when A/D bits are disabled in hardware,
i.e. propagate accessed information to SPTE.Accessed even when KVM is
doing manual tracking by making SPTEs not-present. In addition to
eliminating a small amount of code in is_accessed_spte(), this also paves
the way for preserving Accessed information when a SPTE is zapped in
response to a mmu_notifier PROTECTION event, e.g. if a SPTE is zapped
because NUMA balancing kicks in.
Note, EPT is the only flavor of paging in which A/D bits are conditionally
enabled, and the Accessed (and Dirty) bit is software-available when A/D
bits are disabled.
Note #2, there are currently no concrete plans to preserve Accessed
information. Explorations on that front were the initial catalyst, but
the cleanup is the motivation for the actual commit.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/mmu.c | 3 ++-
arch/x86/kvm/mmu/spte.c | 4 ++--
arch/x86/kvm/mmu/spte.h | 11 +----------
3 files changed, 5 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 06fb0fd3f87d..5be3b5f054f1 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3486,7 +3486,8 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
* enabled, the SPTE can't be an access-tracked SPTE.
*/
if (unlikely(!kvm_ad_enabled) && is_access_track_spte(spte))
- new_spte = restore_acc_track_spte(new_spte);
+ new_spte = restore_acc_track_spte(new_spte) |
+ shadow_accessed_mask;
/*
* To keep things simple, only SPTEs that are MMU-writable can
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index 54d8c9b76050..617479efd127 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -175,7 +175,7 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
spte |= shadow_present_mask;
if (!prefetch || synchronizing)
- spte |= spte_shadow_accessed_mask(spte);
+ spte |= shadow_accessed_mask;
/*
* For simplicity, enforce the NX huge page mitigation even if not
@@ -346,7 +346,7 @@ u64 mark_spte_for_access_track(u64 spte)
spte |= (spte & SHADOW_ACC_TRACK_SAVED_BITS_MASK) <<
SHADOW_ACC_TRACK_SAVED_BITS_SHIFT;
- spte &= ~shadow_acc_track_mask;
+ spte &= ~(shadow_acc_track_mask | shadow_accessed_mask);
return spte;
}
diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index 908cb672c4e1..c8dc75337c8b 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -316,12 +316,6 @@ static inline bool spte_ad_need_write_protect(u64 spte)
return (spte & SPTE_TDP_AD_MASK) != SPTE_TDP_AD_ENABLED;
}
-static inline u64 spte_shadow_accessed_mask(u64 spte)
-{
- KVM_MMU_WARN_ON(!is_shadow_present_pte(spte));
- return spte_ad_enabled(spte) ? shadow_accessed_mask : 0;
-}
-
static inline u64 spte_shadow_dirty_mask(u64 spte)
{
KVM_MMU_WARN_ON(!is_shadow_present_pte(spte));
@@ -355,10 +349,7 @@ static inline kvm_pfn_t spte_to_pfn(u64 pte)
static inline bool is_accessed_spte(u64 spte)
{
- u64 accessed_mask = spte_shadow_accessed_mask(spte);
-
- return accessed_mask ? spte & accessed_mask
- : !is_access_track_spte(spte);
+ return spte & shadow_accessed_mask;
}
static inline u64 get_rsvd_bits(struct rsvd_bits_validate *rsvd_check, u64 pte,
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 13/18] KVM: x86/mmu: Process only valid TDP MMU roots when aging a gfn range
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
` (11 preceding siblings ...)
2024-10-11 2:10 ` [PATCH 12/18] KVM: x86/mmu: Use Accessed bit even when _hardware_ A/D bits are disabled Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-11 2:10 ` [PATCH 14/18] KVM: x86/mmu: Stop processing TDP MMU roots for test_age if young SPTE found Sean Christopherson
` (6 subsequent siblings)
19 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
Skip invalid TDP MMU roots when aging a gfn range. There is zero reason
to process invalid roots, as they by definition hold stale information.
E.g. if a root is invalid because its from a previous memslot generation,
in the unlikely event the root has a SPTE for the gfn, then odds are good
that the gfn=>hva mapping is different, i.e. doesn't map to the hva that
is being aged by the primary MMU.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/tdp_mmu.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index f77927b57962..e8c061bf94ec 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1205,9 +1205,11 @@ static __always_inline bool kvm_tdp_mmu_handle_gfn(struct kvm *kvm,
/*
* Don't support rescheduling, none of the MMU notifiers that funnel
- * into this helper allow blocking; it'd be dead, wasteful code.
+ * into this helper allow blocking; it'd be dead, wasteful code. Note,
+ * this helper must NOT be used to unmap GFNs, as it processes only
+ * valid roots!
*/
- for_each_tdp_mmu_root(kvm, root, range->slot->as_id) {
+ for_each_valid_tdp_mmu_root(kvm, root, range->slot->as_id) {
rcu_read_lock();
tdp_root_for_each_leaf_pte(iter, root, range->start, range->end)
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 14/18] KVM: x86/mmu: Stop processing TDP MMU roots for test_age if young SPTE found
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
` (12 preceding siblings ...)
2024-10-11 2:10 ` [PATCH 13/18] KVM: x86/mmu: Process only valid TDP MMU roots when aging a gfn range Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-17 16:52 ` Paolo Bonzini
2024-10-11 2:10 ` [PATCH 15/18] KVM: x86/mmu: Dedup logic for detecting TLB flushes on leaf SPTE changes Sean Christopherson
` (5 subsequent siblings)
19 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
Return immediately if a young SPTE is found when testing, but not updating,
SPTEs. The return value is a boolean, i.e. whether there is one young SPTE
or fifty is irrelevant (ignoring the fact that it's impossible for there to
be fifty SPTEs, as KVM has a hard limit on the number of valid TDP MMU
roots).
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/tdp_mmu.c | 84 ++++++++++++++++++--------------------
1 file changed, 40 insertions(+), 44 deletions(-)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index e8c061bf94ec..f412bca206c5 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1192,35 +1192,6 @@ bool kvm_tdp_mmu_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range,
return flush;
}
-typedef bool (*tdp_handler_t)(struct kvm *kvm, struct tdp_iter *iter,
- struct kvm_gfn_range *range);
-
-static __always_inline bool kvm_tdp_mmu_handle_gfn(struct kvm *kvm,
- struct kvm_gfn_range *range,
- tdp_handler_t handler)
-{
- struct kvm_mmu_page *root;
- struct tdp_iter iter;
- bool ret = false;
-
- /*
- * Don't support rescheduling, none of the MMU notifiers that funnel
- * into this helper allow blocking; it'd be dead, wasteful code. Note,
- * this helper must NOT be used to unmap GFNs, as it processes only
- * valid roots!
- */
- for_each_valid_tdp_mmu_root(kvm, root, range->slot->as_id) {
- rcu_read_lock();
-
- tdp_root_for_each_leaf_pte(iter, root, range->start, range->end)
- ret |= handler(kvm, &iter, range);
-
- rcu_read_unlock();
- }
-
- return ret;
-}
-
/*
* Mark the SPTEs range of GFNs [start, end) unaccessed and return non-zero
* if any of the GFNs in the range have been accessed.
@@ -1229,15 +1200,10 @@ static __always_inline bool kvm_tdp_mmu_handle_gfn(struct kvm *kvm,
* from the clear_young() or clear_flush_young() notifier, which uses the
* return value to determine if the page has been accessed.
*/
-static bool age_gfn_range(struct kvm *kvm, struct tdp_iter *iter,
- struct kvm_gfn_range *range)
+static void kvm_tdp_mmu_age_spte(struct tdp_iter *iter)
{
u64 new_spte;
- /* If we have a non-accessed entry we don't need to change the pte. */
- if (!is_accessed_spte(iter->old_spte))
- return false;
-
if (spte_ad_enabled(iter->old_spte)) {
iter->old_spte = tdp_mmu_clear_spte_bits(iter->sptep,
iter->old_spte,
@@ -1253,23 +1219,53 @@ static bool age_gfn_range(struct kvm *kvm, struct tdp_iter *iter,
trace_kvm_tdp_mmu_spte_changed(iter->as_id, iter->gfn, iter->level,
iter->old_spte, new_spte);
- return true;
+}
+
+static bool __kvm_tdp_mmu_age_gfn_range(struct kvm *kvm,
+ struct kvm_gfn_range *range,
+ bool test_only)
+{
+ struct kvm_mmu_page *root;
+ struct tdp_iter iter;
+ bool ret = false;
+
+ /*
+ * Don't support rescheduling, none of the MMU notifiers that funnel
+ * into this helper allow blocking; it'd be dead, wasteful code. Note,
+ * this helper must NOT be used to unmap GFNs, as it processes only
+ * valid roots!
+ */
+ for_each_valid_tdp_mmu_root(kvm, root, range->slot->as_id) {
+ rcu_read_lock();
+
+ tdp_root_for_each_leaf_pte(iter, root, range->start, range->end) {
+ if (!is_accessed_spte(iter.old_spte))
+ continue;
+
+ ret = true;
+ if (test_only)
+ break;
+
+ kvm_tdp_mmu_age_spte(&iter);
+ }
+
+ rcu_read_unlock();
+
+ if (ret && test_only)
+ break;
+ }
+
+ return ret;
}
bool kvm_tdp_mmu_age_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
{
- return kvm_tdp_mmu_handle_gfn(kvm, range, age_gfn_range);
-}
-
-static bool test_age_gfn(struct kvm *kvm, struct tdp_iter *iter,
- struct kvm_gfn_range *range)
-{
- return is_accessed_spte(iter->old_spte);
+ return __kvm_tdp_mmu_age_gfn_range(kvm, range, false);
}
bool kvm_tdp_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
{
- return kvm_tdp_mmu_handle_gfn(kvm, range, test_age_gfn);
+ return __kvm_tdp_mmu_age_gfn_range(kvm, range, true);
}
/*
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 15/18] KVM: x86/mmu: Dedup logic for detecting TLB flushes on leaf SPTE changes
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
` (13 preceding siblings ...)
2024-10-11 2:10 ` [PATCH 14/18] KVM: x86/mmu: Stop processing TDP MMU roots for test_age if young SPTE found Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-17 16:53 ` Paolo Bonzini
2024-10-11 2:10 ` [PATCH 16/18] KVM: x86/mmu: Set Dirty bit for new SPTEs, even if _hardware_ A/D bits are disabled Sean Christopherson
` (4 subsequent siblings)
19 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
Now that the shadow MMU and TDP MMU have identical logic for detecting
required TLB flushes when updating SPTEs, move said logic to a helper so
that the TDP MMU code can benefit from the comments that are currently
exclusive to the shadow MMU.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/mmu.c | 19 +------------------
arch/x86/kvm/mmu/spte.h | 29 +++++++++++++++++++++++++++++
arch/x86/kvm/mmu/tdp_mmu.c | 3 +--
3 files changed, 31 insertions(+), 20 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 5be3b5f054f1..f75915ff33be 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -488,23 +488,6 @@ static void mmu_spte_set(u64 *sptep, u64 new_spte)
/* Rules for using mmu_spte_update:
* Update the state bits, it means the mapped pfn is not changed.
*
- * If the MMU-writable flag is cleared, i.e. the SPTE is write-protected for
- * write-tracking, remote TLBs must be flushed, even if the SPTE was read-only,
- * as KVM allows stale Writable TLB entries to exist. When dirty logging, KVM
- * flushes TLBs based on whether or not dirty bitmap/ring entries were reaped,
- * not whether or not SPTEs were modified, i.e. only the write-tracking case
- * needs to flush at the time the SPTEs is modified, before dropping mmu_lock.
- *
- * Don't flush if the Accessed bit is cleared, as access tracking tolerates
- * false negatives, and the one path that does care about TLB flushes,
- * kvm_mmu_notifier_clear_flush_young(), flushes if a young SPTE is found, i.e.
- * doesn't rely on lower helpers to detect the need to flush.
- *
- * Lastly, don't flush if the Dirty bit is cleared, as KVM unconditionally
- * flushes when enabling dirty logging (see kvm_mmu_slot_apply_flags()), and
- * when clearing dirty logs, KVM flushes based on whether or not dirty entries
- * were reaped from the bitmap/ring, not whether or not dirty SPTEs were found.
- *
* Returns true if the TLB needs to be flushed
*/
static bool mmu_spte_update(u64 *sptep, u64 new_spte)
@@ -527,7 +510,7 @@ static bool mmu_spte_update(u64 *sptep, u64 new_spte)
WARN_ON_ONCE(!is_shadow_present_pte(old_spte) ||
spte_to_pfn(old_spte) != spte_to_pfn(new_spte));
- return is_mmu_writable_spte(old_spte) && !is_mmu_writable_spte(new_spte);
+ return is_tlb_flush_required_for_leaf_spte(old_spte, new_spte);
}
/*
diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index c8dc75337c8b..a404279ba731 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -467,6 +467,35 @@ static inline bool is_mmu_writable_spte(u64 spte)
return spte & shadow_mmu_writable_mask;
}
+/*
+ * If the MMU-writable flag is cleared, i.e. the SPTE is write-protected for
+ * write-tracking, remote TLBs must be flushed, even if the SPTE was read-only,
+ * as KVM allows stale Writable TLB entries to exist. When dirty logging, KVM
+ * flushes TLBs based on whether or not dirty bitmap/ring entries were reaped,
+ * not whether or not SPTEs were modified, i.e. only the write-tracking case
+ * needs to flush at the time the SPTEs is modified, before dropping mmu_lock.
+ *
+ * Don't flush if the Accessed bit is cleared, as access tracking tolerates
+ * false negatives, and the one path that does care about TLB flushes,
+ * kvm_mmu_notifier_clear_flush_young(), flushes if a young SPTE is found, i.e.
+ * doesn't rely on lower helpers to detect the need to flush.
+ *
+ * Lastly, don't flush if the Dirty bit is cleared, as KVM unconditionally
+ * flushes when enabling dirty logging (see kvm_mmu_slot_apply_flags()), and
+ * when clearing dirty logs, KVM flushes based on whether or not dirty entries
+ * were reaped from the bitmap/ring, not whether or not dirty SPTEs were found.
+ *
+ * Note, this logic only applies to shadow-present leaf SPTEs. The caller is
+ * responsible for checking that the old SPTE is shadow-present, and is also
+ * responsible for determining whether or not a TLB flush is required when
+ * modifying a shadow-present non-leaf SPTE.
+ */
+static inline bool is_tlb_flush_required_for_leaf_spte(u64 old_spte,
+ u64 new_spte)
+{
+ return is_mmu_writable_spte(old_spte) && !is_mmu_writable_spte(new_spte);
+}
+
static inline u64 get_mmio_spte_generation(u64 spte)
{
u64 gen;
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index f412bca206c5..615c6a84fd60 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1034,8 +1034,7 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu,
return RET_PF_RETRY;
else if (is_shadow_present_pte(iter->old_spte) &&
(!is_last_spte(iter->old_spte, iter->level) ||
- WARN_ON_ONCE(is_mmu_writable_spte(iter->old_spte) &&
- !is_mmu_writable_spte(new_spte))))
+ WARN_ON_ONCE(is_tlb_flush_required_for_leaf_spte(iter->old_spte, new_spte))))
kvm_flush_remote_tlbs_gfn(vcpu->kvm, iter->gfn, iter->level);
/*
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 16/18] KVM: x86/mmu: Set Dirty bit for new SPTEs, even if _hardware_ A/D bits are disabled
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
` (14 preceding siblings ...)
2024-10-11 2:10 ` [PATCH 15/18] KVM: x86/mmu: Dedup logic for detecting TLB flushes on leaf SPTE changes Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-11 2:10 ` [PATCH 17/18] KVM: Allow arch code to elide TLB flushes when aging a young page Sean Christopherson
` (3 subsequent siblings)
19 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
When making a SPTE, set the Dirty bit in the SPTE as appropriate, even if
hardware A/D bits are disabled. Only EPT allows A/D bits to be disabled,
and for EPT, the bits are software-available (ignored by hardware) when
A/D bits are disabled, i.e. it is perfectly legal for KVM to use the Dirty
to track dirty pages in software.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/spte.c | 2 +-
arch/x86/kvm/mmu/spte.h | 6 ------
2 files changed, 1 insertion(+), 7 deletions(-)
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index 617479efd127..fd8c3c92ade0 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -237,7 +237,7 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
wrprot = true;
else
spte |= PT_WRITABLE_MASK | shadow_mmu_writable_mask |
- spte_shadow_dirty_mask(spte);
+ shadow_dirty_mask;
}
if (prefetch && !synchronizing)
diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index a404279ba731..e90cc401c168 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -316,12 +316,6 @@ static inline bool spte_ad_need_write_protect(u64 spte)
return (spte & SPTE_TDP_AD_MASK) != SPTE_TDP_AD_ENABLED;
}
-static inline u64 spte_shadow_dirty_mask(u64 spte)
-{
- KVM_MMU_WARN_ON(!is_shadow_present_pte(spte));
- return spte_ad_enabled(spte) ? shadow_dirty_mask : 0;
-}
-
static inline bool is_access_track_spte(u64 spte)
{
return !spte_ad_enabled(spte) && (spte & shadow_acc_track_mask) == 0;
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 17/18] KVM: Allow arch code to elide TLB flushes when aging a young page
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
` (15 preceding siblings ...)
2024-10-11 2:10 ` [PATCH 16/18] KVM: x86/mmu: Set Dirty bit for new SPTEs, even if _hardware_ A/D bits are disabled Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-11 2:10 ` [PATCH 18/18] KVM: x86: Don't emit TLB flushes when aging SPTEs for mmu_notifiers Sean Christopherson
` (2 subsequent siblings)
19 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
Add a Kconfig to allow architectures to opt-out of a TLB flush when a
young page is aged, as invalidating TLB entries is not functionally
required on most KVM-supported architectures. Stale TLB entries can
result in false negatives and theoretically lead to suboptimal reclaim,
but in practice all observations have been that the performance gained by
skipping TLB flushes outweighs any performance lost by reclaiming hot
pages.
E.g. the primary MMUs for x86 RISC-V, s390, and PPC Book3S elide the TLB
flush for ptep_clear_flush_young(), and arm64's MMU skips the trailing DSB
that's required for ordering (presumably because there are optimizations
related to eliding other TLB flushes when doing make-before-break).
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
virt/kvm/Kconfig | 4 ++++
virt/kvm/kvm_main.c | 20 ++++++--------------
2 files changed, 10 insertions(+), 14 deletions(-)
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index fd6a3010afa8..54e959e7d68f 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -100,6 +100,10 @@ config KVM_GENERIC_MMU_NOTIFIER
select MMU_NOTIFIER
bool
+config KVM_ELIDE_TLB_FLUSH_IF_YOUNG
+ depends on KVM_GENERIC_MMU_NOTIFIER
+ bool
+
config KVM_GENERIC_MEMORY_ATTRIBUTES
depends on KVM_GENERIC_MMU_NOTIFIER
bool
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index b1b10dc408a0..83b525d16b61 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -630,7 +630,8 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm,
static __always_inline int kvm_handle_hva_range(struct mmu_notifier *mn,
unsigned long start,
unsigned long end,
- gfn_handler_t handler)
+ gfn_handler_t handler,
+ bool flush_on_ret)
{
struct kvm *kvm = mmu_notifier_to_kvm(mn);
const struct kvm_mmu_notifier_range range = {
@@ -638,7 +639,7 @@ static __always_inline int kvm_handle_hva_range(struct mmu_notifier *mn,
.end = end,
.handler = handler,
.on_lock = (void *)kvm_null_fn,
- .flush_on_ret = true,
+ .flush_on_ret = flush_on_ret,
.may_block = false,
};
@@ -650,17 +651,7 @@ static __always_inline int kvm_handle_hva_range_no_flush(struct mmu_notifier *mn
unsigned long end,
gfn_handler_t handler)
{
- struct kvm *kvm = mmu_notifier_to_kvm(mn);
- const struct kvm_mmu_notifier_range range = {
- .start = start,
- .end = end,
- .handler = handler,
- .on_lock = (void *)kvm_null_fn,
- .flush_on_ret = false,
- .may_block = false,
- };
-
- return __kvm_handle_hva_range(kvm, &range).ret;
+ return kvm_handle_hva_range(mn, start, end, handler, false);
}
void kvm_mmu_invalidate_begin(struct kvm *kvm)
@@ -825,7 +816,8 @@ static int kvm_mmu_notifier_clear_flush_young(struct mmu_notifier *mn,
{
trace_kvm_age_hva(start, end);
- return kvm_handle_hva_range(mn, start, end, kvm_age_gfn);
+ return kvm_handle_hva_range(mn, start, end, kvm_age_gfn,
+ !IS_ENABLED(CONFIG_KVM_ELIDE_TLB_FLUSH_IF_YOUNG));
}
static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn,
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 18/18] KVM: x86: Don't emit TLB flushes when aging SPTEs for mmu_notifiers
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
` (16 preceding siblings ...)
2024-10-11 2:10 ` [PATCH 17/18] KVM: Allow arch code to elide TLB flushes when aging a young page Sean Christopherson
@ 2024-10-11 2:10 ` Sean Christopherson
2024-10-17 16:55 ` [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Paolo Bonzini
2024-10-31 19:51 ` Sean Christopherson
19 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-11 2:10 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
Follow x86's primary MMU, which hasn't flushed TLBs when clearing Accessed
bits for 10+ years, and skip all TLB flushes when aging SPTEs in response
to a clear_flush_young() mmu_notifier event. As documented in x86's
ptep_clear_flush_young(), the probability and impact of "bad" reclaim due
to stale A-bit information is relatively low, whereas the performance cost
of TLB flushes is relatively high. I.e. the cost of flushing TLBs
outweighs the benefits.
On KVM x86, the cost of TLB flushes is even higher, as KVM doesn't batch
TLB flushes for mmu_notifier events (KVM's mmu_notifier contract with MM
makes it all but impossible), and sending IPIs forces all running vCPUs to
go through a VM-Exit => VM-Enter roundtrip.
Furthermore, MGLRU aging of secondary MMUs is expected to use flush-less
mmu_notifiers, i.e. flushing for the !MGLRU will make even less sense, and
will be actively confusing as it wouldn't be clear why KVM "needs" to
flush TLBs for legacy LRU aging, but not for MGLRU aging.
Cc: James Houghton <jthoughton@google.com>
Cc: Yan Zhao <yan.y.zhao@intel.com>
Link: https://lore.kernel.org/all/20240926013506.860253-18-jthoughton@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/Kconfig | 1 +
arch/x86/kvm/mmu/spte.h | 5 ++---
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index f09f13c01c6b..1ed1e4f5d51c 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -22,6 +22,7 @@ config KVM_X86
depends on X86_LOCAL_APIC
select KVM_COMMON
select KVM_GENERIC_MMU_NOTIFIER
+ select KVM_ELIDE_TLB_FLUSH_IF_YOUNG
select HAVE_KVM_IRQCHIP
select HAVE_KVM_PFNCACHE
select HAVE_KVM_DIRTY_RING_TSO
diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index e90cc401c168..8b09a0d60ea6 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -470,9 +470,8 @@ static inline bool is_mmu_writable_spte(u64 spte)
* needs to flush at the time the SPTEs is modified, before dropping mmu_lock.
*
* Don't flush if the Accessed bit is cleared, as access tracking tolerates
- * false negatives, and the one path that does care about TLB flushes,
- * kvm_mmu_notifier_clear_flush_young(), flushes if a young SPTE is found, i.e.
- * doesn't rely on lower helpers to detect the need to flush.
+ * false negatives, e.g. KVM x86 omits TLB flushes even when aging SPTEs for a
+ * mmu_notifier.clear_flush_young() event.
*
* Lastly, don't flush if the Dirty bit is cleared, as KVM unconditionally
* flushes when enabling dirty logging (see kvm_mmu_slot_apply_flags()), and
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 14/18] KVM: x86/mmu: Stop processing TDP MMU roots for test_age if young SPTE found
2024-10-11 2:10 ` [PATCH 14/18] KVM: x86/mmu: Stop processing TDP MMU roots for test_age if young SPTE found Sean Christopherson
@ 2024-10-17 16:52 ` Paolo Bonzini
0 siblings, 0 replies; 23+ messages in thread
From: Paolo Bonzini @ 2024-10-17 16:52 UTC (permalink / raw)
To: Sean Christopherson
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
On 10/11/24 04:10, Sean Christopherson wrote:
> Return immediately if a young SPTE is found when testing, but not updating,
> SPTEs. The return value is a boolean, i.e. whether there is one young SPTE
> or fifty is irrelevant (ignoring the fact that it's impossible for there to
> be fifty SPTEs, as KVM has a hard limit on the number of valid TDP MMU
> roots).
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/mmu/tdp_mmu.c | 84 ++++++++++++++++++--------------------
> 1 file changed, 40 insertions(+), 44 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index e8c061bf94ec..f412bca206c5 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -1192,35 +1192,6 @@ bool kvm_tdp_mmu_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range,
> return flush;
> }
>
> -typedef bool (*tdp_handler_t)(struct kvm *kvm, struct tdp_iter *iter,
> - struct kvm_gfn_range *range);
> -
> -static __always_inline bool kvm_tdp_mmu_handle_gfn(struct kvm *kvm,
> - struct kvm_gfn_range *range,
> - tdp_handler_t handler)
> -{
> - struct kvm_mmu_page *root;
> - struct tdp_iter iter;
> - bool ret = false;
> -
> - /*
> - * Don't support rescheduling, none of the MMU notifiers that funnel
> - * into this helper allow blocking; it'd be dead, wasteful code. Note,
> - * this helper must NOT be used to unmap GFNs, as it processes only
> - * valid roots!
> - */
> - for_each_valid_tdp_mmu_root(kvm, root, range->slot->as_id) {
> - rcu_read_lock();
> -
> - tdp_root_for_each_leaf_pte(iter, root, range->start, range->end)
> - ret |= handler(kvm, &iter, range);
> -
> - rcu_read_unlock();
> - }
> -
> - return ret;
> -}
> -
> /*
> * Mark the SPTEs range of GFNs [start, end) unaccessed and return non-zero
> * if any of the GFNs in the range have been accessed.
> @@ -1229,15 +1200,10 @@ static __always_inline bool kvm_tdp_mmu_handle_gfn(struct kvm *kvm,
> * from the clear_young() or clear_flush_young() notifier, which uses the
> * return value to determine if the page has been accessed.
> */
> -static bool age_gfn_range(struct kvm *kvm, struct tdp_iter *iter,
> - struct kvm_gfn_range *range)
> +static void kvm_tdp_mmu_age_spte(struct tdp_iter *iter)
> {
> u64 new_spte;
>
> - /* If we have a non-accessed entry we don't need to change the pte. */
> - if (!is_accessed_spte(iter->old_spte))
> - return false;
> -
> if (spte_ad_enabled(iter->old_spte)) {
> iter->old_spte = tdp_mmu_clear_spte_bits(iter->sptep,
> iter->old_spte,
> @@ -1253,23 +1219,53 @@ static bool age_gfn_range(struct kvm *kvm, struct tdp_iter *iter,
>
> trace_kvm_tdp_mmu_spte_changed(iter->as_id, iter->gfn, iter->level,
> iter->old_spte, new_spte);
> - return true;
> +}
> +
> +static bool __kvm_tdp_mmu_age_gfn_range(struct kvm *kvm,
> + struct kvm_gfn_range *range,
> + bool test_only)
> +{
> + struct kvm_mmu_page *root;
> + struct tdp_iter iter;
> + bool ret = false;
> +
> + /*
> + * Don't support rescheduling, none of the MMU notifiers that funnel
> + * into this helper allow blocking; it'd be dead, wasteful code. Note,
> + * this helper must NOT be used to unmap GFNs, as it processes only
> + * valid roots!
> + */
> + for_each_valid_tdp_mmu_root(kvm, root, range->slot->as_id) {
> + rcu_read_lock();
> +
> + tdp_root_for_each_leaf_pte(iter, root, range->start, range->end) {
> + if (!is_accessed_spte(iter.old_spte))
> + continue;
> +
> + ret = true;
> + if (test_only)
> + break;
> +
> + kvm_tdp_mmu_age_spte(&iter);
> + }
> +
> + rcu_read_unlock();
> +
> + if (ret && test_only)
> + break;
> + }
> +
> + return ret;
> }
If you use guard(rcu)() you can avoid the repeated breaks:
for_each_valid_tdp_mmu_root(kvm, root, range->slot->as_id) {
guard(rcu)();
tdp_root_for_each_leaf_pte(iter, root, range->start, range->end) {
if (!is_accessed_spte(iter.old_spte))
continue;
ret = true;
if (test_only)
return ret;
kvm_tdp_mmu_age_spte(&iter);
}
}
return ret;
Paolo
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 15/18] KVM: x86/mmu: Dedup logic for detecting TLB flushes on leaf SPTE changes
2024-10-11 2:10 ` [PATCH 15/18] KVM: x86/mmu: Dedup logic for detecting TLB flushes on leaf SPTE changes Sean Christopherson
@ 2024-10-17 16:53 ` Paolo Bonzini
0 siblings, 0 replies; 23+ messages in thread
From: Paolo Bonzini @ 2024-10-17 16:53 UTC (permalink / raw)
To: Sean Christopherson
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
On 10/11/24 04:10, Sean Christopherson wrote:
> +static inline bool is_tlb_flush_required_for_leaf_spte(u64 old_spte,
> + u64 new_spte)
> +{
> + return is_mmu_writable_spte(old_spte) && !is_mmu_writable_spte(new_spte);
> +}
Shorter name? leaf_spte_change_needs_tlb_flush?
Paolo
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn)
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
` (17 preceding siblings ...)
2024-10-11 2:10 ` [PATCH 18/18] KVM: x86: Don't emit TLB flushes when aging SPTEs for mmu_notifiers Sean Christopherson
@ 2024-10-17 16:55 ` Paolo Bonzini
2024-10-31 19:51 ` Sean Christopherson
19 siblings, 0 replies; 23+ messages in thread
From: Paolo Bonzini @ 2024-10-17 16:55 UTC (permalink / raw)
To: Sean Christopherson
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
On 10/11/24 04:10, Sean Christopherson wrote:
> This is effectively an extensive of the kvm_follow_pfn series[*] (and
> applies on top of said series), but is x86-specific and is *almost*
> entirely related to Accessed and Dirty bits.
>
> There's no central theme beyond cleaning up things that were discovered
> when digging deep for the kvm_follow_pfn overhaul, and to a lesser extent
> the series to add MGLRU support in KVM x86.
Very nice - looks obvious in retrospect, as it often happens.
Paolo
> [*] https://lore.kernel.org/all/20241010182427.1434605-1-seanjc@google.com
>
> Sean Christopherson (18):
> KVM: x86/mmu: Flush remote TLBs iff MMU-writable flag is cleared from
> RO SPTE
> KVM: x86/mmu: Always set SPTE's dirty bit if it's created as writable
> KVM: x86/mmu: Fold all of make_spte()'s writable handling into one
> if-else
> KVM: x86/mmu: Don't force flush if SPTE update clears Accessed bit
> KVM: x86/mmu: Don't flush TLBs when clearing Dirty bit in shadow MMU
> KVM: x86/mmu: Drop ignored return value from
> kvm_tdp_mmu_clear_dirty_slot()
> KVM: x86/mmu: Fold mmu_spte_update_no_track() into mmu_spte_update()
> KVM: x86/mmu: WARN and flush if resolving a TDP MMU fault clears
> MMU-writable
> KVM: x86/mmu: Add a dedicated flag to track if A/D bits are globally
> enabled
> KVM: x86/mmu: Set shadow_accessed_mask for EPT even if A/D bits
> disabled
> KVM: x86/mmu: Set shadow_dirty_mask for EPT even if A/D bits disabled
> KVM: x86/mmu: Use Accessed bit even when _hardware_ A/D bits are
> disabled
> KVM: x86/mmu: Process only valid TDP MMU roots when aging a gfn range
> KVM: x86/mmu: Stop processing TDP MMU roots for test_age if young SPTE
> found
> KVM: x86/mmu: Dedup logic for detecting TLB flushes on leaf SPTE
> changes
> KVM: x86/mmu: Set Dirty bit for new SPTEs, even if _hardware_ A/D bits
> are disabled
> KVM: Allow arch code to elide TLB flushes when aging a young page
> KVM: x86: Don't emit TLB flushes when aging SPTEs for mmu_notifiers
>
> arch/x86/kvm/Kconfig | 1 +
> arch/x86/kvm/mmu/mmu.c | 72 +++++++-----------------
> arch/x86/kvm/mmu/spte.c | 59 ++++++++------------
> arch/x86/kvm/mmu/spte.h | 72 ++++++++++++------------
> arch/x86/kvm/mmu/tdp_mmu.c | 109 +++++++++++++++++--------------------
> arch/x86/kvm/mmu/tdp_mmu.h | 2 +-
> virt/kvm/Kconfig | 4 ++
> virt/kvm/kvm_main.c | 20 ++-----
> 8 files changed, 142 insertions(+), 197 deletions(-)
>
>
> base-commit: 3f9cf3d569fdf7fb451294b636991291965573ce
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn)
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
` (18 preceding siblings ...)
2024-10-17 16:55 ` [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Paolo Bonzini
@ 2024-10-31 19:51 ` Sean Christopherson
19 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-10-31 19:51 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Yan Zhao, Sagi Shahar, Alex Bennée,
David Matlack, James Houghton
On Thu, 10 Oct 2024 19:10:32 -0700, Sean Christopherson wrote:
> This is effectively an extensive of the kvm_follow_pfn series[*] (and
> applies on top of said series), but is x86-specific and is *almost*
> entirely related to Accessed and Dirty bits.
>
> There's no central theme beyond cleaning up things that were discovered
> when digging deep for the kvm_follow_pfn overhaul, and to a lesser extent
> the series to add MGLRU support in KVM x86.
>
> [...]
Applied to kvm-x86 mmu, with Paolo's suggestions (the guard(rcu) trick in
particular was quite nice).
[01/18] KVM: x86/mmu: Flush remote TLBs iff MMU-writable flag is cleared from RO SPTE
https://github.com/kvm-x86/linux/commit/081976992f43
[02/18] KVM: x86/mmu: Always set SPTE's dirty bit if it's created as writable
https://github.com/kvm-x86/linux/commit/cc7ed3358e41
[03/18] KVM: x86/mmu: Fold all of make_spte()'s writable handling into one if-else
https://github.com/kvm-x86/linux/commit/0387d79e24d6
[04/18] KVM: x86/mmu: Don't force flush if SPTE update clears Accessed bit
https://github.com/kvm-x86/linux/commit/b7ed46b201a4
[05/18] KVM: x86/mmu: Don't flush TLBs when clearing Dirty bit in shadow MMU
https://github.com/kvm-x86/linux/commit/856cf4a60cff
[06/18] KVM: x86/mmu: Drop ignored return value from kvm_tdp_mmu_clear_dirty_slot()
https://github.com/kvm-x86/linux/commit/010344122dca
[07/18] KVM: x86/mmu: Fold mmu_spte_update_no_track() into mmu_spte_update()
https://github.com/kvm-x86/linux/commit/67c93802928b
[08/18] KVM: x86/mmu: WARN and flush if resolving a TDP MMU fault clears MMU-writable
https://github.com/kvm-x86/linux/commit/1a175082b190
[09/18] KVM: x86/mmu: Add a dedicated flag to track if A/D bits are globally enabled
https://github.com/kvm-x86/linux/commit/a5da5dde4ba4
[10/18] KVM: x86/mmu: Set shadow_accessed_mask for EPT even if A/D bits disabled
https://github.com/kvm-x86/linux/commit/3835819fb1b3
[11/18] KVM: x86/mmu: Set shadow_dirty_mask for EPT even if A/D bits disabled
https://github.com/kvm-x86/linux/commit/53510b912518
[12/18] KVM: x86/mmu: Use Accessed bit even when _hardware_ A/D bits are disabled
https://github.com/kvm-x86/linux/commit/7971801b5618
[13/18] KVM: x86/mmu: Process only valid TDP MMU roots when aging a gfn range
https://github.com/kvm-x86/linux/commit/526e609f0567
[14/18] KVM: x86/mmu: Stop processing TDP MMU roots for test_age if young SPTE found
https://github.com/kvm-x86/linux/commit/51192ebdd145
[15/18] KVM: x86/mmu: Dedup logic for detecting TLB flushes on leaf SPTE changes
https://github.com/kvm-x86/linux/commit/c9b625625ba3
[16/18] KVM: x86/mmu: Set Dirty bit for new SPTEs, even if _hardware_ A/D bits are disabled
https://github.com/kvm-x86/linux/commit/85649117511d
[17/18] KVM: Allow arch code to elide TLB flushes when aging a young page
https://github.com/kvm-x86/linux/commit/2ebbe0308c29
[18/18] KVM: x86: Don't emit TLB flushes when aging SPTEs for mmu_notifiers
https://github.com/kvm-x86/linux/commit/b9883ee40d7e
--
https://github.com/kvm-x86/linux/tree/next
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2024-10-31 19:56 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-11 2:10 [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Sean Christopherson
2024-10-11 2:10 ` [PATCH 01/18] KVM: x86/mmu: Flush remote TLBs iff MMU-writable flag is cleared from RO SPTE Sean Christopherson
2024-10-11 2:10 ` [PATCH 02/18] KVM: x86/mmu: Always set SPTE's dirty bit if it's created as writable Sean Christopherson
2024-10-11 2:10 ` [PATCH 03/18] KVM: x86/mmu: Fold all of make_spte()'s writable handling into one if-else Sean Christopherson
2024-10-11 2:10 ` [PATCH 04/18] KVM: x86/mmu: Don't force flush if SPTE update clears Accessed bit Sean Christopherson
2024-10-11 2:10 ` [PATCH 05/18] KVM: x86/mmu: Don't flush TLBs when clearing Dirty bit in shadow MMU Sean Christopherson
2024-10-11 2:10 ` [PATCH 06/18] KVM: x86/mmu: Drop ignored return value from kvm_tdp_mmu_clear_dirty_slot() Sean Christopherson
2024-10-11 2:10 ` [PATCH 07/18] KVM: x86/mmu: Fold mmu_spte_update_no_track() into mmu_spte_update() Sean Christopherson
2024-10-11 2:10 ` [PATCH 08/18] KVM: x86/mmu: WARN and flush if resolving a TDP MMU fault clears MMU-writable Sean Christopherson
2024-10-11 2:10 ` [PATCH 09/18] KVM: x86/mmu: Add a dedicated flag to track if A/D bits are globally enabled Sean Christopherson
2024-10-11 2:10 ` [PATCH 10/18] KVM: x86/mmu: Set shadow_accessed_mask for EPT even if A/D bits disabled Sean Christopherson
2024-10-11 2:10 ` [PATCH 11/18] KVM: x86/mmu: Set shadow_dirty_mask " Sean Christopherson
2024-10-11 2:10 ` [PATCH 12/18] KVM: x86/mmu: Use Accessed bit even when _hardware_ A/D bits are disabled Sean Christopherson
2024-10-11 2:10 ` [PATCH 13/18] KVM: x86/mmu: Process only valid TDP MMU roots when aging a gfn range Sean Christopherson
2024-10-11 2:10 ` [PATCH 14/18] KVM: x86/mmu: Stop processing TDP MMU roots for test_age if young SPTE found Sean Christopherson
2024-10-17 16:52 ` Paolo Bonzini
2024-10-11 2:10 ` [PATCH 15/18] KVM: x86/mmu: Dedup logic for detecting TLB flushes on leaf SPTE changes Sean Christopherson
2024-10-17 16:53 ` Paolo Bonzini
2024-10-11 2:10 ` [PATCH 16/18] KVM: x86/mmu: Set Dirty bit for new SPTEs, even if _hardware_ A/D bits are disabled Sean Christopherson
2024-10-11 2:10 ` [PATCH 17/18] KVM: Allow arch code to elide TLB flushes when aging a young page Sean Christopherson
2024-10-11 2:10 ` [PATCH 18/18] KVM: x86: Don't emit TLB flushes when aging SPTEs for mmu_notifiers Sean Christopherson
2024-10-17 16:55 ` [PATCH 00/18] KVM: x86/mmu: A/D cleanups (on top of kvm_follow_pfn) Paolo Bonzini
2024-10-31 19:51 ` Sean Christopherson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox