* [PATCH 1/3] KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR
2023-10-27 17:26 [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG David Matlack
@ 2023-10-27 17:26 ` David Matlack
2023-10-27 17:26 ` [PATCH 2/3] KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP MMU David Matlack
` (3 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: David Matlack @ 2023-10-27 17:26 UTC (permalink / raw)
To: Paolo Bonzini, Sean Christopherson
Cc: David Matlack, Ben Gardon, kvm, Vipin Sharma
Fix an off-by-1 error when passing in the range of pages to
kvm_mmu_try_split_huge_pages() during CLEAR_DIRTY_LOG. Specifically, end
is the last page that needs to be split (inclusive) so pass in `end + 1`
since kvm_mmu_try_split_huge_pages() expects the `end` to be
non-inclusive.
At worst this will cause a huge page to be write-protected instead of
eagerly split, which is purely a performance issue, not a correctness
issue. But even that is unlikely as it would require userspace pass in a
bitmap where the last page is the only 4K page on a huge page that needs
to be split.
Reported-by: Vipin Sharma <vipinsh@google.com>
Fixes: f2928aae8b9a ("UPSTREAM: KVM: x86/mmu: Split huge pages mapped by the TDP MMU during KVM_CLEAR_DIRTY_LOG")
Signed-off-by: David Matlack <dmatlack@google.com>
---
arch/x86/kvm/mmu/mmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index f7901cb4d2fa..6aa966631cab 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1382,7 +1382,7 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
gfn_t end = slot->base_gfn + gfn_offset + __fls(mask);
if (READ_ONCE(eager_page_split))
- kvm_mmu_try_split_huge_pages(kvm, slot, start, end, PG_LEVEL_4K);
+ kvm_mmu_try_split_huge_pages(kvm, slot, start, end + 1, PG_LEVEL_4K);
kvm_mmu_slot_gfn_write_protect(kvm, slot, start, PG_LEVEL_2M);
--
2.42.0.820.g83a721a137-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH 2/3] KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP MMU
2023-10-27 17:26 [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG David Matlack
2023-10-27 17:26 ` [PATCH 1/3] KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR David Matlack
@ 2023-10-27 17:26 ` David Matlack
2023-10-27 17:26 ` [PATCH 3/3] KVM: Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG David Matlack
` (2 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: David Matlack @ 2023-10-27 17:26 UTC (permalink / raw)
To: Paolo Bonzini, Sean Christopherson
Cc: David Matlack, Ben Gardon, kvm, Vipin Sharma
Re-check that the given SPTE is still a leaf and present SPTE after a
failed cmpxchg in clear_dirty_gfn_range(). clear_dirty_gfn_range()
intends to only operate on present leaf SPTEs, but that could change
after a failed cmpxchg.
A check for present was added in commit 3354ef5a592d ("KVM: x86/mmu:
Check for present SPTE when clearing dirty bit in TDP MMU") but the
check for leaf is still buried in tdp_root_for_each_leaf_pte() and does
not get rechecked on retry.
Fixes: a6a0b05da9f3 ("kvm: x86/mmu: Support dirty logging for the TDP MMU")
Signed-off-by: David Matlack <dmatlack@google.com>
---
arch/x86/kvm/mmu/tdp_mmu.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 6cd4dd631a2f..038983b13574 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1522,12 +1522,13 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
rcu_read_lock();
- tdp_root_for_each_leaf_pte(iter, root, start, end) {
+ tdp_root_for_each_pte(iter, root, start, end) {
retry:
- if (tdp_mmu_iter_cond_resched(kvm, &iter, false, true))
+ if (!is_shadow_present_pte(iter.old_spte) ||
+ !is_last_spte(iter.old_spte, iter.level))
continue;
- if (!is_shadow_present_pte(iter.old_spte))
+ if (tdp_mmu_iter_cond_resched(kvm, &iter, false, true))
continue;
KVM_MMU_WARN_ON(kvm_ad_enabled() &&
--
2.42.0.820.g83a721a137-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH 3/3] KVM: Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG
2023-10-27 17:26 [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG David Matlack
2023-10-27 17:26 ` [PATCH 1/3] KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR David Matlack
2023-10-27 17:26 ` [PATCH 2/3] KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP MMU David Matlack
@ 2023-10-27 17:26 ` David Matlack
2023-11-08 0:05 ` [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG Sean Christopherson
2023-12-01 1:52 ` Sean Christopherson
4 siblings, 0 replies; 8+ messages in thread
From: David Matlack @ 2023-10-27 17:26 UTC (permalink / raw)
To: Paolo Bonzini, Sean Christopherson
Cc: David Matlack, Ben Gardon, kvm, Vipin Sharma
Drop and reacquire the mmu_lock during CLEAR_DIRTY_LOG to avoid blocking
other threads from acquiring the mmu_lock (e.g. vCPUs taking page
faults).
It should be safe to drop and reacquire the mmu_lock from a correctness
standpoint. slots_lock already ensures that only one thread in KVM is
processing a GET/CLEAR_DIRTY_LOG ioctl. And KVM already has to be
prepared to handle vCPUs updating the dirty log concurrent with
CLEAR_DIRTY_LOG, hence the atomic_long_fetch_andnot(). So holding the
mmu_lock across loop iterations is entirely unnecessary. It only needs
to be acquired when calling in the arch-specific code to modify page
tables.
This change eliminates drops in guest performance during the live
migration of a 160 vCPU VM that we've observed while userspace is
issuing CLEAR ioctls (tested with 1GiB and 8GiB CLEARs). Userspace could
issue finer-grained CLEAR ioctls, which would also reduce contention on
the mmu_lock, but doing so will increase the rate of remote TLB
flushing, so there is a limit. And there's really no reason to punt this
problem to userspace. KVM can just drop and reacquire the lock more
frequently to avoid holding it for too long.
Signed-off-by: David Matlack <dmatlack@google.com>
---
virt/kvm/kvm_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 486800a7024b..afa61a2309d2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2297,7 +2297,6 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm,
if (copy_from_user(dirty_bitmap_buffer, log->dirty_bitmap, n))
return -EFAULT;
- KVM_MMU_LOCK(kvm);
for (offset = log->first_page, i = offset / BITS_PER_LONG,
n = DIV_ROUND_UP(log->num_pages, BITS_PER_LONG); n--;
i++, offset += BITS_PER_LONG) {
@@ -2316,11 +2315,12 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm,
*/
if (mask) {
flush = true;
+ KVM_MMU_LOCK(kvm);
kvm_arch_mmu_enable_log_dirty_pt_masked(kvm, memslot,
offset, mask);
+ KVM_MMU_UNLOCK(kvm);
}
}
- KVM_MMU_UNLOCK(kvm);
if (flush)
kvm_flush_remote_tlbs_memslot(kvm, memslot);
--
2.42.0.820.g83a721a137-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG
2023-10-27 17:26 [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG David Matlack
` (2 preceding siblings ...)
2023-10-27 17:26 ` [PATCH 3/3] KVM: Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG David Matlack
@ 2023-11-08 0:05 ` Sean Christopherson
2023-11-08 0:23 ` David Matlack
2023-12-01 1:52 ` Sean Christopherson
4 siblings, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2023-11-08 0:05 UTC (permalink / raw)
To: David Matlack; +Cc: Paolo Bonzini, Ben Gardon, kvm, Vipin Sharma
On Fri, Oct 27, 2023, David Matlack wrote:
> This series reduces the impact of CLEAR_DIRTY_LOG on guest performance
> (Patch 3) and fixes 2 minor bugs found along the way (Patches 1 and 2).
> David Matlack (3):
> KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR
> KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP
> MMU
> KVM: Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG
Is there an actual dependency between 1-2 and 3? AFAICT, no? I ask because I
can take the first two as soon as -rc1 is out, but the generic change definitely
needs testing and acks from other architectures.
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG
2023-11-08 0:05 ` [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG Sean Christopherson
@ 2023-11-08 0:23 ` David Matlack
0 siblings, 0 replies; 8+ messages in thread
From: David Matlack @ 2023-11-08 0:23 UTC (permalink / raw)
To: Sean Christopherson; +Cc: Paolo Bonzini, Ben Gardon, kvm, Vipin Sharma
On Tue, Nov 7, 2023 at 4:05 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Oct 27, 2023, David Matlack wrote:
> > This series reduces the impact of CLEAR_DIRTY_LOG on guest performance
> > (Patch 3) and fixes 2 minor bugs found along the way (Patches 1 and 2).
> > David Matlack (3):
> > KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR
> > KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP
> > MMU
> > KVM: Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG
>
> Is there an actual dependency between 1-2 and 3? AFAICT, no? I ask because I
> can take the first two as soon as -rc1 is out, but the generic change definitely
> needs testing and acks from other architectures.
No, there isn't any dependency between any of the commits. Feel free
to grab 1-2 independent of 3.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG
2023-10-27 17:26 [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG David Matlack
` (3 preceding siblings ...)
2023-11-08 0:05 ` [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG Sean Christopherson
@ 2023-12-01 1:52 ` Sean Christopherson
2023-12-01 15:59 ` Sean Christopherson
4 siblings, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2023-12-01 1:52 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, David Matlack; +Cc: kvm, Vipin Sharma
On Fri, 27 Oct 2023 10:26:37 -0700, David Matlack wrote:
> This series reduces the impact of CLEAR_DIRTY_LOG on guest performance
> (Patch 3) and fixes 2 minor bugs found along the way (Patches 1 and 2).
>
> We've observed that guest performance can drop while userspace is
> issuing CLEAR_DIRTY_LOG ioctls and tracked down the problem to
> contention on the mmu_lock in vCPU threads. CLEAR_DIRTY_LOG holds the
> write-lock, so this isn't that surprising. We previously explored
> converting CLEAR_DIRTY_LOG to hold the read-lock [1], but that has some
> negative consequences:
>
> [...]
Applied 1 and 2 to kvm-x86 mmu. To get traction on #3, I recommend resending it
as a standalone patch with all KVM arch maintainers Cc'd.
[1/3] KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR
https://github.com/kvm-x86/linux/commit/7cd1bf039eeb
[2/3] KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP MMU
https://github.com/kvm-x86/linux/commit/76d1492924bc
--
https://github.com/kvm-x86/linux/tree/next
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG
2023-12-01 1:52 ` Sean Christopherson
@ 2023-12-01 15:59 ` Sean Christopherson
0 siblings, 0 replies; 8+ messages in thread
From: Sean Christopherson @ 2023-12-01 15:59 UTC (permalink / raw)
To: Paolo Bonzini, David Matlack; +Cc: kvm, Vipin Sharma
On Thu, Nov 30, 2023, Sean Christopherson wrote:
> On Fri, 27 Oct 2023 10:26:37 -0700, David Matlack wrote:
> > This series reduces the impact of CLEAR_DIRTY_LOG on guest performance
> > (Patch 3) and fixes 2 minor bugs found along the way (Patches 1 and 2).
> >
> > We've observed that guest performance can drop while userspace is
> > issuing CLEAR_DIRTY_LOG ioctls and tracked down the problem to
> > contention on the mmu_lock in vCPU threads. CLEAR_DIRTY_LOG holds the
> > write-lock, so this isn't that surprising. We previously explored
> > converting CLEAR_DIRTY_LOG to hold the read-lock [1], but that has some
> > negative consequences:
> >
> > [...]
>
> Applied 1 and 2 to kvm-x86 mmu. To get traction on #3, I recommend resending it
> as a standalone patch with all KVM arch maintainers Cc'd.
>
> [1/3] KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR
> https://github.com/kvm-x86/linux/commit/7cd1bf039eeb
> [2/3] KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP MMU
> https://github.com/kvm-x86/linux/commit/76d1492924bc
Force pushed because the Fixes: in patch 1 referenced a Google-internal commit.
New hashes:
[1/2] KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR
https://github.com/kvm-x86/linux/commit/1aa4bb916808
[2/2] KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP MMU
https://github.com/kvm-x86/linux/commit/45a61ebb2211
^ permalink raw reply [flat|nested] 8+ messages in thread