* [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG
@ 2023-10-27 17:26 David Matlack
2023-10-27 17:26 ` [PATCH 1/3] KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR David Matlack
` (4 more replies)
0 siblings, 5 replies; 8+ messages in thread
From: David Matlack @ 2023-10-27 17:26 UTC (permalink / raw)
To: Paolo Bonzini, Sean Christopherson
Cc: David Matlack, Ben Gardon, kvm, Vipin Sharma
This series reduces the impact of CLEAR_DIRTY_LOG on guest performance
(Patch 3) and fixes 2 minor bugs found along the way (Patches 1 and 2).
We've observed that guest performance can drop while userspace is
issuing CLEAR_DIRTY_LOG ioctls and tracked down the problem to
contention on the mmu_lock in vCPU threads. CLEAR_DIRTY_LOG holds the
write-lock, so this isn't that surprising. We previously explored
converting CLEAR_DIRTY_LOG to hold the read-lock [1], but that has some
negative consequences:
- Pretty significant code churn is required on x86 and ARM to support
doing CLEAR under the read-lock. Things get especially hairy on x86
when considering how to support the Shadow MMU.
- Holding the read-lock means KVM will have to use atomic
compare-and-exchange operations during eager splitting and clearing
dirty bits, which can be quite slow on certain ARM platforms.
This series proposed an alternative (well, complimentary, really)
approach of simply dropping mmu_lock more frequently. I tested this
series out with one of our internal Live Migration tests where the guest
is running MySQL in a 160 vCPU VM (Intel Broadwell host) and it
eliminates the performance drops we were seeing when userspace issues
CLEAR ioctls. Furthermore I don't see any noticeable improvement when I
test with this series plus a prototype patch convert CLEAR to the read
lock on x86. i.e. It seems we can eliminate most of the lock contention
by just dropping the lock more frequently.
Cc: Vipin Sharma <vipinsh@google.com>
[1] https://lore.kernel.org/kvm/20230602160914.4011728-1-vipinsh@google.com/
David Matlack (3):
KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR
KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP
MMU
KVM: Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG
arch/x86/kvm/mmu/mmu.c | 2 +-
arch/x86/kvm/mmu/tdp_mmu.c | 7 ++++---
virt/kvm/kvm_main.c | 4 ++--
3 files changed, 7 insertions(+), 6 deletions(-)
base-commit: 2b3f2325e71f09098723727d665e2e8003d455dc
--
2.42.0.820.g83a721a137-goog
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/3] KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR
2023-10-27 17:26 [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG David Matlack
@ 2023-10-27 17:26 ` David Matlack
2023-10-27 17:26 ` [PATCH 2/3] KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP MMU David Matlack
` (3 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: David Matlack @ 2023-10-27 17:26 UTC (permalink / raw)
To: Paolo Bonzini, Sean Christopherson
Cc: David Matlack, Ben Gardon, kvm, Vipin Sharma
Fix an off-by-1 error when passing in the range of pages to
kvm_mmu_try_split_huge_pages() during CLEAR_DIRTY_LOG. Specifically, end
is the last page that needs to be split (inclusive) so pass in `end + 1`
since kvm_mmu_try_split_huge_pages() expects the `end` to be
non-inclusive.
At worst this will cause a huge page to be write-protected instead of
eagerly split, which is purely a performance issue, not a correctness
issue. But even that is unlikely as it would require userspace pass in a
bitmap where the last page is the only 4K page on a huge page that needs
to be split.
Reported-by: Vipin Sharma <vipinsh@google.com>
Fixes: f2928aae8b9a ("UPSTREAM: KVM: x86/mmu: Split huge pages mapped by the TDP MMU during KVM_CLEAR_DIRTY_LOG")
Signed-off-by: David Matlack <dmatlack@google.com>
---
arch/x86/kvm/mmu/mmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index f7901cb4d2fa..6aa966631cab 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1382,7 +1382,7 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
gfn_t end = slot->base_gfn + gfn_offset + __fls(mask);
if (READ_ONCE(eager_page_split))
- kvm_mmu_try_split_huge_pages(kvm, slot, start, end, PG_LEVEL_4K);
+ kvm_mmu_try_split_huge_pages(kvm, slot, start, end + 1, PG_LEVEL_4K);
kvm_mmu_slot_gfn_write_protect(kvm, slot, start, PG_LEVEL_2M);
--
2.42.0.820.g83a721a137-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/3] KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP MMU
2023-10-27 17:26 [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG David Matlack
2023-10-27 17:26 ` [PATCH 1/3] KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR David Matlack
@ 2023-10-27 17:26 ` David Matlack
2023-10-27 17:26 ` [PATCH 3/3] KVM: Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG David Matlack
` (2 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: David Matlack @ 2023-10-27 17:26 UTC (permalink / raw)
To: Paolo Bonzini, Sean Christopherson
Cc: David Matlack, Ben Gardon, kvm, Vipin Sharma
Re-check that the given SPTE is still a leaf and present SPTE after a
failed cmpxchg in clear_dirty_gfn_range(). clear_dirty_gfn_range()
intends to only operate on present leaf SPTEs, but that could change
after a failed cmpxchg.
A check for present was added in commit 3354ef5a592d ("KVM: x86/mmu:
Check for present SPTE when clearing dirty bit in TDP MMU") but the
check for leaf is still buried in tdp_root_for_each_leaf_pte() and does
not get rechecked on retry.
Fixes: a6a0b05da9f3 ("kvm: x86/mmu: Support dirty logging for the TDP MMU")
Signed-off-by: David Matlack <dmatlack@google.com>
---
arch/x86/kvm/mmu/tdp_mmu.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 6cd4dd631a2f..038983b13574 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1522,12 +1522,13 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
rcu_read_lock();
- tdp_root_for_each_leaf_pte(iter, root, start, end) {
+ tdp_root_for_each_pte(iter, root, start, end) {
retry:
- if (tdp_mmu_iter_cond_resched(kvm, &iter, false, true))
+ if (!is_shadow_present_pte(iter.old_spte) ||
+ !is_last_spte(iter.old_spte, iter.level))
continue;
- if (!is_shadow_present_pte(iter.old_spte))
+ if (tdp_mmu_iter_cond_resched(kvm, &iter, false, true))
continue;
KVM_MMU_WARN_ON(kvm_ad_enabled() &&
--
2.42.0.820.g83a721a137-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/3] KVM: Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG
2023-10-27 17:26 [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG David Matlack
2023-10-27 17:26 ` [PATCH 1/3] KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR David Matlack
2023-10-27 17:26 ` [PATCH 2/3] KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP MMU David Matlack
@ 2023-10-27 17:26 ` David Matlack
2023-11-08 0:05 ` [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG Sean Christopherson
2023-12-01 1:52 ` Sean Christopherson
4 siblings, 0 replies; 8+ messages in thread
From: David Matlack @ 2023-10-27 17:26 UTC (permalink / raw)
To: Paolo Bonzini, Sean Christopherson
Cc: David Matlack, Ben Gardon, kvm, Vipin Sharma
Drop and reacquire the mmu_lock during CLEAR_DIRTY_LOG to avoid blocking
other threads from acquiring the mmu_lock (e.g. vCPUs taking page
faults).
It should be safe to drop and reacquire the mmu_lock from a correctness
standpoint. slots_lock already ensures that only one thread in KVM is
processing a GET/CLEAR_DIRTY_LOG ioctl. And KVM already has to be
prepared to handle vCPUs updating the dirty log concurrent with
CLEAR_DIRTY_LOG, hence the atomic_long_fetch_andnot(). So holding the
mmu_lock across loop iterations is entirely unnecessary. It only needs
to be acquired when calling in the arch-specific code to modify page
tables.
This change eliminates drops in guest performance during the live
migration of a 160 vCPU VM that we've observed while userspace is
issuing CLEAR ioctls (tested with 1GiB and 8GiB CLEARs). Userspace could
issue finer-grained CLEAR ioctls, which would also reduce contention on
the mmu_lock, but doing so will increase the rate of remote TLB
flushing, so there is a limit. And there's really no reason to punt this
problem to userspace. KVM can just drop and reacquire the lock more
frequently to avoid holding it for too long.
Signed-off-by: David Matlack <dmatlack@google.com>
---
virt/kvm/kvm_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 486800a7024b..afa61a2309d2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2297,7 +2297,6 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm,
if (copy_from_user(dirty_bitmap_buffer, log->dirty_bitmap, n))
return -EFAULT;
- KVM_MMU_LOCK(kvm);
for (offset = log->first_page, i = offset / BITS_PER_LONG,
n = DIV_ROUND_UP(log->num_pages, BITS_PER_LONG); n--;
i++, offset += BITS_PER_LONG) {
@@ -2316,11 +2315,12 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm,
*/
if (mask) {
flush = true;
+ KVM_MMU_LOCK(kvm);
kvm_arch_mmu_enable_log_dirty_pt_masked(kvm, memslot,
offset, mask);
+ KVM_MMU_UNLOCK(kvm);
}
}
- KVM_MMU_UNLOCK(kvm);
if (flush)
kvm_flush_remote_tlbs_memslot(kvm, memslot);
--
2.42.0.820.g83a721a137-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG
2023-10-27 17:26 [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG David Matlack
` (2 preceding siblings ...)
2023-10-27 17:26 ` [PATCH 3/3] KVM: Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG David Matlack
@ 2023-11-08 0:05 ` Sean Christopherson
2023-11-08 0:23 ` David Matlack
2023-12-01 1:52 ` Sean Christopherson
4 siblings, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2023-11-08 0:05 UTC (permalink / raw)
To: David Matlack; +Cc: Paolo Bonzini, Ben Gardon, kvm, Vipin Sharma
On Fri, Oct 27, 2023, David Matlack wrote:
> This series reduces the impact of CLEAR_DIRTY_LOG on guest performance
> (Patch 3) and fixes 2 minor bugs found along the way (Patches 1 and 2).
> David Matlack (3):
> KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR
> KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP
> MMU
> KVM: Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG
Is there an actual dependency between 1-2 and 3? AFAICT, no? I ask because I
can take the first two as soon as -rc1 is out, but the generic change definitely
needs testing and acks from other architectures.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG
2023-11-08 0:05 ` [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG Sean Christopherson
@ 2023-11-08 0:23 ` David Matlack
0 siblings, 0 replies; 8+ messages in thread
From: David Matlack @ 2023-11-08 0:23 UTC (permalink / raw)
To: Sean Christopherson; +Cc: Paolo Bonzini, Ben Gardon, kvm, Vipin Sharma
On Tue, Nov 7, 2023 at 4:05 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Oct 27, 2023, David Matlack wrote:
> > This series reduces the impact of CLEAR_DIRTY_LOG on guest performance
> > (Patch 3) and fixes 2 minor bugs found along the way (Patches 1 and 2).
> > David Matlack (3):
> > KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR
> > KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP
> > MMU
> > KVM: Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG
>
> Is there an actual dependency between 1-2 and 3? AFAICT, no? I ask because I
> can take the first two as soon as -rc1 is out, but the generic change definitely
> needs testing and acks from other architectures.
No, there isn't any dependency between any of the commits. Feel free
to grab 1-2 independent of 3.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG
2023-10-27 17:26 [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG David Matlack
` (3 preceding siblings ...)
2023-11-08 0:05 ` [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG Sean Christopherson
@ 2023-12-01 1:52 ` Sean Christopherson
2023-12-01 15:59 ` Sean Christopherson
4 siblings, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2023-12-01 1:52 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, David Matlack; +Cc: kvm, Vipin Sharma
On Fri, 27 Oct 2023 10:26:37 -0700, David Matlack wrote:
> This series reduces the impact of CLEAR_DIRTY_LOG on guest performance
> (Patch 3) and fixes 2 minor bugs found along the way (Patches 1 and 2).
>
> We've observed that guest performance can drop while userspace is
> issuing CLEAR_DIRTY_LOG ioctls and tracked down the problem to
> contention on the mmu_lock in vCPU threads. CLEAR_DIRTY_LOG holds the
> write-lock, so this isn't that surprising. We previously explored
> converting CLEAR_DIRTY_LOG to hold the read-lock [1], but that has some
> negative consequences:
>
> [...]
Applied 1 and 2 to kvm-x86 mmu. To get traction on #3, I recommend resending it
as a standalone patch with all KVM arch maintainers Cc'd.
[1/3] KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR
https://github.com/kvm-x86/linux/commit/7cd1bf039eeb
[2/3] KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP MMU
https://github.com/kvm-x86/linux/commit/76d1492924bc
--
https://github.com/kvm-x86/linux/tree/next
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG
2023-12-01 1:52 ` Sean Christopherson
@ 2023-12-01 15:59 ` Sean Christopherson
0 siblings, 0 replies; 8+ messages in thread
From: Sean Christopherson @ 2023-12-01 15:59 UTC (permalink / raw)
To: Paolo Bonzini, David Matlack; +Cc: kvm, Vipin Sharma
On Thu, Nov 30, 2023, Sean Christopherson wrote:
> On Fri, 27 Oct 2023 10:26:37 -0700, David Matlack wrote:
> > This series reduces the impact of CLEAR_DIRTY_LOG on guest performance
> > (Patch 3) and fixes 2 minor bugs found along the way (Patches 1 and 2).
> >
> > We've observed that guest performance can drop while userspace is
> > issuing CLEAR_DIRTY_LOG ioctls and tracked down the problem to
> > contention on the mmu_lock in vCPU threads. CLEAR_DIRTY_LOG holds the
> > write-lock, so this isn't that surprising. We previously explored
> > converting CLEAR_DIRTY_LOG to hold the read-lock [1], but that has some
> > negative consequences:
> >
> > [...]
>
> Applied 1 and 2 to kvm-x86 mmu. To get traction on #3, I recommend resending it
> as a standalone patch with all KVM arch maintainers Cc'd.
>
> [1/3] KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR
> https://github.com/kvm-x86/linux/commit/7cd1bf039eeb
> [2/3] KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP MMU
> https://github.com/kvm-x86/linux/commit/76d1492924bc
Force pushed because the Fixes: in patch 1 referenced a Google-internal commit.
New hashes:
[1/2] KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR
https://github.com/kvm-x86/linux/commit/1aa4bb916808
[2/2] KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP MMU
https://github.com/kvm-x86/linux/commit/45a61ebb2211
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-12-01 15:59 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-27 17:26 [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG David Matlack
2023-10-27 17:26 ` [PATCH 1/3] KVM: x86/mmu: Fix off-by-1 when splitting huge pages during CLEAR David Matlack
2023-10-27 17:26 ` [PATCH 2/3] KVM: x86/mmu: Check for leaf SPTE when clearing dirty bit in the TDP MMU David Matlack
2023-10-27 17:26 ` [PATCH 3/3] KVM: Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG David Matlack
2023-11-08 0:05 ` [PATCH 0/3] KVM: Performance and correctness fixes for CLEAR_DIRTY_LOG Sean Christopherson
2023-11-08 0:23 ` David Matlack
2023-12-01 1:52 ` Sean Christopherson
2023-12-01 15:59 ` Sean Christopherson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).