From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Ben Gardon <bgardon@google.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.18 11/35] KVM: x86/MMU: Zap non-leaf SPTEs when disabling dirty logging
Date: Tue, 9 Aug 2022 20:00:40 +0200 [thread overview]
Message-ID: <20220809175515.481015905@linuxfoundation.org> (raw)
In-Reply-To: <20220809175515.046484486@linuxfoundation.org>
From: Ben Gardon <bgardon@google.com>
[ Upstream commit 5ba7c4c6d1c7af47a916f728bb5940669684a087 ]
Currently disabling dirty logging with the TDP MMU is extremely slow.
On a 96 vCPU / 96G VM backed with gigabyte pages, it takes ~200 seconds
to disable dirty logging with the TDP MMU, as opposed to ~4 seconds with
the shadow MMU.
When disabling dirty logging, zap non-leaf parent entries to allow
replacement with huge pages instead of recursing and zapping all of the
child, leaf entries. This reduces the number of TLB flushes required.
and reduces the disable dirty log time with the TDP MMU to ~3 seconds.
Opportunistically add a WARN() to catch GFNs that are mapped at a
higher level than their max level.
Signed-off-by: Ben Gardon <bgardon@google.com>
Message-Id: <20220525230904.1584480-1-bgardon@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/x86/kvm/mmu/tdp_iter.c | 9 +++++++++
arch/x86/kvm/mmu/tdp_iter.h | 1 +
arch/x86/kvm/mmu/tdp_mmu.c | 38 +++++++++++++++++++++++++++++++------
3 files changed, 42 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/mmu/tdp_iter.c b/arch/x86/kvm/mmu/tdp_iter.c
index 6d3b3e5a5533..ee4802d7b36c 100644
--- a/arch/x86/kvm/mmu/tdp_iter.c
+++ b/arch/x86/kvm/mmu/tdp_iter.c
@@ -145,6 +145,15 @@ static bool try_step_up(struct tdp_iter *iter)
return true;
}
+/*
+ * Step the iterator back up a level in the paging structure. Should only be
+ * used when the iterator is below the root level.
+ */
+void tdp_iter_step_up(struct tdp_iter *iter)
+{
+ WARN_ON(!try_step_up(iter));
+}
+
/*
* Step to the next SPTE in a pre-order traversal of the paging structure.
* To get to the next SPTE, the iterator either steps down towards the goal
diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h
index f0af385c56e0..adfca0cf94d3 100644
--- a/arch/x86/kvm/mmu/tdp_iter.h
+++ b/arch/x86/kvm/mmu/tdp_iter.h
@@ -114,5 +114,6 @@ void tdp_iter_start(struct tdp_iter *iter, struct kvm_mmu_page *root,
int min_level, gfn_t next_last_level_gfn);
void tdp_iter_next(struct tdp_iter *iter);
void tdp_iter_restart(struct tdp_iter *iter);
+void tdp_iter_step_up(struct tdp_iter *iter);
#endif /* __KVM_X86_MMU_TDP_ITER_H */
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 922b06bf4b94..b61a11d462cc 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1748,12 +1748,12 @@ static void zap_collapsible_spte_range(struct kvm *kvm,
gfn_t start = slot->base_gfn;
gfn_t end = start + slot->npages;
struct tdp_iter iter;
+ int max_mapping_level;
kvm_pfn_t pfn;
rcu_read_lock();
tdp_root_for_each_pte(iter, root, start, end) {
-retry:
if (tdp_mmu_iter_cond_resched(kvm, &iter, false, true))
continue;
@@ -1761,15 +1761,41 @@ static void zap_collapsible_spte_range(struct kvm *kvm,
!is_last_spte(iter.old_spte, iter.level))
continue;
+ /*
+ * This is a leaf SPTE. Check if the PFN it maps can
+ * be mapped at a higher level.
+ */
pfn = spte_to_pfn(iter.old_spte);
- if (kvm_is_reserved_pfn(pfn) ||
- iter.level >= kvm_mmu_max_mapping_level(kvm, slot, iter.gfn,
- pfn, PG_LEVEL_NUM))
+
+ if (kvm_is_reserved_pfn(pfn))
continue;
+ max_mapping_level = kvm_mmu_max_mapping_level(kvm, slot,
+ iter.gfn, pfn, PG_LEVEL_NUM);
+
+ WARN_ON(max_mapping_level < iter.level);
+
+ /*
+ * If this page is already mapped at the highest
+ * viable level, there's nothing more to do.
+ */
+ if (max_mapping_level == iter.level)
+ continue;
+
+ /*
+ * The page can be remapped at a higher level, so step
+ * up to zap the parent SPTE.
+ */
+ while (max_mapping_level > iter.level)
+ tdp_iter_step_up(&iter);
+
/* Note, a successful atomic zap also does a remote TLB flush. */
- if (tdp_mmu_zap_spte_atomic(kvm, &iter))
- goto retry;
+ tdp_mmu_zap_spte_atomic(kvm, &iter);
+
+ /*
+ * If the atomic zap fails, the iter will recurse back into
+ * the same subtree to retry.
+ */
}
rcu_read_unlock();
--
2.35.1
next prev parent reply other threads:[~2022-08-09 18:20 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-09 18:00 [PATCH 5.18 00/35] 5.18.17-rc1 review Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 01/35] x86/speculation: Make all RETbleed mitigations 64-bit only Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 02/35] block: fix default IO priority handling again Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 03/35] tools/vm/slabinfo: Handle files in debugfs Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 04/35] ACPI: video: Force backlight native for some TongFang devices Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 05/35] ACPI: video: Shortening quirk list by identifying Clevo by board_name only Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 06/35] ACPI: APEI: Better fix to avoid spamming the console with old error logs Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 07/35] crypto: arm64/poly1305 - fix a read out-of-bound Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 08/35] KVM: x86: do not report a vCPU as preempted outside instruction boundaries Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 09/35] KVM: x86: do not set st->preempted when going back to user space Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 10/35] KVM: selftests: Make hyperv_clock selftest more stable Greg Kroah-Hartman
2022-08-09 18:00 ` Greg Kroah-Hartman [this message]
2022-08-09 18:00 ` [PATCH 5.18 12/35] entry/kvm: Exit to user mode when TIF_NOTIFY_SIGNAL is set Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 13/35] KVM: x86: disable preemption while updating apicv inhibition Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 14/35] KVM: x86: disable preemption around the call to kvm_arch_vcpu_{un|}blocking Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 15/35] KVM: selftests: Restrict test region to 48-bit physical addresses when using nested Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 16/35] tools/kvm_stat: fix display of error when multiple processes are found Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 17/35] selftests: KVM: Handle compiler optimizations in ucall Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 18/35] KVM: x86/svm: add __GFP_ACCOUNT to __sev_dbg_{en,de}crypt_user() Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 19/35] arm64: set UXN on swapper page tables Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 20/35] btrfs: zoned: prevent allocation from previous data relocation BG Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 21/35] btrfs: zoned: fix critical section of relocation inode writeback Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 22/35] btrfs: zoned: drop optimization of zone finish Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 23/35] Bluetooth: hci_qca: Return wakeup for qca_wakeup Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 24/35] Bluetooth: hci_bcm: Add BCM4349B1 variant Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 25/35] Bluetooth: hci_bcm: Add DT compatible for CYW55572 Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 26/35] dt-bindings: bluetooth: broadcom: Add BCM4349B1 DT binding Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 27/35] Bluetooth: btusb: Add support of IMC Networks PID 0x3568 Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 28/35] Bluetooth: btusb: Add Realtek RTL8852C support ID 0x04CA:0x4007 Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 29/35] Bluetooth: btusb: Add Realtek RTL8852C support ID 0x04C5:0x1675 Greg Kroah-Hartman
2022-08-09 18:00 ` [PATCH 5.18 30/35] Bluetooth: btusb: Add Realtek RTL8852C support ID 0x0CB8:0xC558 Greg Kroah-Hartman
2022-08-09 18:01 ` [PATCH 5.18 31/35] Bluetooth: btusb: Add Realtek RTL8852C support ID 0x13D3:0x3587 Greg Kroah-Hartman
2022-08-09 18:01 ` [PATCH 5.18 32/35] Bluetooth: btusb: Add Realtek RTL8852C support ID 0x13D3:0x3586 Greg Kroah-Hartman
2022-08-09 18:01 ` [PATCH 5.18 33/35] macintosh/adb: fix oob read in do_adb_query() function Greg Kroah-Hartman
2022-08-09 18:01 ` [PATCH 5.18 34/35] x86/speculation: Add RSB VM Exit protections Greg Kroah-Hartman
2022-08-09 18:01 ` [PATCH 5.18 35/35] x86/speculation: Add LFENCE to RSB fill sequence Greg Kroah-Hartman
2022-08-09 21:47 ` [PATCH 5.18 00/35] 5.18.17-rc1 review Florian Fainelli
2022-08-10 6:16 ` Naresh Kamboju
2022-08-10 12:54 ` Ron Economos
2022-08-10 13:26 ` Sudip Mukherjee (Codethink)
2022-08-10 13:33 ` Guenter Roeck
2022-08-10 14:17 ` Justin Forbes
2022-08-10 14:31 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220809175515.481015905@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=bgardon@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox