From: James Houghton <jthoughton@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <seanjc@google.com>
Cc: Vipin Sharma <vipinsh@google.com>,
David Matlack <dmatlack@google.com>,
James Houghton <jthoughton@google.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH v5 0/7] KVM: x86/mmu: Run TDP MMU NX huge page recovery under MMU read lock
Date: Mon, 7 Jul 2025 22:47:13 +0000 [thread overview]
Message-ID: <20250707224720.4016504-1-jthoughton@google.com> (raw)
Hi Sean/Paolo,
I'm finishing off Vipin's NX huge page recovery optimization for the TDP
MMU from last year. This is a respin on the series I sent a couple weeks
ago, v4. Below is a mostly unchanged cover letter from v4.
NX huge page recovery can cause guest performance jitter, originally
noticed with network tests in Windows guests. Please see Vipin's earlier
performance results[1]. Below is some new data I have collected with the
nx_huge_pages_perf_test that I've included with this series.
The NX huge page recovery for the shadow MMU is still done under the MMU
write lock, but with the TDP MMU, we can instead do it under the MMU
read lock by:
1. Tracking the possible NX huge pages for the two MMUs separately
(patch 1).
2. Updating the NX huge page recovery routine for the TDP MMU to
- zap SPTEs atomically, and
- grab tdp_mmu_pages_lock to iterate over the NX huge page list
(patch 3).
I threw in patch 4 because it seems harmless and closer to the "right"
thing to do. Feel free to drop it if you don't agree with me. :)
I'm also grabbing David's execute_perf_test[3] while I'm at it. It was
dropped before simply because it didn't apply at the time. David's test
works well as a stress test for NX huge page recovery when NX huge page
recovery is tuned to be very aggressive.
Changes since v4[4]:
- 32-bit build fixups for patch 1 and 3.
- Small variable rename in patch 3.
Changes since v3[2]:
- Dropped the move of the `sp->nx_huge_page_disallowed` check to outside
of the tdp_mmu_pages_lock.
- Implemented Sean's array suggestion for `possible_nx_huge_pages`.
- Implemented some other cleanup suggestions from Sean.
- Made shadow MMU not take the RCU lock in NX huge page recovery.
- Added a selftest for measuring jitter.
- Added David's execute_perf_test[3].
-- Results
$ cat /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
100
$ cat /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
4
$ ./nx_huge_pages_perf_test -b 16G -s anonymous_hugetlb_1gb
[Unpatched] Max fault latency: 8496724 cycles
[Unpatched] Max fault latency: 8404426 cycles
[ Patched ] Max fault latency: 49418 cycles
[ Patched ] Max fault latency: 51948 cycles
$ ./nx_huge_pages_perf_test -b 16G -s anonymous_hugetlb_2mb
[Unpatched] Max fault latency: 5320740 cycles
[Unpatched] Max fault latency: 5384554 cycles
[ Patched ] Max fault latency: 50052 cycles
[ Patched ] Max fault latency: 103774 cycles
$ ./nx_huge_pages_perf_test -b 16G -s anonymous_thp
[Unpatched] Max fault latency: 7625022 cycles
[Unpatched] Max fault latency: 6339934 cycles
[ Patched ] Max fault latency: 107976 cycles
[ Patched ] Max fault latency: 108386 cycles
$ ./nx_huge_pages_perf_test -b 16G -s anonymous
[Unpatched] Max fault latency: 143036 cycles
[Unpatched] Max fault latency: 287444 cycles
[ Patched ] Max fault latency: 274626 cycles
[ Patched ] Max fault latency: 303984 cycles
We can see about a 100x decrease in maximum fault latency for both
2M pages and 1G pages. This test is only timing writes to unmapped
pages that are not themselves currently undergoing NX huge page
recovery. The test only produces interesting results when NX huge page
recovery is actually occurring, so the parameters are tuned to make it
very likely for NX huge page recovery to occur in the middle of the
test.
Based on latest kvm/next.
[1]: https://lore.kernel.org/kvm/20240906204515.3276696-3-vipinsh@google.com/
[2]: https://lore.kernel.org/kvm/20240906204515.3276696-1-vipinsh@google.com/
[3]: https://lore.kernel.org/kvm/20221109185905.486172-2-dmatlack@google.com/
[4]: https://lore.kernel.org/kvm/20250616181144.2874709-1-jthoughton@google.com/
David Matlack (1):
KVM: selftests: Introduce a selftest to measure execution performance
James Houghton (3):
KVM: x86/mmu: Only grab RCU lock for nx hugepage recovery for TDP MMU
KVM: selftests: Provide extra mmap flags in vm_mem_add()
KVM: selftests: Add an NX huge pages jitter test
Vipin Sharma (3):
KVM: x86/mmu: Track TDP MMU NX huge pages separately
KVM: x86/mmu: Rename kvm_tdp_mmu_zap_sp() to better indicate its
purpose
KVM: x86/mmu: Recover TDP MMU NX huge pages using MMU read lock
arch/x86/include/asm/kvm_host.h | 43 +++-
arch/x86/kvm/mmu/mmu.c | 180 +++++++++-----
arch/x86/kvm/mmu/mmu_internal.h | 7 +-
arch/x86/kvm/mmu/tdp_mmu.c | 49 +++-
arch/x86/kvm/mmu/tdp_mmu.h | 3 +-
tools/testing/selftests/kvm/Makefile.kvm | 2 +
.../testing/selftests/kvm/execute_perf_test.c | 199 ++++++++++++++++
.../testing/selftests/kvm/include/kvm_util.h | 3 +-
.../testing/selftests/kvm/include/memstress.h | 4 +
tools/testing/selftests/kvm/lib/kvm_util.c | 15 +-
tools/testing/selftests/kvm/lib/memstress.c | 25 +-
.../kvm/x86/nx_huge_pages_perf_test.c | 223 ++++++++++++++++++
.../kvm/x86/private_mem_conversions_test.c | 2 +-
13 files changed, 656 insertions(+), 99 deletions(-)
create mode 100644 tools/testing/selftests/kvm/execute_perf_test.c
create mode 100644 tools/testing/selftests/kvm/x86/nx_huge_pages_perf_test.c
base-commit: 8046d29dde17002523f94d3e6e0ebe486ce52166
--
2.50.0.727.gbf7dc18ff4-goog
next reply other threads:[~2025-07-07 22:48 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-07 22:47 James Houghton [this message]
2025-07-07 22:47 ` [PATCH v5 1/7] KVM: x86/mmu: Track TDP MMU NX huge pages separately James Houghton
2025-08-19 17:57 ` Sean Christopherson
2025-07-07 22:47 ` [PATCH v5 2/7] KVM: x86/mmu: Rename kvm_tdp_mmu_zap_sp() to better indicate its purpose James Houghton
2025-07-07 22:47 ` [PATCH v5 3/7] KVM: x86/mmu: Recover TDP MMU NX huge pages using MMU read lock James Houghton
2025-07-23 20:34 ` Sean Christopherson
2025-07-28 18:07 ` James Houghton
2025-07-28 18:17 ` David Matlack
2025-07-28 21:38 ` Sean Christopherson
2025-07-28 21:48 ` James Houghton
2025-08-01 18:17 ` David Matlack
2025-08-01 22:00 ` Sean Christopherson
2025-08-12 19:21 ` David Matlack
2025-07-07 22:47 ` [PATCH v5 4/7] KVM: x86/mmu: Only grab RCU lock for nx hugepage recovery for TDP MMU James Houghton
2025-07-23 20:38 ` Sean Christopherson
2025-07-28 17:51 ` James Houghton
2025-07-07 22:47 ` [PATCH v5 5/7] KVM: selftests: Introduce a selftest to measure execution performance James Houghton
2025-07-23 20:50 ` Sean Christopherson
2025-07-29 0:18 ` James Houghton
2025-07-07 22:47 ` [PATCH v5 6/7] KVM: selftests: Provide extra mmap flags in vm_mem_add() James Houghton
2025-07-07 22:47 ` [PATCH v5 7/7] KVM: selftests: Add an NX huge pages jitter test James Houghton
2025-07-23 21:04 ` Sean Christopherson
2025-07-28 18:40 ` James Houghton
2025-08-01 14:11 ` Sean Christopherson
2025-08-01 18:45 ` James Houghton
2025-08-01 22:30 ` Sean Christopherson
2025-07-23 20:44 ` [PATCH v5 0/7] KVM: x86/mmu: Run TDP MMU NX huge page recovery under MMU read lock Sean Christopherson
2025-07-29 0:19 ` James Houghton
2025-08-19 23:12 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250707224720.4016504-1-jthoughton@google.com \
--to=jthoughton@google.com \
--cc=dmatlack@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=vipinsh@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).