All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Optimize S2 page splitting
@ 2026-06-18 13:14 Leonardo Bras
  2026-06-18 13:14 ` [PATCH v2 1/3] KVM: arm64: Avoid re-testing walk_continue Leonardo Bras
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Leonardo Bras @ 2026-06-18 13:14 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, Joey Gouly, Steffen Eiden,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
	Fuad Tabba, Leonardo Bras, Raghavendra Rao Ananta
  Cc: linux-arm-kernel, kvmarm, linux-kernel

While playing with dirty-bit tracking, I decided to take a look on how page
splitting works. Found out all entries are walked, even though we can infer,
for instance that:
- If a level-3 entry is walked, it means the parent level-2 entry is split
- If a split just succeeded in an table entry, it means all children nodes
  are already split

This patches' idea is to introduce new walking flags to skip pagetable
levels 0-3.

The idea of skipping child nodes was also tested, but it was marginally
slower than just skipping levels, so it was discarted.

Optimization measured on two scenarios involving eager-splitting on a
VM with 1 memslot of 16GB:
- Scenario 1: No manual protect, whole memslot split at dirty-track enable
  (KVM_SET_USER_MEMORY_REGION2 ioctl with KVM_MEM_LOG_DIRTY_PAGES)
  - Split happens only once, whole region
  - Evalutes improved batch performance of splitting
- Scenario 2: Manual protect, split happens during every dirty-bit clean
  (KVM_CLEAR_DIRTY_LOG ioctl), average for 2 iterations.
  - Split called multiple times, for smaller 64-page sections.
  - Evaluate improved performance for multiple calls

Scenario 1, improvement on dirty-track enable ioctl for the memslot:
- Memory was already split (4k pages):  -44.01% runtime (stdev 2.80%)
- THP backed memory:                    -24.66% runtime (stdev 1.21%)
- 16x1GB hugetlb memory:                -24.78% runtime (stdev 0.85%)

Scenario 2, improvement on dirty-log clean ioctl for the memslot:
- Memory was already split (4k pages):  -38.98% runtime (stdev 1.91%)
- THP backed memory:                    -25.49% runtime (stdev 0.65%)
- 16x1GB hugetlb memory:                -24.24% runtime (stdev 0.65%)

For collecting above numbers, the following script was ran in both vanilla
and patched kernels, with kernel parameter 'default_hugepagesz=1G', on an
TX2 with 32GB RAM.

--- dirty_test.sh
#!/bin/bash
filename=$(uname -r |cut -d'-' -f 4-)

run_test(){
  uname -a
  cat /proc/cmdline

  #prepare
  sudo bash -c 'echo 64 > /proc/sys/vm/nr_hugepages'

  ./dirty_log_perf_test -g -b 64G
  ./dirty_log_perf_test -g -b 64G -s anonymous_thp
  ./dirty_log_perf_test -g -b 64G -s shared_hugetlb

  ./dirty_log_perf_test -b 64G
  ./dirty_log_perf_test -b 64G -s anonymous_thp
  ./dirty_log_perf_test -b 64G -s shared_hugetlb
}

run_test 2>&1 | tee ${filename}
---

Above dirty_log_perf_test command is the standard kvm selftest found in the
kernel tree. It tested the following guest modes:
Testing guest mode: PA-bits:40,  VA-bits:48,  4K pages
Testing guest mode: PA-bits:40,  VA-bits:48, 64K pages
Testing guest mode: PA-bits:36,  VA-bits:48,  4K pages
Testing guest mode: PA-bits:36,  VA-bits:48, 64K pages

Performance numbers from above modes were used to calculate average and
stdev showed in the optimization results.

Changes since v1:
- Fixed inverted flag verification priority (Sashiko)
- Fixed incorrectly skipping POST call if level was skipped (Sashiko), and to that
- New pre-patch that changes goto-out -> return to avoid re-testing walk_continue 
v1 Link: https://lore.kernel.org/lkml/20260610202112.2695205-2-leo.bras@arm.com/

Changes since RFC:
- Changed approach from return value to walk flags (Will Deacon)
- Discarted skip_child approach (Oliver Upton)
- Measured in real hardware, and from userspace perspective (Marc Zyngier)
- Better explanation of what and how numbers were collected
RFC Link: https://lore.kernel.org/all/20260515195904.2466381-1-leo.bras@arm.com/

Thanks!
Leo

Leonardo Bras (3):
  KVM: arm64: Avoid re-testing walk_continue
  KVM: arm64: Introduce KVM_PGTABLE_WALK_SKIP_LEVEL* walk flags
  KVM: arm64: Make stage2_split_walker() skip unnecessary walks

 arch/arm64/include/asm/kvm_pgtable.h | 13 +++++++++++++
 arch/arm64/kvm/hyp/pgtable.c         | 28 +++++++++++++++++++++-------
 2 files changed, 34 insertions(+), 7 deletions(-)


base-commit: 66affa37cfac0aec061cc4bcf4a065b0c52f7e19
-- 
2.54.0



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-06-18 14:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-18 13:14 [PATCH v2 0/3] Optimize S2 page splitting Leonardo Bras
2026-06-18 13:14 ` [PATCH v2 1/3] KVM: arm64: Avoid re-testing walk_continue Leonardo Bras
2026-06-18 13:14 ` [PATCH v2 2/3] KVM: arm64: Introduce KVM_PGTABLE_WALK_SKIP_LEVEL* walk flags Leonardo Bras
2026-06-18 13:14 ` [PATCH v2 3/3] KVM: arm64: Make stage2_split_walker() skip unnecessary walks Leonardo Bras
2026-06-18 14:38 ` [PATCH v2 0/3] Optimize S2 page splitting Leonardo Bras

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.