* [RFC PATCH 0/3] KVM: selftests: Improvements on dirty-ring
@ 2026-06-24 17:16 Leonardo Bras
2026-06-24 17:16 ` [RFC PATCH 1/3] KVM: selftests: memstress: Add option to enable dirty-ring on VM creation Leonardo Bras
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Leonardo Bras @ 2026-06-24 17:16 UTC (permalink / raw)
To: Paolo Bonzini, Shuah Khan, Sean Christopherson, David Matlack,
Leonardo Bras, Ackerley Tng, Marc Zyngier, Oliver Upton, Wu Fei,
Claudio Imbrenda, Steffen Eiden
Cc: kvm, linux-kselftest, linux-kernel
Add support to dirty-ring on dirty_log_perf_test, so we can measure
performance differences when we change / improve the mechanism.
One major question I had in this set is that dirty-ring needs to be
enabled after the VM was created, but before any cpu is created, so
for that, I made a few changes on memstress so it can actually take
dirty-ring size as a parameter for VM creation. (Patch #1)
I honestly don't think it's pretty, but I could not think on a
better way of doing this without messing too much in the code.
Also added some checks in dirty-ring enable, so an error message on
ring size is more informative.
Please provide feedback :)
Thanks!
Leo
Leonardo Bras (3):
KVM: selftests: memstress: Add option to enable dirty-ring on VM
creation
KVM: selftests: Check dirty-ring size before enabling
KVM: selftests: dirty_log_perf_test: Add dirty-ring support
.../testing/selftests/kvm/include/memstress.h | 3 +-
.../selftests/kvm/access_tracking_perf_test.c | 2 +-
.../selftests/kvm/demand_paging_test.c | 2 +-
.../selftests/kvm/dirty_log_perf_test.c | 100 ++++++++++++++++--
tools/testing/selftests/kvm/lib/kvm_util.c | 18 +++-
tools/testing/selftests/kvm/lib/memstress.c | 34 +++++-
.../kvm/memslot_modification_stress_test.c | 2 +-
.../kvm/x86/dirty_log_page_splitting_test.c | 2 +-
8 files changed, 140 insertions(+), 23 deletions(-)
base-commit: 840ef6c78e6a2f694b578ecb9063241c992aaa9e
--
2.54.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [RFC PATCH 1/3] KVM: selftests: memstress: Add option to enable dirty-ring on VM creation
2026-06-24 17:16 [RFC PATCH 0/3] KVM: selftests: Improvements on dirty-ring Leonardo Bras
@ 2026-06-24 17:16 ` Leonardo Bras
2026-06-24 17:16 ` [RFC PATCH 2/3] KVM: selftests: Check dirty-ring size before enabling Leonardo Bras
2026-06-24 17:16 ` [RFC PATCH 3/3] KVM: selftests: dirty_log_perf_test: Add dirty-ring support Leonardo Bras
2 siblings, 0 replies; 6+ messages in thread
From: Leonardo Bras @ 2026-06-24 17:16 UTC (permalink / raw)
To: Paolo Bonzini, Shuah Khan, Sean Christopherson, David Matlack,
Leonardo Bras, Ackerley Tng, Marc Zyngier, Oliver Upton, Wu Fei,
Claudio Imbrenda, Steffen Eiden
Cc: kvm, linux-kselftest, linux-kernel
Dirty-ring should be enabled after the VM is created, but before the
creation of any cpu. To to so, add an option on memstress_create_vm() that
takes dirty-ring size, and enables it at the correct moment.
This required a new function memstress_vm_create_with_vcpus() to be
created based on the previously used __vm_create_with_vcpus();
Signed-off-by: Leonardo Bras <leo.bras@arm.com>
---
.../testing/selftests/kvm/include/memstress.h | 3 +-
.../selftests/kvm/access_tracking_perf_test.c | 2 +-
.../selftests/kvm/demand_paging_test.c | 2 +-
.../selftests/kvm/dirty_log_perf_test.c | 2 +-
tools/testing/selftests/kvm/lib/memstress.c | 34 ++++++++++++++++---
.../kvm/memslot_modification_stress_test.c | 2 +-
.../kvm/x86/dirty_log_page_splitting_test.c | 2 +-
7 files changed, 37 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/kvm/include/memstress.h b/tools/testing/selftests/kvm/include/memstress.h
index 0d1d6230cc05..e17f344978af 100644
--- a/tools/testing/selftests/kvm/include/memstress.h
+++ b/tools/testing/selftests/kvm/include/memstress.h
@@ -51,21 +51,22 @@ struct memstress_args {
bool stop_vcpus;
struct memstress_vcpu_args vcpu_args[KVM_MAX_VCPUS];
};
extern struct memstress_args memstress_args;
struct kvm_vm *memstress_create_vm(enum vm_guest_mode mode, int nr_vcpus,
u64 vcpu_memory_bytes, int slots,
enum vm_mem_backing_src_type backing_src,
- bool partition_vcpu_memory_access);
+ bool partition_vcpu_memory_access,
+ u32 dirty_ring_size);
void memstress_destroy_vm(struct kvm_vm *vm);
void memstress_set_write_percent(struct kvm_vm *vm, u32 write_percent);
void memstress_set_random_access(struct kvm_vm *vm, bool random_access);
void memstress_start_vcpu_threads(int vcpus, void (*vcpu_fn)(struct memstress_vcpu_args *));
void memstress_join_vcpu_threads(int vcpus);
void memstress_guest_code(u32 vcpu_id);
u64 memstress_nested_pages(int nr_vcpus);
diff --git a/tools/testing/selftests/kvm/access_tracking_perf_test.c b/tools/testing/selftests/kvm/access_tracking_perf_test.c
index 4415c94b2866..f78d56265d89 100644
--- a/tools/testing/selftests/kvm/access_tracking_perf_test.c
+++ b/tools/testing/selftests/kvm/access_tracking_perf_test.c
@@ -402,21 +402,21 @@ static void mark_memory_idle(struct kvm_vm *vm, int nr_vcpus)
run_iteration(vm, nr_vcpus, "Mark memory idle (page_idle)");
}
static void run_test(enum vm_guest_mode mode, void *arg)
{
struct test_params *params = arg;
struct kvm_vm *vm;
int nr_vcpus = params->nr_vcpus;
vm = memstress_create_vm(mode, nr_vcpus, params->vcpu_memory_bytes, 1,
- params->backing_src, !overlap_memory_access);
+ params->backing_src, !overlap_memory_access, 0);
/*
* If guest_page_size is larger than the host's page size, the
* guest (memstress) will only fault in a subset of the host's pages.
*/
test_pages = params->nr_vcpus * params->vcpu_memory_bytes /
max(memstress_args.guest_page_size,
(u64)getpagesize());
memstress_start_vcpu_threads(nr_vcpus, vcpu_thread_main);
diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c
index 302c4923d093..92246e27ef74 100644
--- a/tools/testing/selftests/kvm/demand_paging_test.c
+++ b/tools/testing/selftests/kvm/demand_paging_test.c
@@ -155,21 +155,21 @@ static void run_test(enum vm_guest_mode mode, void *arg)
struct test_params *p = arg;
struct uffd_desc **uffd_descs = NULL;
u64 uffd_region_size;
struct timespec start;
struct timespec ts_diff;
double vcpu_paging_rate;
struct kvm_vm *vm;
int i, num_uffds = 0;
vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1,
- p->src_type, p->partition_vcpu_memory_access);
+ p->src_type, p->partition_vcpu_memory_access, 0);
demand_paging_size = get_backing_src_pagesz(p->src_type);
guest_data_prototype = malloc(demand_paging_size);
TEST_ASSERT(guest_data_prototype,
"Failed to allocate buffer for guest data pattern");
memset(guest_data_prototype, 0xAB, demand_paging_size);
if (p->uffd_mode == UFFDIO_REGISTER_MODE_MINOR) {
num_uffds = p->single_uffd ? 1 : nr_vcpus;
diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index ef779fa91827..69b38791440e 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -112,21 +112,21 @@ static void run_test(enum vm_guest_mode mode, void *arg)
struct timespec start;
struct timespec ts_diff;
struct timespec get_dirty_log_total = (struct timespec){0};
struct timespec vcpu_dirty_total = (struct timespec){0};
struct timespec avg;
struct timespec clear_dirty_log_total = (struct timespec){0};
int i;
vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size,
p->slots, p->backing_src,
- p->partition_vcpu_memory_access);
+ p->partition_vcpu_memory_access, 0);
memstress_set_write_percent(vm, p->write_percent);
guest_num_pages = (nr_vcpus * guest_percpu_mem_size) >> vm->page_shift;
guest_num_pages = vm_adjust_num_guest_pages(mode, guest_num_pages);
host_num_pages = vm_num_host_pages(mode, guest_num_pages);
pages_per_slot = host_num_pages / p->slots;
bitmaps = memstress_alloc_bitmaps(p->slots, pages_per_slot);
diff --git a/tools/testing/selftests/kvm/lib/memstress.c b/tools/testing/selftests/kvm/lib/memstress.c
index 6dcd15910a06..c9f85533ffff 100644
--- a/tools/testing/selftests/kvm/lib/memstress.c
+++ b/tools/testing/selftests/kvm/lib/memstress.c
@@ -114,24 +114,49 @@ void memstress_setup_vcpus(struct kvm_vm *vm, int nr_vcpus,
}
vcpu_args_set(vcpus[i], 1, i);
pr_debug("Added VCPU %d with test mem gpa [%lx, %lx)\n",
i, vcpu_args->gpa, vcpu_args->gpa +
(vcpu_args->pages * args->guest_page_size));
}
}
+static struct kvm_vm *memstress_vm_create_with_vcpus(struct vm_shape shape,
+ u32 nr_vcpus,
+ u64 extra_mem_pages,
+ void *guest_code,
+ struct kvm_vcpu *vcpus[],
+ u32 dirty_ring_size)
+{
+ struct kvm_vm *vm;
+ int i;
+
+ TEST_ASSERT(!nr_vcpus || vcpus, "Must provide vCPU array");
+
+ vm = __vm_create(shape, nr_vcpus, extra_mem_pages);
+
+ if (dirty_ring_size)
+ vm_enable_dirty_ring(vm, dirty_ring_size);
+
+ for (i = 0; i < nr_vcpus; ++i)
+ vcpus[i] = vm_vcpu_add(vm, i, guest_code);
+
+ kvm_arch_vm_finalize_vcpus(vm);
+ return vm;
+}
+
struct kvm_vm *memstress_create_vm(enum vm_guest_mode mode, int nr_vcpus,
u64 vcpu_memory_bytes, int slots,
enum vm_mem_backing_src_type backing_src,
- bool partition_vcpu_memory_access)
+ bool partition_vcpu_memory_access,
+ u32 dirty_ring_size)
{
struct memstress_args *args = &memstress_args;
struct kvm_vm *vm;
u64 guest_num_pages, slot0_pages = 0;
u64 backing_src_pagesz = get_backing_src_pagesz(backing_src);
u64 region_end_gfn;
int i;
pr_info("Testing guest mode: %s\n", vm_guest_mode_string(mode));
@@ -160,23 +185,24 @@ struct kvm_vm *memstress_create_vm(enum vm_guest_mode mode, int nr_vcpus,
* in-memory data structures.
*/
if (args->nested)
slot0_pages += memstress_nested_pages(nr_vcpus);
/*
* Pass guest_num_pages to populate the page tables for test memory.
* The memory is also added to memslot 0, but that's a benign side
* effect as KVM allows aliasing HVAs in meslots.
*/
- vm = __vm_create_with_vcpus(VM_SHAPE(mode), nr_vcpus,
- slot0_pages + guest_num_pages,
- memstress_guest_code, vcpus);
+ vm = memstress_vm_create_with_vcpus(VM_SHAPE(mode), nr_vcpus,
+ slot0_pages + guest_num_pages,
+ memstress_guest_code, vcpus,
+ dirty_ring_size);
args->vm = vm;
/* Put the test region at the top guest physical memory. */
region_end_gfn = vm->max_gfn + 1;
#ifdef __x86_64__
/*
* When running vCPUs in L2, restrict the test region to 48 bits to
* avoid needing 5-level page tables to identity map L2.
diff --git a/tools/testing/selftests/kvm/memslot_modification_stress_test.c b/tools/testing/selftests/kvm/memslot_modification_stress_test.c
index 9c7578a098c3..3a1be9b75e01 100644
--- a/tools/testing/selftests/kvm/memslot_modification_stress_test.c
+++ b/tools/testing/selftests/kvm/memslot_modification_stress_test.c
@@ -83,21 +83,21 @@ struct test_params {
bool disable_slot_zap_quirk;
};
static void run_test(enum vm_guest_mode mode, void *arg)
{
struct test_params *p = arg;
struct kvm_vm *vm;
vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1,
VM_MEM_SRC_ANONYMOUS,
- p->partition_vcpu_memory_access);
+ p->partition_vcpu_memory_access, 0);
#ifdef __x86_64__
if (p->disable_slot_zap_quirk)
vm_enable_cap(vm, KVM_CAP_DISABLE_QUIRKS2, KVM_X86_QUIRK_SLOT_ZAP_ALL);
pr_info("Memslot zap quirk %s\n", p->disable_slot_zap_quirk ?
"disabled" : "enabled");
#endif
pr_info("Finished creating vCPUs\n");
diff --git a/tools/testing/selftests/kvm/x86/dirty_log_page_splitting_test.c b/tools/testing/selftests/kvm/x86/dirty_log_page_splitting_test.c
index 388ba4101f97..661e9abfb439 100644
--- a/tools/testing/selftests/kvm/x86/dirty_log_page_splitting_test.c
+++ b/tools/testing/selftests/kvm/x86/dirty_log_page_splitting_test.c
@@ -94,21 +94,21 @@ static void run_test(enum vm_guest_mode mode, void *unused)
u64 pages_per_slot;
int i;
struct kvm_page_stats stats_populated;
struct kvm_page_stats stats_dirty_logging_enabled;
struct kvm_page_stats stats_dirty_pass[ITERATIONS];
struct kvm_page_stats stats_clear_pass[ITERATIONS];
struct kvm_page_stats stats_dirty_logging_disabled;
struct kvm_page_stats stats_repopulated;
vm = memstress_create_vm(mode, VCPUS, guest_percpu_mem_size,
- SLOTS, backing_src, false);
+ SLOTS, backing_src, false, 0);
guest_num_pages = (VCPUS * guest_percpu_mem_size) >> vm->page_shift;
guest_num_pages = vm_adjust_num_guest_pages(mode, guest_num_pages);
host_num_pages = vm_num_host_pages(mode, guest_num_pages);
pages_per_slot = host_num_pages / SLOTS;
TEST_ASSERT_EQ(host_num_pages, pages_per_slot * SLOTS);
TEST_ASSERT(!(host_num_pages % 512),
"Number of pages, '%lu' not a multiple of 2MiB", host_num_pages);
bitmaps = memstress_alloc_bitmaps(SLOTS, pages_per_slot);
--
2.54.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [RFC PATCH 2/3] KVM: selftests: Check dirty-ring size before enabling
2026-06-24 17:16 [RFC PATCH 0/3] KVM: selftests: Improvements on dirty-ring Leonardo Bras
2026-06-24 17:16 ` [RFC PATCH 1/3] KVM: selftests: memstress: Add option to enable dirty-ring on VM creation Leonardo Bras
@ 2026-06-24 17:16 ` Leonardo Bras
2026-06-24 17:28 ` sashiko-bot
2026-06-24 17:16 ` [RFC PATCH 3/3] KVM: selftests: dirty_log_perf_test: Add dirty-ring support Leonardo Bras
2 siblings, 1 reply; 6+ messages in thread
From: Leonardo Bras @ 2026-06-24 17:16 UTC (permalink / raw)
To: Paolo Bonzini, Shuah Khan, Sean Christopherson, David Matlack,
Leonardo Bras, Ackerley Tng, Marc Zyngier, Oliver Upton, Wu Fei,
Claudio Imbrenda, Steffen Eiden
Cc: kvm, linux-kselftest, linux-kernel
As of today, trying to enable dirty-ring with a size bigger than the
maximum will return an "argument list too long" error.
Change vm_enable_dirty_ring() to get the maximum size, then compare it to
the desired size before enabling. If the value is invalid, print a more
precise error message.
Signed-off-by: Leonardo Bras <leo.bras@arm.com>
---
tools/testing/selftests/kvm/lib/kvm_util.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 195f3fdae1e3..237f0ce0ae60 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -160,24 +160,34 @@ unsigned int kvm_check_cap(long cap)
ret = __kvm_ioctl(kvm_fd, KVM_CHECK_EXTENSION, (void *)cap);
TEST_ASSERT(ret >= 0, KVM_IOCTL_ERROR(KVM_CHECK_EXTENSION, ret));
kvm_free_fd(kvm_fd);
return (unsigned int)ret;
}
void vm_enable_dirty_ring(struct kvm_vm *vm, u32 ring_size)
{
- if (vm_check_cap(vm, KVM_CAP_DIRTY_LOG_RING_ACQ_REL))
- vm_enable_cap(vm, KVM_CAP_DIRTY_LOG_RING_ACQ_REL, ring_size);
- else
- vm_enable_cap(vm, KVM_CAP_DIRTY_LOG_RING, ring_size);
+ long cap = KVM_CAP_DIRTY_LOG_RING_ACQ_REL;
+ int max_size = vm_check_cap(vm, cap);
+
+ if (!max_size) {
+ cap = KVM_CAP_DIRTY_LOG_RING;
+ max_size = vm_check_cap(vm, cap);
+ }
+
+ TEST_ASSERT(ring_size <= max_size && is_power_of_2(ring_size),
+ "Invalid dirty-ring size: Should be a power of two "
+ "smaller than %d entries\n",
+ max_size / sizeof(struct kvm_dirty_gfn));
+
+ vm_enable_cap(vm, cap, ring_size);
vm->dirty_ring_size = ring_size;
}
static void vm_open(struct kvm_vm *vm)
{
vm->kvm_fd = _open_kvm_dev_path_or_exit(O_RDWR);
TEST_REQUIRE(kvm_has_cap(KVM_CAP_IMMEDIATE_EXIT));
vm->fd = __kvm_ioctl(vm->kvm_fd, KVM_CREATE_VM, (void *)vm->type);
--
2.54.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [RFC PATCH 3/3] KVM: selftests: dirty_log_perf_test: Add dirty-ring support
2026-06-24 17:16 [RFC PATCH 0/3] KVM: selftests: Improvements on dirty-ring Leonardo Bras
2026-06-24 17:16 ` [RFC PATCH 1/3] KVM: selftests: memstress: Add option to enable dirty-ring on VM creation Leonardo Bras
2026-06-24 17:16 ` [RFC PATCH 2/3] KVM: selftests: Check dirty-ring size before enabling Leonardo Bras
@ 2026-06-24 17:16 ` Leonardo Bras
2026-06-24 17:29 ` sashiko-bot
2 siblings, 1 reply; 6+ messages in thread
From: Leonardo Bras @ 2026-06-24 17:16 UTC (permalink / raw)
To: Paolo Bonzini, Shuah Khan, Sean Christopherson, David Matlack,
Leonardo Bras, Ackerley Tng, Marc Zyngier, Oliver Upton, Wu Fei,
Claudio Imbrenda, Steffen Eiden
Cc: kvm, linux-kselftest, linux-kernel
dirty_log_test supports both dirty-bitmap and dirty-ring as dirty-page
tracking mechanisms, while dirty_log_perf_test only supports dirty-bitmap.
Add support to dirty-ring on dirty_log_perf_test so it can be used to
compare performance between changes in the mechanism.
Signed-off-by: Leonardo Bras <leo.bras@arm.com>
---
.../selftests/kvm/dirty_log_perf_test.c | 100 ++++++++++++++++--
1 file changed, 90 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 69b38791440e..659efa679bc7 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -6,63 +6,115 @@
*
* Copyright (C) 2018, Red Hat, Inc.
* Copyright (C) 2020, Google, Inc.
*/
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <pthread.h>
#include <linux/bitmap.h>
+#include <asm/barrier.h>
#include "kvm_util.h"
#include "test_util.h"
#include "memstress.h"
#include "guest_modes.h"
#include "ucall_common.h"
/* How many host loops to run by default (one KVM_GET_DIRTY_LOG for each loop)*/
#define TEST_HOST_LOOP_N 2UL
static int nr_vcpus = 1;
static u64 guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE;
static bool run_vcpus_while_disabling_dirty_logging;
/* Host variables */
static u64 dirty_log_manual_caps;
+static u32 dirty_ring_size;
static bool host_quit;
static int iteration;
static int vcpu_last_completed_iteration[KVM_MAX_VCPUS];
+static struct timespec vcpu_dirty_ring_collect[KVM_MAX_VCPUS];
+
+static void dirty_ring_collect(struct kvm_vcpu *vcpu, u32 *ring_idx,
+ struct timespec *ts)
+{
+ struct timespec start;
+ struct kvm_dirty_gfn *dirty_gfns = vcpu_map_dirty_ring(vcpu);
+ u32 ret, idx = *ring_idx;
+ u32 ring_size = vcpu->vm->dirty_ring_size;
+
+ clock_gettime(CLOCK_MONOTONIC, &start);
+
+ while (true) {
+ struct kvm_dirty_gfn *cur;
+
+ cur = &dirty_gfns[idx % ring_size];
+ if (smp_load_acquire(&cur->flags) != KVM_DIRTY_GFN_F_DIRTY)
+ break;
+
+ smp_store_release(&cur->flags, KVM_DIRTY_GFN_F_RESET);
+ idx++;
+ }
+
+ idx -= *ring_idx;
+ *ring_idx += idx;
+
+ ret = kvm_vm_reset_dirty_ring(vcpu->vm);
+
+ TEST_ASSERT(ret == idx, "Reset dirty pages (%u) mismatch "
+ "with collected (%u)", ret, idx);
+
+ *ts = timespec_add(*ts, timespec_elapsed(start));
+}
static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
{
struct kvm_vcpu *vcpu = vcpu_args->vcpu;
int vcpu_idx = vcpu_args->vcpu_idx;
u64 pages_count = 0;
struct kvm_run *run;
struct timespec start;
struct timespec ts_diff;
struct timespec total = (struct timespec){0};
struct timespec avg;
+ bool use_dirty_ring = !!vcpu->vm->dirty_ring_size;
+ u32 ring_idx = 0;
int ret;
run = vcpu->run;
while (!READ_ONCE(host_quit)) {
int current_iteration = READ_ONCE(iteration);
+ struct timespec collect = (struct timespec){0};
clock_gettime(CLOCK_MONOTONIC, &start);
- ret = _vcpu_run(vcpu);
+
+ do {
+ ret = _vcpu_run(vcpu);
+ if (!use_dirty_ring)
+ break;
+
+ dirty_ring_collect(vcpu, &ring_idx, &collect);
+ } while (ret == KVM_EXIT_DIRTY_RING_FULL);
+
ts_diff = timespec_elapsed(start);
+ if (use_dirty_ring) {
+ ts_diff = timespec_sub(ts_diff, collect);
+ vcpu_dirty_ring_collect[vcpu_idx] = collect;
+ }
+
TEST_ASSERT(ret == 0, "vcpu_run failed: %d", ret);
- TEST_ASSERT(get_ucall(vcpu, NULL) == UCALL_SYNC,
+ TEST_ASSERT(get_ucall(vcpu, NULL) == UCALL_SYNC ||
+ (use_dirty_ring && run->exit_reason == KVM_EXIT_DIRTY_RING_FULL),
"Invalid guest sync status: exit_reason=%s",
exit_reason_str(run->exit_reason));
pr_debug("Got sync event from vCPU %d\n", vcpu_idx);
vcpu_last_completed_iteration[vcpu_idx] = current_iteration;
pr_debug("vCPU %d updated last completed iteration to %d\n",
vcpu_idx, vcpu_last_completed_iteration[vcpu_idx]);
if (current_iteration) {
pages_count += vcpu_args->pages;
@@ -112,42 +164,45 @@ static void run_test(enum vm_guest_mode mode, void *arg)
struct timespec start;
struct timespec ts_diff;
struct timespec get_dirty_log_total = (struct timespec){0};
struct timespec vcpu_dirty_total = (struct timespec){0};
struct timespec avg;
struct timespec clear_dirty_log_total = (struct timespec){0};
int i;
vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size,
p->slots, p->backing_src,
- p->partition_vcpu_memory_access, 0);
+ p->partition_vcpu_memory_access,
+ dirty_ring_size);
memstress_set_write_percent(vm, p->write_percent);
guest_num_pages = (nr_vcpus * guest_percpu_mem_size) >> vm->page_shift;
guest_num_pages = vm_adjust_num_guest_pages(mode, guest_num_pages);
host_num_pages = vm_num_host_pages(mode, guest_num_pages);
pages_per_slot = host_num_pages / p->slots;
bitmaps = memstress_alloc_bitmaps(p->slots, pages_per_slot);
if (dirty_log_manual_caps)
vm_enable_cap(vm, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2,
dirty_log_manual_caps);
/* Start the iterations */
iteration = 0;
host_quit = false;
clock_gettime(CLOCK_MONOTONIC, &start);
- for (i = 0; i < nr_vcpus; i++)
+ for (i = 0; i < nr_vcpus; i++) {
vcpu_last_completed_iteration[i] = -1;
+ vcpu_dirty_ring_collect[i] = (struct timespec){0};
+ }
/*
* Use 100% writes during the population phase to ensure all
* memory is actually populated and not just mapped to the zero
* page. The prevents expensive copy-on-write faults from
* occurring during the dirty memory iterations below, which
* would pollute the performance results.
*/
memstress_set_write_percent(vm, 100);
memstress_set_random_access(vm, false);
@@ -188,20 +243,35 @@ static void run_test(enum vm_guest_mode mode, void *arg)
while (READ_ONCE(vcpu_last_completed_iteration[i])
!= iteration)
;
}
ts_diff = timespec_elapsed(start);
vcpu_dirty_total = timespec_add(vcpu_dirty_total, ts_diff);
pr_info("Iteration %d dirty memory time: %ld.%.9lds\n",
iteration, ts_diff.tv_sec, ts_diff.tv_nsec);
+ if (dirty_ring_size) {
+ struct timespec iteration_sum = (struct timespec){0};
+
+ for (i = 0; i < nr_vcpus; i++)
+ iteration_sum = timespec_add(iteration_sum,
+ vcpu_dirty_ring_collect[i]);
+
+ pr_info("Iteration %d clear dirty ring time: %ld.%.9lds\n",
+ iteration, iteration_sum.tv_sec, iteration_sum.tv_nsec);
+
+ clear_dirty_log_total = timespec_add(clear_dirty_log_total,
+ iteration_sum);
+ continue;
+ }
+
clock_gettime(CLOCK_MONOTONIC, &start);
memstress_get_dirty_log(vm, bitmaps, p->slots);
ts_diff = timespec_elapsed(start);
get_dirty_log_total = timespec_add(get_dirty_log_total,
ts_diff);
pr_info("Iteration %d get dirty log time: %ld.%.9lds\n",
iteration, ts_diff.tv_sec, ts_diff.tv_nsec);
if (dirty_log_manual_caps) {
clock_gettime(CLOCK_MONOTONIC, &start);
@@ -231,46 +301,51 @@ static void run_test(enum vm_guest_mode mode, void *arg)
ts_diff.tv_sec, ts_diff.tv_nsec);
/*
* Tell the vCPU threads to quit. No need to manually check that vCPUs
* have stopped running after disabling dirty logging, the join will
* wait for them to exit.
*/
host_quit = true;
memstress_join_vcpu_threads(nr_vcpus);
- avg = timespec_div(get_dirty_log_total, p->iterations);
- pr_info("Get dirty log over %lu iterations took %ld.%.9lds. (Avg %ld.%.9lds/iteration)\n",
- p->iterations, get_dirty_log_total.tv_sec,
- get_dirty_log_total.tv_nsec, avg.tv_sec, avg.tv_nsec);
+ if (!dirty_ring_size) {
+ avg = timespec_div(get_dirty_log_total, p->iterations);
+ pr_info("Get dirty log over %lu iterations took %ld.%.9lds. (Avg %ld.%.9lds/iteration)\n",
+ p->iterations, get_dirty_log_total.tv_sec,
+ get_dirty_log_total.tv_nsec, avg.tv_sec, avg.tv_nsec);
+ }
- if (dirty_log_manual_caps) {
+ if (dirty_log_manual_caps || dirty_ring_size) {
avg = timespec_div(clear_dirty_log_total, p->iterations);
pr_info("Clear dirty log over %lu iterations took %ld.%.9lds. (Avg %ld.%.9lds/iteration)\n",
p->iterations, clear_dirty_log_total.tv_sec,
clear_dirty_log_total.tv_nsec, avg.tv_sec, avg.tv_nsec);
}
memstress_free_bitmaps(bitmaps, p->slots);
memstress_destroy_vm(vm);
}
static void help(char *name)
{
puts("");
printf("usage: %s [-h] [-a] [-i iterations] [-p offset] [-g] "
"[-m mode] [-n] [-b vcpu bytes] [-v vcpus] [-o] [-r random seed ] [-s mem type]"
"[-x memslots] [-w percentage] [-c physical cpus to run test on]\n", name);
puts("");
printf(" -a: access memory randomly rather than in order.\n");
printf(" -i: specify iteration counts (default: %"PRIu64")\n",
TEST_HOST_LOOP_N);
+ printf(" -d: specify the size of dirty-ring for tracking dirty pages.\n"
+ " If non-zero, will cause dirty-ring to be used instead of\n"
+ " dirty-bitmap. Must be a power of two.");
printf(" -g: Do not enable KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2. This\n"
" makes KVM_GET_DIRTY_LOG clear the dirty log (i.e.\n"
" KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE is not enabled)\n"
" and writes will be tracked as soon as dirty logging is\n"
" enabled on the memslot (i.e. KVM_DIRTY_LOG_INITIALLY_SET\n"
" is not enabled).\n");
printf(" -p: specify guest physical test memory offset\n"
" Warning: a low offset can conflict with the loaded test code.\n");
guest_modes_help();
printf(" -n: Run the vCPUs in nested mode (L2)\n");
@@ -313,31 +388,36 @@ int main(int argc, char *argv[])
/* Override the seed to be deterministic by default. */
guest_random_seed = 1;
dirty_log_manual_caps =
kvm_check_cap(KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2);
dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE |
KVM_DIRTY_LOG_INITIALLY_SET);
guest_modes_append_default();
- while ((opt = getopt(argc, argv, "ab:c:eghi:m:nop:r:s:v:x:w:")) != -1) {
+ while ((opt = getopt(argc, argv, "ab:c:d:eghi:m:nop:r:s:v:x:w:")) != -1) {
switch (opt) {
case 'a':
p.random_access = true;
break;
case 'b':
guest_percpu_mem_size = parse_size(optarg);
break;
case 'c':
pcpu_list = optarg;
break;
+ case 'd':
+ dirty_ring_size = parse_size(optarg);
+ dirty_ring_size *= sizeof(struct kvm_dirty_gfn);
+ dirty_log_manual_caps = 0;
+ break;
case 'e':
/* 'e' is for evil. */
run_vcpus_while_disabling_dirty_logging = true;
break;
case 'g':
dirty_log_manual_caps = 0;
break;
case 'h':
help(argv[0]);
break;
--
2.54.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC PATCH 2/3] KVM: selftests: Check dirty-ring size before enabling
2026-06-24 17:16 ` [RFC PATCH 2/3] KVM: selftests: Check dirty-ring size before enabling Leonardo Bras
@ 2026-06-24 17:28 ` sashiko-bot
0 siblings, 0 replies; 6+ messages in thread
From: sashiko-bot @ 2026-06-24 17:28 UTC (permalink / raw)
To: Leonardo Bras; +Cc: kvm
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [Low] Format string type mismatch causing undefined behavior on 64-bit big-endian architectures
- [Low] Misleading error message contradicts inclusive boundary check
- [Low] Misleading assertion failure when dirty ring capabilities are unsupported
--
commit 8555e295b2631adeaf18b3b867988e6adc58eb4c
Author: Leonardo Bras <leo.bras@arm.com>
KVM: selftests: Check dirty-ring size before enabling
This commit changes vm_enable_dirty_ring() to dynamically fetch the maximum
dirty-ring size via KVM_CAP_DIRTY_LOG_RING_ACQ_REL or KVM_CAP_DIRTY_LOG_RING.
It then validates that the requested ring size is less than or equal to this
maximum and is a power of two, printing a specific error message if the check
fails.
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
[ ... ]
> void vm_enable_dirty_ring(struct kvm_vm *vm, u32 ring_size)
> {
> - if (vm_check_cap(vm, KVM_CAP_DIRTY_LOG_RING_ACQ_REL))
> - vm_enable_cap(vm, KVM_CAP_DIRTY_LOG_RING_ACQ_REL, ring_size);
> - else
> - vm_enable_cap(vm, KVM_CAP_DIRTY_LOG_RING, ring_size);
> + long cap = KVM_CAP_DIRTY_LOG_RING_ACQ_REL;
> + int max_size = vm_check_cap(vm, cap);
> +
> + if (!max_size) {
> + cap = KVM_CAP_DIRTY_LOG_RING;
> + max_size = vm_check_cap(vm, cap);
> + }
> +
[Severity: Low]
If the host KVM subsystem does not support these capabilities, max_size
will evaluate to 0.
Since the assertion below will then fail for any positive ring_size, does
this create a misleading error message about "smaller than 0 entries"
rather than indicating the feature is unsupported?
> + TEST_ASSERT(ring_size <= max_size && is_power_of_2(ring_size),
> + "Invalid dirty-ring size: Should be a power of two "
> + "smaller than %d entries\n",
[Severity: Low]
The condition checks if ring_size is less than or equal to max_size, but the
error message text specifies it must be "smaller than".
Does this mismatch make it confusing to debug failures when the requested size
is exactly equal to the maximum size?
> + max_size / sizeof(struct kvm_dirty_gfn));
[Severity: Low]
The expression max_size / sizeof(struct kvm_dirty_gfn) evaluates to a size_t,
but the format string uses %d.
Can this cause format warnings, and result in incorrectly printing 0 on
64-bit big-endian architectures since the format parsing will read the most
significant 32 bits?
> +
> + vm_enable_cap(vm, cap, ring_size);
> vm->dirty_ring_size = ring_size;
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260624171656.1737580-1-leo.bras@arm.com?part=2
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC PATCH 3/3] KVM: selftests: dirty_log_perf_test: Add dirty-ring support
2026-06-24 17:16 ` [RFC PATCH 3/3] KVM: selftests: dirty_log_perf_test: Add dirty-ring support Leonardo Bras
@ 2026-06-24 17:29 ` sashiko-bot
0 siblings, 0 replies; 6+ messages in thread
From: sashiko-bot @ 2026-06-24 17:29 UTC (permalink / raw)
To: Leonardo Bras; +Cc: kvm
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [Medium] dirty_ring_collect() computes array indices using the dirty ring size in bytes rather than the number of elements, causing a severe out-of-bounds memory access.
- [Medium] The do-while loop in vcpu_worker() terminates prematurely because it incorrectly checks the _vcpu_run() return value instead of the KVM exit reason, artificially shortening the workload.
- [Medium] Concurrent calls to kvm_vm_reset_dirty_ring() by multiple vCPU threads cause a TEST_ASSERT failure because the VM-wide reset count is compared against a thread-local harvested count.
- [Medium] The elapsed time calculation for clearing the dirty ring incorrectly aggregates parallel CPU time instead of serial wall-clock time, causing an unfair comparison with the dirty-bitmap mechanism.
--
commit 70a79bc8da1cc1134a0b9d174c9a32912039dc85
Author: Leonardo Bras <leo.bras@arm.com>
Subject: KVM: selftests: dirty_log_perf_test: Add dirty-ring support
dirty_log_test supports both dirty-bitmap and dirty-ring as dirty-page
tracking mechanisms, while dirty_log_perf_test only supports dirty-bitmap.
Add support to dirty-ring on dirty_log_perf_test so it can be used to
compare performance between changes in the mechanism.
> diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> index 69b38791440e7..659efa679bc74 100644
> --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
> +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
[ ... ]
> @@ -29,9 +30,43 @@ static bool run_vcpus_while_disabling_dirty_logging;
>
> /* Host variables */
> static u64 dirty_log_manual_caps;
> +static u32 dirty_ring_size;
> static bool host_quit;
> static int iteration;
> static int vcpu_last_completed_iteration[KVM_MAX_VCPUS];
> +static struct timespec vcpu_dirty_ring_collect[KVM_MAX_VCPUS];
> +
> +static void dirty_ring_collect(struct kvm_vcpu *vcpu, u32 *ring_idx,
> + struct timespec *ts)
> +{
> + struct timespec start;
> + struct kvm_dirty_gfn *dirty_gfns = vcpu_map_dirty_ring(vcpu);
> + u32 ret, idx = *ring_idx;
> + u32 ring_size = vcpu->vm->dirty_ring_size;
> +
> + clock_gettime(CLOCK_MONOTONIC, &start);
> +
> + while (true) {
> + struct kvm_dirty_gfn *cur;
> +
> + cur = &dirty_gfns[idx % ring_size];
[Severity: Medium]
Since the ring size is configured in bytes (e.g., in main() via
dirty_ring_size *= sizeof(struct kvm_dirty_gfn)), does using it here as a
modulo for indexing into an array of structs cause out-of-bounds memory
accesses?
> + if (smp_load_acquire(&cur->flags) != KVM_DIRTY_GFN_F_DIRTY)
> + break;
> +
> + smp_store_release(&cur->flags, KVM_DIRTY_GFN_F_RESET);
> + idx++;
> + }
> +
> + idx -= *ring_idx;
> + *ring_idx += idx;
> +
> + ret = kvm_vm_reset_dirty_ring(vcpu->vm);
> +
> + TEST_ASSERT(ret == idx, "Reset dirty pages (%u) mismatch "
> + "with collected (%u)", ret, idx);
[Severity: Medium]
Will this assertion falsely fail under concurrency?
The kvm_vm_reset_dirty_ring() call performs a VM-wide reset of the dirty
rings for all vCPUs.
In a multi-vCPU test, is it possible for one thread's ioctl call to reap
reset entries prepared by other threads, resulting in a return value greater
than its local idx count?
> +
> + *ts = timespec_add(*ts, timespec_elapsed(start));
> +}
>
> static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
> {
[ ... ]
> @@ -43,19 +78,36 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
> struct timespec ts_diff;
> struct timespec total = (struct timespec){0};
> struct timespec avg;
> + bool use_dirty_ring = !!vcpu->vm->dirty_ring_size;
> + u32 ring_idx = 0;
> int ret;
>
> run = vcpu->run;
>
> while (!READ_ONCE(host_quit)) {
> int current_iteration = READ_ONCE(iteration);
> + struct timespec collect = (struct timespec){0};
>
> clock_gettime(CLOCK_MONOTONIC, &start);
> - ret = _vcpu_run(vcpu);
> +
> + do {
> + ret = _vcpu_run(vcpu);
> + if (!use_dirty_ring)
> + break;
> +
> + dirty_ring_collect(vcpu, &ring_idx, &collect);
> + } while (ret == KVM_EXIT_DIRTY_RING_FULL);
[Severity: Medium]
Does this loop terminate prematurely because it compares the return value of
_vcpu_run() instead of the KVM exit reason?
_vcpu_run() returns the ioctl status (0 on success), not the exit reason.
This means the condition evaluates to while (0 == KVM_EXIT_DIRTY_RING_FULL)
and will exit after the very first dirty ring fill.
> +
> ts_diff = timespec_elapsed(start);
>
> + if (use_dirty_ring) {
> + ts_diff = timespec_sub(ts_diff, collect);
> + vcpu_dirty_ring_collect[vcpu_idx] = collect;
> + }
> +
> TEST_ASSERT(ret == 0, "vcpu_run failed: %d", ret);
[ ... ]
> @@ -195,6 +250,21 @@ static void run_test(enum vm_guest_mode mode, void *arg)
> pr_info("Iteration %d dirty memory time: %ld.%.9lds\n",
> iteration, ts_diff.tv_sec, ts_diff.tv_nsec);
>
> + if (dirty_ring_size) {
> + struct timespec iteration_sum = (struct timespec){0};
> +
> + for (i = 0; i < nr_vcpus; i++)
> + iteration_sum = timespec_add(iteration_sum,
> + vcpu_dirty_ring_collect[i]);
> +
> + pr_info("Iteration %d clear dirty ring time: %ld.%.9lds\n",
> + iteration, iteration_sum.tv_sec, iteration_sum.tv_nsec);
[Severity: Medium]
Does aggregating the parallel execution times of all vCPU threads into
iteration_sum artificially inflate the reported dirty-ring clearing time?
The dirty-bitmap mechanism measures sequential wall-clock time in the main
thread. By summing parallel CPU times here (which scales linearly with the
number of vCPUs), it seems this will produce misleading performance metrics
when comparing the two mechanisms.
> +
> + clear_dirty_log_total = timespec_add(clear_dirty_log_total,
> + iteration_sum);
> + continue;
> + }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260624171656.1737580-1-leo.bras@arm.com?part=3
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-06-24 17:29 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-24 17:16 [RFC PATCH 0/3] KVM: selftests: Improvements on dirty-ring Leonardo Bras
2026-06-24 17:16 ` [RFC PATCH 1/3] KVM: selftests: memstress: Add option to enable dirty-ring on VM creation Leonardo Bras
2026-06-24 17:16 ` [RFC PATCH 2/3] KVM: selftests: Check dirty-ring size before enabling Leonardo Bras
2026-06-24 17:28 ` sashiko-bot
2026-06-24 17:16 ` [RFC PATCH 3/3] KVM: selftests: dirty_log_perf_test: Add dirty-ring support Leonardo Bras
2026-06-24 17:29 ` sashiko-bot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.