* [PATCH v2 0/4] KVM: Dynamic sized memslots array
@ 2024-09-04 22:35 Peter Xu
  2024-09-04 22:35 ` [PATCH v2 1/4] KVM: Dynamic sized kvm " Peter Xu
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Peter Xu @ 2024-09-04 22:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Vitaly Kuznetsov, Prasad Pandit, Julia Suvorova, peterx,
	Fabiano Rosas, Paolo Bonzini, Juraj Marcin, David Hildenbrand
v2:
- Reordered patches, make the fix patch cc:stable
- s/NUM/NR/ for the newly introduced macros [DavidH]
- Added kvm_slots_double() wrapper [DavidH]
v1: https://lore.kernel.org/r/20240904191635.3045606-1-peterx@redhat.com
This series make KVM memslots to be allocated dynamically in QEMU.  It
fixes a migration performance regression that I observed, reducing precopy
dirty sync process from ~86ms to ~3ms each time.
Patch 1 is the fix to the problem, while the rest three patches are
cleanups.
Thanks,
Peter Xu (4):
  KVM: Dynamic sized kvm memslots array
  KVM: Define KVM_MEMSLOTS_NUM_MAX_DEFAULT
  KVM: Rename KVMMemoryListener.nr_used_slots to nr_slots_used
  KVM: Rename KVMState->nr_slots to nr_slots_max
 include/sysemu/kvm_int.h |   7 +--
 accel/kvm/kvm-all.c      | 111 +++++++++++++++++++++++++++++++--------
 accel/kvm/trace-events   |   1 +
 3 files changed, 93 insertions(+), 26 deletions(-)
-- 
2.45.0
^ permalink raw reply	[flat|nested] 9+ messages in thread* [PATCH v2 1/4] KVM: Dynamic sized kvm memslots array 2024-09-04 22:35 [PATCH v2 0/4] KVM: Dynamic sized memslots array Peter Xu @ 2024-09-04 22:35 ` Peter Xu 2024-09-05 15:32 ` Juraj Marcin 2024-09-04 22:35 ` [PATCH v2 2/4] KVM: Define KVM_MEMSLOTS_NUM_MAX_DEFAULT Peter Xu ` (2 subsequent siblings) 3 siblings, 1 reply; 9+ messages in thread From: Peter Xu @ 2024-09-04 22:35 UTC (permalink / raw) To: qemu-devel Cc: Vitaly Kuznetsov, Prasad Pandit, Julia Suvorova, peterx, Fabiano Rosas, Paolo Bonzini, Juraj Marcin, David Hildenbrand, qemu-stable, Zhiyi Guo Zhiyi reported an infinite loop issue in VFIO use case. The cause of that was a separate discussion, however during that I found a regression of dirty sync slowness when profiling. Each KVMMemoryListerner maintains an array of kvm memslots. Currently it's statically allocated to be the max supported by the kernel. However after Linux commit 4fc096a99e ("KVM: Raise the maximum number of user memslots"), the max supported memslots reported now grows to some number large enough so that it may not be wise to always statically allocate with the max reported. What's worse, QEMU kvm code still walks all the allocated memslots entries to do any form of lookups. It can drastically slow down all memslot operations because each of such loop can run over 32K times on the new kernels. Fix this issue by making the memslots to be allocated dynamically. Here the initial size was set to 16 because it should cover the basic VM usages, so that the hope is the majority VM use case may not even need to grow at all (e.g. if one starts a VM with ./qemu-system-x86_64 by default it'll consume 9 memslots), however not too large to waste memory. There can also be even better way to address this, but so far this is the simplest and should be already better even than before we grow the max supported memslots. For example, in the case of above issue when VFIO was attached on a 32GB system, there are only ~10 memslots used. So it could be good enough as of now. In the above VFIO context, measurement shows that the precopy dirty sync shrinked from ~86ms to ~3ms after this patch applied. It should also apply to any KVM enabled VM even without VFIO. NOTE: we don't have a FIXES tag for this patch because there's no real commit that regressed this in QEMU. Such behavior existed for a long time, but only start to be a problem when the kernel reports very large nr_slots_max value. However that's pretty common now (the kernel change was merged in 2021) so we attached cc:stable because we'll want this change to be backported to stable branches. Cc: qemu-stable <qemu-stable@nongnu.org> Reported-by: Zhiyi Guo <zhguo@redhat.com> Tested-by: Zhiyi Guo <zhguo@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> --- include/sysemu/kvm_int.h | 1 + accel/kvm/kvm-all.c | 93 +++++++++++++++++++++++++++++++++------- accel/kvm/trace-events | 1 + 3 files changed, 80 insertions(+), 15 deletions(-) diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h index 1d8fb1473b..48e496b3d4 100644 --- a/include/sysemu/kvm_int.h +++ b/include/sysemu/kvm_int.h @@ -46,6 +46,7 @@ typedef struct KVMMemoryListener { MemoryListener listener; KVMSlot *slots; unsigned int nr_used_slots; + unsigned int nr_slots_allocated; int as_id; QSIMPLEQ_HEAD(, KVMMemoryUpdate) transaction_add; QSIMPLEQ_HEAD(, KVMMemoryUpdate) transaction_del; diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 75d11a07b2..f9368494a8 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -69,6 +69,9 @@ #define KVM_GUESTDBG_BLOCKIRQ 0 #endif +/* Default num of memslots to be allocated when VM starts */ +#define KVM_MEMSLOTS_NR_ALLOC_DEFAULT 16 + struct KVMParkedVcpu { unsigned long vcpu_id; int kvm_fd; @@ -165,6 +168,57 @@ void kvm_resample_fd_notify(int gsi) } } +/** + * kvm_slots_grow(): Grow the slots[] array in the KVMMemoryListener + * + * @kml: The KVMMemoryListener* to grow the slots[] array + * @nr_slots_new: The new size of slots[] array + * + * Returns: True if the array grows larger, false otherwise. + */ +static bool kvm_slots_grow(KVMMemoryListener *kml, unsigned int nr_slots_new) +{ + unsigned int i, cur = kml->nr_slots_allocated; + KVMSlot *slots; + + if (nr_slots_new > kvm_state->nr_slots) { + nr_slots_new = kvm_state->nr_slots; + } + + if (cur >= nr_slots_new) { + /* Big enough, no need to grow, or we reached max */ + return false; + } + + if (cur == 0) { + slots = g_new0(KVMSlot, nr_slots_new); + } else { + assert(kml->slots); + slots = g_renew(KVMSlot, kml->slots, nr_slots_new); + /* + * g_renew() doesn't initialize extended buffers, however kvm + * memslots require fields to be zero-initialized. E.g. pointers, + * memory_size field, etc. + */ + memset(&slots[cur], 0x0, sizeof(slots[0]) * (nr_slots_new - cur)); + } + + for (i = cur; i < nr_slots_new; i++) { + slots[i].slot = i; + } + + kml->slots = slots; + kml->nr_slots_allocated = nr_slots_new; + trace_kvm_slots_grow(cur, nr_slots_new); + + return true; +} + +static bool kvm_slots_double(KVMMemoryListener *kml) +{ + return kvm_slots_grow(kml, kml->nr_slots_allocated * 2); +} + unsigned int kvm_get_max_memslots(void) { KVMState *s = KVM_STATE(current_accel()); @@ -193,15 +247,20 @@ unsigned int kvm_get_free_memslots(void) /* Called with KVMMemoryListener.slots_lock held */ static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml) { - KVMState *s = kvm_state; int i; - for (i = 0; i < s->nr_slots; i++) { +retry: + for (i = 0; i < kml->nr_slots_allocated; i++) { if (kml->slots[i].memory_size == 0) { return &kml->slots[i]; } } + /* If no free slots, try to grow first by doubling */ + if (kvm_slots_double(kml)) { + goto retry; + } + return NULL; } @@ -222,10 +281,9 @@ static KVMSlot *kvm_lookup_matching_slot(KVMMemoryListener *kml, hwaddr start_addr, hwaddr size) { - KVMState *s = kvm_state; int i; - for (i = 0; i < s->nr_slots; i++) { + for (i = 0; i < kml->nr_slots_allocated; i++) { KVMSlot *mem = &kml->slots[i]; if (start_addr == mem->start_addr && size == mem->memory_size) { @@ -267,7 +325,7 @@ int kvm_physical_memory_addr_from_host(KVMState *s, void *ram, int i, ret = 0; kvm_slots_lock(); - for (i = 0; i < s->nr_slots; i++) { + for (i = 0; i < kml->nr_slots_allocated; i++) { KVMSlot *mem = &kml->slots[i]; if (ram >= mem->ram && ram < mem->ram + mem->memory_size) { @@ -1071,7 +1129,7 @@ static int kvm_physical_log_clear(KVMMemoryListener *kml, kvm_slots_lock(); - for (i = 0; i < s->nr_slots; i++) { + for (i = 0; i < kml->nr_slots_allocated; i++) { mem = &kml->slots[i]; /* Discard slots that are empty or do not overlap the section */ if (!mem->memory_size || @@ -1719,12 +1777,8 @@ static void kvm_log_sync_global(MemoryListener *l, bool last_stage) /* Flush all kernel dirty addresses into KVMSlot dirty bitmap */ kvm_dirty_ring_flush(); - /* - * TODO: make this faster when nr_slots is big while there are - * only a few used slots (small VMs). - */ kvm_slots_lock(); - for (i = 0; i < s->nr_slots; i++) { + for (i = 0; i < kml->nr_slots_allocated; i++) { mem = &kml->slots[i]; if (mem->memory_size && mem->flags & KVM_MEM_LOG_DIRTY_PAGES) { kvm_slot_sync_dirty_pages(mem); @@ -1839,12 +1893,9 @@ void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml, { int i; - kml->slots = g_new0(KVMSlot, s->nr_slots); kml->as_id = as_id; - for (i = 0; i < s->nr_slots; i++) { - kml->slots[i].slot = i; - } + kvm_slots_grow(kml, KVM_MEMSLOTS_NR_ALLOC_DEFAULT); QSIMPLEQ_INIT(&kml->transaction_add); QSIMPLEQ_INIT(&kml->transaction_del); @@ -2461,6 +2512,18 @@ static int kvm_init(MachineState *ms) s->nr_slots = 32; } + /* + * A VM will at least require a few memslots to work, or it can even + * fail to boot. Make sure the supported value is always at least + * larger than what we will initially allocate. + */ + if (s->nr_slots < KVM_MEMSLOTS_NR_ALLOC_DEFAULT) { + ret = -EINVAL; + fprintf(stderr, "KVM max supported number of slots (%d) too small\n", + s->nr_slots); + goto err; + } + s->nr_as = kvm_check_extension(s, KVM_CAP_MULTI_ADDRESS_SPACE); if (s->nr_as <= 1) { s->nr_as = 1; diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events index 37626c1ac5..ad2ae6fca5 100644 --- a/accel/kvm/trace-events +++ b/accel/kvm/trace-events @@ -36,3 +36,4 @@ kvm_io_window_exit(void) "" kvm_run_exit_system_event(int cpu_index, uint32_t event_type) "cpu_index %d, system_even_type %"PRIu32 kvm_convert_memory(uint64_t start, uint64_t size, const char *msg) "start 0x%" PRIx64 " size 0x%" PRIx64 " %s" kvm_memory_fault(uint64_t start, uint64_t size, uint64_t flags) "start 0x%" PRIx64 " size 0x%" PRIx64 " flags 0x%" PRIx64 +kvm_slots_grow(unsigned int old, unsigned int new) "%u -> %u" -- 2.45.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/4] KVM: Dynamic sized kvm memslots array 2024-09-04 22:35 ` [PATCH v2 1/4] KVM: Dynamic sized kvm " Peter Xu @ 2024-09-05 15:32 ` Juraj Marcin 2024-09-05 15:59 ` Peter Xu 0 siblings, 1 reply; 9+ messages in thread From: Juraj Marcin @ 2024-09-05 15:32 UTC (permalink / raw) To: Peter Xu Cc: qemu-devel, Vitaly Kuznetsov, Prasad Pandit, Julia Suvorova, Fabiano Rosas, Paolo Bonzini, David Hildenbrand, qemu-stable, Zhiyi Guo Hi Peter, On Thu, Sep 5, 2024 at 12:35 AM Peter Xu <peterx@redhat.com> wrote: > > Zhiyi reported an infinite loop issue in VFIO use case. The cause of that > was a separate discussion, however during that I found a regression of > dirty sync slowness when profiling. > > Each KVMMemoryListerner maintains an array of kvm memslots. Currently it's > statically allocated to be the max supported by the kernel. However after > Linux commit 4fc096a99e ("KVM: Raise the maximum number of user memslots"), > the max supported memslots reported now grows to some number large enough > so that it may not be wise to always statically allocate with the max > reported. > > What's worse, QEMU kvm code still walks all the allocated memslots entries > to do any form of lookups. It can drastically slow down all memslot > operations because each of such loop can run over 32K times on the new > kernels. > > Fix this issue by making the memslots to be allocated dynamically. > > Here the initial size was set to 16 because it should cover the basic VM > usages, so that the hope is the majority VM use case may not even need to > grow at all (e.g. if one starts a VM with ./qemu-system-x86_64 by default > it'll consume 9 memslots), however not too large to waste memory. > > There can also be even better way to address this, but so far this is the > simplest and should be already better even than before we grow the max > supported memslots. For example, in the case of above issue when VFIO was > attached on a 32GB system, there are only ~10 memslots used. So it could > be good enough as of now. > > In the above VFIO context, measurement shows that the precopy dirty sync > shrinked from ~86ms to ~3ms after this patch applied. It should also apply > to any KVM enabled VM even without VFIO. > > NOTE: we don't have a FIXES tag for this patch because there's no real > commit that regressed this in QEMU. Such behavior existed for a long time, > but only start to be a problem when the kernel reports very large > nr_slots_max value. However that's pretty common now (the kernel change > was merged in 2021) so we attached cc:stable because we'll want this change > to be backported to stable branches. > > Cc: qemu-stable <qemu-stable@nongnu.org> > Reported-by: Zhiyi Guo <zhguo@redhat.com> > Tested-by: Zhiyi Guo <zhguo@redhat.com> > Acked-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > include/sysemu/kvm_int.h | 1 + > accel/kvm/kvm-all.c | 93 +++++++++++++++++++++++++++++++++------- > accel/kvm/trace-events | 1 + > 3 files changed, 80 insertions(+), 15 deletions(-) > > diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h > index 1d8fb1473b..48e496b3d4 100644 > --- a/include/sysemu/kvm_int.h > +++ b/include/sysemu/kvm_int.h > @@ -46,6 +46,7 @@ typedef struct KVMMemoryListener { > MemoryListener listener; > KVMSlot *slots; > unsigned int nr_used_slots; > + unsigned int nr_slots_allocated; > int as_id; > QSIMPLEQ_HEAD(, KVMMemoryUpdate) transaction_add; > QSIMPLEQ_HEAD(, KVMMemoryUpdate) transaction_del; > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c > index 75d11a07b2..f9368494a8 100644 > --- a/accel/kvm/kvm-all.c > +++ b/accel/kvm/kvm-all.c > @@ -69,6 +69,9 @@ > #define KVM_GUESTDBG_BLOCKIRQ 0 > #endif > > +/* Default num of memslots to be allocated when VM starts */ > +#define KVM_MEMSLOTS_NR_ALLOC_DEFAULT 16 > + > struct KVMParkedVcpu { > unsigned long vcpu_id; > int kvm_fd; > @@ -165,6 +168,57 @@ void kvm_resample_fd_notify(int gsi) > } > } > > +/** > + * kvm_slots_grow(): Grow the slots[] array in the KVMMemoryListener > + * > + * @kml: The KVMMemoryListener* to grow the slots[] array > + * @nr_slots_new: The new size of slots[] array > + * > + * Returns: True if the array grows larger, false otherwise. > + */ > +static bool kvm_slots_grow(KVMMemoryListener *kml, unsigned int nr_slots_new) > +{ > + unsigned int i, cur = kml->nr_slots_allocated; > + KVMSlot *slots; > + > + if (nr_slots_new > kvm_state->nr_slots) { > + nr_slots_new = kvm_state->nr_slots; > + } > + > + if (cur >= nr_slots_new) { > + /* Big enough, no need to grow, or we reached max */ > + return false; > + } > + > + if (cur == 0) { > + slots = g_new0(KVMSlot, nr_slots_new); > + } else { > + assert(kml->slots); > + slots = g_renew(KVMSlot, kml->slots, nr_slots_new); > + /* > + * g_renew() doesn't initialize extended buffers, however kvm > + * memslots require fields to be zero-initialized. E.g. pointers, > + * memory_size field, etc. > + */ > + memset(&slots[cur], 0x0, sizeof(slots[0]) * (nr_slots_new - cur)); > + } > + > + for (i = cur; i < nr_slots_new; i++) { > + slots[i].slot = i; > + } > + > + kml->slots = slots; > + kml->nr_slots_allocated = nr_slots_new; > + trace_kvm_slots_grow(cur, nr_slots_new); > + > + return true; > +} > + > +static bool kvm_slots_double(KVMMemoryListener *kml) > +{ > + return kvm_slots_grow(kml, kml->nr_slots_allocated * 2); > +} > + > unsigned int kvm_get_max_memslots(void) > { > KVMState *s = KVM_STATE(current_accel()); > @@ -193,15 +247,20 @@ unsigned int kvm_get_free_memslots(void) > /* Called with KVMMemoryListener.slots_lock held */ > static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml) > { > - KVMState *s = kvm_state; > int i; > > - for (i = 0; i < s->nr_slots; i++) { > +retry: > + for (i = 0; i < kml->nr_slots_allocated; i++) { > if (kml->slots[i].memory_size == 0) { > return &kml->slots[i]; > } > } > > + /* If no free slots, try to grow first by doubling */ > + if (kvm_slots_double(kml)) { > + goto retry; At this point we know all previously allocated slots were used and there should be a free slot just after the last used slot (at the start of the region zeroed in the grow function). Wouldn't it be faster to return it here right away, instead of iterating through slots that should still be used again? > + } > + > return NULL; > } > > @@ -222,10 +281,9 @@ static KVMSlot *kvm_lookup_matching_slot(KVMMemoryListener *kml, > hwaddr start_addr, > hwaddr size) > { > - KVMState *s = kvm_state; > int i; > > - for (i = 0; i < s->nr_slots; i++) { > + for (i = 0; i < kml->nr_slots_allocated; i++) { > KVMSlot *mem = &kml->slots[i]; > > if (start_addr == mem->start_addr && size == mem->memory_size) { > @@ -267,7 +325,7 @@ int kvm_physical_memory_addr_from_host(KVMState *s, void *ram, > int i, ret = 0; > > kvm_slots_lock(); > - for (i = 0; i < s->nr_slots; i++) { > + for (i = 0; i < kml->nr_slots_allocated; i++) { > KVMSlot *mem = &kml->slots[i]; > > if (ram >= mem->ram && ram < mem->ram + mem->memory_size) { > @@ -1071,7 +1129,7 @@ static int kvm_physical_log_clear(KVMMemoryListener *kml, > > kvm_slots_lock(); > > - for (i = 0; i < s->nr_slots; i++) { > + for (i = 0; i < kml->nr_slots_allocated; i++) { > mem = &kml->slots[i]; > /* Discard slots that are empty or do not overlap the section */ > if (!mem->memory_size || > @@ -1719,12 +1777,8 @@ static void kvm_log_sync_global(MemoryListener *l, bool last_stage) > /* Flush all kernel dirty addresses into KVMSlot dirty bitmap */ > kvm_dirty_ring_flush(); > > - /* > - * TODO: make this faster when nr_slots is big while there are > - * only a few used slots (small VMs). > - */ > kvm_slots_lock(); > - for (i = 0; i < s->nr_slots; i++) { > + for (i = 0; i < kml->nr_slots_allocated; i++) { > mem = &kml->slots[i]; > if (mem->memory_size && mem->flags & KVM_MEM_LOG_DIRTY_PAGES) { > kvm_slot_sync_dirty_pages(mem); > @@ -1839,12 +1893,9 @@ void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml, > { > int i; > > - kml->slots = g_new0(KVMSlot, s->nr_slots); > kml->as_id = as_id; > > - for (i = 0; i < s->nr_slots; i++) { > - kml->slots[i].slot = i; > - } > + kvm_slots_grow(kml, KVM_MEMSLOTS_NR_ALLOC_DEFAULT); > > QSIMPLEQ_INIT(&kml->transaction_add); > QSIMPLEQ_INIT(&kml->transaction_del); > @@ -2461,6 +2512,18 @@ static int kvm_init(MachineState *ms) > s->nr_slots = 32; > } > > + /* > + * A VM will at least require a few memslots to work, or it can even > + * fail to boot. Make sure the supported value is always at least > + * larger than what we will initially allocate. > + */ > + if (s->nr_slots < KVM_MEMSLOTS_NR_ALLOC_DEFAULT) { > + ret = -EINVAL; > + fprintf(stderr, "KVM max supported number of slots (%d) too small\n", > + s->nr_slots); > + goto err; > + } > + > s->nr_as = kvm_check_extension(s, KVM_CAP_MULTI_ADDRESS_SPACE); > if (s->nr_as <= 1) { > s->nr_as = 1; > diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events > index 37626c1ac5..ad2ae6fca5 100644 > --- a/accel/kvm/trace-events > +++ b/accel/kvm/trace-events > @@ -36,3 +36,4 @@ kvm_io_window_exit(void) "" > kvm_run_exit_system_event(int cpu_index, uint32_t event_type) "cpu_index %d, system_even_type %"PRIu32 > kvm_convert_memory(uint64_t start, uint64_t size, const char *msg) "start 0x%" PRIx64 " size 0x%" PRIx64 " %s" > kvm_memory_fault(uint64_t start, uint64_t size, uint64_t flags) "start 0x%" PRIx64 " size 0x%" PRIx64 " flags 0x%" PRIx64 > +kvm_slots_grow(unsigned int old, unsigned int new) "%u -> %u" > -- > 2.45.0 > -- Juraj Marcin ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/4] KVM: Dynamic sized kvm memslots array 2024-09-05 15:32 ` Juraj Marcin @ 2024-09-05 15:59 ` Peter Xu 2024-09-06 10:54 ` Juraj Marcin 0 siblings, 1 reply; 9+ messages in thread From: Peter Xu @ 2024-09-05 15:59 UTC (permalink / raw) To: Juraj Marcin Cc: qemu-devel, Vitaly Kuznetsov, Prasad Pandit, Julia Suvorova, Fabiano Rosas, Paolo Bonzini, David Hildenbrand, qemu-stable, Zhiyi Guo On Thu, Sep 05, 2024 at 05:32:46PM +0200, Juraj Marcin wrote: > Hi Peter, Hi, Juraj, [...] > > unsigned int kvm_get_max_memslots(void) > > { > > KVMState *s = KVM_STATE(current_accel()); > > @@ -193,15 +247,20 @@ unsigned int kvm_get_free_memslots(void) > > /* Called with KVMMemoryListener.slots_lock held */ > > static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml) > > { > > - KVMState *s = kvm_state; > > int i; > > > > - for (i = 0; i < s->nr_slots; i++) { > > +retry: > > + for (i = 0; i < kml->nr_slots_allocated; i++) { > > if (kml->slots[i].memory_size == 0) { > > return &kml->slots[i]; > > } > > } > > > > + /* If no free slots, try to grow first by doubling */ > > + if (kvm_slots_double(kml)) { > > + goto retry; > > At this point we know all previously allocated slots were used and > there should be a free slot just after the last used slot (at the > start of the region zeroed in the grow function). Wouldn't it be > faster to return it here right away, instead of iterating through > slots that should still be used again? Good question. One trivial concern is we'll then have assumption on how kvm_slots_double() behaves, e.g., it must not move anything around inside, and we need to know that it touches nr_slots_allocated so we need to cache it. The outcome looks like this: ===8<=== diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 020fd16ab8..7429fe87a8 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -249,9 +249,9 @@ unsigned int kvm_get_free_memslots(void) /* Called with KVMMemoryListener.slots_lock held */ static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml) { + unsigned int n; int i; -retry: for (i = 0; i < kml->nr_slots_allocated; i++) { if (kml->slots[i].memory_size == 0) { return &kml->slots[i]; @@ -259,8 +259,13 @@ retry: } /* If no free slots, try to grow first by doubling */ + n = kml->nr_slots_allocated; if (kvm_slots_double(kml)) { - goto retry; + /* + * If succeed, we must have n used slots, then followed by n free + * slots. + */ + return &kml->slots[n]; } return NULL; ===8<=== It's still good to get rid of "goto", and faster indeed. Though I wished we don't need those assumptions, as cons. One thing to mention that I expect this is extremely slow path, where I don't expect to even be reached in major uses of QEMU, and when reached should be only once or limited few times per VM life cycle. The re-walks here shouldn't be a perf concern IMHO, because when it's a concern we'll hit it much more frequently elsewhere... many other hotter paths around. So far it looks slightly more readable to me to keep the old way, but I'm ok either way. What do you think? Thanks, -- Peter Xu ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/4] KVM: Dynamic sized kvm memslots array 2024-09-05 15:59 ` Peter Xu @ 2024-09-06 10:54 ` Juraj Marcin 2024-09-06 12:30 ` Peter Xu 0 siblings, 1 reply; 9+ messages in thread From: Juraj Marcin @ 2024-09-06 10:54 UTC (permalink / raw) To: Peter Xu Cc: qemu-devel, Vitaly Kuznetsov, Prasad Pandit, Julia Suvorova, Fabiano Rosas, Paolo Bonzini, David Hildenbrand, qemu-stable, Zhiyi Guo Hi Peter, On Thu, Sep 5, 2024 at 6:00 PM Peter Xu <peterx@redhat.com> wrote: > > On Thu, Sep 05, 2024 at 05:32:46PM +0200, Juraj Marcin wrote: > > Hi Peter, > > Hi, Juraj, > > [...] > > > > unsigned int kvm_get_max_memslots(void) > > > { > > > KVMState *s = KVM_STATE(current_accel()); > > > @@ -193,15 +247,20 @@ unsigned int kvm_get_free_memslots(void) > > > /* Called with KVMMemoryListener.slots_lock held */ > > > static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml) > > > { > > > - KVMState *s = kvm_state; > > > int i; > > > > > > - for (i = 0; i < s->nr_slots; i++) { > > > +retry: > > > + for (i = 0; i < kml->nr_slots_allocated; i++) { > > > if (kml->slots[i].memory_size == 0) { > > > return &kml->slots[i]; > > > } > > > } > > > > > > + /* If no free slots, try to grow first by doubling */ > > > + if (kvm_slots_double(kml)) { > > > + goto retry; > > > > At this point we know all previously allocated slots were used and > > there should be a free slot just after the last used slot (at the > > start of the region zeroed in the grow function). Wouldn't it be > > faster to return it here right away, instead of iterating through > > slots that should still be used again? > > Good question. > > One trivial concern is we'll then have assumption on how kvm_slots_double() > behaves, e.g., it must not move anything around inside, and we need to know > that it touches nr_slots_allocated so we need to cache it. The outcome > looks like this: > > ===8<=== > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c > index 020fd16ab8..7429fe87a8 100644 > --- a/accel/kvm/kvm-all.c > +++ b/accel/kvm/kvm-all.c > @@ -249,9 +249,9 @@ unsigned int kvm_get_free_memslots(void) > /* Called with KVMMemoryListener.slots_lock held */ > static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml) > { > + unsigned int n; > int i; > > -retry: > for (i = 0; i < kml->nr_slots_allocated; i++) { > if (kml->slots[i].memory_size == 0) { > return &kml->slots[i]; > @@ -259,8 +259,13 @@ retry: > } > > /* If no free slots, try to grow first by doubling */ > + n = kml->nr_slots_allocated; > if (kvm_slots_double(kml)) { > - goto retry; > + /* > + * If succeed, we must have n used slots, then followed by n free > + * slots. > + */ > + return &kml->slots[n]; > } > > return NULL; > ===8<=== > > It's still good to get rid of "goto", and faster indeed. Though I wished > we don't need those assumptions, as cons. > > One thing to mention that I expect this is extremely slow path, where I > don't expect to even be reached in major uses of QEMU, and when reached > should be only once or limited few times per VM life cycle. The re-walks > here shouldn't be a perf concern IMHO, because when it's a concern we'll > hit it much more frequently elsewhere... many other hotter paths around. > > So far it looks slightly more readable to me to keep the old way, but I'm > ok either way. What do you think? I agree that it requires this assumption of not moving slots around, but I think it's intuitive to assume it when it comes to doubling/increasing the size of an array, realloc() and g_renew() also don't shuffle existing elements. In addition, there already is such an assumption. If slots were moved around, pointers returned by `return &kml->slots[i];` wouldn't point to the same slot structure after doubling. However, I realized there's also another problem with this return statement. g_renew() could have moved the whole array to a new address, making all the previously returned pointers invalid. This could be solved by either adding another layer of indirection, so the function returns a pointer to a single slot structure that never moves and the array contains pointers to these structures, or the slots need to be always accessed through an up-to-date pointer to the array, probably from another structure or through a getter function. With the first approach, pointers in the array could shuffle, but with the second one, the index of a slot must not change during the lifetime of the slot, keeping the assumption correct. > > Thanks, > > -- > Peter Xu > -- Juraj Marcin ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/4] KVM: Dynamic sized kvm memslots array 2024-09-06 10:54 ` Juraj Marcin @ 2024-09-06 12:30 ` Peter Xu 0 siblings, 0 replies; 9+ messages in thread From: Peter Xu @ 2024-09-06 12:30 UTC (permalink / raw) To: Juraj Marcin Cc: qemu-devel, Vitaly Kuznetsov, Prasad Pandit, Julia Suvorova, Fabiano Rosas, Paolo Bonzini, David Hildenbrand, qemu-stable, Zhiyi Guo On Fri, Sep 06, 2024 at 12:54:37PM +0200, Juraj Marcin wrote: > Hi Peter, > > On Thu, Sep 5, 2024 at 6:00 PM Peter Xu <peterx@redhat.com> wrote: > > > > On Thu, Sep 05, 2024 at 05:32:46PM +0200, Juraj Marcin wrote: > > > Hi Peter, > > > > Hi, Juraj, > > > > [...] > > > > > > unsigned int kvm_get_max_memslots(void) > > > > { > > > > KVMState *s = KVM_STATE(current_accel()); > > > > @@ -193,15 +247,20 @@ unsigned int kvm_get_free_memslots(void) > > > > /* Called with KVMMemoryListener.slots_lock held */ > > > > static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml) > > > > { > > > > - KVMState *s = kvm_state; > > > > int i; > > > > > > > > - for (i = 0; i < s->nr_slots; i++) { > > > > +retry: > > > > + for (i = 0; i < kml->nr_slots_allocated; i++) { > > > > if (kml->slots[i].memory_size == 0) { > > > > return &kml->slots[i]; > > > > } > > > > } > > > > > > > > + /* If no free slots, try to grow first by doubling */ > > > > + if (kvm_slots_double(kml)) { > > > > + goto retry; > > > > > > At this point we know all previously allocated slots were used and > > > there should be a free slot just after the last used slot (at the > > > start of the region zeroed in the grow function). Wouldn't it be > > > faster to return it here right away, instead of iterating through > > > slots that should still be used again? > > > > Good question. > > > > One trivial concern is we'll then have assumption on how kvm_slots_double() > > behaves, e.g., it must not move anything around inside, and we need to know > > > that it touches nr_slots_allocated so we need to cache it. The outcome > > looks like this: > > > > ===8<=== > > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c > > index 020fd16ab8..7429fe87a8 100644 > > --- a/accel/kvm/kvm-all.c > > +++ b/accel/kvm/kvm-all.c > > @@ -249,9 +249,9 @@ unsigned int kvm_get_free_memslots(void) > > /* Called with KVMMemoryListener.slots_lock held */ > > static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml) > > { > > + unsigned int n; > > int i; > > > > -retry: > > for (i = 0; i < kml->nr_slots_allocated; i++) { > > if (kml->slots[i].memory_size == 0) { > > return &kml->slots[i]; > > @@ -259,8 +259,13 @@ retry: > > } > > > > /* If no free slots, try to grow first by doubling */ > > + n = kml->nr_slots_allocated; > > if (kvm_slots_double(kml)) { > > - goto retry; > > + /* > > + * If succeed, we must have n used slots, then followed by n free > > + * slots. > > + */ > > + return &kml->slots[n]; > > } > > > > return NULL; > > ===8<=== > > > > It's still good to get rid of "goto", and faster indeed. Though I wished > > we don't need those assumptions, as cons. > > > > One thing to mention that I expect this is extremely slow path, where I > > don't expect to even be reached in major uses of QEMU, and when reached > > should be only once or limited few times per VM life cycle. The re-walks > > here shouldn't be a perf concern IMHO, because when it's a concern we'll > > hit it much more frequently elsewhere... many other hotter paths around. > > > > So far it looks slightly more readable to me to keep the old way, but I'm > > ok either way. What do you think? > > I agree that it requires this assumption of not moving slots around, > but I think it's intuitive to assume it when it comes to > doubling/increasing the size of an array, realloc() and g_renew() also > don't shuffle existing elements. > > In addition, there already is such an assumption. If slots were moved > around, pointers returned by `return &kml->slots[i];` wouldn't point > to the same slot structure after doubling. > > However, I realized there's also another problem with this return > statement. g_renew() could have moved the whole array to a new > address, making all the previously returned pointers invalid. This > could be solved by either adding another layer of indirection, so the > function returns a pointer to a single slot structure that never moves > and the array contains pointers to these structures, or the slots need > to be always accessed through an up-to-date pointer to the array, > probably from another structure or through a getter function. With the > first approach, pointers in the array could shuffle, but with the > second one, the index of a slot must not change during the lifetime of > the slot, keeping the assumption correct. Note that all access to kvm slots are protected by kvm_slots_lock() which is currently a mutex (aka, non-RCU), and we can't cache slot yet so far because we don't know whether there's concurrent update. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 2/4] KVM: Define KVM_MEMSLOTS_NUM_MAX_DEFAULT 2024-09-04 22:35 [PATCH v2 0/4] KVM: Dynamic sized memslots array Peter Xu 2024-09-04 22:35 ` [PATCH v2 1/4] KVM: Dynamic sized kvm " Peter Xu @ 2024-09-04 22:35 ` Peter Xu 2024-09-04 22:35 ` [PATCH v2 3/4] KVM: Rename KVMMemoryListener.nr_used_slots to nr_slots_used Peter Xu 2024-09-04 22:35 ` [PATCH v2 4/4] KVM: Rename KVMState->nr_slots to nr_slots_max Peter Xu 3 siblings, 0 replies; 9+ messages in thread From: Peter Xu @ 2024-09-04 22:35 UTC (permalink / raw) To: qemu-devel Cc: Vitaly Kuznetsov, Prasad Pandit, Julia Suvorova, peterx, Fabiano Rosas, Paolo Bonzini, Juraj Marcin, David Hildenbrand Make the default max nr_slots a macro, it's only used when KVM reports nothing. Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> --- accel/kvm/kvm-all.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index f9368494a8..bc1b039190 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -71,6 +71,8 @@ /* Default num of memslots to be allocated when VM starts */ #define KVM_MEMSLOTS_NR_ALLOC_DEFAULT 16 +/* Default max allowed memslots if kernel reported nothing */ +#define KVM_MEMSLOTS_NR_MAX_DEFAULT 32 struct KVMParkedVcpu { unsigned long vcpu_id; @@ -2509,7 +2511,7 @@ static int kvm_init(MachineState *ms) /* If unspecified, use the default value */ if (!s->nr_slots) { - s->nr_slots = 32; + s->nr_slots_max = KVM_MEMSLOTS_NR_MAX_DEFAULT; } /* -- 2.45.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 3/4] KVM: Rename KVMMemoryListener.nr_used_slots to nr_slots_used 2024-09-04 22:35 [PATCH v2 0/4] KVM: Dynamic sized memslots array Peter Xu 2024-09-04 22:35 ` [PATCH v2 1/4] KVM: Dynamic sized kvm " Peter Xu 2024-09-04 22:35 ` [PATCH v2 2/4] KVM: Define KVM_MEMSLOTS_NUM_MAX_DEFAULT Peter Xu @ 2024-09-04 22:35 ` Peter Xu 2024-09-04 22:35 ` [PATCH v2 4/4] KVM: Rename KVMState->nr_slots to nr_slots_max Peter Xu 3 siblings, 0 replies; 9+ messages in thread From: Peter Xu @ 2024-09-04 22:35 UTC (permalink / raw) To: qemu-devel Cc: Vitaly Kuznetsov, Prasad Pandit, Julia Suvorova, peterx, Fabiano Rosas, Paolo Bonzini, Juraj Marcin, David Hildenbrand This will make all nr_slots counters to be named in the same manner. Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> --- include/sysemu/kvm_int.h | 2 +- accel/kvm/kvm-all.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h index 48e496b3d4..b705dfc9b4 100644 --- a/include/sysemu/kvm_int.h +++ b/include/sysemu/kvm_int.h @@ -45,7 +45,7 @@ typedef struct KVMMemoryUpdate { typedef struct KVMMemoryListener { MemoryListener listener; KVMSlot *slots; - unsigned int nr_used_slots; + unsigned int nr_slots_used; unsigned int nr_slots_allocated; int as_id; QSIMPLEQ_HEAD(, KVMMemoryUpdate) transaction_add; diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index bc1b039190..b7fb73ae18 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -239,7 +239,7 @@ unsigned int kvm_get_free_memslots(void) if (!s->as[i].ml) { continue; } - used_slots = MAX(used_slots, s->as[i].ml->nr_used_slots); + used_slots = MAX(used_slots, s->as[i].ml->nr_slots_used); } kvm_slots_unlock(); @@ -1510,7 +1510,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml, } start_addr += slot_size; size -= slot_size; - kml->nr_used_slots--; + kml->nr_slots_used--; } while (size); return; } @@ -1549,7 +1549,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml, ram_start_offset += slot_size; ram += slot_size; size -= slot_size; - kml->nr_used_slots++; + kml->nr_slots_used++; } while (size); } -- 2.45.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 4/4] KVM: Rename KVMState->nr_slots to nr_slots_max 2024-09-04 22:35 [PATCH v2 0/4] KVM: Dynamic sized memslots array Peter Xu ` (2 preceding siblings ...) 2024-09-04 22:35 ` [PATCH v2 3/4] KVM: Rename KVMMemoryListener.nr_used_slots to nr_slots_used Peter Xu @ 2024-09-04 22:35 ` Peter Xu 3 siblings, 0 replies; 9+ messages in thread From: Peter Xu @ 2024-09-04 22:35 UTC (permalink / raw) To: qemu-devel Cc: Vitaly Kuznetsov, Prasad Pandit, Julia Suvorova, peterx, Fabiano Rosas, Paolo Bonzini, Juraj Marcin, David Hildenbrand This value used to reflect the maximum supported memslots from KVM kernel. Rename it to be clearer. Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> --- include/sysemu/kvm_int.h | 4 ++-- accel/kvm/kvm-all.c | 16 ++++++++-------- 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h index b705dfc9b4..2c57194b6b 100644 --- a/include/sysemu/kvm_int.h +++ b/include/sysemu/kvm_int.h @@ -103,8 +103,8 @@ struct KVMDirtyRingReaper { struct KVMState { AccelState parent_obj; - - int nr_slots; + /* Max number of KVM slots supported */ + int nr_slots_max; int fd; int vmfd; int coalesced_mmio; diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index b7fb73ae18..020fd16ab8 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -183,8 +183,8 @@ static bool kvm_slots_grow(KVMMemoryListener *kml, unsigned int nr_slots_new) unsigned int i, cur = kml->nr_slots_allocated; KVMSlot *slots; - if (nr_slots_new > kvm_state->nr_slots) { - nr_slots_new = kvm_state->nr_slots; + if (nr_slots_new > kvm_state->nr_slots_max) { + nr_slots_new = kvm_state->nr_slots_max; } if (cur >= nr_slots_new) { @@ -225,7 +225,7 @@ unsigned int kvm_get_max_memslots(void) { KVMState *s = KVM_STATE(current_accel()); - return s->nr_slots; + return s->nr_slots_max; } unsigned int kvm_get_free_memslots(void) @@ -243,7 +243,7 @@ unsigned int kvm_get_free_memslots(void) } kvm_slots_unlock(); - return s->nr_slots - used_slots; + return s->nr_slots_max - used_slots; } /* Called with KVMMemoryListener.slots_lock held */ @@ -2507,10 +2507,10 @@ static int kvm_init(MachineState *ms) (kvm_supported_memory_attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE); kvm_immediate_exit = kvm_check_extension(s, KVM_CAP_IMMEDIATE_EXIT); - s->nr_slots = kvm_check_extension(s, KVM_CAP_NR_MEMSLOTS); + s->nr_slots_max = kvm_check_extension(s, KVM_CAP_NR_MEMSLOTS); /* If unspecified, use the default value */ - if (!s->nr_slots) { + if (!s->nr_slots_max) { s->nr_slots_max = KVM_MEMSLOTS_NR_MAX_DEFAULT; } @@ -2519,10 +2519,10 @@ static int kvm_init(MachineState *ms) * fail to boot. Make sure the supported value is always at least * larger than what we will initially allocate. */ - if (s->nr_slots < KVM_MEMSLOTS_NR_ALLOC_DEFAULT) { + if (s->nr_slots_max < KVM_MEMSLOTS_NR_ALLOC_DEFAULT) { ret = -EINVAL; fprintf(stderr, "KVM max supported number of slots (%d) too small\n", - s->nr_slots); + s->nr_slots_max); goto err; } -- 2.45.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-09-06 12:30 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-09-04 22:35 [PATCH v2 0/4] KVM: Dynamic sized memslots array Peter Xu 2024-09-04 22:35 ` [PATCH v2 1/4] KVM: Dynamic sized kvm " Peter Xu 2024-09-05 15:32 ` Juraj Marcin 2024-09-05 15:59 ` Peter Xu 2024-09-06 10:54 ` Juraj Marcin 2024-09-06 12:30 ` Peter Xu 2024-09-04 22:35 ` [PATCH v2 2/4] KVM: Define KVM_MEMSLOTS_NUM_MAX_DEFAULT Peter Xu 2024-09-04 22:35 ` [PATCH v2 3/4] KVM: Rename KVMMemoryListener.nr_used_slots to nr_slots_used Peter Xu 2024-09-04 22:35 ` [PATCH v2 4/4] KVM: Rename KVMState->nr_slots to nr_slots_max Peter Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).