From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Laurent Vivier <lvivier@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
qemu-devel@nongnu.org, Juan Quintela <quintela@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v3 09/12] kvm: Persistent per kvmslot dirty bitmap
Date: Thu, 30 May 2019 14:53:30 +0100 [thread overview]
Message-ID: <20190530135329.GG2823@work-vm> (raw)
In-Reply-To: <20190530092919.26059-10-peterx@redhat.com>
* Peter Xu (peterx@redhat.com) wrote:
> When synchronizing dirty bitmap from kernel KVM we do it in a
> per-kvmslot fashion and we allocate the userspace bitmap for each of
> the ioctl. This patch instead make the bitmap cache be persistent
> then we don't need to g_malloc0() every time.
>
> More importantly, the cached per-kvmslot dirty bitmap will be further
> used when we want to add support for the KVM_CLEAR_DIRTY_LOG and this
> cached bitmap will be used to guarantee we won't clear any unknown
> dirty bits otherwise that can be a severe data loss issue for
> migration code.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
Is there no way to make this get allocated the first time it's needed?
I'm thinking here of the VM most of the time not being migrated so we're
allocating this structure for no benefit.
Dave
> ---
> accel/kvm/kvm-all.c | 39 +++++++++++++++++++++------------------
> include/sysemu/kvm_int.h | 2 ++
> 2 files changed, 23 insertions(+), 18 deletions(-)
>
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index b686531586..334c610918 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -497,31 +497,14 @@ static int kvm_physical_sync_dirty_bitmap(KVMMemoryListener *kml,
> return 0;
> }
>
> - /* XXX bad kernel interface alert
> - * For dirty bitmap, kernel allocates array of size aligned to
> - * bits-per-long. But for case when the kernel is 64bits and
> - * the userspace is 32bits, userspace can't align to the same
> - * bits-per-long, since sizeof(long) is different between kernel
> - * and user space. This way, userspace will provide buffer which
> - * may be 4 bytes less than the kernel will use, resulting in
> - * userspace memory corruption (which is not detectable by valgrind
> - * too, in most cases).
> - * So for now, let's align to 64 instead of HOST_LONG_BITS here, in
> - * a hope that sizeof(long) won't become >8 any time soon.
> - */
> - size = ALIGN(((mem->memory_size) >> TARGET_PAGE_BITS),
> - /*HOST_LONG_BITS*/ 64) / 8;
> - d.dirty_bitmap = g_malloc0(size);
> -
> + d.dirty_bitmap = mem->dirty_bmap;
> d.slot = mem->slot | (kml->as_id << 16);
> if (kvm_vm_ioctl(s, KVM_GET_DIRTY_LOG, &d) == -1) {
> DPRINTF("ioctl failed %d\n", errno);
> - g_free(d.dirty_bitmap);
> return -1;
> }
>
> kvm_get_dirty_pages_log_range(section, d.dirty_bitmap);
> - g_free(d.dirty_bitmap);
> }
>
> return 0;
> @@ -765,6 +748,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
> MemoryRegion *mr = section->mr;
> bool writeable = !mr->readonly && !mr->rom_device;
> hwaddr start_addr, size;
> + unsigned long bmap_size;
> void *ram;
>
> if (!memory_region_is_ram(mr)) {
> @@ -796,6 +780,8 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
> }
>
> /* unregister the slot */
> + g_free(mem->dirty_bmap);
> + mem->dirty_bmap = NULL;
> mem->memory_size = 0;
> mem->flags = 0;
> err = kvm_set_user_memory_region(kml, mem, false);
> @@ -807,12 +793,29 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
> return;
> }
>
> + /*
> + * XXX bad kernel interface alert For dirty bitmap, kernel
> + * allocates array of size aligned to bits-per-long. But for case
> + * when the kernel is 64bits and the userspace is 32bits,
> + * userspace can't align to the same bits-per-long, since
> + * sizeof(long) is different between kernel and user space. This
> + * way, userspace will provide buffer which may be 4 bytes less
> + * than the kernel will use, resulting in userspace memory
> + * corruption (which is not detectable by valgrind too, in most
> + * cases). So for now, let's align to 64 instead of
> + * HOST_LONG_BITS here, in a hope that sizeof(long) won't become
> + * >8 any time soon.
> + */
> + bmap_size = ALIGN((size >> TARGET_PAGE_BITS),
> + /*HOST_LONG_BITS*/ 64) / 8;
> +
> /* register the new slot */
> mem = kvm_alloc_slot(kml);
> mem->memory_size = size;
> mem->start_addr = start_addr;
> mem->ram = ram;
> mem->flags = kvm_mem_flags(mr);
> + mem->dirty_bmap = g_malloc0(bmap_size);
>
> err = kvm_set_user_memory_region(kml, mem, true);
> if (err) {
> diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
> index f838412491..687a2ee423 100644
> --- a/include/sysemu/kvm_int.h
> +++ b/include/sysemu/kvm_int.h
> @@ -21,6 +21,8 @@ typedef struct KVMSlot
> int slot;
> int flags;
> int old_flags;
> + /* Dirty bitmap cache for the slot */
> + unsigned long *dirty_bmap;
> } KVMSlot;
>
> typedef struct KVMMemoryListener {
> --
> 2.17.1
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2019-05-30 13:57 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-30 9:29 [Qemu-devel] [PATCH v3 00/12] kvm/migration: support KVM_CLEAR_DIRTY_LOG Peter Xu
2019-05-30 9:29 ` [Qemu-devel] [PATCH v3 01/12] checkpatch: Allow SPDX-License-Identifier Peter Xu
2019-05-31 12:56 ` Juan Quintela
2019-06-03 6:21 ` Peter Xu
2019-06-03 8:01 ` Paolo Bonzini
2019-05-30 9:29 ` [Qemu-devel] [PATCH v3 02/12] migration: No need to take rcu during sync_dirty_bitmap Peter Xu
2019-05-31 12:57 ` Juan Quintela
2019-05-31 12:58 ` Juan Quintela
2019-05-30 9:29 ` [Qemu-devel] [PATCH v3 03/12] memory: Remove memory_region_get_dirty() Peter Xu
2019-05-31 12:59 ` Juan Quintela
2019-05-30 9:29 ` [Qemu-devel] [PATCH v3 04/12] memory: Don't set migration bitmap when without migration Peter Xu
2019-05-31 13:01 ` Juan Quintela
2019-06-01 2:41 ` Peter Xu
2019-05-30 9:29 ` [Qemu-devel] [PATCH v3 05/12] bitmap: Add bitmap_copy_with_{src|dst}_offset() Peter Xu
2019-05-30 11:05 ` Dr. David Alan Gilbert
2019-05-31 1:45 ` Peter Xu
2019-05-30 9:29 ` [Qemu-devel] [PATCH v3 06/12] memory: Pass mr into snapshot_and_clear_dirty Peter Xu
2019-05-30 11:22 ` Dr. David Alan Gilbert
2019-05-31 2:36 ` Peter Xu
2019-05-30 9:29 ` [Qemu-devel] [PATCH v3 07/12] memory: Introduce memory listener hook log_clear() Peter Xu
2019-05-30 13:20 ` Dr. David Alan Gilbert
2019-05-30 9:29 ` [Qemu-devel] [PATCH v3 08/12] kvm: Update comments for sync_dirty_bitmap Peter Xu
2019-05-30 9:29 ` [Qemu-devel] [PATCH v3 09/12] kvm: Persistent per kvmslot dirty bitmap Peter Xu
2019-05-30 13:53 ` Dr. David Alan Gilbert [this message]
2019-05-31 2:43 ` Peter Xu
2019-05-30 9:29 ` [Qemu-devel] [PATCH v3 10/12] kvm: Introduce slots lock for memory listener Peter Xu
2019-05-30 16:40 ` Dr. David Alan Gilbert
2019-05-31 2:48 ` Peter Xu
2019-05-30 9:29 ` [Qemu-devel] [PATCH v3 11/12] kvm: Support KVM_CLEAR_DIRTY_LOG Peter Xu
2019-05-30 17:56 ` Dr. David Alan Gilbert
2019-05-30 9:29 ` [Qemu-devel] [PATCH v3 12/12] migration: Split log_clear() into smaller chunks Peter Xu
2019-05-30 18:58 ` Dr. David Alan Gilbert
2019-05-31 3:05 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190530135329.GG2823@work-vm \
--to=dgilbert@redhat.com \
--cc=lvivier@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).