All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Huth <thuth@redhat.com>
To: Liang Li <liang.z.li@intel.com>, qemu-devel@nongnu.org
Cc: kvm@vger.kernel.org, mst@redhat.com, lcapitulino@redhat.com,
	pbonzini@redhat.com, quintela@redhat.com, amit.shah@redhat.com,
	dgilbert@redhat.com
Subject: Re: [QEMU 1/7] balloon: speed up inflating & deflating process
Date: Tue, 14 Jun 2016 13:37:36 +0200	[thread overview]
Message-ID: <575FEC80.2070500@redhat.com> (raw)
In-Reply-To: <1465813009-21390-2-git-send-email-liang.z.li@intel.com>

On 13.06.2016 12:16, Liang Li wrote:
> The implementation of the current virtio-balloon is not very efficient,
> Bellow is test result of time spends on inflating the balloon to 3GB of
> a 4GB idle guest:
> 
> a. allocating pages (6.5%, 103ms)
> b. sending PFNs to host (68.3%, 787ms)
> c. address translation (6.1%, 96ms)
> d. madvise (19%, 300ms)
> 
> It takes about 1577ms for the whole inflating process to complete. The
> test shows that the bottle neck is the stage b and stage d.
> 
> If using a bitmap to send the page info instead of the PFNs, we can
> reduce the overhead spends on stage b quite a lot. Furthermore, it's
> possible to do the address translation and do the madvise with a bulk
> of pages, instead of the current page per page way, so the overhead of
> stage c and stage d can also be reduced a lot.
> 
> This patch is the QEMU side implementation which is intended to speed
> up the inflating & deflating process by adding a new feature to the
> virtio-balloon device. And now, inflating the balloon to 3GB of a 4GB
> idle guest only takes 210ms, it's about 8 times as fast as before.
> 
> TODO: optimize stage a by allocating/freeing a chunk of pages instead
> of a single page at a time.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> ---
>  hw/virtio/virtio-balloon.c                      | 159 ++++++++++++++++++++----
>  include/standard-headers/linux/virtio_balloon.h |   1 +
>  2 files changed, 139 insertions(+), 21 deletions(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 8c15e09..8cf74c2 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -47,6 +47,76 @@ static void balloon_page(void *addr, int deflate)
>  #endif
>  }
>  
> +static void do_balloon_bulk_pages(ram_addr_t base_pfn, int page_shift,
> +                                  unsigned long len, bool deflate)
> +{
> +    ram_addr_t size, processed, chunk, base;
> +    void *addr;
> +    MemoryRegionSection section = {.mr = NULL};
> +
> +    size = (len << page_shift);
> +    base = (base_pfn << page_shift);
> +
> +    for (processed = 0; processed < size; processed += chunk) {
> +        chunk = size - processed;
> +        while (chunk >= TARGET_PAGE_SIZE) {
> +            section = memory_region_find(get_system_memory(),
> +                                         base + processed, chunk);
> +            if (!section.mr) {
> +                chunk = QEMU_ALIGN_DOWN(chunk / 2, TARGET_PAGE_SIZE);
> +            } else {
> +                break;
> +            }
> +        }
> +
> +        if (section.mr &&
> +            (int128_nz(section.size) && memory_region_is_ram(section.mr))) {
> +            addr = section.offset_within_region +
> +                   memory_region_get_ram_ptr(section.mr);
> +            qemu_madvise(addr, chunk,
> +                         deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
> +        } else {
> +            fprintf(stderr, "can't find the chunk, skip\n");

Please try to avoid new fprintf(stderr, ...) in the QEMU sources.
Use error_report(...) or in this case maybe rather
qemu_log_mask(LOG_GUEST_ERROR, ...) instead, and try to use a more
reasonable error message (e.g. that it is clear that the error happened
in the balloon code).

> +            chunk = TARGET_PAGE_SIZE;
> +        }
> +    }
> +}
> +
> +static void balloon_bulk_pages(ram_addr_t base_pfn, unsigned long *bitmap,
> +                               unsigned long len, int page_shift, bool deflate)
> +{
> +#if defined(__linux__)

Why do you need this #if here?

> +    unsigned long end  = len * 8;
> +    unsigned long current = 0;
> +
> +    if (!qemu_balloon_is_inhibited() && (!kvm_enabled() ||
> +                                         kvm_has_sync_mmu())) {
> +        while (current < end) {
> +            unsigned long one = find_next_bit(bitmap, end, current);
> +
> +            if (one < end) {
> +                unsigned long zero = find_next_zero_bit(bitmap, end, one + 1);
> +                unsigned long page_length;
> +
> +                if (zero >= end) {
> +                    page_length = end - one;
> +                } else {
> +                    page_length = zero - one;
> +                }
> +
> +                if (page_length) {
> +                    do_balloon_bulk_pages(base_pfn + one, page_shift,
> +                                          page_length, deflate);
> +                }
> +                current = one + page_length;
> +            } else {
> +                current = one;
> +            }
> +        }
> +    }
> +#endif
> +}

 Thomas


WARNING: multiple messages have this Message-ID (diff)
From: Thomas Huth <thuth@redhat.com>
To: Liang Li <liang.z.li@intel.com>, qemu-devel@nongnu.org
Cc: kvm@vger.kernel.org, mst@redhat.com, lcapitulino@redhat.com,
	pbonzini@redhat.com, quintela@redhat.com, amit.shah@redhat.com,
	dgilbert@redhat.com
Subject: Re: [Qemu-devel] [QEMU 1/7] balloon: speed up inflating & deflating process
Date: Tue, 14 Jun 2016 13:37:36 +0200	[thread overview]
Message-ID: <575FEC80.2070500@redhat.com> (raw)
In-Reply-To: <1465813009-21390-2-git-send-email-liang.z.li@intel.com>

On 13.06.2016 12:16, Liang Li wrote:
> The implementation of the current virtio-balloon is not very efficient,
> Bellow is test result of time spends on inflating the balloon to 3GB of
> a 4GB idle guest:
> 
> a. allocating pages (6.5%, 103ms)
> b. sending PFNs to host (68.3%, 787ms)
> c. address translation (6.1%, 96ms)
> d. madvise (19%, 300ms)
> 
> It takes about 1577ms for the whole inflating process to complete. The
> test shows that the bottle neck is the stage b and stage d.
> 
> If using a bitmap to send the page info instead of the PFNs, we can
> reduce the overhead spends on stage b quite a lot. Furthermore, it's
> possible to do the address translation and do the madvise with a bulk
> of pages, instead of the current page per page way, so the overhead of
> stage c and stage d can also be reduced a lot.
> 
> This patch is the QEMU side implementation which is intended to speed
> up the inflating & deflating process by adding a new feature to the
> virtio-balloon device. And now, inflating the balloon to 3GB of a 4GB
> idle guest only takes 210ms, it's about 8 times as fast as before.
> 
> TODO: optimize stage a by allocating/freeing a chunk of pages instead
> of a single page at a time.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> ---
>  hw/virtio/virtio-balloon.c                      | 159 ++++++++++++++++++++----
>  include/standard-headers/linux/virtio_balloon.h |   1 +
>  2 files changed, 139 insertions(+), 21 deletions(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 8c15e09..8cf74c2 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -47,6 +47,76 @@ static void balloon_page(void *addr, int deflate)
>  #endif
>  }
>  
> +static void do_balloon_bulk_pages(ram_addr_t base_pfn, int page_shift,
> +                                  unsigned long len, bool deflate)
> +{
> +    ram_addr_t size, processed, chunk, base;
> +    void *addr;
> +    MemoryRegionSection section = {.mr = NULL};
> +
> +    size = (len << page_shift);
> +    base = (base_pfn << page_shift);
> +
> +    for (processed = 0; processed < size; processed += chunk) {
> +        chunk = size - processed;
> +        while (chunk >= TARGET_PAGE_SIZE) {
> +            section = memory_region_find(get_system_memory(),
> +                                         base + processed, chunk);
> +            if (!section.mr) {
> +                chunk = QEMU_ALIGN_DOWN(chunk / 2, TARGET_PAGE_SIZE);
> +            } else {
> +                break;
> +            }
> +        }
> +
> +        if (section.mr &&
> +            (int128_nz(section.size) && memory_region_is_ram(section.mr))) {
> +            addr = section.offset_within_region +
> +                   memory_region_get_ram_ptr(section.mr);
> +            qemu_madvise(addr, chunk,
> +                         deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
> +        } else {
> +            fprintf(stderr, "can't find the chunk, skip\n");

Please try to avoid new fprintf(stderr, ...) in the QEMU sources.
Use error_report(...) or in this case maybe rather
qemu_log_mask(LOG_GUEST_ERROR, ...) instead, and try to use a more
reasonable error message (e.g. that it is clear that the error happened
in the balloon code).

> +            chunk = TARGET_PAGE_SIZE;
> +        }
> +    }
> +}
> +
> +static void balloon_bulk_pages(ram_addr_t base_pfn, unsigned long *bitmap,
> +                               unsigned long len, int page_shift, bool deflate)
> +{
> +#if defined(__linux__)

Why do you need this #if here?

> +    unsigned long end  = len * 8;
> +    unsigned long current = 0;
> +
> +    if (!qemu_balloon_is_inhibited() && (!kvm_enabled() ||
> +                                         kvm_has_sync_mmu())) {
> +        while (current < end) {
> +            unsigned long one = find_next_bit(bitmap, end, current);
> +
> +            if (one < end) {
> +                unsigned long zero = find_next_zero_bit(bitmap, end, one + 1);
> +                unsigned long page_length;
> +
> +                if (zero >= end) {
> +                    page_length = end - one;
> +                } else {
> +                    page_length = zero - one;
> +                }
> +
> +                if (page_length) {
> +                    do_balloon_bulk_pages(base_pfn + one, page_shift,
> +                                          page_length, deflate);
> +                }
> +                current = one + page_length;
> +            } else {
> +                current = one;
> +            }
> +        }
> +    }
> +#endif
> +}

 Thomas

  reply	other threads:[~2016-06-14 11:37 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-13 10:16 [QEMU 0/7] Fast balloon and fast live migration Liang Li
2016-06-13 10:16 ` [Qemu-devel] " Liang Li
2016-06-13 10:16 ` [QEMU 1/7] balloon: speed up inflating & deflating process Liang Li
2016-06-13 10:16   ` [Qemu-devel] " Liang Li
2016-06-14 11:37   ` Thomas Huth [this message]
2016-06-14 11:37     ` Thomas Huth
2016-06-14 14:22     ` Li, Liang Z
2016-06-14 14:22       ` [Qemu-devel] " Li, Liang Z
2016-06-14 14:41       ` Li, Liang Z
2016-06-14 14:41         ` [Qemu-devel] " Li, Liang Z
2016-06-14 15:33         ` Thomas Huth
2016-06-14 15:33           ` [Qemu-devel] " Thomas Huth
2016-06-17  0:54           ` Li, Liang Z
2016-06-17  0:54             ` [Qemu-devel] " Li, Liang Z
2016-06-19  4:12   ` Michael S. Tsirkin
2016-06-19  4:12     ` [Qemu-devel] " Michael S. Tsirkin
2016-06-20  1:37     ` Li, Liang Z
2016-06-20  1:37       ` [Qemu-devel] " Li, Liang Z
2016-06-13 10:16 ` [QEMU 2/7] virtio-balloon: add drop cache support Liang Li
2016-06-13 10:16   ` [Qemu-devel] " Liang Li
2016-06-19  4:14   ` Michael S. Tsirkin
2016-06-19  4:14     ` [Qemu-devel] " Michael S. Tsirkin
2016-06-20  2:09     ` Li, Liang Z
2016-06-20  2:09       ` [Qemu-devel] " Li, Liang Z
2016-06-13 10:16 ` [QEMU 3/7] Add the hmp and qmp interface for dropping cache Liang Li
2016-06-13 10:16   ` [Qemu-devel] " Liang Li
2016-06-13 10:50   ` Daniel P. Berrange
2016-06-13 11:06     ` Daniel P. Berrange
2016-06-13 14:12       ` Li, Liang Z
2016-06-13 14:12         ` Li, Liang Z
2016-06-13 11:41     ` Paolo Bonzini
2016-06-13 14:14       ` Li, Liang Z
2016-06-13 14:14         ` Li, Liang Z
2016-06-13 13:50     ` Li, Liang Z
2016-06-13 13:50       ` Li, Liang Z
2016-06-13 15:09       ` Dr. David Alan Gilbert
2016-06-14  1:15         ` Li, Liang Z
2016-06-14  1:15           ` Li, Liang Z
2016-06-17  1:35         ` Li, Liang Z
2016-06-17  1:35           ` Li, Liang Z
2016-06-13 10:16 ` [QEMU 4/7] balloon: get free page info from guest Liang Li
2016-06-13 10:16   ` [Qemu-devel] " Liang Li
2016-06-19  4:24   ` Michael S. Tsirkin
2016-06-19  4:24     ` [Qemu-devel] " Michael S. Tsirkin
2016-06-20  2:48     ` Li, Liang Z
2016-06-20  2:48       ` [Qemu-devel] " Li, Liang Z
2016-06-13 10:16 ` [QEMU 5/7] bitmap: Add a new bitmap_move function Liang Li
2016-06-13 10:16   ` [Qemu-devel] " Liang Li
2016-06-13 10:16 ` [QEMU 6/7] kvm: Add two new arch specific functions Liang Li
2016-06-13 10:16   ` [Qemu-devel] " Liang Li
2016-06-19  4:27   ` Michael S. Tsirkin
2016-06-19  4:27     ` [Qemu-devel] " Michael S. Tsirkin
2016-06-20  3:16     ` Li, Liang Z
2016-06-20  3:16       ` [Qemu-devel] " Li, Liang Z
2016-06-13 10:16 ` [QEMU 7/7] migration: skip free pages during live migration Liang Li
2016-06-13 10:16   ` [Qemu-devel] " Liang Li
2016-06-19  4:43   ` Michael S. Tsirkin
2016-06-19  4:43     ` [Qemu-devel] " Michael S. Tsirkin
2016-06-20  2:52     ` Li, Liang Z
2016-06-20  2:52       ` [Qemu-devel] " Li, Liang Z

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=575FEC80.2070500@redhat.com \
    --to=thuth@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=lcapitulino@redhat.com \
    --cc=liang.z.li@intel.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.