Re: [Qemu-devel] [PATCH] XBZRLE: Fix qemu crash when resize the xbzrle cache during migration

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Gonglei (Arei)" <arei.gonglei@huawei.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
	Juan Quintela <quintela@redhat.com>,
	Luonengjun <luonengjun@huawei.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"owasserm@redhat.com" <owasserm@redhat.com>,
	"chenliang (T)" <chenliang88@huawei.com>,
	"Huangweidong (Hardware)" <huangweidong@huawei.com>
Subject: Re: [Qemu-devel] [PATCH] XBZRLE: Fix qemu crash when resize the xbzrle cache during migration
Date: Wed, 19 Feb 2014 10:53:58 +0000	[thread overview]
Message-ID: <20140219105357.GD2916@work-vm> (raw)
In-Reply-To: <33183CC9F5247A488A2544077AF19020815C5AB8@SZXEMA503-MBS.china.huawei.com>

* Gonglei (Arei) (arei.gonglei@huawei.com) wrote:

Hi Arei,

> It is likely to crash qemu when resize the xbzrle cache
> during migration. Because the xbzrle cache will be modified
> by migration thread and resize thread.

Thanks - we was just thinking about this last night after 
we hit it.

I was thinking about doing it by just moving the resize
into the ram save loop; but I think your lock looks about right.

> Test scene
> step one: set the size of xbzrle cache 1GB.
> step two: migrate vm which dirty memory continuously.
> step three: set the size of xbzrle cache 0.125GB.
> 
> Signed-off-by: ChenLiang <chenliang88@huawei.com<mailto:chenliang88@huawei.com>>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com<mailto:arei.gonglei@huawei.com>>
> ---
> arch_init.c                    |   42 ++++++++++++++++++++++++++++++++++++---
> include/migration/page_cache.h |   14 +++++++++++++
> page_cache.c                   |   13 ++++++++++++
> 3 files changed, 65 insertions(+), 4 deletions(-)
> 
> diff --git a/arch_init.c b/arch_init.c
> index 80574a0..e2d2c72 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -164,26 +164,56 @@ static struct {
>      uint8_t *encoded_buf;
>      /* buffer for storing page content */
>      uint8_t *current_buf;
> -    /* Cache for XBZRLE */
> +    /* Cache for XBZRLE, Protected by lock. */
>      PageCache *cache;
> +    QemuMutex lock;
> } XBZRLE = {
>      .encoded_buf = NULL,
>      .current_buf = NULL,
>      .cache = NULL,
> };
> +
> /* buffer used for XBZRLE decoding */
> static uint8_t *xbzrle_decoded_buf;
> 
> +static void XBZRLE_cache_lock(void)
> +{
> +    qemu_mutex_lock(&XBZRLE.lock);
> +}
> +
> +static void XBZRLE_cache_unlock(void)
> +{
> +    qemu_mutex_unlock(&XBZRLE.lock);
> +}
> +
> +
> int64_t xbzrle_cache_resize(int64_t new_size)
> {
> +    PageCache *new_cache, *old_cache;
> +
>      if (new_size < TARGET_PAGE_SIZE) {
>          return -1;
>      }
> 
>      if (XBZRLE.cache != NULL) {
> -        return cache_resize(XBZRLE.cache, new_size / TARGET_PAGE_SIZE) *
> -            TARGET_PAGE_SIZE;
> +        if (pow2floor(new_size) == cache_max_num_items(XBZRLE.cache)
> +                                               *TARGET_PAGE_SIZE) {
> +            goto ret;
> +        }
> +        new_cache = cache_init(new_size / TARGET_PAGE_SIZE,
> +                                      cache_page_size(XBZRLE.cache));

I wonder whether it's safe to even check the size of the XBZRLE.cache here;
it's a short race, but I think if you're unlucky and migration completes
between the NULL check and here then it would break.
Also 'migration_end' calls cache_fini, so it probably also needs to have
the lock around it.

> +        if (!new_cache) {
> +            DPRINTF("Error creating cache\n");
> +            return -1;
> +        }
> +        XBZRLE_cache_lock();
> +        old_cache = XBZRLE.cache;
> +        XBZRLE.cache = new_cache;
> +        XBZRLE_cache_unlock();
> +        cache_fini(old_cache);
> +        goto ret;
>      }
> +ret:
>      return pow2floor(new_size);
> }
> 
> @@ -522,6 +552,8 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
>              ret = ram_control_save_page(f, block->offset,
>                                 offset, TARGET_PAGE_SIZE, &bytes_sent);
> 
> +            XBZRLE_cache_lock();
> +
>              if (ret != RAM_SAVE_CONTROL_NOT_SUPP) {
>                  if (ret != RAM_SAVE_CONTROL_DELAYED) {
>                      if (bytes_sent > 0) {
> @@ -553,6 +585,8 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
>                  acct_info.norm_pages++;
>              }
> 
> +            XBZRLE_cache_unlock();
> +
>              /* if page is unmodified, continue to the next */
>              if (bytes_sent > 0) {
>                  last_sent_block = block;
> @@ -681,7 +715,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>              XBZRLE.encoded_buf = NULL;
>              return -1;
>          }
> -
> +        qemu_mutex_init(&XBZRLE.lock);
>          acct_clear();
>      }
> 
> diff --git a/include/migration/page_cache.h b/include/migration/page_cache.h
> index d156f0d..6be0103 100644
> --- a/include/migration/page_cache.h
> +++ b/include/migration/page_cache.h
> @@ -79,4 +79,18 @@ int cache_insert(PageCache *cache, uint64_t addr, uint8_t *pdata);
>   */
> int64_t cache_resize(PageCache *cache, int64_t num_pages);
> 
> +/**
> + * cache_max_num_items: return the cache max num items.
> + *
> + * @cache pointer to the PageCache struct
> + */
> +int64_t cache_max_num_items(PageCache *cache);
> +
> +/**
> + * cache_page_size: return the cache page size
> + *
> + * @cache pointer to the PageCache struct
> + */
> +unsigned int cache_page_size(PageCache *cache);
> +
> #endif
> diff --git a/page_cache.c b/page_cache.c
> index 3ef6ee7..92e401c 100644
> --- a/page_cache.c
> +++ b/page_cache.c
> @@ -234,3 +234,16 @@ int64_t cache_resize(PageCache *cache, int64_t new_num_pages)
> 
>      return cache->max_num_items;
> }
> +
> +int64_t cache_max_num_items(PageCache *cache)
> +{
> +    g_assert(cache);
> +    return cache->max_num_items;
> +}
> +
> +unsigned int cache_page_size(PageCache *cache)
> +{
> +    g_assert(cache);
> +    return cache->page_size;
> +}
> +
> --
> 1.6.0.2
> 
> 
> Best regards,
> -Gonglei
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

next prev parent reply	other threads:[~2014-02-19 10:56 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-19 10:24 [Qemu-devel] [PATCH] XBZRLE: Fix qemu crash when resize the xbzrle cache during migration Gonglei (Arei)
2014-02-19 10:53 ` Dr. David Alan Gilbert [this message]
2014-02-20  1:52   ` Gonglei (Arei)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140219105357.GD2916@work-vm \
    --to=dgilbert@redhat.com \
    --cc=arei.gonglei@huawei.com \
    --cc=chenliang88@huawei.com \
    --cc=huangweidong@huawei.com \
    --cc=luonengjun@huawei.com \
    --cc=owasserm@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.