From: Wen Congyang <wency@cn.fujitsu.com>
To: "Denis V. Lunev" <den@openvz.org>
Cc: Igor Redko <redkoi@virtuozzo.com>,
Juan Quintela <quintela@redhat.com>,
qemu-devel@nongnu.org, Anna Melekhova <annam@virtuozzo.com>,
Amit Shah <amit.shah@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 2/2] migration: fix deadlock
Date: Fri, 25 Sep 2015 17:35:46 +0800 [thread overview]
Message-ID: <56051572.5010107@cn.fujitsu.com> (raw)
In-Reply-To: <1443172180-1005-3-git-send-email-den@openvz.org>
On 09/25/2015 05:09 PM, Denis V. Lunev wrote:
> Release qemu global mutex before call synchronize_rcu().
> synchronize_rcu() waiting for all readers to finish their critical
> sections. There is at least one critical section in which we try
> to get QGM (critical section is in address_space_rw() and
> prepare_mmio_access() is trying to aquire QGM).
>
> Both functions (migration_end() and migration_bitmap_extend())
> are called from main thread which is holding QGM.
>
> Thus there is a race condition that ends up with deadlock:
> main thread working thread
> Lock QGA |
> | Call KVM_EXIT_IO handler
> | |
> | Open rcu reader's critical section
> Migration cleanup bh |
> | |
> synchronize_rcu() is |
> waiting for readers |
> | prepare_mmio_access() is waiting for QGM
> \ /
> deadlock
>
> The patch changes bitmap freeing from direct g_free after synchronize_rcu
> to g_free_rcu.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> Reported-by: Igor Redko <redkoi@virtuozzo.com>
> CC: Igor Redko <redkoi@virtuozzo.com>
> CC: Anna Melekhova <annam@virtuozzo.com>
> CC: Juan Quintela <quintela@redhat.com>
> CC: Amit Shah <amit.shah@redhat.com>
> CC: Paolo Bonzini <pbonzini@redhat.com>
> CC: Wen Congyang <wency@cn.fujitsu.com>
> ---
> migration/ram.c | 43 ++++++++++++++++++++++++++++---------------
> 1 file changed, 28 insertions(+), 15 deletions(-)
>
> diff --git a/migration/ram.c b/migration/ram.c
> index a712c68..56b6fce 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -221,12 +221,27 @@ static RAMBlock *last_seen_block;
> /* This is the last block from where we have sent data */
> static RAMBlock *last_sent_block;
> static ram_addr_t last_offset;
> -static unsigned long *migration_bitmap;
> static QemuMutex migration_bitmap_mutex;
> static uint64_t migration_dirty_pages;
> static uint32_t last_version;
> static bool ram_bulk_stage;
>
> +static struct BitmapRcu {
> + struct rcu_head rcu;
> + unsigned long bmap[0];
> +} *migration_bitmap_rcu;
> +
> +static inline struct BitmapRcu *bitmap_new_rcu(long nbits)
> +{
> + long len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
> + struct BitmapRcu *ptr = g_try_malloc0(len + sizeof(struct BitmapRcu));
It is better to allocate memory twice, one is for BitmapRcu, another is calling
bitmap_new(). The user doesn't need to know how the bitmap is implemented.
> + if (ptr == NULL) {
> + abort();
> + }
> + return ptr;
> +}
> +
> +
> struct CompressParam {
> bool start;
> bool done;
> @@ -508,7 +523,7 @@ ram_addr_t migration_bitmap_find_and_reset_dirty(MemoryRegion *mr,
>
> unsigned long next;
>
> - bitmap = atomic_rcu_read(&migration_bitmap);
> + bitmap = atomic_rcu_read(&migration_bitmap_rcu)->bmap;
> if (ram_bulk_stage && nr > base) {
> next = nr + 1;
> } else {
> @@ -526,7 +541,7 @@ ram_addr_t migration_bitmap_find_and_reset_dirty(MemoryRegion *mr,
> static void migration_bitmap_sync_range(ram_addr_t start, ram_addr_t length)
> {
> unsigned long *bitmap;
> - bitmap = atomic_rcu_read(&migration_bitmap);
> + bitmap = atomic_rcu_read(&migration_bitmap_rcu)->bmap;
> migration_dirty_pages +=
> cpu_physical_memory_sync_dirty_bitmap(bitmap, start, length);
> }
> @@ -1029,12 +1044,11 @@ static void migration_end(void)
> /* caller have hold iothread lock or is in a bh, so there is
> * no writing race against this migration_bitmap
> */
> - unsigned long *bitmap = migration_bitmap;
> - atomic_rcu_set(&migration_bitmap, NULL);
> + struct BitmapRcu *bitmap = migration_bitmap_rcu;
> + atomic_rcu_set(&migration_bitmap_rcu, NULL);
> if (bitmap) {
> memory_global_dirty_log_stop();
> - synchronize_rcu();
> - g_free(bitmap);
> + g_free_rcu(bitmap, rcu);
> }
>
> XBZRLE_cache_lock();
> @@ -1070,9 +1084,9 @@ void migration_bitmap_extend(ram_addr_t old, ram_addr_t new)
> /* called in qemu main thread, so there is
> * no writing race against this migration_bitmap
> */
> - if (migration_bitmap) {
> - unsigned long *old_bitmap = migration_bitmap, *bitmap;
> - bitmap = bitmap_new(new);
> + if (migration_bitmap_rcu) {
> + struct BitmapRcu *old_bitmap = migration_bitmap_rcu, *bitmap;
> + bitmap = bitmap_new_rcu(new);
>
> /* prevent migration_bitmap content from being set bit
> * by migration_bitmap_sync_range() at the same time.
> @@ -1080,12 +1094,11 @@ void migration_bitmap_extend(ram_addr_t old, ram_addr_t new)
> * at the same time.
> */
> qemu_mutex_lock(&migration_bitmap_mutex);
> - bitmap_copy(bitmap, old_bitmap, old);
> - atomic_rcu_set(&migration_bitmap, bitmap);
> + bitmap_copy(bitmap->bmap, old_bitmap->bmap, old);
> + atomic_rcu_set(&migration_bitmap_rcu, bitmap);
> qemu_mutex_unlock(&migration_bitmap_mutex);
> migration_dirty_pages += new - old;
> - synchronize_rcu();
> - g_free(old_bitmap);
> + g_free_rcu(old_bitmap, rcu);
> }
> }
>
> @@ -1144,7 +1157,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> reset_ram_globals();
>
> ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS;
> - migration_bitmap = bitmap_new(ram_bitmap_pages);
> + migration_bitmap_rcu = bitmap_new_rcu(ram_bitmap_pages);
>
> /*
> * Count the total number of pages used by ram blocks not including any
>
next prev parent reply other threads:[~2015-09-25 9:36 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-24 12:53 [Qemu-devel] [PATCH 1/1] migration: fix deadlock Denis V. Lunev
2015-09-25 1:21 ` Wen Congyang
2015-09-25 8:03 ` Denis V. Lunev
2015-09-25 8:23 ` Wen Congyang
2015-09-25 9:09 ` [Qemu-devel] [PATCH v2 0/2] " Denis V. Lunev
2015-09-25 9:09 ` [Qemu-devel] [PATCH 1/2] migration: bitmap_set is unnecessary as bitmap_new uses g_try_malloc0 Denis V. Lunev
2015-09-25 9:24 ` Wen Congyang
2015-09-25 9:31 ` Denis V. Lunev
2015-09-25 9:37 ` Wen Congyang
2015-09-25 10:05 ` Denis V. Lunev
2015-09-25 9:09 ` [Qemu-devel] [PATCH 2/2] migration: fix deadlock Denis V. Lunev
2015-09-25 9:35 ` Wen Congyang [this message]
2015-09-25 9:46 ` [Qemu-devel] [PATCH v2 0/2] " Wen Congyang
2015-09-28 10:55 ` Igor Redko
2015-09-28 15:12 ` Igor Redko
2015-09-29 8:47 ` Dr. David Alan Gilbert
2015-09-30 14:28 ` Igor Redko
2015-09-29 15:32 ` [Qemu-devel] [PATCH 1/1] " Igor Redko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56051572.5010107@cn.fujitsu.com \
--to=wency@cn.fujitsu.com \
--cc=amit.shah@redhat.com \
--cc=annam@virtuozzo.com \
--cc=den@openvz.org \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=redkoi@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.