All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>, Juan Quintela <quintela@redhat.com>
Cc: amit.shah@redhat.com, qemu-devel@nongnu.org,
	Li Zhijian <lizhijian@cn.fujitsu.com>
Subject: Re: [Qemu-devel] [PULL 27/28] migration: protect migration_bitmap
Date: Wed, 8 Jul 2015 22:35:57 +0200	[thread overview]
Message-ID: <559D89AD.4090000@redhat.com> (raw)
In-Reply-To: <20150708191318.GM4117@noname.redhat.com>



On 08/07/2015 21:13, Kevin Wolf wrote:
> Am 07.07.2015 um 15:09 hat Juan Quintela geschrieben:
>> From: Li Zhijian <lizhijian@cn.fujitsu.com>
>>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> ---
>>  migration/ram.c | 23 +++++++++++++++++------
>>  1 file changed, 17 insertions(+), 6 deletions(-)
> 
> In current master, HMP 'savevm' is broken (looks like a deadlock in RCU
> code, it just hangs indefinitely). git bisect points to this patch.

This looks like synchronize_rcu() is being called within
rcu_read_lock()/rcu_read_unlock().

The easiest fix is to somehow use call_rcu, but I haven't looked at the
code very well.

I found another embarrassing bug in the RCU code, but it's been there
forever and can wait for after -rc0 (and it wasn't really a problem
until BQL-less MMIO was merged a couple days ago).

Paolo

> The stack trace looks like this:
> 
> (gdb) thread apply all bt
> 
> Thread 3 (Thread 0x7f06febfe700 (LWP 5717)):
> #0  0x00007f070e749f7d in __lll_lock_wait () from /lib64/libpthread.so.0
> #1  0x00007f070e745d32 in _L_lock_791 () from /lib64/libpthread.so.0
> #2  0x00007f070e745c38 in pthread_mutex_lock () from /lib64/libpthread.so.0
> #3  0x00007f070fed8bc9 in qemu_mutex_lock (mutex=mutex@entry=0x7f07107e6700 <rcu_gp_lock>) at util/qemu-thread-posix.c:73
> #4  0x00007f070fee7631 in synchronize_rcu () at util/rcu.c:129
> #5  0x00007f070fee77d9 in call_rcu_thread (opaque=<optimized out>) at util/rcu.c:240
> #6  0x00007f070e743df5 in start_thread () from /lib64/libpthread.so.0
> #7  0x00007f07066ab1ad in clone () from /lib64/libc.so.6
> 
> Thread 2 (Thread 0x7f06f940f700 (LWP 5719)):
> #0  0x00007f070e749f7d in __lll_lock_wait () from /lib64/libpthread.so.0
> #1  0x00007f070e74c4ec in _L_cond_lock_792 () from /lib64/libpthread.so.0
> #2  0x00007f070e74c3c8 in __pthread_mutex_cond_lock () from /lib64/libpthread.so.0
> #3  0x00007f070e747795 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
> #4  0x00007f070fed8ca9 in qemu_cond_wait (cond=<optimized out>, mutex=mutex@entry=0x7f07103b0400 <qemu_global_mutex>) at util/qemu-thread-posix.c:132
> #5  0x00007f070fc4daab in qemu_tcg_cpu_thread_fn (arg=<optimized out>) at /home/kwolf/source/qemu/cpus.c:1050
> #6  0x00007f070e743df5 in start_thread () from /lib64/libpthread.so.0
> #7  0x00007f07066ab1ad in clone () from /lib64/libc.so.6
> 
> Thread 1 (Thread 0x7f070fb20bc0 (LWP 5716)):
> #0  0x00007f07066a5949 in syscall () from /lib64/libc.so.6
> #1  0x00007f070fed8fa2 in futex_wait (val=4294967295, ev=0x7f07107e66c0 <rcu_gp_event>) at util/qemu-thread-posix.c:301
> #2  qemu_event_wait (ev=ev@entry=0x7f07107e66c0 <rcu_gp_event>) at util/qemu-thread-posix.c:399
> #3  0x00007f070fee7713 in wait_for_readers () at util/rcu.c:120
> #4  synchronize_rcu () at util/rcu.c:149
> #5  0x00007f070fc6e0c2 in migration_end () at /home/kwolf/source/qemu/migration/ram.c:1033
> #6  0x00007f070fc6ef23 in ram_save_complete (f=0x7f07122f9aa0, opaque=<optimized out>) at /home/kwolf/source/qemu/migration/ram.c:1241
> #7  0x00007f070fc71d75 in qemu_savevm_state_complete (f=f@entry=0x7f07122f9aa0) at /home/kwolf/source/qemu/migration/savevm.c:836
> #8  0x00007f070fc7298e in qemu_savevm_state (errp=0x7ffe2a081ff8, f=0x7f07122f9aa0) at /home/kwolf/source/qemu/migration/savevm.c:945
> #9  hmp_savevm (mon=0x7f071233b500, qdict=<optimized out>) at /home/kwolf/source/qemu/migration/savevm.c:1353
> #10 0x00007f070fc552d0 in handle_hmp_command (mon=mon@entry=0x7f071233b500, cmdline=0x7f0712350197 "foo") at /home/kwolf/source/qemu/monitor.c:4058
> #11 0x00007f070fc56467 in monitor_command_cb (opaque=0x7f071233b500, cmdline=<optimized out>, readline_opaque=<optimized out>)
>     at /home/kwolf/source/qemu/monitor.c:5081
> #12 0x00007f070fee6dbf in readline_handle_byte (rs=0x7f0712350190, ch=<optimized out>) at util/readline.c:391
> #13 0x00007f070fc55387 in monitor_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>) at /home/kwolf/source/qemu/monitor.c:5064
> #14 0x00007f070fd17b21 in qemu_chr_be_write (len=<optimized out>, buf=0x7ffe2a082640 "\n\331\367\325\001\200\377\377\200&\b*\376\177", s=0x7f0712304670)
>     at qemu-char.c:306
> #15 fd_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=0x7f0712304670) at qemu-char.c:1012
> #16 0x00007f070e04c9ba in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
> #17 0x00007f070fe61678 in glib_pollfds_poll () at main-loop.c:199
> #18 os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:244
> #19 main_loop_wait (nonblocking=<optimized out>) at main-loop.c:493
> #20 0x00007f070fc24e9e in main_loop () at vl.c:1901
> #21 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4649
> 
> Kevin
> 
>> diff --git a/migration/ram.c b/migration/ram.c
>> index 644f52a..9c0bcfe 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -494,6 +494,7 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
>>      return 1;
>>  }
>>
>> +/* Called with rcu_read_lock() to protect migration_bitmap */
>>  static inline
>>  ram_addr_t migration_bitmap_find_and_reset_dirty(MemoryRegion *mr,
>>                                                   ram_addr_t start)
>> @@ -502,26 +503,31 @@ ram_addr_t migration_bitmap_find_and_reset_dirty(MemoryRegion *mr,
>>      unsigned long nr = base + (start >> TARGET_PAGE_BITS);
>>      uint64_t mr_size = TARGET_PAGE_ALIGN(memory_region_size(mr));
>>      unsigned long size = base + (mr_size >> TARGET_PAGE_BITS);
>> +    unsigned long *bitmap;
>>
>>      unsigned long next;
>>
>> +    bitmap = atomic_rcu_read(&migration_bitmap);
>>      if (ram_bulk_stage && nr > base) {
>>          next = nr + 1;
>>      } else {
>> -        next = find_next_bit(migration_bitmap, size, nr);
>> +        next = find_next_bit(bitmap, size, nr);
>>      }
>>
>>      if (next < size) {
>> -        clear_bit(next, migration_bitmap);
>> +        clear_bit(next, bitmap);
>>          migration_dirty_pages--;
>>      }
>>      return (next - base) << TARGET_PAGE_BITS;
>>  }
>>
>> +/* Called with rcu_read_lock() to protect migration_bitmap */
>>  static void migration_bitmap_sync_range(ram_addr_t start, ram_addr_t length)
>>  {
>> +    unsigned long *bitmap;
>> +    bitmap = atomic_rcu_read(&migration_bitmap);
>>      migration_dirty_pages +=
>> -        cpu_physical_memory_sync_dirty_bitmap(migration_bitmap, start, length);
>> +        cpu_physical_memory_sync_dirty_bitmap(bitmap, start, length);
>>  }
>>
>>
>> @@ -1017,10 +1023,15 @@ void free_xbzrle_decoded_buf(void)
>>
>>  static void migration_end(void)
>>  {
>> -    if (migration_bitmap) {
>> +    /* caller have hold iothread lock or is in a bh, so there is
>> +     * no writing race against this migration_bitmap
>> +     */
>> +    unsigned long *bitmap = migration_bitmap;
>> +    atomic_rcu_set(&migration_bitmap, NULL);
>> +    if (bitmap) {
>>          memory_global_dirty_log_stop();
>> -        g_free(migration_bitmap);
>> -        migration_bitmap = NULL;
>> +        synchronize_rcu();
>> +        g_free(bitmap);
>>      }
>>
>>      XBZRLE_cache_lock();
>> -- 
>> 2.4.3
>>
>>
> 
> 

  reply	other threads:[~2015-07-08 20:36 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-07 13:08 [Qemu-devel] [PULL v3 00/28] Migration pull request Juan Quintela
2015-07-07 13:08 ` [Qemu-devel] [PULL 01/28] rdma: fix memory leak Juan Quintela
2015-07-07 13:08 ` [Qemu-devel] [PULL 02/28] Only try and read a VMDescription if it should be there Juan Quintela
2015-07-07 13:08 ` [Qemu-devel] [PULL 03/28] rdma typos Juan Quintela
2015-07-07 13:08 ` [Qemu-devel] [PULL 04/28] Store block name in local blocks structure Juan Quintela
2015-07-07 13:08 ` [Qemu-devel] [PULL 05/28] Translate offsets to destination address space Juan Quintela
2015-07-07 13:08 ` [Qemu-devel] [PULL 06/28] Rework ram_control_load_hook to hook during block load Juan Quintela
2015-07-07 13:08 ` [Qemu-devel] [PULL 07/28] Allow rdma_delete_block to work without the hash Juan Quintela
2015-07-07 13:08 ` [Qemu-devel] [PULL 08/28] Rework ram block hash Juan Quintela
2015-07-07 13:08 ` [Qemu-devel] [PULL 09/28] Sort destination RAMBlocks to be the same as the source Juan Quintela
2015-07-07 13:08 ` [Qemu-devel] [PULL 10/28] Sanity check RDMA remote data Juan Quintela
2015-07-09 14:08   ` Paolo Bonzini
2015-07-09 14:41     ` Dr. David Alan Gilbert
2015-07-07 13:08 ` [Qemu-devel] [PULL 11/28] Fail more cleanly in mismatched RAM cases Juan Quintela
2015-07-07 13:08 ` [Qemu-devel] [PULL 12/28] Fix older machine type compatibility on power with section footers Juan Quintela
2015-07-07 13:08 ` [Qemu-devel] [PULL 13/28] runstate: Add runstate store Juan Quintela
2015-07-07 13:08 ` [Qemu-devel] [PULL 14/28] runstate: migration allows more transitions now Juan Quintela
2015-07-08  9:40   ` zhanghailiang
2015-07-08 11:06     ` Juan Quintela
2015-07-09  2:08       ` zhanghailiang
2015-07-09  2:16       ` Wen Congyang
2015-07-15 10:56       ` Wen Congyang
2015-07-15 11:13         ` Juan Quintela
2015-07-07 13:08 ` [Qemu-devel] [PULL 15/28] migration: create new section to store global state Juan Quintela
2015-07-08 10:11   ` Christian Borntraeger
2015-07-08 10:14     ` Dr. David Alan Gilbert
2015-07-08 10:19       ` Christian Borntraeger
2015-07-08 10:36       ` Christian Borntraeger
2015-07-08 10:43         ` Dr. David Alan Gilbert
2015-07-08 10:54           ` Christian Borntraeger
2015-07-08 11:14             ` Dr. David Alan Gilbert
2015-07-08 11:10     ` Juan Quintela
2015-07-08 12:08     ` Juan Quintela
2015-07-08 12:17       ` Christian Borntraeger
2015-07-08 12:25         ` Juan Quintela
2015-07-08 12:34           ` Christian Borntraeger
2015-07-08 12:51             ` Juan Quintela
2015-07-07 13:08 ` [Qemu-devel] [PULL 16/28] global_state: Make section optional Juan Quintela
2015-07-07 13:08 ` [Qemu-devel] [PULL 17/28] vmstate: Create optional sections Juan Quintela
2015-07-07 13:08 ` [Qemu-devel] [PULL 18/28] migration: Add configuration section Juan Quintela
2015-07-07 13:09 ` [Qemu-devel] [PULL 19/28] migration: Use cmpxchg correctly Juan Quintela
2015-07-07 13:09 ` [Qemu-devel] [PULL 20/28] migration: ensure we start in NONE state Juan Quintela
2015-07-07 13:09 ` [Qemu-devel] [PULL 21/28] migration: Use always helper to set state Juan Quintela
2015-07-07 13:09 ` [Qemu-devel] [PULL 22/28] migration: No need to call trace_migrate_set_state() Juan Quintela
2015-07-07 13:09 ` [Qemu-devel] [PULL 23/28] migration: create migration event Juan Quintela
2015-07-07 13:09 ` [Qemu-devel] [PULL 24/28] migration: Make events a capability Juan Quintela
2015-07-07 14:56   ` Wen Congyang
2015-07-07 15:13     ` Juan Quintela
2015-07-08  6:14   ` Jiri Denemark
2015-07-07 13:09 ` [Qemu-devel] [PULL 25/28] migration: Add migration events on target side Juan Quintela
2015-07-07 13:09 ` [Qemu-devel] [PULL 26/28] check_section_footers: Check the correct section_id Juan Quintela
2015-07-07 13:09 ` [Qemu-devel] [PULL 27/28] migration: protect migration_bitmap Juan Quintela
2015-07-08 19:13   ` Kevin Wolf
2015-07-08 20:35     ` Paolo Bonzini [this message]
2015-07-09  1:19       ` Wen Congyang
2015-07-09  7:59         ` Paolo Bonzini
2015-07-09  8:14           ` Wen Congyang
2015-07-09 12:51             ` Paolo Bonzini
2015-07-09 13:31               ` Wen Congyang
2015-07-07 13:09 ` [Qemu-devel] [PULL 28/28] migration: extend migration_bitmap Juan Quintela
2015-07-07 18:12 ` [Qemu-devel] [PULL v3 00/28] Migration pull request Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=559D89AD.4090000@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=lizhijian@cn.fujitsu.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.