qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Juan Quintela <quintela@redhat.com>
To: qemu-devel@nongnu.org
Cc: Hailiang Zhang <zhang.zhanghailiang@huawei.com>,
	Li Zhijian <lizhijian@cn.fujitsu.com>,
	Juan Quintela <quintela@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Zhang Chen <chen.zhang@intel.com>, "Rao, Lei" <lei.rao@intel.com>
Subject: [PULL 06/11] Fixed SVM hang when do failover before PVM crash
Date: Wed,  3 Nov 2021 09:46:00 +0100	[thread overview]
Message-ID: <20211103084605.20027-7-quintela@redhat.com> (raw)
In-Reply-To: <20211103084605.20027-1-quintela@redhat.com>

From: "Rao, Lei" <lei.rao@intel.com>

This patch fixed as follows:
    Thread 1 (Thread 0x7f34ee738d80 (LWP 11212)):
    #0 __pthread_clockjoin_ex (threadid=139847152957184, thread_return=0x7f30b1febf30, clockid=<optimized out>, abstime=<optimized out>, block=<optimized out>) at pthread_join_common.c:145
    #1 0x0000563401998e36 in qemu_thread_join (thread=0x563402d66610) at util/qemu-thread-posix.c:587
    #2 0x00005634017a79fa in process_incoming_migration_co (opaque=0x0) at migration/migration.c:502
    #3 0x00005634019b59c9 in coroutine_trampoline (i0=63395504, i1=22068) at util/coroutine-ucontext.c:115
    #4 0x00007f34ef860660 in ?? () at ../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91 from /lib/x86_64-linux-gnu/libc.so.6
    #5 0x00007f30b21ee730 in ?? ()
    #6 0x0000000000000000 in ?? ()

    Thread 13 (Thread 0x7f30b3dff700 (LWP 11747)):
    #0  __lll_lock_wait (futex=futex@entry=0x56340218ffa0 <qemu_global_mutex>, private=0) at lowlevellock.c:52
    #1  0x00007f34efa000a3 in _GI__pthread_mutex_lock (mutex=0x56340218ffa0 <qemu_global_mutex>) at ../nptl/pthread_mutex_lock.c:80
    #2  0x0000563401997f99 in qemu_mutex_lock_impl (mutex=0x56340218ffa0 <qemu_global_mutex>, file=0x563401b7a80e "migration/colo.c", line=806) at util/qemu-thread-posix.c:78
    #3  0x0000563401407144 in qemu_mutex_lock_iothread_impl (file=0x563401b7a80e "migration/colo.c", line=806) at /home/workspace/colo-qemu/cpus.c:1899
    #4  0x00005634017ba8e8 in colo_process_incoming_thread (opaque=0x563402d664c0) at migration/colo.c:806
    #5  0x0000563401998b72 in qemu_thread_start (args=0x5634039f8370) at util/qemu-thread-posix.c:519
    #6  0x00007f34ef9fd609 in start_thread (arg=<optimized out>) at pthread_create.c:477
    #7  0x00007f34ef924293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

    The QEMU main thread is holding the lock:
    (gdb) p qemu_global_mutex
    $1 = {lock = {_data = {lock = 2, __count = 0, __owner = 11212, __nusers = 9, __kind = 0, __spins = 0, __elision = 0, __list = {_prev = 0x0, __next = 0x0}},
     __size = "\002\000\000\000\000\000\000\000\314+\000\000\t", '\000' <repeats 26 times>, __align = 2}, file = 0x563401c07e4b "util/main-loop.c", line = 240,
    initialized = true}

>From the call trace, we can see it is a deadlock bug. and the QEMU main thread holds the global mutex to wait until the COLO thread ends. and the colo thread
wants to acquire the global mutex, which will cause a deadlock. So, we should release the qemu_global_mutex before waiting colo thread ends.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/migration.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index 3fb856f6e1..abaf6f9e3d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -590,8 +590,10 @@ static void process_incoming_migration_co(void *opaque)
         mis->have_colo_incoming_thread = true;
         qemu_coroutine_yield();
 
+        qemu_mutex_unlock_iothread();
         /* Wait checkpoint incoming thread exit before free resource */
         qemu_thread_join(&mis->colo_incoming_thread);
+        qemu_mutex_lock_iothread();
         /* We hold the global iothread lock, so it is safe here */
         colo_release_ram_cache();
     }
-- 
2.33.1



  parent reply	other threads:[~2021-11-03  8:55 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-03  8:45 [PULL 00/11] Migration 20211102 patches Juan Quintela
2021-11-03  8:45 ` [PULL 01/11] migration: provide an error message to migration_cancel() Juan Quintela
2021-11-03  8:45 ` [PULL 02/11] migration: initialise compression_counters for a new migration Juan Quintela
2021-11-03  8:45 ` [PULL 03/11] migration: Zero migration compression counters Juan Quintela
2021-11-03  8:45 ` [PULL 04/11] Some minor optimizations for COLO Juan Quintela
2021-11-03  8:45 ` [PULL 05/11] Fixed qemu crash when guest power off in COLO mode Juan Quintela
2021-11-03  8:46 ` Juan Quintela [this message]
2021-11-03  8:46 ` [PULL 07/11] colo: fixed 'Segmentation fault' when the simplex mode PVM poweroff Juan Quintela
2021-11-03  8:46 ` [PULL 08/11] Removed the qemu_fclose() in colo_process_incoming_thread Juan Quintela
2021-11-03  8:46 ` [PULL 09/11] Changed the last-mode to none of first start COLO Juan Quintela
2021-11-03  8:46 ` [PULL 10/11] colo: Don't dump colo cache if dump-guest-core=off Juan Quintela
2021-11-03  8:46 ` [PULL 11/11] Optimized the function of fill_connection_key Juan Quintela
2021-11-03 15:06   ` Juan Quintela
2021-11-04 10:32 ` [PULL 00/11] Migration 20211102 patches Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211103084605.20027-7-quintela@redhat.com \
    --to=quintela@redhat.com \
    --cc=chen.zhang@intel.com \
    --cc=dgilbert@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=lei.rao@intel.com \
    --cc=lizhijian@cn.fujitsu.com \
    --cc=qemu-devel@nongnu.org \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).