From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37969) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZgWOB-0007B2-5r for qemu-devel@nongnu.org; Mon, 28 Sep 2015 07:14:36 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZgWO7-0006Ve-4U for qemu-devel@nongnu.org; Mon, 28 Sep 2015 07:14:35 -0400 Received: from mx2.parallels.com ([199.115.105.18]:46276) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZgWO6-0006Tn-V9 for qemu-devel@nongnu.org; Mon, 28 Sep 2015 07:14:31 -0400 Message-ID: <1443437748.13911.2.camel@virtuozzo.com> From: Igor Redko Date: Mon, 28 Sep 2015 13:55:48 +0300 In-Reply-To: <560517FB.9080909@cn.fujitsu.com> References: <56050489.9010306@cn.fujitsu.com> <1443172180-1005-1-git-send-email-den@openvz.org> <560517FB.9080909@cn.fujitsu.com> Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [PATCH v2 0/2] migration: fix deadlock List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Wen Congyang Cc: Juan Quintela , qemu-devel@nongnu.org, Anna Melekhova , Paolo Bonzini , Amit Shah , "Denis V. Lunev" On Пт., 2015-09-25 at 17:46 +0800, Wen Congyang wrote: > On 09/25/2015 05:09 PM, Denis V. Lunev wrote: > > Release qemu global mutex before call synchronize_rcu(). > > synchronize_rcu() waiting for all readers to finish their critical > > sections. There is at least one critical section in which we try > > to get QGM (critical section is in address_space_rw() and > > prepare_mmio_access() is trying to aquire QGM). > > > > Both functions (migration_end() and migration_bitmap_extend()) > > are called from main thread which is holding QGM. > > > > Thus there is a race condition that ends up with deadlock: > > main thread working thread > > Lock QGA | > > | Call KVM_EXIT_IO handler > > | | > > | Open rcu reader's critical section > > Migration cleanup bh | > > | | > > synchronize_rcu() is | > > waiting for readers | > > | prepare_mmio_access() is waiting for QGM > > \ / > > deadlock > > > > Patches here are quick and dirty, compile-tested only to validate the > > architectual approach. > > > > Igor, Anna, can you pls start your tests with these patches instead of your > > original one. Thank you. > > Can you give me the backtrace of the working thread? > > I think it is very bad to wait some lock in rcu reader's cirtical section. #0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007f1ef113ccfd in __GI___pthread_mutex_lock (mutex=0x7f1ef4145ce0 ) at ../nptl/pthread_mutex_lock.c:80 #2 0x00007f1ef3c36546 in qemu_mutex_lock (mutex=0x7f1ef4145ce0 ) at util/qemu-thread-posix.c:73 #3 0x00007f1ef387ff46 in qemu_mutex_lock_iothread () at /home/user/my_qemu/qemu/cpus.c:1170 #4 0x00007f1ef38514a2 in prepare_mmio_access (mr=0x7f1ef612f200) at /home/user/my_qemu/qemu/exec.c:2390 #5 0x00007f1ef385157e in address_space_rw (as=0x7f1ef40ec940 , addr=49402, attrs=..., buf=0x7f1ef3f97000 "\001", len=1, is_write=true) at /home/user/my_qemu/qemu/exec.c:2425 #6 0x00007f1ef3897c53 in kvm_handle_io (port=49402, attrs=..., data=0x7f1ef3f97000, direction=1, size=1, count=1) at /home/user/my_qemu/qemu/kvm-all.c:1680 #7 0x00007f1ef3898144 in kvm_cpu_exec (cpu=0x7f1ef5010fc0) at /home/user/my_qemu/qemu/kvm-all.c:1849 #8 0x00007f1ef387fa91 in qemu_kvm_cpu_thread_fn (arg=0x7f1ef5010fc0) at /home/user/my_qemu/qemu/cpus.c:979 #9 0x00007f1ef113a6aa in start_thread (arg=0x7f1eef0b9700) at pthread_create.c:333 #10 0x00007f1ef0e6feed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 > > > > > Signed-off-by: Denis V. Lunev > > CC: Igor Redko > > CC: Anna Melekhova > > CC: Juan Quintela > > CC: Amit Shah > > > > Denis V. Lunev (2): > > migration: bitmap_set is unnecessary as bitmap_new uses g_try_malloc0 > > migration: fix deadlock > > > > migration/ram.c | 45 ++++++++++++++++++++++++++++----------------- > > 1 file changed, 28 insertions(+), 17 deletions(-) > > >