qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Igor Redko <redkoi@virtuozzo.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>,
	Anna Melekhova <annam@virtuozzo.com>,
	qemu-devel@nongnu.org, "Denis V. Lunev" <den@openvz.org>,
	Amit Shah <amit.shah@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v2 0/2] migration: fix deadlock
Date: Wed, 30 Sep 2015 17:28:05 +0300	[thread overview]
Message-ID: <560BF175.4030503@virtuozzo.com> (raw)
In-Reply-To: <20150929084715.GA2684@work-vm>

On 29.09.2015 11:47, Dr. David Alan Gilbert wrote:
> * Igor Redko (redkoi@virtuozzo.com) wrote:
>> On Пт., 2015-09-25 at 17:46 +0800, Wen Congyang wrote:
>>> On 09/25/2015 05:09 PM, Denis V. Lunev wrote:
>>>> Release qemu global mutex before call synchronize_rcu().
>>>> synchronize_rcu() waiting for all readers to finish their critical
>>>> sections. There is at least one critical section in which we try
>>>> to get QGM (critical section is in address_space_rw() and
>>>> prepare_mmio_access() is trying to aquire QGM).
>>>>
>>>> Both functions (migration_end() and migration_bitmap_extend())
>>>> are called from main thread which is holding QGM.
>>>>
>>>> Thus there is a race condition that ends up with deadlock:
>>>> main thread     working thread
>>>> Lock QGA                |
>>>> |             Call KVM_EXIT_IO handler
>>>> |                       |
>>>> |        Open rcu reader's critical section
>>>> Migration cleanup bh    |
>>>> |                       |
>>>> synchronize_rcu() is    |
>>>> waiting for readers     |
>>>> |            prepare_mmio_access() is waiting for QGM
>>>>    \                   /
>>>>           deadlock
>>>>
>>>> Patches here are quick and dirty, compile-tested only to validate the
>>>> architectual approach.
>>>>
>>>> Igor, Anna, can you pls start your tests with these patches instead of your
>>>> original one. Thank you.
>>>
>>> Can you give me the backtrace of the working thread?
>>>
>>> I think it is very bad to wait some lock in rcu reader's cirtical section.
>>
>> #0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
>> #1  0x00007f1ef113ccfd in __GI___pthread_mutex_lock (mutex=0x7f1ef4145ce0 <qemu_global_mutex>) at ../nptl/pthread_mutex_lock.c:80
>> #2  0x00007f1ef3c36546 in qemu_mutex_lock (mutex=0x7f1ef4145ce0 <qemu_global_mutex>) at util/qemu-thread-posix.c:73
>> #3  0x00007f1ef387ff46 in qemu_mutex_lock_iothread () at /home/user/my_qemu/qemu/cpus.c:1170
>> #4  0x00007f1ef38514a2 in prepare_mmio_access (mr=0x7f1ef612f200) at /home/user/my_qemu/qemu/exec.c:2390
>> #5  0x00007f1ef385157e in address_space_rw (as=0x7f1ef40ec940 <address_space_io>, addr=49402, attrs=..., buf=0x7f1ef3f97000 "\001", len=1, is_write=true)
>>      at /home/user/my_qemu/qemu/exec.c:2425
>> #6  0x00007f1ef3897c53 in kvm_handle_io (port=49402, attrs=..., data=0x7f1ef3f97000, direction=1, size=1, count=1) at /home/user/my_qemu/qemu/kvm-all.c:1680
>> #7  0x00007f1ef3898144 in kvm_cpu_exec (cpu=0x7f1ef5010fc0) at /home/user/my_qemu/qemu/kvm-all.c:1849
>> #8  0x00007f1ef387fa91 in qemu_kvm_cpu_thread_fn (arg=0x7f1ef5010fc0) at /home/user/my_qemu/qemu/cpus.c:979
>> #9  0x00007f1ef113a6aa in start_thread (arg=0x7f1eef0b9700) at pthread_create.c:333
>> #10 0x00007f1ef0e6feed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
>
> Do you have a test to run in the guest that easily triggers this?
>
> Dave

There are two ways to trigger this. Both of them need 2 hosts with 
qemu+libvirt (host0 and host1) configured for migration.

First way:
0. Create VM on host0 and install centos7
1. Shutdown VM.
2. Start VM (virsh start <VM_name>) and right after that start migration 
to host1 (smth like 'virsh migrate --live --verbose <VM_name> 
"qemu+ssh://host1/system"')
3. Stop migration after ~1 sec (after migration process have been 
started, but before it completed. for example when you see "Migration: [ 
  5 %]")

deadlock: no response from VM and no response from qemu monitor (for 
example 'virsh qemu-monitor-command --hmp <VM_NAME> "info migrate"' will 
hang indefinitely) 9/10

Second way:
0. Create VM with e1000 network card on host0 and install centos7
1. Run iperf on VM (or any other load on network)
2. Start migration
3. Stop migration before it completed.

For this approach e1000 network card is essential because it generates 
KVM_EXIT_MMIO.

Igor
>>>
>>>>
>>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>>> CC: Igor Redko <redkoi@virtuozzo.com>
>>>> CC: Anna Melekhova <annam@virtuozzo.com>
>>>> CC: Juan Quintela <quintela@redhat.com>
>>>> CC: Amit Shah <amit.shah@redhat.com>
>>>>
>>>> Denis V. Lunev (2):
>>>>    migration: bitmap_set is unnecessary as bitmap_new uses g_try_malloc0
>>>>    migration: fix deadlock
>>>>
>>>>   migration/ram.c | 45 ++++++++++++++++++++++++++++-----------------
>>>>   1 file changed, 28 insertions(+), 17 deletions(-)
>>>>
>>>
>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>

  reply	other threads:[~2015-09-30 14:28 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-24 12:53 [Qemu-devel] [PATCH 1/1] migration: fix deadlock Denis V. Lunev
2015-09-25  1:21 ` Wen Congyang
2015-09-25  8:03   ` Denis V. Lunev
2015-09-25  8:23     ` Wen Congyang
2015-09-25  9:09       ` [Qemu-devel] [PATCH v2 0/2] " Denis V. Lunev
2015-09-25  9:09         ` [Qemu-devel] [PATCH 1/2] migration: bitmap_set is unnecessary as bitmap_new uses g_try_malloc0 Denis V. Lunev
2015-09-25  9:24           ` Wen Congyang
2015-09-25  9:31             ` Denis V. Lunev
2015-09-25  9:37               ` Wen Congyang
2015-09-25 10:05                 ` Denis V. Lunev
2015-09-25  9:09         ` [Qemu-devel] [PATCH 2/2] migration: fix deadlock Denis V. Lunev
2015-09-25  9:35           ` Wen Congyang
2015-09-25  9:46         ` [Qemu-devel] [PATCH v2 0/2] " Wen Congyang
2015-09-28 10:55           ` Igor Redko
2015-09-28 15:12             ` Igor Redko
2015-09-29  8:47             ` Dr. David Alan Gilbert
2015-09-30 14:28               ` Igor Redko [this message]
2015-09-29 15:32       ` [Qemu-devel] [PATCH 1/1] " Igor Redko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=560BF175.4030503@virtuozzo.com \
    --to=redkoi@virtuozzo.com \
    --cc=amit.shah@redhat.com \
    --cc=annam@virtuozzo.com \
    --cc=den@openvz.org \
    --cc=dgilbert@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).