[PATCH 0/2] migration: multifd live migration improvement

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] migration: multifd live migration improvement
@ 2021-11-26 15:31 Li Zhang
  2021-11-26 15:31 ` [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever Li Zhang
  2021-11-26 15:31 ` [PATCH 2/2] migration: Set the socket backlog number to reduce the chance of live migration failure Li Zhang
  0 siblings, 2 replies; 25+ messages in thread
From: Li Zhang @ 2021-11-26 15:31 UTC (permalink / raw)
  To: quintela, dgilbert, cfontana, qemu-devel; +Cc: Li Zhang

When testing live migration with multifd channels (8, 16, or a bigger number)
and using qemu -incoming (without "defer"), if a network error occurs
(for example, triggering the kernel SYN flooding detection),
the migration fails and the guest hangs forever.

The test environment and the command line is as the following:

QEMU verions: QEMU emulator version 6.2.91 (v6.2.0-rc1-47-gc5fbdd60cf)
Host OS: SLE 15  with kernel: 5.14.5-1-default
Network Card: mlx5 100Gbps
Network card: Intel Corporation I350 Gigabit (1Gbps)

Source:
qemu-system-x86_64 -M q35 -smp 32 -nographic \
        -serial telnet:10.156.208.153:4321,server,nowait \
        -m 4096 -enable-kvm -hda /var/lib/libvirt/images/openSUSE-15.3.img \
        -monitor stdio
Dest:
qemu-system-x86_64 -M q35 -smp 32 -nographic \
        -serial telnet:10.156.208.154:4321,server,nowait \
        -m 4096 -enable-kvm -hda /var/lib/libvirt/images/openSUSE-15.3.img \
        -monitor stdio \
        -incoming tcp:1.0.8.154:4000

(qemu) migrate_set_parameter max-bandwidth 100G
(qemu) migrate_set_capability multifd on
(qemu) migrate_set_parameter multifd-channels 16

The guest hangs when executing the command: migrate -d tcp:1.0.8.154:4000.

If a network problem happens, TCP ACK is not received by destination
and the destination resets the connection with RST.

No.     Time    Source  Destination     Protocol        Length  Info
119     1.021169        1.0.8.153       1.0.8.154       TCP     1410    60166 → 4000 [PSH, ACK] Seq=65 Ack=1 Win=62720 Len=1344 TSval=1338662881 TSecr=1399531897
No.     Time    Source  Destination     Protocol        Length  Info
125     1.021181        1.0.8.154       1.0.8.153       TCP     54      4000 → 60166 [RST] Seq=1 Win=0 Len=0

kernel log:
[334520.229445] TCP: request_sock_TCP: Possible SYN flooding on port 4000. Sending cookies.  Check SNMP counters.
[334562.994919] TCP: request_sock_TCP: Possible SYN flooding on port 4000. Sending cookies.  Check SNMP counters.
[334695.519927] TCP: request_sock_TCP: Possible SYN flooding on port 4000. Sending cookies.  Check SNMP counters.
[334734.689511] TCP: request_sock_TCP: Possible SYN flooding on port 4000. Sending cookies.  Check SNMP counters.
[335687.740415] TCP: request_sock_TCP: Possible SYN flooding on port 4000. Sending cookies.  Check SNMP counters.
[335730.013598] TCP: request_sock_TCP: Possible SYN flooding on port 4000. Sending cookies.  Check SNMP counters.

There are two problems here:
1. When live migration fails, the guest hangs and no errors reported
   even if an error has happened.
2. The network problem causes the live migraiton failure when channel
   number is 8, 16, or larger.

So the two patches are to fix the two problems.

Li Zhang (2):
  multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid
    waiting forever
  migration: Set the socket backlog number to reduce chance of live
    migration failure

 migration/multifd.c | 2 +-
 migration/socket.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

-- 
2.31.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-11-26 15:31 [PATCH 0/2] migration: multifd live migration improvement Li Zhang
@ 2021-11-26 15:31 ` Li Zhang
  2021-11-26 15:49   ` Daniel P. Berrangé
  2021-11-26 16:33   ` Juan Quintela
  2021-11-26 15:31 ` [PATCH 2/2] migration: Set the socket backlog number to reduce the chance of live migration failure Li Zhang
  1 sibling, 2 replies; 25+ messages in thread
From: Li Zhang @ 2021-11-26 15:31 UTC (permalink / raw)
  To: quintela, dgilbert, cfontana, qemu-devel; +Cc: Li Zhang

When doing live migration with multifd channels 8, 16 or larger number,
the guest hangs in the presence of the network errors such as missing TCP ACKs.

At sender's side:
The main thread is blocked on qemu_thread_join, migration_fd_cleanup
is called because one thread fails on qio_channel_write_all when
the network problem happens and other send threads are blocked on sendmsg.
They could not be terminated. So the main thread is blocked on qemu_thread_join
to wait for the threads terminated.

(gdb) bt
0  0x00007f30c8dcffc0 in __pthread_clockjoin_ex () at /lib64/libpthread.so.0
1  0x000055cbb716084b in qemu_thread_join (thread=0x55cbb881f418) at ../util/qemu-thread-posix.c:627
2  0x000055cbb6b54e40 in multifd_save_cleanup () at ../migration/multifd.c:542
3  0x000055cbb6b4de06 in migrate_fd_cleanup (s=0x55cbb8024000) at ../migration/migration.c:1808
4  0x000055cbb6b4dfb4 in migrate_fd_cleanup_bh (opaque=0x55cbb8024000) at ../migration/migration.c:1850
5  0x000055cbb7173ac1 in aio_bh_call (bh=0x55cbb7eb98e0) at ../util/async.c:141
6  0x000055cbb7173bcb in aio_bh_poll (ctx=0x55cbb7ebba80) at ../util/async.c:169
7  0x000055cbb715ba4b in aio_dispatch (ctx=0x55cbb7ebba80) at ../util/aio-posix.c:381
8  0x000055cbb7173ffe in aio_ctx_dispatch (source=0x55cbb7ebba80, callback=0x0, user_data=0x0) at ../util/async.c:311
9  0x00007f30c9c8cdf4 in g_main_context_dispatch () at /usr/lib64/libglib-2.0.so.0
10 0x000055cbb71851a2 in glib_pollfds_poll () at ../util/main-loop.c:232
11 0x000055cbb718521c in os_host_main_loop_wait (timeout=42251070366) at ../util/main-loop.c:255
12 0x000055cbb7185321 in main_loop_wait (nonblocking=0) at ../util/main-loop.c:531
13 0x000055cbb6e6ba27 in qemu_main_loop () at ../softmmu/runstate.c:726
14 0x000055cbb6ad6fd7 in main (argc=68, argv=0x7ffc0c578888, envp=0x7ffc0c578ab0) at ../softmmu/main.c:50

At receiver's side:
Several receive threads are not created successfully and the receive threads
which have been created are blocked on qemu_sem_wait. No semaphores are posted
because migration is not started if not all the receive threads are created
successfully and multifd_recv_sync_main is not called which posts the semaphore
to receive threads. So the receive threads are waiting on the semaphore and
never return. It shouldn't wait for the semaphore forever.
Use qemu_sem_timedwait to wait for a while, then return and close the channels.
So the guest doesn't hang anymore.

(gdb) bt
0  0x00007fd61c43f064 in do_futex_wait.constprop () at /lib64/libpthread.so.0
1  0x00007fd61c43f158 in __new_sem_wait_slow.constprop.0 () at /lib64/libpthread.so.0
2  0x000056075916014a in qemu_sem_wait (sem=0x56075b6515f0) at ../util/qemu-thread-posix.c:358
3  0x0000560758b56643 in multifd_recv_thread (opaque=0x56075b651550) at ../migration/multifd.c:1112
4  0x0000560759160598 in qemu_thread_start (args=0x56075befad00) at ../util/qemu-thread-posix.c:556
5  0x00007fd61c43594a in start_thread () at /lib64/libpthread.so.0
6  0x00007fd61c158d0f in clone () at /lib64/libc.so.6

Signed-off-by: Li Zhang <lizhang@suse.de>
---
 migration/multifd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 7c9deb1921..656239ca2a 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -1109,7 +1109,7 @@ static void *multifd_recv_thread(void *opaque)
 
         if (flags & MULTIFD_FLAG_SYNC) {
             qemu_sem_post(&multifd_recv_state->sem_sync);
-            qemu_sem_wait(&p->sem_sync);
+            qemu_sem_timedwait(&p->sem_sync, 1000);
         }
     }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-11-26 15:31 ` [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever Li Zhang
@ 2021-11-26 15:49   ` Daniel P. Berrangé
  2021-11-26 16:44     ` Li Zhang
  2021-11-29 11:20     ` Dr. David Alan Gilbert
  2021-11-26 16:33   ` Juan Quintela
  1 sibling, 2 replies; 25+ messages in thread
From: Daniel P. Berrangé @ 2021-11-26 15:49 UTC (permalink / raw)
  To: Li Zhang; +Cc: qemu-devel, cfontana, dgilbert, quintela

On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> When doing live migration with multifd channels 8, 16 or larger number,
> the guest hangs in the presence of the network errors such as missing TCP ACKs.
> 
> At sender's side:
> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> is called because one thread fails on qio_channel_write_all when
> the network problem happens and other send threads are blocked on sendmsg.
> They could not be terminated. So the main thread is blocked on qemu_thread_join
> to wait for the threads terminated.

Isn't the right answer here to ensure we've called 'shutdown' on
all the FDs, so that the threads get kicked out of sendmsg, before
trying to join the thread ?


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-11-26 15:49   ` Daniel P. Berrangé
@ 2021-11-26 16:44     ` Li Zhang
  2021-11-26 16:51       ` Daniel P. Berrangé
  2021-11-29 11:20     ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 25+ messages in thread
From: Li Zhang @ 2021-11-26 16:44 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: qemu-devel, cfontana, dgilbert, quintela


On 11/26/21 4:49 PM, Daniel P. Berrangé wrote:
> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>> When doing live migration with multifd channels 8, 16 or larger number,
>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>
>> At sender's side:
>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>> is called because one thread fails on qio_channel_write_all when
>> the network problem happens and other send threads are blocked on sendmsg.
>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>> to wait for the threads terminated.
> Isn't the right answer here to ensure we've called 'shutdown' on
> all the FDs, so that the threads get kicked out of sendmsg, before
> trying to join the thread ?

If we shutdown the channels at sender's side, it could terminate send 
threads. The receive threads are still waiting there.

 From receiver's side, if wait semaphore is timeout, the channels can be 
terminated at last. And the sender threads also be terminated at last.

If we close channels on both sides, I am not sure if it works correctly.


>
> Regards,
> Daniel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-11-26 16:44     ` Li Zhang
@ 2021-11-26 16:51       ` Daniel P. Berrangé
  2021-11-26 17:00         ` Li Zhang
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel P. Berrangé @ 2021-11-26 16:51 UTC (permalink / raw)
  To: Li Zhang; +Cc: qemu-devel, cfontana, dgilbert, quintela

On Fri, Nov 26, 2021 at 05:44:04PM +0100, Li Zhang wrote:
> 
> On 11/26/21 4:49 PM, Daniel P. Berrangé wrote:
> > On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > > When doing live migration with multifd channels 8, 16 or larger number,
> > > the guest hangs in the presence of the network errors such as missing TCP ACKs.
> > > 
> > > At sender's side:
> > > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > > is called because one thread fails on qio_channel_write_all when
> > > the network problem happens and other send threads are blocked on sendmsg.
> > > They could not be terminated. So the main thread is blocked on qemu_thread_join
> > > to wait for the threads terminated.
> > Isn't the right answer here to ensure we've called 'shutdown' on
> > all the FDs, so that the threads get kicked out of sendmsg, before
> > trying to join the thread ?
> 
> If we shutdown the channels at sender's side, it could terminate send
> threads. The receive threads are still waiting there.
> 
> From receiver's side, if wait semaphore is timeout, the channels can be
> terminated at last. And the sender threads also be terminated at last.

If something goes wrong on the sender side, the mgmt app should be
tearing down the destination QEMU entirely, so I'm not sure we need
to do anything special to deal with received threads.

Using semtimedwait just feels risky because it will introduce false
failures if the system/network is under high load such that the
connections don't all establish within 1 second.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-11-26 16:51       ` Daniel P. Berrangé
@ 2021-11-26 17:00         ` Li Zhang
  2021-11-26 17:13           ` Daniel P. Berrangé
  0 siblings, 1 reply; 25+ messages in thread
From: Li Zhang @ 2021-11-26 17:00 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: quintela, qemu-devel, dgilbert, cfontana


On 11/26/21 5:51 PM, Daniel P. Berrangé wrote:
> On Fri, Nov 26, 2021 at 05:44:04PM +0100, Li Zhang wrote:
>> On 11/26/21 4:49 PM, Daniel P. Berrangé wrote:
>>> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>>>> When doing live migration with multifd channels 8, 16 or larger number,
>>>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>>>
>>>> At sender's side:
>>>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>>>> is called because one thread fails on qio_channel_write_all when
>>>> the network problem happens and other send threads are blocked on sendmsg.
>>>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>>>> to wait for the threads terminated.
>>> Isn't the right answer here to ensure we've called 'shutdown' on
>>> all the FDs, so that the threads get kicked out of sendmsg, before
>>> trying to join the thread ?
>> If we shutdown the channels at sender's side, it could terminate send
>> threads. The receive threads are still waiting there.
>>
>>  From receiver's side, if wait semaphore is timeout, the channels can be
>> terminated at last. And the sender threads also be terminated at last.
> If something goes wrong on the sender side, the mgmt app should be
> tearing down the destination QEMU entirely, so I'm not sure we need
> to do anything special to deal with received threads.
>
> Using semtimedwait just feels risky because it will introduce false
> failures if the system/network is under high load such that the
> connections don't all establish within 1 second.

You are right. This may be a risk. I am not sure if the interval is 
proper, we can set longer.


>
> Regards,
> Daniel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-11-26 17:00         ` Li Zhang
@ 2021-11-26 17:13           ` Daniel P. Berrangé
  2021-11-26 17:44             ` Li Zhang
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel P. Berrangé @ 2021-11-26 17:13 UTC (permalink / raw)
  To: Li Zhang; +Cc: quintela, qemu-devel, dgilbert, cfontana

On Fri, Nov 26, 2021 at 06:00:24PM +0100, Li Zhang wrote:
> 
> On 11/26/21 5:51 PM, Daniel P. Berrangé wrote:
> > On Fri, Nov 26, 2021 at 05:44:04PM +0100, Li Zhang wrote:
> > > On 11/26/21 4:49 PM, Daniel P. Berrangé wrote:
> > > > On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > > > > When doing live migration with multifd channels 8, 16 or larger number,
> > > > > the guest hangs in the presence of the network errors such as missing TCP ACKs.
> > > > > 
> > > > > At sender's side:
> > > > > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > > > > is called because one thread fails on qio_channel_write_all when
> > > > > the network problem happens and other send threads are blocked on sendmsg.
> > > > > They could not be terminated. So the main thread is blocked on qemu_thread_join
> > > > > to wait for the threads terminated.
> > > > Isn't the right answer here to ensure we've called 'shutdown' on
> > > > all the FDs, so that the threads get kicked out of sendmsg, before
> > > > trying to join the thread ?
> > > If we shutdown the channels at sender's side, it could terminate send
> > > threads. The receive threads are still waiting there.
> > > 
> > >  From receiver's side, if wait semaphore is timeout, the channels can be
> > > terminated at last. And the sender threads also be terminated at last.
> > If something goes wrong on the sender side, the mgmt app should be
> > tearing down the destination QEMU entirely, so I'm not sure we need
> > to do anything special to deal with received threads.
> > 
> > Using semtimedwait just feels risky because it will introduce false
> > failures if the system/network is under high load such that the
> > connections don't all establish within 1 second.
> 
> You are right. This may be a risk. I am not sure if the interval is proper,
> we can set longer.

I don't think any kind of timeout is right in this context. There should
be a sem_post() invoked in every scenario where we want to tear down the
recv thread. That should only be the case when we see the other end of
the connection close IMHO.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-11-26 17:13           ` Daniel P. Berrangé
@ 2021-11-26 17:44             ` Li Zhang
  0 siblings, 0 replies; 25+ messages in thread
From: Li Zhang @ 2021-11-26 17:44 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: cfontana, qemu-devel, dgilbert, quintela


On 11/26/21 6:13 PM, Daniel P. Berrangé wrote:
> On Fri, Nov 26, 2021 at 06:00:24PM +0100, Li Zhang wrote:
>> On 11/26/21 5:51 PM, Daniel P. Berrangé wrote:
>>> On Fri, Nov 26, 2021 at 05:44:04PM +0100, Li Zhang wrote:
>>>> On 11/26/21 4:49 PM, Daniel P. Berrangé wrote:
>>>>> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>>>>>> When doing live migration with multifd channels 8, 16 or larger number,
>>>>>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>>>>>
>>>>>> At sender's side:
>>>>>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>>>>>> is called because one thread fails on qio_channel_write_all when
>>>>>> the network problem happens and other send threads are blocked on sendmsg.
>>>>>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>>>>>> to wait for the threads terminated.
>>>>> Isn't the right answer here to ensure we've called 'shutdown' on
>>>>> all the FDs, so that the threads get kicked out of sendmsg, before
>>>>> trying to join the thread ?
>>>> If we shutdown the channels at sender's side, it could terminate send
>>>> threads. The receive threads are still waiting there.
>>>>
>>>>   From receiver's side, if wait semaphore is timeout, the channels can be
>>>> terminated at last. And the sender threads also be terminated at last.
>>> If something goes wrong on the sender side, the mgmt app should be
>>> tearing down the destination QEMU entirely, so I'm not sure we need
>>> to do anything special to deal with received threads.
>>>
>>> Using semtimedwait just feels risky because it will introduce false
>>> failures if the system/network is under high load such that the
>>> connections don't all establish within 1 second.
>> You are right. This may be a risk. I am not sure if the interval is proper,
>> we can set longer.
> I don't think any kind of timeout is right in this context. There should
> be a sem_post() invoked in every scenario where we want to tear down the
> recv thread. That should only be the case when we see the other end of
> the connection close IMHO.

OK,  I need to consider about that. It may be better to shutdown the 
channels from sender's side.


Thanks

Li

>
> Regards,
> Daniel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-11-26 15:49   ` Daniel P. Berrangé
  2021-11-26 16:44     ` Li Zhang
@ 2021-11-29 11:20     ` Dr. David Alan Gilbert
  2021-11-29 13:37       ` Li Zhang
  2021-11-29 14:58       ` Daniel P. Berrangé
  1 sibling, 2 replies; 25+ messages in thread
From: Dr. David Alan Gilbert @ 2021-11-29 11:20 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: qemu-devel, Li Zhang, cfontana, quintela

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > When doing live migration with multifd channels 8, 16 or larger number,
> > the guest hangs in the presence of the network errors such as missing TCP ACKs.
> > 
> > At sender's side:
> > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > is called because one thread fails on qio_channel_write_all when
> > the network problem happens and other send threads are blocked on sendmsg.
> > They could not be terminated. So the main thread is blocked on qemu_thread_join
> > to wait for the threads terminated.
> 
> Isn't the right answer here to ensure we've called 'shutdown' on
> all the FDs, so that the threads get kicked out of sendmsg, before
> trying to join the thread ?

I agree a timeout is wrong here; there is no way to get a good timeout
value.
However, I'm a bit confused - we should be able to try a shutdown on the
receive side using the 'yank' command. - that's what it's there for; Li
does this solve your problem?

multifd_load_cleanup already kicks sem_sync before trying to do a
thread_join - so have we managed to trigger that on the receive side?

Dave

> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-11-29 11:20     ` Dr. David Alan Gilbert
@ 2021-11-29 13:37       ` Li Zhang
  2021-11-29 14:50         ` Dr. David Alan Gilbert
  2021-11-29 14:58       ` Daniel P. Berrangé
  1 sibling, 1 reply; 25+ messages in thread
From: Li Zhang @ 2021-11-29 13:37 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Daniel P. Berrangé
  Cc: quintela, qemu-devel, cfontana


On 11/29/21 12:20 PM, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>>> When doing live migration with multifd channels 8, 16 or larger number,
>>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>>
>>> At sender's side:
>>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>>> is called because one thread fails on qio_channel_write_all when
>>> the network problem happens and other send threads are blocked on sendmsg.
>>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>>> to wait for the threads terminated.
>> Isn't the right answer here to ensure we've called 'shutdown' on
>> all the FDs, so that the threads get kicked out of sendmsg, before
>> trying to join the thread ?
> I agree a timeout is wrong here; there is no way to get a good timeout
> value.
> However, I'm a bit confused - we should be able to try a shutdown on the
> receive side using the 'yank' command. - that's what it's there for; Li
> does this solve your problem?

No, I tried to register 'yank' on the receive side, the receive threads 
are still waiting there.

It seems that on send side, 'yank' doesn't work either when the send 
threads are blocked.

This may be not the case to call yank. I am not quite sure about it.

>
> multifd_load_cleanup already kicks sem_sync before trying to do a
> thread_join - so have we managed to trigger that on the receive side?

There is no problem with sem_sync in function multifd_load_cleanup.

But it is not called in my case, because no errors are detected on the 
receive side.

The problem is here:

void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
{
     MigrationIncomingState *mis = migration_incoming_get_current();
     Error *local_err = NULL;
     bool start_migration;

    ...

     if (!mis->from_src_file) {

     ...

      } else {
         /* Multiple connections */
         assert(migrate_use_multifd());
         start_migration = multifd_recv_new_channel(ioc, &local_err);
         if (local_err) {
             error_propagate(errp, local_err);
             return;
         }
     }
    if (start_migration) {
         migration_incoming_process();
     }
}

start_migration is always 0, and migration is not started because some 
receive threads are not created.

No errors are detected here and the main process works well but receive 
threads are all waiting for semaphore.

It's hard to know if the receive threads are not created. If we can find 
a way to check if any receive threads

are not created, we can kick the sem_sync and do cleanup.

 From the source code, the thread will be created when QIO channel 
detects something by GIO watch if I understand correctly.

If nothing is detected, socket_accept_icoming_migration won't be called, 
the thread will not be created.

socket_start_incoming_migration_internal ->

     qio_net_listener_set_client_func_full(listener,
socket_accept_incoming_migration,
                                           NULL, NULL,
g_main_context_get_thread_default());

    qio_net_listener_set_client_func_full ->

                qio_channel_add_watch_source(
                 QIO_CHANNEL(listener->sioc[i]), G_IO_IN,
                 qio_net_listener_channel_func,
                 listener, (GDestroyNotify)object_unref, context);

   socket_accept_incoming_migration ->

        migration_channel_process_incoming ->

                migration_ioc_process_incoming ->

                      multifd_recv_new_channel ->

                             qemu_thread_create(&p->thread, p->name, 
multifd_recv_thread, p,
QEMU_THREAD_JOINABLE);

>
> Dave
>
>> Regards,
>> Daniel
>> -- 
>> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
>> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
>> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-11-29 13:37       ` Li Zhang
@ 2021-11-29 14:50         ` Dr. David Alan Gilbert
  2021-11-29 15:34           ` Li Zhang
  2021-12-01 12:11           ` Li Zhang
  0 siblings, 2 replies; 25+ messages in thread
From: Dr. David Alan Gilbert @ 2021-11-29 14:50 UTC (permalink / raw)
  To: Li Zhang; +Cc: quintela, Daniel P. Berrangé, qemu-devel, cfontana

* Li Zhang (lizhang@suse.de) wrote:
> 
> On 11/29/21 12:20 PM, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > > > When doing live migration with multifd channels 8, 16 or larger number,
> > > > the guest hangs in the presence of the network errors such as missing TCP ACKs.
> > > > 
> > > > At sender's side:
> > > > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > > > is called because one thread fails on qio_channel_write_all when
> > > > the network problem happens and other send threads are blocked on sendmsg.
> > > > They could not be terminated. So the main thread is blocked on qemu_thread_join
> > > > to wait for the threads terminated.
> > > Isn't the right answer here to ensure we've called 'shutdown' on
> > > all the FDs, so that the threads get kicked out of sendmsg, before
> > > trying to join the thread ?
> > I agree a timeout is wrong here; there is no way to get a good timeout
> > value.
> > However, I'm a bit confused - we should be able to try a shutdown on the
> > receive side using the 'yank' command. - that's what it's there for; Li
> > does this solve your problem?
> 
> No, I tried to register 'yank' on the receive side, the receive threads are
> still waiting there.
> 
> It seems that on send side, 'yank' doesn't work either when the send threads
> are blocked.
> 
> This may be not the case to call yank. I am not quite sure about it.

We need to fix that; 'yank' should be able to recover from any network
issue.  If it's not working we need to understand why.

> > 
> > multifd_load_cleanup already kicks sem_sync before trying to do a
> > thread_join - so have we managed to trigger that on the receive side?
> 
> There is no problem with sem_sync in function multifd_load_cleanup.
> 
> But it is not called in my case, because no errors are detected on the
> receive side.

If you're getting TCP errors why aren't you seeing any errors on the
receive side?

> The problem is here:
> 
> void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
> {
>     MigrationIncomingState *mis = migration_incoming_get_current();
>     Error *local_err = NULL;
>     bool start_migration;
> 
>    ...
> 
>     if (!mis->from_src_file) {
> 
>     ...
> 
>      } else {
>         /* Multiple connections */
>         assert(migrate_use_multifd());
>         start_migration = multifd_recv_new_channel(ioc, &local_err);
>         if (local_err) {
>             error_propagate(errp, local_err);
>             return;
>         }
>     }
>    if (start_migration) {
>         migration_incoming_process();
>     }
> }
> 
> start_migration is always 0, and migration is not started because some
> receive threads are not created.
> 
> No errors are detected here and the main process works well but receive
> threads are all waiting for semaphore.
> 
> It's hard to know if the receive threads are not created. If we can find a
> way to check if any receive threads

So is this only a problem for network issues that happen during startup,
before all the threads have been created?

Dave

> are not created, we can kick the sem_sync and do cleanup.
> 
> From the source code, the thread will be created when QIO channel detects
> something by GIO watch if I understand correctly.
> 
> If nothing is detected, socket_accept_icoming_migration won't be called, the
> thread will not be created.
> 
> socket_start_incoming_migration_internal ->
> 
>     qio_net_listener_set_client_func_full(listener,
> socket_accept_incoming_migration,
>                                           NULL, NULL,
> g_main_context_get_thread_default());
> 
>    qio_net_listener_set_client_func_full ->
> 
>                qio_channel_add_watch_source(
>                 QIO_CHANNEL(listener->sioc[i]), G_IO_IN,
>                 qio_net_listener_channel_func,
>                 listener, (GDestroyNotify)object_unref, context);
> 
>   socket_accept_incoming_migration ->
> 
>        migration_channel_process_incoming ->
> 
>                migration_ioc_process_incoming ->
> 
>                      multifd_recv_new_channel ->
> 
>                             qemu_thread_create(&p->thread, p->name,
> multifd_recv_thread, p,
> QEMU_THREAD_JOINABLE);
> 
> > 
> > Dave
> > 
> > > Regards,
> > > Daniel
> > > -- 
> > > |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> > > |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> > > |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> > > 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-11-29 14:50         ` Dr. David Alan Gilbert
@ 2021-11-29 15:34           ` Li Zhang
  2021-12-01 12:11           ` Li Zhang
  1 sibling, 0 replies; 25+ messages in thread
From: Li Zhang @ 2021-11-29 15:34 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: quintela, Daniel P. Berrangé, qemu-devel, cfontana


On 11/29/21 3:50 PM, Dr. David Alan Gilbert wrote:
> * Li Zhang (lizhang@suse.de) wrote:
>> On 11/29/21 12:20 PM, Dr. David Alan Gilbert wrote:
>>> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>>>> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>>>>> When doing live migration with multifd channels 8, 16 or larger number,
>>>>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>>>>
>>>>> At sender's side:
>>>>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>>>>> is called because one thread fails on qio_channel_write_all when
>>>>> the network problem happens and other send threads are blocked on sendmsg.
>>>>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>>>>> to wait for the threads terminated.
>>>> Isn't the right answer here to ensure we've called 'shutdown' on
>>>> all the FDs, so that the threads get kicked out of sendmsg, before
>>>> trying to join the thread ?
>>> I agree a timeout is wrong here; there is no way to get a good timeout
>>> value.
>>> However, I'm a bit confused - we should be able to try a shutdown on the
>>> receive side using the 'yank' command. - that's what it's there for; Li
>>> does this solve your problem?
>> No, I tried to register 'yank' on the receive side, the receive threads are
>> still waiting there.
>>
>> It seems that on send side, 'yank' doesn't work either when the send threads
>> are blocked.
>>
>> This may be not the case to call yank. I am not quite sure about it.
> We need to fix that; 'yank' should be able to recover from any network
> issue.  If it's not working we need to understand why.

OK, I will look into it.

>
>>> multifd_load_cleanup already kicks sem_sync before trying to do a
>>> thread_join - so have we managed to trigger that on the receive side?
>> There is no problem with sem_sync in function multifd_load_cleanup.
>>
>> But it is not called in my case, because no errors are detected on the
>> receive side.
> If you're getting TCP errors why aren't you seeing any errors on the
> receive side?

That's  a good point. I need to find out it.

>
>> The problem is here:
>>
>> void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
>> {
>>      MigrationIncomingState *mis = migration_incoming_get_current();
>>      Error *local_err = NULL;
>>      bool start_migration;
>>
>>     ...
>>
>>      if (!mis->from_src_file) {
>>
>>      ...
>>
>>       } else {
>>          /* Multiple connections */
>>          assert(migrate_use_multifd());
>>          start_migration = multifd_recv_new_channel(ioc, &local_err);
>>          if (local_err) {
>>              error_propagate(errp, local_err);
>>              return;
>>          }
>>      }
>>     if (start_migration) {
>>          migration_incoming_process();
>>      }
>> }
>>
>> start_migration is always 0, and migration is not started because some
>> receive threads are not created.
>>
>> No errors are detected here and the main process works well but receive
>> threads are all waiting for semaphore.
>>
>> It's hard to know if the receive threads are not created. If we can find a
>> way to check if any receive threads
> So is this only a problem for network issues that happen during startup,
> before all the threads have been created?

Yes, it is.

>
> Dave
>
>> are not created, we can kick the sem_sync and do cleanup.
>>
>>  From the source code, the thread will be created when QIO channel detects
>> something by GIO watch if I understand correctly.
>>
>> If nothing is detected, socket_accept_icoming_migration won't be called, the
>> thread will not be created.
>>
>> socket_start_incoming_migration_internal ->
>>
>>      qio_net_listener_set_client_func_full(listener,
>> socket_accept_incoming_migration,
>>                                            NULL, NULL,
>> g_main_context_get_thread_default());
>>
>>     qio_net_listener_set_client_func_full ->
>>
>>                 qio_channel_add_watch_source(
>>                  QIO_CHANNEL(listener->sioc[i]), G_IO_IN,
>>                  qio_net_listener_channel_func,
>>                  listener, (GDestroyNotify)object_unref, context);
>>
>>    socket_accept_incoming_migration ->
>>
>>         migration_channel_process_incoming ->
>>
>>                 migration_ioc_process_incoming ->
>>
>>                       multifd_recv_new_channel ->
>>
>>                              qemu_thread_create(&p->thread, p->name,
>> multifd_recv_thread, p,
>> QEMU_THREAD_JOINABLE);
>>
>>> Dave
>>>
>>>> Regards,
>>>> Daniel
>>>> -- 
>>>> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
>>>> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
>>>> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>>>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-11-29 14:50         ` Dr. David Alan Gilbert
  2021-11-29 15:34           ` Li Zhang
@ 2021-12-01 12:11           ` Li Zhang
  2021-12-01 12:22             ` Daniel P. Berrangé
  1 sibling, 1 reply; 25+ messages in thread
From: Li Zhang @ 2021-12-01 12:11 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: quintela, Daniel P. Berrangé, qemu-devel, cfontana


On 11/29/21 3:50 PM, Dr. David Alan Gilbert wrote:
> * Li Zhang (lizhang@suse.de) wrote:
>> On 11/29/21 12:20 PM, Dr. David Alan Gilbert wrote:
>>> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>>>> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>>>>> When doing live migration with multifd channels 8, 16 or larger number,
>>>>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>>>>
>>>>> At sender's side:
>>>>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>>>>> is called because one thread fails on qio_channel_write_all when
>>>>> the network problem happens and other send threads are blocked on sendmsg.
>>>>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>>>>> to wait for the threads terminated.
>>>> Isn't the right answer here to ensure we've called 'shutdown' on
>>>> all the FDs, so that the threads get kicked out of sendmsg, before
>>>> trying to join the thread ?
>>> I agree a timeout is wrong here; there is no way to get a good timeout
>>> value.
>>> However, I'm a bit confused - we should be able to try a shutdown on the
>>> receive side using the 'yank' command. - that's what it's there for; Li
>>> does this solve your problem?
>> No, I tried to register 'yank' on the receive side, the receive threads are
>> still waiting there.
>>
>> It seems that on send side, 'yank' doesn't work either when the send threads
>> are blocked.
>>
>> This may be not the case to call yank. I am not quite sure about it.
> We need to fix that; 'yank' should be able to recover from any network
> issue.  If it's not working we need to understand why.

Hi Dr. David,

On the receive side, I register 'yank' and it is called. But it is just 
to shut down the channels,

it couldn't fix the problem of the receive threads which are waiting for 
the semaphore.

So the receive threads are still waiting there.

On the send side,  the main process is blocked on qemu_thread_join(), 
when I tried the 'yank'

command with QMP,  it is not handled. So the QMP doesn't work and yank 
doesn't work.

I think it's necessary to shutdown the channels before terminating the 
threads, which can prevent the send threads

being blocked on sendmsg.

By investigating the source code of yank, it only shuts down the 
channels, the live migration may recover when

something wrong occurs because of io channels. But if the threads are 
blocked on semphores,  locks or

something else, it couldn't recover by yank command line.

>
>>> multifd_load_cleanup already kicks sem_sync before trying to do a
>>> thread_join - so have we managed to trigger that on the receive side?
>> There is no problem with sem_sync in function multifd_load_cleanup.
>>
>> But it is not called in my case, because no errors are detected on the
>> receive side.
> If you're getting TCP errors why aren't you seeing any errors on the
> receive side?

 From the kernel log,  a TCP SYN flooding is detected. This causes the 
TCP ACK

missing and the receive side just sends a RST to reset the connection 
forcely without errors.

If TCP SYN Flooding detecting is disabled, the problem can be ignored 
and it can cotinue to tranfer the data.

And live migration works, but I don't think the TCP SYNC flooding 
detecting should be disabled.

On the send side, it causes a failure when writing qio channels and 
migration_save_cleanup is called.


Thank

Li

>
>> The problem is here:
>>
>> void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
>> {
>>      MigrationIncomingState *mis = migration_incoming_get_current();
>>      Error *local_err = NULL;
>>      bool start_migration;
>>
>>     ...
>>
>>      if (!mis->from_src_file) {
>>
>>      ...
>>
>>       } else {
>>          /* Multiple connections */
>>          assert(migrate_use_multifd());
>>          start_migration = multifd_recv_new_channel(ioc, &local_err);
>>          if (local_err) {
>>              error_propagate(errp, local_err);
>>              return;
>>          }
>>      }
>>     if (start_migration) {
>>          migration_incoming_process();
>>      }
>> }
>>
>> start_migration is always 0, and migration is not started because some
>> receive threads are not created.
>>
>> No errors are detected here and the main process works well but receive
>> threads are all waiting for semaphore.
>>
>> It's hard to know if the receive threads are not created. If we can find a
>> way to check if any receive threads
> So is this only a problem for network issues that happen during startup,
> before all the threads have been created?
>
> Dave
>
>> are not created, we can kick the sem_sync and do cleanup.
>>
>>  From the source code, the thread will be created when QIO channel detects
>> something by GIO watch if I understand correctly.
>>
>> If nothing is detected, socket_accept_icoming_migration won't be called, the
>> thread will not be created.
>>
>> socket_start_incoming_migration_internal ->
>>
>>      qio_net_listener_set_client_func_full(listener,
>> socket_accept_incoming_migration,
>>                                            NULL, NULL,
>> g_main_context_get_thread_default());
>>
>>     qio_net_listener_set_client_func_full ->
>>
>>                 qio_channel_add_watch_source(
>>                  QIO_CHANNEL(listener->sioc[i]), G_IO_IN,
>>                  qio_net_listener_channel_func,
>>                  listener, (GDestroyNotify)object_unref, context);
>>
>>    socket_accept_incoming_migration ->
>>
>>         migration_channel_process_incoming ->
>>
>>                 migration_ioc_process_incoming ->
>>
>>                       multifd_recv_new_channel ->
>>
>>                              qemu_thread_create(&p->thread, p->name,
>> multifd_recv_thread, p,
>> QEMU_THREAD_JOINABLE);
>>
>>> Dave
>>>
>>>> Regards,
>>>> Daniel
>>>> -- 
>>>> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
>>>> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
>>>> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>>>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-12-01 12:11           ` Li Zhang
@ 2021-12-01 12:22             ` Daniel P. Berrangé
  2021-12-01 13:42               ` Li Zhang
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel P. Berrangé @ 2021-12-01 12:22 UTC (permalink / raw)
  To: Li Zhang; +Cc: cfontana, quintela, Dr. David Alan Gilbert, qemu-devel

On Wed, Dec 01, 2021 at 01:11:13PM +0100, Li Zhang wrote:
> 
> On 11/29/21 3:50 PM, Dr. David Alan Gilbert wrote:
> > * Li Zhang (lizhang@suse.de) wrote:
> > > On 11/29/21 12:20 PM, Dr. David Alan Gilbert wrote:
> > > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > > On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > > > > > When doing live migration with multifd channels 8, 16 or larger number,
> > > > > > the guest hangs in the presence of the network errors such as missing TCP ACKs.
> > > > > > 
> > > > > > At sender's side:
> > > > > > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > > > > > is called because one thread fails on qio_channel_write_all when
> > > > > > the network problem happens and other send threads are blocked on sendmsg.
> > > > > > They could not be terminated. So the main thread is blocked on qemu_thread_join
> > > > > > to wait for the threads terminated.
> > > > > Isn't the right answer here to ensure we've called 'shutdown' on
> > > > > all the FDs, so that the threads get kicked out of sendmsg, before
> > > > > trying to join the thread ?
> > > > I agree a timeout is wrong here; there is no way to get a good timeout
> > > > value.
> > > > However, I'm a bit confused - we should be able to try a shutdown on the
> > > > receive side using the 'yank' command. - that's what it's there for; Li
> > > > does this solve your problem?
> > > No, I tried to register 'yank' on the receive side, the receive threads are
> > > still waiting there.
> > > 
> > > It seems that on send side, 'yank' doesn't work either when the send threads
> > > are blocked.
> > > 
> > > This may be not the case to call yank. I am not quite sure about it.
> > We need to fix that; 'yank' should be able to recover from any network
> > issue.  If it's not working we need to understand why.
> 
> Hi Dr. David,
> 
> On the receive side, I register 'yank' and it is called. But it is just to
> shut down the channels,
> 
> it couldn't fix the problem of the receive threads which are waiting for the
> semaphore.
> 
> So the receive threads are still waiting there.
> 
> On the send side,  the main process is blocked on qemu_thread_join(), when I
> tried the 'yank'
> 
> command with QMP,  it is not handled. So the QMP doesn't work and yank
> doesn't work.

IOW, there is a bug in QEMU on the send side. It should not be calling
qemu_thread_join() from the main thread, unless it is extremely
confident that the thread in question has already finished.

You seem to be showing that the thread(s) are still running, so we
need to understand why that is the case, and why the main thread
still decided to try to join these threads which haven't finished.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-12-01 12:22             ` Daniel P. Berrangé
@ 2021-12-01 13:42               ` Li Zhang
  2021-12-01 14:09                 ` Daniel P. Berrangé
  0 siblings, 1 reply; 25+ messages in thread
From: Li Zhang @ 2021-12-01 13:42 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: qemu-devel, cfontana, Dr. David Alan Gilbert, quintela


On 12/1/21 1:22 PM, Daniel P. Berrangé wrote:
> On Wed, Dec 01, 2021 at 01:11:13PM +0100, Li Zhang wrote:
>> On 11/29/21 3:50 PM, Dr. David Alan Gilbert wrote:
>>> * Li Zhang (lizhang@suse.de) wrote:
>>>> On 11/29/21 12:20 PM, Dr. David Alan Gilbert wrote:
>>>>> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>>>>>> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>>>>>>> When doing live migration with multifd channels 8, 16 or larger number,
>>>>>>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>>>>>>
>>>>>>> At sender's side:
>>>>>>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>>>>>>> is called because one thread fails on qio_channel_write_all when
>>>>>>> the network problem happens and other send threads are blocked on sendmsg.
>>>>>>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>>>>>>> to wait for the threads terminated.
>>>>>> Isn't the right answer here to ensure we've called 'shutdown' on
>>>>>> all the FDs, so that the threads get kicked out of sendmsg, before
>>>>>> trying to join the thread ?
>>>>> I agree a timeout is wrong here; there is no way to get a good timeout
>>>>> value.
>>>>> However, I'm a bit confused - we should be able to try a shutdown on the
>>>>> receive side using the 'yank' command. - that's what it's there for; Li
>>>>> does this solve your problem?
>>>> No, I tried to register 'yank' on the receive side, the receive threads are
>>>> still waiting there.
>>>>
>>>> It seems that on send side, 'yank' doesn't work either when the send threads
>>>> are blocked.
>>>>
>>>> This may be not the case to call yank. I am not quite sure about it.
>>> We need to fix that; 'yank' should be able to recover from any network
>>> issue.  If it's not working we need to understand why.
>> Hi Dr. David,
>>
>> On the receive side, I register 'yank' and it is called. But it is just to
>> shut down the channels,
>>
>> it couldn't fix the problem of the receive threads which are waiting for the
>> semaphore.
>>
>> So the receive threads are still waiting there.
>>
>> On the send side,  the main process is blocked on qemu_thread_join(), when I
>> tried the 'yank'
>>
>> command with QMP,  it is not handled. So the QMP doesn't work and yank
>> doesn't work.
> IOW, there is a bug in QEMU on the send side. It should not be calling
> qemu_thread_join() from the main thread, unless it is extremely
> confident that the thread in question has already finished.
>
> You seem to be showing that the thread(s) are still running, so we
> need to understand why that is the case, and why the main thread
> still decided to try to join these threads which haven't finished.

Some threads are running. But there is one thread fails to 
qio_channel_write_all.

In migration_thread(), it detects an error here:

    thr_error = migration_detect_error(s);
         if (thr_error == MIG_THR_ERR_FATAL) {
             /* Stop migration */
             break;

It will stop migration and cleanup.

>
> Regards,
> Daniel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-12-01 13:42               ` Li Zhang
@ 2021-12-01 14:09                 ` Daniel P. Berrangé
  2021-12-01 14:15                   ` Li Zhang
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel P. Berrangé @ 2021-12-01 14:09 UTC (permalink / raw)
  To: Li Zhang; +Cc: qemu-devel, cfontana, Dr. David Alan Gilbert, quintela

On Wed, Dec 01, 2021 at 02:42:04PM +0100, Li Zhang wrote:
> 
> On 12/1/21 1:22 PM, Daniel P. Berrangé wrote:
> > On Wed, Dec 01, 2021 at 01:11:13PM +0100, Li Zhang wrote:
> > > On 11/29/21 3:50 PM, Dr. David Alan Gilbert wrote:
> > > > * Li Zhang (lizhang@suse.de) wrote:
> > > > > On 11/29/21 12:20 PM, Dr. David Alan Gilbert wrote:
> > > > > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > > > > On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > > > > > > > When doing live migration with multifd channels 8, 16 or larger number,
> > > > > > > > the guest hangs in the presence of the network errors such as missing TCP ACKs.
> > > > > > > > 
> > > > > > > > At sender's side:
> > > > > > > > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > > > > > > > is called because one thread fails on qio_channel_write_all when
> > > > > > > > the network problem happens and other send threads are blocked on sendmsg.
> > > > > > > > They could not be terminated. So the main thread is blocked on qemu_thread_join
> > > > > > > > to wait for the threads terminated.
> > > > > > > Isn't the right answer here to ensure we've called 'shutdown' on
> > > > > > > all the FDs, so that the threads get kicked out of sendmsg, before
> > > > > > > trying to join the thread ?
> > > > > > I agree a timeout is wrong here; there is no way to get a good timeout
> > > > > > value.
> > > > > > However, I'm a bit confused - we should be able to try a shutdown on the
> > > > > > receive side using the 'yank' command. - that's what it's there for; Li
> > > > > > does this solve your problem?
> > > > > No, I tried to register 'yank' on the receive side, the receive threads are
> > > > > still waiting there.
> > > > > 
> > > > > It seems that on send side, 'yank' doesn't work either when the send threads
> > > > > are blocked.
> > > > > 
> > > > > This may be not the case to call yank. I am not quite sure about it.
> > > > We need to fix that; 'yank' should be able to recover from any network
> > > > issue.  If it's not working we need to understand why.
> > > Hi Dr. David,
> > > 
> > > On the receive side, I register 'yank' and it is called. But it is just to
> > > shut down the channels,
> > > 
> > > it couldn't fix the problem of the receive threads which are waiting for the
> > > semaphore.
> > > 
> > > So the receive threads are still waiting there.
> > > 
> > > On the send side,  the main process is blocked on qemu_thread_join(), when I
> > > tried the 'yank'
> > > 
> > > command with QMP,  it is not handled. So the QMP doesn't work and yank
> > > doesn't work.
> > IOW, there is a bug in QEMU on the send side. It should not be calling
> > qemu_thread_join() from the main thread, unless it is extremely
> > confident that the thread in question has already finished.
> > 
> > You seem to be showing that the thread(s) are still running, so we
> > need to understand why that is the case, and why the main thread
> > still decided to try to join these threads which haven't finished.
> 
> Some threads are running. But there is one thread fails to
> qio_channel_write_all.
> 
> In migration_thread(), it detects an error here:
> 
>    thr_error = migration_detect_error(s);
>         if (thr_error == MIG_THR_ERR_FATAL) {
>             /* Stop migration */
>             break;
> 
> It will stop migration and cleanup.

Those threads which are still running need to be made to
terminate before trying to join them

A quick glance at multifd_send_terminate_threads() makes me
suspect multifd shutdown is not reliable.

It is merely setting some boolean flags and posting to a
semaphore. It is doing nothing to shutdown the socket
associated with each thread, so the threads can still be
waiting in an I/O call. IMHO multifd_send_terminate_threads
needs to call qio_chanel_shutdown(p->c, QIO_CHANNEL_SHUTDOWN_BOTH)


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-12-01 14:09                 ` Daniel P. Berrangé
@ 2021-12-01 14:15                   ` Li Zhang
  0 siblings, 0 replies; 25+ messages in thread
From: Li Zhang @ 2021-12-01 14:15 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: qemu-devel, cfontana, Dr. David Alan Gilbert, quintela


On 12/1/21 3:09 PM, Daniel P. Berrangé wrote:
> On Wed, Dec 01, 2021 at 02:42:04PM +0100, Li Zhang wrote:
>> On 12/1/21 1:22 PM, Daniel P. Berrangé wrote:
>>> On Wed, Dec 01, 2021 at 01:11:13PM +0100, Li Zhang wrote:
>>>> On 11/29/21 3:50 PM, Dr. David Alan Gilbert wrote:
>>>>> * Li Zhang (lizhang@suse.de) wrote:
>>>>>> On 11/29/21 12:20 PM, Dr. David Alan Gilbert wrote:
>>>>>>> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>>>>>>>> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>>>>>>>>> When doing live migration with multifd channels 8, 16 or larger number,
>>>>>>>>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>>>>>>>>
>>>>>>>>> At sender's side:
>>>>>>>>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>>>>>>>>> is called because one thread fails on qio_channel_write_all when
>>>>>>>>> the network problem happens and other send threads are blocked on sendmsg.
>>>>>>>>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>>>>>>>>> to wait for the threads terminated.
>>>>>>>> Isn't the right answer here to ensure we've called 'shutdown' on
>>>>>>>> all the FDs, so that the threads get kicked out of sendmsg, before
>>>>>>>> trying to join the thread ?
>>>>>>> I agree a timeout is wrong here; there is no way to get a good timeout
>>>>>>> value.
>>>>>>> However, I'm a bit confused - we should be able to try a shutdown on the
>>>>>>> receive side using the 'yank' command. - that's what it's there for; Li
>>>>>>> does this solve your problem?
>>>>>> No, I tried to register 'yank' on the receive side, the receive threads are
>>>>>> still waiting there.
>>>>>>
>>>>>> It seems that on send side, 'yank' doesn't work either when the send threads
>>>>>> are blocked.
>>>>>>
>>>>>> This may be not the case to call yank. I am not quite sure about it.
>>>>> We need to fix that; 'yank' should be able to recover from any network
>>>>> issue.  If it's not working we need to understand why.
>>>> Hi Dr. David,
>>>>
>>>> On the receive side, I register 'yank' and it is called. But it is just to
>>>> shut down the channels,
>>>>
>>>> it couldn't fix the problem of the receive threads which are waiting for the
>>>> semaphore.
>>>>
>>>> So the receive threads are still waiting there.
>>>>
>>>> On the send side,  the main process is blocked on qemu_thread_join(), when I
>>>> tried the 'yank'
>>>>
>>>> command with QMP,  it is not handled. So the QMP doesn't work and yank
>>>> doesn't work.
>>> IOW, there is a bug in QEMU on the send side. It should not be calling
>>> qemu_thread_join() from the main thread, unless it is extremely
>>> confident that the thread in question has already finished.
>>>
>>> You seem to be showing that the thread(s) are still running, so we
>>> need to understand why that is the case, and why the main thread
>>> still decided to try to join these threads which haven't finished.
>> Some threads are running. But there is one thread fails to
>> qio_channel_write_all.
>>
>> In migration_thread(), it detects an error here:
>>
>>     thr_error = migration_detect_error(s);
>>          if (thr_error == MIG_THR_ERR_FATAL) {
>>              /* Stop migration */
>>              break;
>>
>> It will stop migration and cleanup.
> Those threads which are still running need to be made to
> terminate before trying to join them
>
> A quick glance at multifd_send_terminate_threads() makes me
> suspect multifd shutdown is not reliable.
>
> It is merely setting some boolean flags and posting to a
> semaphore. It is doing nothing to shutdown the socket
> associated with each thread, so the threads can still be
> waiting in an I/O call. IMHO multifd_send_terminate_threads
> needs to call qio_chanel_shutdown(p->c, QIO_CHANNEL_SHUTDOWN_BOTH)

Agree with you.


>
> Regards,
> Daniel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-11-29 11:20     ` Dr. David Alan Gilbert
  2021-11-29 13:37       ` Li Zhang
@ 2021-11-29 14:58       ` Daniel P. Berrangé
  2021-11-29 15:49         ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 25+ messages in thread
From: Daniel P. Berrangé @ 2021-11-29 14:58 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, Li Zhang, cfontana, quintela

On Mon, Nov 29, 2021 at 11:20:08AM +0000, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > > When doing live migration with multifd channels 8, 16 or larger number,
> > > the guest hangs in the presence of the network errors such as missing TCP ACKs.
> > > 
> > > At sender's side:
> > > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > > is called because one thread fails on qio_channel_write_all when
> > > the network problem happens and other send threads are blocked on sendmsg.
> > > They could not be terminated. So the main thread is blocked on qemu_thread_join
> > > to wait for the threads terminated.
> > 
> > Isn't the right answer here to ensure we've called 'shutdown' on
> > all the FDs, so that the threads get kicked out of sendmsg, before
> > trying to join the thread ?
> 
> I agree a timeout is wrong here; there is no way to get a good timeout
> value.
> However, I'm a bit confused - we should be able to try a shutdown on the
> receive side using the 'yank' command. - that's what it's there for; Li
> does this solve your problem?

Why do we even need to use 'yank' on the receive side ? Until migration
has switched over from src to dst, the receive side is discardable and
the whole process can just be teminated with kill(SIGTERM/SIGKILL).

On the source side 'yank' is needed, because the QEMU process is still
running the live workload and thus is precious and mustn't be killed.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-11-29 14:58       ` Daniel P. Berrangé
@ 2021-11-29 15:49         ` Dr. David Alan Gilbert
  2021-12-06  9:28           ` Li Zhang
  0 siblings, 1 reply; 25+ messages in thread
From: Dr. David Alan Gilbert @ 2021-11-29 15:49 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: qemu-devel, Li Zhang, cfontana, quintela

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Mon, Nov 29, 2021 at 11:20:08AM +0000, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > > > When doing live migration with multifd channels 8, 16 or larger number,
> > > > the guest hangs in the presence of the network errors such as missing TCP ACKs.
> > > > 
> > > > At sender's side:
> > > > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > > > is called because one thread fails on qio_channel_write_all when
> > > > the network problem happens and other send threads are blocked on sendmsg.
> > > > They could not be terminated. So the main thread is blocked on qemu_thread_join
> > > > to wait for the threads terminated.
> > > 
> > > Isn't the right answer here to ensure we've called 'shutdown' on
> > > all the FDs, so that the threads get kicked out of sendmsg, before
> > > trying to join the thread ?
> > 
> > I agree a timeout is wrong here; there is no way to get a good timeout
> > value.
> > However, I'm a bit confused - we should be able to try a shutdown on the
> > receive side using the 'yank' command. - that's what it's there for; Li
> > does this solve your problem?
> 
> Why do we even need to use 'yank' on the receive side ? Until migration
> has switched over from src to dst, the receive side is discardable and
> the whole process can just be teminated with kill(SIGTERM/SIGKILL).

True, although it's nice to be able to quit cleanly.

> On the source side 'yank' is needed, because the QEMU process is still
> running the live workload and thus is precious and mustn't be killed.

True.

Dave

> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-11-29 15:49         ` Dr. David Alan Gilbert
@ 2021-12-06  9:28           ` Li Zhang
  0 siblings, 0 replies; 25+ messages in thread
From: Li Zhang @ 2021-12-06  9:28 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Daniel P. Berrangé
  Cc: qemu-devel, cfontana, quintela


On 11/29/21 4:49 PM, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>> On Mon, Nov 29, 2021 at 11:20:08AM +0000, Dr. David Alan Gilbert wrote:
>>> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>>>> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>>>>> When doing live migration with multifd channels 8, 16 or larger number,
>>>>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>>>>
>>>>> At sender's side:
>>>>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>>>>> is called because one thread fails on qio_channel_write_all when
>>>>> the network problem happens and other send threads are blocked on sendmsg.
>>>>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>>>>> to wait for the threads terminated.
>>>> Isn't the right answer here to ensure we've called 'shutdown' on
>>>> all the FDs, so that the threads get kicked out of sendmsg, before
>>>> trying to join the thread ?
>>> I agree a timeout is wrong here; there is no way to get a good timeout
>>> value.
>>> However, I'm a bit confused - we should be able to try a shutdown on the
>>> receive side using the 'yank' command. - that's what it's there for; Li
>>> does this solve your problem?
>> Why do we even need to use 'yank' on the receive side ? Until migration
>> has switched over from src to dst, the receive side is discardable and
>> the whole process can just be teminated with kill(SIGTERM/SIGKILL).
> True, although it's nice to be able to quit cleanly.

I found that the 'yank' function has been registered on receive side 
actually.

It's different from the send side.

It's in the function:

void migration_channel_process_incoming(QIOChannel *ioc)
{
     MigrationState *s = migrate_get_current();
     Error *local_err = NULL;

     trace_migration_set_incoming_channel(
         ioc, object_get_typename(OBJECT(ioc)));

     if (s->parameters.tls_creds &&
         *s->parameters.tls_creds &&
         !object_dynamic_cast(OBJECT(ioc),
                              TYPE_QIO_CHANNEL_TLS)) {
         migration_tls_channel_process_incoming(s, ioc, &local_err);
     } else {
         migration_ioc_register_yank(ioc);
         migration_ioc_process_incoming(ioc, &local_err);
     }

     if (local_err) {
         error_report_err(local_err);
     }
}


>
>> On the source side 'yank' is needed, because the QEMU process is still
>> running the live workload and thus is precious and mustn't be killed.
> True.
>
> Dave
>
>> Regards,
>> Daniel
>> -- 
>> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
>> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
>> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-11-26 15:31 ` [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever Li Zhang
  2021-11-26 15:49   ` Daniel P. Berrangé
@ 2021-11-26 16:33   ` Juan Quintela
  2021-11-26 16:56     ` Li Zhang
  1 sibling, 1 reply; 25+ messages in thread
From: Juan Quintela @ 2021-11-26 16:33 UTC (permalink / raw)
  To: Li Zhang; +Cc: qemu-devel, dgilbert, cfontana

Li Zhang <lizhang@suse.de> wrote:
> When doing live migration with multifd channels 8, 16 or larger number,
> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>
> At sender's side:
> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> is called because one thread fails on qio_channel_write_all when
> the network problem happens and other send threads are blocked on sendmsg.
> They could not be terminated. So the main thread is blocked on qemu_thread_join
> to wait for the threads terminated.
>
> (gdb) bt
> 0  0x00007f30c8dcffc0 in __pthread_clockjoin_ex () at /lib64/libpthread.so.0
> 1  0x000055cbb716084b in qemu_thread_join (thread=0x55cbb881f418) at ../util/qemu-thread-posix.c:627
> 2  0x000055cbb6b54e40 in multifd_save_cleanup () at ../migration/multifd.c:542
> 3  0x000055cbb6b4de06 in migrate_fd_cleanup (s=0x55cbb8024000) at ../migration/migration.c:1808
> 4  0x000055cbb6b4dfb4 in migrate_fd_cleanup_bh (opaque=0x55cbb8024000) at ../migration/migration.c:1850
> 5  0x000055cbb7173ac1 in aio_bh_call (bh=0x55cbb7eb98e0) at ../util/async.c:141
> 6  0x000055cbb7173bcb in aio_bh_poll (ctx=0x55cbb7ebba80) at ../util/async.c:169
> 7  0x000055cbb715ba4b in aio_dispatch (ctx=0x55cbb7ebba80) at ../util/aio-posix.c:381
> 8  0x000055cbb7173ffe in aio_ctx_dispatch (source=0x55cbb7ebba80, callback=0x0, user_data=0x0) at ../util/async.c:311
> 9  0x00007f30c9c8cdf4 in g_main_context_dispatch () at /usr/lib64/libglib-2.0.so.0
> 10 0x000055cbb71851a2 in glib_pollfds_poll () at ../util/main-loop.c:232
> 11 0x000055cbb718521c in os_host_main_loop_wait (timeout=42251070366) at ../util/main-loop.c:255
> 12 0x000055cbb7185321 in main_loop_wait (nonblocking=0) at ../util/main-loop.c:531
> 13 0x000055cbb6e6ba27 in qemu_main_loop () at ../softmmu/runstate.c:726
> 14 0x000055cbb6ad6fd7 in main (argc=68, argv=0x7ffc0c578888, envp=0x7ffc0c578ab0) at ../softmmu/main.c:50
>
> At receiver's side:
> Several receive threads are not created successfully and the receive threads
> which have been created are blocked on qemu_sem_wait. No semaphores are posted
> because migration is not started if not all the receive threads are created
> successfully and multifd_recv_sync_main is not called which posts the semaphore
> to receive threads. So the receive threads are waiting on the semaphore and
> never return. It shouldn't wait for the semaphore forever.
> Use qemu_sem_timedwait to wait for a while, then return and close the channels.
> So the guest doesn't hang anymore.
>
> (gdb) bt
> 0  0x00007fd61c43f064 in do_futex_wait.constprop () at /lib64/libpthread.so.0
> 1  0x00007fd61c43f158 in __new_sem_wait_slow.constprop.0 () at /lib64/libpthread.so.0
> 2  0x000056075916014a in qemu_sem_wait (sem=0x56075b6515f0) at ../util/qemu-thread-posix.c:358
> 3  0x0000560758b56643 in multifd_recv_thread (opaque=0x56075b651550) at ../migration/multifd.c:1112
> 4  0x0000560759160598 in qemu_thread_start (args=0x56075befad00) at ../util/qemu-thread-posix.c:556
> 5  0x00007fd61c43594a in start_thread () at /lib64/libpthread.so.0
> 6  0x00007fd61c158d0f in clone () at /lib64/libc.so.6
>
> Signed-off-by: Li Zhang <lizhang@suse.de>
> ---
>  migration/multifd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 7c9deb1921..656239ca2a 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -1109,7 +1109,7 @@ static void *multifd_recv_thread(void *opaque)
>  
>          if (flags & MULTIFD_FLAG_SYNC) {
>              qemu_sem_post(&multifd_recv_state->sem_sync);
> -            qemu_sem_wait(&p->sem_sync);
> +            qemu_sem_timedwait(&p->sem_sync, 1000);
>          }
>      }

Problem happens here, but I think that the solution is not worng.  We
are returning from the semaphore without given a single error message.

Later, Juan.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever
  2021-11-26 16:33   ` Juan Quintela
@ 2021-11-26 16:56     ` Li Zhang
  0 siblings, 0 replies; 25+ messages in thread
From: Li Zhang @ 2021-11-26 16:56 UTC (permalink / raw)
  To: quintela; +Cc: qemu-devel, dgilbert, cfontana


On 11/26/21 5:33 PM, Juan Quintela wrote:
> Li Zhang <lizhang@suse.de> wrote:
>> When doing live migration with multifd channels 8, 16 or larger number,
>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>
>> At sender's side:
>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>> is called because one thread fails on qio_channel_write_all when
>> the network problem happens and other send threads are blocked on sendmsg.
>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>> to wait for the threads terminated.
>>
>> (gdb) bt
>> 0  0x00007f30c8dcffc0 in __pthread_clockjoin_ex () at /lib64/libpthread.so.0
>> 1  0x000055cbb716084b in qemu_thread_join (thread=0x55cbb881f418) at ../util/qemu-thread-posix.c:627
>> 2  0x000055cbb6b54e40 in multifd_save_cleanup () at ../migration/multifd.c:542
>> 3  0x000055cbb6b4de06 in migrate_fd_cleanup (s=0x55cbb8024000) at ../migration/migration.c:1808
>> 4  0x000055cbb6b4dfb4 in migrate_fd_cleanup_bh (opaque=0x55cbb8024000) at ../migration/migration.c:1850
>> 5  0x000055cbb7173ac1 in aio_bh_call (bh=0x55cbb7eb98e0) at ../util/async.c:141
>> 6  0x000055cbb7173bcb in aio_bh_poll (ctx=0x55cbb7ebba80) at ../util/async.c:169
>> 7  0x000055cbb715ba4b in aio_dispatch (ctx=0x55cbb7ebba80) at ../util/aio-posix.c:381
>> 8  0x000055cbb7173ffe in aio_ctx_dispatch (source=0x55cbb7ebba80, callback=0x0, user_data=0x0) at ../util/async.c:311
>> 9  0x00007f30c9c8cdf4 in g_main_context_dispatch () at /usr/lib64/libglib-2.0.so.0
>> 10 0x000055cbb71851a2 in glib_pollfds_poll () at ../util/main-loop.c:232
>> 11 0x000055cbb718521c in os_host_main_loop_wait (timeout=42251070366) at ../util/main-loop.c:255
>> 12 0x000055cbb7185321 in main_loop_wait (nonblocking=0) at ../util/main-loop.c:531
>> 13 0x000055cbb6e6ba27 in qemu_main_loop () at ../softmmu/runstate.c:726
>> 14 0x000055cbb6ad6fd7 in main (argc=68, argv=0x7ffc0c578888, envp=0x7ffc0c578ab0) at ../softmmu/main.c:50
>>
>> At receiver's side:
>> Several receive threads are not created successfully and the receive threads
>> which have been created are blocked on qemu_sem_wait. No semaphores are posted
>> because migration is not started if not all the receive threads are created
>> successfully and multifd_recv_sync_main is not called which posts the semaphore
>> to receive threads. So the receive threads are waiting on the semaphore and
>> never return. It shouldn't wait for the semaphore forever.
>> Use qemu_sem_timedwait to wait for a while, then return and close the channels.
>> So the guest doesn't hang anymore.
>>
>> (gdb) bt
>> 0  0x00007fd61c43f064 in do_futex_wait.constprop () at /lib64/libpthread.so.0
>> 1  0x00007fd61c43f158 in __new_sem_wait_slow.constprop.0 () at /lib64/libpthread.so.0
>> 2  0x000056075916014a in qemu_sem_wait (sem=0x56075b6515f0) at ../util/qemu-thread-posix.c:358
>> 3  0x0000560758b56643 in multifd_recv_thread (opaque=0x56075b651550) at ../migration/multifd.c:1112
>> 4  0x0000560759160598 in qemu_thread_start (args=0x56075befad00) at ../util/qemu-thread-posix.c:556
>> 5  0x00007fd61c43594a in start_thread () at /lib64/libpthread.so.0
>> 6  0x00007fd61c158d0f in clone () at /lib64/libc.so.6
>>
>> Signed-off-by: Li Zhang <lizhang@suse.de>
>> ---
>>   migration/multifd.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/migration/multifd.c b/migration/multifd.c
>> index 7c9deb1921..656239ca2a 100644
>> --- a/migration/multifd.c
>> +++ b/migration/multifd.c
>> @@ -1109,7 +1109,7 @@ static void *multifd_recv_thread(void *opaque)
>>   
>>           if (flags & MULTIFD_FLAG_SYNC) {
>>               qemu_sem_post(&multifd_recv_state->sem_sync);
>> -            qemu_sem_wait(&p->sem_sync);
>> +            qemu_sem_timedwait(&p->sem_sync, 1000);
>>           }
>>       }
> Problem happens here, but I think that the solution is not worng.  We
> are returning from the semaphore without given a single error message.

Ah, okay. I can add an error message.

Thanks

Li


>
> Later, Juan.
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 2/2] migration: Set the socket backlog number to reduce the chance of live migration failure
  2021-11-26 15:31 [PATCH 0/2] migration: multifd live migration improvement Li Zhang
  2021-11-26 15:31 ` [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever Li Zhang
@ 2021-11-26 15:31 ` Li Zhang
  2021-11-26 16:32   ` Juan Quintela
  1 sibling, 1 reply; 25+ messages in thread
From: Li Zhang @ 2021-11-26 15:31 UTC (permalink / raw)
  To: quintela, dgilbert, cfontana, qemu-devel; +Cc: Li Zhang

When creating the guest on destination with -incoming ip:port in QEMU command line,
the source code that sets the backlog number as the same as multifd channels doesn't
get called. So the number of backlog is always 1. It's very easy to cause live migration
failure, so a bigger number is preferred to reduce the chance of the failure.

Signed-off-by: Li Zhang <lizhang@suse.de>
---
 migration/socket.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/socket.c b/migration/socket.c
index 05705a32d8..398d4c10fa 100644
--- a/migration/socket.c
+++ b/migration/socket.c
@@ -152,7 +152,7 @@ socket_start_incoming_migration_internal(SocketAddress *saddr,
     QIONetListener *listener = qio_net_listener_new();
     MigrationIncomingState *mis = migration_incoming_get_current();
     size_t i;
-    int num = 1;
+    int num = 16;
 
     qio_net_listener_set_name(listener, "migration-socket-listener");
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] migration: Set the socket backlog number to reduce the chance of live migration failure
  2021-11-26 15:31 ` [PATCH 2/2] migration: Set the socket backlog number to reduce the chance of live migration failure Li Zhang
@ 2021-11-26 16:32   ` Juan Quintela
  2021-11-26 16:44     ` Li Zhang
  0 siblings, 1 reply; 25+ messages in thread
From: Juan Quintela @ 2021-11-26 16:32 UTC (permalink / raw)
  To: Li Zhang; +Cc: qemu-devel, dgilbert, cfontana

Li Zhang <lizhang@suse.de> wrote:
> When creating the guest on destination with -incoming ip:port in QEMU command line,
> the source code that sets the backlog number as the same as multifd channels doesn't
> get called. So the number of backlog is always 1. It's very easy to cause live migration
> failure, so a bigger number is preferred to reduce the chance of the failure.
>
> Signed-off-by: Li Zhang <lizhang@suse.de>
> ---
>  migration/socket.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/migration/socket.c b/migration/socket.c
> index 05705a32d8..398d4c10fa 100644
> --- a/migration/socket.c
> +++ b/migration/socket.c
> @@ -152,7 +152,7 @@ socket_start_incoming_migration_internal(SocketAddress *saddr,
>      QIONetListener *listener = qio_net_listener_new();
>      MigrationIncomingState *mis = migration_incoming_get_current();
>      size_t i;
> -    int num = 1;
> +    int num = 16;
>  
>      qio_net_listener_set_name(listener, "migration-socket-listener");

Here, the right answer is to use -incoming defer.

Later, Juan.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] migration: Set the socket backlog number to reduce the chance of live migration failure
  2021-11-26 16:32   ` Juan Quintela
@ 2021-11-26 16:44     ` Li Zhang
  0 siblings, 0 replies; 25+ messages in thread
From: Li Zhang @ 2021-11-26 16:44 UTC (permalink / raw)
  To: quintela; +Cc: cfontana, qemu-devel, dgilbert


On 11/26/21 5:32 PM, Juan Quintela wrote:
> Li Zhang <lizhang@suse.de> wrote:
>> When creating the guest on destination with -incoming ip:port in QEMU command line,
>> the source code that sets the backlog number as the same as multifd channels doesn't
>> get called. So the number of backlog is always 1. It's very easy to cause live migration
>> failure, so a bigger number is preferred to reduce the chance of the failure.
>>
>> Signed-off-by: Li Zhang <lizhang@suse.de>
>> ---
>>   migration/socket.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/migration/socket.c b/migration/socket.c
>> index 05705a32d8..398d4c10fa 100644
>> --- a/migration/socket.c
>> +++ b/migration/socket.c
>> @@ -152,7 +152,7 @@ socket_start_incoming_migration_internal(SocketAddress *saddr,
>>       QIONetListener *listener = qio_net_listener_new();
>>       MigrationIncomingState *mis = migration_incoming_get_current();
>>       size_t i;
>> -    int num = 1;
>> +    int num = 16;
>>   
>>       qio_net_listener_set_name(listener, "migration-socket-listener");
> Here, the right answer is to use -incoming defer.

Ok, thanks a lot.


>
> Later, Juan.
>
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2021-12-06  9:30 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-11-26 15:31 [PATCH 0/2] migration: multifd live migration improvement Li Zhang
2021-11-26 15:31 ` [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever Li Zhang
2021-11-26 15:49   ` Daniel P. Berrangé
2021-11-26 16:44     ` Li Zhang
2021-11-26 16:51       ` Daniel P. Berrangé
2021-11-26 17:00         ` Li Zhang
2021-11-26 17:13           ` Daniel P. Berrangé
2021-11-26 17:44             ` Li Zhang
2021-11-29 11:20     ` Dr. David Alan Gilbert
2021-11-29 13:37       ` Li Zhang
2021-11-29 14:50         ` Dr. David Alan Gilbert
2021-11-29 15:34           ` Li Zhang
2021-12-01 12:11           ` Li Zhang
2021-12-01 12:22             ` Daniel P. Berrangé
2021-12-01 13:42               ` Li Zhang
2021-12-01 14:09                 ` Daniel P. Berrangé
2021-12-01 14:15                   ` Li Zhang
2021-11-29 14:58       ` Daniel P. Berrangé
2021-11-29 15:49         ` Dr. David Alan Gilbert
2021-12-06  9:28           ` Li Zhang
2021-11-26 16:33   ` Juan Quintela
2021-11-26 16:56     ` Li Zhang
2021-11-26 15:31 ` [PATCH 2/2] migration: Set the socket backlog number to reduce the chance of live migration failure Li Zhang
2021-11-26 16:32   ` Juan Quintela
2021-11-26 16:44     ` Li Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).