From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43323) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aukLZ-0001Zg-Vu for qemu-devel@nongnu.org; Mon, 25 Apr 2016 13:31:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aukLW-0002kQ-Nr for qemu-devel@nongnu.org; Mon, 25 Apr 2016 13:30:57 -0400 Received: from mail-db3on0110.outbound.protection.outlook.com ([157.55.234.110]:35762 helo=emea01-db3-obe.outbound.protection.outlook.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aukLW-0002kF-2v for qemu-devel@nongnu.org; Mon, 25 Apr 2016 13:30:54 -0400 References: <571E2049.2020706@virtuozzo.com> <20160425171850.GD2232@work-vm> From: "Denis V. Lunev" Message-ID: <571E5445.6050506@virtuozzo.com> Date: Mon, 25 Apr 2016 20:30:45 +0300 MIME-Version: 1.0 In-Reply-To: <20160425171850.GD2232@work-vm> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] long irresponsibility or stuck in the migration code List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: Amit Shah , QEMU , Dmitry Mishin On 04/25/2016 08:18 PM, Dr. David Alan Gilbert wrote: > * Denis V. Lunev (den@virtuozzo.com) wrote: >> Hello, Amit! > > (That's unresponsiveness not irresponsibility :-) ;) thank you >> We have faced very interesting issue with QEMU migration code. >> Migration thread performs the following operation: >> >> #0 0x00007f61abe9978d in sendmsg () at ../sysdeps/unix/syscall-template.S:81 >> #1 0x00007f61b2942055 in do_send_recv (sockfd=sockfd@entry=104, iov=iov@entry=0x7f61b71a8030, >> iov_cnt=, do_send=do_send@entry=true) at util/iov.c:104 >> #2 0x00007f61b2942528 in iov_send_recv (sockfd=104, iov=iov@entry=0x7f61b71a8030, iov_cnt=iov_cnt@entry=1, >> offset=27532, offset@entry=0, bytes=5236, bytes@entry=32768, do_send=do_send@entry=true) at util/iov.c:181 >> #3 0x00007f61b287724a in socket_writev_buffer (opaque=0x7f61b6ec8070, iov=0x7f61b71a8030, iovcnt=1, >> pos=) at migration/qemu-file-unix.c:43 >> #4 0x00007f61b2875caa in qemu_fflush (f=f@entry=0x7f61b71a0000) at migration/qemu-file.c:109 >> #5 0x00007f61b2875e1a in qemu_put_buffer (f=0x7f61b71a0000, buf=buf@entry=0x7f61b662e030 "", size=size@entry=842) >> at migration/qemu-file.c:323 >> #6 0x00007f61b287674f in qemu_put_buffer (size=842, buf=0x7f61b662e030 "", f=0x7f61b2875caa ) >> ---Type to continue, or q to quit--- >> at migration/qemu-file.c:589 >> #7 qemu_put_qemu_file (f_des=f_des@entry=0x7f61b71a0000, f_src=0x7f61b662e000) at migration/qemu-file.c:589 >> #8 0x00007f61b26fab01 in compress_page_with_multi_thread (bytes_transferred=0x7f61b2dfe578 , >> offset=2138677280, block=0x7f61b51e9b80, f=0x7f61b71a0000) at /usr/src/debug/qemu-2.3.0/migration/ram.c:872 >> #9 ram_save_compressed_page (bytes_transferred=0x7f61b2dfe578 , last_stage=true, >> offset=2138677280, block=0x7f61b51e9b80, f=0x7f61b71a0000) at /usr/src/debug/qemu-2.3.0/migration/ram.c:957 >> #10 ram_find_and_save_block (f=f@entry=0x7f61b71a0000, last_stage=last_stage@entry=true, >> bytes_transferred=0x7f61b2dfe578 ) at /usr/src/debug/qemu-2.3.0/migration/ram.c:1015 >> #11 0x00007f61b26faed5 in ram_save_complete (f=0x7f61b71a0000, opaque=) >> at /usr/src/debug/qemu-2.3.0/migration/ram.c:1280 >> #12 0x00007f61b26ff241 in qemu_savevm_state_complete_precopy (f=0x7f61b71a0000, >> iterable_only=iterable_only@entry=false) at /usr/src/debug/qemu-2.3.0/migration/savevm.c:976 >> #13 0x00007f61b2872ecb in migration_completion (start_time=, old_vm_running=, >> current_active_state=, s=0x7f61b2d8bfc0 ) at migration/migration.c:1212 >> #14 migration_thread (opaque=0x7f61b2d8bfc0 ) at migration/migration.c:1307 >> #15 0x00007f61abe92dc5 in start_thread (arg=0x7f6117ff8700) at pthread_create.c:308 >> #16 0x00007f61abbc028d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:11 >> >> which can take really BIG period of time. >> >> The problem is that we have taken qemu_global_mutex >> >> static void migration_completion(MigrationState *s, int current_active_state, >> bool *old_vm_running, >> int64_t *start_time) >> { >> int ret; >> >> if (s->state == MIGRATION_STATUS_ACTIVE) { >> qemu_mutex_lock_iothread(); >> *start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); >> qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER); >> *old_vm_running = runstate_is_running(); >> ret = global_state_store(); >> >> if (!ret) { >> ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); >> if (ret >= 0) { >> qemu_file_set_rate_limit(s->file, INT64_MAX); >> qemu_savevm_state_complete_precopy(s->file, false); >> } >> } >> qemu_mutex_unlock_iothread(); >> >> and thus QEMU process is irresponsible for any management requests. >> Here we have some misconfiguration and the data is not read, but >> this could happen in other cases. >> >> From my point of view we should drop qemu_mutex_unlock_iothread() >> before any socket operation but doing this in a straight way (just >> drop the lock) seems improper. >> >> Do you have any opinion on the problem? > Yeh I've seen this before; it needs fixing but it's not obvious how. > If you set the migration speed reasonably, that means that the > amount of data sent at this point is a lot smaller so the send shouldn't > take too long - however if the destination stalls at this point > you're in trouble. > > The tricky bit is understanding exactly why we're holding the lock > at this point; I can think of a few reasons but I'm not sure if it's > all of them: > a) We don't want any hot-add/remove while we're trying to save the device > state. > b) We want to be able to stop the guest > c) We want to be able to stop any IO. > > I did suggest ( https://lists.gnu.org/archive/html/qemu-devel/2016-02/msg01711.html ) > a lock free monitor would be nice where you could get some status > commands and perhaps issue a migration_cancel; but it sounds > like it's messy untangling the monitor. > > If we knew all the reasons we were taking the lock there then > perhaps we could split it into finer locks and let the monitor > carry on; but I'm sure it's a huge task. I have the same feeling. Thank you for a response. At least I am not along here. Den