From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59117) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1axsih-0007xm-QP for qemu-devel@nongnu.org; Wed, 04 May 2016 05:03:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1axsiW-0004Xv-3h for qemu-devel@nongnu.org; Wed, 04 May 2016 05:03:42 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52606) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1axsiV-0004V7-Ux for qemu-devel@nongnu.org; Wed, 04 May 2016 05:03:36 -0400 From: Juan Quintela In-Reply-To: <20160503163908.GJ2242@work-vm> (David Alan Gilbert's message of "Tue, 3 May 2016 17:39:09 +0100") References: <1462257521-16075-1-git-send-email-liang.z.li@intel.com> <20160503163908.GJ2242@work-vm> Reply-To: quintela@redhat.com Date: Wed, 04 May 2016 11:03:21 +0200 Message-ID: <87d1p2urdy.fsf@emacs.mitica> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Qemu-devel] [PATCH] migration: Fix multi-thread compression bug List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: Liang Li , qemu-devel@nongnu.org, amit.shah@redhat.com "Dr. David Alan Gilbert" wrote: > * Liang Li (liang.z.li@intel.com) wrote: >> Recently, a bug related to multiple thread compression feature for >> live migration is reported. The destination side will be blocked >> during live migration if there are heavy workload in host and >> memory intensive workload in guest, this is most likely to happen >> when there is one decompression thread. >> >> Some parts of the decompression code are incorrect: >> 1. The main thread receives data from source side will enter a busy >> loop to wait for a free decompression thread. >> 2. A lock is needed to protect the decomp_param[idx]->start, because >> it is checked in the main thread and is updated in the decompression >> thread. >> >> Fix these two issues by following the code pattern for compression. >> > > OK, I think that's an improvement - but I have a question. > Since it's an improvement (and basically now the same as compress): > > Reviewed-by: Dr. David Alan Gilbert > > however, my question is: > What guarantee's that all of the decompression has finished by the time > the VM is started? I see in migration/migration.c that we have: > > if (!global_state_received() || > global_state_get_runstate() == RUN_STATE_RUNNING) { > if (autostart) { > vm_start(); > } else { > runstate_set(RUN_STATE_PAUSED); > } > } else { > runstate_set(global_state_get_runstate()); > } > migrate_decompress_threads_join(); You are right here. If we don't want to do the join earlier, we need to do the equivalent of looking that all threads have set param->done as true. That is what I do on the multiple-fd code. Later, Juan.