From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39086) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b550L-0000Kl-1w for qemu-devel@nongnu.org; Tue, 24 May 2016 01:35:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b550F-00027L-W1 for qemu-devel@nongnu.org; Tue, 24 May 2016 01:35:44 -0400 Received: from mail-db3on0117.outbound.protection.outlook.com ([157.55.234.117]:8422 helo=emea01-db3-obe.outbound.protection.outlook.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b550F-00027D-30 for qemu-devel@nongnu.org; Tue, 24 May 2016 01:35:39 -0400 References: <1463124421-16587-1-git-send-email-den@openvz.org> <20160523071942.GB24417@grmbl.mre> From: "Denis V. Lunev" Message-ID: <5743E824.40900@openvz.org> Date: Tue, 24 May 2016 08:35:32 +0300 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 1/1] migration: fix ram decompression race deadlock List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Li, Liang Z" , Amit Shah Cc: "qemu-devel@nongnu.org" , Maxim Nestratov , Juan Quintela , "Dr. David Alan Gilbert" On 05/24/2016 05:07 AM, Li, Liang Z wrote: >> Adding Liang Li to CC for his comments as the author of the feature. >> >> On (Fri) 13 May 2016 [10:27:01], Denis V. Lunev wrote: >>> From: Maxim Nestratov >>> >>> There is a race in between do_data_decompress and start_decompression. >>> >>> do_data_decompress() >>> while (!quit_decomp_thread) { >>> qemu_mutex_lock(¶m->mutex); >>> while (!param->start && !quit_decomp_thread) { >>> qemu_cond_wait(¶m->cond, ¶m->mutex); >>> ... >>> param->start = false; >>> } >>> qemu_mutex_unlock(¶m->mutex); >>> [ preempted here, start_decompression() is executed ] >>> } >>> >>> start_decompression() >>> { >>> qemu_mutex_lock(¶m->mutex); >>> param->start = true; >>> qemu_cond_signal(¶m->cond); >>> qemu_mutex_unlock(¶m->mutex); >>> } >>> >>> In this case do_data_decompress will never enter inner loop again and >>> will eat 100% CPU. The patch fixes this problem by correcting while >>> loop where we wait for condition only and other actions are moved out of >> it. >>> Signed-off-by: Maxim Nestratov >>> Signed-off-by: Denis V. Lunev >>> CC: Juan Quintela >>> CC: Amit Shah >>> --- >>> migration/ram.c | 22 +++++++++++----------- >>> 1 file changed, 11 insertions(+), 11 deletions(-) >>> >>> diff --git a/migration/ram.c b/migration/ram.c index 3f05738..579bfc0 >>> 100644 >>> --- a/migration/ram.c >>> +++ b/migration/ram.c >>> @@ -2193,18 +2193,18 @@ static void *do_data_decompress(void >> *opaque) >>> qemu_mutex_lock(¶m->mutex); >>> while (!param->start && !quit_decomp_thread) { >>> qemu_cond_wait(¶m->cond, ¶m->mutex); >>> - pagesize = TARGET_PAGE_SIZE; >>> - if (!quit_decomp_thread) { >>> - /* uncompress() will return failed in some case, especially >>> - * when the page is dirted when doing the compression, it's >>> - * not a problem because the dirty page will be retransferred >>> - * and uncompress() won't break the data in other pages. >>> - */ >>> - uncompress((Bytef *)param->des, &pagesize, >>> - (const Bytef *)param->compbuf, param->len); >>> - } >>> - param->start = false; >>> } >>> + pagesize = TARGET_PAGE_SIZE; >>> + if (!quit_decomp_thread) { >>> + /* uncompress() will return failed in some case, especially >>> + * when the page is dirted when doing the compression, it's >>> + * not a problem because the dirty page will be retransferred >>> + * and uncompress() won't break the data in other pages. >>> + */ >>> + uncompress((Bytef *)param->des, &pagesize, >>> + (const Bytef *)param->compbuf, param->len); >>> + } >>> + param->start = false; >>> qemu_mutex_unlock(¶m->mutex); >>> } > Thanks for your patch! And it can help to fix this issue. > Actually, the is not the only issue in the current multi-thread (de)compression code. > So I submitted a path set to fix all of them before your patch. And sorry for my mistake. > > Refer to the link: > https://lists.gnu.org/archive/html/qemu-devel/2016-05/msg00631.html > for more information. > > Thanks > Liang yep. thanks a lot. Your patch would be better ;)