From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39002) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ffi4K-0003nT-FX for qemu-devel@nongnu.org; Wed, 18 Jul 2018 04:44:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ffi4H-0002fL-C1 for qemu-devel@nongnu.org; Wed, 18 Jul 2018 04:44:20 -0400 Received: from mail-it0-x243.google.com ([2607:f8b0:4001:c0b::243]:32893) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1ffi4H-0002e8-6c for qemu-devel@nongnu.org; Wed, 18 Jul 2018 04:44:17 -0400 Received: by mail-it0-x243.google.com with SMTP id y124-v6so16933100itc.0 for ; Wed, 18 Jul 2018 01:44:17 -0700 (PDT) References: <20180604095520.8563-1-xiaoguangrong@tencent.com> <20180604095520.8563-9-xiaoguangrong@tencent.com> <20180713180120.GD2434@work-vm> From: Xiao Guangrong Message-ID: <250fa742-5fe5-7701-bfb3-a4302d59b772@gmail.com> Date: Wed, 18 Jul 2018 16:44:09 +0800 MIME-Version: 1.0 In-Reply-To: <20180713180120.GD2434@work-vm> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [PATCH 08/12] migration: do not flush_compressed_data at the end of each iteration List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: pbonzini@redhat.com, mst@redhat.com, mtosatti@redhat.com, qemu-devel@nongnu.org, kvm@vger.kernel.org, peterx@redhat.com, jiang.biao2@zte.com.cn, wei.w.wang@intel.com, Xiao Guangrong On 07/14/2018 02:01 AM, Dr. David Alan Gilbert wrote: > * guangrong.xiao@gmail.com (guangrong.xiao@gmail.com) wrote: >> From: Xiao Guangrong >> >> flush_compressed_data() needs to wait all compression threads to >> finish their work, after that all threads are free until the >> migration feed new request to them, reducing its call can improve >> the throughput and use CPU resource more effectively >> >> We do not need to flush all threads at the end of iteration, the >> data can be kept locally until the memory block is changed >> >> Signed-off-by: Xiao Guangrong >> --- >> migration/ram.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/migration/ram.c b/migration/ram.c >> index f9a8646520..0a38c1c61e 100644 >> --- a/migration/ram.c >> +++ b/migration/ram.c >> @@ -1994,6 +1994,7 @@ static void ram_save_cleanup(void *opaque) >> } >> >> xbzrle_cleanup(); >> + flush_compressed_data(*rsp); >> compress_threads_save_cleanup(); >> ram_state_cleanup(rsp); >> } > > I'm not sure why this change corresponds to the other removal. > We should already have sent all remaining data in ram_save_complete()'s > call to flush_compressed_data - so what is this one for? > This is for the error condition, if any error occurred during live migration, there is no chance to call ram_save_complete. After using the lockless multithreads model, we assert all requests have been handled before destroy the work threads. >> @@ -2690,7 +2691,6 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) >> } >> i++; >> } >> - flush_compressed_data(rs); >> rcu_read_unlock(); > > Hmm - are we sure there's no other cases that depend on ordering of all > of the compressed data being sent out between threads? Err, i tried think it over carefully, however, still missed the case you mentioned. :( Anyway, doing flush_compressed_data() for every 50ms hurt us too much. > I think the one I'd most worry about is the case where: > > iteration one: > thread 1: Save compressed page 'n' > > iteration two: > thread 2: Save compressed page 'n' > > What guarantees that the version of page 'n' > from thread 2 reaches the destination first without > this flush? > Hmm... you are right, i missed this case. So how about avoid it by doing this check at the beginning of ram_save_iterate(): if (ram_counters.dirty_sync_count != rs.dirty_sync_count) { flush_compressed_data(*rsp); rs.dirty_sync_count = ram_counters.dirty_sync_count; }