From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40788) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZtacP-0002yX-A1 for qemu-devel@nongnu.org; Tue, 03 Nov 2015 07:23:18 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZtacM-000118-6N for qemu-devel@nongnu.org; Tue, 03 Nov 2015 07:23:17 -0500 Received: from mx1.redhat.com ([209.132.183.28]:57217) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZtacM-000112-1Q for qemu-devel@nongnu.org; Tue, 03 Nov 2015 07:23:14 -0500 Date: Tue, 3 Nov 2015 17:53:06 +0530 From: Amit Shah Message-ID: <20151103122306.GI7673@grmbl.mre> References: <1446449823-25049-1-git-send-email-liang.z.li@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1446449823-25049-1-git-send-email-liang.z.li@intel.com> Subject: Re: [Qemu-devel] [v2 RESEND 0/4] Fix long vm downtime during live migration List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Liang Li Cc: yong.y.wang@intel.com, pbonzini@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com, quintela@redhat.com On (Mon) 02 Nov 2015 [15:36:59], Liang Li wrote: > The patch 3ea3b7fa9af067982f34b of kvm introduces a lazy collapsing > of small sptes into large sptes mechanism, which intend to solve the > performance drop issue if live migration fails or is canceled. The > rmap will be scanned in the KVM_SET_USER_MEMORY_REGION ioctl context > when dirty logging is stopped so as to drop the small sptes, scanning > the rmap and drop the small sptes is a time consuming operation which > will take dozens of milliseconds, the actual time depends on VM's > memory size. For a VM with 8GB RAM, it will take about 30ms. > > The current QEMU code stop the dirty logging during the pause and > copy stage by calling the migration_end() function. Now migration_end() > is a time consuming operation because it calls > memroy_global_dirty_log_stop(), which will trigger the scanning of rmap > and dropping small sptes operation. So call migration_end() before all > the vmsate data has already been transferred to the destination will > prolong VM downtime. > > migration_end() should be deferred after all the data has been > transferred to the destination. blk_mig_cleanup() can be deferred too. Reviewed-by: Amit Shah Thanks for adding to the commit message, that helped. Amit