From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:58412) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TOly7-0008DD-OP for qemu-devel@nongnu.org; Thu, 18 Oct 2012 05:00:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TOlxz-0001h5-Mn for qemu-devel@nongnu.org; Thu, 18 Oct 2012 05:00:43 -0400 Received: from mail-pa0-f45.google.com ([209.85.220.45]:35512) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TOlxz-0001gu-BF for qemu-devel@nongnu.org; Thu, 18 Oct 2012 05:00:35 -0400 Received: by mail-pa0-f45.google.com with SMTP id fb10so8171018pad.4 for ; Thu, 18 Oct 2012 02:00:34 -0700 (PDT) Sender: Paolo Bonzini Message-ID: <507FC52C.7040802@redhat.com> Date: Thu, 18 Oct 2012 11:00:28 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <1350545426-23172-1-git-send-email-quintela@redhat.com> In-Reply-To: <1350545426-23172-1-git-send-email-quintela@redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 00/30] Migration thread 20121017 edition List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Juan Quintela Cc: qemu-devel@nongnu.org Il 18/10/2012 09:29, Juan Quintela ha scritto: > v3: > > This is work in progress on top of the previous migration series just sent. > > - Introduces a thread for migration instead of using a timer and callback > - remove the writting to the fd from the iothread lock > - make the writes synchronous > - Introduce a new pending method that returns how many bytes are pending for > one save live section > - last patch just shows printfs to see where the time is being spent > on the migration complete phase. > (yes it pollutes all uses of stop on the monitor) > > So far I have found that we spent a lot of time on bdrv_flush_all() It > can take from 1ms to 600ms (yes, it is not a typo). That dwarfs the > migration default downtime time (30ms). > > Stop all vcpus: > > - it works now (after the changes on qemu_cpu_is_vcpu on the previous > series) caveat is that the time that brdv_flush_all() takes is > "unpredictable". Any silver bullets? You could reuse the "block" live migration item. In block_save_pending, start a bdrv_aio_flush() on all block devices that have already completed the previous one. But that's not a regression in the migration thread, isn't it? Paolo