From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:45486)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mst@redhat.com>) id 1aj7d4-0003Sz-Fp
	for qemu-devel@nongnu.org; Thu, 24 Mar 2016 11:57:02 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <mst@redhat.com>) id 1aj7d1-00053k-9Z
	for qemu-devel@nongnu.org; Thu, 24 Mar 2016 11:56:58 -0400
Received: from mx1.redhat.com ([209.132.183.28]:39195)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mst@redhat.com>) id 1aj7d1-00053G-1S
	for qemu-devel@nongnu.org; Thu, 24 Mar 2016 11:56:55 -0400
Date: Thu, 24 Mar 2016 17:56:47 +0200
From: "Michael S. Tsirkin" <mst@redhat.com>
Message-ID: <20160324175503-mutt-send-email-mst@redhat.com>
References: <1458632629-4649-1-git-send-email-liang.z.li@intel.com>
	<20160322190530.GI2216@work-vm>
	<F2CBF3009FA73547804AE4C663CAB28E04159C9C@shsmsx102.ccr.corp.intel.com>
	<20160324012424.GB14956@linux-gk3p> <20160324090004.GA2230@work-vm>
	<F2CBF3009FA73547804AE4C663CAB28E0415B8E5@shsmsx102.ccr.corp.intel.com>
	<20160324102354.GB2230@work-vm>
	<F2CBF3009FA73547804AE4C663CAB28E0415BD6F@shsmsx102.ccr.corp.intel.com>
	<20160324165530-mutt-send-email-mst@redhat.com>
	<F2CBF3009FA73547804AE4C663CAB28E0415C07D@shsmsx102.ccr.corp.intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <F2CBF3009FA73547804AE4C663CAB28E0415C07D@shsmsx102.ccr.corp.intel.com>
Subject: Re: [Qemu-devel] [RFC Design Doc]Speed up live migration by
	skipping free pages
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Li, Liang Z" <liang.z.li@intel.com>
Cc: "rkagan@virtuozzo.com" <rkagan@virtuozzo.com>, "linux-kernel@vger.kenel.org" <linux-kernel@vger.kenel.org>, "ehabkost@redhat.com" <ehabkost@redhat.com>, "kvm@vger.kernel.org" <kvm@vger.kernel.org>, "quintela@redhat.com" <quintela@redhat.com>, "simhan@hpe.com" <simhan@hpe.com>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "jitendra.kolhe@hpe.com" <jitendra.kolhe@hpe.com>, "mohan_parthasarathy@hpe.com" <mohan_parthasarathy@hpe.com>, "amit.shah@redhat.com" <amit.shah@redhat.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, Wei Yang <richard.weiyang@huawei.com>, "rth@twiddle.net" <rth@twiddle.net>

On Thu, Mar 24, 2016 at 03:53:25PM +0000, Li, Liang Z wrote:
> > > > > Not very complex, we can implement like this:
> > > > >
> > > > > 1. Set all the bits in the migration_bitmap_rcu->bmap to 1 2.
> > > > > Clear all the bits in ram_list.
> > > > > dirty_memory[DIRTY_MEMORY_MIGRATION]
> > > > > 3. Send the get_free_page_bitmap request 4. Start to send pages to
> > > > > destination and check if the free_page_bitmap is ready
> > > > >     if (is_ready) {
> > > > >           filter out the free pages from  migration_bitmap_rcu->bmap;
> > > > >           migration_bitmap_sync();
> > > > >     }
> > > > >      continue until live migration complete.
> > > > >
> > > > >
> > > > > Is that right?
> > > >
> > > > The order I'm trying to understand is something like:
> > > >
> > > >     a) Send the get_free_page_bitmap request
> > > >     b) Start sending pages
> > > >     c) Reach the end of memory
> > > >       [ is_ready is false - guest hasn't made free map yet ]
> > > >     d) normal migration_bitmap_sync() at end of first pass
> > > >     e) Carry on sending dirty pages
> > > >     f) is_ready is true
> > > >       f.1) filter out free pages?
> > > >       f.2) migration_bitmap_sync()
> > > >
> > > > It's f.1 I'm worried about.  If the guest started generating the
> > > > free bitmap before (d), then a page marked as 'free' in f.1 might
> > > > have become dirty before (d) and so (f.2) doesn't set the dirty
> > > > again, and so we can't filter out pages in f.1.
> > > >
> > >
> > > As you described, the order is incorrect.
> > >
> > > Liang
> > 
> > 
> > So to make it safe, what is required is to make sure no free list us outstanding
> > before calling migration_bitmap_sync.
> > 
> > If one is outstanding, filter out pages before calling migration_bitmap_sync.
> > 
> > Of course, if we just do it like we normally do with migration, then by the
> > time we call migration_bitmap_sync dirty bitmap is completely empty, so
> > there won't be anything to filter out.
> > 
> > One way to address this is call migration_bitmap_sync in the IO handler,
> > while VCPU is stopped, then make sure to filter out pages before the next
> > migration_bitmap_sync.
> > 
> > Another is to start filtering out pages upon IO handler, but make sure to flush
> > the queue before calling migration_bitmap_sync.
> > 
> 
> It's really complex, maybe we should switch to a simple start,  just skip the free page in
> the ram bulk stage and make it asynchronous?
> 
> Liang

You mean like your patches do? No, blocking bulk migration until guest
response is basically a non-starter.

-- 
MST