From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:49543)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <kwolf@redhat.com>) id 1YaiwV-0001oM-Dr
	for qemu-devel@nongnu.org; Wed, 25 Mar 2015 06:53:48 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <kwolf@redhat.com>) id 1YaiwU-0001kH-FG
	for qemu-devel@nongnu.org; Wed, 25 Mar 2015 06:53:47 -0400
Date: Wed, 25 Mar 2015 11:53:38 +0100
From: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20150325105338.GA4581@noname.str.redhat.com>
References: <1426582438-9698-1-git-send-email-liang.z.li@intel.com>
	<87wq2fkelb.fsf@neno.neno>
	<F2CBF3009FA73547804AE4C663CAB28E4D0DA1@shsmsx102.ccr.corp.intel.com>
	<20150318111709.GB4576@noname.redhat.com>
	<87k2yeh498.fsf@neno.neno>
	<F2CBF3009FA73547804AE4C663CAB28E4D36CB@shsmsx102.ccr.corp.intel.com>
	<87zj711hd0.fsf@neno.neno>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <87zj711hd0.fsf@neno.neno>
Subject: Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping
	VM
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Juan Quintela <quintela@redhat.com>
Cc: "amit.shah@redhat.com" <amit.shah@redhat.com>, "Zhang,
	Yang Z" <yang.z.zhang@intel.com>, "Li, Liang Z" <liang.z.li@intel.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "qemu-block@nongnu.org" <qemu-block@nongnu.org>

Am 25.03.2015 um 11:50 hat Juan Quintela geschrieben:
> "Li, Liang Z" <liang.z.li@intel.com> wrote:
> >> >> > Right now, we don't have an interface to detect that cases and got
> >> >> > back to the iterative stage.
> >> >>
> >> >> How about go back to the iterative stage when detect that the
> >> >> pending_size is larger Than max_size, like this:
> >> >>
> >> >> +                /* do flush here is aimed to shorten the VM downtime,
> >> >> +                 * bdrv_flush_all is a time consuming operation
> >> >> +                 * when the guest has done some file writing */
> >> >> +                bdrv_flush_all();
> >> >> +                pending_size = qemu_savevm_state_pending(s->file, max_size);
> >> >> +                if (pending_size && pending_size >= max_size) {
> >> >> +                    qemu_mutex_unlock_iothread();
> >> >> +                    continue;
> >> >> +                }
> >> >>                   ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
> >> >>                   if (ret >= 0) {
> >> >>                       qemu_file_set_rate_limit(s->file, INT64_MAX);
> >> >>
> >> >> and this is quite simple.
> >> >
> >> > Yes, but it is too simple. If you hold all the locks during
> >> > bdrv_flush_all(), your VM will effectively stop as soon as it performs
> >> > the next I/O access, so you don't win much. And you still don't have a
> >> > timeout for cases where the flush takes really long.
> >> 
> >> This is probably better than what we had now (basically we are "meassuring"
> >> after bdrv_flush_all how much the amount of dirty memory has changed,
> >> and return to iterative stage if it took too much.  A timeout would be better
> >> anyways.  And an interface te start the synchronization sooner
> >> asynchronously would be also good.
> >> 
> >> Notice that my understanding is that any proper fix for this is 2.4 material.
> >
> > Then, how to deal with this issue in 2.3, leave it here? or make an
> > incomplete fix like I do above?
> 
> I think it is better to leave it here for 2.3. With a patch like this
> one, we improve in one load and we got worse in a different load (depens
> a lot in the ratio of dirtying memory vs disk).  I have no data which
> load is more common, so I prefer to be conservative so late in the
> cycle.  What do you think?

I agree, it's too late in the release cycle for such a change.

Kevin