From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33829) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z86Ml-0000hv-Ol for qemu-devel@nongnu.org; Thu, 25 Jun 2015 08:34:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z86Mh-0001Z5-O9 for qemu-devel@nongnu.org; Thu, 25 Jun 2015 08:34:51 -0400 Date: Thu, 25 Jun 2015 13:34:43 +0100 From: Stefan Hajnoczi Message-ID: <20150625123443.GA4419@stefanha-thinkpad.redhat.com> References: <1426582438-9698-1-git-send-email-liang.z.li@intel.com> <87wq2fkelb.fsf@neno.neno> <20150318111709.GB4576@noname.redhat.com> <87k2yeh498.fsf@neno.neno> <87zj711hd0.fsf@neno.neno> <20150325105338.GA4581@noname.str.redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="2fHTh5uZTiUOsy+g" Content-Disposition: inline In-Reply-To: Subject: Re: [Qemu-devel] [Qemu-block] [PATCH] migration: flush the bdrv before stopping VM List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Li, Liang Z" Cc: Kevin Wolf , "qemu-block@nongnu.org" , Juan Quintela , "qemu-devel@nongnu.org" , "Zhang, Yang Z" , "amit.shah@redhat.com" --2fHTh5uZTiUOsy+g Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jun 24, 2015 at 11:08:43AM +0000, Li, Liang Z wrote: > > > >> >> > Right now, we don't have an interface to detect that cases and > > > >> >> > got back to the iterative stage. > > > >> >> > > > >> >> How about go back to the iterative stage when detect that the > > > >> >> pending_size is larger Than max_size, like this: > > > >> >> > > > >> >> + /* do flush here is aimed to shorten the VM do= wntime, > > > >> >> + * bdrv_flush_all is a time consuming operation > > > >> >> + * when the guest has done some file writing */ > > > >> >> + bdrv_flush_all(); > > > >> >> + pending_size =3D qemu_savevm_state_pending(s->= file, > > max_size); > > > >> >> + if (pending_size && pending_size >=3D max_size= ) { > > > >> >> + qemu_mutex_unlock_iothread(); > > > >> >> + continue; > > > >> >> + } > > > >> >> ret =3D vm_stop_force_state(RUN_STATE_FINISH_= MIGRATE); > > > >> >> if (ret >=3D 0) { > > > >> >> qemu_file_set_rate_limit(s->file, > > > >> >> INT64_MAX); > > > >> >> > > > >> >> and this is quite simple. > > > >> > > > > >> > Yes, but it is too simple. If you hold all the locks during > > > >> > bdrv_flush_all(), your VM will effectively stop as soon as it > > > >> > performs the next I/O access, so you don't win much. And you > > > >> > still don't have a timeout for cases where the flush takes reall= y long. > > > >> > > > >> This is probably better than what we had now (basically we are > > "meassuring" > > > >> after bdrv_flush_all how much the amount of dirty memory has > > > >> changed, and return to iterative stage if it took too much. A > > > >> timeout would be better anyways. And an interface te start the > > > >> synchronization sooner asynchronously would be also good. > > > >> > > > >> Notice that my understanding is that any proper fix for this is 2.4 > > material. > > > > > > > > Then, how to deal with this issue in 2.3, leave it here? or make an > > > > incomplete fix like I do above? > > > > > > I think it is better to leave it here for 2.3. With a patch like this > > > one, we improve in one load and we got worse in a different load > > > (depens a lot in the ratio of dirtying memory vs disk). I have no > > > data which load is more common, so I prefer to be conservative so late > > > in the cycle. What do you think? > >=20 > > I agree, it's too late in the release cycle for such a change. > >=20 > > Kevin >=20 > Hi Juan & Kevin, >=20 > I have not found the related patches to fix the issue which lead to long = VM downtime, how is it going? Kevin is on vacation and QEMU is currently in 2.4 soft freeze. Unless patches have been posted/merged that I'm not aware of, it is unlikely that anything will happen before QEMU 2.4 is released. Stefan --2fHTh5uZTiUOsy+g Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVi/VjAAoJEJykq7OBq3PIPksIAL31ZPUACV0vwE2LU/AMj9lq oyLsaBuPJNfaVZxg7Spk2Meab0VvG1q51ZuMQLZtDDw09KaAGk9qQy4+MjvDMtlp ag7fjjj7lWUEiC45Th/kCzWfrWk40PMzl/xeOM0h4ntIYJJkWUQ/rLHwmCd2kdn7 i8G3+BAe5scpl5hnj2BkyOrMmn5E2bH+tIDxE6HVBPI0GyFNJ7aea30vsQdczBtw qowD7Py8pi1kr8FYaB6ZqIFoW47xCLWzfdTf8jMP6F7sJqEaiYcecPqhxAZzlD01 hrr6AlpJABbpUYIrru6HLlFYJw8qpbMNUkNi0PAIuOr+I24WKK2ot/H7KO9l8P8= =h0vG -----END PGP SIGNATURE----- --2fHTh5uZTiUOsy+g--