From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36540) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ejUTr-00037O-Ss for qemu-devel@nongnu.org; Wed, 07 Feb 2018 13:30:08 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ejUTn-0004nV-Lz for qemu-devel@nongnu.org; Wed, 07 Feb 2018 13:30:03 -0500 Date: Wed, 7 Feb 2018 18:29:30 +0000 From: "Dr. David Alan Gilbert" Message-ID: <20180207182930.GV2665@work-vm> References: <6cc4a99c-0212-6b7b-4a12-3e898215bea9@kamp.de> <20170919143829.GH2107@work-vm> <48cdd8ff-4940-9aa1-9aba-1acf8ce74ebe@kamp.de> <20170919144130.GI2107@work-vm> <20170921123625.GE2717@work-vm> <20171212170533.GG2409@work-vm> <8ed6cb1e-4770-21ec-164e-7142e649eab3@kamp.de> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <8ed6cb1e-4770-21ec-164e-7142e649eab3@kamp.de> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] Block Migration and CPU throttling List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Lieven Cc: "qemu-devel@nongnu.org" , Juan Quintela , Fam Zheng , Stefan Hajnoczi , qemu block , jjherne@linux.vnet.ibm.com * Peter Lieven (pl@kamp.de) wrote: > Am 12.12.2017 um 18:05 schrieb Dr. David Alan Gilbert: > > * Peter Lieven (pl@kamp.de) wrote: > > > Am 21.09.2017 um 14:36 schrieb Dr. David Alan Gilbert: > > > > * Peter Lieven (pl@kamp.de) wrote: > > > > > Am 19.09.2017 um 16:41 schrieb Dr. David Alan Gilbert: > > > > > > * Peter Lieven (pl@kamp.de) wrote: > > > > > > > Am 19.09.2017 um 16:38 schrieb Dr. David Alan Gilbert: > > > > > > > > * Peter Lieven (pl@kamp.de) wrote: > > > > > > > > > Hi, > > > > > > > > >=20 > > > > > > > > > I just noticed that CPU throttling and Block Migration = don't work together very well. > > > > > > > > > During block migration the throttling heuristic detects= that we obviously make no progress > > > > > > > > > in ram transfer. But the reason is the running block mi= gration and not a too high dirty pages rate. > > > > > > > > >=20 > > > > > > > > > The result is that any VM is throttled by 99% during bl= ock migration. > > > > > > > > Hmm that's unfortunate; do you have a bandwidth set lower= than your > > > > > > > > actual network connection? I'm just wondering if it's act= ually going > > > > > > > > between the block and RAM iterative sections or getting s= tuck in ne. > > > > > > > It happens also if source and dest are on the same machine = and speed is set to 100G. > > > > > > But does it happen if they're not and the speed is set low? > > > > > Yes, it does. I noticed it in our test environment between diff= erent nodes with a 10G > > > > > link in between. But its totally clear why it happens. During b= lock migration we transfer > > > > > all dirty memory pages in each round (if there is moderate memo= ry load), but all dirty > > > > > pages are obviously more than 50% of the transferred ram in tha= t round. > > > > > It is more exactly 100%. But the current logic triggers on this= condition. > > > > >=20 > > > > > I think I will go forward and send a patch which disables auto = converge during > > > > > block migration bulk stage. > > > > Yes, that's fair; it probably would also make sense to throttle = the RAM > > > > migration during the block migration bulk stage, since the chance= s are > > > > it's not going to get far. (I think in the nbd setup, the main > > > > migration process isn't started until the end of bulk). > > > Catching up with the idea of delaying ram migration until block bul= k has completed. > > > What do you think is the easiest way to achieve this? > > > >=20 > > I think the answer depends whether we think this is a 'special' or we > > need a new general purpose mechanism. > >=20 > > If it was really general then we'd probably want to split the iterati= ve > > stage in two somehow, and only do RAM in the second half. > >=20 > > But I'm not sure it's worth it; I suspect the easiest way is: > >=20 > > a) Add a counter in migration/ram.c or in the RAM state somewhere > > b) Make ram_save_inhibit increment the counter > > c) Check the counter at the head of ram_save_iterate and just exi= t > > if it's none 0 > > d) Call ram_save_inhibit from block_save_setup > > e) Then release it when you've finished the bulk stage > >=20 > > Make sure you still count the RAM in the pending totals, otherwise > > migration might think it's finished a bit early. >=20 > Is there any culprit I don't see or is it as easy as this? Hmm, looks promising doesn't it; might need an include or two tidied up, but looks worth a try. Just be careful that there are no cases where block migration can't transfer data in that state, otherwise we'll keep coming back to here and spewing empty sections. Dave > diff --git a/migration/ram.c b/migration/ram.c > index cb1950f..c67bcf1 100644 > --- a/migration/ram.c > +++ b/migration/ram.c > @@ -2255,6 +2255,13 @@ static int ram_save_iterate(QEMUFile *f, void *o= paque) > =A0=A0=A0=A0 int64_t t0; > =A0=A0=A0=A0 int done =3D 0; >=20 > +=A0=A0=A0 if (blk_mig_bulk_active()) { > +=A0=A0=A0=A0=A0=A0=A0 /* Avoid transferring RAM during bulk phase of b= lock migration as > +=A0=A0=A0=A0=A0=A0=A0=A0 * the bulk phase will usually take a lot of t= ime and transferring > +=A0=A0=A0=A0=A0=A0=A0=A0 * RAM updates again and again is pointless. *= / > +=A0=A0=A0=A0=A0=A0=A0 goto out; > +=A0=A0=A0 } > + > =A0=A0=A0=A0 rcu_read_lock(); > =A0=A0=A0=A0 if (ram_list.version !=3D rs->last_version) { > =A0=A0=A0=A0=A0=A0=A0=A0 ram_state_reset(rs); > @@ -2301,6 +2308,7 @@ static int ram_save_iterate(QEMUFile *f, void *op= aque) > =A0=A0=A0=A0=A0 */ > =A0=A0=A0=A0 ram_control_after_iterate(f, RAM_CONTROL_ROUND); >=20 > +out: > =A0=A0=A0=A0 qemu_put_be64(f, RAM_SAVE_FLAG_EOS); > =A0=A0=A0=A0 ram_counters.transferred +=3D 8; >=20 >=20 > Peter >=20 -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK