From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:41398) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UXEfS-0004Zi-Hq for qemu-devel@nongnu.org; Tue, 30 Apr 2013 13:48:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UXEfN-0004SJ-O1 for qemu-devel@nongnu.org; Tue, 30 Apr 2013 13:48:42 -0400 Received: from g1t0029.austin.hp.com ([15.216.28.36]:27359) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UXEfN-0004PG-GP for qemu-devel@nongnu.org; Tue, 30 Apr 2013 13:48:37 -0400 Message-ID: <518003EC.6060002@hp.com> Date: Tue, 30 Apr 2013 10:48:28 -0700 From: Chegu Vinod MIME-Version: 1.0 References: <1367095836-19318-1-git-send-email-chegu_vinod@hp.com> <877gjkm4ds.fsf@elfo.elfo> <517FE964.2050702@hp.com> <87y5c0knwv.fsf@elfo.elfo> In-Reply-To: <87y5c0knwv.fsf@elfo.elfo> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC PATCH v2] Throttle-down guest when live migration does not converge. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: quintela@redhat.com Cc: pbonzini@redhat.com, qemu-devel@nongnu.org, anthony@codemonkey.ws, owasserm@redhat.com On 4/30/2013 9:01 AM, Juan Quintela wrote: > Chegu Vinod wrote: >> On 4/30/2013 8:20 AM, Juan Quintela wrote: >>>> (qemu) info migrate >>>> capabilities: xbzrle: off auto-converge: off <---- >>>> Migration status: active >>>> total time: 1487503 milliseconds >>> 148 seconds >> 1487 seconds and still the Migration is not completed. >> >>>> expected downtime: 519 milliseconds >>>> transferred ram: 383749347 kbytes >>>> remaining ram: 2753372 kbytes >>>> total ram: 268444224 kbytes >>>> duplicate: 65461532 pages >>>> skipped: 64901568 pages >>>> normal: 95750218 pages >>>> normal bytes: 383000872 kbytes >>>> dirty pages rate: 67551 pages >>>> >>>> --- >>>> >>>> (qemu) info migrate >>>> capabilities: xbzrle: off auto-converge: on <---- >>>> Migration status: completed >>>> total time: 241161 milliseconds >>>> downtime: 6373 milliseconds >>> 6.3 seconds and finished, not bad at all O:-) >> That's the *downtime*.. The total time for migration to complete is >> 241 secs. (SpecJBB is >> one of those workloads that dirties memory quite a bit). > Sorry, you are right. Imressive anyways for such small change. > >>>> +/* To reduce the dirty rate explicitly disallow the VCPUs from spending >>>> + much time in the VM. The migration thread will try to catchup. >>>> + Workload will experience a greater performance drop but for a shorter >>>> + duration. >>>> +*/ >>>> +void *migration_throttle_down(void *opaque) >>>> +{ >>>> + throttling = true; >>>> + while (throttling_needed()) { >>>> + CPUArchState *penv = first_cpu; >>> I am not sure that we can follow the list without the iothread lock >>> here. >> Hmm.. Is this due to vcpu hot plug that might happen at the time of >> live migration (or) due >> to something else ? I was trying to avoid holding the iothread lock >> for longer duration and slow >> down the migration thread... > Well, thinking back about it, what we should do is disable cpu > hotplug/unplug during migration I tend to agree. For now I am not going to hold the iothread lock for following the list... > (it is not working well anyways as > Today). Yes...and I see that Igor, Eduardo et.al. are trying to fix this. Vinod > > Thanks, Juan. > . >