From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:60303) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UXD0C-0003Kd-0V for qemu-devel@nongnu.org; Tue, 30 Apr 2013 12:02:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UXD06-0007Zg-UR for qemu-devel@nongnu.org; Tue, 30 Apr 2013 12:01:59 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43118) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UXD06-0007ZJ-DS for qemu-devel@nongnu.org; Tue, 30 Apr 2013 12:01:54 -0400 From: Juan Quintela In-Reply-To: <517FE964.2050702@hp.com> (Chegu Vinod's message of "Tue, 30 Apr 2013 08:55:16 -0700") References: <1367095836-19318-1-git-send-email-chegu_vinod@hp.com> <877gjkm4ds.fsf@elfo.elfo> <517FE964.2050702@hp.com> Date: Tue, 30 Apr 2013 18:01:52 +0200 Message-ID: <87y5c0knwv.fsf@elfo.elfo> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Qemu-devel] [RFC PATCH v2] Throttle-down guest when live migration does not converge. Reply-To: quintela@redhat.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Chegu Vinod Cc: pbonzini@redhat.com, qemu-devel@nongnu.org, anthony@codemonkey.ws, owasserm@redhat.com Chegu Vinod wrote: > On 4/30/2013 8:20 AM, Juan Quintela wrote: >>> >>> (qemu) info migrate >>> capabilities: xbzrle: off auto-converge: off <---- >>> Migration status: active >>> total time: 1487503 milliseconds >> 148 seconds > > 1487 seconds and still the Migration is not completed. > >> >>> expected downtime: 519 milliseconds >>> transferred ram: 383749347 kbytes >>> remaining ram: 2753372 kbytes >>> total ram: 268444224 kbytes >>> duplicate: 65461532 pages >>> skipped: 64901568 pages >>> normal: 95750218 pages >>> normal bytes: 383000872 kbytes >>> dirty pages rate: 67551 pages >>> >>> --- >>> >>> (qemu) info migrate >>> capabilities: xbzrle: off auto-converge: on <---- >>> Migration status: completed >>> total time: 241161 milliseconds >>> downtime: 6373 milliseconds >> 6.3 seconds and finished, not bad at all O:-) > That's the *downtime*.. The total time for migration to complete is > 241 secs. (SpecJBB is > one of those workloads that dirties memory quite a bit). Sorry, you are right. Imressive anyways for such small change. >>> +/* To reduce the dirty rate explicitly disallow the VCPUs from spending >>> + much time in the VM. The migration thread will try to catchup. >>> + Workload will experience a greater performance drop but for a shorter >>> + duration. >>> +*/ >>> +void *migration_throttle_down(void *opaque) >>> +{ >>> + throttling = true; >>> + while (throttling_needed()) { >>> + CPUArchState *penv = first_cpu; >> I am not sure that we can follow the list without the iothread lock >> here. > > Hmm.. Is this due to vcpu hot plug that might happen at the time of > live migration (or) due > to something else ? I was trying to avoid holding the iothread lock > for longer duration and slow > down the migration thread... Well, thinking back about it, what we should do is disable cpu hotplug/unplug during migration (it is not working well anyways as Today). Thanks, Juan.