From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:38868)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1Yznd9-0004HG-97
	for qemu-devel@nongnu.org; Tue, 02 Jun 2015 10:57:28 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1Yznd6-0006DU-2u
	for qemu-devel@nongnu.org; Tue, 02 Jun 2015 10:57:27 -0400
Received: from mx1.redhat.com ([209.132.183.28]:53466)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1Yznd5-0006DM-Rr
	for qemu-devel@nongnu.org; Tue, 02 Jun 2015 10:57:24 -0400
Date: Tue, 2 Jun 2015 15:57:18 +0100
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Message-ID: <20150602145718.GD2139@work-vm>
References: <1433171851-18507-1-git-send-email-jjherne@linux.vnet.ibm.com>
	<1433171851-18507-3-git-send-email-jjherne@linux.vnet.ibm.com>
	<20150601153259.GK2314@work-vm>
	<556C936F.8090606@linux.vnet.ibm.com>
	<20150602135855.GC2139@work-vm>
	<556DBF9F.9000004@linux.vnet.ibm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <556DBF9F.9000004@linux.vnet.ibm.com>
Subject: Re: [Qemu-devel] [PATCH 2/2] migration: Dynamic cpu throttling for
 auto-converge
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Jason J. Herne" <jjherne@linux.vnet.ibm.com>
Cc: amit.shah@redhat.com, borntraeger@de.ibm.com, quintela@redhat.com, qemu-devel@nongnu.org, afaerber@suse.de

* Jason J. Herne (jjherne@linux.vnet.ibm.com) wrote:
> On 06/02/2015 09:58 AM, Dr. David Alan Gilbert wrote:
> >* Jason J. Herne (jjherne@linux.vnet.ibm.com) wrote:
> >>On 06/01/2015 11:32 AM, Dr. David Alan Gilbert wrote:
> >>>* Jason J. Herne (jjherne@linux.vnet.ibm.com) wrote:
> >>>>Remove traditional auto-converge static 30ms throttling code and replace it
> >>>>with a dynamic throttling algorithm.
> >>>>
> >>>>Additionally, be more aggressive when deciding when to start throttling.
> >>>>Previously we waited until four unproductive memory passes. Now we begin
> >>>>throttling after only two unproductive memory passes. Four seemed quite
> >>>>arbitrary and only waiting for two passes allows us to complete the migration
> >>>>faster.
> >>>>
> >>>>Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
> >>>>Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
> >>>>---
> >>>>  arch_init.c           | 95 +++++++++++++++++----------------------------------
> >>>>  migration/migration.c |  9 +++++
> >>>>  2 files changed, 41 insertions(+), 63 deletions(-)
> >>>>
> >>>>diff --git a/arch_init.c b/arch_init.c
> >>>>index 23d3feb..73ae494 100644
> >>>>--- a/arch_init.c
> >>>>+++ b/arch_init.c
> >>>>@@ -111,9 +111,7 @@ int graphic_depth = 32;
> >>>>  #endif
> >>>>
> >>>>  const uint32_t arch_type = QEMU_ARCH;
> >>>>-static bool mig_throttle_on;
> >>>>  static int dirty_rate_high_cnt;
> >>>>-static void check_guest_throttling(void);
> >>>>
> >>>>  static uint64_t bitmap_sync_count;
> >>>>
> >>>>@@ -487,6 +485,31 @@ static size_t save_page_header(QEMUFile *f, RAMBlock *block, ram_addr_t offset)
> >>>>      return size;
> >>>>  }
> >>>>
> >>>>+/* Reduce amount of guest cpu execution to hopefully slow down memory writes.
> >>>>+ * If guest dirty memory rate is reduced below the rate at which we can
> >>>>+ * transfer pages to the destination then we should be able to complete
> >>>>+ * migration. Some workloads dirty memory way too fast and will not effectively
> >>>>+ * converge, even with auto-converge. For these workloads we will continue to
> >>>>+ * increase throttling until the guest is paused long enough to complete the
> >>>>+ * migration. This essentially becomes a non-live migration.
> >>>>+ */
> >>>>+static void mig_throttle_guest_down(void)
> >>>>+{
> >>>>+    CPUState *cpu;
> >>>>+
> >>>>+    CPU_FOREACH(cpu) {
> >>>>+        /* We have not started throttling yet. Lets start it.*/
> >>>>+        if (!cpu_throttle_active(cpu)) {
> >>>>+            cpu_throttle_start(cpu, 0.2);
> >>>>+        }
> >>>>+
> >>>>+        /* Throttling is already in place. Just increase the throttling rate */
> >>>>+        else {
> >>>>+            cpu_throttle_start(cpu, cpu_throttle_get_ratio(cpu) * 2);
> >>>>+        }
> >>>
> >>>Now that migration has migrate_parameters, it would be best to replace
> >>>the magic numbers (the 0.2, the *2 - anything else?)  by parameters that can
> >>>change the starting throttling and increase rate.  It would probably also be
> >>>good to make the current throttling rate visible in info somewhere; maybe
> >>>info migrate?
> >>>
> >>
> >>I did consider all of this. However, I don't think that the controls
> >>this patch provides are an ideal throttling mechanism. I suspect someone
> >>with
> >>vcpu/scheduling experience could whip up something more user friendly and
> >>cleaner.
> >>I merely propose this because it seems better than what we have today for
> >>auto-converge.
> >>
> >>Also, I'm not sure how useful the information really is to the user. The
> >>fact that it is a ratio instead of a percentage might be confusing. Also,
> >>I suspect it is not
> >>truly very accurate. Again, I was going for "make it better", not "make it
> >>perfect".
> >>
> >>Lastly, if we define this external interface we are kind of stuck with it,
> >>yes?
> >
> >Well, one thing you can do is add a parameter with a name starting with x-
> >which means it's not a fixed interface (so things like libvirt wont use it).
> >And the reason I was interested in seeing the information was otherwise
> >we don't really have any way of knowing how well the code is working;
> >is it already throttling down more and more?
> >
> 
> Fair point. Can we add a qmp/hmp command like "info x-cpu-throttle"? Or a
> new line in "info migrate" output, "x-throttle-ration:  1.0" perhaps?
> Would this mess up libvirt's parsing of the hmp/qmp data?

A friendly libvirt person said that it will ignore extra data
(and if it doesn't it's a bug).

Dave

> -- 
> -- Jason J. Herne (jjherne@linux.vnet.ibm.com)
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK