From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:53453)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <berrange@redhat.com>) id 1Ubrxc-0008Op-KR
	for qemu-devel@nongnu.org; Mon, 13 May 2013 08:34:41 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <berrange@redhat.com>) id 1Ubrxb-0000tv-4T
	for qemu-devel@nongnu.org; Mon, 13 May 2013 08:34:36 -0400
Received: from mx1.redhat.com ([209.132.183.28]:10945)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <berrange@redhat.com>) id 1Ubrxa-0000tP-S7
	for qemu-devel@nongnu.org; Mon, 13 May 2013 08:34:35 -0400
Date: Mon, 13 May 2013 13:33:27 +0100
From: "Daniel P. Berrange" <berrange@redhat.com>
Message-ID: <20130513123327.GA32268@redhat.com>
References: <1368128600-30721-1-git-send-email-chegu_vinod@hp.com>
	<1368128600-30721-4-git-send-email-chegu_vinod@hp.com>
	<87y5bnc7a0.fsf@codemonkey.ws> <20130510141759.GO13475@redhat.com>
	<87li7mzxd6.fsf@codemonkey.ws>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <87li7mzxd6.fsf@codemonkey.ws>
Subject: Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live
 migration
Reply-To: "Daniel P. Berrange" <berrange@redhat.com>
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: quintela@redhat.com, Chegu Vinod <chegu_vinod@hp.com>, qemu-devel@nongnu.org, owasserm@redhat.com, pbonzini@redhat.com

On Fri, May 10, 2013 at 10:08:05AM -0500, Anthony Liguori wrote:
> "Daniel P. Berrange" <berrange@redhat.com> writes:
> 
> > On Fri, May 10, 2013 at 08:07:51AM -0500, Anthony Liguori wrote:
> >> Chegu Vinod <chegu_vinod@hp.com> writes:
> >> 
> >> >  If a user chooses to turn on the auto-converge migration capability
> >> >  these changes detect the lack of convergence and throttle down the
> >> >  guest. i.e. force the VCPUs out of the guest for some duration
> >> >  and let the migration thread catchup and help converge.
> >> >
> >> >  Verified the convergence using the following :
> >> >  - SpecJbb2005 workload running on a 20VCPU/256G guest(~80% busy)
> >> >  - OLTP like workload running on a 80VCPU/512G guest (~80% busy)
> >> >
> >> >  Sample results with SpecJbb2005 workload : (migrate speed set to 20Gb and
> >> >  migrate downtime set to 4seconds).
> >> 
> >> Would it make sense to separate out the "slow the VCPU down" part of
> >> this?
> >> 
> >> That would give a management tool more flexibility to create policies
> >> around slowing the VCPU down to encourage migration.
> >> 
> >> In fact, I wonder if we need anything in the migration path if we just
> >> expose the "slow the VCPU down" bit as a feature.
> >> 
> >> Slow the VCPU down is not quite the same as setting priority of the VCPU
> >> thread largely because of the QBL so I recognize the need to have
> >> something for this in QEMU.
> >
> > Rather than the priority, could you perhaps do the VCPU slow down
> > using  cfs_quota_us + cfs_period_us settings though ? These let you
> > place hard caps on schedular time afforded to vCPUs and we can already
> > control those via libvirt + cgroups.
> 
> The problem with the bandwidth controller is the same with priorities.
> You can end up causing lock holder pre-emption which would negatively
> impact migration performance.
> 
> It's far better for QEMU to voluntarily give up some time knowing that
> it's not holding the QBL since then migration can continue without
> impact.

IMHO it'd be nice to get some clear benchmark numbers of just how bug
the lock holder pre-emption problem is when using cgroup hard caps,
before we invent another mechanism for throttling the CPUs that has
to be plumbed into the whole stack.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|