From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:37287)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <benoit.canet@irqsave.net>) id 1UIgTk-00038y-FK
	for qemu-devel@nongnu.org; Thu, 21 Mar 2013 10:28:44 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <benoit.canet@irqsave.net>) id 1UIgTX-0003wh-A9
	for qemu-devel@nongnu.org; Thu, 21 Mar 2013 10:28:23 -0400
Received: from nodalink.pck.nerim.net ([62.212.105.220]:34190
	helo=paradis.irqsave.net) by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <benoit.canet@irqsave.net>) id 1UIgTW-0003sm-Vq
	for qemu-devel@nongnu.org; Thu, 21 Mar 2013 10:28:15 -0400
Date: Thu, 21 Mar 2013 15:28:57 +0100
From: =?iso-8859-1?Q?Beno=EEt?= Canet <benoit.canet@irqsave.net>
Message-ID: <20130321142856.GA4259@irqsave.net>
References: <1363770734-30970-1-git-send-email-benoit@irqsave.net>
	<1363770734-30970-2-git-send-email-benoit@irqsave.net>
	<20130320132924.GB14441@stefanha-thinkpad.muc.redhat.com>
	<20130320142842.GA2389@stefanha-thinkpad.muc.redhat.com>
	<20130320152714.GB1473@irqsave.net>
	<20130321103453.GB12555@stefanha-thinkpad.redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
In-Reply-To: <20130321103453.GB12555@stefanha-thinkpad.redhat.com>
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [PATCH] block: fix bdrv_exceed_iops_limits wait
 computation
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: =?iso-8859-1?Q?Beno=EEt?= Canet <benoit.canet@irqsave.net>, kwolf@redhat.com, wuzhy@linux.vnet.ibm.com, qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>


The +1 was here to account the current request as already done in this sl=
ice.
Statistically there is 50% chance that it will be wrong.
I toyed adding + 0.5 add wait_time doesn't drift anymore while iops don't
oscillate.

diff --git a/block.c b/block.c
index 0a062c9..455d8b0 100644
--- a/block.c
+++ b/block.c
@@ -3739,7 +3739,7 @@ static bool bdrv_exceed_iops_limits(BlockDriverStat=
e *bs,
bool is_write,
     }
=20
     /* Calc approx time to dispatch */
-    wait_time =3D (ios_base + 1) / iops_limit;
+    wait_time =3D (ios_base + 0.5) / iops_limit;
     if (wait_time > elapsed_time) {
         wait_time =3D wait_time - elapsed_time;
     } else {

I will let a vm run this patch for a while

Regards

Beno=EEt

> Le Thursday 21 Mar 2013 =E0 11:34:53 (+0100), Stefan Hajnoczi a =E9crit=
 :
> On Wed, Mar 20, 2013 at 04:27:14PM +0100, Beno=EEt Canet wrote:
> > > Now there is no oscillation and the wait_times do not grow or shrin=
k
> > > under constant load from dd(1).
> > >
> > > Can you try this patch by itself to see if it fixes the oscillation=
?
> >=20
> > On my test setup it fixes the oscillation and lead to an average 149.=
88 iops.
> > However another pattern appear.
> > iostat -d 1 -x will show something between 150 and 160 iops for sever=
al sample
> > then a sample would show around 70 iops to compensate for the additio=
nal ios
> > and this cycle restart.
>=20
> I've begun drilling down on these fluctuations.
>=20
> I think the problem is that I/O throttling uses bdrv_acct_done()
> accounting.  bdrv_acct_done is only called when requests complete.  Thi=
s
> has the following problem:
>=20
> Number of IOPS in this slice @ 150 IOPS =3D 15 ops per 100 ms slice
>=20
> 14 ops have completed already, only 1 more can proceed.
>=20
> 3 ops arrive in rapid succession:
>=20
> Op #1: Allowed through since 1 op can proceed.  We submit the op.
> Op #2: Allowed through since op #1 is still in progress so
>        bdrv_acct_done() has not been called yet.
> Op #3: Allowed through since op #1 & #2 are still in progress so
>        bdrv_acct_done() has not been called yet.
>=20
> Now when the ops start completing and the slice is extended we end up
> with weird wait times since we overspent our budget.
>=20
> I'm going to try a fix for delayed accounting.  Will report back with
> patches if it is successful.
>=20
> Stefan
>=20