From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:37287) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UIgTk-00038y-FK for qemu-devel@nongnu.org; Thu, 21 Mar 2013 10:28:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UIgTX-0003wh-A9 for qemu-devel@nongnu.org; Thu, 21 Mar 2013 10:28:23 -0400 Received: from nodalink.pck.nerim.net ([62.212.105.220]:34190 helo=paradis.irqsave.net) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UIgTW-0003sm-Vq for qemu-devel@nongnu.org; Thu, 21 Mar 2013 10:28:15 -0400 Date: Thu, 21 Mar 2013 15:28:57 +0100 From: =?iso-8859-1?Q?Beno=EEt?= Canet Message-ID: <20130321142856.GA4259@irqsave.net> References: <1363770734-30970-1-git-send-email-benoit@irqsave.net> <1363770734-30970-2-git-send-email-benoit@irqsave.net> <20130320132924.GB14441@stefanha-thinkpad.muc.redhat.com> <20130320142842.GA2389@stefanha-thinkpad.muc.redhat.com> <20130320152714.GB1473@irqsave.net> <20130321103453.GB12555@stefanha-thinkpad.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20130321103453.GB12555@stefanha-thinkpad.redhat.com> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH] block: fix bdrv_exceed_iops_limits wait computation List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: =?iso-8859-1?Q?Beno=EEt?= Canet , kwolf@redhat.com, wuzhy@linux.vnet.ibm.com, qemu-devel@nongnu.org, Stefan Hajnoczi The +1 was here to account the current request as already done in this sl= ice. Statistically there is 50% chance that it will be wrong. I toyed adding + 0.5 add wait_time doesn't drift anymore while iops don't oscillate. diff --git a/block.c b/block.c index 0a062c9..455d8b0 100644 --- a/block.c +++ b/block.c @@ -3739,7 +3739,7 @@ static bool bdrv_exceed_iops_limits(BlockDriverStat= e *bs, bool is_write, } =20 /* Calc approx time to dispatch */ - wait_time =3D (ios_base + 1) / iops_limit; + wait_time =3D (ios_base + 0.5) / iops_limit; if (wait_time > elapsed_time) { wait_time =3D wait_time - elapsed_time; } else { I will let a vm run this patch for a while Regards Beno=EEt > Le Thursday 21 Mar 2013 =E0 11:34:53 (+0100), Stefan Hajnoczi a =E9crit= : > On Wed, Mar 20, 2013 at 04:27:14PM +0100, Beno=EEt Canet wrote: > > > Now there is no oscillation and the wait_times do not grow or shrin= k > > > under constant load from dd(1). > > > > > > Can you try this patch by itself to see if it fixes the oscillation= ? > >=20 > > On my test setup it fixes the oscillation and lead to an average 149.= 88 iops. > > However another pattern appear. > > iostat -d 1 -x will show something between 150 and 160 iops for sever= al sample > > then a sample would show around 70 iops to compensate for the additio= nal ios > > and this cycle restart. >=20 > I've begun drilling down on these fluctuations. >=20 > I think the problem is that I/O throttling uses bdrv_acct_done() > accounting. bdrv_acct_done is only called when requests complete. Thi= s > has the following problem: >=20 > Number of IOPS in this slice @ 150 IOPS =3D 15 ops per 100 ms slice >=20 > 14 ops have completed already, only 1 more can proceed. >=20 > 3 ops arrive in rapid succession: >=20 > Op #1: Allowed through since 1 op can proceed. We submit the op. > Op #2: Allowed through since op #1 is still in progress so > bdrv_acct_done() has not been called yet. > Op #3: Allowed through since op #1 & #2 are still in progress so > bdrv_acct_done() has not been called yet. >=20 > Now when the ops start completing and the slice is extended we end up > with weird wait times since we overspent our budget. >=20 > I'm going to try a fix for delayed accounting. Will report back with > patches if it is successful. >=20 > Stefan >=20