From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:43325) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UIJ5I-0006IY-Ka for qemu-devel@nongnu.org; Wed, 20 Mar 2013 09:29:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UIJ5C-0001Aq-3E for qemu-devel@nongnu.org; Wed, 20 Mar 2013 09:29:40 -0400 Received: from mx1.redhat.com ([209.132.183.28]:3554) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UIJ5B-0001AP-Rj for qemu-devel@nongnu.org; Wed, 20 Mar 2013 09:29:34 -0400 Date: Wed, 20 Mar 2013 14:29:24 +0100 From: Stefan Hajnoczi Message-ID: <20130320132924.GB14441@stefanha-thinkpad.muc.redhat.com> References: <1363770734-30970-1-git-send-email-benoit@irqsave.net> <1363770734-30970-2-git-send-email-benoit@irqsave.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <1363770734-30970-2-git-send-email-benoit@irqsave.net> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH] block: fix bdrv_exceed_iops_limits wait computation List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?iso-8859-1?Q?Beno=EEt?= Canet Cc: kwolf@redhat.com, wuzhy@linux.vnet.ibm.com, qemu-devel@nongnu.org On Wed, Mar 20, 2013 at 10:12:14AM +0100, Beno=EEt Canet wrote: > This patch fix an I/O throttling behavior triggered by limiting at 150 = iops > and running a load of 50 threads doing random pwrites on a raw virtio d= evice. >=20 > After a few moments the iops count start to oscillate steadily between = 0 and a > value upper than the limit. >=20 > As the load keep running the period and the amplitude of the oscillatio= n > increase until io bursts reaching the physical storage max iops count a= re > done. >=20 > These bursts are followed by guest io starvation. >=20 > As the period of this oscillation cycle is increasing the cause must be= a > computation error leading to increase slowly the wait time. >=20 > This patch make the wait time a bit smaller and tests confirm that it s= olves > the oscillating behavior. >=20 > Signed-off-by: Benoit Canet > --- > block.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) >=20 > diff --git a/block.c b/block.c > index 0a062c9..455d8b0 100644 > --- a/block.c > +++ b/block.c > @@ -3739,7 +3739,7 @@ static bool bdrv_exceed_iops_limits(BlockDriverSt= ate *bs, bool is_write, > } > =20 > /* Calc approx time to dispatch */ > - wait_time =3D (ios_base + 1) / iops_limit; > + wait_time =3D ios_base / iops_limit; > if (wait_time > elapsed_time) { > wait_time =3D wait_time - elapsed_time; > } else { I tried reproducing without your test case: dd if=3D/dev/vdb of=3D/dev/null bs=3D4096 iflag=3Ddirect I've pasted printfs below which reveals that wait time increases monotoni= cally! In other words, dd is slowed down more and more as it runs: bs 0x7f56fe187a30 co 0x7f56fe211cb0 throttled for 1426 ms bs 0x7f56fe187a30 co 0x7f56fe211cb0 woke up from throttled_reqs after sle= eping bs 0x7f56fe187a30 co 0x7f56fe211cb0 throttled for 1431 ms bs 0x7f56fe187a30 co 0x7f56fe211cb0 woke up from throttled_reqs after sle= eping bs 0x7f56fe187a30 co 0x7f56fe211cb0 throttled for 1437 ms bs 0x7f56fe187a30 co 0x7f56fe211cb0 woke up from throttled_reqs after sle= eping ... Killing dd and starting it again resets the accumulated delay (probably b= ecause we end the slice and state is cleared). This suggests workloads that are constantly at the I/O limit will experie= nce creeping delay or the oscillations you found. After applying your patch I observed the opposite behavior: wait time dec= reases until it resets itself. Perhaps we're waiting less and less until we jus= t finish the slice and all values reset: bs 0x7f2cd2c52a30 co 0x7f2cd2ce3910 throttled for 496 ms bs 0x7f2cd2c52a30 co 0x7f2cd2ce3910 woke up from throttled_reqs after sle= eping bs 0x7f2cd2c52a30 co 0x7f2cd2ce3910 throttled for 489 ms bs 0x7f2cd2c52a30 co 0x7f2cd2ce3910 woke up from throttled_reqs after sle= eping bs 0x7f2cd2c52a30 co 0x7f2cd2ce3910 throttled for 484 ms bs 0x7f2cd2c52a30 co 0x7f2cd2ce3910 woke up from throttled_reqs after sle= eping bs 0x7f2cd2c52a30 co 0x7f2cd2ce3910 throttled for 480 ms bs 0x7f2cd2c52a30 co 0x7f2cd2ce3910 woke up from throttled_reqs after sle= eping bs 0x7f2cd2c52a30 co 0x7f2cd2ce3910 throttled for 474 ms ... bs 0x7f2cd2c52a30 co 0x7f2cd2ce3910 throttled for 300 ms bs 0x7f2cd2c52a30 co 0x7f2cd2ce3910 woke up from throttled_reqs after sle= eping bs 0x7f2cd2c52a30 co 0x7f2cd2ce3910 throttled for 299 ms bs 0x7f2cd2c52a30 co 0x7f2cd2ce3910 woke up from throttled_reqs after sle= eping bs 0x7f2cd2c52a30 co 0x7f2cd2ce3910 throttled for 298 ms bs 0x7f2cd2c52a30 co 0x7f2cd2ce3910 woke up from throttled_reqs after sle= eping bs 0x7f2cd2c52a30 co 0x7f2cd2ce3910 throttled for 494 ms I'm not confident that I understand the effects of your patch. Do you ha= ve an explanation for these results? More digging will probably be necessary to solve the underlying problem h= ere. diff --git a/block.c b/block.c index 0a062c9..7a8c9e6 100644 --- a/block.c +++ b/block.c @@ -175,7 +175,9 @@ static void bdrv_io_limits_intercept(BlockDriverState= *bs, int64_t wait_time =3D -1; =20 if (!qemu_co_queue_empty(&bs->throttled_reqs)) { + fprintf(stderr, "bs %p co %p waiting for throttled_reqs\n", bs, = qemu_coroutine_self()); qemu_co_queue_wait(&bs->throttled_reqs); + fprintf(stderr, "bs %p co %p woke up from throttled_reqs\n", bs,= qemu_coroutine_self()); } =20 /* In fact, we hope to keep each request's timing, in FIFO mode. The= next @@ -188,7 +190,9 @@ static void bdrv_io_limits_intercept(BlockDriverState= *bs, while (bdrv_exceed_io_limits(bs, nb_sectors, is_write, &wait_time)) = { qemu_mod_timer(bs->block_timer, wait_time + qemu_get_clock_ns(vm_clock)); + fprintf(stderr, "bs %p co %p throttled for %"PRId64" ms\n", bs, = qemu_coroutine_self(), wait_time qemu_co_queue_wait_insert_head(&bs->throttled_reqs); + fprintf(stderr, "bs %p co %p woke up from throttled_reqs after s= leeping\n", bs, qemu_coroutine_s } =20 qemu_co_queue_next(&bs->throttled_reqs);