All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: "Denis V. Lunev" <den@openvz.org>,
	qemu-block@nongnu.org, qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>, Fam Zheng <famz@redhat.com>,
	Evgeny Yakovlev <eyakovlev@virtuozzo.com>,
	Max Reitz <mreitz@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	John Snow <jsnow@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v5 4/4] block: ignore flush requests when storage is clean
Date: Thu, 7 Jul 2016 17:04:48 -0600	[thread overview]
Message-ID: <577EE010.7090106@redhat.com> (raw)
In-Reply-To: <1467643124-29778-5-git-send-email-den@openvz.org>

[-- Attachment #1: Type: text/plain, Size: 3169 bytes --]

On 07/04/2016 08:38 AM, Denis V. Lunev wrote:
> From: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
> 
> Some guests (win2008 server for example) do a lot of unnecessary
> flushing when underlying media has not changed. This adds additional
> overhead on host when calling fsync/fdatasync.
> 
> This change introduces a write generation scheme in BlockDriverState.
> Current write generation is checked against last flushed generation to
> avoid unnessesary flushes.
> 
> The problem with excessive flushing was found by a performance test
> which does parallel directory tree creation (from 2 processes).
> Results improved from 0.424 loops/sec to 0.432 loops/sec.
> Each loop creates 10^3 directories with 10 files in each.
> 

> +++ b/block/io.c
> @@ -1294,6 +1294,7 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
>      }
>      bdrv_debug_event(bs, BLKDBG_PWRITEV_DONE);
>  
> +    ++bs->write_gen;

Why pre-increment?  Most code uses post-increment when done as a
statement in isolation.

>      bdrv_set_dirty(bs, start_sector, end_sector - start_sector);
>  
>      if (bs->wr_highest_offset < offset + bytes) {
> @@ -2211,6 +2212,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>  {
>      int ret;
>      BdrvTrackedRequest req;
> +    int current_gen = bs->write_gen;
>  
>      if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
>          bdrv_is_sg(bs)) {
> @@ -2219,6 +2221,12 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>  
>      tracked_request_begin(&req, bs, 0, 0, BDRV_TRACKED_FLUSH);
>  
> +    /* Wait until any previous flushes are completed */
> +    while (bs->flush_started_gen != bs->flushed_gen) {

Should this be an inequality, as in s/!=/</, in case several flushes can
be started in parallel and where the later flush ends up finishing
before the earlier flush?

> +        qemu_co_queue_wait(&bs->flush_queue);
> +    }
> +    bs->flush_started_gen = current_gen;
> +
>      /* Write back all layers by calling one driver function */
>      if (bs->drv->bdrv_co_flush) {
>          ret = bs->drv->bdrv_co_flush(bs);
> @@ -2239,6 +2247,11 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>          goto flush_parent;
>      }
>  
> +    /* Check if we really need to flush anything */
> +    if (bs->flushed_gen == current_gen) {

Likewise, if you are tracking generations, should this be s/==/<=/ (am I
getting the direction correct)?

> +++ b/include/block/block_int.h
> @@ -420,6 +420,11 @@ struct BlockDriverState {
>                           note this is a reference count */
>      bool probed;
>  
> +    CoQueue flush_queue;            /* Serializing flush queue */
> +    unsigned int write_gen;         /* Current data generation */
> +    unsigned int flush_started_gen; /* Generation for which flush has started */
> +    unsigned int flushed_gen;       /* Flushed write generation */

Should these be 64-bit integers to avoid risk of overflow after just
2^32 flush attempts?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

  reply	other threads:[~2016-07-07 23:04 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-04 14:38 [Qemu-devel] [PATCH v5 0/4] block: ignore flush requests when storage is clean Denis V. Lunev
2016-07-04 14:38 ` [Qemu-devel] [PATCH v5 1/4] ide: refactor retry_unit set and clear into separate function Denis V. Lunev
2016-07-04 14:38 ` [Qemu-devel] [PATCH v5 2/4] ide: set retry_unit for PIO and FLUSH requests Denis V. Lunev
2016-07-04 14:38 ` [Qemu-devel] [PATCH v5 3/4] tests: in IDE and AHCI tests perform DMA write before flushing Denis V. Lunev
2016-07-04 14:38 ` [Qemu-devel] [PATCH v5 4/4] block: ignore flush requests when storage is clean Denis V. Lunev
2016-07-07 23:04   ` Eric Blake [this message]
2016-07-08 15:19     ` Evgeny Yakovlev
2016-07-08 18:44   ` John Snow
2016-07-11 10:12     ` Evgeny Yakovlev
2016-07-11 21:01       ` John Snow
2016-07-04 14:53 ` [Qemu-devel] [PATCH v5 0/4] " Paolo Bonzini
2016-07-04 15:48   ` Evgeny Yakovlev
2016-07-07 22:06   ` John Snow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=577EE010.7090106@redhat.com \
    --to=eblake@redhat.com \
    --cc=den@openvz.org \
    --cc=eyakovlev@virtuozzo.com \
    --cc=famz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.