All of lore.kernel.org
 help / color / mirror / Atom feed
From: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
To: John Snow <jsnow@redhat.com>, "Denis V. Lunev" <den@openvz.org>,
	qemu-block@nongnu.org, qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>, Fam Zheng <famz@redhat.com>,
	Max Reitz <mreitz@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v5 4/4] block: ignore flush requests when storage is clean
Date: Mon, 11 Jul 2016 13:12:47 +0300	[thread overview]
Message-ID: <5783711F.8000309@virtuozzo.com> (raw)
In-Reply-To: <234286ed-aa95-1c0d-4207-a1e1706231d1@redhat.com>



On 08.07.2016 21:44, John Snow wrote:
>
> On 07/04/2016 10:38 AM, Denis V. Lunev wrote:
>> From: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
>>
>> Some guests (win2008 server for example) do a lot of unnecessary
>> flushing when underlying media has not changed. This adds additional
>> overhead on host when calling fsync/fdatasync.
>>
>> This change introduces a write generation scheme in BlockDriverState.
>> Current write generation is checked against last flushed generation to
>> avoid unnessesary flushes.
>>
>> The problem with excessive flushing was found by a performance test
>> which does parallel directory tree creation (from 2 processes).
>> Results improved from 0.424 loops/sec to 0.432 loops/sec.
>> Each loop creates 10^3 directories with 10 files in each.
>>
>> Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Kevin Wolf <kwolf@redhat.com>
>> CC: Max Reitz <mreitz@redhat.com>
>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>> CC: Fam Zheng <famz@redhat.com>
>> CC: John Snow <jsnow@redhat.com>
>> ---
>>   block.c                   |  3 +++
>>   block/io.c                | 18 ++++++++++++++++++
>>   include/block/block_int.h |  5 +++++
>>   3 files changed, 26 insertions(+)
>>
>> diff --git a/block.c b/block.c
>> index f4648e9..366fad6 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -234,6 +234,8 @@ BlockDriverState *bdrv_new(void)
>>       bs->refcnt = 1;
>>       bs->aio_context = qemu_get_aio_context();
>>   
>> +    qemu_co_queue_init(&bs->flush_queue);
>> +
>>       QTAILQ_INSERT_TAIL(&all_bdrv_states, bs, bs_list);
>>   
>>       return bs;
>> @@ -2582,6 +2584,7 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset)
>>           ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
>>           bdrv_dirty_bitmap_truncate(bs);
>>           bdrv_parent_cb_resize(bs);
>> +        ++bs->write_gen;
>>       }
>>       return ret;
>>   }
>> diff --git a/block/io.c b/block/io.c
>> index 7cf3645..a5451b6 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -1294,6 +1294,7 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
>>       }
>>       bdrv_debug_event(bs, BLKDBG_PWRITEV_DONE);
>>   
>> +    ++bs->write_gen;
>>       bdrv_set_dirty(bs, start_sector, end_sector - start_sector);
>>   
>>       if (bs->wr_highest_offset < offset + bytes) {
>> @@ -2211,6 +2212,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>>   {
>>       int ret;
>>       BdrvTrackedRequest req;
>> +    int current_gen = bs->write_gen;
>>   
>>       if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
>>           bdrv_is_sg(bs)) {
>> @@ -2219,6 +2221,12 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>>   
>>       tracked_request_begin(&req, bs, 0, 0, BDRV_TRACKED_FLUSH);
>>   
>> +    /* Wait until any previous flushes are completed */
>> +    while (bs->flush_started_gen != bs->flushed_gen) {
>> +        qemu_co_queue_wait(&bs->flush_queue);
>> +    }
>> +    bs->flush_started_gen = current_gen;
>> +
>>       /* Write back all layers by calling one driver function */
>>       if (bs->drv->bdrv_co_flush) {
>>           ret = bs->drv->bdrv_co_flush(bs);
>> @@ -2239,6 +2247,11 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>>           goto flush_parent;
>>       }
>>   
>> +    /* Check if we really need to flush anything */
>> +    if (bs->flushed_gen == current_gen) {
>> +        goto flush_parent;
>> +    }
>> +
>>       BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_DISK);
>>       if (bs->drv->bdrv_co_flush_to_disk) {
>>           ret = bs->drv->bdrv_co_flush_to_disk(bs);
>> @@ -2279,6 +2292,10 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>>   flush_parent:
>>       ret = bs->file ? bdrv_co_flush(bs->file->bs) : 0;
>>   out:
>> +    /* Notify any pending flushes that we have completed */
>> +    bs->flushed_gen = current_gen;
>> +    qemu_co_queue_restart_all(&bs->flush_queue);
>> +
>>       tracked_request_end(&req);
>>       return ret;
>>   }
>> @@ -2402,6 +2419,7 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
>>       }
>>       ret = 0;
>>   out:
>> +    ++bs->write_gen;
>>       bdrv_set_dirty(bs, req.offset >> BDRV_SECTOR_BITS,
>>                      req.bytes >> BDRV_SECTOR_BITS);
>>       tracked_request_end(&req);
>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>> index 2057156..8543daf 100644
>> --- a/include/block/block_int.h
>> +++ b/include/block/block_int.h
>> @@ -420,6 +420,11 @@ struct BlockDriverState {
>>                            note this is a reference count */
>>       bool probed;
>>   
>> +    CoQueue flush_queue;            /* Serializing flush queue */
>> +    unsigned int write_gen;         /* Current data generation */
>> +    unsigned int flush_started_gen; /* Generation for which flush has started */
>> +    unsigned int flushed_gen;       /* Flushed write generation */
>> +
>>       BlockDriver *drv; /* NULL means no media */
>>       void *opaque;
>>   
>>
> Breaks qcow2 iotests 026 089 141 144

Sorry, didn't knew those tests existed, only ran make check previously.
Looking at 026, looks like it is the same problem as in IDE and AHCI. 
Test case injects blkdebug write errors which should be triggered by 
flushes and expects to see them in output. However those flushes are now 
skipped and no events are generated. Otherwise resulting image looks 
consistent, all data was flushed. Expect the same problem to be with 
other tests, but maybe test case is incorrect now?

>
> --js

  reply	other threads:[~2016-07-11 13:47 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-04 14:38 [Qemu-devel] [PATCH v5 0/4] block: ignore flush requests when storage is clean Denis V. Lunev
2016-07-04 14:38 ` [Qemu-devel] [PATCH v5 1/4] ide: refactor retry_unit set and clear into separate function Denis V. Lunev
2016-07-04 14:38 ` [Qemu-devel] [PATCH v5 2/4] ide: set retry_unit for PIO and FLUSH requests Denis V. Lunev
2016-07-04 14:38 ` [Qemu-devel] [PATCH v5 3/4] tests: in IDE and AHCI tests perform DMA write before flushing Denis V. Lunev
2016-07-04 14:38 ` [Qemu-devel] [PATCH v5 4/4] block: ignore flush requests when storage is clean Denis V. Lunev
2016-07-07 23:04   ` Eric Blake
2016-07-08 15:19     ` Evgeny Yakovlev
2016-07-08 18:44   ` John Snow
2016-07-11 10:12     ` Evgeny Yakovlev [this message]
2016-07-11 21:01       ` John Snow
2016-07-04 14:53 ` [Qemu-devel] [PATCH v5 0/4] " Paolo Bonzini
2016-07-04 15:48   ` Evgeny Yakovlev
2016-07-07 22:06   ` John Snow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5783711F.8000309@virtuozzo.com \
    --to=eyakovlev@virtuozzo.com \
    --cc=den@openvz.org \
    --cc=famz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.