qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 1/1] qcow2: avoid extra flushes in qcow2
@ 2016-06-01  9:12 Denis V. Lunev
  2016-06-01 10:07 ` Kevin Wolf
  0 siblings, 1 reply; 7+ messages in thread
From: Denis V. Lunev @ 2016-06-01  9:12 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: den, Pavel Borzenkov, Kevin Wolf, Max Reitz

qcow2_cache_flush() calls bdrv_flush() unconditionally after writing
cache entries of a particular cache. This can lead to as many as
2 additional fdatasyncs inside bdrv_flush.

We can simply skip all fdatasync calls inside qcow2_co_flush_to_os
as bdrv_flush for sure will do the job. This seriously affects the
performance of database operations inside the guest.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Pavel Borzenkov <pborzenkov@virtuozzo.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
---
 block/qcow2-cache.c | 11 +++++++++--
 block/qcow2.c       |  4 ++--
 block/qcow2.h       |  1 +
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 0fe8eda..6079c4a 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -226,7 +226,7 @@ static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i)
     return 0;
 }
 
-int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c)
+int qcow2_cache_flush_nosync(BlockDriverState *bs, Qcow2Cache *c)
 {
     BDRVQcow2State *s = bs->opaque;
     int result = 0;
@@ -242,8 +242,15 @@ int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c)
         }
     }
 
+    return result;
+}
+
+int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c)
+{
+    int result = qcow2_cache_flush_nosync(bs, c);
+
     if (result == 0) {
-        ret = bdrv_flush(bs->file->bs);
+        int ret = bdrv_flush(bs->file->bs);
         if (ret < 0) {
             result = ret;
         }
diff --git a/block/qcow2.c b/block/qcow2.c
index 38caa66..bb6b788 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2837,14 +2837,14 @@ static coroutine_fn int qcow2_co_flush_to_os(BlockDriverState *bs)
     int ret;
 
     qemu_co_mutex_lock(&s->lock);
-    ret = qcow2_cache_flush(bs, s->l2_table_cache);
+    ret = qcow2_cache_flush_nosync(bs, s->l2_table_cache);
     if (ret < 0) {
         qemu_co_mutex_unlock(&s->lock);
         return ret;
     }
 
     if (qcow2_need_accurate_refcounts(s)) {
-        ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+        ret = qcow2_cache_flush_nosync(bs, s->refcount_block_cache);
         if (ret < 0) {
             qemu_co_mutex_unlock(&s->lock);
             return ret;
diff --git a/block/qcow2.h b/block/qcow2.h
index a063a3c..0751225 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -583,6 +583,7 @@ int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c);
 void qcow2_cache_entry_mark_dirty(BlockDriverState *bs, Qcow2Cache *c,
      void *table);
 int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c);
+int qcow2_cache_flush_nosync(BlockDriverState *bs, Qcow2Cache *c);
 int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c,
     Qcow2Cache *dependency);
 void qcow2_cache_depends_on_flush(Qcow2Cache *c);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH 1/1] qcow2: avoid extra flushes in qcow2
  2016-06-01  9:12 Denis V. Lunev
@ 2016-06-01 10:07 ` Kevin Wolf
  2016-06-01 11:35   ` Pavel Borzenkov
  2016-06-02 13:38   ` Pavel Borzenkov
  0 siblings, 2 replies; 7+ messages in thread
From: Kevin Wolf @ 2016-06-01 10:07 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: qemu-block, qemu-devel, Pavel Borzenkov, Max Reitz, jsnow

Am 01.06.2016 um 11:12 hat Denis V. Lunev geschrieben:
> qcow2_cache_flush() calls bdrv_flush() unconditionally after writing
> cache entries of a particular cache. This can lead to as many as
> 2 additional fdatasyncs inside bdrv_flush.
> 
> We can simply skip all fdatasync calls inside qcow2_co_flush_to_os
> as bdrv_flush for sure will do the job.

This looked wrong at first because flushes are needed to keep the right
order of writes to the different caches. However, I see that you keep
the flush in qcow2_cache_flush_dependency(), so in the code this is
actually fine.

Can you make that more explicit in the commit message?

> This seriously affects the
> performance of database operations inside the guest.
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Pavel Borzenkov <pborzenkov@virtuozzo.com>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Max Reitz <mreitz@redhat.com>

Do you have performance numbers for master and with your patch applied?
(No performance related patch should come without numbers in its commit
message!)

What I find interesting is that this seems to help even though
duplicated flushes should actually be really cheap because there is no
new data that could be flushed in the second request. Makes me wonder if
guests send duplicated flushes, too, and whether we should optimise
that.

Maybe it would also be interesting to measure how things perform if we
removed the flush from qcow2_cache_flush_dependency(). This would be
incorrect code (corruption possible after host crash), but I'd like to
know how much performance we actually lose here. This is performance
that could potentially be gained by using a journal.

Kevin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH 1/1] qcow2: avoid extra flushes in qcow2
  2016-06-01 10:07 ` Kevin Wolf
@ 2016-06-01 11:35   ` Pavel Borzenkov
  2016-06-02 13:38   ` Pavel Borzenkov
  1 sibling, 0 replies; 7+ messages in thread
From: Pavel Borzenkov @ 2016-06-01 11:35 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Denis V. Lunev, qemu-block, qemu-devel, Max Reitz, jsnow

On Wed, Jun 01, 2016 at 12:07:01PM +0200, Kevin Wolf wrote:
> Am 01.06.2016 um 11:12 hat Denis V. Lunev geschrieben:
> > qcow2_cache_flush() calls bdrv_flush() unconditionally after writing
> > cache entries of a particular cache. This can lead to as many as
> > 2 additional fdatasyncs inside bdrv_flush.
> > 
> > We can simply skip all fdatasync calls inside qcow2_co_flush_to_os
> > as bdrv_flush for sure will do the job.
> 
> This looked wrong at first because flushes are needed to keep the right
> order of writes to the different caches. However, I see that you keep
> the flush in qcow2_cache_flush_dependency(), so in the code this is
> actually fine.
> 
> Can you make that more explicit in the commit message?
> 
> > This seriously affects the
> > performance of database operations inside the guest.
> > 
> > Signed-off-by: Denis V. Lunev <den@openvz.org>
> > CC: Pavel Borzenkov <pborzenkov@virtuozzo.com>
> > CC: Kevin Wolf <kwolf@redhat.com>
> > CC: Max Reitz <mreitz@redhat.com>
> 
> Do you have performance numbers for master and with your patch applied?
> (No performance related patch should come without numbers in its commit
> message!)

The problem with excessive flushing was found by a couple of performance tests:
  - parallel directory tree creation (from 2 processes)
  - 32 cached writes + fsync at the end in a loop

For the first one results improved from 2.6 loops/sec to 3.5 loops/sec.
Each loop creates 10^3 directories with 10 files in each.

For the second one results improved from ~600 fsync/sec to ~1100
fsync/sec. Though, it was run on SSD so it probably won't show such
performance gain on rotational media.

> 
> What I find interesting is that this seems to help even though
> duplicated flushes should actually be really cheap because there is no
> new data that could be flushed in the second request. Makes me wonder if
> guests send duplicated flushes, too, and whether we should optimise
> that.

SSDs are affected by flushes a lot. Looks like flushes mess with their
allocation algorithms.

Also, we are not alone on the machine. Other processes might have
written some data after first flush already, so the second one might not
be that cheap after all (disk is going to wait for it to be written to
persistent media).

Pavel

> 
> Maybe it would also be interesting to measure how things perform if we
> removed the flush from qcow2_cache_flush_dependency(). This would be
> incorrect code (corruption possible after host crash), but I'd like to
> know how much performance we actually lose here. This is performance
> that could potentially be gained by using a journal.
> 
> Kevin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH 1/1] qcow2: avoid extra flushes in qcow2
  2016-06-01 10:07 ` Kevin Wolf
  2016-06-01 11:35   ` Pavel Borzenkov
@ 2016-06-02 13:38   ` Pavel Borzenkov
  1 sibling, 0 replies; 7+ messages in thread
From: Pavel Borzenkov @ 2016-06-02 13:38 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Denis V. Lunev, qemu-block, qemu-devel, Max Reitz, jsnow

On Wed, Jun 01, 2016 at 12:07:01PM +0200, Kevin Wolf wrote:
> Am 01.06.2016 um 11:12 hat Denis V. Lunev geschrieben:
> > qcow2_cache_flush() calls bdrv_flush() unconditionally after writing
> > cache entries of a particular cache. This can lead to as many as
> > 2 additional fdatasyncs inside bdrv_flush.
> > 
> > We can simply skip all fdatasync calls inside qcow2_co_flush_to_os
> > as bdrv_flush for sure will do the job.
> 
> This looked wrong at first because flushes are needed to keep the right
> order of writes to the different caches. However, I see that you keep
> the flush in qcow2_cache_flush_dependency(), so in the code this is
> actually fine.
> 
> Can you make that more explicit in the commit message?
> 
> > This seriously affects the
> > performance of database operations inside the guest.
> > 
> > Signed-off-by: Denis V. Lunev <den@openvz.org>
> > CC: Pavel Borzenkov <pborzenkov@virtuozzo.com>
> > CC: Kevin Wolf <kwolf@redhat.com>
> > CC: Max Reitz <mreitz@redhat.com>
> 
> Do you have performance numbers for master and with your patch applied?
> (No performance related patch should come without numbers in its commit
> message!)
> 
> What I find interesting is that this seems to help even though
> duplicated flushes should actually be really cheap because there is no
> new data that could be flushed in the second request. Makes me wonder if
> guests send duplicated flushes, too, and whether we should optimise
> that.
> 
> Maybe it would also be interesting to measure how things perform if we
> removed the flush from qcow2_cache_flush_dependency(). This would be
> incorrect code (corruption possible after host crash), but I'd like to
> know how much performance we actually lose here. This is performance
> that could potentially be gained by using a journal.

Here are the results of the following testcase: sequential write of 8Gb
file by 64Kb blocks, on unallocated qcow2 image, with fsync() after each
64 block. Lazy refcounts are disabled, so we have a dependent cache
here. Results from SSD machine are as follows (every result is a 10
iterations average):

w/o patches: ~420 blocks/sec
with Den's patch: ~650 blocks/sec
with Den's patch + qcow2_cache_flush_dependency() switched to
qcow2_cache_flush_nosync(): ~720 blocks/sec

> 
> Kevin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH 1/1] qcow2: avoid extra flushes in qcow2
@ 2016-06-02 15:58 Denis V. Lunev
  2016-06-02 18:34 ` [Qemu-devel] [PATCH v2 " Denis V. Lunev
  2016-06-03  8:38 ` [Qemu-devel] [PATCH " Kevin Wolf
  0 siblings, 2 replies; 7+ messages in thread
From: Denis V. Lunev @ 2016-06-02 15:58 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: den, Pavel Borzenkov, Kevin Wolf, Max Reitz

The problem with excessive flushing was found by a couple of performance
tests:
  - parallel directory tree creation (from 2 processes)
  - 32 cached writes + fsync at the end in a loop

For the first one results improved from 2.6 loops/sec to 3.5 loops/sec.
Each loop creates 10^3 directories with 10 files in each.

For the second one results improved from ~600 fsync/sec to ~1100
fsync/sec. Though, it was run on SSD so it probably won't show such
performance gain on rotational media.

qcow2_cache_flush() calls bdrv_flush() unconditionally after writing
cache entries of a particular cache. This can lead to as many as
2 additional fdatasyncs inside bdrv_flush.

We can simply skip all fdatasync calls inside qcow2_co_flush_to_os
as bdrv_flush for sure will do the job. These flushes are necessary to
keep the right order of writes to the different caches. Though this is
not necessary in the current code base as this ordering is ensured through
the flush in qcow2_cache_flush_dependency().

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Pavel Borzenkov <pborzenkov@virtuozzo.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
---
Changes from v1:
- renamed qcow2_cache_flush_nosync to qcow2_cache_write (looks better to me)
- rewritten commit message entirely

 block/qcow2-cache.c | 11 +++++++++--
 block/qcow2.c       |  4 ++--
 block/qcow2.h       |  1 +
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 0fe8eda..208a060 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -226,7 +226,7 @@ static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i)
     return 0;
 }
 
-int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c)
+int qcow2_cache_write(BlockDriverState *bs, Qcow2Cache *c)
 {
     BDRVQcow2State *s = bs->opaque;
     int result = 0;
@@ -242,8 +242,15 @@ int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c)
         }
     }
 
+    return result;
+}
+
+int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c)
+{
+    int result = qcow2_cache_write(bs, c);
+
     if (result == 0) {
-        ret = bdrv_flush(bs->file->bs);
+        int ret = bdrv_flush(bs->file->bs);
         if (ret < 0) {
             result = ret;
         }
diff --git a/block/qcow2.c b/block/qcow2.c
index 38caa66..a194a8a 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2837,14 +2837,14 @@ static coroutine_fn int qcow2_co_flush_to_os(BlockDriverState *bs)
     int ret;
 
     qemu_co_mutex_lock(&s->lock);
-    ret = qcow2_cache_flush(bs, s->l2_table_cache);
+    ret = qcow2_cache_write(bs, s->l2_table_cache);
     if (ret < 0) {
         qemu_co_mutex_unlock(&s->lock);
         return ret;
     }
 
     if (qcow2_need_accurate_refcounts(s)) {
-        ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+        ret = qcow2_cache_write(bs, s->refcount_block_cache);
         if (ret < 0) {
             qemu_co_mutex_unlock(&s->lock);
             return ret;
diff --git a/block/qcow2.h b/block/qcow2.h
index a063a3c..7db9795 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -583,6 +583,7 @@ int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c);
 void qcow2_cache_entry_mark_dirty(BlockDriverState *bs, Qcow2Cache *c,
      void *table);
 int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c);
+int qcow2_cache_write(BlockDriverState *bs, Qcow2Cache *c);
 int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c,
     Qcow2Cache *dependency);
 void qcow2_cache_depends_on_flush(Qcow2Cache *c);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/1] qcow2: avoid extra flushes in qcow2
  2016-06-02 15:58 [Qemu-devel] [PATCH 1/1] qcow2: avoid extra flushes in qcow2 Denis V. Lunev
@ 2016-06-02 18:34 ` Denis V. Lunev
  2016-06-03  8:38 ` [Qemu-devel] [PATCH " Kevin Wolf
  1 sibling, 0 replies; 7+ messages in thread
From: Denis V. Lunev @ 2016-06-02 18:34 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: Pavel Borzenkov, Kevin Wolf, Max Reitz

On 06/02/2016 06:58 PM, Denis V. Lunev wrote:
> The problem with excessive flushing was found by a couple of performance
> tests:
>    - parallel directory tree creation (from 2 processes)
>    - 32 cached writes + fsync at the end in a loop
>
> For the first one results improved from 2.6 loops/sec to 3.5 loops/sec.
> Each loop creates 10^3 directories with 10 files in each.
>
> For the second one results improved from ~600 fsync/sec to ~1100
> fsync/sec. Though, it was run on SSD so it probably won't show such
> performance gain on rotational media.
>
> qcow2_cache_flush() calls bdrv_flush() unconditionally after writing
> cache entries of a particular cache. This can lead to as many as
> 2 additional fdatasyncs inside bdrv_flush.
>
> We can simply skip all fdatasync calls inside qcow2_co_flush_to_os
> as bdrv_flush for sure will do the job. These flushes are necessary to
> keep the right order of writes to the different caches. Though this is
> not necessary in the current code base as this ordering is ensured through
> the flush in qcow2_cache_flush_dependency().
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Pavel Borzenkov <pborzenkov@virtuozzo.com>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Max Reitz <mreitz@redhat.com>
actually this is v2 version of the patchset, missed version number in 
the subject

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH 1/1] qcow2: avoid extra flushes in qcow2
  2016-06-02 15:58 [Qemu-devel] [PATCH 1/1] qcow2: avoid extra flushes in qcow2 Denis V. Lunev
  2016-06-02 18:34 ` [Qemu-devel] [PATCH v2 " Denis V. Lunev
@ 2016-06-03  8:38 ` Kevin Wolf
  1 sibling, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2016-06-03  8:38 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: qemu-block, qemu-devel, Pavel Borzenkov, Max Reitz

Am 02.06.2016 um 17:58 hat Denis V. Lunev geschrieben:
> The problem with excessive flushing was found by a couple of performance
> tests:
>   - parallel directory tree creation (from 2 processes)
>   - 32 cached writes + fsync at the end in a loop
> 
> For the first one results improved from 2.6 loops/sec to 3.5 loops/sec.
> Each loop creates 10^3 directories with 10 files in each.
> 
> For the second one results improved from ~600 fsync/sec to ~1100
> fsync/sec. Though, it was run on SSD so it probably won't show such
> performance gain on rotational media.
> 
> qcow2_cache_flush() calls bdrv_flush() unconditionally after writing
> cache entries of a particular cache. This can lead to as many as
> 2 additional fdatasyncs inside bdrv_flush.
> 
> We can simply skip all fdatasync calls inside qcow2_co_flush_to_os
> as bdrv_flush for sure will do the job. These flushes are necessary to
> keep the right order of writes to the different caches. Though this is
> not necessary in the current code base as this ordering is ensured through
> the flush in qcow2_cache_flush_dependency().
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Pavel Borzenkov <pborzenkov@virtuozzo.com>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Max Reitz <mreitz@redhat.com>

Thanks, applied to the block branch.

Kevin

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-06-03  8:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-02 15:58 [Qemu-devel] [PATCH 1/1] qcow2: avoid extra flushes in qcow2 Denis V. Lunev
2016-06-02 18:34 ` [Qemu-devel] [PATCH v2 " Denis V. Lunev
2016-06-03  8:38 ` [Qemu-devel] [PATCH " Kevin Wolf
  -- strict thread matches above, loose matches on Subject: below --
2016-06-01  9:12 Denis V. Lunev
2016-06-01 10:07 ` Kevin Wolf
2016-06-01 11:35   ` Pavel Borzenkov
2016-06-02 13:38   ` Pavel Borzenkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).