qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/2] block/io: Delay decrementing the quiesce_counter
@ 2019-05-22 14:28 Max Reitz
  2019-05-22 14:28 ` [Qemu-devel] [PATCH 1/2] " Max Reitz
  2019-05-22 14:28 ` [Qemu-devel] [PATCH 2/2] iotests: Test cancelling a job and closing the VM Max Reitz
  0 siblings, 2 replies; 4+ messages in thread
From: Max Reitz @ 2019-05-22 14:28 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi, Max Reitz

See patch 1 for the code-based explanation, and patch 2 for a case where
this bites in practice.

Max Reitz (2):
  block/io: Delay decrementing the quiesce_counter
  iotests: Test cancelling a job and closing the VM

 block/io.c                 |  3 ++-
 tests/qemu-iotests/255     | 54 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/255.out | 24 +++++++++++++++++
 3 files changed, 80 insertions(+), 1 deletion(-)

-- 
2.21.0



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Qemu-devel] [PATCH 1/2] block/io: Delay decrementing the quiesce_counter
  2019-05-22 14:28 [Qemu-devel] [PATCH 0/2] block/io: Delay decrementing the quiesce_counter Max Reitz
@ 2019-05-22 14:28 ` Max Reitz
  2019-05-22 14:28 ` [Qemu-devel] [PATCH 2/2] iotests: Test cancelling a job and closing the VM Max Reitz
  1 sibling, 0 replies; 4+ messages in thread
From: Max Reitz @ 2019-05-22 14:28 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi, Max Reitz

When ending a drained section, bdrv_do_drained_end() currently first
decrements the quiesce_counter, and only then actually ends the drain.

The bdrv_drain_invoke(bs, false) call may cause graph changes.  Say the
graph change involves replacing an existing BB's ("blk") BDS
(blk_bs(blk)) by @bs.  Let us introducing the following values:
- bs_oqc = old_quiesce_counter
  (so bs->quiesce_counter == bs_oqc - 1)
- obs_qc = blk_bs(blk)->quiesce_counter (before bdrv_drain_invoke())

Let us assume there is no blk_pthread_unthrottled() involved, so
blk->quiesce_counter == obs_qc (before bdrv_drain_invoke()).

Now replacing blk_bs(blk) by @bs will reduce blk->quiesce_counter by
obs_qc (making it 0) and increase it by bs_oqc-1 (making it bs_oqc-1).

bdrv_drain_invoke() returns and we invoke bdrv_parent_drained_end().
This will decrement blk->quiesce_counter by one, so it would be -1 --
were there not an assertion against that in blk_root_drained_end().

We therefore have to keep the quiesce_counter up at least until
bdrv_drain_invoke() returns, so that bdrv_parent_drained_end() does the
right thing for the parents @bs got during bdrv_drain_invoke().

But let us delay it even further, namely until bdrv_parent_drained_end()
returns, because then it mirrors bdrv_do_drained_begin(): There, we
first increment the quiesce_counter, then begin draining the parents,
and then call bdrv_drain_invoke().  It makes sense to let
bdrv_do_drained_end() unravel this exactly in reverse.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/io.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index 150358c3b1..0f6ebd001c 100644
--- a/block/io.c
+++ b/block/io.c
@@ -422,11 +422,12 @@ static void bdrv_do_drained_end(BlockDriverState *bs, bool recursive,
         return;
     }
     assert(bs->quiesce_counter > 0);
-    old_quiesce_counter = atomic_fetch_dec(&bs->quiesce_counter);
 
     /* Re-enable things in child-to-parent order */
     bdrv_drain_invoke(bs, false);
     bdrv_parent_drained_end(bs, parent, ignore_bds_parents);
+
+    old_quiesce_counter = atomic_fetch_dec(&bs->quiesce_counter);
     if (old_quiesce_counter == 1) {
         aio_enable_external(bdrv_get_aio_context(bs));
     }
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Qemu-devel] [PATCH 2/2] iotests: Test cancelling a job and closing the VM
  2019-05-22 14:28 [Qemu-devel] [PATCH 0/2] block/io: Delay decrementing the quiesce_counter Max Reitz
  2019-05-22 14:28 ` [Qemu-devel] [PATCH 1/2] " Max Reitz
@ 2019-05-22 14:28 ` Max Reitz
  2019-05-22 14:31   ` Max Reitz
  1 sibling, 1 reply; 4+ messages in thread
From: Max Reitz @ 2019-05-22 14:28 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi, Max Reitz

This patch adds a test where we cancel a throttled mirror job and
immediately close the VM before it can be cancelled.  Doing so will
invoke bdrv_drain_all() while the mirror job tries to drain the
throttled node.  When bdrv_drain_all_end() tries to lift its drain on
the throttle node, the job will exit and replace the current root node
of the BB drive0 (which is the job's filter node) by the throttle node.
Before the previous patch, this replacement did not increase drive0's
quiesce_counter by a sufficient amount, so when
bdrv_parent_drained_end() (invoked by bdrv_do_drained_end(), invoked by
bdrv_drain_all_end()) tried to end the drain on all of the throttle
node's parents, it decreased drive0's quiesce_counter below 0 -- which
fails an assertion.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/255     | 54 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/255.out | 24 +++++++++++++++++
 2 files changed, 78 insertions(+)

diff --git a/tests/qemu-iotests/255 b/tests/qemu-iotests/255
index c0bb37a9b0..b1bbe3ff0a 100755
--- a/tests/qemu-iotests/255
+++ b/tests/qemu-iotests/255
@@ -35,6 +35,10 @@ def blockdev_create(vm, options):
         vm.run_job('job0')
     iotests.log("")
 
+iotests.log('Finishing a commit job with background reads')
+iotests.log('============================================')
+iotests.log('')
+
 with iotests.FilePath('t.qcow2') as disk_path, \
      iotests.FilePath('t.qcow2.mid') as mid_path, \
      iotests.FilePath('t.qcow2.base') as base_path, \
@@ -80,4 +84,54 @@ with iotests.FilePath('t.qcow2') as disk_path, \
     vm.run_job('job0', auto_finalize=False, pre_finalize=start_requests,
                 auto_dismiss=True)
 
+    #vm.qmp_log('block-job-cancel', device='job0')
+
+    vm.shutdown()
+
+iotests.log('')
+iotests.log('Closing the VM while a job is being cancelled')
+iotests.log('=============================================')
+iotests.log('')
+
+with iotests.FilePath('src.qcow2') as src_path, \
+     iotests.FilePath('dst.qcow2') as dst_path, \
+     iotests.VM() as vm:
+
+    iotests.log('=== Create images and start VM ===')
+    iotests.log('')
+
+    size = 128 * 1024 * 1024
+    size_str = str(size)
+
+    iotests.qemu_img_log('create', '-f', iotests.imgfmt, src_path, size_str)
+    iotests.qemu_img_log('create', '-f', iotests.imgfmt, dst_path, size_str)
+
+    iotests.log(iotests.qemu_io('-f', iotests.imgfmt, '-c', 'write 0 1M',
+                                src_path),
+                filters=[iotests.filter_test_dir, iotests.filter_qemu_io])
+
+    vm.add_object('throttle-group,x-bps-read=4096,id=throttle0')
+
+    vm.add_blockdev('file,node-name=src-file,filename=%s' % (src_path))
+    vm.add_blockdev('%s,node-name=src,file=src-file' % (iotests.imgfmt))
+
+    vm.add_blockdev('file,node-name=dst-file,filename=%s' % (dst_path))
+    vm.add_blockdev('%s,node-name=dst,file=dst-file' % (iotests.imgfmt))
+
+    vm.add_blockdev('throttle,node-name=src-throttled,' +
+                    'throttle-group=throttle0,file=src')
+
+    vm.add_device('virtio-blk,drive=src-throttled')
+
+    vm.launch()
+
+    iotests.log('=== Start a mirror job ===')
+    iotests.log('')
+
+    vm.qmp_log('blockdev-mirror', job_id='job0', device='src-throttled',
+                                  target='dst', sync='full')
+
+    vm.qmp_log('block-job-cancel', device='job0')
+    vm.qmp_log('quit')
+
     vm.shutdown()
diff --git a/tests/qemu-iotests/255.out b/tests/qemu-iotests/255.out
index 9a2d7cbb77..348909fdef 100644
--- a/tests/qemu-iotests/255.out
+++ b/tests/qemu-iotests/255.out
@@ -1,3 +1,6 @@
+Finishing a commit job with background reads
+============================================
+
 === Create backing chain and start VM ===
 
 Formatting 'TEST_DIR/PID-t.qcow2.mid', fmt=qcow2 size=134217728 cluster_size=65536 lazy_refcounts=off refcount_bits=16
@@ -14,3 +17,24 @@ Formatting 'TEST_DIR/PID-t.qcow2', fmt=qcow2 size=134217728 cluster_size=65536 l
 {"return": {}}
 {"data": {"id": "job0", "type": "commit"}, "event": "BLOCK_JOB_PENDING", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"data": {"device": "job0", "len": 134217728, "offset": 134217728, "speed": 0, "type": "commit"}, "event": "BLOCK_JOB_COMPLETED", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
+
+Closing the VM while a job is being cancelled
+=============================================
+
+=== Create images and start VM ===
+
+Formatting 'TEST_DIR/PID-src.qcow2', fmt=qcow2 size=134217728 cluster_size=65536 lazy_refcounts=off refcount_bits=16
+
+Formatting 'TEST_DIR/PID-dst.qcow2', fmt=qcow2 size=134217728 cluster_size=65536 lazy_refcounts=off refcount_bits=16
+
+wrote 1048576/1048576 bytes at offset 0
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+=== Start a mirror job ===
+
+{"execute": "blockdev-mirror", "arguments": {"device": "src-throttled", "job-id": "job0", "sync": "full", "target": "dst"}}
+{"return": {}}
+{"execute": "block-job-cancel", "arguments": {"device": "job0"}}
+{"return": {}}
+{"execute": "quit", "arguments": {}}
+{"return": {}}
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] iotests: Test cancelling a job and closing the VM
  2019-05-22 14:28 ` [Qemu-devel] [PATCH 2/2] iotests: Test cancelling a job and closing the VM Max Reitz
@ 2019-05-22 14:31   ` Max Reitz
  0 siblings, 0 replies; 4+ messages in thread
From: Max Reitz @ 2019-05-22 14:31 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 2064 bytes --]

On 22.05.19 16:28, Max Reitz wrote:
> This patch adds a test where we cancel a throttled mirror job and
> immediately close the VM before it can be cancelled.  Doing so will
> invoke bdrv_drain_all() while the mirror job tries to drain the
> throttled node.  When bdrv_drain_all_end() tries to lift its drain on
> the throttle node, the job will exit and replace the current root node
> of the BB drive0 (which is the job's filter node) by the throttle node.
> Before the previous patch, this replacement did not increase drive0's
> quiesce_counter by a sufficient amount, so when
> bdrv_parent_drained_end() (invoked by bdrv_do_drained_end(), invoked by
> bdrv_drain_all_end()) tried to end the drain on all of the throttle
> node's parents, it decreased drive0's quiesce_counter below 0 -- which
> fails an assertion.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  tests/qemu-iotests/255     | 54 ++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/255.out | 24 +++++++++++++++++
>  2 files changed, 78 insertions(+)
> 
> diff --git a/tests/qemu-iotests/255 b/tests/qemu-iotests/255
> index c0bb37a9b0..b1bbe3ff0a 100755
> --- a/tests/qemu-iotests/255
> +++ b/tests/qemu-iotests/255
> @@ -35,6 +35,10 @@ def blockdev_create(vm, options):
>          vm.run_job('job0')
>      iotests.log("")
>  
> +iotests.log('Finishing a commit job with background reads')
> +iotests.log('============================================')
> +iotests.log('')
> +
>  with iotests.FilePath('t.qcow2') as disk_path, \
>       iotests.FilePath('t.qcow2.mid') as mid_path, \
>       iotests.FilePath('t.qcow2.base') as base_path, \
> @@ -80,4 +84,54 @@ with iotests.FilePath('t.qcow2') as disk_path, \
>      vm.run_job('job0', auto_finalize=False, pre_finalize=start_requests,
>                  auto_dismiss=True)
>  
> +    #vm.qmp_log('block-job-cancel', device='job0')
> +
> +    vm.shutdown()

Er, oops, I took so much time with the commit messages that I forgot
this.  Let me try again...

Shamefully,

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-05-22 14:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-05-22 14:28 [Qemu-devel] [PATCH 0/2] block/io: Delay decrementing the quiesce_counter Max Reitz
2019-05-22 14:28 ` [Qemu-devel] [PATCH 1/2] " Max Reitz
2019-05-22 14:28 ` [Qemu-devel] [PATCH 2/2] iotests: Test cancelling a job and closing the VM Max Reitz
2019-05-22 14:31   ` Max Reitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).