qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again
@ 2015-12-01  9:36 Fam Zheng
  2015-12-01  9:36 ` [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests Fam Zheng
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Fam Zheng @ 2015-12-01  9:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Jeff Cody, qemu-block, Stefan Hajnoczi

This is basically a supplementary fix of 06c3916b.

On 512 disks, the crash only happens when copy-on-read is enabled, which is
covered by the previou fix.  But on 4k disks the write request that triggers
the notifier itself may be serialised, in which case the read req from
backup_do_cow will still be serialised in bdrv_aligned_preadv, resulting in the
same assertion failure.

Fam Zheng (3):
  block: Don't wait serialising for non-COR read requests
  iotests: Add "add_drive_raw" method
  iotests: Add regresion test case for write notifier assertion failure

 block/backup.c                |  2 +-
 block/io.c                    | 12 +++++++-----
 include/block/block.h         |  4 ++--
 tests/qemu-iotests/056        | 25 +++++++++++++++++++++++++
 tests/qemu-iotests/056.out    |  4 ++--
 tests/qemu-iotests/iotests.py |  5 +++++
 trace-events                  |  2 +-
 7 files changed, 43 insertions(+), 11 deletions(-)

-- 
2.4.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests
  2015-12-01  9:36 [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again Fam Zheng
@ 2015-12-01  9:36 ` Fam Zheng
  2015-12-01  9:54   ` Kevin Wolf
  2015-12-01  9:36 ` [Qemu-devel] [PATCH 2/3] iotests: Add "add_drive_raw" method Fam Zheng
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Fam Zheng @ 2015-12-01  9:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Jeff Cody, qemu-block, Stefan Hajnoczi

The assertion problem was noticed in 06c3916b35a, but it wasn't
completely fixed, because even though the req is not marked as
serialising, it still gets serialised by wait_serialising_requests
against other serialising requests, which could lead to the same
assertion failure.

Fix it by even more explicitly skipping the serialising for this
specific case.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/backup.c        |  2 +-
 block/io.c            | 12 +++++++-----
 include/block/block.h |  4 ++--
 trace-events          |  2 +-
 4 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 3b39119..705bb77 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -132,7 +132,7 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
         qemu_iovec_init_external(&bounce_qiov, &iov, 1);
 
         if (is_write_notifier) {
-            ret = bdrv_co_no_copy_on_readv(bs,
+            ret = bdrv_co_readv_no_serialising(bs,
                                            start * BACKUP_SECTORS_PER_CLUSTER,
                                            n, &bounce_qiov);
         } else {
diff --git a/block/io.c b/block/io.c
index adc1eab..e00fb5d 100644
--- a/block/io.c
+++ b/block/io.c
@@ -863,7 +863,9 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
         mark_request_serialising(req, bdrv_get_cluster_size(bs));
     }
 
-    wait_serialising_requests(req);
+    if (!(flags & BDRV_REQ_NO_SERIALISING)) {
+        wait_serialising_requests(req);
+    }
 
     if (flags & BDRV_REQ_COPY_ON_READ) {
         int pnum;
@@ -952,7 +954,7 @@ static int coroutine_fn bdrv_co_do_preadv(BlockDriverState *bs,
     }
 
     /* Don't do copy-on-read if we read data before write operation */
-    if (bs->copy_on_read && !(flags & BDRV_REQ_NO_COPY_ON_READ)) {
+    if (bs->copy_on_read && !(flags & BDRV_REQ_NO_SERIALISING)) {
         flags |= BDRV_REQ_COPY_ON_READ;
     }
 
@@ -1021,13 +1023,13 @@ int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num,
     return bdrv_co_do_readv(bs, sector_num, nb_sectors, qiov, 0);
 }
 
-int coroutine_fn bdrv_co_no_copy_on_readv(BlockDriverState *bs,
+int coroutine_fn bdrv_co_readv_no_serialising(BlockDriverState *bs,
     int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
 {
-    trace_bdrv_co_no_copy_on_readv(bs, sector_num, nb_sectors);
+    trace_bdrv_co_readv_no_serialising(bs, sector_num, nb_sectors);
 
     return bdrv_co_do_readv(bs, sector_num, nb_sectors, qiov,
-                            BDRV_REQ_NO_COPY_ON_READ);
+                            BDRV_REQ_NO_SERIALISING);
 }
 
 int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs,
diff --git a/include/block/block.h b/include/block/block.h
index 73edb1a..3477328 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -61,7 +61,7 @@ typedef enum {
      * opened with BDRV_O_UNMAP.
      */
     BDRV_REQ_MAY_UNMAP          = 0x4,
-    BDRV_REQ_NO_COPY_ON_READ    = 0x8,
+    BDRV_REQ_NO_SERIALISING     = 0x8,
 } BdrvRequestFlags;
 
 typedef struct BlockSizes {
@@ -248,7 +248,7 @@ int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num,
     int nb_sectors, QEMUIOVector *qiov);
 int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs,
     int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
-int coroutine_fn bdrv_co_no_copy_on_readv(BlockDriverState *bs,
+int coroutine_fn bdrv_co_readv_no_serialising(BlockDriverState *bs,
     int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
 int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num,
     int nb_sectors, QEMUIOVector *qiov);
diff --git a/trace-events b/trace-events
index 0b0ff02..2fce98e 100644
--- a/trace-events
+++ b/trace-events
@@ -69,7 +69,7 @@ bdrv_aio_write_zeroes(void *bs, int64_t sector_num, int nb_sectors, int flags, v
 bdrv_lock_medium(void *bs, bool locked) "bs %p locked %d"
 bdrv_co_readv(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num %"PRId64" nb_sectors %d"
 bdrv_co_copy_on_readv(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num %"PRId64" nb_sectors %d"
-bdrv_co_no_copy_on_readv(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num %"PRId64" nb_sectors %d"
+bdrv_co_readv_no_serialising(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num %"PRId64" nb_sectors %d"
 bdrv_co_writev(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num %"PRId64" nb_sectors %d"
 bdrv_co_write_zeroes(void *bs, int64_t sector_num, int nb_sector, int flags) "bs %p sector_num %"PRId64" nb_sectors %d flags %#x"
 bdrv_co_io_em(void *bs, int64_t sector_num, int nb_sectors, int is_write, void *acb) "bs %p sector_num %"PRId64" nb_sectors %d is_write %d acb %p"
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 2/3] iotests: Add "add_drive_raw" method
  2015-12-01  9:36 [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again Fam Zheng
  2015-12-01  9:36 ` [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests Fam Zheng
@ 2015-12-01  9:36 ` Fam Zheng
  2015-12-01  9:36 ` [Qemu-devel] [PATCH 3/3] iotests: Add regresion test case for write notifier assertion failure Fam Zheng
  2015-12-03  6:50 ` [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again Stefan Hajnoczi
  3 siblings, 0 replies; 8+ messages in thread
From: Fam Zheng @ 2015-12-01  9:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Jeff Cody, qemu-block, Stefan Hajnoczi

This offers full manual control over the "-drive" options.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/iotests.py | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index ff5905f..e02245e 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -140,6 +140,11 @@ class VM(object):
         self._args.append('-monitor')
         self._args.append(args)
 
+    def add_drive_raw(self, opts):
+        self._args.append('-drive')
+        self._args.append(opts)
+        return self
+
     def add_drive(self, path, opts='', interface='virtio'):
         '''Add a virtio-blk drive to the VM'''
         options = ['if=%s' % interface,
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 3/3] iotests: Add regresion test case for write notifier assertion failure
  2015-12-01  9:36 [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again Fam Zheng
  2015-12-01  9:36 ` [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests Fam Zheng
  2015-12-01  9:36 ` [Qemu-devel] [PATCH 2/3] iotests: Add "add_drive_raw" method Fam Zheng
@ 2015-12-01  9:36 ` Fam Zheng
  2015-12-03  6:50 ` [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again Stefan Hajnoczi
  3 siblings, 0 replies; 8+ messages in thread
From: Fam Zheng @ 2015-12-01  9:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Jeff Cody, qemu-block, Stefan Hajnoczi

The idea is to let the top level bs have a big request alignment with
blkdebug, so that the aio_write request issued from monitor will be
serialised. This tests that QEMU doesn't crash upon the read request
from the backup job's write notifier, which is a very special case of
"reentrant" request.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/056     | 25 +++++++++++++++++++++++++
 tests/qemu-iotests/056.out |  4 ++--
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/056 b/tests/qemu-iotests/056
index 54e4bd0..04f2c3c 100755
--- a/tests/qemu-iotests/056
+++ b/tests/qemu-iotests/056
@@ -82,6 +82,31 @@ class TestSyncModesNoneAndTop(iotests.QMPTestCase):
         time.sleep(1)
         self.assertEqual(-1, qemu_io('-c', 'read -P0x41 0 512', target_img).find("verification failed"))
 
+class TestBeforeWriteNotifier(iotests.QMPTestCase):
+    def setUp(self):
+        self.vm = iotests.VM().add_drive_raw("file=blkdebug::null-co://,id=drive0,align=65536,driver=blkdebug")
+        self.vm.launch()
+
+    def tearDown(self):
+        self.vm.shutdown()
+        os.remove(target_img)
+
+    def test_before_write_notifier(self):
+        self.vm.pause_drive("drive0")
+        result = self.vm.qmp('drive-backup', device='drive0',
+                             sync='full', target=target_img,
+                             format="file", speed=1)
+        self.assert_qmp(result, 'return', {})
+        result = self.vm.qmp('block-job-pause', device="drive0")
+        self.assert_qmp(result, 'return', {})
+        # Speed is low enough that this must be an uncopied range, which will
+        # trigger the before write notifier
+        self.vm.hmp_qemu_io('drive0', 'aio_write -P 1 512512 512')
+        self.vm.resume_drive("drive0")
+        result = self.vm.qmp('block-job-resume', device="drive0")
+        self.assert_qmp(result, 'return', {})
+        event = self.cancel_and_wait()
+        self.assert_qmp(event, 'data/type', 'backup')
 
 if __name__ == '__main__':
     iotests.main(supported_fmts=['qcow2', 'qed'])
diff --git a/tests/qemu-iotests/056.out b/tests/qemu-iotests/056.out
index fbc63e6..8d7e996 100644
--- a/tests/qemu-iotests/056.out
+++ b/tests/qemu-iotests/056.out
@@ -1,5 +1,5 @@
-..
+...
 ----------------------------------------------------------------------
-Ran 2 tests
+Ran 3 tests
 
 OK
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests
  2015-12-01  9:36 ` [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests Fam Zheng
@ 2015-12-01  9:54   ` Kevin Wolf
  2015-12-01 10:26     ` Fam Zheng
  0 siblings, 1 reply; 8+ messages in thread
From: Kevin Wolf @ 2015-12-01  9:54 UTC (permalink / raw)
  To: Fam Zheng; +Cc: Jeff Cody, qemu-block, qemu-devel, Stefan Hajnoczi

Am 01.12.2015 um 10:36 hat Fam Zheng geschrieben:
> The assertion problem was noticed in 06c3916b35a, but it wasn't
> completely fixed, because even though the req is not marked as
> serialising, it still gets serialised by wait_serialising_requests
> against other serialising requests, which could lead to the same
> assertion failure.
> 
> Fix it by even more explicitly skipping the serialising for this
> specific case.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>

And this, my friends, is another example why read/write notifiers are
wrong and should die sooner rather than later. </broken-record>

Kevin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests
  2015-12-01  9:54   ` Kevin Wolf
@ 2015-12-01 10:26     ` Fam Zheng
  2015-12-01 11:00       ` Kevin Wolf
  0 siblings, 1 reply; 8+ messages in thread
From: Fam Zheng @ 2015-12-01 10:26 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Jeff Cody, qemu-block, qemu-devel, Stefan Hajnoczi

On Tue, 12/01 10:54, Kevin Wolf wrote:
> Am 01.12.2015 um 10:36 hat Fam Zheng geschrieben:
> > The assertion problem was noticed in 06c3916b35a, but it wasn't
> > completely fixed, because even though the req is not marked as
> > serialising, it still gets serialised by wait_serialising_requests
> > against other serialising requests, which could lead to the same
> > assertion failure.
> > 
> > Fix it by even more explicitly skipping the serialising for this
> > specific case.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> 
> And this, my friends, is another example why read/write notifiers are
> wrong and should die sooner rather than later. </broken-record>
> 

Yes, I agree, except it's not clear to me what a better alternative solution
should be. A immediate question is, with whatever approach we will have,
wouldn't we still need to do this sort of "reentrant" COW before the data is
overwritten?

Fam

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests
  2015-12-01 10:26     ` Fam Zheng
@ 2015-12-01 11:00       ` Kevin Wolf
  0 siblings, 0 replies; 8+ messages in thread
From: Kevin Wolf @ 2015-12-01 11:00 UTC (permalink / raw)
  To: Fam Zheng; +Cc: Jeff Cody, qemu-block, qemu-devel, Stefan Hajnoczi

Am 01.12.2015 um 11:26 hat Fam Zheng geschrieben:
> On Tue, 12/01 10:54, Kevin Wolf wrote:
> > Am 01.12.2015 um 10:36 hat Fam Zheng geschrieben:
> > > The assertion problem was noticed in 06c3916b35a, but it wasn't
> > > completely fixed, because even though the req is not marked as
> > > serialising, it still gets serialised by wait_serialising_requests
> > > against other serialising requests, which could lead to the same
> > > assertion failure.
> > > 
> > > Fix it by even more explicitly skipping the serialising for this
> > > specific case.
> > > 
> > > Signed-off-by: Fam Zheng <famz@redhat.com>
> > 
> > And this, my friends, is another example why read/write notifiers are
> > wrong and should die sooner rather than later. </broken-record>
> > 
> 
> Yes, I agree, except it's not clear to me what a better alternative solution
> should be. A immediate question is, with whatever approach we will have,
> wouldn't we still need to do this sort of "reentrant" COW before the data is
> overwritten?

If the backup job could temporarily insert a filter driver, you wouldn't
get reentrance, but another BDS in the stack.

Now that much of the blockdev-add stuff seems to be falling into place
at last, dynamic reconfiguration and filters might be the next big nut
for us to crack.

Kevin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again
  2015-12-01  9:36 [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again Fam Zheng
                   ` (2 preceding siblings ...)
  2015-12-01  9:36 ` [Qemu-devel] [PATCH 3/3] iotests: Add regresion test case for write notifier assertion failure Fam Zheng
@ 2015-12-03  6:50 ` Stefan Hajnoczi
  3 siblings, 0 replies; 8+ messages in thread
From: Stefan Hajnoczi @ 2015-12-03  6:50 UTC (permalink / raw)
  To: Fam Zheng; +Cc: Kevin Wolf, Jeff Cody, qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 1172 bytes --]

On Tue, Dec 01, 2015 at 05:36:27PM +0800, Fam Zheng wrote:
> This is basically a supplementary fix of 06c3916b.
> 
> On 512 disks, the crash only happens when copy-on-read is enabled, which is
> covered by the previou fix.  But on 4k disks the write request that triggers
> the notifier itself may be serialised, in which case the read req from
> backup_do_cow will still be serialised in bdrv_aligned_preadv, resulting in the
> same assertion failure.
> 
> Fam Zheng (3):
>   block: Don't wait serialising for non-COR read requests
>   iotests: Add "add_drive_raw" method
>   iotests: Add regresion test case for write notifier assertion failure
> 
>  block/backup.c                |  2 +-
>  block/io.c                    | 12 +++++++-----
>  include/block/block.h         |  4 ++--
>  tests/qemu-iotests/056        | 25 +++++++++++++++++++++++++
>  tests/qemu-iotests/056.out    |  4 ++--
>  tests/qemu-iotests/iotests.py |  5 +++++
>  trace-events                  |  2 +-
>  7 files changed, 43 insertions(+), 11 deletions(-)
> 
> -- 
> 2.4.3
> 

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-12-03  6:51 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-01  9:36 [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again Fam Zheng
2015-12-01  9:36 ` [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests Fam Zheng
2015-12-01  9:54   ` Kevin Wolf
2015-12-01 10:26     ` Fam Zheng
2015-12-01 11:00       ` Kevin Wolf
2015-12-01  9:36 ` [Qemu-devel] [PATCH 2/3] iotests: Add "add_drive_raw" method Fam Zheng
2015-12-01  9:36 ` [Qemu-devel] [PATCH 3/3] iotests: Add regresion test case for write notifier assertion failure Fam Zheng
2015-12-03  6:50 ` [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).