* [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again @ 2015-12-01 9:36 Fam Zheng 2015-12-01 9:36 ` [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests Fam Zheng ` (3 more replies) 0 siblings, 4 replies; 8+ messages in thread From: Fam Zheng @ 2015-12-01 9:36 UTC (permalink / raw) To: qemu-devel; +Cc: Kevin Wolf, Jeff Cody, qemu-block, Stefan Hajnoczi This is basically a supplementary fix of 06c3916b. On 512 disks, the crash only happens when copy-on-read is enabled, which is covered by the previou fix. But on 4k disks the write request that triggers the notifier itself may be serialised, in which case the read req from backup_do_cow will still be serialised in bdrv_aligned_preadv, resulting in the same assertion failure. Fam Zheng (3): block: Don't wait serialising for non-COR read requests iotests: Add "add_drive_raw" method iotests: Add regresion test case for write notifier assertion failure block/backup.c | 2 +- block/io.c | 12 +++++++----- include/block/block.h | 4 ++-- tests/qemu-iotests/056 | 25 +++++++++++++++++++++++++ tests/qemu-iotests/056.out | 4 ++-- tests/qemu-iotests/iotests.py | 5 +++++ trace-events | 2 +- 7 files changed, 43 insertions(+), 11 deletions(-) -- 2.4.3 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests 2015-12-01 9:36 [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again Fam Zheng @ 2015-12-01 9:36 ` Fam Zheng 2015-12-01 9:54 ` Kevin Wolf 2015-12-01 9:36 ` [Qemu-devel] [PATCH 2/3] iotests: Add "add_drive_raw" method Fam Zheng ` (2 subsequent siblings) 3 siblings, 1 reply; 8+ messages in thread From: Fam Zheng @ 2015-12-01 9:36 UTC (permalink / raw) To: qemu-devel; +Cc: Kevin Wolf, Jeff Cody, qemu-block, Stefan Hajnoczi The assertion problem was noticed in 06c3916b35a, but it wasn't completely fixed, because even though the req is not marked as serialising, it still gets serialised by wait_serialising_requests against other serialising requests, which could lead to the same assertion failure. Fix it by even more explicitly skipping the serialising for this specific case. Signed-off-by: Fam Zheng <famz@redhat.com> --- block/backup.c | 2 +- block/io.c | 12 +++++++----- include/block/block.h | 4 ++-- trace-events | 2 +- 4 files changed, 11 insertions(+), 9 deletions(-) diff --git a/block/backup.c b/block/backup.c index 3b39119..705bb77 100644 --- a/block/backup.c +++ b/block/backup.c @@ -132,7 +132,7 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs, qemu_iovec_init_external(&bounce_qiov, &iov, 1); if (is_write_notifier) { - ret = bdrv_co_no_copy_on_readv(bs, + ret = bdrv_co_readv_no_serialising(bs, start * BACKUP_SECTORS_PER_CLUSTER, n, &bounce_qiov); } else { diff --git a/block/io.c b/block/io.c index adc1eab..e00fb5d 100644 --- a/block/io.c +++ b/block/io.c @@ -863,7 +863,9 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs, mark_request_serialising(req, bdrv_get_cluster_size(bs)); } - wait_serialising_requests(req); + if (!(flags & BDRV_REQ_NO_SERIALISING)) { + wait_serialising_requests(req); + } if (flags & BDRV_REQ_COPY_ON_READ) { int pnum; @@ -952,7 +954,7 @@ static int coroutine_fn bdrv_co_do_preadv(BlockDriverState *bs, } /* Don't do copy-on-read if we read data before write operation */ - if (bs->copy_on_read && !(flags & BDRV_REQ_NO_COPY_ON_READ)) { + if (bs->copy_on_read && !(flags & BDRV_REQ_NO_SERIALISING)) { flags |= BDRV_REQ_COPY_ON_READ; } @@ -1021,13 +1023,13 @@ int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num, return bdrv_co_do_readv(bs, sector_num, nb_sectors, qiov, 0); } -int coroutine_fn bdrv_co_no_copy_on_readv(BlockDriverState *bs, +int coroutine_fn bdrv_co_readv_no_serialising(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *qiov) { - trace_bdrv_co_no_copy_on_readv(bs, sector_num, nb_sectors); + trace_bdrv_co_readv_no_serialising(bs, sector_num, nb_sectors); return bdrv_co_do_readv(bs, sector_num, nb_sectors, qiov, - BDRV_REQ_NO_COPY_ON_READ); + BDRV_REQ_NO_SERIALISING); } int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs, diff --git a/include/block/block.h b/include/block/block.h index 73edb1a..3477328 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -61,7 +61,7 @@ typedef enum { * opened with BDRV_O_UNMAP. */ BDRV_REQ_MAY_UNMAP = 0x4, - BDRV_REQ_NO_COPY_ON_READ = 0x8, + BDRV_REQ_NO_SERIALISING = 0x8, } BdrvRequestFlags; typedef struct BlockSizes { @@ -248,7 +248,7 @@ int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *qiov); int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *qiov); -int coroutine_fn bdrv_co_no_copy_on_readv(BlockDriverState *bs, +int coroutine_fn bdrv_co_readv_no_serialising(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *qiov); int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *qiov); diff --git a/trace-events b/trace-events index 0b0ff02..2fce98e 100644 --- a/trace-events +++ b/trace-events @@ -69,7 +69,7 @@ bdrv_aio_write_zeroes(void *bs, int64_t sector_num, int nb_sectors, int flags, v bdrv_lock_medium(void *bs, bool locked) "bs %p locked %d" bdrv_co_readv(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num %"PRId64" nb_sectors %d" bdrv_co_copy_on_readv(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num %"PRId64" nb_sectors %d" -bdrv_co_no_copy_on_readv(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num %"PRId64" nb_sectors %d" +bdrv_co_readv_no_serialising(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num %"PRId64" nb_sectors %d" bdrv_co_writev(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num %"PRId64" nb_sectors %d" bdrv_co_write_zeroes(void *bs, int64_t sector_num, int nb_sector, int flags) "bs %p sector_num %"PRId64" nb_sectors %d flags %#x" bdrv_co_io_em(void *bs, int64_t sector_num, int nb_sectors, int is_write, void *acb) "bs %p sector_num %"PRId64" nb_sectors %d is_write %d acb %p" -- 2.4.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests 2015-12-01 9:36 ` [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests Fam Zheng @ 2015-12-01 9:54 ` Kevin Wolf 2015-12-01 10:26 ` Fam Zheng 0 siblings, 1 reply; 8+ messages in thread From: Kevin Wolf @ 2015-12-01 9:54 UTC (permalink / raw) To: Fam Zheng; +Cc: Jeff Cody, qemu-block, qemu-devel, Stefan Hajnoczi Am 01.12.2015 um 10:36 hat Fam Zheng geschrieben: > The assertion problem was noticed in 06c3916b35a, but it wasn't > completely fixed, because even though the req is not marked as > serialising, it still gets serialised by wait_serialising_requests > against other serialising requests, which could lead to the same > assertion failure. > > Fix it by even more explicitly skipping the serialising for this > specific case. > > Signed-off-by: Fam Zheng <famz@redhat.com> And this, my friends, is another example why read/write notifiers are wrong and should die sooner rather than later. </broken-record> Kevin ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests 2015-12-01 9:54 ` Kevin Wolf @ 2015-12-01 10:26 ` Fam Zheng 2015-12-01 11:00 ` Kevin Wolf 0 siblings, 1 reply; 8+ messages in thread From: Fam Zheng @ 2015-12-01 10:26 UTC (permalink / raw) To: Kevin Wolf; +Cc: Jeff Cody, qemu-block, qemu-devel, Stefan Hajnoczi On Tue, 12/01 10:54, Kevin Wolf wrote: > Am 01.12.2015 um 10:36 hat Fam Zheng geschrieben: > > The assertion problem was noticed in 06c3916b35a, but it wasn't > > completely fixed, because even though the req is not marked as > > serialising, it still gets serialised by wait_serialising_requests > > against other serialising requests, which could lead to the same > > assertion failure. > > > > Fix it by even more explicitly skipping the serialising for this > > specific case. > > > > Signed-off-by: Fam Zheng <famz@redhat.com> > > And this, my friends, is another example why read/write notifiers are > wrong and should die sooner rather than later. </broken-record> > Yes, I agree, except it's not clear to me what a better alternative solution should be. A immediate question is, with whatever approach we will have, wouldn't we still need to do this sort of "reentrant" COW before the data is overwritten? Fam ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests 2015-12-01 10:26 ` Fam Zheng @ 2015-12-01 11:00 ` Kevin Wolf 0 siblings, 0 replies; 8+ messages in thread From: Kevin Wolf @ 2015-12-01 11:00 UTC (permalink / raw) To: Fam Zheng; +Cc: Jeff Cody, qemu-block, qemu-devel, Stefan Hajnoczi Am 01.12.2015 um 11:26 hat Fam Zheng geschrieben: > On Tue, 12/01 10:54, Kevin Wolf wrote: > > Am 01.12.2015 um 10:36 hat Fam Zheng geschrieben: > > > The assertion problem was noticed in 06c3916b35a, but it wasn't > > > completely fixed, because even though the req is not marked as > > > serialising, it still gets serialised by wait_serialising_requests > > > against other serialising requests, which could lead to the same > > > assertion failure. > > > > > > Fix it by even more explicitly skipping the serialising for this > > > specific case. > > > > > > Signed-off-by: Fam Zheng <famz@redhat.com> > > > > And this, my friends, is another example why read/write notifiers are > > wrong and should die sooner rather than later. </broken-record> > > > > Yes, I agree, except it's not clear to me what a better alternative solution > should be. A immediate question is, with whatever approach we will have, > wouldn't we still need to do this sort of "reentrant" COW before the data is > overwritten? If the backup job could temporarily insert a filter driver, you wouldn't get reentrance, but another BDS in the stack. Now that much of the blockdev-add stuff seems to be falling into place at last, dynamic reconfiguration and filters might be the next big nut for us to crack. Kevin ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Qemu-devel] [PATCH 2/3] iotests: Add "add_drive_raw" method 2015-12-01 9:36 [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again Fam Zheng 2015-12-01 9:36 ` [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests Fam Zheng @ 2015-12-01 9:36 ` Fam Zheng 2015-12-01 9:36 ` [Qemu-devel] [PATCH 3/3] iotests: Add regresion test case for write notifier assertion failure Fam Zheng 2015-12-03 6:50 ` [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again Stefan Hajnoczi 3 siblings, 0 replies; 8+ messages in thread From: Fam Zheng @ 2015-12-01 9:36 UTC (permalink / raw) To: qemu-devel; +Cc: Kevin Wolf, Jeff Cody, qemu-block, Stefan Hajnoczi This offers full manual control over the "-drive" options. Signed-off-by: Fam Zheng <famz@redhat.com> --- tests/qemu-iotests/iotests.py | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py index ff5905f..e02245e 100644 --- a/tests/qemu-iotests/iotests.py +++ b/tests/qemu-iotests/iotests.py @@ -140,6 +140,11 @@ class VM(object): self._args.append('-monitor') self._args.append(args) + def add_drive_raw(self, opts): + self._args.append('-drive') + self._args.append(opts) + return self + def add_drive(self, path, opts='', interface='virtio'): '''Add a virtio-blk drive to the VM''' options = ['if=%s' % interface, -- 2.4.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Qemu-devel] [PATCH 3/3] iotests: Add regresion test case for write notifier assertion failure 2015-12-01 9:36 [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again Fam Zheng 2015-12-01 9:36 ` [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests Fam Zheng 2015-12-01 9:36 ` [Qemu-devel] [PATCH 2/3] iotests: Add "add_drive_raw" method Fam Zheng @ 2015-12-01 9:36 ` Fam Zheng 2015-12-03 6:50 ` [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again Stefan Hajnoczi 3 siblings, 0 replies; 8+ messages in thread From: Fam Zheng @ 2015-12-01 9:36 UTC (permalink / raw) To: qemu-devel; +Cc: Kevin Wolf, Jeff Cody, qemu-block, Stefan Hajnoczi The idea is to let the top level bs have a big request alignment with blkdebug, so that the aio_write request issued from monitor will be serialised. This tests that QEMU doesn't crash upon the read request from the backup job's write notifier, which is a very special case of "reentrant" request. Signed-off-by: Fam Zheng <famz@redhat.com> --- tests/qemu-iotests/056 | 25 +++++++++++++++++++++++++ tests/qemu-iotests/056.out | 4 ++-- 2 files changed, 27 insertions(+), 2 deletions(-) diff --git a/tests/qemu-iotests/056 b/tests/qemu-iotests/056 index 54e4bd0..04f2c3c 100755 --- a/tests/qemu-iotests/056 +++ b/tests/qemu-iotests/056 @@ -82,6 +82,31 @@ class TestSyncModesNoneAndTop(iotests.QMPTestCase): time.sleep(1) self.assertEqual(-1, qemu_io('-c', 'read -P0x41 0 512', target_img).find("verification failed")) +class TestBeforeWriteNotifier(iotests.QMPTestCase): + def setUp(self): + self.vm = iotests.VM().add_drive_raw("file=blkdebug::null-co://,id=drive0,align=65536,driver=blkdebug") + self.vm.launch() + + def tearDown(self): + self.vm.shutdown() + os.remove(target_img) + + def test_before_write_notifier(self): + self.vm.pause_drive("drive0") + result = self.vm.qmp('drive-backup', device='drive0', + sync='full', target=target_img, + format="file", speed=1) + self.assert_qmp(result, 'return', {}) + result = self.vm.qmp('block-job-pause', device="drive0") + self.assert_qmp(result, 'return', {}) + # Speed is low enough that this must be an uncopied range, which will + # trigger the before write notifier + self.vm.hmp_qemu_io('drive0', 'aio_write -P 1 512512 512') + self.vm.resume_drive("drive0") + result = self.vm.qmp('block-job-resume', device="drive0") + self.assert_qmp(result, 'return', {}) + event = self.cancel_and_wait() + self.assert_qmp(event, 'data/type', 'backup') if __name__ == '__main__': iotests.main(supported_fmts=['qcow2', 'qed']) diff --git a/tests/qemu-iotests/056.out b/tests/qemu-iotests/056.out index fbc63e6..8d7e996 100644 --- a/tests/qemu-iotests/056.out +++ b/tests/qemu-iotests/056.out @@ -1,5 +1,5 @@ -.. +... ---------------------------------------------------------------------- -Ran 2 tests +Ran 3 tests OK -- 2.4.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again 2015-12-01 9:36 [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again Fam Zheng ` (2 preceding siblings ...) 2015-12-01 9:36 ` [Qemu-devel] [PATCH 3/3] iotests: Add regresion test case for write notifier assertion failure Fam Zheng @ 2015-12-03 6:50 ` Stefan Hajnoczi 3 siblings, 0 replies; 8+ messages in thread From: Stefan Hajnoczi @ 2015-12-03 6:50 UTC (permalink / raw) To: Fam Zheng; +Cc: Kevin Wolf, Jeff Cody, qemu-devel, qemu-block [-- Attachment #1: Type: text/plain, Size: 1172 bytes --] On Tue, Dec 01, 2015 at 05:36:27PM +0800, Fam Zheng wrote: > This is basically a supplementary fix of 06c3916b. > > On 512 disks, the crash only happens when copy-on-read is enabled, which is > covered by the previou fix. But on 4k disks the write request that triggers > the notifier itself may be serialised, in which case the read req from > backup_do_cow will still be serialised in bdrv_aligned_preadv, resulting in the > same assertion failure. > > Fam Zheng (3): > block: Don't wait serialising for non-COR read requests > iotests: Add "add_drive_raw" method > iotests: Add regresion test case for write notifier assertion failure > > block/backup.c | 2 +- > block/io.c | 12 +++++++----- > include/block/block.h | 4 ++-- > tests/qemu-iotests/056 | 25 +++++++++++++++++++++++++ > tests/qemu-iotests/056.out | 4 ++-- > tests/qemu-iotests/iotests.py | 5 +++++ > trace-events | 2 +- > 7 files changed, 43 insertions(+), 11 deletions(-) > > -- > 2.4.3 > Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-12-03 6:51 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-12-01 9:36 [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again Fam Zheng 2015-12-01 9:36 ` [Qemu-devel] [PATCH 1/3] block: Don't wait serialising for non-COR read requests Fam Zheng 2015-12-01 9:54 ` Kevin Wolf 2015-12-01 10:26 ` Fam Zheng 2015-12-01 11:00 ` Kevin Wolf 2015-12-01 9:36 ` [Qemu-devel] [PATCH 2/3] iotests: Add "add_drive_raw" method Fam Zheng 2015-12-01 9:36 ` [Qemu-devel] [PATCH 3/3] iotests: Add regresion test case for write notifier assertion failure Fam Zheng 2015-12-03 6:50 ` [Qemu-devel] [PATCH 0/3] block: Fix assertion failure with before write notifier again Stefan Hajnoczi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).