* [Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading
@ 2018-07-10 6:31 Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 01/10] block: Prefix file driver trace points with "file_" Fam Zheng
` (10 more replies)
0 siblings, 11 replies; 13+ messages in thread
From: Fam Zheng @ 2018-07-10 6:31 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, vsementsov, qemu-block, Fam Zheng, Kevin Wolf,
Max Reitz, Jeff Cody, Eric Blake, John Snow, Stefan Hajnoczi
Based-on: 20180709163719.87107-1-vsementsov@virtuozzo.com
v3: - Rebase onto Vladmir's:
[PATCH v5 0/4] fix image fleecing
on top of master, which has blklogwrites to be converted to BdrvChild for
the bdrv_co_pdiscard() parameter. [Kevin]
- Add file_ prefix to file protocol trace points. [Kevin]
- Assert that BdrvTrackedRequest bytes and offset don't overflow INT64_MAX.
[Kevin]
- Don't misuse req->offset/req->bytes in bdrv_co_write_req_prepare/finish.
[Kevin]
- Fix stat64_max. [Kevin]
- Keep lines within 80 columns. [Kevin]
- Grammar fix in commit message and rev-by. [Eric]
Qcow2 allocates new clusters after the end of the file. If it is the destinaton
of copy offloading, we must adjust dst->bs->total_sectors. Otherwise, further
reads will drop to the "beyond EOF" code path and return zeroes, which problem
is caught by iotests 222.
Follow the logic in the normal write code and update bs->total_sectors after
I/O is done.
While at it, add a few convenient trace points to aid future debug experiences
in the topic.
Fam Zheng (10):
block: Prefix file driver trace points with "file_"
block: Add copy offloading trace points
block: Use BdrvChild to discard
block: Use uint64_t for BdrvTrackedRequest byte fields
block: Extract common write req handling
block: Fix handling of image enlarging write
block: Use common req handling for discard
block: Use common req handling in copy offloading
block: Fix bdrv_co_truncate overlap check
block: Use common write req handling in truncate
block/blkdebug.c | 2 +-
block/blklogwrites.c | 2 +-
block/blkreplay.c | 2 +-
block/block-backend.c | 2 +-
block/copy-on-read.c | 2 +-
block/file-posix.c | 4 +-
block/file-win32.c | 2 +-
block/io.c | 213 ++++++++++++++++++++++++--------------
block/iscsi.c | 3 +
block/mirror.c | 2 +-
block/qcow2-refcount.c | 2 +-
block/raw-format.c | 2 +-
block/throttle.c | 2 +-
block/trace-events | 10 +-
include/block/block.h | 4 +-
include/block/block_int.h | 4 +-
16 files changed, 163 insertions(+), 95 deletions(-)
--
2.17.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v3 01/10] block: Prefix file driver trace points with "file_"
2018-07-10 6:31 [Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading Fam Zheng
@ 2018-07-10 6:31 ` Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 02/10] block: Add copy offloading trace points Fam Zheng
` (9 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Fam Zheng @ 2018-07-10 6:31 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, vsementsov, qemu-block, Fam Zheng, Kevin Wolf,
Max Reitz, Jeff Cody, Eric Blake, John Snow, Stefan Hajnoczi
With in one module, trace points usually have a common prefix named
after the module name. paio_submit and paio_submit_co are the only two
trace points so far in the two file protocol drivers. As we are adding
more, having a common prefix here is better so that trace points can be
enabled with a glob. Rename them.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
block/file-posix.c | 2 +-
block/file-win32.c | 2 +-
block/trace-events | 4 ++--
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index 4fec8cb53c..1185c7c5cc 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1743,7 +1743,7 @@ static int paio_submit_co_full(BlockDriverState *bs, int fd,
assert(qiov->size == bytes);
}
- trace_paio_submit_co(offset, bytes, type);
+ trace_file_paio_submit_co(offset, bytes, type);
pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
return thread_pool_submit_co(pool, aio_worker, acb);
}
diff --git a/block/file-win32.c b/block/file-win32.c
index 0411fe80fd..f1e2187f3b 100644
--- a/block/file-win32.c
+++ b/block/file-win32.c
@@ -162,7 +162,7 @@ static BlockAIOCB *paio_submit(BlockDriverState *bs, HANDLE hfile,
acb->aio_nbytes = count;
acb->aio_offset = offset;
- trace_paio_submit(acb, opaque, offset, count, type);
+ trace_file_paio_submit(acb, opaque, offset, count, type);
pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
}
diff --git a/block/trace-events b/block/trace-events
index c35287b48a..854d5525af 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -55,8 +55,8 @@ qmp_block_stream(void *bs, void *job) "bs %p job %p"
# block/file-win32.c
# block/file-posix.c
-paio_submit_co(int64_t offset, int count, int type) "offset %"PRId64" count %d type %d"
-paio_submit(void *acb, void *opaque, int64_t offset, int count, int type) "acb %p opaque %p offset %"PRId64" count %d type %d"
+file_paio_submit_co(int64_t offset, int count, int type) "offset %"PRId64" count %d type %d"
+file_paio_submit(void *acb, void *opaque, int64_t offset, int count, int type) "acb %p opaque %p offset %"PRId64" count %d type %d"
# block/qcow2.c
qcow2_writev_start_req(void *co, int64_t offset, int bytes) "co %p offset 0x%" PRIx64 " bytes %d"
--
2.17.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v3 02/10] block: Add copy offloading trace points
2018-07-10 6:31 [Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 01/10] block: Prefix file driver trace points with "file_" Fam Zheng
@ 2018-07-10 6:31 ` Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 03/10] block: Use BdrvChild to discard Fam Zheng
` (8 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Fam Zheng @ 2018-07-10 6:31 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, vsementsov, qemu-block, Fam Zheng, Kevin Wolf,
Max Reitz, Jeff Cody, Eric Blake, John Snow, Stefan Hajnoczi
A few trace points that can help reveal what is happening in a copy
offloading I/O path.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
block/file-posix.c | 2 ++
block/io.c | 4 ++++
block/iscsi.c | 3 +++
block/trace-events | 6 ++++++
4 files changed, 15 insertions(+)
diff --git a/block/file-posix.c b/block/file-posix.c
index 1185c7c5cc..09f6b938f6 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1488,6 +1488,8 @@ static ssize_t handle_aiocb_copy_range(RawPosixAIOData *aiocb)
ssize_t ret = copy_file_range(aiocb->aio_fildes, &in_off,
aiocb->aio_fd2, &out_off,
bytes, 0);
+ trace_file_copy_file_range(aiocb->bs, aiocb->aio_fildes, in_off,
+ aiocb->aio_fd2, out_off, bytes, 0, ret);
if (ret == 0) {
/* No progress (e.g. when beyond EOF), let the caller fall back to
* buffer I/O. */
diff --git a/block/io.c b/block/io.c
index 102cb0e1ba..600060c936 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2997,6 +2997,8 @@ int coroutine_fn bdrv_co_copy_range_from(BdrvChild *src, uint64_t src_offset,
BdrvRequestFlags read_flags,
BdrvRequestFlags write_flags)
{
+ trace_bdrv_co_copy_range_from(src, src_offset, dst, dst_offset, bytes,
+ read_flags, write_flags);
return bdrv_co_copy_range_internal(src, src_offset, dst, dst_offset,
bytes, read_flags, write_flags, true);
}
@@ -3011,6 +3013,8 @@ int coroutine_fn bdrv_co_copy_range_to(BdrvChild *src, uint64_t src_offset,
BdrvRequestFlags read_flags,
BdrvRequestFlags write_flags)
{
+ trace_bdrv_co_copy_range_to(src, src_offset, dst, dst_offset, bytes,
+ read_flags, write_flags);
return bdrv_co_copy_range_internal(src, src_offset, dst, dst_offset,
bytes, read_flags, write_flags, false);
}
diff --git a/block/iscsi.c b/block/iscsi.c
index 38b917a1e5..bb69faf34a 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -44,6 +44,7 @@
#include "qapi/qmp/qstring.h"
#include "crypto/secret.h"
#include "scsi/utils.h"
+#include "trace.h"
/* Conflict between scsi/utils.h and libiscsi! :( */
#define SCSI_XFER_NONE ISCSI_XFER_NONE
@@ -2399,6 +2400,8 @@ retry:
}
out_unlock:
+
+ trace_iscsi_xcopy(src_lun, src_offset, dst_lun, dst_offset, bytes, r);
g_free(iscsi_task.task);
qemu_mutex_unlock(&dst_lun->mutex);
g_free(iscsi_task.err_str);
diff --git a/block/trace-events b/block/trace-events
index 854d5525af..3e8c47bb24 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -15,6 +15,8 @@ bdrv_co_preadv(void *bs, int64_t offset, int64_t nbytes, unsigned int flags) "bs
bdrv_co_pwritev(void *bs, int64_t offset, int64_t nbytes, unsigned int flags) "bs %p offset %"PRId64" nbytes %"PRId64" flags 0x%x"
bdrv_co_pwrite_zeroes(void *bs, int64_t offset, int count, int flags) "bs %p offset %"PRId64" count %d flags 0x%x"
bdrv_co_do_copy_on_readv(void *bs, int64_t offset, unsigned int bytes, int64_t cluster_offset, int64_t cluster_bytes) "bs %p offset %"PRId64" bytes %u cluster_offset %"PRId64" cluster_bytes %"PRId64
+bdrv_co_copy_range_from(void *src, uint64_t src_offset, void *dst, uint64_t dst_offset, uint64_t bytes, int read_flags, int write_flags) "src %p offset %"PRIu64" dst %p offset %"PRIu64" bytes %"PRIu64" rw flags 0x%x 0x%x"
+bdrv_co_copy_range_to(void *src, uint64_t src_offset, void *dst, uint64_t dst_offset, uint64_t bytes, int read_flags, int write_flags) "src %p offset %"PRIu64" dst %p offset %"PRIu64" bytes %"PRIu64" rw flags 0x%x 0x%x"
# block/stream.c
stream_one_iteration(void *s, int64_t offset, uint64_t bytes, int is_allocated) "s %p offset %" PRId64 " bytes %" PRIu64 " is_allocated %d"
@@ -57,6 +59,7 @@ qmp_block_stream(void *bs, void *job) "bs %p job %p"
# block/file-posix.c
file_paio_submit_co(int64_t offset, int count, int type) "offset %"PRId64" count %d type %d"
file_paio_submit(void *acb, void *opaque, int64_t offset, int count, int type) "acb %p opaque %p offset %"PRId64" count %d type %d"
+file_copy_file_range(void *bs, int src, int64_t src_off, int dst, int64_t dst_off, int64_t bytes, int flags, int64_t ret) "bs %p src_fd %d offset %"PRIu64" dst_fd %d offset %"PRIu64" bytes %"PRIu64" flags %d ret %"PRId64
# block/qcow2.c
qcow2_writev_start_req(void *co, int64_t offset, int bytes) "co %p offset 0x%" PRIx64 " bytes %d"
@@ -150,3 +153,6 @@ nvme_free_req_queue_wait(void *q) "q %p"
nvme_cmd_map_qiov(void *s, void *cmd, void *req, void *qiov, int entries) "s %p cmd %p req %p qiov %p entries %d"
nvme_cmd_map_qiov_pages(void *s, int i, uint64_t page) "s %p page[%d] 0x%"PRIx64
nvme_cmd_map_qiov_iov(void *s, int i, void *page, int pages) "s %p iov[%d] %p pages %d"
+
+# block/iscsi.c
+iscsi_xcopy(void *src_lun, uint64_t src_off, void *dst_lun, uint64_t dst_off, uint64_t bytes, int ret) "src_lun %p offset %"PRIu64" dst_lun %p offset %"PRIu64" bytes %"PRIu64" ret %d"
--
2.17.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v3 03/10] block: Use BdrvChild to discard
2018-07-10 6:31 [Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 01/10] block: Prefix file driver trace points with "file_" Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 02/10] block: Add copy offloading trace points Fam Zheng
@ 2018-07-10 6:31 ` Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 04/10] block: Use uint64_t for BdrvTrackedRequest byte fields Fam Zheng
` (7 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Fam Zheng @ 2018-07-10 6:31 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, vsementsov, qemu-block, Fam Zheng, Kevin Wolf,
Max Reitz, Jeff Cody, Eric Blake, John Snow, Stefan Hajnoczi
Other I/O functions are already using a BdrvChild pointer in the API, so
make discard do the same. It makes it possible to initiate the same
permission checks before doing I/O, and much easier to share the
helper functions for this, which will be added and used by write,
truncate and copy range paths.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
block/blkdebug.c | 2 +-
block/blklogwrites.c | 2 +-
block/blkreplay.c | 2 +-
block/block-backend.c | 2 +-
block/copy-on-read.c | 2 +-
block/io.c | 18 +++++++++---------
block/mirror.c | 2 +-
block/qcow2-refcount.c | 2 +-
block/raw-format.c | 2 +-
block/throttle.c | 2 +-
include/block/block.h | 4 ++--
11 files changed, 20 insertions(+), 20 deletions(-)
diff --git a/block/blkdebug.c b/block/blkdebug.c
index 526af2a808..0457bf5b66 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -625,7 +625,7 @@ static int coroutine_fn blkdebug_co_pdiscard(BlockDriverState *bs,
return err;
}
- return bdrv_co_pdiscard(bs->file->bs, offset, bytes);
+ return bdrv_co_pdiscard(bs->file, offset, bytes);
}
static int coroutine_fn blkdebug_co_block_status(BlockDriverState *bs,
diff --git a/block/blklogwrites.c b/block/blklogwrites.c
index 63bf6b34a9..34e98e4ff5 100644
--- a/block/blklogwrites.c
+++ b/block/blklogwrites.c
@@ -483,7 +483,7 @@ static int coroutine_fn blk_log_writes_co_do_file_flush(BlkLogWritesFileReq *fr)
static int coroutine_fn
blk_log_writes_co_do_file_pdiscard(BlkLogWritesFileReq *fr)
{
- return bdrv_co_pdiscard(fr->bs->file->bs, fr->offset, fr->bytes);
+ return bdrv_co_pdiscard(fr->bs->file, fr->offset, fr->bytes);
}
static int coroutine_fn
diff --git a/block/blkreplay.c b/block/blkreplay.c
index b016dbeee7..766150ade6 100755
--- a/block/blkreplay.c
+++ b/block/blkreplay.c
@@ -113,7 +113,7 @@ static int coroutine_fn blkreplay_co_pdiscard(BlockDriverState *bs,
int64_t offset, int bytes)
{
uint64_t reqid = blkreplay_next_id();
- int ret = bdrv_co_pdiscard(bs->file->bs, offset, bytes);
+ int ret = bdrv_co_pdiscard(bs->file, offset, bytes);
block_request_create(reqid, bs, qemu_coroutine_self());
qemu_coroutine_yield();
diff --git a/block/block-backend.c b/block/block-backend.c
index ac8c3e0b1c..fa120630be 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1559,7 +1559,7 @@ int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int bytes)
return ret;
}
- return bdrv_co_pdiscard(blk_bs(blk), offset, bytes);
+ return bdrv_co_pdiscard(blk->root, offset, bytes);
}
int blk_co_flush(BlockBackend *blk)
diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index 1dcdaeed69..a19164f9eb 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -116,7 +116,7 @@ static int coroutine_fn cor_co_pwrite_zeroes(BlockDriverState *bs,
static int coroutine_fn cor_co_pdiscard(BlockDriverState *bs,
int64_t offset, int bytes)
{
- return bdrv_co_pdiscard(bs->file->bs, offset, bytes);
+ return bdrv_co_pdiscard(bs->file, offset, bytes);
}
diff --git a/block/io.c b/block/io.c
index 600060c936..77f35b1013 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2611,7 +2611,7 @@ int bdrv_flush(BlockDriverState *bs)
}
typedef struct DiscardCo {
- BlockDriverState *bs;
+ BdrvChild *child;
int64_t offset;
int bytes;
int ret;
@@ -2620,17 +2620,17 @@ static void coroutine_fn bdrv_pdiscard_co_entry(void *opaque)
{
DiscardCo *rwco = opaque;
- rwco->ret = bdrv_co_pdiscard(rwco->bs, rwco->offset, rwco->bytes);
+ rwco->ret = bdrv_co_pdiscard(rwco->child, rwco->offset, rwco->bytes);
}
-int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset,
- int bytes)
+int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
{
BdrvTrackedRequest req;
int max_pdiscard, ret;
int head, tail, align;
+ BlockDriverState *bs = child->bs;
- if (!bs->drv) {
+ if (!bs || !bs->drv) {
return -ENOMEDIUM;
}
@@ -2741,11 +2741,11 @@ out:
return ret;
}
-int bdrv_pdiscard(BlockDriverState *bs, int64_t offset, int bytes)
+int bdrv_pdiscard(BdrvChild *child, int64_t offset, int bytes)
{
Coroutine *co;
DiscardCo rwco = {
- .bs = bs,
+ .child = child,
.offset = offset,
.bytes = bytes,
.ret = NOT_DONE,
@@ -2756,8 +2756,8 @@ int bdrv_pdiscard(BlockDriverState *bs, int64_t offset, int bytes)
bdrv_pdiscard_co_entry(&rwco);
} else {
co = qemu_coroutine_create(bdrv_pdiscard_co_entry, &rwco);
- bdrv_coroutine_enter(bs, co);
- BDRV_POLL_WHILE(bs, rwco.ret == NOT_DONE);
+ bdrv_coroutine_enter(child->bs, co);
+ BDRV_POLL_WHILE(child->bs, rwco.ret == NOT_DONE);
}
return rwco.ret;
diff --git a/block/mirror.c b/block/mirror.c
index 61bd9f3cf1..b48c3f8cf5 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1333,7 +1333,7 @@ static int coroutine_fn bdrv_mirror_top_do_write(BlockDriverState *bs,
break;
case MIRROR_METHOD_DISCARD:
- ret = bdrv_co_pdiscard(bs->backing->bs, offset, bytes);
+ ret = bdrv_co_pdiscard(bs->backing, offset, bytes);
break;
default:
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 18c729aa27..4e1589ad7a 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -734,7 +734,7 @@ void qcow2_process_discards(BlockDriverState *bs, int ret)
/* Discard is optional, ignore the return value */
if (ret >= 0) {
- bdrv_pdiscard(bs->file->bs, d->offset, d->bytes);
+ bdrv_pdiscard(bs->file, d->offset, d->bytes);
}
g_free(d);
diff --git a/block/raw-format.c b/block/raw-format.c
index a3591985f6..dee262875a 100644
--- a/block/raw-format.c
+++ b/block/raw-format.c
@@ -297,7 +297,7 @@ static int coroutine_fn raw_co_pdiscard(BlockDriverState *bs,
if (ret) {
return ret;
}
- return bdrv_co_pdiscard(bs->file->bs, offset, bytes);
+ return bdrv_co_pdiscard(bs->file, offset, bytes);
}
static int64_t raw_getlength(BlockDriverState *bs)
diff --git a/block/throttle.c b/block/throttle.c
index f617f23a12..636c9764aa 100644
--- a/block/throttle.c
+++ b/block/throttle.c
@@ -149,7 +149,7 @@ static int coroutine_fn throttle_co_pdiscard(BlockDriverState *bs,
ThrottleGroupMember *tgm = bs->opaque;
throttle_group_co_io_limits_intercept(tgm, bytes, true);
- return bdrv_co_pdiscard(bs->file->bs, offset, bytes);
+ return bdrv_co_pdiscard(bs->file, offset, bytes);
}
static int throttle_co_flush(BlockDriverState *bs)
diff --git a/include/block/block.h b/include/block/block.h
index 6623d15432..875284de6d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -416,8 +416,8 @@ AioWait *bdrv_get_aio_wait(BlockDriverState *bs);
bdrv_get_aio_context(bs_), \
cond); })
-int bdrv_pdiscard(BlockDriverState *bs, int64_t offset, int bytes);
-int bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes);
+int bdrv_pdiscard(BdrvChild *child, int64_t offset, int bytes);
+int bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes);
int bdrv_has_zero_init_1(BlockDriverState *bs);
int bdrv_has_zero_init(BlockDriverState *bs);
bool bdrv_unallocated_blocks_are_zero(BlockDriverState *bs);
--
2.17.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v3 04/10] block: Use uint64_t for BdrvTrackedRequest byte fields
2018-07-10 6:31 [Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading Fam Zheng
` (2 preceding siblings ...)
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 03/10] block: Use BdrvChild to discard Fam Zheng
@ 2018-07-10 6:31 ` Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 05/10] block: Extract common write req handling Fam Zheng
` (6 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Fam Zheng @ 2018-07-10 6:31 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, vsementsov, qemu-block, Fam Zheng, Kevin Wolf,
Max Reitz, Jeff Cody, Eric Blake, John Snow, Stefan Hajnoczi
This matches the types used for bytes in the rest parts of block layer.
In the case of bdrv_co_truncate, new_bytes can be the image size which
probably doesn't fit in a 32 bit int.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
block/io.c | 8 +++++---
include/block/block_int.h | 4 ++--
2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/block/io.c b/block/io.c
index 77f35b1013..c56e329bb5 100644
--- a/block/io.c
+++ b/block/io.c
@@ -587,9 +587,11 @@ static void tracked_request_end(BdrvTrackedRequest *req)
static void tracked_request_begin(BdrvTrackedRequest *req,
BlockDriverState *bs,
int64_t offset,
- unsigned int bytes,
+ uint64_t bytes,
enum BdrvTrackedRequestType type)
{
+ assert(bytes <= INT64_MAX && offset <= INT64_MAX - bytes);
+
*req = (BdrvTrackedRequest){
.bs = bs,
.offset = offset,
@@ -611,7 +613,7 @@ static void tracked_request_begin(BdrvTrackedRequest *req,
static void mark_request_serialising(BdrvTrackedRequest *req, uint64_t align)
{
int64_t overlap_offset = req->offset & ~(align - 1);
- unsigned int overlap_bytes = ROUND_UP(req->offset + req->bytes, align)
+ uint64_t overlap_bytes = ROUND_UP(req->offset + req->bytes, align)
- overlap_offset;
if (!req->serialising) {
@@ -667,7 +669,7 @@ static int bdrv_get_cluster_size(BlockDriverState *bs)
}
static bool tracked_request_overlaps(BdrvTrackedRequest *req,
- int64_t offset, unsigned int bytes)
+ int64_t offset, uint64_t bytes)
{
/* aaaa bbbb */
if (offset >= req->overlap_offset + req->overlap_bytes) {
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 119b6e4ea5..90a33e4b0e 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -69,12 +69,12 @@ enum BdrvTrackedRequestType {
typedef struct BdrvTrackedRequest {
BlockDriverState *bs;
int64_t offset;
- unsigned int bytes;
+ uint64_t bytes;
enum BdrvTrackedRequestType type;
bool serialising;
int64_t overlap_offset;
- unsigned int overlap_bytes;
+ uint64_t overlap_bytes;
QLIST_ENTRY(BdrvTrackedRequest) list;
Coroutine *co; /* owner, used for deadlock detection */
--
2.17.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v3 05/10] block: Extract common write req handling
2018-07-10 6:31 [Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading Fam Zheng
` (3 preceding siblings ...)
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 04/10] block: Use uint64_t for BdrvTrackedRequest byte fields Fam Zheng
@ 2018-07-10 6:31 ` Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 06/10] block: Fix handling of image enlarging write Fam Zheng
` (5 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Fam Zheng @ 2018-07-10 6:31 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, vsementsov, qemu-block, Fam Zheng, Kevin Wolf,
Max Reitz, Jeff Cody, Eric Blake, John Snow, Stefan Hajnoczi
As a mechanical refactoring patch, this is the first step towards
unified and more correct write code paths. This is helpful because
multiple BlockDriverState fields need to be updated after modifying
image data, and it's hard to maintain in multiple places such as copy
offload, discard and truncate.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
block/io.c | 91 ++++++++++++++++++++++++++++++++++--------------------
1 file changed, 57 insertions(+), 34 deletions(-)
diff --git a/block/io.c b/block/io.c
index c56e329bb5..960e1492d0 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1553,6 +1553,61 @@ fail:
return ret;
}
+static inline int coroutine_fn
+bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, uint64_t bytes,
+ BdrvTrackedRequest *req, int flags)
+{
+ BlockDriverState *bs = child->bs;
+ bool waited;
+ int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
+
+ if (bs->read_only) {
+ return -EPERM;
+ }
+
+ /* BDRV_REQ_NO_SERIALISING is only for read operation */
+ assert(!(flags & BDRV_REQ_NO_SERIALISING));
+ assert(!(bs->open_flags & BDRV_O_INACTIVE));
+ assert((bs->open_flags & BDRV_O_NO_IO) == 0);
+ assert(!(flags & ~BDRV_REQ_MASK));
+
+ if (flags & BDRV_REQ_SERIALISING) {
+ mark_request_serialising(req, bdrv_get_cluster_size(bs));
+ }
+
+ waited = wait_serialising_requests(req);
+
+ assert(!waited || !req->serialising ||
+ is_request_serialising_and_aligned(req));
+ assert(req->overlap_offset <= offset);
+ assert(offset + bytes <= req->overlap_offset + req->overlap_bytes);
+
+ if (flags & BDRV_REQ_WRITE_UNCHANGED) {
+ assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
+ } else {
+ assert(child->perm & BLK_PERM_WRITE);
+ }
+ assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE);
+ return notifier_with_return_list_notify(&bs->before_write_notifiers, req);
+}
+
+static inline void coroutine_fn
+bdrv_co_write_req_finish(BdrvChild *child, int64_t offset, uint64_t bytes,
+ BdrvTrackedRequest *req, int ret)
+{
+ int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
+ BlockDriverState *bs = child->bs;
+
+ atomic_inc(&bs->write_gen);
+ bdrv_set_dirty(bs, offset, bytes);
+
+ stat64_max(&bs->wr_highest_offset, offset + bytes);
+
+ if (ret == 0) {
+ bs->total_sectors = MAX(bs->total_sectors, end_sector);
+ }
+}
+
/*
* Forwards an already correctly aligned write request to the BlockDriver,
* after possibly fragmenting it.
@@ -1563,10 +1618,8 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild *child,
{
BlockDriverState *bs = child->bs;
BlockDriver *drv = bs->drv;
- bool waited;
int ret;
- int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
uint64_t bytes_remaining = bytes;
int max_transfer;
@@ -1582,31 +1635,10 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild *child,
assert((offset & (align - 1)) == 0);
assert((bytes & (align - 1)) == 0);
assert(!qiov || bytes == qiov->size);
- assert((bs->open_flags & BDRV_O_NO_IO) == 0);
- assert(!(flags & ~BDRV_REQ_MASK));
max_transfer = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX),
align);
- /* BDRV_REQ_NO_SERIALISING is only for read operation */
- assert(!(flags & BDRV_REQ_NO_SERIALISING));
-
- if (flags & BDRV_REQ_SERIALISING) {
- mark_request_serialising(req, bdrv_get_cluster_size(bs));
- }
-
- waited = wait_serialising_requests(req);
- assert(!waited || !req->serialising ||
- is_request_serialising_and_aligned(req));
- assert(req->overlap_offset <= offset);
- assert(offset + bytes <= req->overlap_offset + req->overlap_bytes);
- if (flags & BDRV_REQ_WRITE_UNCHANGED) {
- assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
- } else {
- assert(child->perm & BLK_PERM_WRITE);
- }
- assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE);
-
- ret = notifier_with_return_list_notify(&bs->before_write_notifiers, req);
+ ret = bdrv_co_write_req_prepare(child, offset, bytes, req, flags);
if (!ret && bs->detect_zeroes != BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF &&
!(flags & BDRV_REQ_ZERO_WRITE) && drv->bdrv_co_pwrite_zeroes &&
@@ -1655,15 +1687,10 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild *child,
}
bdrv_debug_event(bs, BLKDBG_PWRITEV_DONE);
- atomic_inc(&bs->write_gen);
- bdrv_set_dirty(bs, offset, bytes);
-
- stat64_max(&bs->wr_highest_offset, offset + bytes);
-
if (ret >= 0) {
- bs->total_sectors = MAX(bs->total_sectors, end_sector);
ret = 0;
}
+ bdrv_co_write_req_finish(child, offset, bytes, req, ret);
return ret;
}
@@ -1778,10 +1805,6 @@ int coroutine_fn bdrv_co_pwritev(BdrvChild *child,
if (!bs->drv) {
return -ENOMEDIUM;
}
- if (bs->read_only) {
- return -EPERM;
- }
- assert(!(bs->open_flags & BDRV_O_INACTIVE));
ret = bdrv_check_byte_request(bs, offset, bytes);
if (ret < 0) {
--
2.17.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v3 06/10] block: Fix handling of image enlarging write
2018-07-10 6:31 [Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading Fam Zheng
` (4 preceding siblings ...)
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 05/10] block: Extract common write req handling Fam Zheng
@ 2018-07-10 6:31 ` Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 07/10] block: Use common req handling for discard Fam Zheng
` (4 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Fam Zheng @ 2018-07-10 6:31 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, vsementsov, qemu-block, Fam Zheng, Kevin Wolf,
Max Reitz, Jeff Cody, Eric Blake, John Snow, Stefan Hajnoczi
Two problems exist when a write request that enlarges the image (i.e.
write beyond EOF) finishes:
1) parent is not notified about size change;
2) dirty bitmap is not resized although we try to set the dirty bits;
Fix them just like how bdrv_co_truncate works.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
block/io.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/block/io.c b/block/io.c
index 960e1492d0..10a475302a 100644
--- a/block/io.c
+++ b/block/io.c
@@ -40,6 +40,7 @@
static AioWait drain_all_aio_wait;
+static void bdrv_parent_cb_resize(BlockDriverState *bs);
static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
int64_t offset, int bytes, BdrvRequestFlags flags);
@@ -1599,13 +1600,16 @@ bdrv_co_write_req_finish(BdrvChild *child, int64_t offset, uint64_t bytes,
BlockDriverState *bs = child->bs;
atomic_inc(&bs->write_gen);
- bdrv_set_dirty(bs, offset, bytes);
stat64_max(&bs->wr_highest_offset, offset + bytes);
- if (ret == 0) {
- bs->total_sectors = MAX(bs->total_sectors, end_sector);
+ if (ret == 0 &&
+ end_sector > bs->total_sectors) {
+ bs->total_sectors = end_sector;
+ bdrv_parent_cb_resize(bs);
+ bdrv_dirty_bitmap_truncate(bs, end_sector << BDRV_SECTOR_BITS);
}
+ bdrv_set_dirty(bs, offset, bytes);
}
/*
--
2.17.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v3 07/10] block: Use common req handling for discard
2018-07-10 6:31 [Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading Fam Zheng
` (5 preceding siblings ...)
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 06/10] block: Fix handling of image enlarging write Fam Zheng
@ 2018-07-10 6:31 ` Fam Zheng
2018-07-10 14:45 ` Kevin Wolf
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 08/10] block: Use common req handling in copy offloading Fam Zheng
` (3 subsequent siblings)
10 siblings, 1 reply; 13+ messages in thread
From: Fam Zheng @ 2018-07-10 6:31 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, vsementsov, qemu-block, Fam Zheng, Kevin Wolf,
Max Reitz, Jeff Cody, Eric Blake, John Snow, Stefan Hajnoczi
Reuse the new bdrv_co_write_req_prepare/finish helpers. The variation
here is that discard requests don't affect bs->wr_highest_offset, and it
cannot extend the image.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
block/io.c | 32 ++++++++++++++++++++++----------
1 file changed, 22 insertions(+), 10 deletions(-)
diff --git a/block/io.c b/block/io.c
index 10a475302a..cc426bab2e 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1601,15 +1601,31 @@ bdrv_co_write_req_finish(BdrvChild *child, int64_t offset, uint64_t bytes,
atomic_inc(&bs->write_gen);
- stat64_max(&bs->wr_highest_offset, offset + bytes);
-
+ /* Discard cannot extend the image, but in error handling cases, such as
+ * when reverting a qcow2 cluster allocation, the discarded range can pass
+ * the end of image file, so we cannot assert about BDRV_TRACKED_DISCARD
+ * here. Instead, just skip it, since semantically a discard request
+ * beyond EOF cannot expand the image anyway.
+ * */
if (ret == 0 &&
- end_sector > bs->total_sectors) {
+ end_sector > bs->total_sectors &&
+ req->type != BDRV_TRACKED_DISCARD) {
bs->total_sectors = end_sector;
bdrv_parent_cb_resize(bs);
bdrv_dirty_bitmap_truncate(bs, end_sector << BDRV_SECTOR_BITS);
}
- bdrv_set_dirty(bs, offset, bytes);
+ if (req->bytes) {
+ switch (req->type) {
+ case BDRV_TRACKED_WRITE:
+ stat64_max(&bs->wr_highest_offset, offset + bytes);
+ /* fall through, to set dirty bits */
+ case BDRV_TRACKED_DISCARD:
+ bdrv_set_dirty(bs, offset, bytes);
+ break;
+ default:
+ break;
+ }
+ }
}
/*
@@ -2670,10 +2686,7 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
ret = bdrv_check_byte_request(bs, offset, bytes);
if (ret < 0) {
return ret;
- } else if (bs->read_only) {
- return -EPERM;
}
- assert(!(bs->open_flags & BDRV_O_INACTIVE));
/* Do nothing if disabled. */
if (!(bs->open_flags & BDRV_O_UNMAP)) {
@@ -2697,7 +2710,7 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
bdrv_inc_in_flight(bs);
tracked_request_begin(&req, bs, offset, bytes, BDRV_TRACKED_DISCARD);
- ret = notifier_with_return_list_notify(&bs->before_write_notifiers, &req);
+ ret = bdrv_co_write_req_prepare(child, offset, bytes, &req, 0);
if (ret < 0) {
goto out;
}
@@ -2763,8 +2776,7 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
}
ret = 0;
out:
- atomic_inc(&bs->write_gen);
- bdrv_set_dirty(bs, req.offset, req.bytes);
+ bdrv_co_write_req_finish(child, req.offset, req.bytes, &req, ret);
tracked_request_end(&req);
bdrv_dec_in_flight(bs);
return ret;
--
2.17.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v3 08/10] block: Use common req handling in copy offloading
2018-07-10 6:31 [Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading Fam Zheng
` (6 preceding siblings ...)
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 07/10] block: Use common req handling for discard Fam Zheng
@ 2018-07-10 6:31 ` Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 09/10] block: Fix bdrv_co_truncate overlap check Fam Zheng
` (2 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Fam Zheng @ 2018-07-10 6:31 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, vsementsov, qemu-block, Fam Zheng, Kevin Wolf,
Max Reitz, Jeff Cody, Eric Blake, John Snow, Stefan Hajnoczi
This brings the request handling logic inline with write and discard,
fixing write_gen, resize_cb, dirty bitmaps and image size refreshing.
The last of these issues broke iotest case 222, which is now fixed.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
block/io.c | 22 +++++++++-------------
1 file changed, 9 insertions(+), 13 deletions(-)
diff --git a/block/io.c b/block/io.c
index cc426bab2e..9687da1ce9 100644
--- a/block/io.c
+++ b/block/io.c
@@ -3007,20 +3007,16 @@ static int coroutine_fn bdrv_co_copy_range_internal(
bdrv_inc_in_flight(dst->bs);
tracked_request_begin(&req, dst->bs, dst_offset, bytes,
BDRV_TRACKED_WRITE);
-
- /* BDRV_REQ_NO_SERIALISING is only for read operation */
- assert(!(write_flags & BDRV_REQ_NO_SERIALISING));
- if (write_flags & BDRV_REQ_SERIALISING) {
- mark_request_serialising(&req, bdrv_get_cluster_size(dst->bs));
+ ret = bdrv_co_write_req_prepare(dst, dst_offset, bytes, &req,
+ write_flags);
+ if (!ret) {
+ ret = dst->bs->drv->bdrv_co_copy_range_to(dst->bs,
+ src, src_offset,
+ dst, dst_offset,
+ bytes,
+ read_flags, write_flags);
}
- wait_serialising_requests(&req);
-
- ret = dst->bs->drv->bdrv_co_copy_range_to(dst->bs,
- src, src_offset,
- dst, dst_offset,
- bytes,
- read_flags, write_flags);
-
+ bdrv_co_write_req_finish(dst, dst_offset, bytes, &req, ret);
tracked_request_end(&req);
bdrv_dec_in_flight(dst->bs);
}
--
2.17.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v3 09/10] block: Fix bdrv_co_truncate overlap check
2018-07-10 6:31 [Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading Fam Zheng
` (7 preceding siblings ...)
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 08/10] block: Use common req handling in copy offloading Fam Zheng
@ 2018-07-10 6:31 ` Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 10/10] block: Use common write req handling in truncate Fam Zheng
2018-07-10 14:46 ` [Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading Kevin Wolf
10 siblings, 0 replies; 13+ messages in thread
From: Fam Zheng @ 2018-07-10 6:31 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, vsementsov, qemu-block, Fam Zheng, Kevin Wolf,
Max Reitz, Jeff Cody, Eric Blake, John Snow, Stefan Hajnoczi
If we are growing the image and potentially using preallocation for the
new area, we need to make sure that no write requests are made to the
"preallocated" area which is [@old_size, @offset), not
[@offset, offset * 2 - @old_size).
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
block/io.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/block/io.c b/block/io.c
index 9687da1ce9..e3e2d5286d 100644
--- a/block/io.c
+++ b/block/io.c
@@ -3113,7 +3113,8 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset,
}
bdrv_inc_in_flight(bs);
- tracked_request_begin(&req, bs, offset, new_bytes, BDRV_TRACKED_TRUNCATE);
+ tracked_request_begin(&req, bs, offset - new_bytes, new_bytes,
+ BDRV_TRACKED_TRUNCATE);
/* If we are growing the image and potentially using preallocation for the
* new area, we need to make sure that no write requests are made to it
--
2.17.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v3 10/10] block: Use common write req handling in truncate
2018-07-10 6:31 [Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading Fam Zheng
` (8 preceding siblings ...)
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 09/10] block: Fix bdrv_co_truncate overlap check Fam Zheng
@ 2018-07-10 6:31 ` Fam Zheng
2018-07-10 14:46 ` [Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading Kevin Wolf
10 siblings, 0 replies; 13+ messages in thread
From: Fam Zheng @ 2018-07-10 6:31 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, vsementsov, qemu-block, Fam Zheng, Kevin Wolf,
Max Reitz, Jeff Cody, Eric Blake, John Snow, Stefan Hajnoczi
Truncation is the last to convert from open coded req handling to
reusing helpers. This time the permission check in prepare has to adapt
to the new caller: it checks a different permission bit, and doesn't
trigger the before write notifier.
Also, truncation should always trigger a bs->total_sectors update and in
turn call parent resize_cb. Update the condition in finish helper, too.
It's intended to do a duplicated bs->read_only check before calling
bdrv_co_write_req_prepare() so that we can be more informative with the
error message, as bdrv_co_write_req_prepare() doesn't have Error
parameter.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
block/io.c | 57 ++++++++++++++++++++++++++++++++++--------------------
1 file changed, 36 insertions(+), 21 deletions(-)
diff --git a/block/io.c b/block/io.c
index e3e2d5286d..abcb8b3de6 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1582,14 +1582,24 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, uint64_t bytes,
is_request_serialising_and_aligned(req));
assert(req->overlap_offset <= offset);
assert(offset + bytes <= req->overlap_offset + req->overlap_bytes);
-
- if (flags & BDRV_REQ_WRITE_UNCHANGED) {
- assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
- } else {
- assert(child->perm & BLK_PERM_WRITE);
- }
assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE);
- return notifier_with_return_list_notify(&bs->before_write_notifiers, req);
+
+ switch (req->type) {
+ case BDRV_TRACKED_WRITE:
+ case BDRV_TRACKED_DISCARD:
+ if (flags & BDRV_REQ_WRITE_UNCHANGED) {
+ assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
+ } else {
+ assert(child->perm & BLK_PERM_WRITE);
+ }
+ return notifier_with_return_list_notify(&bs->before_write_notifiers,
+ req);
+ case BDRV_TRACKED_TRUNCATE:
+ assert(child->perm & BLK_PERM_RESIZE);
+ return 0;
+ default:
+ abort();
+ }
}
static inline void coroutine_fn
@@ -1608,8 +1618,9 @@ bdrv_co_write_req_finish(BdrvChild *child, int64_t offset, uint64_t bytes,
* beyond EOF cannot expand the image anyway.
* */
if (ret == 0 &&
- end_sector > bs->total_sectors &&
- req->type != BDRV_TRACKED_DISCARD) {
+ (req->type == BDRV_TRACKED_TRUNCATE ||
+ end_sector > bs->total_sectors) &&
+ req->type != BDRV_TRACKED_DISCARD) {
bs->total_sectors = end_sector;
bdrv_parent_cb_resize(bs);
bdrv_dirty_bitmap_truncate(bs, end_sector << BDRV_SECTOR_BITS);
@@ -3088,7 +3099,6 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset,
int64_t old_size, new_bytes;
int ret;
- assert(child->perm & BLK_PERM_RESIZE);
/* if bs->drv == NULL, bs is closed, so there's nothing to do here */
if (!drv) {
@@ -3121,7 +3131,18 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset,
* concurrently or they might be overwritten by preallocation. */
if (new_bytes) {
mark_request_serialising(&req, 1);
- wait_serialising_requests(&req);
+ }
+ if (bs->read_only) {
+ error_setg(errp, "Image is read-only");
+ ret = -EACCES;
+ goto out;
+ }
+ ret = bdrv_co_write_req_prepare(child, offset - new_bytes, new_bytes, &req,
+ 0);
+ if (ret < 0) {
+ error_setg_errno(errp, -ret,
+ "Failed to prepare request for truncation");
+ goto out;
}
if (!drv->bdrv_co_truncate) {
@@ -3133,13 +3154,6 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset,
ret = -ENOTSUP;
goto out;
}
- if (bs->read_only) {
- error_setg(errp, "Image is read-only");
- ret = -EACCES;
- goto out;
- }
-
- assert(!(bs->open_flags & BDRV_O_INACTIVE));
ret = drv->bdrv_co_truncate(bs, offset, prealloc, errp);
if (ret < 0) {
@@ -3151,9 +3165,10 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset,
} else {
offset = bs->total_sectors * BDRV_SECTOR_SIZE;
}
- bdrv_dirty_bitmap_truncate(bs, offset);
- bdrv_parent_cb_resize(bs);
- atomic_inc(&bs->write_gen);
+ /* It's possible that truncation succeeded but refresh_total_sectors
+ * failed, but the latter doesn't affect how we should finish the request.
+ * Pass 0 as the last parameter so that dirty bitmaps etc. are handled. */
+ bdrv_co_write_req_finish(child, offset - new_bytes, new_bytes, &req, 0);
out:
tracked_request_end(&req);
--
2.17.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH v3 07/10] block: Use common req handling for discard
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 07/10] block: Use common req handling for discard Fam Zheng
@ 2018-07-10 14:45 ` Kevin Wolf
0 siblings, 0 replies; 13+ messages in thread
From: Kevin Wolf @ 2018-07-10 14:45 UTC (permalink / raw)
To: Fam Zheng
Cc: qemu-devel, Paolo Bonzini, vsementsov, qemu-block, Max Reitz,
Jeff Cody, Eric Blake, John Snow, Stefan Hajnoczi
Am 10.07.2018 um 08:31 hat Fam Zheng geschrieben:
> Reuse the new bdrv_co_write_req_prepare/finish helpers. The variation
> here is that discard requests don't affect bs->wr_highest_offset, and it
> cannot extend the image.
>
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
> block/io.c | 32 ++++++++++++++++++++++----------
> 1 file changed, 22 insertions(+), 10 deletions(-)
>
> diff --git a/block/io.c b/block/io.c
> index 10a475302a..cc426bab2e 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1601,15 +1601,31 @@ bdrv_co_write_req_finish(BdrvChild *child, int64_t offset, uint64_t bytes,
>
> atomic_inc(&bs->write_gen);
>
> - stat64_max(&bs->wr_highest_offset, offset + bytes);
> -
> + /* Discard cannot extend the image, but in error handling cases, such as
> + * when reverting a qcow2 cluster allocation, the discarded range can pass
> + * the end of image file, so we cannot assert about BDRV_TRACKED_DISCARD
> + * here. Instead, just skip it, since semantically a discard request
> + * beyond EOF cannot expand the image anyway.
> + * */
Oh, yet another style for the end of a multiline comment. :-)
I'll fix this while applying.
Kevin
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading
2018-07-10 6:31 [Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading Fam Zheng
` (9 preceding siblings ...)
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 10/10] block: Use common write req handling in truncate Fam Zheng
@ 2018-07-10 14:46 ` Kevin Wolf
10 siblings, 0 replies; 13+ messages in thread
From: Kevin Wolf @ 2018-07-10 14:46 UTC (permalink / raw)
To: Fam Zheng
Cc: qemu-devel, Paolo Bonzini, vsementsov, qemu-block, Max Reitz,
Jeff Cody, Eric Blake, John Snow, Stefan Hajnoczi
Am 10.07.2018 um 08:31 hat Fam Zheng geschrieben:
> Based-on: 20180709163719.87107-1-vsementsov@virtuozzo.com
>
> v3: - Rebase onto Vladmir's:
> [PATCH v5 0/4] fix image fleecing
> on top of master, which has blklogwrites to be converted to BdrvChild for
> the bdrv_co_pdiscard() parameter. [Kevin]
> - Add file_ prefix to file protocol trace points. [Kevin]
> - Assert that BdrvTrackedRequest bytes and offset don't overflow INT64_MAX.
> [Kevin]
> - Don't misuse req->offset/req->bytes in bdrv_co_write_req_prepare/finish.
> [Kevin]
> - Fix stat64_max. [Kevin]
> - Keep lines within 80 columns. [Kevin]
> - Grammar fix in commit message and rev-by. [Eric]
>
> Qcow2 allocates new clusters after the end of the file. If it is the destinaton
> of copy offloading, we must adjust dst->bs->total_sectors. Otherwise, further
> reads will drop to the "beyond EOF" code path and return zeroes, which problem
> is caught by iotests 222.
>
> Follow the logic in the normal write code and update bs->total_sectors after
> I/O is done.
>
> While at it, add a few convenient trace points to aid future debug experiences
> in the topic.
Thanks, applied to the block branch.
Kevin
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2018-07-10 14:47 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-10 6:31 [Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 01/10] block: Prefix file driver trace points with "file_" Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 02/10] block: Add copy offloading trace points Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 03/10] block: Use BdrvChild to discard Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 04/10] block: Use uint64_t for BdrvTrackedRequest byte fields Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 05/10] block: Extract common write req handling Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 06/10] block: Fix handling of image enlarging write Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 07/10] block: Use common req handling for discard Fam Zheng
2018-07-10 14:45 ` Kevin Wolf
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 08/10] block: Use common req handling in copy offloading Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 09/10] block: Fix bdrv_co_truncate overlap check Fam Zheng
2018-07-10 6:31 ` [Qemu-devel] [PATCH v3 10/10] block: Use common write req handling in truncate Fam Zheng
2018-07-10 14:46 ` [Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading Kevin Wolf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).